This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
8
CallSiteSplitting.cpp

Differential D40037

[CallSiteSplitting] Remove some indirection (NFC).
ClosedPublic

Authored by fhahn on Nov 14 2017, 9:54 AM.

Download Raw Diff

Details

Reviewers

junbuml
mcrosier
davidxl

Commits

rG2a266a343fec: [CallSiteSplitting] Remove some indirection (NFC).
rL318593: [CallSiteSplitting] Remove some indirection (NFC).

Summary

With this patch I tried to reduce the complexity of the code sightly, by
removing some indirection. Please let me know what you think.

Diff Detail

Event Timeline

fhahn created this revision.Nov 14 2017, 9:54 AM

Thanks Florian for cleaning up this pass. Overall look good, but this doesn't seem a NFC and can miss an opportunity for a constant phi in non-OR structure. Please see my inline comments.

lib/Transforms/Scalar/CallSiteSplitting.cpp
374	In case HeaderBB has more than two successors (e.g, if a switch is it's terminator), this change could allow splitting. If you want to keep this change NFC, the number of successors of HeaderBB may need to be checked.
381	We also split a call-site if it use a PHI with all constant incoming values. In this case, Preds are not necessarily an OR structure. For example, in the diamond pattern below, we should split the call site. define i32 @test(i32* %a, i32 %v) { entry: br i1 undef, label %TBB0, label %TBB1 TBB0: br i1 undef, label %Tail, label %End TBB1: br i1 undef, label %Tail, label %End Tail: %p = phi i32[1,%TBB0], [2, %TBB1] %r = call i32 @callee(i32* %a, i32 %v, i32 %p) ret i32 %r End: ret i32 %v }

Added tryToSplitOnOrPredicatedArgument and tryToSplitOnPHIPredicatedArgument to make clear which checks are required for each. This changed the order of the predecessors in 2 test cases, as now we do not determine the HeaderBB for PHI predicated arguments.

I have also added the test case Jun suggested to make sure this change does not regress things.

fhahn added inline comments.Nov 15 2017, 5:03 AM

lib/Transforms/Scalar/CallSiteSplitting.cpp
374	I think for the OR case the checks for ICMP would catch this case later (as switch for i1 only can have 2 successors I think) , but I could add an explicit check.
381	Done! I've separated the code paths splitting on PHI and on Or, as they have different legality checks.

junbuml added inline comments.Nov 15 2017, 10:24 AM

lib/Transforms/Scalar/CallSiteSplitting.cpp
374	Currently, the below test case is not handled, but with this change, we can split the call-site. Although it's not the OR structure, it doesn't seem to break the original intention of this pass. Looks like we can even extend this pass to support switch better. However, if we want to support this in this change, I think we should run tests. define i32 @test(i32* %a, i32 %i, i32 %v) { Header: switch i32 %i, label %End [ i32 0, label %BB0 i32 1, label %BB1 i32 2, label %BB2 i32 3, label %Tail i32 4, label %Tail ] BB0: br label %End BB1: br label %End BB2: %cmp = icmp eq i32 %v, 1 br i1 %cmp, label %Tail, label %End Tail: %r = call i32 @callee(i32* %a, i32 %v) ret i32 %r End: ret i32 %v }
test/Transforms/CallSiteSplitting/callsite-split-or-phi.ll
232 ↗	(On Diff #123012)	Maybe test_cfg_no_or --> test_cfg_no_or_phi ?

In D40037#925314, @junbuml wrote:

Thanks Florian for cleaning up this pass. Overall look good, but this doesn't seem a NFC and can miss an opportunity for a constant phi in non-OR structure. Please see my inline comments.

Just a side comment -- this looks more and more like more general code cloning optimization to enable constant/range propagations -- may want to consider rename the pass and make it more general.

In D40037#926426, @davidxl wrote:

Just a side comment -- this looks more and more like more general code cloning optimization to enable constant/range propagations -- may want to consider rename the pass and make it more general.

Yep, one of the reasons I tried reduce some indirection is to make it easier to extend the pass :)

fhahn added inline comments.Nov 15 2017, 11:20 AM

lib/Transforms/Scalar/CallSiteSplitting.cpp
374	I think it will be best to keep this change as a NFC and go from there, what do you think? I would plan to update the patch to only do the transform if the header has 2 successors in the OR case and post a follow-up patch to enable the additional transform after that.

junbuml added inline comments.Nov 15 2017, 11:41 AM

lib/Transforms/Scalar/CallSiteSplitting.cpp
374	Sounds good to me.

Add check to ensure we only split call sites in tryToSplitOnOrPredicatedArgument if the header block has exactly 2 successors. I plan to post a patch relaxing that soonish.

LGTM. Please run clang-format before committing.

lib/Transforms/Scalar/CallSiteSplitting.cpp
117	Looks like more than 80?

This revision is now accepted and ready to land.Nov 16 2017, 9:44 AM

fhahn closed this revision.Nov 18 2017, 10:14 AM

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

CallSiteSplitting.cpp

171 lines

Diff 122861

lib/Transforms/Scalar/CallSiteSplitting.cpp

Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	static void setConstantInArgument(Instruction CallI, Instruction &NewCallI,
unsigned ArgNo = 0;		unsigned ArgNo = 0;
for (auto &I : CS.args()) {		for (auto &I : CS.args()) {
if (&*I == Op)		if (&*I == Op)
CS.setArgument(ArgNo, ConstValue);		CS.setArgument(ArgNo, ConstValue);
++ArgNo;		++ArgNo;
}		}
}		}

static bool createCallSitesOnOrPredicatedArgument(		static bool isCondRelevantToAnyCallArgument(ICmpInst *Cmp, CallSite CS) {
		assert(isa<Constant>(Cmp->getOperand(1)) && "Expected a constant operand.");
		Value *Op0 = Cmp->getOperand(0);
		unsigned ArgNo = 0;
		for (CallSite::arg_iterator I = CS.arg_begin(), E = CS.arg_end(); I != E;
		++I, ++ArgNo) {
		// Don't consider constant or arguments that are already known non-null.
		if (isa<Constant>(*I) \|\| CS.paramHasAttr(ArgNo, Attribute::NonNull))
		continue;

		if (*I == Op0)
		return true;
		}
		return false;
		}

		static SmallVector<BranchInst *, 2> findOrCondRelevantToCallArgument(CallSite CS) {
		junbumlUnsubmitted Not Done Reply Inline Actions Looks like more than 80? junbuml: Looks like more than 80?
		SmallVector<BranchInst *, 2> BranchInsts;
		for (auto PredBB : predecessors(CS.getInstruction()->getParent())) {
		auto *PBI = dyn_cast<BranchInst>(PredBB->getTerminator());
		if (!PBI \|\| !PBI->isConditional())
		continue;

		CmpInst::Predicate Pred;
		Value *Cond = PBI->getCondition();
		if (!match(Cond, m_ICmp(Pred, m_Value(), m_Constant())))
		continue;
		ICmpInst *Cmp = cast<ICmpInst>(Cond);
		if (Pred == ICmpInst::ICMP_EQ \|\| Pred == ICmpInst::ICMP_NE)
		if (isCondRelevantToAnyCallArgument(Cmp, CS))
		BranchInsts.push_back(PBI);
		}
		return BranchInsts;
		}

		static bool tryCreateCallSitesOnOrPredicatedArgument(
CallSite CS, Instruction *&NewCSTakenFromHeader,		CallSite CS, Instruction *&NewCSTakenFromHeader,
Instruction *&NewCSTakenFromNextCond,		Instruction &NewCSTakenFromNextCond, BasicBlock HeaderBB) {
SmallVectorImpl<BranchInst > &BranchInsts, BasicBlock HeaderBB) {		auto BranchInsts = findOrCondRelevantToCallArgument(CS);
assert(BranchInsts.size() <= 2 &&		assert(BranchInsts.size() <= 2 &&
"Unexpected number of blocks in the OR predicated condition");		"Unexpected number of blocks in the OR predicated condition");
Instruction *Instr = CS.getInstruction();		Instruction *Instr = CS.getInstruction();
BasicBlock *CallSiteBB = Instr->getParent();		BasicBlock *CallSiteBB = Instr->getParent();
TerminatorInst *HeaderTI = HeaderBB->getTerminator();		TerminatorInst *HeaderTI = HeaderBB->getTerminator();
bool IsCSInTakenPath = CallSiteBB == HeaderTI->getSuccessor(0);		bool IsCSInTakenPath = CallSiteBB == HeaderTI->getSuccessor(0);

for (unsigned I = 0, E = BranchInsts.size(); I != E; ++I) {		for (auto *PBI : BranchInsts) {
BranchInst *PBI = BranchInsts[I];
assert(isa<ICmpInst>(PBI->getCondition()) &&		assert(isa<ICmpInst>(PBI->getCondition()) &&
"Unexpected condition in a conditional branch.");		"Unexpected condition in a conditional branch.");
ICmpInst *Cmp = cast<ICmpInst>(PBI->getCondition());		ICmpInst *Cmp = cast<ICmpInst>(PBI->getCondition());
Value *Arg = Cmp->getOperand(0);		Value *Arg = Cmp->getOperand(0);
assert(isa<Constant>(Cmp->getOperand(1)) &&		assert(isa<Constant>(Cmp->getOperand(1)) &&
"Expected op1 to be a constant.");		"Expected op1 to be a constant.");
Constant *ConstVal = cast<Constant>(Cmp->getOperand(1));		Constant *ConstVal = cast<Constant>(Cmp->getOperand(1));
CmpInst::Predicate Pred = Cmp->getPredicate();		CmpInst::Predicate Pred = Cmp->getPredicate();
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	static bool canSplitCallSite(CallSite CS) {
// Allow splitting a call-site only when there is no instruction before the		// Allow splitting a call-site only when there is no instruction before the
// call-site in the basic block. Based on this constraint, we only clone the		// call-site in the basic block. Based on this constraint, we only clone the
// call instruction, and we do not move a call-site across any other		// call instruction, and we do not move a call-site across any other
// instruction.		// instruction.
BasicBlock *CallSiteBB = Instr->getParent();		BasicBlock *CallSiteBB = Instr->getParent();
if (Instr != CallSiteBB->getFirstNonPHI())		if (Instr != CallSiteBB->getFirstNonPHI())
return false;		return false;

pred_iterator PII = pred_begin(CallSiteBB);
pred_iterator PIE = pred_end(CallSiteBB);
unsigned NumPreds = std::distance(PII, PIE);

// Allow only one extra call-site. No more than two from one call-site.
if (NumPreds != 2)
return false;

// Cannot split an edge from an IndirectBrInst.		// Cannot split an edge from an IndirectBrInst.
BasicBlock Preds[2] = {PII++, *PII};		SmallVector<BasicBlock*, 2> Preds(predecessors(CallSiteBB));
if (isa<IndirectBrInst>(Preds[0]->getTerminator()) \|\|		if (isa<IndirectBrInst>(Preds[0]->getTerminator()) \|\|
isa<IndirectBrInst>(Preds[1]->getTerminator()))		isa<IndirectBrInst>(Preds[1]->getTerminator()))
return false;		return false;

return CallSiteBB->canSplitPredecessors();		return CallSiteBB->canSplitPredecessors();
}		}

/// Return true if the CS is split into its new predecessors which are directly		/// Return true if the CS is split into its new predecessors which are directly
/// hooked to each of its orignial predecessors pointed by PredBB1 and PredBB2.		/// hooked to each of its orignial predecessors pointed by PredBB1 and PredBB2.
/// Note that PredBB1 and PredBB2 are decided in findPredicatedArgument(),		/// In OR predicated case, PredBB1 will point the header, and PredBB2 will point
/// especially for the OR predicated case where PredBB1 will point the header,		/// to the second compare block. CallInst1 and CallInst2 will be the new
/// and PredBB2 will point the the second compare block. CallInst1 and CallInst2		/// call-sites placed in the new predecessors split for PredBB1 and PredBB2,
/// will be the new call-sites placed in the new predecessors split for PredBB1		/// repectively. Therefore, CallInst1 will be the call-site placed
/// and PredBB2, repectively. Therefore, CallInst1 will be the call-site placed
/// between Header and Tail, and CallInst2 will be the call-site between TBB and		/// between Header and Tail, and CallInst2 will be the call-site between TBB and
/// Tail. For example, in the IR below with an OR condition, the call-site can		/// Tail. For example, in the IR below with an OR condition, the call-site can
/// be split		/// be split
///		///
/// from :		/// from :
///		///
/// Header:		/// Header:
/// %c = icmp eq i32* %a, null		/// %c = icmp eq i32* %a, null
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	static void splitCallSite(CallSite CS, BasicBlock PredBB1, BasicBlock PredBB2,
DEBUG(dbgs() << " " << *CallInst1 << " in " << SplitBlock1->getName()		DEBUG(dbgs() << " " << *CallInst1 << " in " << SplitBlock1->getName()
<< "\n");		<< "\n");
DEBUG(dbgs() << " " << *CallInst2 << " in " << SplitBlock2->getName()		DEBUG(dbgs() << " " << *CallInst2 << " in " << SplitBlock2->getName()
<< "\n");		<< "\n");
Instr->eraseFromParent();		Instr->eraseFromParent();
NumCallSiteSplit++;		NumCallSiteSplit++;
}		}

static bool isCondRelevantToAnyCallArgument(ICmpInst *Cmp, CallSite CS) {
assert(isa<Constant>(Cmp->getOperand(1)) && "Expected a constant operand.");
Value *Op0 = Cmp->getOperand(0);
unsigned ArgNo = 0;
for (CallSite::arg_iterator I = CS.arg_begin(), E = CS.arg_end(); I != E;
++I, ++ArgNo) {
// Don't consider constant or arguments that are already known non-null.
if (isa<Constant>(*I) \|\| CS.paramHasAttr(ArgNo, Attribute::NonNull))
continue;

if (*I == Op0)
return true;
}
return false;
}

static void findOrCondRelevantToCallArgument(
CallSite CS, BasicBlock PredBB, BasicBlock OtherPredBB,
SmallVectorImpl<BranchInst > &BranchInsts, BasicBlock &HeaderBB) {
auto *PBI = dyn_cast<BranchInst>(PredBB->getTerminator());
if (!PBI \|\| !PBI->isConditional())
return;

if (PBI->getSuccessor(0) == OtherPredBB \|\|
PBI->getSuccessor(1) == OtherPredBB)
if (PredBB == OtherPredBB->getSinglePredecessor()) {
assert(!HeaderBB && "Expect to find only a single header block");
HeaderBB = PredBB;
}

CmpInst::Predicate Pred;
Value *Cond = PBI->getCondition();
if (!match(Cond, m_ICmp(Pred, m_Value(), m_Constant())))
return;
ICmpInst *Cmp = cast<ICmpInst>(Cond);
if (Pred == ICmpInst::ICMP_EQ \|\| Pred == ICmpInst::ICMP_NE)
if (isCondRelevantToAnyCallArgument(Cmp, CS))
BranchInsts.push_back(PBI);
}

// Return true if the call-site has an argument which is a PHI with only		// Return true if the call-site has an argument which is a PHI with only
// constant incoming values.		// constant incoming values.
static bool isPredicatedOnPHI(CallSite CS) {		static bool isPredicatedOnPHI(CallSite CS) {
Instruction *Instr = CS.getInstruction();		Instruction *Instr = CS.getInstruction();
BasicBlock *Parent = Instr->getParent();		BasicBlock *Parent = Instr->getParent();
if (Instr != Parent->getFirstNonPHI())		if (Instr != Parent->getFirstNonPHI())
return false;		return false;

Show All 12 Lines	if (PHINode *PN = dyn_cast<PHINode>(&BI)) {
return true;		return true;
}		}
}		}
break;		break;
}		}
return false;		return false;
}		}

// Return true if an agument in CS is predicated on an 'or' condition.		static bool tryToSplitCallSite(CallSite CS) {
// Create new call-site with arguments constrained based on the OR condition.		if (!CS.arg_size())
static bool findPredicatedOnOrCondition(CallSite CS, BasicBlock *PredBB1,
BasicBlock *PredBB2,
Instruction *&NewCallTakenFromHeader,
Instruction *&NewCallTakenFromNextCond,
BasicBlock *&HeaderBB) {
SmallVector<BranchInst *, 4> BranchInsts;
findOrCondRelevantToCallArgument(CS, PredBB1, PredBB2, BranchInsts, HeaderBB);
findOrCondRelevantToCallArgument(CS, PredBB2, PredBB1, BranchInsts, HeaderBB);
if (BranchInsts.empty() \|\| !HeaderBB)
return false;

// If an OR condition is detected, try to create call sites with constrained
// arguments (e.g., NonNull attribute or constant value).
return createCallSitesOnOrPredicatedArgument(CS, NewCallTakenFromHeader,
NewCallTakenFromNextCond,
BranchInsts, HeaderBB);
}

static bool findPredicatedArgument(CallSite CS, Instruction *&CallInst1,
Instruction *&CallInst2,
BasicBlock &PredBB1, BasicBlock &PredBB2) {
BasicBlock *CallSiteBB = CS.getInstruction()->getParent();
pred_iterator PII = pred_begin(CallSiteBB);
pred_iterator PIE = pred_end(CallSiteBB);
assert(std::distance(PII, PIE) == 2 && "Expect only two predecessors.");
(void)PIE;
BasicBlock Preds[2] = {PII++, *PII};
BasicBlock *&HeaderBB = PredBB1;
if (!findPredicatedOnOrCondition(CS, Preds[0], Preds[1], CallInst1, CallInst2,
HeaderBB) &&
!isPredicatedOnPHI(CS))
return false;		return false;

if (!PredBB1)		BasicBlock *HeaderBB = nullptr;
PredBB1 = Preds[0];		BasicBlock *OrBB = nullptr;
		BasicBlock *CallSiteBB = CS.getInstruction()->getParent();
PredBB2 = PredBB1 == Preds[0] ? Preds[1] : Preds[0];		SmallVector<BasicBlock*, 2> Preds(predecessors(CallSiteBB));
return true;		if (Preds.size() != 2)
}		return false;

static bool tryToSplitCallSite(CallSite CS) {		// Check if one of the predecessors is a single predecessors of the other.
if (!CS.arg_size())		// This is a requirement for control flow modeling an OR. HeaderBB points to
		// the single predecessor and OrBB points to other node. HeaderBB potentially
		// contains the first compare of the OR and OrBB the second.
		if (Preds[1]->getSinglePredecessor() == Preds[0]) {
		junbumlUnsubmitted Not Done Reply Inline Actions In case HeaderBB has more than two successors (e.g, if a switch is it's terminator), this change could allow splitting. If you want to keep this change NFC, the number of successors of HeaderBB may need to be checked. junbuml: In case HeaderBB has more than two successors (e.g, if a switch is it's terminator), this…
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions I think for the OR case the checks for ICMP would catch this case later (as switch for i1 only can have 2 successors I think) , but I could add an explicit check. fhahn: I think for the OR case the checks for ICMP would catch this case later (as switch for i1 only…
		junbumlUnsubmitted Not Done Reply Inline Actions Currently, the below test case is not handled, but with this change, we can split the call-site. Although it's not the OR structure, it doesn't seem to break the original intention of this pass. Looks like we can even extend this pass to support switch better. However, if we want to support this in this change, I think we should run tests. define i32 @test(i32* %a, i32 %i, i32 %v) { Header: switch i32 %i, label %End [ i32 0, label %BB0 i32 1, label %BB1 i32 2, label %BB2 i32 3, label %Tail i32 4, label %Tail ] BB0: br label %End BB1: br label %End BB2: %cmp = icmp eq i32 %v, 1 br i1 %cmp, label %Tail, label %End Tail: %r = call i32 @callee(i32* %a, i32 %v) ret i32 %r End: ret i32 %v } junbuml: Currently, the below test case is not handled, but with this change, we can split the call-site.
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions I think it will be best to keep this change as a NFC and go from there, what do you think? I would plan to update the patch to only do the transform if the header has 2 successors in the OR case and post a follow-up patch to enable the additional transform after that. fhahn: I think it will be best to keep this change as a NFC and go from there, what do you think? I…
		junbumlUnsubmitted Not Done Reply Inline Actions Sounds good to me. junbuml: Sounds good to me.
		HeaderBB = Preds[0];
		OrBB = Preds[1];
		} else if (Preds[0]->getSinglePredecessor() == Preds[1]) {
		HeaderBB = Preds[1];
		OrBB = Preds[0];
		} else
return false;		return false;
		junbumlUnsubmitted Not Done Reply Inline Actions We also split a call-site if it use a PHI with all constant incoming values. In this case, Preds are not necessarily an OR structure. For example, in the diamond pattern below, we should split the call site. define i32 @test(i32* %a, i32 %v) { entry: br i1 undef, label %TBB0, label %TBB1 TBB0: br i1 undef, label %Tail, label %End TBB1: br i1 undef, label %Tail, label %End Tail: %p = phi i32[1,%TBB0], [2, %TBB1] %r = call i32 @callee(i32* %a, i32 %v, i32 %p) ret i32 %r End: ret i32 %v } junbuml: We also split a call-site if it use a PHI with all constant incoming values. In this case…
		fhahnAuthorUnsubmitted Not Done Reply Inline Actions Done! I've separated the code paths splitting on PHI and on Or, as they have different legality checks. fhahn: Done! I've separated the code paths splitting on PHI and on Or, as they have different legality…

BasicBlock *PredBB1 = nullptr;
BasicBlock *PredBB2 = nullptr;
Instruction *CallInst1 = nullptr;		Instruction *CallInst1 = nullptr;
Instruction *CallInst2 = nullptr;		Instruction *CallInst2 = nullptr;
if (!canSplitCallSite(CS) \|\|		if (!canSplitCallSite(CS))
!findPredicatedArgument(CS, CallInst1, CallInst2, PredBB1, PredBB2)) {		return false;

		if (!tryCreateCallSitesOnOrPredicatedArgument(CS, CallInst1, CallInst2,
		HeaderBB) &&
		!isPredicatedOnPHI(CS)) {
assert(!CallInst1 && !CallInst2 && "Unexpected new call-sites cloned.");		assert(!CallInst1 && !CallInst2 && "Unexpected new call-sites cloned.");
return false;		return false;
}		}
splitCallSite(CS, PredBB1, PredBB2, CallInst1, CallInst2);		splitCallSite(CS, HeaderBB, OrBB, CallInst1, CallInst2);
return true;		return true;
}		}

static bool doCallSiteSplitting(Function &F, TargetLibraryInfo &TLI) {		static bool doCallSiteSplitting(Function &F, TargetLibraryInfo &TLI) {
bool Changed = false;		bool Changed = false;
for (Function::iterator BI = F.begin(), BE = F.end(); BI != BE;) {		for (Function::iterator BI = F.begin(), BE = F.end(); BI != BE;) {
BasicBlock &BB = *BI++;		BasicBlock &BB = *BI++;
for (BasicBlock::iterator II = BB.begin(), IE = BB.end(); II != IE;) {		for (BasicBlock::iterator II = BB.begin(), IE = BB.end(); II != IE;) {
▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines