This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
7
InlineFunction.cpp

Differential D10631

[Inliner][NFCI] Add an InlineSite abstraction.
AbandonedPublic

Authored by sanjoy on Jun 22 2015, 7:14 PM.

Download Raw Diff

Details

Reviewers

chandlerc
reames
swaroop.sridhar
pgavlin
nlewycky

Summary

The InlineSite abstraction is currently a thin layer around CallSite, but later changes will have it multiplex between a CallSite and a Statepoint.

Diff Detail

Event Timeline

sanjoy updated this revision to Diff 28191.Jun 22 2015, 7:14 PM

sanjoy retitled this revision from to [Inliner][NFCI] Add an InlineSite abstraction..

sanjoy updated this object.

sanjoy edited the test plan for this revision. (Show Details)

sanjoy added reviewers: reames, chandlerc, nlewycky.

sanjoy added parent revisions: D10630: [Statepoints][NFC] Rename variables to llvm style., D10629: [Statepoints][NFC] Add Statepoint::operator bool(), D10627: [Statepoints][NFC] Add Statepoint::getGCResult..

sanjoy mentioned this in D10632: [Inlining][NFC] Introduce a InlineFunction that takes a Statepoint..

sanjoy added a subscriber: Unknown Object (MLST).

Please ignore this change. I thought I had run the test suite after this, but I had not and a bunch of test fail with this applied. I'll put up an update and fixed change tomorrow.

update with fixed patch

sanjoy added reviewers: pgavlin, swaroop.sridhar.Jun 23 2015, 2:19 PM

Comments inline.

lib/Transforms/Utils/InlineFunction.cpp
66	I'm not convinced of the need for a new abstraction here. What does this really add over having a function which takes a call site and extracts the underlying callee from a statepoint and the surface callee from a normal call? An getUnwrappedCallee(CallSite CS) function seems to serve the same purpose with much less complexity. Even if I accept the need for an extra abstraction, I strongly object to the duplication of the underlying fields. Add proxying access if needed, but please do not duplicate each field from the underlying call site.
119	Placement wise, this should be below the existing InlineFunction code.
131	I'm confused by this bit. It seems like possibly an unrelated refactoring? Or is this the statepoint related bits for gc.result?
1367	Unintentional diff?

This revision now requires changes to proceed.Jun 24 2015, 11:51 AM

sanjoy added inline comments.Jun 24 2015, 12:08 PM

lib/Transforms/Utils/InlineFunction.cpp
66	What does this really add over having a function which takes a call site and extracts the underlying callee from a statepoint and the surface callee from a normal call? It is not just the call destination -- I will have to update all of the places where the Inliner accesses, say, paramHasAttr and change it to do something different if the call site is a statepoint. I suppose I could have a `getUnwrappedXXX` for each of those properties, but that's semantically what this class is. Plus by having a separate type I can verify that I've not accidentally used `CS.getFoo()` where it would be incorrect to do so. Even if I accept the need for an extra abstraction, I strongly object to the duplication of the underlying fields. Add proxying access if needed, but please do not duplicate each field from the underlying call site. Ok.
131	I moved this part so to that I could RAUW `gc_result` easily in the following patch. I'll split this bit out into its own refactoring.
1367	See previous comment.

sanjoy added parent revisions: D10756: [NFC] Make the Statepoint class more like CallSite, D10755: [Statepoints][NFC] Constify accessors on Statepoint..Jun 25 2015, 5:52 PM

Make the InlineSite class a lot more lightweight. It is now a thin
wrapper around CallSite. A later change will have it multiplex over
Statepoint and CallSite.

sanjoy mentioned this in D10758: [Inliner] Teach LLVM to inline through statepoints..Jun 25 2015, 8:00 PM

sanjoy updated this object.Jun 26 2015, 12:41 AM

sanjoy edited edge metadata.

Started looking and really thinking about this.

The more I think about it, the more I think that *if* we need to support inlining through statepoints, then CallSite should be able to wrap a statepoint just like it does an invoke. I don't think we want two abstractions here.

But the more I think about that, the more I think that the current statepoint IR form makes that really ... icky. But I don't know if there is realistically a better one because I've not sufficiently internalized the constraints the statepoints were designed under.

So prior to going further on this patch, I'm going to do two things to educate myself:

Chat with Sanjoy (probably in IRC) so he can teach me more about statepoints and maybe point me at the right design discussions to just go read.

Follow up on the statepoint inlining design discussion thread to ask for some more high-level details that shouldn't end up buried on this review.

Hi Chandler,

What is your current stance on this? Should I start making gc_statepoint calls / invokes managed by CallSite?

The only potential downside I see to making CallSites manage gc_statepoint is that it will make CallSite more complex than a simple dispatch over CallInst vs. InvokeInst. CallSite will now have to dispatch over the cross product of {CallInst, InvokeInst} and { isStatepoint, !isStatepoint }. There will be a slight performance penalty[1] and some bits of LLVM that depend on CallSite being uncomplicated may require updating. I don't mind doing all of this if we can all agree that CallSite managed statepoints are the right way to do this.

<hand-waving>

Here is how I think about what's going on, in case that is helpful:

The mental model I have for inlining through statepoints is that the inliner "sees through" some control and data flow in the gc_statepoint intrinsic and inlines two levels at once. It can then inline through / simplify the attached gc_relocate and gc_result calls as well.

For instance, an imaginary implementation of gc_statepoint that will allow the current (i.e. implemented in the current patch set) form of inlining looks like:

define i32 @gc_statepoint(func_ptr %f, ...args) {
  (arg0, arg1, ...) = unpack(...args) ;; can't be represented in LLVM IR
  %result = %f(arg0, arg1, ...)
  ret pack(%result)  ;; can't be easily represented in LLVM IR
}

with the axiom

gc_result(pack(%result)) == %result

The inliner then, semantically, first inlines through the gc_statepoint call, then recognizes that the call to %f is now direct, and then inlines through that call as well. It then simplifies the associated gc_(result|relocate) calls[2].

There are two places where this view of things break down:

the "internals" of @gc_statepoint are not representable in LLVM IR, especially once we take into account deoptimization and gc state.
even if (1) was somehow addressed, it won't be okay to inline only through @gc_statepoint and not subsequently inline through %f.

But semantically, this is one way to think about what's going on.

</hand-waving>

[1]: Most of this may be mitigated by s/PointerIntPair<InstrTy*, 1, bool>/PointerIntPair<InstrTy*, 2> in CallSite
[2]: Currently we only simplify gc_result calls, logic simplification for gc_relocate will come later)

These have been superseded by the operand bundles work.

Revision Contents

Path

Size

lib/

Transforms/

Utils/

InlineFunction.cpp

215 lines

Diff 28283

lib/Transforms/Utils/InlineFunction.cpp

Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	bool llvm::InlineFunction(CallInst *CI, InlineFunctionInfo &IFI,
return InlineFunction(CallSite(CI), IFI, InsertLifetime);		return InlineFunction(CallSite(CI), IFI, InsertLifetime);
}		}
bool llvm::InlineFunction(InvokeInst *II, InlineFunctionInfo &IFI,		bool llvm::InlineFunction(InvokeInst *II, InlineFunctionInfo &IFI,
bool InsertLifetime) {		bool InsertLifetime) {
return InlineFunction(CallSite(II), IFI, InsertLifetime);		return InlineFunction(CallSite(II), IFI, InsertLifetime);
}		}

namespace {		namespace {
		class InlineSite {
		reamesUnsubmitted Not Done Reply Inline Actions I'm not convinced of the need for a new abstraction here. What does this really add over having a function which takes a call site and extracts the underlying callee from a statepoint and the surface callee from a normal call? An getUnwrappedCallee(CallSite CS) function seems to serve the same purpose with much less complexity. Even if I accept the need for an extra abstraction, I strongly object to the duplication of the underlying fields. Add proxying access if needed, but please do not duplicate each field from the underlying call site. reames: I'm not convinced of the need for a new abstraction here. What does this really add over…
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions What does this really add over having a function which takes a call site and extracts the underlying callee from a statepoint and the surface callee from a normal call? It is not just the call destination -- I will have to update all of the places where the Inliner accesses, say, paramHasAttr and change it to do something different if the call site is a statepoint. I suppose I could have a `getUnwrappedXXX` for each of those properties, but that's semantically what this class is. Plus by having a separate type I can verify that I've not accidentally used `CS.getFoo()` where it would be incorrect to do so. Even if I accept the need for an extra abstraction, I strongly object to the duplication of the underlying fields. Add proxying access if needed, but please do not duplicate each field from the underlying call site. Ok. sanjoy: > What does this really add over having a function which takes a call site and extracts the…
		Function *CalledFunction;
		Instruction *CallSiteInst;
		CallSite::arg_iterator ArgBegin;
		CallSite::arg_iterator ArgEnd;
		unsigned ArgSize;
		bool DoesNotThrow;

		#ifdef NDEBUG
		void verify() const {}
		#else
		void verify() const;
		#endif

		bool paramHasAttr(unsigned i, Attribute::AttrKind A) const {
		assert(i < (ArgSize + 1) && "out of bounds!");
		if (auto *II = dyn_cast<InvokeInst>(CallSiteInst))
		return II->paramHasAttr(i, A);
		else
		return cast<CallInst>(CallSiteInst)->paramHasAttr(i, A);
		}

		public:
		explicit InlineSite(CallSite CS)
		: CalledFunction(CS.getCalledFunction()),
		CallSiteInst(CS.getInstruction()), ArgBegin(CS.arg_begin()),
		ArgEnd(CS.arg_end()), ArgSize(CS.arg_size()),
		DoesNotThrow(CS.doesNotThrow()) {
		verify();
		}

		Function *getCalledFunction() const { return CalledFunction; }

		Function *getCaller() const;

		Instruction *getInstruction() const { return CallSiteInst; }

		Value *getArgument(unsigned aidx) const {
		assert(aidx < ArgSize && "arg index out of bounds!");
		return *(ArgBegin + aidx);
		}

		bool isByValArgument(unsigned ArgNo) const {
		return paramHasAttr(ArgNo + 1, Attribute::ByVal);
		}

		bool doesNotThrow() const { return DoesNotThrow; }
		unsigned arg_size() const { return ArgSize; }
		CallSite::arg_iterator arg_begin() const { return ArgBegin; }
		CallSite::arg_iterator arg_end() const { return ArgEnd; }
		};
		}

		static bool InlineFunctionImpl(InlineSite IS, InlineFunctionInfo &IFI,
		reamesUnsubmitted Not Done Reply Inline Actions Placement wise, this should be below the existing InlineFunction code. reames: Placement wise, this should be below the existing InlineFunction code.
		bool InsertLifetime, Value *&ReturnVal,
		bool &InlinedMustTailCalls);

		bool llvm::InlineFunction(CallSite CS, InlineFunctionInfo &IFI,
		bool InsertLifetime) {
		Value *ReturnValOut = nullptr;
		bool InlinedMustTailCalls = false;
		if (!InlineFunctionImpl(InlineSite(CS), IFI, InsertLifetime, ReturnValOut,
		InlinedMustTailCalls))
		return false;

		Instruction *TheCall = CS.getInstruction();
		reamesUnsubmitted Not Done Reply Inline Actions I'm confused by this bit. It seems like possibly an unrelated refactoring? Or is this the statepoint related bits for gc.result? reames: I'm confused by this bit. It seems like possibly an unrelated refactoring? Or is this the…
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions I moved this part so to that I could RAUW `gc_result` easily in the following patch. I'll split this bit out into its own refactoring. sanjoy: I moved this part so to that I could RAUW `gc_result` easily in the following patch. I'll…
		if (!TheCall->use_empty()) {
		if (ReturnValOut == TheCall) // Can happen in unreachable code
		ReturnValOut = UndefValue::get(TheCall->getType());
		TheCall->replaceAllUsesWith(ReturnValOut);
		}

		BasicBlock *ContainingBlock = TheCall->getParent();

		// Entire block may have become unreachable.
		if (InlinedMustTailCalls && pred_empty(ContainingBlock))
		ContainingBlock->eraseFromParent();
		else
		TheCall->eraseFromParent();

		return true;
		}

		namespace {
/// A class for recording information about inlining through an invoke.		/// A class for recording information about inlining through an invoke.
class InvokeInliningInfo {		class InvokeInliningInfo {
BasicBlock *OuterResumeDest; ///< Destination of the invoke's unwind.		BasicBlock *OuterResumeDest; ///< Destination of the invoke's unwind.
BasicBlock *InnerResumeDest; ///< Destination for the callee's resume.		BasicBlock *InnerResumeDest; ///< Destination for the callee's resume.
LandingPadInst *CallerLPad; ///< LandingPadInst associated with the invoke.		LandingPadInst *CallerLPad; ///< LandingPadInst associated with the invoke.
PHINode *InnerEHValuesPHI; ///< PHI for EH values from landingpad insts.		PHINode *InnerEHValuesPHI; ///< PHI for EH values from landingpad insts.
SmallVector<Value*, 8> UnwindDestPHIValues;		SmallVector<Value*, 8> UnwindDestPHIValues;

▲ Show 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	InnerEHValuesPHI = PHINode::Create(CallerLPad->getType(), PHICapacity,
"eh.lpad-body", InsertPoint);		"eh.lpad-body", InsertPoint);
CallerLPad->replaceAllUsesWith(InnerEHValuesPHI);		CallerLPad->replaceAllUsesWith(InnerEHValuesPHI);
InnerEHValuesPHI->addIncoming(CallerLPad, OuterResumeDest);		InnerEHValuesPHI->addIncoming(CallerLPad, OuterResumeDest);

// All done.		// All done.
return InnerResumeDest;		return InnerResumeDest;
}		}

		#ifndef NDEBUG
		void InlineSite::verify() const {
		FunctionType *FTy = CalledFunction->getFunctionType();
		unsigned aidx = 0;
		for (auto AI = ArgBegin; AI != ArgEnd; ++AI, aidx++)
		assert((aidx >= FTy->getNumParams() \|\|
		FTy->getParamType(aidx) == (*AI)->getType()) &&
		"Calling a function with a bad signature!");

		assert(aidx == ArgSize && "invalid ArgSize!");
		}
		#endif

		Function *InlineSite::getCaller() const {
		return CallSiteInst->getParent()->getParent();
		}

/// Forward the 'resume' instruction to the caller's landing pad block.		/// Forward the 'resume' instruction to the caller's landing pad block.
/// When the landing pad block has only one predecessor, this is a simple		/// When the landing pad block has only one predecessor, this is a simple
/// branch. When there is more than one predecessor, we need to split the		/// branch. When there is more than one predecessor, we need to split the
/// landing pad block after the landingpad instruction and jump to there.		/// landing pad block after the landingpad instruction and jump to there.
void InvokeInliningInfo::forwardResume(ResumeInst *RI,		void InvokeInliningInfo::forwardResume(ResumeInst *RI,
SmallPtrSetImpl<LandingPadInst*> &InlinedLPads) {		SmallPtrSetImpl<LandingPadInst*> &InlinedLPads) {
BasicBlock *Dest = getInnerResumeDest();		BasicBlock *Dest = getInnerResumeDest();
BasicBlock *Src = RI->getParent();		BasicBlock *Src = RI->getParent();
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines

/// When inlining a function that contains noalias scope metadata,		/// When inlining a function that contains noalias scope metadata,
/// this metadata needs to be cloned so that the inlined blocks		/// this metadata needs to be cloned so that the inlined blocks
/// have different "unqiue scopes" at every call site. Were this not done, then		/// have different "unqiue scopes" at every call site. Were this not done, then
/// aliasing scopes from a function inlined into a caller multiple times could		/// aliasing scopes from a function inlined into a caller multiple times could
/// not be differentiated (and this would lead to miscompiles because the		/// not be differentiated (and this would lead to miscompiles because the
/// non-aliasing property communicated by the metadata could have		/// non-aliasing property communicated by the metadata could have
/// call-site-specific control dependencies).		/// call-site-specific control dependencies).
static void CloneAliasScopeMetadata(CallSite CS, ValueToValueMapTy &VMap) {		static void CloneAliasScopeMetadata(InlineSite IS, ValueToValueMapTy &VMap) {
const Function *CalledFunc = CS.getCalledFunction();		const Function *CalledFunc = IS.getCalledFunction();
SetVector<const MDNode *> MD;		SetVector<const MDNode *> MD;

// Note: We could only clone the metadata if it is already used in the		// Note: We could only clone the metadata if it is already used in the
// caller. I'm omitting that check here because it might confuse		// caller. I'm omitting that check here because it might confuse
// inter-procedural alias analysis passes. We can revisit this if it becomes		// inter-procedural alias analysis passes. We can revisit this if it becomes
// an efficiency or overhead problem.		// an efficiency or overhead problem.

for (Function::const_iterator I = CalledFunc->begin(), IE = CalledFunc->end();		for (Function::const_iterator I = CalledFunc->begin(), IE = CalledFunc->end();
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	if (!NI)
continue;		continue;

if (MDNode *M = NI->getMetadata(LLVMContext::MD_alias_scope)) {		if (MDNode *M = NI->getMetadata(LLVMContext::MD_alias_scope)) {
MDNode *NewMD = MDMap[M];		MDNode *NewMD = MDMap[M];
// If the call site also had alias scope metadata (a list of scopes to		// If the call site also had alias scope metadata (a list of scopes to
// which instructions inside it might belong), propagate those scopes to		// which instructions inside it might belong), propagate those scopes to
// the inlined instructions.		// the inlined instructions.
if (MDNode *CSM =		if (MDNode *CSM =
CS.getInstruction()->getMetadata(LLVMContext::MD_alias_scope))		IS.getInstruction()->getMetadata(LLVMContext::MD_alias_scope))
NewMD = MDNode::concatenate(NewMD, CSM);		NewMD = MDNode::concatenate(NewMD, CSM);
NI->setMetadata(LLVMContext::MD_alias_scope, NewMD);		NI->setMetadata(LLVMContext::MD_alias_scope, NewMD);
} else if (NI->mayReadOrWriteMemory()) {		} else if (NI->mayReadOrWriteMemory()) {
if (MDNode *M =		if (MDNode *M =
CS.getInstruction()->getMetadata(LLVMContext::MD_alias_scope))		IS.getInstruction()->getMetadata(LLVMContext::MD_alias_scope))
NI->setMetadata(LLVMContext::MD_alias_scope, M);		NI->setMetadata(LLVMContext::MD_alias_scope, M);
}		}

if (MDNode *M = NI->getMetadata(LLVMContext::MD_noalias)) {		if (MDNode *M = NI->getMetadata(LLVMContext::MD_noalias)) {
MDNode *NewMD = MDMap[M];		MDNode *NewMD = MDMap[M];
// If the call site also had noalias metadata (a list of scopes with		// If the call site also had noalias metadata (a list of scopes with
// which instructions inside it don't alias), propagate those scopes to		// which instructions inside it don't alias), propagate those scopes to
// the inlined instructions.		// the inlined instructions.
if (MDNode *CSM =		if (MDNode *CSM =
CS.getInstruction()->getMetadata(LLVMContext::MD_noalias))		IS.getInstruction()->getMetadata(LLVMContext::MD_noalias))
NewMD = MDNode::concatenate(NewMD, CSM);		NewMD = MDNode::concatenate(NewMD, CSM);
NI->setMetadata(LLVMContext::MD_noalias, NewMD);		NI->setMetadata(LLVMContext::MD_noalias, NewMD);
} else if (NI->mayReadOrWriteMemory()) {		} else if (NI->mayReadOrWriteMemory()) {
if (MDNode *M = CS.getInstruction()->getMetadata(LLVMContext::MD_noalias))		if (MDNode *M = IS.getInstruction()->getMetadata(LLVMContext::MD_noalias))
NI->setMetadata(LLVMContext::MD_noalias, M);		NI->setMetadata(LLVMContext::MD_noalias, M);
}		}
}		}
}		}

/// If the inlined function has noalias arguments,		/// If the inlined function has noalias arguments,
/// then add new alias scopes for each noalias argument, tag the mapped noalias		/// then add new alias scopes for each noalias argument, tag the mapped noalias
/// parameters with noalias metadata specifying the new scope, and tag all		/// parameters with noalias metadata specifying the new scope, and tag all
/// non-derived loads, stores and memory intrinsics with the new alias scopes.		/// non-derived loads, stores and memory intrinsics with the new alias scopes.
static void AddAliasScopeMetadata(CallSite CS, ValueToValueMapTy &VMap,		static void AddAliasScopeMetadata(InlineSite IS, ValueToValueMapTy &VMap,
const DataLayout &DL, AliasAnalysis *AA) {		const DataLayout &DL, AliasAnalysis *AA) {
if (!EnableNoAliasConversion)		if (!EnableNoAliasConversion)
return;		return;

const Function *CalledFunc = CS.getCalledFunction();		const Function *CalledFunc = IS.getCalledFunction();
SmallVector<const Argument *, 4> NoAliasArgs;		SmallVector<const Argument *, 4> NoAliasArgs;

for (Function::const_arg_iterator I = CalledFunc->arg_begin(),		for (Function::const_arg_iterator I = CalledFunc->arg_begin(),
E = CalledFunc->arg_end(); I != E; ++I) {		E = CalledFunc->arg_end(); I != E; ++I) {
if (I->hasNoAliasAttr() && !I->hasNUses(0))		if (I->hasNoAliasAttr() && !I->hasNUses(0))
NoAliasArgs.push_back(I);		NoAliasArgs.push_back(I);
}		}

▲ Show 20 Lines • Show All 204 Lines • ▼ Show 20 Lines	if (const Instruction *I = dyn_cast<Instruction>(VMI->first)) {
MDNode::concatenate(NI->getMetadata(LLVMContext::MD_alias_scope),		MDNode::concatenate(NI->getMetadata(LLVMContext::MD_alias_scope),
MDNode::get(CalledFunc->getContext(), Scopes)));		MDNode::get(CalledFunc->getContext(), Scopes)));
}		}
}		}
}		}

/// If the inlined function has non-byval align arguments, then		/// If the inlined function has non-byval align arguments, then
/// add @llvm.assume-based alignment assumptions to preserve this information.		/// add @llvm.assume-based alignment assumptions to preserve this information.
static void AddAlignmentAssumptions(CallSite CS, InlineFunctionInfo &IFI) {		static void AddAlignmentAssumptions(InlineSite IS, InlineFunctionInfo &IFI) {
if (!PreserveAlignmentAssumptions)		if (!PreserveAlignmentAssumptions)
return;		return;
auto &DL = CS.getCaller()->getParent()->getDataLayout();		auto &DL = IS.getCaller()->getParent()->getDataLayout();

// To avoid inserting redundant assumptions, we should check for assumptions		// To avoid inserting redundant assumptions, we should check for assumptions
// already in the caller. To do this, we might need a DT of the caller.		// already in the caller. To do this, we might need a DT of the caller.
DominatorTree DT;		DominatorTree DT;
bool DTCalculated = false;		bool DTCalculated = false;

Function *CalledFunc = CS.getCalledFunction();		Function *CalledFunc = IS.getCalledFunction();
for (Function::arg_iterator I = CalledFunc->arg_begin(),		for (Function::arg_iterator I = CalledFunc->arg_begin(),
E = CalledFunc->arg_end();		E = CalledFunc->arg_end();
I != E; ++I) {		I != E; ++I) {
unsigned Align = I->getType()->isPointerTy() ? I->getParamAlignment() : 0;		unsigned Align = I->getType()->isPointerTy() ? I->getParamAlignment() : 0;
if (Align && !I->hasByValOrInAllocaAttr() && !I->hasNUses(0)) {		if (Align && !I->hasByValOrInAllocaAttr() && !I->hasNUses(0)) {
if (!DTCalculated) {		if (!DTCalculated) {
DT.recalculate(const_cast<Function&>(*CS.getInstruction()->getParent()		DT.recalculate(const_cast<Function &>(
->getParent()));		*IS.getInstruction()->getParent()->getParent()));
DTCalculated = true;		DTCalculated = true;
}		}

// If we can already prove the asserted alignment in the context of the		// If we can already prove the asserted alignment in the context of the
// caller, then don't bother inserting the assumption.		// caller, then don't bother inserting the assumption.
Value *Arg = CS.getArgument(I->getArgNo());		Value *Arg = IS.getArgument(I->getArgNo());
if (getKnownAlignment(Arg, DL, CS.getInstruction(),		if (getKnownAlignment(Arg, DL, IS.getInstruction(),
&IFI.ACT->getAssumptionCache(*CalledFunc),		&IFI.ACT->getAssumptionCache(*CalledFunc),
&DT) >= Align)		&DT) >= Align)
continue;		continue;

IRBuilder<>(CS.getInstruction())		IRBuilder<>(IS.getInstruction())
.CreateAlignmentAssumption(DL, Arg, Align);		.CreateAlignmentAssumption(DL, Arg, Align);
}		}
}		}
}		}

/// Once we have cloned code over from a callee into the caller,		/// Once we have cloned code over from a callee into the caller,
/// update the specified callgraph to reflect the changes we made.		/// update the specified callgraph to reflect the changes we made.
/// Note that it's possible that not all code was copied over, so only		/// Note that it's possible that not all code was copied over, so only
/// some edges of the callgraph may remain.		/// some edges of the callgraph may remain.
static void UpdateCallGraphAfterInlining(CallSite CS,		static void UpdateCallGraphAfterInlining(InlineSite IS,
Function::iterator FirstNewBlock,		Function::iterator FirstNewBlock,
ValueToValueMapTy &VMap,		ValueToValueMapTy &VMap,
InlineFunctionInfo &IFI) {		InlineFunctionInfo &IFI) {
CallGraph &CG = *IFI.CG;		CallGraph &CG = *IFI.CG;
const Function *Caller = CS.getInstruction()->getParent()->getParent();		const Function *Caller = IS.getInstruction()->getParent()->getParent();
const Function *Callee = CS.getCalledFunction();		const Function *Callee = IS.getCalledFunction();
CallGraphNode *CalleeNode = CG[Callee];		CallGraphNode *CalleeNode = CG[Callee];
CallGraphNode *CallerNode = CG[Caller];		CallGraphNode *CallerNode = CG[Caller];

// Since we inlined some uninlined call sites in the callee into the caller,		// Since we inlined some uninlined call sites in the callee into the caller,
// add edges from the caller to all of the callees of the callee.		// add edges from the caller to all of the callees of the callee.
CallGraphNode::iterator I = CalleeNode->begin(), E = CalleeNode->end();		CallGraphNode::iterator I = CalleeNode->begin(), E = CalleeNode->end();

// Consider the case where CalleeNode == CallerNode.		// Consider the case where CalleeNode == CallerNode.
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	if (!I->second->getFunction())
continue;		continue;
}		}

CallerNode->addCalledFunction(CallSite(NewCall), I->second);		CallerNode->addCalledFunction(CallSite(NewCall), I->second);
}		}

// Update the call graph by deleting the edge from Callee to Caller. We must		// Update the call graph by deleting the edge from Callee to Caller. We must
// do this after the loop above in case Caller and Callee are the same.		// do this after the loop above in case Caller and Callee are the same.
CallerNode->removeCallEdgeFor(CS);		CallerNode->removeCallEdgeFor(CallSite(IS.getInstruction()));
}		}

static void HandleByValArgumentInit(Value Dst, Value Src, Module *M,		static void HandleByValArgumentInit(Value Dst, Value Src, Module *M,
BasicBlock *InsertBlock,		BasicBlock *InsertBlock,
InlineFunctionInfo &IFI) {		InlineFunctionInfo &IFI) {
Type *AggTy = cast<PointerType>(Src->getType())->getElementType();		Type *AggTy = cast<PointerType>(Src->getType())->getElementType();
IRBuilder<> Builder(InsertBlock->begin());		IRBuilder<> Builder(InsertBlock->begin());

▲ Show 20 Lines • Show All 175 Lines • ▼ Show 20 Lines
/// This function inlines the called function into the basic block of the		/// This function inlines the called function into the basic block of the
/// caller. This returns false if it is not possible to inline this call.		/// caller. This returns false if it is not possible to inline this call.
/// The program is still in a well defined state if this occurs though.		/// The program is still in a well defined state if this occurs though.
///		///
/// Note that this only does one level of inlining. For example, if the		/// Note that this only does one level of inlining. For example, if the
/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now		/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now
/// exists in the instruction stream. Similarly this will inline a recursive		/// exists in the instruction stream. Similarly this will inline a recursive
/// function by one level.		/// function by one level.
bool llvm::InlineFunction(CallSite CS, InlineFunctionInfo &IFI,		bool InlineFunctionImpl(InlineSite IS, InlineFunctionInfo &IFI,
bool InsertLifetime) {		bool InsertLifetime, Value *&ReturnVal,
Instruction *TheCall = CS.getInstruction();		bool &InlinedMustTailCalls) {
		Instruction *TheCall = IS.getInstruction();
assert(TheCall->getParent() && TheCall->getParent()->getParent() &&		assert(TheCall->getParent() && TheCall->getParent()->getParent() &&
"Instruction not in function!");		"Instruction not in function!");

// If IFI has any state in it, zap it before we fill it in.		// If IFI has any state in it, zap it before we fill it in.
IFI.reset();		IFI.reset();

const Function *CalledFunc = CS.getCalledFunction();		const Function *CalledFunc = IS.getCalledFunction();
if (!CalledFunc \|\| // Can't inline external function or indirect		if (!CalledFunc \|\| // Can't inline external function or indirect
CalledFunc->isDeclaration() \|\| // call, or call to a vararg function!		CalledFunc->isDeclaration() \|\| // call, or call to a vararg function!
CalledFunc->getFunctionType()->isVarArg()) return false;		CalledFunc->getFunctionType()->isVarArg()) return false;

// If the call to the callee cannot throw, set the 'nounwind' flag on any		// If the call to the callee cannot throw, set the 'nounwind' flag on any
// calls that we inline.		// calls that we inline.
bool MarkNoUnwind = CS.doesNotThrow();		bool MarkNoUnwind = IS.doesNotThrow();

BasicBlock *OrigBB = TheCall->getParent();		BasicBlock *OrigBB = TheCall->getParent();
Function *Caller = OrigBB->getParent();		Function *Caller = OrigBB->getParent();

// GC poses two hazards to inlining, which only occur when the callee has GC:		// GC poses two hazards to inlining, which only occur when the callee has GC:
// 1. If the caller has no GC, then the callee's GC must be propagated to the		// 1. If the caller has no GC, then the callee's GC must be propagated to the
// caller.		// caller.
// 2. If the caller has a differing GC, it is invalid to inline.		// 2. If the caller has a differing GC, it is invalid to inline.
Show All 36 Lines	bool InlineFunctionImpl(InlineSite IS, InlineFunctionInfo &IFI,

{ // Scope to destroy VMap after cloning.		{ // Scope to destroy VMap after cloning.
ValueToValueMapTy VMap;		ValueToValueMapTy VMap;
// Keep a list of pair (dst, src) to emit byval initializations.		// Keep a list of pair (dst, src) to emit byval initializations.
SmallVector<std::pair<Value, Value>, 4> ByValInit;		SmallVector<std::pair<Value, Value>, 4> ByValInit;

auto &DL = Caller->getParent()->getDataLayout();		auto &DL = Caller->getParent()->getDataLayout();

assert(CalledFunc->arg_size() == CS.arg_size() &&		assert(CalledFunc->arg_size() == IS.arg_size() &&
"No varargs calls can be inlined!");		"No varargs calls can be inlined!");

// Calculate the vector of arguments to pass into the function cloner, which		// Calculate the vector of arguments to pass into the function cloner, which
// matches up the formal to the actual argument values.		// matches up the formal to the actual argument values.
CallSite::arg_iterator AI = CS.arg_begin();		CallSite::arg_iterator AI = IS.arg_begin();
unsigned ArgNo = 0;		unsigned ArgNo = 0;
for (Function::const_arg_iterator I = CalledFunc->arg_begin(),		for (Function::const_arg_iterator I = CalledFunc->arg_begin(),
E = CalledFunc->arg_end(); I != E; ++I, ++AI, ++ArgNo) {		E = CalledFunc->arg_end(); I != E; ++I, ++AI, ++ArgNo) {
Value ActualArg = AI;		Value ActualArg = AI;

// When byval arguments actually inlined, we need to make the copy implied		// When byval arguments actually inlined, we need to make the copy implied
// by them explicit. However, we don't do this if the callee is readonly		// by them explicit. However, we don't do this if the callee is readonly
// or readnone, because the copy would be unneeded: the callee doesn't		// or readnone, because the copy would be unneeded: the callee doesn't
// modify the struct.		// modify the struct.
if (CS.isByValArgument(ArgNo)) {		if (IS.isByValArgument(ArgNo)) {
ActualArg = HandleByValArgument(ActualArg, TheCall, CalledFunc, IFI,		ActualArg = HandleByValArgument(ActualArg, TheCall, CalledFunc, IFI,
CalledFunc->getParamAlignment(ArgNo+1));		CalledFunc->getParamAlignment(ArgNo+1));
if (ActualArg != *AI)		if (ActualArg != *AI)
ByValInit.push_back(std::make_pair(ActualArg, (Value) AI));		ByValInit.push_back(std::make_pair(ActualArg, (Value) AI));
}		}

VMap[I] = ActualArg;		VMap[I] = ActualArg;
}		}

// Add alignment assumptions if necessary. We do this before the inlined		// Add alignment assumptions if necessary. We do this before the inlined
// instructions are actually cloned into the caller so that we can easily		// instructions are actually cloned into the caller so that we can easily
// check what will be known at the start of the inlined code.		// check what will be known at the start of the inlined code.
AddAlignmentAssumptions(CS, IFI);		AddAlignmentAssumptions(IS, IFI);

// We want the inliner to prune the code as it copies. We would LOVE to		// We want the inliner to prune the code as it copies. We would LOVE to
// have no dead or constant instructions leftover after inlining occurs		// have no dead or constant instructions leftover after inlining occurs
// (which can happen, e.g., because an argument was constant), but we'll be		// (which can happen, e.g., because an argument was constant), but we'll be
// happy with whatever the cloner can do.		// happy with whatever the cloner can do.
CloneAndPruneFunctionInto(Caller, CalledFunc, VMap,		CloneAndPruneFunctionInto(Caller, CalledFunc, VMap,
/ModuleLevelChanges=/false, Returns, ".i",		/ModuleLevelChanges=/false, Returns, ".i",
&InlinedFunctionInfo, TheCall);		&InlinedFunctionInfo, TheCall);

// Remember the first block that is newly cloned over.		// Remember the first block that is newly cloned over.
FirstNewBlock = LastBlock; ++FirstNewBlock;		FirstNewBlock = LastBlock; ++FirstNewBlock;

// Inject byval arguments initialization.		// Inject byval arguments initialization.
for (std::pair<Value, Value> &Init : ByValInit)		for (std::pair<Value, Value> &Init : ByValInit)
HandleByValArgumentInit(Init.first, Init.second, Caller->getParent(),		HandleByValArgumentInit(Init.first, Init.second, Caller->getParent(),
FirstNewBlock, IFI);		FirstNewBlock, IFI);

// Update the callgraph if requested.		// Update the callgraph if requested.
if (IFI.CG)		if (IFI.CG)
UpdateCallGraphAfterInlining(CS, FirstNewBlock, VMap, IFI);		UpdateCallGraphAfterInlining(IS, FirstNewBlock, VMap, IFI);

// Update inlined instructions' line number information.		// Update inlined instructions' line number information.
fixupLineNumbers(Caller, FirstNewBlock, TheCall);		fixupLineNumbers(Caller, FirstNewBlock, TheCall);

// Clone existing noalias metadata if necessary.		// Clone existing noalias metadata if necessary.
CloneAliasScopeMetadata(CS, VMap);		CloneAliasScopeMetadata(IS, VMap);

// Add noalias metadata if necessary.		// Add noalias metadata if necessary.
AddAliasScopeMetadata(CS, VMap, DL, IFI.AA);		AddAliasScopeMetadata(IS, VMap, DL, IFI.AA);

// FIXME: We could register any cloned assumptions instead of clearing the		// FIXME: We could register any cloned assumptions instead of clearing the
// whole function's cache.		// whole function's cache.
if (IFI.ACT)		if (IFI.ACT)
IFI.ACT->getAssumptionCache(*Caller).clear();		IFI.ACT->getAssumptionCache(*Caller).clear();
}		}

// If there are any alloca instructions in the block that used to be the entry		// If there are any alloca instructions in the block that used to be the entry
Show All 36 Lines	for (BasicBlock::iterator I = FirstNewBlock->begin(),
AI, I);		AI, I);
}		}
// Move any dbg.declares describing the allocas into the entry basic block.		// Move any dbg.declares describing the allocas into the entry basic block.
DIBuilder DIB(*Caller->getParent());		DIBuilder DIB(*Caller->getParent());
for (auto &AI : IFI.StaticAllocas)		for (auto &AI : IFI.StaticAllocas)
replaceDbgDeclareForAlloca(AI, AI, DIB, /Deref=/false);		replaceDbgDeclareForAlloca(AI, AI, DIB, /Deref=/false);
}		}

bool InlinedMustTailCalls = false;		InlinedMustTailCalls = false;
if (InlinedFunctionInfo.ContainsCalls) {		if (InlinedFunctionInfo.ContainsCalls) {
CallInst::TailCallKind CallSiteTailKind = CallInst::TCK_None;		CallInst::TailCallKind CallSiteTailKind = CallInst::TCK_None;
if (CallInst *CI = dyn_cast<CallInst>(TheCall))		if (CallInst *CI = dyn_cast<CallInst>(TheCall))
CallSiteTailKind = CI->getTailCallKind();		CallSiteTailKind = CI->getTailCallKind();

for (Function::iterator BB = FirstNewBlock, E = Caller->end(); BB != E;		for (Function::iterator BB = FirstNewBlock, E = Caller->end(); BB != E;
++BB) {		++BB) {
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	if (Returns.size() == 1 && std::distance(FirstNewBlock, Caller->end()) == 1) {

// If the call site was an invoke instruction, add a branch to the normal		// If the call site was an invoke instruction, add a branch to the normal
// destination.		// destination.
if (InvokeInst *II = dyn_cast<InvokeInst>(TheCall)) {		if (InvokeInst *II = dyn_cast<InvokeInst>(TheCall)) {
BranchInst *NewBr = BranchInst::Create(II->getNormalDest(), TheCall);		BranchInst *NewBr = BranchInst::Create(II->getNormalDest(), TheCall);
NewBr->setDebugLoc(Returns[0]->getDebugLoc());		NewBr->setDebugLoc(Returns[0]->getDebugLoc());
}		}

// If the return instruction returned a value, replace uses of the call with		ReturnVal = Returns[0]->getReturnValue();
		reamesUnsubmitted Not Done Reply Inline Actions Unintentional diff? reames: Unintentional diff?
		sanjoyAuthorUnsubmitted Not Done Reply Inline Actions See previous comment. sanjoy: See previous comment.
// uses of the returned value.
if (!TheCall->use_empty()) {
ReturnInst *R = Returns[0];
if (TheCall == R->getReturnValue())
TheCall->replaceAllUsesWith(UndefValue::get(TheCall->getType()));
else
TheCall->replaceAllUsesWith(R->getReturnValue());
}
// Since we are now done with the Call/Invoke, we can delete it.
TheCall->eraseFromParent();

// Since we are now done with the return instruction, delete it also.		// Since we are now done with the return instruction, delete it.
Returns[0]->eraseFromParent();		Returns[0]->eraseFromParent();

// We are now done with the inlining.		// We are now done with the inlining.
return true;		return true;
}		}

// Otherwise, we have the normal case, of more than one block to inline or		// Otherwise, we have the normal case, of more than one block to inline or
// multiple return sites.		// multiple return sites.
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	bool InlineFunctionImpl(InlineSite IS, InlineFunctionInfo &IFI,
if (Returns.size() > 1) {		if (Returns.size() > 1) {
// The PHI node should go at the front of the new basic block to merge all		// The PHI node should go at the front of the new basic block to merge all
// possible incoming values.		// possible incoming values.
if (!TheCall->use_empty()) {		if (!TheCall->use_empty()) {
PHI = PHINode::Create(RTy, Returns.size(), TheCall->getName(),		PHI = PHINode::Create(RTy, Returns.size(), TheCall->getName(),
AfterCallBB->begin());		AfterCallBB->begin());
// Anything that used the result of the function call should now use the		// Anything that used the result of the function call should now use the
// PHI node as their operand.		// PHI node as their operand.
TheCall->replaceAllUsesWith(PHI);		ReturnVal = PHI;
}		}

// Loop over all of the return instructions adding entries to the PHI node		// Loop over all of the return instructions adding entries to the PHI node
// as appropriate.		// as appropriate.
if (PHI) {		if (PHI) {
for (unsigned i = 0, e = Returns.size(); i != e; ++i) {		for (unsigned i = 0, e = Returns.size(); i != e; ++i) {
ReturnInst *RI = Returns[i];		ReturnInst *RI = Returns[i];
assert(RI->getReturnValue()->getType() == PHI->getType() &&		assert(RI->getReturnValue()->getType() == PHI->getType() &&
Show All 14 Lines	if (Returns.size() > 1) {
}		}
// We need to set the debug location to somewhere inside the		// We need to set the debug location to somewhere inside the
// inlined function. The line number may be nonsensical, but the		// inlined function. The line number may be nonsensical, but the
// instruction will at least be associated with the right		// instruction will at least be associated with the right
// function.		// function.
if (CreatedBranchToNormalDest)		if (CreatedBranchToNormalDest)
CreatedBranchToNormalDest->setDebugLoc(Loc);		CreatedBranchToNormalDest->setDebugLoc(Loc);
} else if (!Returns.empty()) {		} else if (!Returns.empty()) {
// Otherwise, if there is exactly one return value, just replace anything		// If there is exactly one return value, we do not need to introduce PHI
// using the return value of the call with the computed value.		// nodes.
if (!TheCall->use_empty()) {		ReturnVal = Returns[0]->getReturnValue();
if (TheCall == Returns[0]->getReturnValue())
TheCall->replaceAllUsesWith(UndefValue::get(TheCall->getType()));
else
TheCall->replaceAllUsesWith(Returns[0]->getReturnValue());
}

// Update PHI nodes that use the ReturnBB to use the AfterCallBB.		// Update PHI nodes that use the ReturnBB to use the AfterCallBB.
BasicBlock *ReturnBB = Returns[0]->getParent();		BasicBlock *ReturnBB = Returns[0]->getParent();
ReturnBB->replaceAllUsesWith(AfterCallBB);		ReturnBB->replaceAllUsesWith(AfterCallBB);

// Splice the code from the return block into the block that it will return		// Splice the code from the return block into the block that it will return
// to, which contains the code that was after the call.		// to, which contains the code that was after the call.
AfterCallBB->getInstList().splice(AfterCallBB->begin(),		AfterCallBB->getInstList().splice(AfterCallBB->begin(),
ReturnBB->getInstList());		ReturnBB->getInstList());

if (CreatedBranchToNormalDest)		if (CreatedBranchToNormalDest)
CreatedBranchToNormalDest->setDebugLoc(Returns[0]->getDebugLoc());		CreatedBranchToNormalDest->setDebugLoc(Returns[0]->getDebugLoc());

// Delete the return instruction now and empty ReturnBB now.		// Delete the return instruction now and empty ReturnBB now.
Returns[0]->eraseFromParent();		Returns[0]->eraseFromParent();
ReturnBB->eraseFromParent();		ReturnBB->eraseFromParent();
} else if (!TheCall->use_empty()) {		} else {
// No returns, but something is using the return value of the call. Just		// No returns, hence no well-defined return value
// nuke the result.		ReturnVal = UndefValue::get(TheCall->getType());
TheCall->replaceAllUsesWith(UndefValue::get(TheCall->getType()));
}		}

// Since we are now done with the Call/Invoke, we can delete it.
TheCall->eraseFromParent();

// If we inlined any musttail calls and the original return is now
// unreachable, delete it. It can only contain a bitcast and ret.
if (InlinedMustTailCalls && pred_begin(AfterCallBB) == pred_end(AfterCallBB))
AfterCallBB->eraseFromParent();

// We should always be able to fold the entry block of the function into the		// We should always be able to fold the entry block of the function into the
// single predecessor of the block...		// single predecessor of the block...
assert(cast<BranchInst>(Br)->isUnconditional() && "splitBasicBlock broken!");		assert(cast<BranchInst>(Br)->isUnconditional() && "splitBasicBlock broken!");
BasicBlock *CalleeEntry = cast<BranchInst>(Br)->getSuccessor(0);		BasicBlock *CalleeEntry = cast<BranchInst>(Br)->getSuccessor(0);

// Splice the code entry block into calling block, right before the		// Splice the code entry block into calling block, right before the
// unconditional branch.		// unconditional branch.
CalleeEntry->replaceAllUsesWith(OrigBB); // Update PHI nodes		CalleeEntry->replaceAllUsesWith(OrigBB); // Update PHI nodes
Show All 9 Lines	bool InlineFunctionImpl(InlineSite IS, InlineFunctionInfo &IFI,
// the entries are the same or undef). If so, remove the PHI so it doesn't		// the entries are the same or undef). If so, remove the PHI so it doesn't
// block other optimizations.		// block other optimizations.
if (PHI) {		if (PHI) {
auto &DL = Caller->getParent()->getDataLayout();		auto &DL = Caller->getParent()->getDataLayout();
if (Value *V = SimplifyInstruction(PHI, DL, nullptr, nullptr,		if (Value *V = SimplifyInstruction(PHI, DL, nullptr, nullptr,
&IFI.ACT->getAssumptionCache(*Caller))) {		&IFI.ACT->getAssumptionCache(*Caller))) {
PHI->replaceAllUsesWith(V);		PHI->replaceAllUsesWith(V);
PHI->eraseFromParent();		PHI->eraseFromParent();
		assert(ReturnVal == PHI &&
		"PHI was created to represent the return value!");
		ReturnVal = V;
}		}
}		}

return true;		return true;
}		}