This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/
-
llvm/
-
Analysis/
-
CGSCCPassManager.h
-
Transforms/
-
IPO/
-
Inliner.h
-
InlinerPass.h
-
Utils/
-
Cloning.h
-
lib/
-
Analysis/
-
InlineCost.cpp
-
Passes/
-
PassBuilder.cpp
-
PassRegistry.def
-
Transforms/
-
IPO/
-
AlwaysInliner.cpp
-
InlineSimple.cpp
-
Inliner.cpp
-
Utils/
-
InlineFunction.cpp
-
test/Transforms/Inline/
-
Transforms/
-
Inline/
-
basictest.ll
-
cgscc-update.ll
-
last-callsite.ll
-
nested-inline.ll

Differential D24226

[PM] Provide an initial, minimal port of the inliner to the new pass manager.
ClosedPublic

Authored by chandlerc on Sep 5 2016, 1:08 AM.

Download Raw Diff

Details

Reviewers

sanjoy

Commits

rG1d9631144761: [PM] Provide an initial, minimal port of the inliner to the new pass manager.
rL290161: [PM] Provide an initial, minimal port of the inliner to the new pass manager.

Summary

This doesn't implement *every* feature of the existing inliner, but
tries to implement the most important ones for building a functional
optimization pipeline and beginning to sort out bugs, regressions, and
other problems.

Notable, but intentional omissions:

No alloca merging support. Why? Because it isn't clear we want to do this at all. Active discussion and investigation is going on to remove it, so for simplicity I omitted it.
No support for trying to iterate on "internally" devirtualized calls. Why? Because it adds what I suspect is inappropriate coupling for little or no benefit. We will have an outer iteration system that tracks devirtualization including that from function passes and iterates already. We should improve that rather than approximate it here.
Optimization remarks. Why? Purely to make the patch smaller, no other reason at all.

Note that of these, the last two are the ones I expect to need to
implement eventually. The last one I'll probably do almost immediately.
But I wanted to skip it in the initial patch to try to focus the change
as much as possible as there is already a lot of code moving around and
both of these *could* be skipped without really disrupting the core
logic.

A summary of the different things happening here:

Adding the usual new PM class and rigging.

Fixing minor underlying assumptions in the inline cost analysis or inline logic that don't generally hold in the new PM world.

Adding the core pass logic which is in essence a loop over the calls in the nodes in the call graph. This is a bit duplicated from the old inliner, but only a handful of lines could realistically be shared. (I tried at first, and it really didn't help anything.) All told, this is only about 100 lines of code, and most of that is the mechanics of wiring up analyses from the new PM world.

Updating the LazyCallGraph (in the new PM) based on the *newly inlined* calls and references. This is very minimal because we cannot form cycles.

When inlining removes the last use of a function, eagerly updating the call graph and deleting the function so that any "one use remaining" inline cost heuristics are immediately refined. This is pretty minor in the inliner at 15 or so lines.

After all the inlining for a particular function, updating the LazyCallGraph and the CGSCC pass manager to reflect the removed call edges and function references. Both of these can happen just be removing the call instruction that is inlined, but I've implemented this as a generalized "DCE" update because we run InstSimplify as we process inlined instructions and I don't want to constrain how far we constant fold. Ultimately, it adds no real complexity to handle the full "DCE" cases. The logic here is similar but not precisely the same as the fully general function-pass update. The reason for the difference is for efficiency. We can build the necessary sets up-front and early exit in the cases where all called functions or referenced functions are accounted for. All told this is just over 100 lines. I'd still like to find a way to simplify this, as it feels like more code than should be necessary but I've not found anything that is really an improvement. (Mostly found things that make it shorter but harder to read and understand.)

Refactoring the existing CGSCC update logic to share as much code as possible when implementing #6.

While the patch delta is somewhat large, much of this is moving code out
to helper functions. I can separate these into NFC refactoring patches
that I land ahead of time but it didn't make sense as I would have no
way to explain the *particular* refactoring without adding the second
caller to that API here.

The really substantial delta is 100 lines of inliner and under 300 lines
of CGSCC update, which seems fairly reasonable for the core of all this
madness. =] The rest are comments, APIs in headers, and boiler plate fro
the new pass.

Depends on D24225

Diff Detail

Repository: rL LLVM

Event Timeline

chandlerc updated this revision to Diff 70307.Sep 5 2016, 1:08 AM

chandlerc retitled this revision from to [PM] Provide an initial, minimal port of the inliner to the new pass manager..

chandlerc updated this object.

chandlerc added a reviewer: sanjoy.

chandlerc added a parent revision: D24225: [LCG] Add the necessary functionality to the LazyCallGraph to support inlining..

chandlerc added a subscriber: llvm-commits.

Herald added subscribers: eraman, mcrosier, mehdi_amini. · View Herald TranscriptSep 5 2016, 1:08 AM

eastig added a subscriber: eastig.Sep 5 2016, 7:33 AM

davidxl added a subscriber: davidxl.Sep 5 2016, 10:51 PM

davidxl added inline comments.

lib/Transforms/IPO/Inliner.cpp
842 ↗	(On Diff #70307)	Where is this behavior documented? It seems very fragile to depend on implication details here -- there is not even a way to do assert.
851 ↗	(On Diff #70307)	It is quite unfortunate IR will need to be traversed again here -- there is compile time overhead here. Why can't IFI be used to get the newly exposed callsite?
863 ↗	(On Diff #70307)	If a CGSCC pass has to go through hoops to get SCC graph update properly, I feel that something is wrong. Having multiple levels of SCC is one thing whose complexity can probably be hidden, but requiring every pass to pay the penalty is a different story.

eraman added inline comments.Sep 6 2016, 10:49 AM

lib/Transforms/IPO/Inliner.cpp
848 ↗	(On Diff #70307)	If only one block is cloned into the caller, it is spliced into the block containing the call instruction. This logic to walk the newly added basic blocks does not work in that case.
894 ↗	(On Diff #70307)	Should this be if (!DidInline) continue; Changed = true; ?

Some small nits.

lib/Analysis/CGSCCPassManager.cpp
433 ↗	(On Diff #70307)	maybe auto
lib/Analysis/LazyCallGraph.cpp
28 ↗	(On Diff #70307)	emplace/try_emplace?

Prazek added inline comments.Sep 12 2016, 2:30 PM

lib/Analysis/CGSCCPassManager.cpp
140–141 ↗	(On Diff #70307)	why not using?
199–200 ↗	(On Diff #70307)	same.
269 ↗	(On Diff #70307)	I know that it is not your change, but auto
388–391 ↗	(On Diff #70307)	usings?
432 ↗	(On Diff #70307)	auto?
434 ↗	(On Diff #70307)	auto

Most of the comments down below are stylistic issues, except: InlinerPass::run in this patch does not handle the full generality of InlineFunction. That is the most important thing that needs to be addressed.

include/llvm/Analysis/CGSCCPassManager.h
527 ↗	(On Diff #70307)	Please document the `RC` and `C` parameters-- it isn't obvious what they represent. Are they `SourceRC` and `SourceC`? Also, the return value needs to be documented.
lib/Analysis/CGSCCPassManager.cpp
144 ↗	(On Diff #70307)	If `RC` and `C` are the source `RefSCC` and `SCC`, then please add an assert to make that obvious.
163 ↗	(On Diff #70307)	This should be mentioned on the declaration of `RefSCC` itself.
182 ↗	(On Diff #70307)	I'd just assert that the first element of `NewRefSCCs` is `RC` and then skip the first element in the loop. Checking `NewRC != RC` in every iteration may let a bug slip through undetected.
393 ↗	(On Diff #70307)	Add a `Node::empty()`?
421 ↗	(On Diff #70307)	What is the extra check on `Visited.insert(Callee).second` getting you here? Why not just try to `DeadCallTargets.erase(Callee)` directly?
429 ↗	(On Diff #70307)	I don't buy this reason for doing two walks. :) I think you should be able to do a combined walk and break out if `DeadRefTargets.empty() && DeadCallTargets.empty()`, and predicate the `DeadCallTargets` on `!DeadCallTargets.empty()` and corresponding for `DeadRefTargets`. We'll have the overhead of repeated calls to `SmallPtrSet::empty()` but that sounds cheaper than walking all instructions twice. However, walking twice looks simpler. If you'd rather keep it this way to make things readable, I can buy that. :)
454 ↗	(On Diff #70307)	I'd s/`DeadTargets`/`DeadEdges`/
455 ↗	(On Diff #70307)	I'd s/`DemotedRefTargets`/`DemotedCallTargets` (i.e. these were call targets that were demoted). Alternatively `TargetsDemotedToRef` (actually, this one sounds better).
lib/Transforms/IPO/Inliner.cpp
848 ↗	(On Diff #70307)	Looks like this one wasn't addressed.
861 ↗	(On Diff #70307)	As mentioned in `D24226`, `InlineFunction` can insert non-trivial call edges (by promoting ref edges), causing SCCs to be formed.

This revision now requires changes to proceed.Sep 29 2016, 11:35 PM

Significant improvement of update logic to address core issues pointed out by
Sanjoy and Easwaran and revealed with further testing. New test cases added as
well. Extraneous changes to other code removed leaving this patch even more
focused on the inliner port.

Herald added a subscriber: fhahn. · View Herald TranscriptDec 8 2016, 5:22 PM

With the latest update I've now addressed essentially all of the comments (that remained relevant). The patch is now, if anything, smaller and more focused too. =] Some responses to specific comments below.

include/llvm/Analysis/CGSCCPassManager.h
527 ↗	(On Diff #70307)	This code is gone now.
lib/Analysis/CGSCCPassManager.cpp
140–141 ↗	(On Diff #70307)	Consistency with other code, but all of this is gone now.
163 ↗	(On Diff #70307)	I'll fold this into the larger LCG documentation update ongoing. Also note that this existing comment just moved around (and is no longer moved after the updates to this patch).
182 ↗	(On Diff #70307)	Done but in a separate cleanup CL.
269 ↗	(On Diff #70307)	I will do a separate cleanup CL to add some use of auto in relevant places, but this doesn't belong in this change.
393 ↗	(On Diff #70307)	This code is gone so I've not added this. I can though if it comes back up.
421 ↗	(On Diff #70307)	All this code is gone now. Happy to revisit the walk approach in the other code though to either improve it or to change it to look like this walk looked. But not in the inliner patch. =]
429 ↗	(On Diff #70307)	See above, all this is gone now. We can talk about whether the existing code should be simplified in a separate patch.
432 ↗	(On Diff #70307)	I think the type helps readability here.
433 ↗	(On Diff #70307)	And here.
434 ↗	(On Diff #70307)	But will upt auto here in some follow-up patc.h
lib/Analysis/LazyCallGraph.cpp
28 ↗	(On Diff #70307)	The insert was already there. If there is some value to emplace (not sure there is) then we can go through and systematically use it, but I'd rather not blindly mix and match as code happens to be touched.
lib/Transforms/IPO/Inliner.cpp
842 ↗	(On Diff #70307)	It's not really documented at all, and it only actually holds for the clone API, not the InlineFunction API. See the discussion weath Easwaran below for details, but I'm no longer dealing with any of this.
848 ↗	(On Diff #70307)	This is a great comment Easwaran, and in fact there are more issues. I was thinking of the clone API when I wrote this, InlineFunction does way too much to let this work. There were actually other issues here as well, and I've switched entirely to a simpler approach where we add all of the edges in the call graph for all of the inlined functions, and then prune out the ones based on simplifications at the end of this. For iterating in the inliner itself, I've added a tiny bit of code inside `InlineFunction` where we were already walking the cloned instructions to compute the list of inlined calls even without a call graph, and used that here. Hopefully this works better -- I even already had a test case that was hitting this and just hadn't looked at it carefully enough.
851 ↗	(On Diff #70307)	I've actually switched to getting the newly exposed callsite inside `InlineFunction` in the latest version (see above for why). It still walks the newly inlined instructions, but there really isn't any other way and there shouldn't be much cost here (we walk the inlined instructions several times inside `InlineFunction` from what I can see).
861 ↗	(On Diff #70307)	Yep. I've added the test cases for this and it is now handled much better.
863 ↗	(On Diff #70307)	I'm not sure what the concern is here? This is even simpler now, but either way I think the Inliner is somewhat unusual in that it is mutating the call graph. Passes which do that will have to do work to correctly update SCCs.
894 ↗	(On Diff #70307)	Uh, quite. =D Fixed!

Overall this looks great! Minor comments inline.

include/llvm/Transforms/IPO/Inliner.h
96 ↗	(On Diff #80851)	The explicit `llvm::` qualification is not needed.
lib/Transforms/IPO/Inliner.cpp
667 ↗	(On Diff #80851)	Line looks too long, clang-format?
781 ↗	(On Diff #80851)	I'd mildly prefer spelling out `ProfileSummaryInfo` here, since the type is not obvious from context.
799 ↗	(On Diff #80851)	s/`llvm::getInlineCost`/`getInlineCost`/
857 ↗	(On Diff #80851)	I'm not confident that the calls sites you've put in `Calls` here and above will stay valid across the future iterations of this loop. That is, say we're looking at the `main` SCC in: void f() { return false; } void g() { } void main() { if (f() /* CS0 /) g() / CS1 */; } Before entering the loop, we'll put `CS0` and `CS1` in `Calls`. Once we inline through `CS0`, it is reasonable (though I don't know if it does this today) for `InlineFunction` to want to simplify away and delete `CS1`, leaving a dangling pointer in `Calls`. Is there a reason why that ^ won't happen?
865 ↗	(On Diff #80851)	This bit needs a unit test -- nothing in `make check` failed when I commented this out.
895 ↗	(On Diff #80851)	Two periods intentional?
897 ↗	(On Diff #80851)	Why is `DebugLogging` true by default?

This revision now requires changes to proceed.Dec 11 2016, 7:27 PM

Thanks! Mostly minor comments below.

include/llvm/Transforms/IPO/Inliner.h
1 ↗	(On Diff #70307)	The header above should read Inliner.h
include/llvm/Transforms/Utils/Cloning.h
200 ↗	(On Diff #80851)	typo: of -> if
lib/Transforms/IPO/Inliner.cpp
857 ↗	(On Diff #80851)	When Calls is initially populated, only callees for which isDeclaration is false is added. That is not the case with the callees in IFI.InlinedCallSites. We check and bail out if Callee->isDeclaration() is true in many places (getInlineCost for example), so this probably doesn't break anything now, but it's preferable to filter out callees without their body when we augment Calls above.
881 ↗	(On Diff #80851)	This might add edges that don't exist anymore in the IR: assuming a->b->c originally, and b first gets inlined and then c gets inlined into 'a' this would still add a->c edge.
lib/Transforms/Utils/InlineFunction.cpp
1645 ↗	(On Diff #80851)	Partial inlining implementation creates IFI with an empty CG and this code unnecessarily adds the inlined callsites when invoked in the context of partial inlining. Not a big deal, but may be you can change InlinedCallSites to a pointer and guard the below code based on that?

sanjoy added inline comments.Dec 12 2016, 5:23 PM

lib/Transforms/IPO/Inliner.cpp
857 ↗	(On Diff #80851)	If this is a reply to what I said above, then I'm not sure how `isDeclaration` is relevant to the problem I'm trying to point out. I'm trying to sketch a scenario where `InlineFunction` ends up excising a call site to a function (that has a body) after we've put a pointer to the `CallInst` or `InvokeInst` for the said call site in `Calls` therefore ending up with a dangling pointer. It is possible that `InlineFunction` does not do such simplifications (simplifications in blocks that logically belonged to the caller, that is); but if that is the case then that invariant should be documented and tested.

eraman added inline comments.Dec 12 2016, 5:31 PM

lib/Transforms/IPO/Inliner.cpp
857 ↗	(On Diff #80851)	Sorry, that wasn't a reply to your comment. Mine was a separate unrelated comment on line 857 above.

Update to address review comments (and rebase).

Thanks so much for the review! Updated patch and some responses below.

lib/Transforms/IPO/Inliner.cpp
857 ↗	(On Diff #80851)	Nope, no reason other than "it doesn't do that that at the moment". At least, that I'm aware of... I may be missing something, but a casual look through the old inliner's code makes me think it has the same fundamental assumption. Perhaps as a consequence, the current inliner does DCE on call instructions rather than InlineFunction doing this.... I think we should just document this (in a separate patch probably)...
857 ↗	(On Diff #80851)	Sanjoy, thanks for the thought. Sadly we rely on this already and just don't have any testing for it. I've added a test case based on your idea with lots of dead code to try and make sure we don't get this wrong at some point in the future. It also showcases exactly why we should do some DCE in the caller when inlining because we leave trivially dead code around. I've also added a comment while I'm here so we don't forget. It's a small change. Good thought Easwaran, totally agree about inserting the minimal number of calls. I've adjusted the code accordingly.
865 ↗	(On Diff #80851)	Yea, deleting the analogous line above also doesn't cause anything to fail. :: sigh ::
865 ↗	(On Diff #80851)	I've had to add numerous tests to cover all of the "last callsite" heuristics under this inliner. They include crafty test cases that cover this. I'm not really trying to make them work with the original inliner though as the approach used there is fundamentally harder to test.
881 ↗	(On Diff #80851)	Yes, this is an over approximation. But we then clean all of these (dead) edges up with the below CG update routine.
897 ↗	(On Diff #80851)	.... Because I didn't thread an explicit DebugLogging flag through this code (and I don't want to), and I haven't taught the CGSCC update logic code to support a side channel like DEBUG_PASS yet, so I hard coded it. Anyways, removed. The right way to make this stuff debuggable is to support DEBUG_PASS, but that's a separate issue and nothing to do with this patch.
lib/Transforms/Utils/InlineFunction.cpp
1645 ↗	(On Diff #80851)	I don't think this is really a big deal, and I'd rather not deal with ownership issues of a pointer.... The partial inlining code isn't really heavily used, and it remains correct. In fact, I suspect that if we actually wanted to keep the partial inlining code (but I don't think we do long-term), we'd want to use this exact data structure to iterate the same way we do in the inliner. So if its OK, I'd rather keep this code simple and pay a (very small I think) overhead of building this when it isn't used in that context.

lgtm with minor nits

lib/Transforms/IPO/Inliner.cpp
783 ↗	(On Diff #81552)	Super minor, but it seems natural to have a `SCC::getModule()`. Maybe in a later change?
808 ↗	(On Diff #81552)	Why not `SmallVector<LazyCallGraph::Node *, 16> Nodes(InitialC.begin(), InitialC.end());`?
853 ↗	(On Diff #81552)	How about s/`Calls`/`CallsInCurrentNode`? Given the nested loops, it is a little hard to keep track of whether `Calls` has all the call sites in the SCC or just the ones in the current node.
868 ↗	(On Diff #81552)	(This is minor, please don't hold this patch over resolving this. We can add this (or not) in a later change.) I'd rather add to `Calls` if either `CS` is indirect OR is direct to a function with a body, since a later inline can make that call direct. That way, we get this case: fnptr_t f() { return printf; } void g(fnptr_t val) { val(""); } void main() { fnptr_t val = f(); g(val); } If we're lucky enough to first inline `g` and then `f`, we want to consider inlining the call to `val` that came in from `g` (that is now directly, after inlining `f`). Of course, we have to be somewhat smarter to always get this case, but the probability is higher if we reconsider indirect calls.
915 ↗	(On Diff #81552)	We'll only ever try to insert a function once into `DeadFunctions`, right? Since if we're trying to insert `InlinedCallee` into `DeadFunctions` then it has no uses, and we could not possible try to inline through a call to it again later, so it should never appear again in `InlinedCallees`. If that ^ is correct, perhaps we can add an assert here (and we could even use a `SmallVector` instead of `SmallPtrSet`, but I'd especially like to have the assert in that case).

This revision is now accepted and ready to land.Dec 15 2016, 10:22 PM

Updates from latest review by Sanjoy.

Patch updated and responses below. Some discussion here so will let you give a final OK before I land. Also would like to get an OK from Easwaran if possible although he seemed pretty happy on the last iteration.

lib/Transforms/IPO/Inliner.cpp
783 ↗	(On Diff #81552)	Will do.
808 ↗	(On Diff #81552)	Because it doesn't compile -- we need pointers not references. Eventially with a range constructor and an adaptor this will work, but today the current code seemed like the least typing. =[
853 ↗	(On Diff #81552)	I'm not sure about this... Seems a long name when we don't have two of them. I've cleaned up the comment a bit. Also, the per-node loop is the outer loop. The only reason this isn't scoped to the per-node region is to avoid re-allocation.
868 ↗	(On Diff #81552)	Yeah, this is really similar to the devirtualization cases already. I think I'll do this in a follow-up to make it more focused and get more testing (and because I have a few other things I'd like to get moving). Generally, I want to pursue a strategy of optimistic iteration here rather than exhaustive. Because we're going to end up with an iteration construct around all of this to handle cross-pass cases anyways, and we can rely on that handling hard to spot cases. The focus here should be to quickly find as my opportunities as we can without too much waste. For example, the current inliner does another scan of the entire function every time anything gets inlined. This basically doubles the cost as we do a full run of computing inline cost and not inlining anything. Then we pop out to the devirt iteration and in some cases re-run it ... again.... I have a good idea of how to handle the majority of easy cases here though by looking at the users of inlined calls when they are inlined. But it'll require a bit of work and threading things through as well as more test cases so i'd like to circle back to it.
915 ↗	(On Diff #81552)	All of this stemmed from an intermediate state where I tried to actually delete the function here. Now that it is fully deferred, I can directly put it in the queue above when we discover it. This avoids the duplicated predicate and this extra loop, etc. I've switched to a vector and added the assert because yea, we can't inline the same call twice and have it become dead both times. =] At least, not unless there is some really cool reincarnation going on here...
865 ↗	(On Diff #80851)	Yea, deleting the analogous line above also doesn't cause anything to fail. :: sigh ::

LGTM!

lib/Transforms/IPO/Inliner.cpp
808 ↗	(On Diff #81552)	Ah, okay; I missed the `&`.

Closed by commit rL290161: [PM] Provide an initial, minimal port of the inliner to the new pass manager. (authored by chandlerc). · Explain WhyDec 19 2016, 7:26 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

Analysis/

CGSCCPassManager.h

13 lines

Transforms/

IPO/

Inliner.h

108 lines

InlinerPass.h

87 lines

Utils/

Cloning.h

12 lines

lib/

Analysis/

InlineCost.cpp

6 lines

Passes/

PassBuilder.cpp

1 line

PassRegistry.def

1 line

Transforms/

IPO/

AlwaysInliner.cpp

10 lines

InlineSimple.cpp

13 lines

Inliner.cpp

189 lines

Utils/

InlineFunction.cpp

10 lines

test/

Transforms/

Inline/

1 line

145 lines

269 lines

1 line

Diff 82058

llvm/trunk/include/llvm/Analysis/CGSCCPassManager.h

Show First 20 Lines • Show All 325 Lines • ▼ Show 20 Lines	PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM) {
// iterating off the worklists.		// iterating off the worklists.
SmallPtrSet<LazyCallGraph::RefSCC *, 4> InvalidRefSCCSet;		SmallPtrSet<LazyCallGraph::RefSCC *, 4> InvalidRefSCCSet;
SmallPtrSet<LazyCallGraph::SCC *, 4> InvalidSCCSet;		SmallPtrSet<LazyCallGraph::SCC *, 4> InvalidSCCSet;

CGSCCUpdateResult UR = {RCWorklist, CWorklist, InvalidRefSCCSet,		CGSCCUpdateResult UR = {RCWorklist, CWorklist, InvalidRefSCCSet,
InvalidSCCSet, nullptr, nullptr};		InvalidSCCSet, nullptr, nullptr};

PreservedAnalyses PA = PreservedAnalyses::all();		PreservedAnalyses PA = PreservedAnalyses::all();
for (LazyCallGraph::RefSCC &InitialRC : CG.postorder_ref_sccs()) {		for (auto RCI = CG.postorder_ref_scc_begin(),
		RCE = CG.postorder_ref_scc_end();
		RCI != RCE;) {
assert(RCWorklist.empty() &&		assert(RCWorklist.empty() &&
"Should always start with an empty RefSCC worklist");		"Should always start with an empty RefSCC worklist");
// The postorder_ref_sccs range we are walking is lazily constructed, so		// The postorder_ref_sccs range we are walking is lazily constructed, so
// we only push the first one onto the worklist. The worklist allows us		// we only push the first one onto the worklist. The worklist allows us
// to capture new RefSCCs created during transformations.		// to capture new RefSCCs created during transformations.
//		//
// We really want to form RefSCCs lazily because that makes them cheaper		// We really want to form RefSCCs lazily because that makes them cheaper
// to update as the program is simplified and allows us to have greater		// to update as the program is simplified and allows us to have greater
// cache locality as forming a RefSCC touches all the parts of all the		// cache locality as forming a RefSCC touches all the parts of all the
// functions within that RefSCC.		// functions within that RefSCC.
RCWorklist.insert(&InitialRC);		//
		// We also eagerly increment the iterator to the next position because
		// the CGSCC passes below may delete the current RefSCC.
		RCWorklist.insert(&*RCI++);

do {		do {
LazyCallGraph::RefSCC *RC = RCWorklist.pop_back_val();		LazyCallGraph::RefSCC *RC = RCWorklist.pop_back_val();
if (InvalidRefSCCSet.count(RC)) {		if (InvalidRefSCCSet.count(RC)) {
if (DebugLogging)		if (DebugLogging)
dbgs() << "Skipping an invalid RefSCC...\n";		dbgs() << "Skipping an invalid RefSCC...\n";
continue;		continue;
}		}
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	for (auto RCI = CG.postorder_ref_scc_begin(),
// FIXME: If we ever start having RefSCC passes, we'll want to		// FIXME: If we ever start having RefSCC passes, we'll want to
// iterate there too.		// iterate there too.
RC = UR.UpdatedRC ? UR.UpdatedRC : RC;		RC = UR.UpdatedRC ? UR.UpdatedRC : RC;
C = UR.UpdatedC ? UR.UpdatedC : C;		C = UR.UpdatedC ? UR.UpdatedC : C;
if (DebugLogging && UR.UpdatedC)		if (DebugLogging && UR.UpdatedC)
dbgs() << "Re-running SCC passes after a refinement of the "		dbgs() << "Re-running SCC passes after a refinement of the "
"current SCC: "		"current SCC: "
<< *UR.UpdatedC << "\n";		<< *UR.UpdatedC << "\n";

		// Note that both `C` and `RC` may at this point refer to deleted,
		// invalid SCC and RefSCCs respectively. But we will short circuit
		// the processing when we check them in the loop above.
} while (UR.UpdatedC);		} while (UR.UpdatedC);

} while (!CWorklist.empty());		} while (!CWorklist.empty());
} while (!RCWorklist.empty());		} while (!RCWorklist.empty());
}		}

// By definition we preserve the call garph, all SCC analyses, and the		// By definition we preserve the call garph, all SCC analyses, and the
// analysis proxies by handling them above and in any nested pass managers.		// analysis proxies by handling them above and in any nested pass managers.
▲ Show 20 Lines • Show All 179 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Transforms/IPO/Inliner.h

Property	Old Value	New Value
cvs2svn:cvs-rev	null	1.1
svn:eol-style	null	native
svn:keywords	null	Author Date Id Revision

				//===- Inliner.h - Inliner pass and infrastructure --------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TRANSFORMS_IPO_INLINER_H
				#define LLVM_TRANSFORMS_IPO_INLINER_H

				#include "llvm/Analysis/CGSCCPassManager.h"
				#include "llvm/Analysis/CallGraphSCCPass.h"
				#include "llvm/Analysis/InlineCost.h"
				#include "llvm/Analysis/LazyCallGraph.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/Transforms/Utils/ImportedFunctionsInliningStatistics.h"

				namespace llvm {
				class AssumptionCacheTracker;
				class CallSite;
				class DataLayout;
				class InlineCost;
				class OptimizationRemarkEmitter;
				class ProfileSummaryInfo;

				/// This class contains all of the helper code which is used to perform the
				/// inlining operations that do not depend on the policy. It contains the core
				/// bottom-up inlining infrastructure that specific inliner passes use.
				struct LegacyInlinerBase : public CallGraphSCCPass {
				explicit LegacyInlinerBase(char &ID);
				explicit LegacyInlinerBase(char &ID, bool InsertLifetime);

				/// For this class, we declare that we require and preserve the call graph.
				/// If the derived class implements this method, it should always explicitly
				/// call the implementation here.
				void getAnalysisUsage(AnalysisUsage &Info) const override;

				bool doInitialization(CallGraph &CG) override;

				/// Main run interface method, this implements the interface required by the
				/// Pass class.
				bool runOnSCC(CallGraphSCC &SCC) override;

				using llvm::Pass::doFinalization;
				/// Remove now-dead linkonce functions at the end of processing to avoid
				/// breaking the SCC traversal.
				bool doFinalization(CallGraph &CG) override;

				/// This method must be implemented by the subclass to determine the cost of
				/// inlining the specified call site. If the cost returned is greater than
				/// the current inline threshold, the call site is not inlined.
				virtual InlineCost getInlineCost(CallSite CS) = 0;

				/// Remove dead functions.
				///
				/// This also includes a hack in the form of the 'AlwaysInlineOnly' flag
				/// which restricts it to deleting functions with an 'AlwaysInline'
				/// attribute. This is useful for the InlineAlways pass that only wants to
				/// deal with that subset of the functions.
				bool removeDeadFunctions(CallGraph &CG, bool AlwaysInlineOnly = false);

				/// This function performs the main work of the pass. The default of
				/// Inlinter::runOnSCC() calls skipSCC() before calling this method, but
				/// derived classes which cannot be skipped can override that method and call
				/// this function unconditionally.
				bool inlineCalls(CallGraphSCC &SCC);

				private:
				// Insert @llvm.lifetime intrinsics.
				bool InsertLifetime;

				protected:
				AssumptionCacheTracker *ACT;
				ProfileSummaryInfo *PSI;
				ImportedFunctionsInliningStatistics ImportedFunctionsStats;
				};

				/// The inliner pass for the new pass manager.
				///
				/// This pass wires together the inlining utilities and the inline cost
				/// analysis into a CGSCC pass. It considers every call in every function in
				/// the SCC and tries to inline if profitable. It can be tuned with a number of
				/// parameters to control what cost model is used and what tradeoffs are made
				/// when making the decision.
				///
				/// It should be noted that the legacy inliners do considerably more than this
				/// inliner pass does. They provide logic for manually merging allocas, and
				/// doing considerable DCE including the DCE of dead functions. This pass makes
				/// every attempt to be simpler. DCE of functions requires complex reasoning
				/// about comdat groups, etc. Instead, it is expected that other more focused
				/// passes be composed to achieve the same end result.
				class InlinerPass : public PassInfoMixin<InlinerPass> {
				public:
				InlinerPass(InlineParams Params = getInlineParams())
				: Params(std::move(Params)) {}

				PreservedAnalyses run(LazyCallGraph::SCC &C, CGSCCAnalysisManager &AM,
				LazyCallGraph &CG, CGSCCUpdateResult &UR);

				private:
				InlineParams Params;
				};

				} // End llvm namespace

				#endif

llvm/trunk/include/llvm/Transforms/IPO/InlinerPass.h

	//===- InlinerPass.h - Code common to all inliners --------------- C++ --===//
	//
	// The LLVM Compiler Infrastructure
	//
	// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.
	//
	//===----------------------------------------------------------------------===//
	//
	// This file defines a simple policy-based bottom-up inliner. This file
	// implements all of the boring mechanics of the bottom-up inlining, while the
	// subclass determines WHAT to inline, which is the much more interesting
	// component.
	//
	//===----------------------------------------------------------------------===//

	#ifndef LLVM_TRANSFORMS_IPO_INLINERPASS_H
	#define LLVM_TRANSFORMS_IPO_INLINERPASS_H

	#include "llvm/Analysis/CallGraphSCCPass.h"
	#include "llvm/Analysis/InlineCost.h"
	#include "llvm/Analysis/TargetTransformInfo.h"
	#include "llvm/Transforms/Utils/ImportedFunctionsInliningStatistics.h"

	namespace llvm {
	class AssumptionCacheTracker;
	class CallSite;
	class DataLayout;
	class InlineCost;
	class OptimizationRemarkEmitter;
	class ProfileSummaryInfo;
	template <class PtrType, unsigned SmallSize> class SmallPtrSet;

	/// This class contains all of the helper code which is used to perform the
	/// inlining operations that do not depend on the policy.
	struct Inliner : public CallGraphSCCPass {
	explicit Inliner(char &ID);
	explicit Inliner(char &ID, bool InsertLifetime);

	/// For this class, we declare that we require and preserve the call graph.
	/// If the derived class implements this method, it should always explicitly
	/// call the implementation here.
	void getAnalysisUsage(AnalysisUsage &Info) const override;

	bool doInitialization(CallGraph &CG) override;

	/// Main run interface method, this implements the interface required by the
	/// Pass class.
	bool runOnSCC(CallGraphSCC &SCC) override;

	using llvm::Pass::doFinalization;
	/// Remove now-dead linkonce functions at the end of processing to avoid
	/// breaking the SCC traversal.
	bool doFinalization(CallGraph &CG) override;

	/// This method must be implemented by the subclass to determine the cost of
	/// inlining the specified call site. If the cost returned is greater than
	/// the current inline threshold, the call site is not inlined.
	virtual InlineCost getInlineCost(CallSite CS) = 0;

	/// Remove dead functions.
	///
	/// This also includes a hack in the form of the 'AlwaysInlineOnly' flag
	/// which restricts it to deleting functions with an 'AlwaysInline'
	/// attribute. This is useful for the InlineAlways pass that only wants to
	/// deal with that subset of the functions.
	bool removeDeadFunctions(CallGraph &CG, bool AlwaysInlineOnly = false);

	/// This function performs the main work of the pass. The default of
	/// Inlinter::runOnSCC() calls skipSCC() before calling this method, but
	/// derived classes which cannot be skipped can override that method and call
	/// this function unconditionally.
	bool inlineCalls(CallGraphSCC &SCC);

	private:
	// Insert @llvm.lifetime intrinsics.
	bool InsertLifetime;

	protected:
	AssumptionCacheTracker *ACT;
	ProfileSummaryInfo *PSI;
	ImportedFunctionsInliningStatistics ImportedFunctionsStats;
	};

	} // End llvm namespace

	#endif

llvm/trunk/include/llvm/Transforms/Utils/Cloning.h

Show First 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	public:
/// StaticAllocas - InlineFunction fills this in with all static allocas that		/// StaticAllocas - InlineFunction fills this in with all static allocas that
/// get copied into the caller.		/// get copied into the caller.
SmallVector<AllocaInst *, 4> StaticAllocas;		SmallVector<AllocaInst *, 4> StaticAllocas;

/// InlinedCalls - InlineFunction fills this in with callsites that were		/// InlinedCalls - InlineFunction fills this in with callsites that were
/// inlined from the callee. This is only filled in if CG is non-null.		/// inlined from the callee. This is only filled in if CG is non-null.
SmallVector<WeakVH, 8> InlinedCalls;		SmallVector<WeakVH, 8> InlinedCalls;

		/// All of the new call sites inlined into the caller.
		///
		/// 'InlineFunction' fills this in by scanning the inlined instructions, and
		/// only if CG is null. If CG is non-null, instead the value handle
		/// `InlinedCalls` above is used.
		SmallVector<CallSite, 8> InlinedCallSites;

void reset() {		void reset() {
StaticAllocas.clear();		StaticAllocas.clear();
InlinedCalls.clear();		InlinedCalls.clear();
		InlinedCallSites.clear();
}		}
};		};

/// InlineFunction - This function inlines the called function into the basic		/// InlineFunction - This function inlines the called function into the basic
/// block of the caller. This returns false if it is not possible to inline		/// block of the caller. This returns false if it is not possible to inline
/// this call. The program is still in a well defined state if this occurs		/// this call. The program is still in a well defined state if this occurs
/// though.		/// though.
///		///
/// Note that this only does one level of inlining. For example, if the		/// Note that this only does one level of inlining. For example, if the
/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now		/// instruction 'call B' is inlined, and 'B' calls 'C', then the call to 'C' now
/// exists in the instruction stream. Similarly this will inline a recursive		/// exists in the instruction stream. Similarly this will inline a recursive
/// function by one level.		/// function by one level.
///		///
		/// Note that while this routine is allowed to cleanup and optimize the
		/// inlined code to minimize the actual inserted code, it must not delete
		/// code in the caller as users of this routine may have pointers to
		/// instructions in the caller that need to remain stable.
bool InlineFunction(CallInst *C, InlineFunctionInfo &IFI,		bool InlineFunction(CallInst *C, InlineFunctionInfo &IFI,
AAResults *CalleeAAR = nullptr, bool InsertLifetime = true);		AAResults *CalleeAAR = nullptr, bool InsertLifetime = true);
bool InlineFunction(InvokeInst *II, InlineFunctionInfo &IFI,		bool InlineFunction(InvokeInst *II, InlineFunctionInfo &IFI,
AAResults *CalleeAAR = nullptr, bool InsertLifetime = true);		AAResults *CalleeAAR = nullptr, bool InsertLifetime = true);
bool InlineFunction(CallSite CS, InlineFunctionInfo &IFI,		bool InlineFunction(CallSite CS, InlineFunctionInfo &IFI,
AAResults *CalleeAAR = nullptr, bool InsertLifetime = true);		AAResults *CalleeAAR = nullptr, bool InsertLifetime = true);

/// \brief Clones a loop \p OrigLoop. Returns the loop and the blocks in \p		/// \brief Clones a loop \p OrigLoop. Returns the loop and the blocks in \p
Show All 18 Lines

llvm/trunk/lib/Analysis/InlineCost.cpp

Show First 20 Lines • Show All 632 Lines • ▼ Show 20 Lines	void CallAnalyzer::updateThreshold(CallSite CS, Function &Callee) {
// and reduce the threshold if the caller has the necessary attribute.		// and reduce the threshold if the caller has the necessary attribute.
if (Caller->optForMinSize())		if (Caller->optForMinSize())
Threshold = MinIfValid(Threshold, Params.OptMinSizeThreshold);		Threshold = MinIfValid(Threshold, Params.OptMinSizeThreshold);
else if (Caller->optForSize())		else if (Caller->optForSize())
Threshold = MinIfValid(Threshold, Params.OptSizeThreshold);		Threshold = MinIfValid(Threshold, Params.OptSizeThreshold);

bool HotCallsite = false;		bool HotCallsite = false;
uint64_t TotalWeight;		uint64_t TotalWeight;
if (CS.getInstruction()->extractProfTotalWeight(TotalWeight) &&		if (PSI && CS.getInstruction()->extractProfTotalWeight(TotalWeight) &&
PSI->isHotCount(TotalWeight)) {		PSI->isHotCount(TotalWeight)) {
HotCallsite = true;		HotCallsite = true;
}		}

// Listen to the inlinehint attribute or profile based hotness information		// Listen to the inlinehint attribute or profile based hotness information
// when it would increase the threshold and the caller does not need to		// when it would increase the threshold and the caller does not need to
// minimize its size.		// minimize its size.
bool InlineHint = Callee.hasFnAttribute(Attribute::InlineHint) \|\|		bool InlineHint = Callee.hasFnAttribute(Attribute::InlineHint) \|\|
PSI->isFunctionEntryHot(&Callee);		(PSI && PSI->isFunctionEntryHot(&Callee));
if (InlineHint && !Caller->optForMinSize())		if (InlineHint && !Caller->optForMinSize())
Threshold = MaxIfValid(Threshold, Params.HintThreshold);		Threshold = MaxIfValid(Threshold, Params.HintThreshold);

if (HotCallsite && !Caller->optForMinSize())		if (HotCallsite && !Caller->optForMinSize())
Threshold = MaxIfValid(Threshold, Params.HotCallSiteThreshold);		Threshold = MaxIfValid(Threshold, Params.HotCallSiteThreshold);

bool ColdCallee = PSI->isFunctionEntryCold(&Callee);		bool ColdCallee = PSI && PSI->isFunctionEntryCold(&Callee);
// For cold callees, use the ColdThreshold knob if it is available and reduces		// For cold callees, use the ColdThreshold knob if it is available and reduces
// the threshold.		// the threshold.
if (ColdCallee)		if (ColdCallee)
Threshold = MinIfValid(Threshold, Params.ColdThreshold);		Threshold = MinIfValid(Threshold, Params.ColdThreshold);

// Finally, take the target-specific inlining threshold multiplier into		// Finally, take the target-specific inlining threshold multiplier into
// account.		// account.
Threshold *= TTI.getInliningThresholdMultiplier();		Threshold *= TTI.getInliningThresholdMultiplier();
▲ Show 20 Lines • Show All 944 Lines • Show Last 20 Lines

llvm/trunk/lib/Passes/PassBuilder.cpp

	Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines
	#include "llvm/Transforms/IPO/ElimAvailExtern.h"			#include "llvm/Transforms/IPO/ElimAvailExtern.h"
	#include "llvm/Transforms/IPO/ForceFunctionAttrs.h"			#include "llvm/Transforms/IPO/ForceFunctionAttrs.h"
	#include "llvm/Transforms/IPO/FunctionAttrs.h"			#include "llvm/Transforms/IPO/FunctionAttrs.h"
	#include "llvm/Transforms/IPO/FunctionImport.h"			#include "llvm/Transforms/IPO/FunctionImport.h"
	#include "llvm/Transforms/IPO/GlobalDCE.h"			#include "llvm/Transforms/IPO/GlobalDCE.h"
	#include "llvm/Transforms/IPO/GlobalOpt.h"			#include "llvm/Transforms/IPO/GlobalOpt.h"
	#include "llvm/Transforms/IPO/GlobalSplit.h"			#include "llvm/Transforms/IPO/GlobalSplit.h"
	#include "llvm/Transforms/IPO/InferFunctionAttrs.h"			#include "llvm/Transforms/IPO/InferFunctionAttrs.h"
				#include "llvm/Transforms/IPO/Inliner.h"
	#include "llvm/Transforms/IPO/Internalize.h"			#include "llvm/Transforms/IPO/Internalize.h"
	#include "llvm/Transforms/IPO/LowerTypeTests.h"			#include "llvm/Transforms/IPO/LowerTypeTests.h"
	#include "llvm/Transforms/IPO/PartialInlining.h"			#include "llvm/Transforms/IPO/PartialInlining.h"
	#include "llvm/Transforms/IPO/SCCP.h"			#include "llvm/Transforms/IPO/SCCP.h"
	#include "llvm/Transforms/IPO/StripDeadPrototypes.h"			#include "llvm/Transforms/IPO/StripDeadPrototypes.h"
	#include "llvm/Transforms/IPO/WholeProgramDevirt.h"			#include "llvm/Transforms/IPO/WholeProgramDevirt.h"
	#include "llvm/Transforms/InstCombine/InstCombine.h"			#include "llvm/Transforms/InstCombine/InstCombine.h"
	#include "llvm/Transforms/InstrProfiling.h"			#include "llvm/Transforms/InstrProfiling.h"
	▲ Show 20 Lines • Show All 748 Lines • Show Last 20 Lines

llvm/trunk/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 81 Lines • ▼ Show 20 Lines
	CGSCC_ANALYSIS("fam-proxy", FunctionAnalysisManagerCGSCCProxy())			CGSCC_ANALYSIS("fam-proxy", FunctionAnalysisManagerCGSCCProxy())
	#undef CGSCC_ANALYSIS			#undef CGSCC_ANALYSIS

	#ifndef CGSCC_PASS			#ifndef CGSCC_PASS
	#define CGSCC_PASS(NAME, CREATE_PASS)			#define CGSCC_PASS(NAME, CREATE_PASS)
	#endif			#endif
	CGSCC_PASS("invalidate<all>", InvalidateAllAnalysesPass())			CGSCC_PASS("invalidate<all>", InvalidateAllAnalysesPass())
	CGSCC_PASS("function-attrs", PostOrderFunctionAttrsPass())			CGSCC_PASS("function-attrs", PostOrderFunctionAttrsPass())
				CGSCC_PASS("inline", InlinerPass())
	CGSCC_PASS("no-op-cgscc", NoOpCGSCCPass())			CGSCC_PASS("no-op-cgscc", NoOpCGSCCPass())
	#undef CGSCC_PASS			#undef CGSCC_PASS

	#ifndef FUNCTION_ANALYSIS			#ifndef FUNCTION_ANALYSIS
	#define FUNCTION_ANALYSIS(NAME, CREATE_PASS)			#define FUNCTION_ANALYSIS(NAME, CREATE_PASS)
	#endif			#endif
	FUNCTION_ANALYSIS("aa", AAManager())			FUNCTION_ANALYSIS("aa", AAManager())
	FUNCTION_ANALYSIS("assumptions", AssumptionAnalysis())			FUNCTION_ANALYSIS("assumptions", AssumptionAnalysis())
	▲ Show 20 Lines • Show All 130 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/AlwaysInliner.cpp

	Show All 20 Lines
	#include "llvm/Analysis/TargetLibraryInfo.h"			#include "llvm/Analysis/TargetLibraryInfo.h"
	#include "llvm/IR/CallSite.h"			#include "llvm/IR/CallSite.h"
	#include "llvm/IR/CallingConv.h"			#include "llvm/IR/CallingConv.h"
	#include "llvm/IR/DataLayout.h"			#include "llvm/IR/DataLayout.h"
	#include "llvm/IR/Instructions.h"			#include "llvm/IR/Instructions.h"
	#include "llvm/IR/IntrinsicInst.h"			#include "llvm/IR/IntrinsicInst.h"
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"
	#include "llvm/IR/Type.h"			#include "llvm/IR/Type.h"
	#include "llvm/Transforms/IPO/InlinerPass.h"			#include "llvm/Transforms/IPO.h"
				#include "llvm/Transforms/IPO/Inliner.h"
	#include "llvm/Transforms/Utils/Cloning.h"			#include "llvm/Transforms/Utils/Cloning.h"

	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "inline"			#define DEBUG_TYPE "inline"

	PreservedAnalyses AlwaysInlinerPass::run(Module &M, ModuleAnalysisManager &) {			PreservedAnalyses AlwaysInlinerPass::run(Module &M, ModuleAnalysisManager &) {
	InlineFunctionInfo IFI;			InlineFunctionInfo IFI;
	Show All 19 Lines
	}			}

	namespace {			namespace {

	/// Inliner pass which only handles "always inline" functions.			/// Inliner pass which only handles "always inline" functions.
	///			///
	/// Unlike the \c AlwaysInlinerPass, this uses the more heavyweight \c Inliner			/// Unlike the \c AlwaysInlinerPass, this uses the more heavyweight \c Inliner
	/// base class to provide several facilities such as array alloca merging.			/// base class to provide several facilities such as array alloca merging.
	class AlwaysInlinerLegacyPass : public Inliner {			class AlwaysInlinerLegacyPass : public LegacyInlinerBase {

	public:			public:
	AlwaysInlinerLegacyPass() : Inliner(ID, /InsertLifetime/ true) {			AlwaysInlinerLegacyPass() : LegacyInlinerBase(ID, /InsertLifetime/ true) {
	initializeAlwaysInlinerLegacyPassPass(*PassRegistry::getPassRegistry());			initializeAlwaysInlinerLegacyPassPass(*PassRegistry::getPassRegistry());
	}			}

	AlwaysInlinerLegacyPass(bool InsertLifetime) : Inliner(ID, InsertLifetime) {			AlwaysInlinerLegacyPass(bool InsertLifetime)
				: LegacyInlinerBase(ID, InsertLifetime) {
	initializeAlwaysInlinerLegacyPassPass(*PassRegistry::getPassRegistry());			initializeAlwaysInlinerLegacyPassPass(*PassRegistry::getPassRegistry());
	}			}

	/// Main run interface method. We override here to avoid calling skipSCC().			/// Main run interface method. We override here to avoid calling skipSCC().
	bool runOnSCC(CallGraphSCC &SCC) override { return inlineCalls(SCC); }			bool runOnSCC(CallGraphSCC &SCC) override { return inlineCalls(SCC); }

	static char ID; // Pass identification, replacement for typeid			static char ID; // Pass identification, replacement for typeid

	▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

llvm/trunk/lib/Transforms/IPO/InlineSimple.cpp

	Show All 19 Lines
	#include "llvm/IR/CallSite.h"			#include "llvm/IR/CallSite.h"
	#include "llvm/IR/CallingConv.h"			#include "llvm/IR/CallingConv.h"
	#include "llvm/IR/DataLayout.h"			#include "llvm/IR/DataLayout.h"
	#include "llvm/IR/Instructions.h"			#include "llvm/IR/Instructions.h"
	#include "llvm/IR/IntrinsicInst.h"			#include "llvm/IR/IntrinsicInst.h"
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"
	#include "llvm/IR/Type.h"			#include "llvm/IR/Type.h"
	#include "llvm/Transforms/IPO.h"			#include "llvm/Transforms/IPO.h"
	#include "llvm/Transforms/IPO/InlinerPass.h"			#include "llvm/Transforms/IPO/Inliner.h"

	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "inline"			#define DEBUG_TYPE "inline"

	namespace {			namespace {

	/// \brief Actual inliner pass implementation.			/// \brief Actual inliner pass implementation.
	///			///
	/// The common implementation of the inlining logic is shared between this			/// The common implementation of the inlining logic is shared between this
	/// inliner pass and the always inliner pass. The two passes use different cost			/// inliner pass and the always inliner pass. The two passes use different cost
	/// analyses to determine when to inline.			/// analyses to determine when to inline.
	class SimpleInliner : public Inliner {			class SimpleInliner : public LegacyInlinerBase {

	InlineParams Params;			InlineParams Params;

	public:			public:
	SimpleInliner() : Inliner(ID), Params(llvm::getInlineParams()) {			SimpleInliner() : LegacyInlinerBase(ID), Params(llvm::getInlineParams()) {
	initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());			initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());
	}			}

	explicit SimpleInliner(InlineParams Params) : Inliner(ID), Params(Params) {			explicit SimpleInliner(InlineParams Params)
				: LegacyInlinerBase(ID), Params(Params) {
	initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());			initializeSimpleInlinerPass(*PassRegistry::getPassRegistry());
	}			}

	static char ID; // Pass identification, replacement for typeid			static char ID; // Pass identification, replacement for typeid

	InlineCost getInlineCost(CallSite CS) override {			InlineCost getInlineCost(CallSite CS) override {
	Function *Callee = CS.getCalledFunction();			Function *Callee = CS.getCalledFunction();
	TargetTransformInfo &TTI = TTIWP->getTTI(*Callee);			TargetTransformInfo &TTI = TTIWP->getTTI(*Callee);
	Show All 37 Lines
	}			}

	Pass *llvm::createFunctionInliningPass(InlineParams &Params) {			Pass *llvm::createFunctionInliningPass(InlineParams &Params) {
	return new SimpleInliner(Params);			return new SimpleInliner(Params);
	}			}

	bool SimpleInliner::runOnSCC(CallGraphSCC &SCC) {			bool SimpleInliner::runOnSCC(CallGraphSCC &SCC) {
	TTIWP = &getAnalysis<TargetTransformInfoWrapperPass>();			TTIWP = &getAnalysis<TargetTransformInfoWrapperPass>();
	return Inliner::runOnSCC(SCC);			return LegacyInlinerBase::runOnSCC(SCC);
	}			}

	void SimpleInliner::getAnalysisUsage(AnalysisUsage &AU) const {			void SimpleInliner::getAnalysisUsage(AnalysisUsage &AU) const {
	AU.addRequired<TargetTransformInfoWrapperPass>();			AU.addRequired<TargetTransformInfoWrapperPass>();
	Inliner::getAnalysisUsage(AU);			LegacyInlinerBase::getAnalysisUsage(AU);
	}			}

llvm/trunk/lib/Transforms/IPO/Inliner.cpp

//===- Inliner.cpp - Code common to all inliners --------------------------===//		//===- Inliner.cpp - Code common to all inliners --------------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements the mechanics required to implement inlining without		// This file implements the mechanics required to implement inlining without
// missing any calls and updating the call graph. The decisions of which calls		// missing any calls and updating the call graph. The decisions of which calls
// are profitable to inline are implemented elsewhere.		// are profitable to inline are implemented elsewhere.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "llvm/Transforms/IPO/Inliner.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/Analysis/CallGraph.h"		#include "llvm/Analysis/CallGraph.h"
#include "llvm/Analysis/InlineCost.h"		#include "llvm/Analysis/InlineCost.h"
#include "llvm/Analysis/OptimizationDiagnosticInfo.h"		#include "llvm/Analysis/OptimizationDiagnosticInfo.h"
#include "llvm/Analysis/ProfileSummaryInfo.h"		#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DiagnosticInfo.h"		#include "llvm/IR/DiagnosticInfo.h"
		#include "llvm/IR/InstIterator.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Module.h"		#include "llvm/IR/Module.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/IPO/InlinerPass.h"
#include "llvm/Transforms/Utils/Cloning.h"		#include "llvm/Transforms/Utils/Cloning.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "inline"		#define DEBUG_TYPE "inline"

STATISTIC(NumInlined, "Number of functions inlined");		STATISTIC(NumInlined, "Number of functions inlined");
STATISTIC(NumCallsDeleted, "Number of call sites deleted, not inlined");		STATISTIC(NumCallsDeleted, "Number of call sites deleted, not inlined");
Show All 28 Lines	cl::opt<InlinerFunctionImportStatsOpts> InlinerFunctionImportStats(
cl::init(InlinerFunctionImportStatsOpts::No),		cl::init(InlinerFunctionImportStatsOpts::No),
cl::values(clEnumValN(InlinerFunctionImportStatsOpts::Basic, "basic",		cl::values(clEnumValN(InlinerFunctionImportStatsOpts::Basic, "basic",
"basic statistics"),		"basic statistics"),
clEnumValN(InlinerFunctionImportStatsOpts::Verbose, "verbose",		clEnumValN(InlinerFunctionImportStatsOpts::Verbose, "verbose",
"printing of statistics for each inlined function")),		"printing of statistics for each inlined function")),
cl::Hidden, cl::desc("Enable inliner stats for imported functions"));		cl::Hidden, cl::desc("Enable inliner stats for imported functions"));
} // namespace		} // namespace

Inliner::Inliner(char &ID) : CallGraphSCCPass(ID), InsertLifetime(true) {}		LegacyInlinerBase::LegacyInlinerBase(char &ID)
		: CallGraphSCCPass(ID), InsertLifetime(true) {}

Inliner::Inliner(char &ID, bool InsertLifetime)		LegacyInlinerBase::LegacyInlinerBase(char &ID, bool InsertLifetime)
: CallGraphSCCPass(ID), InsertLifetime(InsertLifetime) {}		: CallGraphSCCPass(ID), InsertLifetime(InsertLifetime) {}

/// For this class, we declare that we require and preserve the call graph.		/// For this class, we declare that we require and preserve the call graph.
/// If the derived class implements this method, it should		/// If the derived class implements this method, it should
/// always explicitly call the implementation here.		/// always explicitly call the implementation here.
void Inliner::getAnalysisUsage(AnalysisUsage &AU) const {		void LegacyInlinerBase::getAnalysisUsage(AnalysisUsage &AU) const {
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<ProfileSummaryInfoWrapperPass>();		AU.addRequired<ProfileSummaryInfoWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
getAAResultsAnalysisUsage(AU);		getAAResultsAnalysisUsage(AU);
CallGraphSCCPass::getAnalysisUsage(AU);		CallGraphSCCPass::getAnalysisUsage(AU);
}		}

typedef DenseMap<ArrayType , std::vector<AllocaInst >> InlinedArrayAllocasTy;		typedef DenseMap<ArrayType , std::vector<AllocaInst >> InlinedArrayAllocasTy;
▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines	assert(unsigned(InlineHistoryID) < InlineHistory.size() &&
"Invalid inline history ID");		"Invalid inline history ID");
if (InlineHistory[InlineHistoryID].first == F)		if (InlineHistory[InlineHistoryID].first == F)
return true;		return true;
InlineHistoryID = InlineHistory[InlineHistoryID].second;		InlineHistoryID = InlineHistory[InlineHistoryID].second;
}		}
return false;		return false;
}		}

bool Inliner::doInitialization(CallGraph &CG) {		bool LegacyInlinerBase::doInitialization(CallGraph &CG) {
if (InlinerFunctionImportStats != InlinerFunctionImportStatsOpts::No)		if (InlinerFunctionImportStats != InlinerFunctionImportStatsOpts::No)
ImportedFunctionsStats.setModuleInfo(CG.getModule());		ImportedFunctionsStats.setModuleInfo(CG.getModule());
return false; // No changes to CallGraph.		return false; // No changes to CallGraph.
}		}

bool Inliner::runOnSCC(CallGraphSCC &SCC) {		bool LegacyInlinerBase::runOnSCC(CallGraphSCC &SCC) {
if (skipSCC(SCC))		if (skipSCC(SCC))
return false;		return false;
return inlineCalls(SCC);		return inlineCalls(SCC);
}		}

static bool		static bool
inlineCallsImpl(CallGraphSCC &SCC, CallGraph &CG,		inlineCallsImpl(CallGraphSCC &SCC, CallGraph &CG,
std::function<AssumptionCache &(Function &)> GetAssumptionCache,		std::function<AssumptionCache &(Function &)> GetAssumptionCache,
▲ Show 20 Lines • Show All 198 Lines • ▼ Show 20 Lines	for (unsigned CSi = 0; CSi != CallSites.size(); ++CSi) {
Changed = true;		Changed = true;
LocalChange = true;		LocalChange = true;
}		}
} while (LocalChange);		} while (LocalChange);

return Changed;		return Changed;
}		}

bool Inliner::inlineCalls(CallGraphSCC &SCC) {		bool LegacyInlinerBase::inlineCalls(CallGraphSCC &SCC) {
CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();		CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();
ACT = &getAnalysis<AssumptionCacheTracker>();		ACT = &getAnalysis<AssumptionCacheTracker>();
PSI = getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();		PSI = getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		auto &TLI = getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
// We compute dedicated AA results for each function in the SCC as needed. We		// We compute dedicated AA results for each function in the SCC as needed. We
// use a lambda referencing external objects so that they live long enough to		// use a lambda referencing external objects so that they live long enough to
// be queried, but we re-use them each time.		// be queried, but we re-use them each time.
Optional<BasicAAResult> BAR;		Optional<BasicAAResult> BAR;
Optional<AAResults> AAR;		Optional<AAResults> AAR;
auto AARGetter = [&](Function &F) -> AAResults & {		auto AARGetter = [&](Function &F) -> AAResults & {
BAR.emplace(createLegacyPMBasicAAResult(*this, F));		BAR.emplace(createLegacyPMBasicAAResult(*this, F));
AAR.emplace(createLegacyPMAAResults(this, F, BAR));		AAR.emplace(createLegacyPMAAResults(this, F, BAR));
return *AAR;		return *AAR;
};		};
auto GetAssumptionCache = [&](Function &F) -> AssumptionCache & {		auto GetAssumptionCache = [&](Function &F) -> AssumptionCache & {
return ACT->getAssumptionCache(F);		return ACT->getAssumptionCache(F);
};		};
return inlineCallsImpl(SCC, CG, GetAssumptionCache, PSI, TLI, InsertLifetime,		return inlineCallsImpl(SCC, CG, GetAssumptionCache, PSI, TLI, InsertLifetime,
[this](CallSite CS) { return getInlineCost(CS); },		[this](CallSite CS) { return getInlineCost(CS); },
AARGetter, ImportedFunctionsStats);		AARGetter, ImportedFunctionsStats);
}		}

/// Remove now-dead linkonce functions at the end of		/// Remove now-dead linkonce functions at the end of
/// processing to avoid breaking the SCC traversal.		/// processing to avoid breaking the SCC traversal.
bool Inliner::doFinalization(CallGraph &CG) {		bool LegacyInlinerBase::doFinalization(CallGraph &CG) {
if (InlinerFunctionImportStats != InlinerFunctionImportStatsOpts::No)		if (InlinerFunctionImportStats != InlinerFunctionImportStatsOpts::No)
ImportedFunctionsStats.dump(InlinerFunctionImportStats ==		ImportedFunctionsStats.dump(InlinerFunctionImportStats ==
InlinerFunctionImportStatsOpts::Verbose);		InlinerFunctionImportStatsOpts::Verbose);
return removeDeadFunctions(CG);		return removeDeadFunctions(CG);
}		}

/// Remove dead functions that are not included in DNR (Do Not Remove) list.		/// Remove dead functions that are not included in DNR (Do Not Remove) list.
bool Inliner::removeDeadFunctions(CallGraph &CG, bool AlwaysInlineOnly) {		bool LegacyInlinerBase::removeDeadFunctions(CallGraph &CG,
		bool AlwaysInlineOnly) {
SmallVector<CallGraphNode *, 16> FunctionsToRemove;		SmallVector<CallGraphNode *, 16> FunctionsToRemove;
SmallVector<CallGraphNode *, 16> DeadFunctionsInComdats;		SmallVector<CallGraphNode *, 16> DeadFunctionsInComdats;
SmallDenseMap<const Comdat *, int, 16> ComdatEntriesAlive;		SmallDenseMap<const Comdat *, int, 16> ComdatEntriesAlive;

auto RemoveCGN = [&](CallGraphNode *CGN) {		auto RemoveCGN = [&](CallGraphNode *CGN) {
// Remove any call graph edges from the function to its callees.		// Remove any call graph edges from the function to its callees.
CGN->removeAllCalledFunctions();		CGN->removeAllCalledFunctions();

▲ Show 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	FunctionsToRemove.erase(
std::unique(FunctionsToRemove.begin(), FunctionsToRemove.end()),		std::unique(FunctionsToRemove.begin(), FunctionsToRemove.end()),
FunctionsToRemove.end());		FunctionsToRemove.end());
for (CallGraphNode *CGN : FunctionsToRemove) {		for (CallGraphNode *CGN : FunctionsToRemove) {
delete CG.removeFunctionFromModule(CGN);		delete CG.removeFunctionFromModule(CGN);
++NumDeleted;		++NumDeleted;
}		}
return true;		return true;
}		}

		PreservedAnalyses InlinerPass::run(LazyCallGraph::SCC &InitialC,
		CGSCCAnalysisManager &AM, LazyCallGraph &CG,
		CGSCCUpdateResult &UR) {
		FunctionAnalysisManager &FAM =
		AM.getResult<FunctionAnalysisManagerCGSCCProxy>(InitialC, CG)
		.getManager();
		const ModuleAnalysisManager &MAM =
		AM.getResult<ModuleAnalysisManagerCGSCCProxy>(InitialC, CG).getManager();
		bool Changed = false;

		assert(InitialC.size() > 0 && "Cannot handle an empty SCC!");
		Module &M = *InitialC.begin()->getFunction().getParent();
		ProfileSummaryInfo *PSI = MAM.getCachedResult<ProfileSummaryAnalysis>(M);

		std::function<AssumptionCache &(Function &)> GetAssumptionCache =
		[&](Function &F) -> AssumptionCache & {
		return FAM.getResult<AssumptionAnalysis>(F);
		};

		// Setup the data structure used to plumb customization into the
		// `InlineFunction` routine.
		InlineFunctionInfo IFI(/cg=/nullptr);

		auto GetInlineCost = [&](CallSite CS) {
		Function &Callee = *CS.getCalledFunction();
		auto &CalleeTTI = FAM.getResult<TargetIRAnalysis>(Callee);
		return getInlineCost(CS, Params, CalleeTTI, GetAssumptionCache, PSI);
		};

		// We use a worklist of nodes to process so that we can handle if the SCC
		// structure changes and some nodes are no longer part of the current SCC. We
		// also need to use an updatable pointer for the SCC as a consequence.
		SmallVector<LazyCallGraph::Node *, 16> Nodes;
		for (auto &N : InitialC)
		Nodes.push_back(&N);
		auto *C = &InitialC;
		auto *RC = &C->getOuterRefSCC();

		// We also use a secondary worklist of call sites within a particular node to
		// allow quickly continuing to inline through newly inlined call sites where
		// possible.
		SmallVector<CallSite, 16> Calls;

		// Track a set vector of inlined callees so that we can augment the caller
		// with all of their edges in the call graph before pruning out the ones that
		// got simplified away.
		SmallSetVector<Function *, 4> InlinedCallees;

		// Track the dead functions to delete once finished with inlining calls. We
		// defer deleting these to make it easier to handle the call graph updates.
		SmallVector<Function *, 4> DeadFunctions;

		do {
		auto &N = *Nodes.pop_back_val();
		if (CG.lookupSCC(N) != C)
		continue;
		Function &F = N.getFunction();
		if (F.hasFnAttribute(Attribute::OptimizeNone))
		continue;

		// Get the remarks emission analysis for the caller.
		auto &ORE = FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);

		// We want to generally process call sites top-down in order for
		// simplifications stemming from replacing the call with the returned value
		// after inlining to be visible to subsequent inlining decisions. So we
		// walk the function backwards and then process the back of the vector.
		// FIXME: Using reverse is a really bad way to do this. Instead we should
		// do an actual PO walk of the function body.
		for (Instruction &I : reverse(instructions(F)))
		if (auto CS = CallSite(&I))
		if (Function *Callee = CS.getCalledFunction())
		if (!Callee->isDeclaration())
		Calls.push_back(CS);

		bool DidInline = false;
		while (!Calls.empty()) {
		CallSite CS = Calls.pop_back_val();
		Function &Callee = *CS.getCalledFunction();

		// Check whether we want to inline this callsite.
		if (!shouldInline(CS, GetInlineCost, ORE))
		continue;

		if (!InlineFunction(CS, IFI))
		continue;
		DidInline = true;
		InlinedCallees.insert(&Callee);

		// Add any new callsites to defined functions to the worklist.
		for (CallSite &CS : reverse(IFI.InlinedCallSites))
		if (Function *NewCallee = CS.getCalledFunction())
		if (!NewCallee->isDeclaration())
		Calls.push_back(CS);

		// For local functions, check whether this makes the callee trivially
		// dead. In that case, we can drop the body of the function eagerly
		// which may reduce the number of callers of other functions to one,
		// changing inline cost thresholds.
		if (Callee.hasLocalLinkage()) {
		// To check this we also need to nuke any dead constant uses (perhaps
		// made dead by this operation on other functions).
		Callee.removeDeadConstantUsers();
		if (Callee.use_empty()) {
		// Clear the body and queue the function itself for deletion when we
		// finish inlining and call graph updates.
		// Note that after this point, it is an error to do anything other
		// than use the callee's address or delete it.
		Callee.dropAllReferences();
		assert(find(DeadFunctions, &Callee) == DeadFunctions.end() &&
		"Cannot put cause a function to become dead twice!");
		DeadFunctions.push_back(&Callee);
		}
		}
		}

		if (!DidInline)
		continue;
		Changed = true;

		// Add all the inlined callees' edges to the caller. These are by
		// definition trivial edges as we already had a transitive call edge to the
		// callee.
		for (Function *InlinedCallee : InlinedCallees) {
		LazyCallGraph::Node &CalleeN = CG.lookup(InlinedCallee);
		for (LazyCallGraph::Edge &E : CalleeN)
		if (E.isCall())
		RC->insertTrivialCallEdge(N, *E.getNode());
		else
		RC->insertTrivialRefEdge(N, *E.getNode());
		}
		InlinedCallees.clear();

		// At this point, since we have made changes we have at least removed
		// a call instruction. However, in the process we do some incremental
		// simplification of the surrounding code. This simplification can
		// essentially do all of the same things as a function pass and we can
		// re-use the exact same logic for updating the call graph to reflect the
		// change..
		C = &updateCGAndAnalysisManagerForFunctionPass(CG, *C, N, AM, UR);
		RC = &C->getOuterRefSCC();
		} while (!Nodes.empty());

		// Now that we've finished inlining all of the calls across this SCC, delete
		// all of the trivially dead functions, updating the call graph and the CGSCC
		// pass manager in the process.
		//
		// Note that this walks a pointer set which has non-deterministic order but
		// that is OK as all we do is delete things and add pointers to unordered
		// sets.
		for (Function *DeadF : DeadFunctions) {
		// Get the necessary information out of the call graph and nuke the
		// function there.
		auto &DeadC = CG.lookupSCC(CG.lookup(*DeadF));
		auto &DeadRC = DeadC.getOuterRefSCC();
		CG.removeDeadFunction(*DeadF);

		// Mark the relevant parts of the call graph as invalid so we don't visit
		// them.
		UR.InvalidatedSCCs.insert(&DeadC);
		UR.InvalidatedRefSCCs.insert(&DeadRC);

		// And delete the actual function from the module.
		M.getFunctionList().erase(DeadF);
		}
		return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
		}

llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp

Show First 20 Lines • Show All 1,638 Lines • ▼ Show 20 Lines	if (ParentDeopt) {
I->replaceAllUsesWith(NewI);		I->replaceAllUsesWith(NewI);

VH = nullptr;		VH = nullptr;
I->eraseFromParent();		I->eraseFromParent();
}		}
}		}

// Update the callgraph if requested.		// Update the callgraph if requested.
if (IFI.CG)		if (IFI.CG) {
UpdateCallGraphAfterInlining(CS, FirstNewBlock, VMap, IFI);		UpdateCallGraphAfterInlining(CS, FirstNewBlock, VMap, IFI);
		} else {
		// Otherwise just collect the raw call sites that were inlined.
		for (BasicBlock &NewBB :
		make_range(FirstNewBlock->getIterator(), Caller->end()))
		for (Instruction &I : NewBB)
		if (auto CS = CallSite(&I))
		IFI.InlinedCallSites.push_back(CS);
		}

// For 'nodebug' functions, the associated DISubprogram is always null.		// For 'nodebug' functions, the associated DISubprogram is always null.
// Conservatively avoid propagating the callsite debug location to		// Conservatively avoid propagating the callsite debug location to
// instructions inlined from a function whose DISubprogram is not null.		// instructions inlined from a function whose DISubprogram is not null.
fixupLineNumbers(Caller, FirstNewBlock, TheCall,		fixupLineNumbers(Caller, FirstNewBlock, TheCall,
CalledFunc->getSubprogram() != nullptr);		CalledFunc->getSubprogram() != nullptr);

// Clone existing noalias metadata if necessary.		// Clone existing noalias metadata if necessary.
▲ Show 20 Lines • Show All 548 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Inline/basictest.ll

	; RUN: opt < %s -inline -sroa -S \| FileCheck %s			; RUN: opt < %s -inline -sroa -S \| FileCheck %s
				; RUN: opt < %s -passes='cgscc(inline,function(sroa))' -S \| FileCheck %s
	target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"			target datalayout = "E-p:64:64:64-a0:0:8-f32:32:32-f64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-v64:64:64-v128:128:128"

	define i32 @test1f(i32 %i) {			define i32 @test1f(i32 %i) {
	ret i32 %i			ret i32 %i
	}			}

	define i32 @test1(i32 %W) {			define i32 @test1(i32 %W) {
	%X = call i32 @test1f(i32 7)			%X = call i32 @test1f(i32 7)
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/Inline/cgscc-update.ll

				; RUN: opt < %s -aa-pipeline=basic-aa -passes='cgscc(function-attrs,inline)' -S \| FileCheck %s
				; This test runs the inliner and the function attribute deduction. It ensures
				; that when the inliner mutates the call graph it correctly updates the CGSCC
				; iteration so that we can compute refined function attributes. In this way it
				; is leveraging function attribute computation to observe correct call graph
				; updates.

				; Boring unknown external function call.
				; CHECK: declare void @unknown()
				declare void @unknown()

				; Sanity check: this should get annotated as readnone.
				; CHECK: Function Attrs: readnone
				; CHECK-NEXT: declare void @readnone()
				declare void @readnone() readnone

				; The 'test1_' prefixed functions are designed to trigger forming a new direct
				; call in the inlined body of the function. After that, we form a new SCC and
				; using that can deduce precise function attrs.

				; This function should no longer exist.
				; CHECK-NOT: @test1_f()
				define internal void @test1_f(void()* %p) {
				entry:
				call void %p()
				ret void
				}

				; This function should have had 'readnone' deduced for its SCC.
				; CHECK: Function Attrs: noinline readnone
				; CHECK-NEXT: define void @test1_g()
				define void @test1_g() noinline {
				entry:
				call void @test1_f(void()* @test1_h)
				ret void
				}

				; This function should have had 'readnone' deduced for its SCC.
				; CHECK: Function Attrs: noinline readnone
				; CHECK-NEXT: define void @test1_h()
				define void @test1_h() noinline {
				entry:
				call void @test1_g()
				call void @readnone()
				ret void
				}


				; The 'test2_' prefixed functions are designed to trigger forming a new direct
				; call due to RAUW-ing the returned value of a called function into the caller.
				; This too should form a new SCC which can then be reasoned about to compute
				; precise function attrs.

				; This function should no longer exist.
				; CHECK-NOT: @test2_f()
				define internal void()* @test2_f() {
				entry:
				ret void()* @test2_h
				}

				; This function should have had 'readnone' deduced for its SCC.
				; CHECK: Function Attrs: noinline readnone
				; CHECK-NEXT: define void @test2_g()
				define void @test2_g() noinline {
				entry:
				%p = call void()* @test2_f()
				call void %p()
				ret void
				}

				; This function should have had 'readnone' deduced for its SCC.
				; CHECK: Function Attrs: noinline readnone
				; CHECK-NEXT: define void @test2_h()
				define void @test2_h() noinline {
				entry:
				call void @test2_g()
				call void @readnone()
				ret void
				}


				; The 'test3_' prefixed functions are designed to inline in a way that causes
				; call sites to become trivially dead during the middle of inlining callsites of
				; a single function to make sure that the inliner does not get confused by this
				; pattern.

				; CHECK-NOT: @test3_maybe_unknown(
				define internal void @test3_maybe_unknown(i1 %b) {
				entry:
				br i1 %b, label %then, label %exit

				then:
				call void @unknown()
				br label %exit

				exit:
				ret void
				}

				; CHECK-NOT: @test3_f(
				define internal i1 @test3_f() {
				entry:
				ret i1 false
				}

				; CHECK-NOT: @test3_g(
				define internal i1 @test3_g(i1 %b) {
				entry:
				br i1 %b, label %then1, label %if2

				then1:
				call void @test3_maybe_unknown(i1 true)
				br label %if2

				if2:
				%f = call i1 @test3_f()
				br i1 %f, label %then2, label %exit

				then2:
				call void @test3_maybe_unknown(i1 true)
				br label %exit

				exit:
				ret i1 false
				}

				; FIXME: Currently the inliner doesn't successfully mark this as readnone
				; because while it simplifies trivially dead CFGs when inlining callees it
				; doesn't simplify the caller's trivially dead CFG and so we end with a dead
				; block calling @unknown.
				; CHECK-NOT: Function Attrs: readnone
				; CHECK: define void @test3_h()
				define void @test3_h() {
				entry:
				%g = call i1 @test3_g(i1 false)
				br i1 %g, label %then, label %exit

				then:
				call void @test3_maybe_unknown(i1 true)
				br label %exit

				exit:
				call void @test3_maybe_unknown(i1 false)
				ret void
				}

llvm/trunk/test/Transforms/Inline/last-callsite.ll

				; RUN: opt < %s -passes='cgscc(inline)' -inline-threshold=0 -S \| FileCheck %s

				; The 'test1_' prefixed functions test the basic 'last callsite' inline
				; threshold adjustment where we specifically inline the last call site of an
				; internal function regardless of cost.

				define internal void @test1_f() {
				entry:
				%p = alloca i32
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				ret void
				}

				; Identical to @test1_f but doesn't get inlined because there is more than one
				; call. If this does get inlined, the body used both here and in @test1_f
				; isn't a good test for different threshold based on the last call.
				define internal void @test1_g() {
				entry:
				%p = alloca i32
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				ret void
				}

				define void @test1() {
				; CHECK-LABEL: define void @test1()
				entry:
				call void @test1_f()
				; CHECK-NOT: @test1_f

				call void @test1_g()
				call void @test1_g()
				; CHECK: call void @test1_g()
				; CHECK: call void @test1_g()

				ret void
				}


				; The 'test2_' prefixed functions test that we can discover the last callsite
				; bonus after having inlined the prior call site. For this to to work, we need
				; a callsite dependent cost so we have a trivial predicate guarding all the
				; cost, and set that in a particular direction.

				define internal void @test2_f(i1 %b) {
				entry:
				%p = alloca i32
				br i1 %b, label %then, label %exit

				then:
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				br label %exit

				exit:
				ret void
				}

				; Identical to @test2_f but doesn't get inlined because there is more than one
				; call. If this does get inlined, the body used both here and in @test2_f
				; isn't a good test for different threshold based on the last call.
				define internal void @test2_g(i1 %b) {
				entry:
				%p = alloca i32
				br i1 %b, label %then, label %exit

				then:
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				br label %exit

				exit:
				ret void
				}

				define void @test2() {
				; CHECK-LABEL: define void @test2()
				entry:
				; The first call is trivial to inline due to the argument.
				call void @test2_f(i1 false)
				; CHECK-NOT: @test2_f

				; The second call is too expensive to inline unless we update the number of
				; calls after inlining the second.
				call void @test2_f(i1 true)
				; CHECK-NOT: @test2_f

				; Sanity check that two calls with the hard predicate remain uninlined.
				call void @test2_g(i1 true)
				call void @test2_g(i1 true)
				; CHECK: call void @test2_g(i1 true)
				; CHECK: call void @test2_g(i1 true)

				ret void
				}


				; The 'test3_' prefixed functions are similar to the 'test2_' functions but the
				; relative order of the trivial and hard to inline callsites is reversed. This
				; checks that the order of calls isn't significant to whether we observe the
				; "last callsite" threshold difference because the next-to-last gets inlined.
				; FIXME: We don't currently catch this case.

				define internal void @test3_f(i1 %b) {
				entry:
				%p = alloca i32
				br i1 %b, label %then, label %exit

				then:
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				br label %exit

				exit:
				ret void
				}

				; Identical to @test3_f but doesn't get inlined because there is more than one
				; call. If this does get inlined, the body used both here and in @test3_f
				; isn't a good test for different threshold based on the last call.
				define internal void @test3_g(i1 %b) {
				entry:
				%p = alloca i32
				br i1 %b, label %then, label %exit

				then:
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				br label %exit

				exit:
				ret void
				}

				define void @test3() {
				; CHECK-LABEL: define void @test3()
				entry:
				; The first call is too expensive to inline unless we update the number of
				; calls after inlining the second.
				call void @test3_f(i1 true)
				; FIXME: We should inline this call without iteration.
				; CHECK: call void @test3_f(i1 true)

				; But the second call is trivial to inline due to the argument.
				call void @test3_f(i1 false)
				; CHECK-NOT: @test3_f

				; Sanity check that two calls with the hard predicate remain uninlined.
				call void @test3_g(i1 true)
				call void @test3_g(i1 true)
				; CHECK: call void @test3_g(i1 true)
				; CHECK: call void @test3_g(i1 true)

				ret void
				}


				; The 'test4_' prefixed functions are similar to the 'test2_' prefixed
				; functions but include unusual constant expressions that make discovering that
				; a function is dead harder.

				define internal void @test4_f(i1 %b) {
				entry:
				%p = alloca i32
				br i1 %b, label %then, label %exit

				then:
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				br label %exit

				exit:
				ret void
				}

				; Identical to @test4_f but doesn't get inlined because there is more than one
				; call. If this does get inlined, the body used both here and in @test4_f
				; isn't a good test for different threshold based on the last call.
				define internal void @test4_g(i1 %b) {
				entry:
				%p = alloca i32
				br i1 %b, label %then, label %exit

				then:
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				store volatile i32 0, i32* %p
				br label %exit

				exit:
				ret void
				}

				define void @test4() {
				; CHECK-LABEL: define void @test4()
				entry:
				; The first call is trivial to inline due to the argument. However this
				; argument also uses the function being called as part of a complex
				; constant expression. Merely inlining and deleting the call isn't enough to
				; drop the use count here, we need to GC the dead constant expression as
				; well.
				call void @test4_f(i1 icmp ne (i64 ptrtoint (void (i1)* @test4_f to i64), i64 ptrtoint(void (i1)* @test4_f to i64)))
				; CHECK-NOT: @test4_f

				; The second call is too expensive to inline unless we update the number of
				; calls after inlining the second.
				call void @test4_f(i1 true)
				; CHECK-NOT: @test4_f

				; And check that a single call to a function which is used by a complex
				; constant expression cannot be inlined because the constant expression forms
				; a second use. If this part starts failing we need to use more complex
				; constant expressions to reference a particular function with them.
				%sink = alloca i1
				store volatile i1 icmp ne (i64 ptrtoint (void (i1)* @test4_g to i64), i64 ptrtoint(void (i1)* @test4_g to i64)), i1* %sink
				call void @test4_g(i1 true)
				; CHECK: store volatile i1 false
				; CHECK: call void @test4_g(i1 true)

				ret void
				}

llvm/trunk/test/Transforms/Inline/nested-inline.ll

	; RUN: opt < %s -inline -S \| FileCheck %s			; RUN: opt < %s -inline -S \| FileCheck %s
				; RUN: opt < %s -passes='cgscc(inline)' -S \| FileCheck %s
	; Test that bar and bar2 are both inlined throughout and removed.			; Test that bar and bar2 are both inlined throughout and removed.
	@A = weak global i32 0 ; <i32*> [#uses=1]			@A = weak global i32 0 ; <i32*> [#uses=1]
	@B = weak global i32 0 ; <i32*> [#uses=1]			@B = weak global i32 0 ; <i32*> [#uses=1]
	@C = weak global i32 0 ; <i32*> [#uses=1]			@C = weak global i32 0 ; <i32*> [#uses=1]

	define fastcc void @foo(i32 %X) {			define fastcc void @foo(i32 %X) {
	entry:			entry:
	; CHECK-LABEL: @foo(			; CHECK-LABEL: @foo(
	▲ Show 20 Lines • Show All 102 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PM] Provide an initial, minimal port of the inliner to the new pass manager.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 82058

llvm/trunk/include/llvm/Analysis/CGSCCPassManager.h

llvm/trunk/include/llvm/Transforms/IPO/Inliner.h

llvm/trunk/include/llvm/Transforms/IPO/InlinerPass.h

llvm/trunk/include/llvm/Transforms/Utils/Cloning.h

llvm/trunk/lib/Analysis/InlineCost.cpp

llvm/trunk/lib/Passes/PassBuilder.cpp

llvm/trunk/lib/Passes/PassRegistry.def

llvm/trunk/lib/Transforms/IPO/AlwaysInliner.cpp

llvm/trunk/lib/Transforms/IPO/InlineSimple.cpp

llvm/trunk/lib/Transforms/IPO/Inliner.cpp

llvm/trunk/lib/Transforms/Utils/InlineFunction.cpp

llvm/trunk/test/Transforms/Inline/basictest.ll

llvm/trunk/test/Transforms/Inline/cgscc-update.ll

llvm/trunk/test/Transforms/Inline/last-callsite.ll

llvm/trunk/test/Transforms/Inline/nested-inline.ll

[PM] Provide an initial, minimal port of the inliner to the new pass manager.
ClosedPublic