This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
InitializePasses.h
-
LinkAllPasses.h
-
Transforms/
-
IPO.h
-
IPO/
1/1
FunctionSpecialization.h
-
SCCP.h
-
Scalar/
-
SCCP.h
-
lib/
-
Passes/
-
PassBuilderPipelines.cpp
-
PassRegistry.def
-
Transforms/
-
IPO/
17/18
FunctionSpecialization.cpp
-
IPO.cpp
-
PassManagerBuilder.cpp
4/4
SCCP.cpp
-
Scalar/
3
CMakeLists.txt
15/15
SCCP.cpp
-
utils/gn/secondary/llvm/lib/Transforms/Scalar/
-
gn/
-
secondary/
-
llvm/
-
lib/
-
Transforms/
-
Scalar/
-
BUILD.gn

Differential D126455

[FuncSpec] Make the Function Specializer part of the IPSCCP pass.
ClosedPublic

Authored by labrinea on May 26 2022, 3:09 AM.

Download Raw Diff

Details

Reviewers

llvm-commits
ChuanqiXu
fhahn
nikic
efriedma
chill

Commits

rG8136a0172b3c: [FuncSpec] Make the Function Specializer part of the IPSCCP pass.
rG877a9f9abec6: [FuncSpec] Make the Function Specializer part of the IPSCCP pass.

Summary

The aim of this patch is to minimize the compilation time overhead of running Function Specialization. It is about 40% slower to run as a standalone pass (IPSCCP + FuncSpec vs IPSCCP with FuncSpec) according to my measurements. I compiled the llvm testsuite with NewPM-O3 + LTO and measured single threaded [user + system] time of IPSCCP and FuncSpec by passing the '-time-passes' option to lld. Then I compared the two configurations in terms of Instruction Count of the total compilation (not of the individual passes) as in https://llvm-compile-time-tracker.com. Geomean for non-LTO builds is -0.25% and LTO is -0.5% approximately.

You can find more info below:
https://discourse.llvm.org/t/rfc-should-we-enable-function-specialization/61518

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,110 ms	x64 debian > AddressSanitizer-x86_64-linux-dynamic.TestCases::scariness_score_test.cpp
	60,150 ms	x64 debian > AddressSanitizer-x86_64-linux.TestCases::scariness_score_test.cpp
	600 ms	x64 debian > LLVM.Transforms/FunctionSpecialization::bug52821-use-after-free.ll
	600 ms	x64 debian > LLVM.Transforms/FunctionSpecialization::bug55000-read-uninitialized-value.ll
	550 ms	x64 debian > LLVM.Transforms/FunctionSpecialization::function-specialization-always-inline.ll
		View Full Test Results (31 Failed)

Event Timeline

labrinea created this revision.May 26 2022, 3:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 3:09 AM

Herald added subscribers: snehasish, ormris, hiraditya, mgorny. · View Herald Transcript

labrinea requested review of this revision.May 26 2022, 3:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 3:09 AM

This is a proof of concept for RFC: Should we enable Function Specialization?, not ready for review.

labrinea added a child revision: D126456: [SCCP] Notify the Solver when an instruction is removed..May 26 2022, 3:13 AM

Harbormaster completed remote builds in B166438: Diff 432226.May 26 2022, 3:43 AM

labrinea mentioned this in D128822: [FuncSpec] Partially revert rG8b360c69e9e3..Jun 29 2022, 7:26 AM

rebased + fixed tests

Herald added a subscriber: nlopes. · View Herald TranscriptJun 29 2022, 7:29 AM

labrinea added a parent revision: D128822: [FuncSpec] Partially revert rG8b360c69e9e3..Jun 29 2022, 7:30 AM

labrinea added a child revision: D128823: [SCCP] Make it possible to remove predicate info for a given instruction..Jun 29 2022, 7:32 AM

labrinea removed a child revision: D126456: [SCCP] Notify the Solver when an instruction is removed..Jun 29 2022, 7:40 AM

Harbormaster completed remote builds in B172752: Diff 441002.Jun 29 2022, 8:31 AM

labrinea added reviewers: ChuanqiXu, fhahn, eli.friedman.Jul 11 2022, 4:02 AM

tryToReplaceWithConstant method in SCCP does not update the lattice value map at SCCPSolver, and it might lead to a problem that

%arg = getelementptr %struct, %struct* @Global, i32 0, i32 3
%tmp0 = call i64 @func2(i64* %arg)

is folded into

%tmp0 = call i64 @func2(i64* getelementptr inbounds %struct, %struct* @Global, i32 0, i32 3)

a new callbase argument appears, but it is not recorded at SCCPSolver, and this leads to problems such as D128822.

Suggestion:
FunctionSpecializer::tryToReplaceWithConstant use the code piece below to update the propagated argument, and maybe we need such a change for tryToReplaceWithConstant as well.

for (auto *I : UseInsts)
  Solver.visit(I);

labrinea added a reviewer: nikic.Jul 12 2022, 3:30 AM

labrinea edited reviewers, added: efriedma; removed: eli.friedman.Aug 1 2022, 10:39 AM

In D126455#3644307, @sinan wrote:
tryToReplaceWithConstant method in SCCP does not update the lattice value map at SCCPSolver, and it might lead to a problem that

if
%arg = getelementptr %struct, %struct* @Global, i32 0, i32 3
%tmp0 = call i64 @func2(i64* %arg)
is folded into
%tmp0 = call i64 @func2(i64* getelementptr inbounds %struct, %struct* @Global, i32 0, i32 3)
a new callbase argument appears, but it is not recorded at SCCPSolver, and this leads to problems such as D128822.

Suggestion:
FunctionSpecializer::tryToReplaceWithConstant use the code piece below to update the propagated argument, and maybe we need such a change for tryToReplaceWithConstant as well.
for (auto *I : UseInsts)
  Solver.visit(I);

I am doing this in a later patch (see D126456) as I wanted to keep this one as close as possible to the original implementation.

labrinea edited the summary of this revision. (Show Details)Aug 15 2022, 1:44 AM

ping

fhahn added inline comments.Aug 17 2022, 1:51 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
301	can you fix the indentation here in a NFC?
764	IIUC this is done during in between solver runs, right? Is this needed? Isn't it sufficient to continue with the constant value in the value mapping? This would probably remove the need to tell the solver to forget instructions/values.

labrinea added inline comments.Aug 17 2022, 9:22 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
301	How? It seems ok.
764	This is not the only invocation of `tryToReplaceWithConstant` in FuncSpec. On this instance we try to replace the arguments of cloned functions. There's another invocation inside the functor `RunSCCPSolver`. On that instance we try to replace the instructions of cloned functions. Both calls occur as many times as `FuncSpecializationMaxIters` is set to. Moreover, the SCCP pass itself does the same thing on arguments of tracked functions and on instructions of executable blocks (with `tryToReplaceWithConstant` and `simplifyInstsInBlock` accordingly). This happens after the Solver runs and before the Function Specializer is invoked. Therefore, I think we still need to tell the Solver to forget instructions/values if we want to merge the two passes.

chill added a subscriber: chill.Oct 10 2022, 4:14 AM

chill added inline comments.

llvm/lib/Transforms/Scalar/SCCP.cpp
484	IMHO, the invocation of the `FunctionSpecialization` pass ought to happen in this place. The general flow would be like: Initialise solver Run solver once (`Solver.solve()` + `resolvedUndefsIn` loop) Run function specialisation Run solver again Optionally go to 2. Do replacements (from line 512 on) At no point before the last step the passes ought to replace or delete anything (well, except called function operand for cloned functions). If an operand/argument is determined to be a constant, it does not need to be replaced right away, because the passes should consult its lattice value. Yeah, the devil is in the details, but this is the approach to merging the tow passes, as I see it.

I have moved the invocation of the specializer earlier in the ipsccp pass, such that no instructions get deleted until all the solving is done. This essentially makes all of D128822, D128823, D128824, D128825, D126456 and D128827 obsolete. There is one thing I couldn't get working, which is to update the lattice value of the callsites to specialized functions. Unfortunately the semantics of the solver do not allow lattices to move from a generic to a more specific state (i.e. from a wider to a narrower constant range). That said the zapping of returned values won't work on specialized functions, neither we can propagate a constant returned value to the function body where the callsite resides. On another note, the function analysis information which is provided to the pass (predication info, dominator tree) cannot be used on the specialized functions; not sure if that's a problem though.

Harbormaster completed remote builds in B193102: Diff 469041.Oct 19 2022, 2:51 PM

labrinea removed a parent revision: D128822: [FuncSpec] Partially revert rG8b360c69e9e3..Oct 25 2022, 2:58 AM

labrinea added inline comments.Oct 25 2022, 3:01 AM

llvm/lib/Transforms/Scalar/SCCP.cpp
143–154	Just found that we need to do the same inside `replaceSignedInst()` too. I will move this code a function.

chill edited reviewers, added: chill; removed: momchil.velikov.Oct 27 2022, 7:26 AM

chill added inline comments.Oct 27 2022, 8:30 AM

llvm/lib/Transforms/Scalar/CMakeLists.txt
97	Why add IPO here ?
llvm/lib/Transforms/Scalar/SCCP.cpp
143–154	Would it be possible to call `markUsersAsChanged` here ?
147–148	Is there a specific reason to remove the instruction from the block? If not, I'd suggest doing deletion in a single place, as opposed to spreading parts of it all over.
160	Likewise.

labrinea added inline comments.Oct 28 2022, 4:07 AM

llvm/lib/Transforms/Scalar/CMakeLists.txt
97	I vaguely remember a link time error without this change. See also `llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn` at the bottom of this diff. The IPSCCP pass now depends on the FunctionSpecializer whose cpp file is under the IPO directory.
llvm/lib/Transforms/Scalar/SCCP.cpp
143–154	I think we can't because if we replace the uses first then the users of the old value will be empty. Can we markUsersAsChanged before we replaceAllUsesWith the new value? Btw markUsersAsChanged is private for the SCCPInstVisitor, but I suppose I could make it public if need be.
147–148	I am not entirely sure. I wanted to avoid revisiting this instruction accidentally in either of simplifyInstsInBlock(), solve(), or resolvedUndefsIn(). For simplifyInstsInBlock() I could skip the instruction if it's present in `ToDelete`. For the others I don't know what the consequences of revisitng would be. I need to run some tests first.

labrinea added inline comments.Oct 28 2022, 6:09 AM

llvm/lib/Transforms/Scalar/SCCP.cpp
143–154	Actually I could call markUsersAsChanged on the new Instrcution after replacing the uses of the old Instruction with it.

chill added inline comments.Oct 28 2022, 7:33 AM

llvm/lib/Transforms/Scalar/CMakeLists.txt
97	IPO already depends on Scalar, i.e. in `IPO/CMakeLists.txt` we have ... COMPONENT_NAME IPO LINK_COMPONENTS ... Scalar ... Looks like a circular dependency. Perhaps `FunctionSpecialization` needs to go to `Utils` (alongside `SCCPSolver`). Or `runIPSCCP` needs to go to `IPO/SCCP.cpp`. Or both.
llvm/lib/Transforms/Scalar/SCCP.cpp
143–154	OK, let's leave it hanging for now, until I can take a look on top of the latest trunk. Ideally, we are trying to avoid changing code until the Solver is done. Here we have found that an instruction has constant lattice value - we should not replace the users' operands right away, but notify the Solver. The Solver in turn would add the instructions that need reexamining to the instructions worklist and update their lattice values the next time we invoke `Solvet.solve()`. Most likely `SCCPSolver::visit` should become private, the Solver (and the SCCP algorithm in general) is driven by its worklists, we should stick to this design: want something done - add it to the worklist.
147–148	I can't see why would anything go wrong if the instruction is revisited. Do we know if the instruction is safe to remove? It could be `SDiv`/`SRem` with a zero divisor.

chill added inline comments.Oct 28 2022, 6:26 PM

llvm/lib/Transforms/Scalar/SCCP.cpp
147–148	Actually, never mind, we're not replacing the instruction with a constant but with another instruction.

labrinea added inline comments.Oct 30 2022, 8:52 AM

llvm/lib/Transforms/Scalar/SCCP.cpp
143–154	Update: I tried this. It works for 'some' cases. Instead of replacing values with constants I create mappings from the old to the new value and only after all the solving is done then I replace the uses. The specialization of recursive functions doesn't work because it relies on finding allocas of constant integers. Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. In theory I could pass on the mappings from sccp to the specializer but it seems overly complicated to do so.

Changes from last revision:

rebased on top of main; that required adjusting a couple of pass-manager tests as the LoopInfo analysis is now used in the default pipelines by the sccp pass
lazily eraseFromParent instructions which have been replaced instead of removeFromParent and deleteValue later as suggested by @chill
used markUsersAsChanged instead of visit as suggested by @chill (required exposing it to the public interface)
moved/renamed solveWhileResolvedUndefsIn to the solver as it is required from the specializer too (fixes a bug I have added a new testcase for)
changed createSpecialization to return a pointer to the cloned function

Herald added subscribers: wenlei, steven_wu. · View Herald TranscriptOct 31 2022, 8:39 AM

chill added inline comments.Oct 31 2022, 9:39 AM

llvm/lib/Transforms/IPO/SCCP.cpp
41	This part was added for the FunctionSpecialization, if func spec is disabled maybe not pass along the LoopAnalysis?

Harbormaster completed remote builds in B195281: Diff 472022.Oct 31 2022, 9:41 AM

chill added inline comments.Nov 2 2022, 4:08 AM

llvm/lib/Transforms/IPO/SCCP.cpp
48	Now that we added `LoopAnalysis` we may well preserve it too. (I should have included it in the patch which introduced the `LoopAnalysis` here)

Good point. Running the LoopInfo analysis when we do not specialize adds unnecessary compile time overhead. After some investigation I found that notifying the users after replacing a value (instead of notifying only the old users), as well as not removing replaced instructions, both add significant compile time overhead. I am inclined to revert the latter and rework on the former so that markUsersAsChanged can accept a list of users.

Changes from last revision:

Predicated the LoopInfo analysis on the cmd-line option that enables function specialization. As a result we no longer need to adjust the pass manager unit-tests.
Reverted the eraseFromParent to removeFromParent/deleteValue for replaced instructions to save compilation time.
Modified markUsersAsChanged to accept a UserList so that we avoid revisiting irrelevant users after replacing a value (saves compilation time).

Note: The instruction count delta for CTMark with NewPM-O3+LTO between baseline (da5ded4fc9d8c8edfd4a79fa0e75c2ac9165fa7b) and this patch (with funcspec disabled) is about 0.05 % geomean.

labrinea marked 7 inline comments as done.Nov 2 2022, 10:48 AM

labrinea added inline comments.

llvm/lib/Transforms/IPO/SCCP.cpp
48	I tried this but the compiler crashes. Probably because SCCP deletes dead basic blocks.

Harbormaster completed remote builds in B195754: Diff 472680.Nov 2 2022, 11:57 AM

chill added inline comments.Nov 2 2022, 12:26 PM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
273	`WorkList` is a too generic name for a parameter and it's not a worklist per se anyway.
388	This is a little bit fragile in the sense the caller may forget to clear the list. It would be nicer if this function itself clears the list first thing when it starts execution. `WorkList` could also be made a return, taking advantage of move semantics, which looks nicer on paper, but may cause a few allocations/deallocations if we iterate.
llvm/lib/Transforms/Scalar/SCCP.cpp
99	Wouldn't it work without the temporary vector? `markUsersAsChanged` would go over each user, look at the user's operands (including `Old`), and find the `New` (which is some constant) form the lattice values map. Thus we would maybe get just: Solver.markUsersAsChanged(Old); Old->replaceAllUsesWith(New);
143–154	Instead of replacing values with constants I create mappings from the old to the new value .. But isn't this what the `ValueState` already contains? Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. Well, `FunctionSpecializer::rewriteCallSites` and everything else should lookup lattice values, not work directly with operands. But OK, let's not make too many changes at once and revisit it later.
489	I would suggest not creating a vector of all the functions in the module as they could be quite a lot (e.g. in LTO) and thus trigger several heap allocations for `WorkList`. `solveWhileResolvedUndefIn` is quite small and could be overloaded for a `Module ` parameter. I considered making this function a template along the lines of: template<typename RangeT> void printNames(RangeT &&R) { for (auto &F : R) llvm::dbgs() << magic(F)->getName(); } std::vector<llvm::Function > v; llvm::Module M; int main() { printNames(M->functions()); printNames(v); } but couldn't come up with `magic`. As for `propagateConstants` it could be done with a few overloads as well: static bool propagateConstants(SCCPSolver &Solver, Function F, SmallPtrSetImpl<Instruction > &ToDelete); static bool propagateConstants(SCCPSolver &Solver, SmallVectorImpl<Function > &WorkList, SmallPtrSetImpl<Instruction > &ToDelete) { for (Function F : WorkList) propagateConstants(Solve, F, ToDelete); } static bool propagateConstants(SCCPSolver &Solver, Module M, SmallPtrSetImpl<Instruction > &ToDelete) { for (auto &F : Module) propagateConstants(Solve, &F, ToDelete); }
llvm/lib/Transforms/Utils/SCCPSolver.cpp
1577 ↗	(On Diff #472022)	All the functions here forward to the `Visitor`, this one should also just be forwarding. (Not sure why this proxy class exists at all, but I guess we can address it later).

This revision is somewhat different from the previous ones because we no longer replace instructions/arguments whilst in the main specialization loop of the SCCP pass. Instead we use the lattice value when rewritting callsites and when promoting constant stack values. Last time I checked this was even more lightweight from the previous revision. Builds successfully the llvm-test-suite with aggressive funcspec options; haven't tried clang bootstrap yet. Also improves one of the unit tests with recursive functions.

labrinea marked 4 inline comments as done.Nov 7 2022, 9:44 AM

labrinea added inline comments.

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
70	I've removed an unused typedef from here ;)
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
224–225	I am not sure whether this is necessary. The unit tests which exercise recursion are passing without it at least.
329	We now call this function once, not for every clone as it used to be.
821–825	There might be more call sites to rewrite than those in the CallSpecBinding that we have already found, therefore we need to repeat this look up of users here, but at least it now happens once for F compared to Clones.size() times which was the case before.
825	no need to examine lattices of arguments if it's the key of the CallSpecBinding
832–833	We are modifying the list whist traversing it, so we swap the current element with the last one and reduce the iteration range by one.
835	the condition was different before, but I think this is correct
llvm/lib/Transforms/Utils/SCCPSolver.cpp
465–483 ↗	(On Diff #473696)	I couldn't template these two. The main reason was that one iterates over `Function *` whereas the other over `Function &`.

Harbormaster completed remote builds in B196517: Diff 473696.Nov 7 2022, 11:32 AM

chill added inline comments.Nov 8 2022, 8:02 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
826–835	You can swap the order of the loops and get rid of `CallSiteToRewrite`.

chill added inline comments.Nov 8 2022, 8:30 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
826–835	Maybe I'm getting a bit ahead of me, but you can change the function to rewrite just a single call site and factor out the iteration over call sites. The benefit is the function becomes more reusable in as you can independently choose the set of call sites it operates upon. (Incidently, I'm planning to use it that way, but it generally a good change, even if what I have in mind turns out non-working).

labrinea added inline comments.Nov 9 2022, 12:52 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
826–835	You mean to traverse F's users here instead? We are iterating over them while modifying them, which is the reason why CallSiteToRewrite existed in the first place I believe. Also the dynamic cast and other checks we do: CS->getCalledFunction() == F && Solver.isBlockExecutable(CS->getParent()) (note: this one is missing from the current revision) don't need to be repeated on every iteration of the outer loop which walks the specializaions.
826–835	If I change it the way you suggest it will regress the current behaviour. Qsort() from SPEC's mcf won't specialize (it's a recursive function) and it's been quite a drive for this work. What if you alter it once you refactor how we rewrite callsites?

chill added inline comments.Nov 9 2022, 2:23 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
826–835	What we have now is: forall S : Specialisations { forall C : CallSites { do_stuff(S, C); } } What I'm suggesting is to reorder the loops forall C: call sites { forall S: specialisations { do_stuff(S, C); } } That'll avoid the swaps/pop_back. The vector itself stays, good point. And then I'm suggesting to move the outer loop out of the function: func foo() { ... forall C: CallSites { rewriteCallSite(C) } ... } func rewriteCallSite(C) { forall S : Specialisations { do_stuff(S, C); } } Both are NFC. Also the dynamic cast and other checks we do: CS->getCalledFunction() == F && Solver.isBlockExecutable(CS->getParent()) (note: this one is missing from the current revision) don't need to be repeated on every iteration of the outer loop which walks the specializaions. I don't understand this. We do nothing between loops, so interchanging them will execute exactly the same operations in the loop body. If you add these checks somewhere, it's another argument to move the iteration over call sites to the outer loop.

chill added inline comments.Nov 9 2022, 3:49 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
830	This is better placed outside of `rewriteCallSites`, perhaps just after the call to `rewriteCallSites`.

chill added inline comments.Nov 9 2022, 4:24 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
266	I believe the LLVM convention for these kinds of classes/methods is `run`, e.g. `Vectorizer::run()`, `EarlyCSE::run()`, etc.

chill added inline comments.Nov 10 2022, 2:13 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
830	Or a better idea: get the initial size of `CallSitesToRewrite`, decrement that number every time you update a call site. At the end if this number drops to zero mark the function unreachable.

chill added inline comments.Nov 10 2022, 2:26 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
821	Use braces around the `for`, since there are more than two levels of nesting.

labrinea added inline comments.Nov 10 2022, 5:28 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
830	that won't work for dead recursive functions

Changes from last revision:

renamed specialize() to run() as suggested
renamed rewriteCallSites to updateCallSites
interchanged the loops in updateCallSites as suggested
update callsites of specializations before anything else (changes the test identical-specializations.ll)
used braces for nested loop
clang formated

labrinea marked 5 inline comments as done.Nov 10 2022, 5:44 AM

Harbormaster completed remote builds in B197069: Diff 474522.Nov 10 2022, 6:40 AM

Found a bug in the last revision. The swapping idiom violates the expected order of traversing the call sites to update. They are supposed to be sorted by gain.

Harbormaster completed remote builds in B197086: Diff 474555.Nov 10 2022, 9:28 AM

Changes from last revison:

Used a counter for updated callsites instead of revising them at the end to identify dead functions.

Harbormaster completed remote builds in B197169: Diff 474673.Nov 11 2022, 12:49 AM

labrinea added inline comments.Nov 14 2022, 1:47 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
815	I'll rename this and add a comment to explain what it is used for.

Changes from last revision:

rebased
renamed the variable which determines whether we decrement the counter of left callsites to replace
cached the cloned function pointer to the SpecializationInfo to avoid keeping the Clone vector in sync with the Specialization vector when replacing callsites

Harbormaster completed remote builds in B198028: Diff 475858.Nov 16 2022, 10:12 AM

Also removed the immediate replacement of known callsites. It's better to lookup the lattice value instead. This fixed a compile time regression possibly caused by otherwise dead duplicate specializations that had to be processed by later passes.

labrinea mentioned this in D135463: [FuncSpec] Do not generate multiple copies for identical specializations..Nov 16 2022, 10:18 AM

labrinea added a child revision: D135463: [FuncSpec] Do not generate multiple copies for identical specializations..

Would you, please, mark as done the no longer relevant comments?
I think the only issue left is with the circular dependency between libraries.

In D126455#3936611, @chill wrote:

Would you, please, mark as done the no longer relevant comments?
I think the only issue left is with the circular dependency between libraries.

Ack

labrinea mentioned this in D138654: [IPSCCP] Move the IPSCCP run function under the IPO directory..Nov 24 2022, 4:15 AM

Removed the cyclic dependency between LLVMipo and LLVMScalarOpts and rebased on top of D138654.

labrinea added a parent revision: D138654: [IPSCCP] Move the IPSCCP run function under the IPO directory..Nov 24 2022, 6:17 AM

Harbormaster completed remote builds in B199410: Diff 477766.Nov 24 2022, 6:17 AM

LGTM, but let's give a chance for other people to have a look too. @sinan @fhahn

rebase
migrated all funcspec tests to use the -passes= cmdline option

Harbormaster completed remote builds in B200536: Diff 479303.Dec 1 2022, 11:21 AM

aeubanks added a subscriber: aeubanks.Dec 2 2022, 4:16 PM

chill accepted this revision.Dec 5 2022, 1:30 AM

This revision is now accepted and ready to land.Dec 5 2022, 1:30 AM

fhahn added inline comments.Dec 5 2022, 2:17 AM

llvm/lib/Transforms/IPO/SCCP.cpp
25	move this to the the loop below, which uses it
llvm/lib/Transforms/Utils/SCCPSolver.cpp
266 ↗	(On Diff #479303)	I think DomTreeUpdater provides a constructor that doesn't take a DT which could be used unconditionally instead of having all those `if (DTU)` checks spread out across various functions.

Moved a flag close to its use.
Used a DomTreeUpdater without DT/PDT for cloned functions.

labrinea marked 2 inline comments as done.Dec 5 2022, 4:41 AM

labrinea mentioned this in D128827: [WIP][SCCP] Don't track specialized functions unless they are recursive..Dec 5 2022, 5:15 AM

labrinea mentioned this in D128825: [SCCP] Add API for updating the state of the Solver..

labrinea mentioned this in D128824: [SCCP] Add API for AdditionalUsers to the Instruction Visitor..

labrinea mentioned this in D128823: [SCCP] Make it possible to remove predicate info for a given instruction..

labrinea mentioned this in D126456: [SCCP] Notify the Solver when an instruction is removed..

Harbormaster completed remote builds in B201073: Diff 480052.Dec 5 2022, 7:16 AM

chill added a child revision: D139346: [FuncSpec] Global ranking of specialisations.Dec 5 2022, 10:15 AM

This revision was landed with ongoing or failed builds.Dec 8 2022, 4:23 AM

Closed by commit rG877a9f9abec6: [FuncSpec] Make the Function Specializer part of the IPSCCP pass. (authored by labrinea). · Explain Why

This revision was automatically updated to reflect the committed changes.

labrinea mentioned this in rG42c2dc401742: [IPSCCP] Move the IPSCCP run function under the IPO directory..

labrinea added a commit: rG877a9f9abec6: [FuncSpec] Make the Function Specializer part of the IPSCCP pass..

labrinea added a reverting change: rG0f0cb92cb2ad: Revert "[FuncSpec] Make the Function Specializer part of the IPSCCP pass.".Dec 8 2022, 4:42 AM

labrinea added a commit: rG8136a0172b3c: [FuncSpec] Make the Function Specializer part of the IPSCCP pass..Dec 10 2022, 6:50 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

IPO.h

5 lines

IPO/

FunctionSpecialization.h

180 lines

SCCP.h

8 lines

Scalar/

SCCP.h

9 lines

lib/

Passes/

PassBuilderPipelines.cpp

6 lines

PassRegistry.def

1 line

Transforms/

IPO/

FunctionSpecialization.cpp

1045 lines

IPO.cpp

1 line

PassManagerBuilder.cpp

8 lines

SCCP.cpp

108 lines

Scalar/

CMakeLists.txt

1 line

SCCP.cpp

35 lines

utils/

gn/

secondary/

llvm/

lib/

Transforms/

Scalar/

BUILD.gn

1 line

Diff 432226

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines
	void initializeFixIrreduciblePass(PassRegistry &);			void initializeFixIrreduciblePass(PassRegistry &);
	void initializeFixupStatepointCallerSavedPass(PassRegistry&);			void initializeFixupStatepointCallerSavedPass(PassRegistry&);
	void initializeFlattenCFGLegacyPassPass(PassRegistry &);			void initializeFlattenCFGLegacyPassPass(PassRegistry &);
	void initializeFloat2IntLegacyPassPass(PassRegistry&);			void initializeFloat2IntLegacyPassPass(PassRegistry&);
	void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeForwardControlFlowIntegrityPass(PassRegistry&);			void initializeForwardControlFlowIntegrityPass(PassRegistry&);
	void initializeFuncletLayoutPass(PassRegistry&);			void initializeFuncletLayoutPass(PassRegistry&);
	void initializeFunctionImportLegacyPassPass(PassRegistry&);			void initializeFunctionImportLegacyPassPass(PassRegistry&);
	void initializeFunctionSpecializationLegacyPassPass(PassRegistry &);
	void initializeGCMachineCodeAnalysisPass(PassRegistry&);			void initializeGCMachineCodeAnalysisPass(PassRegistry&);
	void initializeGCModuleInfoPass(PassRegistry&);			void initializeGCModuleInfoPass(PassRegistry&);
	void initializeGVNHoistLegacyPassPass(PassRegistry&);			void initializeGVNHoistLegacyPassPass(PassRegistry&);
	void initializeGVNLegacyPassPass(PassRegistry&);			void initializeGVNLegacyPassPass(PassRegistry&);
	void initializeGVNSinkLegacyPassPass(PassRegistry&);			void initializeGVNSinkLegacyPassPass(PassRegistry&);
	void initializeGlobalDCELegacyPassPass(PassRegistry&);			void initializeGlobalDCELegacyPassPass(PassRegistry&);
	void initializeGlobalMergePass(PassRegistry&);			void initializeGlobalMergePass(PassRegistry&);
	void initializeGlobalOptLegacyPassPass(PassRegistry&);			void initializeGlobalOptLegacyPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 282 Lines • Show Last 20 Lines

llvm/include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 225 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createFloat2IntPass();		(void) llvm::createFloat2IntPass();
(void) llvm::createEliminateAvailableExternallyPass();		(void) llvm::createEliminateAvailableExternallyPass();
(void)llvm::createScalarizeMaskedMemIntrinLegacyPass();		(void)llvm::createScalarizeMaskedMemIntrinLegacyPass();
(void) llvm::createWarnMissedTransformationsPass();		(void) llvm::createWarnMissedTransformationsPass();
(void) llvm::createHardwareLoopsPass();		(void) llvm::createHardwareLoopsPass();
(void) llvm::createInjectTLIMappingsLegacyPass();		(void) llvm::createInjectTLIMappingsLegacyPass();
(void) llvm::createUnifyLoopExitsPass();		(void) llvm::createUnifyLoopExitsPass();
(void) llvm::createFixIrreduciblePass();		(void) llvm::createFixIrreduciblePass();
(void)llvm::createFunctionSpecializationPass();
(void)llvm::createSelectOptimizePass();		(void)llvm::createSelectOptimizePass();

(void)new llvm::IntervalPartition();		(void)new llvm::IntervalPartition();
(void)new llvm::ScalarEvolutionWrapperPass();		(void)new llvm::ScalarEvolutionWrapperPass();
llvm::Function::Create(nullptr, llvm::GlobalValue::ExternalLinkage)->viewCFGOnly();		llvm::Function::Create(nullptr, llvm::GlobalValue::ExternalLinkage)->viewCFGOnly();
llvm::RGPassManager RGM;		llvm::RGPassManager RGM;
llvm::TargetLibraryInfoImpl TLII;		llvm::TargetLibraryInfoImpl TLII;
llvm::TargetLibraryInfo TLI(TLII);		llvm::TargetLibraryInfo TLI(TLII);
Show All 11 Lines

llvm/include/llvm/Transforms/IPO.h

	Show First 20 Lines • Show All 164 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	/// createIPSCCPPass - This pass propagates constants from call sites into the			/// createIPSCCPPass - This pass propagates constants from call sites into the
	/// bodies of functions, and keeps track of whether basic blocks are executable			/// bodies of functions, and keeps track of whether basic blocks are executable
	/// in the process.			/// in the process.
	///			///
	ModulePass *createIPSCCPPass();			ModulePass *createIPSCCPPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	/// createFunctionSpecializationPass - This pass propagates constants from call
	/// sites to the specialized version of the callee function.
	ModulePass *createFunctionSpecializationPass();

	//===----------------------------------------------------------------------===//
	//			//
	/// createLoopExtractorPass - This pass extracts all natural loops from the			/// createLoopExtractorPass - This pass extracts all natural loops from the
	/// program into a function if it can.			/// program into a function if it can.
	///			///
	Pass *createLoopExtractorPass();			Pass *createLoopExtractorPass();

	/// createSingleLoopExtractorPass - This pass extracts one natural loop from the			/// createSingleLoopExtractorPass - This pass extracts one natural loop from the
	/// program into a function if it can. This is used by bugpoint.			/// program into a function if it can. This is used by bugpoint.
	▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h

This file was added.

				//===- FunctionSpecialization.h - Function Specialization -----------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This specialises functions with constant parameters. Constant parameters
				// like function pointers and constant globals are propagated to the callee by
				// specializing the function. The main benefit of this pass at the moment is
				// that indirect calls are transformed into direct calls, which provides inline
				// opportunities that the inliner would not have been able to achieve. That's
				// why function specialisation is run before the inliner in the optimisation
				// pipeline; that is by design. Otherwise, we would only benefit from constant
				// passing, which is a valid use-case too, but hasn't been explored much in
				// terms of performance uplifts, cost-model and compile-time impact.
				//
				// Current limitations:
				// - It does not yet handle integer ranges. We do support "literal constants",
				// but that's off by default under an option.
				// - The cost-model could be further looked into (it mainly focuses on inlining
				// benefits),
				//
				// Ideas:
				// - With a function specialization attribute for arguments, we could have
				// a direct way to steer function specialization, avoiding the cost-model,
				// and thus control compile-times / code-size.
				//
				// Todos:
				// - Specializing recursive functions relies on running the transformation a
				// number of times, which is controlled by option
				// `func-specialization-max-iters`. Thus, increasing this value and the
				// number of iterations, will linearly increase the number of times recursive
				// functions get specialized, see also the discussion in
				// https://reviews.llvm.org/D106426 for details. Perhaps there is a
				// compile-time friendlier way to control/limit the number of specialisations
				// for recursive functions.
				// - Don't transform the function if function specialization does not trigger;
				// the SCCPSolver may make IR changes.
				//
				// References:
				// - 2021 LLVM Dev Mtg “Introducing function specialisation, and can we enable
				// it by default?”, https://www.youtube.com/watch?v=zJiCjeXgV5Q
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H
				#define LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H

				#include "llvm/Analysis/CodeMetrics.h"
				#include "llvm/Analysis/InlineCost.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/Transforms/Scalar/SCCP.h"
				#include "llvm/Transforms/Utils/Cloning.h"
				#include "llvm/Transforms/Utils/SCCPSolver.h"
				#include "llvm/Transforms/Utils/SizeOpts.h"

				using namespace llvm;

				namespace llvm {
				// Bookkeeping struct to pass data from the analysis and profitability phase
				// to the actual transform helper functions.
				struct SpecializationInfo {
				SmallVector<ArgInfo, 8> Args; // Stores the {formal,actual} argument pairs.
				InstructionCost Gain; // Profitability: Gain = Bonus - Cost.
				};

				using FuncList = SmallVectorImpl<Function *>;
				labrineaAuthorUnsubmitted Done Reply Inline Actions I've removed an unused typedef from here ;) labrinea: I've removed an unused typedef from here ;)
				using CallArgBinding = std::pair<CallBase , Constant >;
				using CallSpecBinding = std::pair<CallBase *, SpecializationInfo>;
				// We are using MapVector because it guarantees deterministic iteration
				// order across executions.
				using SpecializationMap = SmallMapVector<CallBase *, SpecializationInfo, 8>;

				class FunctionSpecializer {

				/// The IPSCCP Solver.
				SCCPSolver &Solver;

				Module &M;

				/// Analyses used to help determine if a function should be specialized.
				std::function<AssumptionCache &(Function &)> GetAC;
				std::function<TargetTransformInfo &(Function &)> GetTTI;
				std::function<const TargetLibraryInfo &(Function &)> GetTLI;

				// The number of functions specialised, used for collecting statistics and
				// also in the cost model.
				unsigned NbFunctionsSpecialized = 0;

				SmallPtrSet<Function *, 4> SpecializedFuncs;
				SmallPtrSet<Function *, 4> FullySpecialized;
				DenseMap<Function *, CodeMetrics> FunctionMetrics;

				public:
				FunctionSpecializer(SCCPSolver &Solver, Module &M,
				std::function<AssumptionCache &(Function &)> GetAC,
				std::function<TargetTransformInfo &(Function &)> GetTTI,
				std::function<const TargetLibraryInfo &(Function &)> GetTLI)
				: Solver(Solver), M(M), GetAC(GetAC), GetTTI(GetTTI), GetTLI(GetTLI) {}

				~FunctionSpecializer() { removeDeadFunctions(); }

				bool specialize(FuncList &FuncDecls);

				private:
				/// Iterate over the argument tracked functions see if there
				/// are any new constant values for the call instruction via
				/// stack variables.
				void propagateConstantArgs(FuncList &WorkList);

				/// Attempt to specialize functions in the module to enable constant
				/// propagation across function boundaries.
				///
				/// \returns true if at least one function is specialized.
				bool specializeFunctions(FuncList &Candidates, FuncList &WorkList);

				/// Clean up fully specialized functions.
				void removeDeadFunctions();

				// Compute the code metrics for function \p F.
				CodeMetrics &analyzeFunction(Function *F);

				/// This function decides whether it's worthwhile to specialize function
				/// \p F based on the known constant values its arguments can take on. It
				/// only discovers potential specialization opportunities without actually
				/// applying them.
				///
				/// \returns true if any specializations have been found.
				bool calculateGains(Function *F, InstructionCost Cost,
				SmallVectorImpl<CallSpecBinding> &WorkList);

				bool isCandidateFunction(Function *F);

				void specializeFunction(Function *F, SpecializationInfo &S,
				FuncList &WorkList);

				/// Compute and return the cost of specializing function \p F.
				InstructionCost getSpecializationCost(Function *F);

				/// Compute a bonus for replacing argument \p A with constant \p C.
				InstructionCost getSpecializationBonus(Argument A, Constant C);

				/// Determine if we should specialize a function based on the incoming values
				/// of the given argument.
				///
				/// This function implements the goal-directed heuristic. It determines if
				/// specializing the function based on the incoming values of argument \p A
				/// would result in any significant optimization opportunities. If
				/// optimization opportunities exist, the constant values of \p A on which to
				/// specialize the function are collected in \p Constants.
				///
				/// \returns true if the function should be specialized on the given
				/// argument.
				bool isArgumentInteresting(Argument *A,
				SmallVectorImpl<CallArgBinding> &Constants);

				/// Collect in \p Constants all the constant values that argument \p A can
				/// take on.
				void getPossibleConstants(Argument *A,
				SmallVectorImpl<CallArgBinding> &Constants);

				/// Rewrite calls to function \p F to call function \p Clone instead.
				///
				/// This function modifies calls to function \p F as long as the actual
				/// arguments match those in \p Args. Note that for recursive calls we
				/// need to compare against the cloned formal arguments.
				///
				/// Callsites that have been marked with the MinSize function attribute won't
				/// be specialized and rewritten.
				void rewriteCallSites(Function *Clone, const SmallVectorImpl<ArgInfo> &Args,
				ValueToValueMapTy &Mappings);

				void updateSpecializedFuncs(FuncList &Candidates, FuncList &WorkList);
				};
				} // namespace

				#endif // LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H

llvm/include/llvm/Transforms/IPO/SCCP.h

	Show All 26 Lines
	class Module;			class Module;

	/// Pass to perform interprocedural constant propagation.			/// Pass to perform interprocedural constant propagation.
	class IPSCCPPass : public PassInfoMixin<IPSCCPPass> {			class IPSCCPPass : public PassInfoMixin<IPSCCPPass> {
	public:			public:
	PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);			PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
	};			};

	/// Pass to perform interprocedural constant propagation by specializing
	/// functions
	class FunctionSpecializationPass
	: public PassInfoMixin<FunctionSpecializationPass> {
	public:
	PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
	};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_IPO_SCCP_H			#endif // LLVM_TRANSFORMS_IPO_SCCP_H

llvm/include/llvm/Transforms/Scalar/SCCP.h

	Show All 36 Lines
	/// This pass performs function-level constant propagation and merging.			/// This pass performs function-level constant propagation and merging.
	class SCCPPass : public PassInfoMixin<SCCPPass> {			class SCCPPass : public PassInfoMixin<SCCPPass> {
	public:			public:
	PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);			PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
	};			};

	bool runIPSCCP(Module &M, const DataLayout &DL,			bool runIPSCCP(Module &M, const DataLayout &DL,
	std::function<const TargetLibraryInfo &(Function &)> GetTLI,			std::function<const TargetLibraryInfo &(Function &)> GetTLI,
	function_ref<AnalysisResultsForFn(Function &)> getAnalysis);

	bool runFunctionSpecialization(
	Module &M, const DataLayout &DL,
	std::function<TargetLibraryInfo &(Function &)> GetTLI,
	std::function<TargetTransformInfo &(Function &)> GetTTI,			std::function<TargetTransformInfo &(Function &)> GetTTI,
	std::function<AssumptionCache &(Function &)> GetAC,			std::function<AssumptionCache &(Function &)> GetAC,
	function_ref<AnalysisResultsForFn(Function &)> GetAnalysis);			function_ref<AnalysisResultsForFn(Function &)> getAnalysis);
	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_SCALAR_SCCP_H			#endif // LLVM_TRANSFORMS_SCALAR_SCCP_H

llvm/lib/Passes/PassBuilderPipelines.cpp

Show First 20 Lines • Show All 921 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
// post link pipeline after ICP. This is to enable usage of the type		// post link pipeline after ICP. This is to enable usage of the type
// tests in ICP sequences.		// tests in ICP sequences.
if (Phase == ThinOrFullLTOPhase::ThinLTOPostLink)		if (Phase == ThinOrFullLTOPhase::ThinLTOPostLink)
MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));		MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));

for (auto &C : PipelineEarlySimplificationEPCallbacks)		for (auto &C : PipelineEarlySimplificationEPCallbacks)
C(MPM, Level);		C(MPM, Level);

// Specialize functions with IPSCCP.
if (EnableFunctionSpecialization && Level == OptimizationLevel::O3)
MPM.addPass(FunctionSpecializationPass());

// Interprocedural constant propagation now that basic cleanup has occurred		// Interprocedural constant propagation now that basic cleanup has occurred
// and prior to optimizing globals.		// and prior to optimizing globals.
// FIXME: This position in the pipeline hasn't been carefully considered in		// FIXME: This position in the pipeline hasn't been carefully considered in
// years, it should be re-analyzed.		// years, it should be re-analyzed.
MPM.addPass(IPSCCPPass());		MPM.addPass(IPSCCPPass());

// Attach metadata to indirect call sites indicating the set of functions		// Attach metadata to indirect call sites indicating the set of functions
// they may target at run-time. This should follow IPSCCP.		// they may target at run-time. This should follow IPSCCP.
▲ Show 20 Lines • Show All 575 Lines • ▼ Show 20 Lines	if (Level.getSpeedupLevel() > 1) {

// Indirect call promotion. This should promote all the targets that are		// Indirect call promotion. This should promote all the targets that are
// left by the earlier promotion pass that promotes intra-module targets.		// left by the earlier promotion pass that promotes intra-module targets.
// This two-step promotion is to save the compile time. For LTO, it should		// This two-step promotion is to save the compile time. For LTO, it should
// produce the same result as if we only do promotion here.		// produce the same result as if we only do promotion here.
MPM.addPass(PGOIndirectCallPromotion(		MPM.addPass(PGOIndirectCallPromotion(
true /* InLTO */, PGOOpt && PGOOpt->Action == PGOOptions::SampleUse));		true /* InLTO */, PGOOpt && PGOOpt->Action == PGOOptions::SampleUse));

if (EnableFunctionSpecialization && Level == OptimizationLevel::O3)
MPM.addPass(FunctionSpecializationPass());
// Propagate constants at call sites into the functions they call. This		// Propagate constants at call sites into the functions they call. This
// opens opportunities for globalopt (and inlining) by substituting function		// opens opportunities for globalopt (and inlining) by substituting function
// pointers passed as arguments to direct uses of functions.		// pointers passed as arguments to direct uses of functions.
MPM.addPass(IPSCCPPass());		MPM.addPass(IPSCCPPass());

// Attach metadata to indirect call sites indicating the set of functions		// Attach metadata to indirect call sites indicating the set of functions
// they may target at run-time. This should follow IPSCCP.		// they may target at run-time. This should follow IPSCCP.
MPM.addPass(CalledValuePropagationPass());		MPM.addPass(CalledValuePropagationPass());
▲ Show 20 Lines • Show All 347 Lines • Show Last 20 Lines

llvm/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
	MODULE_PASS("cross-dso-cfi", CrossDSOCFIPass())			MODULE_PASS("cross-dso-cfi", CrossDSOCFIPass())
	MODULE_PASS("deadargelim", DeadArgumentEliminationPass())			MODULE_PASS("deadargelim", DeadArgumentEliminationPass())
	MODULE_PASS("debugify", NewPMDebugifyPass())			MODULE_PASS("debugify", NewPMDebugifyPass())
	MODULE_PASS("dot-callgraph", CallGraphDOTPrinterPass())			MODULE_PASS("dot-callgraph", CallGraphDOTPrinterPass())
	MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())			MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())
	MODULE_PASS("extract-blocks", BlockExtractorPass())			MODULE_PASS("extract-blocks", BlockExtractorPass())
	MODULE_PASS("forceattrs", ForceFunctionAttrsPass())			MODULE_PASS("forceattrs", ForceFunctionAttrsPass())
	MODULE_PASS("function-import", FunctionImportPass())			MODULE_PASS("function-import", FunctionImportPass())
	MODULE_PASS("function-specialization", FunctionSpecializationPass())
	MODULE_PASS("globaldce", GlobalDCEPass())			MODULE_PASS("globaldce", GlobalDCEPass())
	MODULE_PASS("globalopt", GlobalOptPass())			MODULE_PASS("globalopt", GlobalOptPass())
	MODULE_PASS("globalsplit", GlobalSplitPass())			MODULE_PASS("globalsplit", GlobalSplitPass())
	MODULE_PASS("hotcoldsplit", HotColdSplittingPass())			MODULE_PASS("hotcoldsplit", HotColdSplittingPass())
	MODULE_PASS("inferattrs", InferFunctionAttrsPass())			MODULE_PASS("inferattrs", InferFunctionAttrsPass())
	MODULE_PASS("inliner-wrapper", ModuleInlinerWrapperPass())			MODULE_PASS("inliner-wrapper", ModuleInlinerWrapperPass())
	MODULE_PASS("print<inline-advisor>", InlineAdvisorAnalysisPrinterPass(dbgs()))			MODULE_PASS("print<inline-advisor>", InlineAdvisorAnalysisPrinterPass(dbgs()))
	MODULE_PASS("inliner-wrapper-no-mandatory-first", ModuleInlinerWrapperPass(			MODULE_PASS("inliner-wrapper-no-mandatory-first", ModuleInlinerWrapperPass(
	▲ Show 20 Lines • Show All 478 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/CodeMetrics.h"		#include "llvm/Analysis/CodeMetrics.h"
#include "llvm/Analysis/InlineCost.h"		#include "llvm/Analysis/InlineCost.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueLattice.h"		#include "llvm/Analysis/ValueLattice.h"
#include "llvm/Analysis/ValueLatticeUtils.h"		#include "llvm/Analysis/ValueLatticeUtils.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
		#include "llvm/Transforms/IPO/FunctionSpecialization.h"
#include "llvm/Transforms/Scalar/SCCP.h"		#include "llvm/Transforms/Scalar/SCCP.h"
#include "llvm/Transforms/Utils/Cloning.h"		#include "llvm/Transforms/Utils/Cloning.h"
#include "llvm/Transforms/Utils/SCCPSolver.h"		#include "llvm/Transforms/Utils/SCCPSolver.h"
#include "llvm/Transforms/Utils/SizeOpts.h"		#include "llvm/Transforms/Utils/SizeOpts.h"
#include <cmath>		#include <cmath>

using namespace llvm;		using namespace llvm;

Show All 39 Lines
//		//
// https://llvm-compile-time-tracker.com		// https://llvm-compile-time-tracker.com
// https://github.com/nikic/llvm-compile-time-tracker		// https://github.com/nikic/llvm-compile-time-tracker
static cl::opt<bool> EnableSpecializationForLiteralConstant(		static cl::opt<bool> EnableSpecializationForLiteralConstant(
"function-specialization-for-literal-constant", cl::init(false), cl::Hidden,		"function-specialization-for-literal-constant", cl::init(false), cl::Hidden,
cl::desc("Enable specialization of functions that take a literal constant "		cl::desc("Enable specialization of functions that take a literal constant "
"as an argument."));		"as an argument."));

namespace {
// Bookkeeping struct to pass data from the analysis and profitability phase
// to the actual transform helper functions.
struct SpecializationInfo {
SmallVector<ArgInfo, 8> Args; // Stores the {formal,actual} argument pairs.
InstructionCost Gain; // Profitability: Gain = Bonus - Cost.
};
} // Anonymous namespace

using FuncList = SmallVectorImpl<Function *>;
using CallArgBinding = std::pair<CallBase , Constant >;
using CallSpecBinding = std::pair<CallBase *, SpecializationInfo>;
// We are using MapVector because it guarantees deterministic iteration
// order across executions.
using SpecializationMap = SmallMapVector<CallBase *, SpecializationInfo, 8>;

// Helper to check if \p LV is either a constant or a constant		// Helper to check if \p LV is either a constant or a constant
// range with a single element. This should cover exactly the same cases as the		// range with a single element. This should cover exactly the same cases as the
// old ValueLatticeElement::isConstant() and is intended to be used in the		// old ValueLatticeElement::isConstant() and is intended to be used in the
// transition to ValueLatticeElement.		// transition to ValueLatticeElement.
static bool isConstant(const ValueLatticeElement &LV) {		static bool isConstant(const ValueLatticeElement &LV) {
return LV.isConstant() \|\|		return LV.isConstant() \|\|
(LV.isConstantRange() && LV.getConstantRange().isSingleElement());		(LV.isConstantRange() && LV.getConstantRange().isSingleElement());
}		}
Show All 27 Lines	for (auto *User : Alloca->users()) {
return nullptr;		return nullptr;
}		}
return dyn_cast_or_null<Constant>(StoreValue);		return dyn_cast_or_null<Constant>(StoreValue);
}		}

// A constant stack value is an AllocaInst that has a single constant		// A constant stack value is an AllocaInst that has a single constant
// value stored to it. Return this constant if such an alloca stack value		// value stored to it. Return this constant if such an alloca stack value
// is a function argument.		// is a function argument.
static Constant getConstantStackValue(CallInst Call, Value *Val,		static Constant getConstantStackValue(CallInst Call, Value *Val) {
SCCPSolver &Solver) {
if (!Val)		if (!Val)
return nullptr;		return nullptr;
Val = Val->stripPointerCasts();		Val = Val->stripPointerCasts();
if (auto *ConstVal = dyn_cast<ConstantInt>(Val))		if (auto *ConstVal = dyn_cast<ConstantInt>(Val))
return ConstVal;		return ConstVal;
auto *Alloca = dyn_cast<AllocaInst>(Val);		auto *Alloca = dyn_cast<AllocaInst>(Val);
if (!Alloca \|\| !Alloca->getAllocatedType()->isIntegerTy())		if (!Alloca \|\| !Alloca->getAllocatedType()->isIntegerTy())
return nullptr;		return nullptr;
Show All 16 Lines
//		//
// @funcspec.arg = internal constant i32 2		// @funcspec.arg = internal constant i32 2
//		//
// define internal void @someFunc(i32* arg1) {		// define internal void @someFunc(i32* arg1) {
// call void @otherFunc(i32* nonnull @funcspec.arg)		// call void @otherFunc(i32* nonnull @funcspec.arg)
// ret void		// ret void
// }		// }
//		//
static void constantArgPropagation(FuncList &WorkList, Module &M,		void FunctionSpecializer::propagateConstantArgs(FuncList &WorkList) {
SCCPSolver &Solver) {
// Iterate over the argument tracked functions see if there		// Iterate over the argument tracked functions see if there
// are any new constant values for the call instruction via		// are any new constant values for the call instruction via
// stack variables.		// stack variables.
for (auto *F : WorkList) {		for (auto *F : WorkList) {

for (auto *User : F->users()) {		for (auto *User : F->users()) {

auto *Call = dyn_cast<CallInst>(User);		auto *Call = dyn_cast<CallInst>(User);
if (!Call)		if (!Call)
continue;		continue;

bool Changed = false;		bool Changed = false;
for (const Use &U : Call->args()) {		for (const Use &U : Call->args()) {
unsigned Idx = Call->getArgOperandNo(&U);		unsigned Idx = Call->getArgOperandNo(&U);
Value *ArgOp = Call->getArgOperand(Idx);		Value *ArgOp = Call->getArgOperand(Idx);
Type *ArgOpType = ArgOp->getType();		Type *ArgOpType = ArgOp->getType();

if (!Call->onlyReadsMemory(Idx) \|\| !ArgOpType->isPointerTy())		if (!Call->onlyReadsMemory(Idx) \|\| !ArgOpType->isPointerTy())
continue;		continue;

auto *ConstVal = getConstantStackValue(Call, ArgOp, Solver);		auto *ConstVal = getConstantStackValue(Call, ArgOp);
if (!ConstVal)		if (!ConstVal)
continue;		continue;

Value *GV = new GlobalVariable(M, ConstVal->getType(), true,		Value *GV = new GlobalVariable(M, ConstVal->getType(), true,
GlobalValue::InternalLinkage, ConstVal,		GlobalValue::InternalLinkage, ConstVal,
"funcspec.arg");		"funcspec.arg");
if (ArgOpType != ConstVal->getType())		if (ArgOpType != ConstVal->getType())
GV = ConstantExpr::getBitCast(cast<Constant>(GV), ArgOpType);		GV = ConstantExpr::getBitCast(cast<Constant>(GV), ArgOpType);

Call->setArgOperand(Idx, GV);		Call->setArgOperand(Idx, GV);
Changed = true;		Changed = true;
}		}

// Add the changed CallInst to Solver Worklist		// Add the changed CallInst to Solver Worklist
if (Changed)		if (Changed)
Solver.visitCall(*Call);		Solver.visitCall(*Call);
		labrineaAuthorUnsubmitted Done Reply Inline Actions I am not sure whether this is necessary. The unit tests which exercise recursion are passing without it at least. labrinea: I am not sure whether this is necessary. The unit tests which exercise recursion are passing…
}		}
}		}
}		}

// ssa_copy intrinsics are introduced by the SCCP solver. These intrinsics		// ssa_copy intrinsics are introduced by the SCCP solver. These intrinsics
// interfere with the constantArgPropagation optimization.		// interfere with the constantArgPropagation optimization.
static void removeSSACopy(Function &F) {		static void removeSSACopy(Function &F) {
for (BasicBlock &BB : F) {		for (BasicBlock &BB : F) {
Show All 9 Lines	static void removeSSACopy(Function &F) {
}		}
}		}

static void removeSSACopy(Module &M) {		static void removeSSACopy(Module &M) {
for (Function &F : M)		for (Function &F : M)
removeSSACopy(F);		removeSSACopy(F);
}		}

namespace {
class FunctionSpecializer {

/// The IPSCCP Solver.
SCCPSolver &Solver;

/// Analyses used to help determine if a function should be specialized.
std::function<AssumptionCache &(Function &)> GetAC;
std::function<TargetTransformInfo &(Function &)> GetTTI;
std::function<TargetLibraryInfo &(Function &)> GetTLI;

SmallPtrSet<Function *, 4> SpecializedFuncs;
SmallPtrSet<Function *, 4> FullySpecialized;
SmallVector<Instruction *> ReplacedWithConstant;
DenseMap<Function *, CodeMetrics> FunctionMetrics;

public:
FunctionSpecializer(SCCPSolver &Solver,
std::function<AssumptionCache &(Function &)> GetAC,
std::function<TargetTransformInfo &(Function &)> GetTTI,
std::function<TargetLibraryInfo &(Function &)> GetTLI)
: Solver(Solver), GetAC(GetAC), GetTTI(GetTTI), GetTLI(GetTLI) {}

~FunctionSpecializer() {
// Eliminate dead code.
removeDeadInstructions();
removeDeadFunctions();
}

/// Attempt to specialize functions in the module to enable constant		/// Attempt to specialize functions in the module to enable constant
/// propagation across function boundaries.		/// propagation across function boundaries.
///		///
/// \returns true if at least one function is specialized.		/// \returns true if at least one function is specialized.
bool specializeFunctions(FuncList &Candidates, FuncList &WorkList) {		bool FunctionSpecializer::specializeFunctions(FuncList &Candidates,
fhahnUnsubmitted Not Done Reply Inline Actions can you fix the indentation here in a NFC? fhahn: can you fix the indentation here in a NFC?
labrineaAuthorUnsubmitted Done Reply Inline Actions How? It seems ok. labrinea: How? It seems ok.
		FuncList &WorkList) {
bool Changed = false;		bool Changed = false;
for (auto *F : Candidates) {		for (auto *F : Candidates) {
if (!isCandidateFunction(F))		if (!isCandidateFunction(F))
continue;		continue;

auto Cost = getSpecializationCost(F);		auto Cost = getSpecializationCost(F);
if (!Cost.isValid()) {		if (!Cost.isValid()) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "FnSpecialization: Invalid specialization cost.\n");		dbgs() << "FnSpecialization: Invalid specialization cost.\n");
continue;		continue;
		chillUnsubmitted Done Reply Inline Actions I believe the LLVM convention for these kinds of classes/methods is `run`, e.g. `Vectorizer::run()`, `EarlyCSE::run()`, etc. chill: I believe the LLVM convention for these kinds of classes/methods is `run`, e.g. `Vectorizer…
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Specialization cost for "		LLVM_DEBUG(dbgs() << "FnSpecialization: Specialization cost for "
<< F->getName() << " is " << Cost << "\n");		<< F->getName() << " is " << Cost << "\n");

SmallVector<CallSpecBinding, 8> Specializations;		SmallVector<CallSpecBinding, 8> Specializations;
if (!calculateGains(F, Cost, Specializations)) {		if (!calculateGains(F, Cost, Specializations)) {
		chillUnsubmitted Done Reply Inline Actions `WorkList` is a too generic name for a parameter and it's not a worklist per se anyway. chill: `WorkList` is a too generic name for a parameter and it's not a worklist per se anyway.
LLVM_DEBUG(dbgs() << "FnSpecialization: No possible constants found\n");		LLVM_DEBUG(dbgs() << "FnSpecialization: No possible constants found\n");
continue;		continue;
}		}

Changed = true;		Changed = true;
for (auto &Entry : Specializations)		for (auto &Entry : Specializations)
specializeFunction(F, Entry.second, WorkList);		specializeFunction(F, Entry.second, WorkList);
}		}

updateSpecializedFuncs(Candidates, WorkList);		updateSpecializedFuncs(Candidates, WorkList);
NumFuncSpecialized += NbFunctionsSpecialized;		NumFuncSpecialized += NbFunctionsSpecialized;
return Changed;		return Changed;
}		}

void removeDeadInstructions() {		void FunctionSpecializer::removeDeadFunctions() {
for (auto *I : ReplacedWithConstant) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Removing dead instruction " << *I
<< "\n");
I->eraseFromParent();
}
ReplacedWithConstant.clear();
}

void removeDeadFunctions() {
for (auto *F : FullySpecialized) {		for (auto *F : FullySpecialized) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Removing dead function "		LLVM_DEBUG(dbgs() << "FnSpecialization: Removing dead function "
<< F->getName() << "\n");		<< F->getName() << "\n");
F->eraseFromParent();		F->eraseFromParent();
}		}
FullySpecialized.clear();		FullySpecialized.clear();
}		}

bool tryToReplaceWithConstant(Value *V) {		static bool tryToReplaceWithConstant(SCCPSolver &Solver, Value *V) {
if (!V->getType()->isSingleValueType() \|\| isa<CallBase>(V) \|\|		if (!V->getType()->isSingleValueType() \|\| isa<CallBase>(V) \|\|
V->user_empty())		V->user_empty())
return false;		return false;

const ValueLatticeElement &IV = Solver.getLatticeValueFor(V);		const ValueLatticeElement &IV = Solver.getLatticeValueFor(V);
if (isOverdefined(IV))		if (isOverdefined(IV))
return false;		return false;
auto *Const =		auto *Const =
isConstant(IV) ? Solver.getConstant(IV) : UndefValue::get(V->getType());		isConstant(IV) ? Solver.getConstant(IV) : UndefValue::get(V->getType());

LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing " << *V		LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing " << *V
<< "\nFnSpecialization: with " << *Const << "\n");		<< "\nFnSpecialization: with " << *Const << "\n");

// Record uses of V to avoid visiting irrelevant uses of const later.		// Record uses of V to avoid visiting irrelevant uses of const later.
SmallVector<Instruction *> UseInsts;		SmallVector<Instruction *> UseInsts;
for (auto *U : V->users())		for (auto *U : V->users())
if (auto *I = dyn_cast<Instruction>(U))		if (auto *I = dyn_cast<Instruction>(U))
if (Solver.isBlockExecutable(I->getParent()))		if (Solver.isBlockExecutable(I->getParent()))
UseInsts.push_back(I);		UseInsts.push_back(I);

V->replaceAllUsesWith(Const);		V->replaceAllUsesWith(Const);

for (auto *I : UseInsts)		for (auto *I : UseInsts)
Solver.visit(I);		Solver.visit(I);

// Remove the instruction from Block and Solver.		// Remove the instruction from Block and Solver.
if (auto *I = dyn_cast<Instruction>(V)) {		if (auto *I = dyn_cast<Instruction>(V)) {
if (I->isSafeToRemove()) {		if (I->isSafeToRemove()) {
ReplacedWithConstant.push_back(I);		I->eraseFromParent();
Solver.removeLatticeValueFor(I);		Solver.removeLatticeValueFor(I);
}		}
}		}
		labrineaAuthorUnsubmitted Done Reply Inline Actions We now call this function once, not for every clone as it used to be. labrinea: We now call this function once, not for every clone as it used to be.
return true;		return true;
}		}

private:
// The number of functions specialised, used for collecting statistics and
// also in the cost model.
unsigned NbFunctionsSpecialized = 0;

// Compute the code metrics for function \p F.		// Compute the code metrics for function \p F.
CodeMetrics &analyzeFunction(Function *F) {		CodeMetrics &FunctionSpecializer::analyzeFunction(Function *F) {
auto I = FunctionMetrics.insert({F, CodeMetrics()});		auto I = FunctionMetrics.insert({F, CodeMetrics()});
CodeMetrics &Metrics = I.first->second;		CodeMetrics &Metrics = I.first->second;
if (I.second) {		if (I.second) {
// The code metrics were not cached.		// The code metrics were not cached.
SmallPtrSet<const Value *, 32> EphValues;		SmallPtrSet<const Value *, 32> EphValues;
CodeMetrics::collectEphemeralValues(F, &(GetAC)(*F), EphValues);		CodeMetrics::collectEphemeralValues(F, &(GetAC)(*F), EphValues);
for (BasicBlock &BB : *F)		for (BasicBlock &BB : *F)
Metrics.analyzeBasicBlock(&BB, (GetTTI)(*F), EphValues);		Metrics.analyzeBasicBlock(&BB, (GetTTI)(*F), EphValues);

LLVM_DEBUG(dbgs() << "FnSpecialization: Code size of function "		LLVM_DEBUG(dbgs() << "FnSpecialization: Code size of function "
<< F->getName() << " is " << Metrics.NumInsts		<< F->getName() << " is " << Metrics.NumInsts
<< " instructions\n");		<< " instructions\n");
}		}
return Metrics;		return Metrics;
}		}

/// Clone the function \p F and remove the ssa_copy intrinsics added by		/// Clone the function \p F and remove the ssa_copy intrinsics added by
/// the SCCPSolver in the cloned version.		/// the SCCPSolver in the cloned version.
Function cloneCandidateFunction(Function F, ValueToValueMapTy &Mappings) {		static Function cloneCandidateFunction(Function F,
		ValueToValueMapTy &Mappings) {
Function *Clone = CloneFunction(F, Mappings);		Function *Clone = CloneFunction(F, Mappings);
removeSSACopy(*Clone);		removeSSACopy(*Clone);
return Clone;		return Clone;
}		}

/// This function decides whether it's worthwhile to specialize function		/// This function decides whether it's worthwhile to specialize function
/// \p F based on the known constant values its arguments can take on. It		/// \p F based on the known constant values its arguments can take on. It
/// only discovers potential specialization opportunities without actually		/// only discovers potential specialization opportunities without actually
/// applying them.		/// applying them.
///		///
/// \returns true if any specializations have been found.		/// \returns true if any specializations have been found.
bool calculateGains(Function *F, InstructionCost Cost,		bool FunctionSpecializer::calculateGains(Function *F, InstructionCost Cost,
SmallVectorImpl<CallSpecBinding> &WorkList) {		SmallVectorImpl<CallSpecBinding> &WorkList) {
SpecializationMap Specializations;		SpecializationMap Specializations;
// Determine if we should specialize the function based on the values the		// Determine if we should specialize the function based on the values the
// argument can take on. If specialization is not profitable, we continue		// argument can take on. If specialization is not profitable, we continue
// on to the next argument.		// on to the next argument.
for (Argument &FormalArg : F->args()) {		for (Argument &FormalArg : F->args()) {
// Determine if this argument is interesting. If we know the argument can		// Determine if this argument is interesting. If we know the argument can
// take on any constant values, they are collected in Constants.		// take on any constant values, they are collected in Constants.
SmallVector<CallArgBinding, 8> ActualArgs;		SmallVector<CallArgBinding, 8> ActualArgs;
if (!isArgumentInteresting(&FormalArg, ActualArgs)) {		if (!isArgumentInteresting(&FormalArg, ActualArgs)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Argument "		LLVM_DEBUG(dbgs() << "FnSpecialization: Argument "
<< FormalArg.getNameOrAsOperand()		<< FormalArg.getNameOrAsOperand()
<< " is not interesting\n");		<< " is not interesting\n");
continue;		continue;
}		}

for (const auto &Entry : ActualArgs) {		for (const auto &Entry : ActualArgs) {
CallBase *Call = Entry.first;		CallBase *Call = Entry.first;
Constant *ActualArg = Entry.second;		Constant *ActualArg = Entry.second;

auto I = Specializations.insert({Call, SpecializationInfo()});		auto I = Specializations.insert({Call, SpecializationInfo()});
SpecializationInfo &S = I.first->second;		SpecializationInfo &S = I.first->second;
		chillUnsubmitted Done Reply Inline Actions This is a little bit fragile in the sense the caller may forget to clear the list. It would be nicer if this function itself clears the list first thing when it starts execution. `WorkList` could also be made a return, taking advantage of move semantics, which looks nicer on paper, but may cause a few allocations/deallocations if we iterate. chill: This is a little bit fragile in the sense the caller may forget to clear the list. It would be…

if (I.second)		if (I.second)
S.Gain = ForceFunctionSpecialization ? 1 : 0 - Cost;		S.Gain = ForceFunctionSpecialization ? 1 : 0 - Cost;
if (!ForceFunctionSpecialization)		if (!ForceFunctionSpecialization)
S.Gain += getSpecializationBonus(&FormalArg, ActualArg);		S.Gain += getSpecializationBonus(&FormalArg, ActualArg);
S.Args.push_back({&FormalArg, ActualArg});		S.Args.push_back({&FormalArg, ActualArg});
}		}
}		}

// Remove unprofitable specializations.		// Remove unprofitable specializations.
Specializations.remove_if(		Specializations.remove_if(
[](const auto &Entry) { return Entry.second.Gain <= 0; });		[](const auto &Entry) { return Entry.second.Gain <= 0; });

// Clear the MapVector and return the underlying vector.		// Clear the MapVector and return the underlying vector.
WorkList = Specializations.takeVector();		WorkList = Specializations.takeVector();

// Sort the candidates in descending order.		// Sort the candidates in descending order.
llvm::stable_sort(WorkList, [](const auto &L, const auto &R) {		llvm::stable_sort(WorkList, [](const auto &L, const auto &R) {
return L.second.Gain > R.second.Gain;		return L.second.Gain > R.second.Gain;
});		});

// Truncate the worklist to 'MaxClonesThreshold' candidates if necessary.		// Truncate the worklist to 'MaxClonesThreshold' candidates if necessary.
if (WorkList.size() > MaxClonesThreshold) {		if (WorkList.size() > MaxClonesThreshold) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Number of candidates exceed "		LLVM_DEBUG(dbgs() << "FnSpecialization: Number of candidates exceed "
<< "the maximum number of clones threshold.\n"		<< "the maximum number of clones threshold.\n"
<< "FnSpecialization: Truncating worklist to "		<< "FnSpecialization: Truncating worklist to "
<< MaxClonesThreshold << " candidates.\n");		<< MaxClonesThreshold << " candidates.\n");
WorkList.erase(WorkList.begin() + MaxClonesThreshold, WorkList.end());		WorkList.erase(WorkList.begin() + MaxClonesThreshold, WorkList.end());
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Specializations for function "		LLVM_DEBUG(dbgs() << "FnSpecialization: Specializations for function "
<< F->getName() << "\n";		<< F->getName() << "\n";
for (const auto &Entry		for (const auto &Entry
: WorkList) {		: WorkList) {
dbgs() << "FnSpecialization: Gain = " << Entry.second.Gain		dbgs() << "FnSpecialization: Gain = " << Entry.second.Gain
<< "\n";		<< "\n";
for (const ArgInfo &Arg : Entry.second.Args)		for (const ArgInfo &Arg : Entry.second.Args)
dbgs() << "FnSpecialization: FormalArg = "		dbgs() << "FnSpecialization: FormalArg = "
<< Arg.Formal->getNameOrAsOperand()		<< Arg.Formal->getNameOrAsOperand()
<< ", ActualArg = "		<< ", ActualArg = "
<< Arg.Actual->getNameOrAsOperand() << "\n";		<< Arg.Actual->getNameOrAsOperand() << "\n";
});		});

return !WorkList.empty();		return !WorkList.empty();
}		}

bool isCandidateFunction(Function *F) {		bool FunctionSpecializer::isCandidateFunction(Function *F) {
// Do not specialize the cloned function again.		// Do not specialize the cloned function again.
if (SpecializedFuncs.contains(F))		if (SpecializedFuncs.contains(F))
return false;		return false;

// If we're optimizing the function for size, we shouldn't specialize it.		// If we're optimizing the function for size, we shouldn't specialize it.
if (F->hasOptSize() \|\|		if (F->hasOptSize() \|\|
shouldOptimizeForSize(F, nullptr, nullptr, PGSOQueryType::IRPass))		shouldOptimizeForSize(F, nullptr, nullptr, PGSOQueryType::IRPass))
return false;		return false;

// Exit if the function is not executable. There's no point in specializing		// Exit if the function is not executable. There's no point in specializing
// a dead function.		// a dead function.
if (!Solver.isBlockExecutable(&F->getEntryBlock()))		if (!Solver.isBlockExecutable(&F->getEntryBlock()))
return false;		return false;

// It wastes time to specialize a function which would get inlined finally.		// It wastes time to specialize a function which would get inlined finally.
if (F->hasFnAttribute(Attribute::AlwaysInline))		if (F->hasFnAttribute(Attribute::AlwaysInline))
return false;		return false;

LLVM_DEBUG(dbgs() << "FnSpecialization: Try function: " << F->getName()		LLVM_DEBUG(dbgs() << "FnSpecialization: Try function: " << F->getName()
<< "\n");		<< "\n");
return true;		return true;
}		}

void specializeFunction(Function *F, SpecializationInfo &S,		void FunctionSpecializer::specializeFunction(Function *F, SpecializationInfo &S,
FuncList &WorkList) {		FuncList &WorkList) {
ValueToValueMapTy Mappings;		ValueToValueMapTy Mappings;
Function *Clone = cloneCandidateFunction(F, Mappings);		Function *Clone = cloneCandidateFunction(F, Mappings);

// Rewrite calls to the function so that they call the clone instead.		// Rewrite calls to the function so that they call the clone instead.
rewriteCallSites(Clone, S.Args, Mappings);		rewriteCallSites(Clone, S.Args, Mappings);

// Initialize the lattice state of the arguments of the function clone,		// Initialize the lattice state of the arguments of the function clone,
// marking the argument on which we specialized the function constant		// marking the argument on which we specialized the function constant
// with the given value.		// with the given value.
Solver.markArgInFuncSpecialization(Clone, S.Args);		Solver.markArgInFuncSpecialization(Clone, S.Args);

// Mark all the specialized functions		// Mark all the specialized functions
WorkList.push_back(Clone);		WorkList.push_back(Clone);
NbFunctionsSpecialized++;		NbFunctionsSpecialized++;

// If the function has been completely specialized, the original function		// If the function has been completely specialized, the original function
// is no longer needed. Mark it unreachable.		// is no longer needed. Mark it unreachable.
if (F->getNumUses() == 0 \|\| all_of(F->users(), [F](User *U) {		if (F->getNumUses() == 0 \|\| all_of(F->users(), [F](User *U) {
if (auto *CS = dyn_cast<CallBase>(U))		if (auto *CS = dyn_cast<CallBase>(U))
return CS->getFunction() == F;		return CS->getFunction() == F;
return false;		return false;
})) {		})) {
Solver.markFunctionUnreachable(F);		Solver.markFunctionUnreachable(F);
FullySpecialized.insert(F);		FullySpecialized.insert(F);
}		}
}		}

/// Compute and return the cost of specializing function \p F.		/// Compute and return the cost of specializing function \p F.
InstructionCost getSpecializationCost(Function *F) {		InstructionCost FunctionSpecializer::getSpecializationCost(Function *F) {
CodeMetrics &Metrics = analyzeFunction(F);		CodeMetrics &Metrics = analyzeFunction(F);
// If the code metrics reveal that we shouldn't duplicate the function, we		// If the code metrics reveal that we shouldn't duplicate the function, we
// shouldn't specialize it. Set the specialization cost to Invalid.		// shouldn't specialize it. Set the specialization cost to Invalid.
// Or if the lines of codes implies that this function is easy to get		// Or if the lines of codes implies that this function is easy to get
// inlined so that we shouldn't specialize it.		// inlined so that we shouldn't specialize it.
if (Metrics.notDuplicatable \|\|		if (Metrics.notDuplicatable \|\|
(!ForceFunctionSpecialization &&		(!ForceFunctionSpecialization &&
Metrics.NumInsts < SmallFunctionThreshold)) {		Metrics.NumInsts < SmallFunctionThreshold)) {
InstructionCost C{};		InstructionCost C{};
C.setInvalid();		C.setInvalid();
return C;		return C;
}		}

// Otherwise, set the specialization cost to be the cost of all the		// Otherwise, set the specialization cost to be the cost of all the
// instructions in the function and penalty for specializing more functions.		// instructions in the function and penalty for specializing more functions.
unsigned Penalty = NbFunctionsSpecialized + 1;		unsigned Penalty = NbFunctionsSpecialized + 1;
return Metrics.NumInsts * InlineConstants::InstrCost * Penalty;		return Metrics.NumInsts * InlineConstants::InstrCost * Penalty;
}		}

InstructionCost getUserBonus(User *U, llvm::TargetTransformInfo &TTI,		static InstructionCost getUserBonus(User *U, llvm::TargetTransformInfo &TTI,
LoopInfo &LI) {		LoopInfo &LI) {
auto *I = dyn_cast_or_null<Instruction>(U);		auto *I = dyn_cast_or_null<Instruction>(U);
// If not an instruction we do not know how to evaluate.		// If not an instruction we do not know how to evaluate.
// Keep minimum possible cost for now so that it doesnt affect		// Keep minimum possible cost for now so that it doesnt affect
// specialization.		// specialization.
if (!I)		if (!I)
return std::numeric_limits<unsigned>::min();		return std::numeric_limits<unsigned>::min();

auto Cost = TTI.getUserCost(U, TargetTransformInfo::TCK_SizeAndLatency);		auto Cost = TTI.getUserCost(U, TargetTransformInfo::TCK_SizeAndLatency);

// Traverse recursively if there are more uses.		// Traverse recursively if there are more uses.
// TODO: Any other instructions to be added here?		// TODO: Any other instructions to be added here?
if (I->mayReadFromMemory() \|\| I->isCast())		if (I->mayReadFromMemory() \|\| I->isCast())
for (auto *User : I->users())		for (auto *User : I->users())
Cost += getUserBonus(User, TTI, LI);		Cost += getUserBonus(User, TTI, LI);

// Increase the cost if it is inside the loop.		// Increase the cost if it is inside the loop.
auto LoopDepth = LI.getLoopDepth(I->getParent());		auto LoopDepth = LI.getLoopDepth(I->getParent());
Cost *= std::pow((double)AvgLoopIterationCount, LoopDepth);		Cost *= std::pow((double)AvgLoopIterationCount, LoopDepth);
return Cost;		return Cost;
}		}

/// Compute a bonus for replacing argument \p A with constant \p C.		/// Compute a bonus for replacing argument \p A with constant \p C.
InstructionCost getSpecializationBonus(Argument A, Constant C) {		InstructionCost FunctionSpecializer::getSpecializationBonus(Argument *A,
		Constant *C) {
Function *F = A->getParent();		Function *F = A->getParent();
DominatorTree DT(*F);		DominatorTree DT(*F);
LoopInfo LI(DT);		LoopInfo LI(DT);
auto &TTI = (GetTTI)(*F);		auto &TTI = (GetTTI)(*F);
LLVM_DEBUG(dbgs() << "FnSpecialization: Analysing bonus for constant: "		LLVM_DEBUG(dbgs() << "FnSpecialization: Analysing bonus for constant: "
<< C->getNameOrAsOperand() << "\n");		<< C->getNameOrAsOperand() << "\n");

InstructionCost TotalCost = 0;		InstructionCost TotalCost = 0;
for (auto *U : A->users()) {		for (auto *U : A->users()) {
TotalCost += getUserBonus(U, TTI, LI);		TotalCost += getUserBonus(U, TTI, LI);
LLVM_DEBUG(dbgs() << "FnSpecialization: User cost ";		LLVM_DEBUG(dbgs() << "FnSpecialization: User cost ";
TotalCost.print(dbgs()); dbgs() << " for: " << *U << "\n");		TotalCost.print(dbgs()); dbgs() << " for: " << *U << "\n");
}		}

// The below heuristic is only concerned with exposing inlining		// The below heuristic is only concerned with exposing inlining
// opportunities via indirect call promotion. If the argument is not a		// opportunities via indirect call promotion. If the argument is not a
// (potentially casted) function pointer, give up.		// (potentially casted) function pointer, give up.
Function *CalledFunction = dyn_cast<Function>(C->stripPointerCasts());		Function *CalledFunction = dyn_cast<Function>(C->stripPointerCasts());
if (!CalledFunction)		if (!CalledFunction)
return TotalCost;		return TotalCost;

// Get TTI for the called function (used for the inline cost).		// Get TTI for the called function (used for the inline cost).
auto &CalleeTTI = (GetTTI)(*CalledFunction);		auto &CalleeTTI = (GetTTI)(*CalledFunction);

// Look at all the call sites whose called value is the argument.		// Look at all the call sites whose called value is the argument.
// Specializing the function on the argument would allow these indirect		// Specializing the function on the argument would allow these indirect
// calls to be promoted to direct calls. If the indirect call promotion		// calls to be promoted to direct calls. If the indirect call promotion
// would likely enable the called function to be inlined, specializing is a		// would likely enable the called function to be inlined, specializing is a
// good idea.		// good idea.
int Bonus = 0;		int Bonus = 0;
for (User *U : A->users()) {		for (User *U : A->users()) {
if (!isa<CallInst>(U) && !isa<InvokeInst>(U))		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
continue;		continue;
auto *CS = cast<CallBase>(U);		auto *CS = cast<CallBase>(U);
if (CS->getCalledOperand() != A)		if (CS->getCalledOperand() != A)
continue;		continue;

// Get the cost of inlining the called function at this call site. Note		// Get the cost of inlining the called function at this call site. Note
// that this is only an estimate. The called function may eventually		// that this is only an estimate. The called function may eventually
// change in a way that leads to it not being inlined here, even though		// change in a way that leads to it not being inlined here, even though
// inlining looks profitable now. For example, one of its called		// inlining looks profitable now. For example, one of its called
// functions may be inlined into it, making the called function too large		// functions may be inlined into it, making the called function too large
// to be inlined into this call site.		// to be inlined into this call site.
//		//
// We apply a boost for performing indirect call promotion by increasing		// We apply a boost for performing indirect call promotion by increasing
// the default threshold by the threshold for indirect calls.		// the default threshold by the threshold for indirect calls.
auto Params = getInlineParams();		auto Params = getInlineParams();
Params.DefaultThreshold += InlineConstants::IndirectCallThreshold;		Params.DefaultThreshold += InlineConstants::IndirectCallThreshold;
InlineCost IC =		InlineCost IC =
getInlineCost(*CS, CalledFunction, Params, CalleeTTI, GetAC, GetTLI);		getInlineCost(*CS, CalledFunction, Params, CalleeTTI, GetAC, GetTLI);

// We clamp the bonus for this call to be between zero and the default		// We clamp the bonus for this call to be between zero and the default
// threshold.		// threshold.
if (IC.isAlways())		if (IC.isAlways())
Bonus += Params.DefaultThreshold;		Bonus += Params.DefaultThreshold;
else if (IC.isVariable() && IC.getCostDelta() > 0)		else if (IC.isVariable() && IC.getCostDelta() > 0)
Bonus += IC.getCostDelta();		Bonus += IC.getCostDelta();

LLVM_DEBUG(dbgs() << "FnSpecialization: Inlining bonus " << Bonus		LLVM_DEBUG(dbgs() << "FnSpecialization: Inlining bonus " << Bonus
<< " for user " << *U << "\n");		<< " for user " << *U << "\n");
}		}

return TotalCost + Bonus;		return TotalCost + Bonus;
}		}

/// Determine if we should specialize a function based on the incoming values		/// Determine if we should specialize a function based on the incoming values
/// of the given argument.		/// of the given argument.
///		///
/// This function implements the goal-directed heuristic. It determines if		/// This function implements the goal-directed heuristic. It determines if
/// specializing the function based on the incoming values of argument \p A		/// specializing the function based on the incoming values of argument \p A
/// would result in any significant optimization opportunities. If		/// would result in any significant optimization opportunities. If
/// optimization opportunities exist, the constant values of \p A on which to		/// optimization opportunities exist, the constant values of \p A on which to
/// specialize the function are collected in \p Constants.		/// specialize the function are collected in \p Constants.
///		///
/// \returns true if the function should be specialized on the given		/// \returns true if the function should be specialized on the given
/// argument.		/// argument.
bool isArgumentInteresting(Argument *A,		bool FunctionSpecializer::isArgumentInteresting(Argument *A,
SmallVectorImpl<CallArgBinding> &Constants) {		SmallVectorImpl<CallArgBinding> &Constants) {
// For now, don't attempt to specialize functions based on the values of		// For now, don't attempt to specialize functions based on the values of
// composite types.		// composite types.
if (!A->getType()->isSingleValueType() \|\| A->user_empty())		if (!A->getType()->isSingleValueType() \|\| A->user_empty())
return false;		return false;

// If the argument isn't overdefined, there's nothing to do. It should		// If the argument isn't overdefined, there's nothing to do. It should
// already be constant.		// already be constant.
if (!Solver.getLatticeValueFor(A).isOverdefined()) {		if (!Solver.getLatticeValueFor(A).isOverdefined()) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Nothing to do, argument "		LLVM_DEBUG(dbgs() << "FnSpecialization: Nothing to do, argument "
<< A->getNameOrAsOperand()		<< A->getNameOrAsOperand()
<< " is already constant?\n");		<< " is already constant?\n");
return false;		return false;
}		}

// Collect the constant values that the argument can take on. If the		// Collect the constant values that the argument can take on. If the
// argument can't take on any constant values, we aren't going to		// argument can't take on any constant values, we aren't going to
// specialize the function. While it's possible to specialize the function		// specialize the function. While it's possible to specialize the function
// based on non-constant arguments, there's likely not much benefit to		// based on non-constant arguments, there's likely not much benefit to
// constant propagation in doing so.		// constant propagation in doing so.
//		//
// TODO 1: currently it won't specialize if there are over the threshold of		// TODO 1: currently it won't specialize if there are over the threshold of
// calls using the same argument, e.g foo(a) x 4 and foo(b) x 1, but it		// calls using the same argument, e.g foo(a) x 4 and foo(b) x 1, but it
// might be beneficial to take the occurrences into account in the cost		// might be beneficial to take the occurrences into account in the cost
// model, so we would need to find the unique constants.		// model, so we would need to find the unique constants.
//		//
// TODO 2: this currently does not support constants, i.e. integer ranges.		// TODO 2: this currently does not support constants, i.e. integer ranges.
//		//
getPossibleConstants(A, Constants);		getPossibleConstants(A, Constants);

if (Constants.empty())		if (Constants.empty())
return false;		return false;

LLVM_DEBUG(dbgs() << "FnSpecialization: Found interesting argument "		LLVM_DEBUG(dbgs() << "FnSpecialization: Found interesting argument "
<< A->getNameOrAsOperand() << "\n");		<< A->getNameOrAsOperand() << "\n");
return true;		return true;
}		}

/// Collect in \p Constants all the constant values that argument \p A can		/// Collect in \p Constants all the constant values that argument \p A can
/// take on.		/// take on.
void getPossibleConstants(Argument *A,		void FunctionSpecializer::getPossibleConstants(Argument *A,
SmallVectorImpl<CallArgBinding> &Constants) {		SmallVectorImpl<CallArgBinding> &Constants) {
Function *F = A->getParent();		Function *F = A->getParent();

// Iterate over all the call sites of the argument's parent function.		// Iterate over all the call sites of the argument's parent function.
for (User *U : F->users()) {		for (User *U : F->users()) {
if (!isa<CallInst>(U) && !isa<InvokeInst>(U))		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
continue;		continue;
auto &CS = *cast<CallBase>(U);		auto &CS = *cast<CallBase>(U);
// If the call site has attribute minsize set, that callsite won't be		// If the call site has attribute minsize set, that callsite won't be
// specialized.		// specialized.
if (CS.hasFnAttr(Attribute::MinSize))		if (CS.hasFnAttr(Attribute::MinSize))
continue;		continue;

// If the parent of the call site will never be executed, we don't need		// If the parent of the call site will never be executed, we don't need
// to worry about the passed value.		// to worry about the passed value.
if (!Solver.isBlockExecutable(CS.getParent()))		if (!Solver.isBlockExecutable(CS.getParent()))
continue;		continue;

auto *V = CS.getArgOperand(A->getArgNo());		auto *V = CS.getArgOperand(A->getArgNo());
if (isa<PoisonValue>(V))		if (isa<PoisonValue>(V))
return;		return;

// For now, constant expressions are fine but only if they are function		// For now, constant expressions are fine but only if they are function
// calls.		// calls.
if (auto *CE = dyn_cast<ConstantExpr>(V))		if (auto *CE = dyn_cast<ConstantExpr>(V))
if (!isa<Function>(CE->getOperand(0)))		if (!isa<Function>(CE->getOperand(0)))
return;		return;

// TrackValueOfGlobalVariable only tracks scalar global variables.		// TrackValueOfGlobalVariable only tracks scalar global variables.
if (auto *GV = dyn_cast<GlobalVariable>(V)) {		if (auto *GV = dyn_cast<GlobalVariable>(V)) {
// Check if we want to specialize on the address of non-constant		// Check if we want to specialize on the address of non-constant
// global values.		// global values.
if (!GV->isConstant())		if (!GV->isConstant())
if (!SpecializeOnAddresses)		if (!SpecializeOnAddresses)
return;		return;

if (!GV->getValueType()->isSingleValueType())		if (!GV->getValueType()->isSingleValueType())
return;		return;
}		}

if (isa<Constant>(V) && (Solver.getLatticeValueFor(V).isConstant() \|\|		if (isa<Constant>(V) && (Solver.getLatticeValueFor(V).isConstant() \|\|
EnableSpecializationForLiteralConstant))		EnableSpecializationForLiteralConstant))
Constants.push_back({&CS, cast<Constant>(V)});		Constants.push_back({&CS, cast<Constant>(V)});
}		}
}		}

/// Rewrite calls to function \p F to call function \p Clone instead.		/// Rewrite calls to function \p F to call function \p Clone instead.
///		///
/// This function modifies calls to function \p F as long as the actual		/// This function modifies calls to function \p F as long as the actual
/// arguments match those in \p Args. Note that for recursive calls we		/// arguments match those in \p Args. Note that for recursive calls we
/// need to compare against the cloned formal arguments.		/// need to compare against the cloned formal arguments.
///		///
/// Callsites that have been marked with the MinSize function attribute won't		/// Callsites that have been marked with the MinSize function attribute won't
/// be specialized and rewritten.		/// be specialized and rewritten.
void rewriteCallSites(Function *Clone, const SmallVectorImpl<ArgInfo> &Args,		void FunctionSpecializer::rewriteCallSites(Function *Clone,
		const SmallVectorImpl<ArgInfo> &Args,
ValueToValueMapTy &Mappings) {		ValueToValueMapTy &Mappings) {
assert(!Args.empty() && "Specialization without arguments");		assert(!Args.empty() && "Specialization without arguments");
Function *F = Args[0].Formal->getParent();		Function *F = Args[0].Formal->getParent();

SmallVector<CallBase *, 8> CallSitesToRewrite;		SmallVector<CallBase *, 8> CallSitesToRewrite;
for (auto *U : F->users()) {		for (auto *U : F->users()) {
if (!isa<CallInst>(U) && !isa<InvokeInst>(U))		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
continue;		continue;
auto &CS = *cast<CallBase>(U);		auto &CS = *cast<CallBase>(U);
if (!CS.getCalledFunction() \|\| CS.getCalledFunction() != F)		if (!CS.getCalledFunction() \|\| CS.getCalledFunction() != F)
continue;		continue;
CallSitesToRewrite.push_back(&CS);		CallSitesToRewrite.push_back(&CS);
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing call sites of "		LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing call sites of "
<< F->getName() << " with " << Clone->getName() << "\n");		<< F->getName() << " with " << Clone->getName() << "\n");

for (auto *CS : CallSitesToRewrite) {		for (auto *CS : CallSitesToRewrite) {
LLVM_DEBUG(dbgs() << "FnSpecialization: "		LLVM_DEBUG(dbgs() << "FnSpecialization: "
<< CS->getFunction()->getName() << " ->" << *CS		<< CS->getFunction()->getName() << " ->" << *CS
<< "\n");		<< "\n");
if (/* recursive call */		if (/* recursive call */
(CS->getFunction() == Clone &&		(CS->getFunction() == Clone &&
all_of(Args,		all_of(Args,
[CS, &Mappings](const ArgInfo &Arg) {		[CS, &Mappings](const ArgInfo &Arg) {
unsigned ArgNo = Arg.Formal->getArgNo();		unsigned ArgNo = Arg.Formal->getArgNo();
return CS->getArgOperand(ArgNo) == Mappings[Arg.Formal];		return CS->getArgOperand(ArgNo) == Mappings[Arg.Formal];
})) \|\|		})) \|\|
/* normal call */		/* normal call */
all_of(Args, [CS](const ArgInfo &Arg) {		all_of(Args, [CS](const ArgInfo &Arg) {
unsigned ArgNo = Arg.Formal->getArgNo();		unsigned ArgNo = Arg.Formal->getArgNo();
return CS->getArgOperand(ArgNo) == Arg.Actual;		return CS->getArgOperand(ArgNo) == Arg.Actual;
})) {		})) {
CS->setCalledFunction(Clone);		CS->setCalledFunction(Clone);
Solver.markOverdefined(CS);		Solver.markOverdefined(CS);
}		}
}		}
}		}

void updateSpecializedFuncs(FuncList &Candidates, FuncList &WorkList) {		void FunctionSpecializer::updateSpecializedFuncs(FuncList &Candidates,
		FuncList &WorkList) {
for (auto *F : WorkList) {		for (auto *F : WorkList) {
SpecializedFuncs.insert(F);		SpecializedFuncs.insert(F);

// Initialize the state of the newly created functions, marking them		// Initialize the state of the newly created functions, marking them
// argument-tracked and executable.		// argument-tracked and executable.
if (F->hasExactDefinition() && !F->hasFnAttribute(Attribute::Naked))		if (F->hasExactDefinition() && !F->hasFnAttribute(Attribute::Naked))
Solver.addTrackedFunction(F);		Solver.addTrackedFunction(F);

Solver.addArgumentTrackedFunction(F);		Solver.addArgumentTrackedFunction(F);
Candidates.push_back(F);		Candidates.push_back(F);
Solver.markBlockExecutable(&F->front());		Solver.markBlockExecutable(&F->front());

// Replace the function arguments for the specialized functions.		// Replace the function arguments for the specialized functions.
for (Argument &Arg : F->args())		for (Argument &Arg : F->args())
if (!Arg.use_empty() && tryToReplaceWithConstant(&Arg))		if (!Arg.use_empty() && tryToReplaceWithConstant(Solver, &Arg))
		fhahnUnsubmitted Done Reply Inline Actions IIUC this is done during in between solver runs, right? Is this needed? Isn't it sufficient to continue with the constant value in the value mapping? This would probably remove the need to tell the solver to forget instructions/values. fhahn: IIUC this is done during in between solver runs, right? Is this needed? Isn't it sufficient to…
		labrineaAuthorUnsubmitted Done Reply Inline Actions This is not the only invocation of `tryToReplaceWithConstant` in FuncSpec. On this instance we try to replace the arguments of cloned functions. There's another invocation inside the functor `RunSCCPSolver`. On that instance we try to replace the instructions of cloned functions. Both calls occur as many times as `FuncSpecializationMaxIters` is set to. Moreover, the SCCP pass itself does the same thing on arguments of tracked functions and on instructions of executable blocks (with `tryToReplaceWithConstant` and `simplifyInstsInBlock` accordingly). This happens after the Solver runs and before the Function Specializer is invoked. Therefore, I think we still need to tell the Solver to forget instructions/values if we want to merge the two passes. labrinea: This is not the only invocation of `tryToReplaceWithConstant` in FuncSpec. On this instance we…
LLVM_DEBUG(dbgs() << "FnSpecialization: Replaced constant argument: "		LLVM_DEBUG(dbgs() << "FnSpecialization: Replaced constant argument: "
<< Arg.getNameOrAsOperand() << "\n");		<< Arg.getNameOrAsOperand() << "\n");
}		}
}		}
};
} // namespace

bool llvm::runFunctionSpecialization(		bool FunctionSpecializer::specialize(FuncList &FuncDecls) {
Module &M, const DataLayout &DL,
std::function<TargetLibraryInfo &(Function &)> GetTLI,
std::function<TargetTransformInfo &(Function &)> GetTTI,
std::function<AssumptionCache &(Function &)> GetAC,
function_ref<AnalysisResultsForFn(Function &)> GetAnalysis) {
SCCPSolver Solver(DL, GetTLI, M.getContext());
FunctionSpecializer FS(Solver, GetAC, GetTTI, GetTLI);
bool Changed = false;		bool Changed = false;

// Loop over all functions, marking arguments to those with their addresses
// taken or that are external as overdefined.
for (Function &F : M) {
if (F.isDeclaration())
continue;
if (F.hasFnAttribute(Attribute::NoDuplicate))
continue;

LLVM_DEBUG(dbgs() << "\nFnSpecialization: Analysing decl: " << F.getName()
<< "\n");
Solver.addAnalysis(F, GetAnalysis(F));

// Determine if we can track the function's arguments. If so, add the
// function to the solver's set of argument-tracked functions.
if (canTrackArgumentsInterprocedurally(&F)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Can track arguments\n");
Solver.addArgumentTrackedFunction(&F);
continue;
} else {
LLVM_DEBUG(dbgs() << "FnSpecialization: Can't track arguments!\n"
<< "FnSpecialization: Doesn't have local linkage, or "
<< "has its address taken\n");
}

// Assume the function is called.
Solver.markBlockExecutable(&F.front());

// Assume nothing about the incoming arguments.
for (Argument &AI : F.args())
Solver.markOverdefined(&AI);
}

// Determine if we can track any of the module's global variables. If so, add
// the global variables we can track to the solver's set of tracked global
// variables.
for (GlobalVariable &G : M.globals()) {
G.removeDeadConstantUsers();
if (canTrackGlobalVariableInterprocedurally(&G))
Solver.trackValueOfGlobalVariable(&G);
}

auto &TrackedFuncs = Solver.getArgumentTrackedFunctions();
SmallVector<Function *, 16> FuncDecls(TrackedFuncs.begin(),
TrackedFuncs.end());

// No tracked functions, so nothing to do: don't run the solver and remove
// the ssa_copy intrinsics that may have been introduced.
if (TrackedFuncs.empty()) {
removeSSACopy(M);
return false;
}

// Solve for constants.		// Solve for constants.
auto RunSCCPSolver = [&](auto &WorkList) {		auto RunSCCPSolver = [&](auto &WorkList) {
bool ResolvedUndefs = true;		bool ResolvedUndefs = true;

while (ResolvedUndefs) {		while (ResolvedUndefs) {
// Not running the solver unnecessary is checked in regression test		// Not running the solver unnecessary is checked in regression test
// nothing-to-do.ll, so if this debug message is changed, this regression		// nothing-to-do.ll, so if this debug message is changed, this regression
// test needs updating too.		// test needs updating too.
Show All 9 Lines	auto RunSCCPSolver = [&](auto &WorkList) {

for (auto *F : WorkList) {		for (auto *F : WorkList) {
for (BasicBlock &BB : *F) {		for (BasicBlock &BB : *F) {
if (!Solver.isBlockExecutable(&BB))		if (!Solver.isBlockExecutable(&BB))
continue;		continue;
// FIXME: The solver may make changes to the function here, so set		// FIXME: The solver may make changes to the function here, so set
// Changed, even if later function specialization does not trigger.		// Changed, even if later function specialization does not trigger.
for (auto &I : make_early_inc_range(BB))		for (auto &I : make_early_inc_range(BB))
Changed \|= FS.tryToReplaceWithConstant(&I);		Changed \|= tryToReplaceWithConstant(Solver, &I);
}		}
}		}
};		};

#ifndef NDEBUG		#ifndef NDEBUG
LLVM_DEBUG(dbgs() << "FnSpecialization: Worklist fn decls:\n");		LLVM_DEBUG(dbgs() << "FnSpecialization: Worklist fn decls:\n");
for (auto *F : FuncDecls)		for (auto *F : FuncDecls)
LLVM_DEBUG(dbgs() << "FnSpecialization: *) " << F->getName() << "\n");		LLVM_DEBUG(dbgs() << "FnSpecialization: *) " << F->getName() << "\n");
#endif		#endif

// Initially resolve the constants in all the argument tracked functions.
RunSCCPSolver(FuncDecls);

SmallVector<Function *, 8> WorkList;		SmallVector<Function *, 8> WorkList;
unsigned I = 0;		unsigned I = 0;
while (FuncSpecializationMaxIters != I++ &&		while (FuncSpecializationMaxIters != I++ &&
FS.specializeFunctions(FuncDecls, WorkList)) {		specializeFunctions(FuncDecls, WorkList)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Finished iteration " << I << "\n");		LLVM_DEBUG(dbgs() << "FnSpecialization: Finished iteration " << I << "\n");

// Run the solver for the specialized functions.		// Run the solver for the specialized functions.
		labrineaAuthorUnsubmitted Done Reply Inline Actions I'll rename this and add a comment to explain what it is used for. labrinea: I'll rename this and add a comment to explain what it is used for.
RunSCCPSolver(WorkList);		RunSCCPSolver(WorkList);

// Replace some unresolved constant arguments.		// Replace some unresolved constant arguments.
constantArgPropagation(FuncDecls, M, Solver);		propagateConstantArgs(FuncDecls);

WorkList.clear();		WorkList.clear();
		chillUnsubmitted Done Reply Inline Actions Use braces around the `for`, since there are more than two levels of nesting. chill: Use braces around the `for`, since there are more than two levels of nesting.
Changed = true;		Changed = true;
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Number of specializations = "		LLVM_DEBUG(dbgs() << "FnSpecialization: Number of specializations = "
		labrineaAuthorUnsubmitted Done Reply Inline Actions There might be more call sites to rewrite than those in the CallSpecBinding that we have already found, therefore we need to repeat this look up of users here, but at least it now happens once for F compared to Clones.size() times which was the case before. labrinea: There might be more call sites to rewrite than those in the CallSpecBinding that we have…
		labrineaAuthorUnsubmitted Done Reply Inline Actions no need to examine lattices of arguments if it's the key of the CallSpecBinding labrinea: no need to examine lattices of arguments if it's the key of the CallSpecBinding
<< NumFuncSpecialized << "\n");		<< NumFuncSpecialized << "\n");

// Remove any ssa_copy intrinsics that may have been introduced.		// Remove any ssa_copy intrinsics that may have been introduced.
removeSSACopy(M);		removeSSACopy(M);
return Changed;		return Changed;
		chillUnsubmitted Done Reply Inline Actions This is better placed outside of `rewriteCallSites`, perhaps just after the call to `rewriteCallSites`. chill: This is better placed outside of `rewriteCallSites`, perhaps just after the call to…
		chillUnsubmitted Done Reply Inline Actions Or a better idea: get the initial size of `CallSitesToRewrite`, decrement that number every time you update a call site. At the end if this number drops to zero mark the function unreachable. chill: Or a better idea: get the initial size of `CallSitesToRewrite`, decrement that number every…
		labrineaAuthorUnsubmitted Done Reply Inline Actions that won't work for dead recursive functions labrinea: that won't work for dead recursive functions
}		}
		labrineaAuthorUnsubmitted Done Reply Inline Actions We are modifying the list whist traversing it, so we swap the current element with the last one and reduce the iteration range by one. labrinea: We are modifying the list whist traversing it, so we swap the current element with the last one…
		labrineaAuthorUnsubmitted Done Reply Inline Actions the condition was different before, but I think this is correct labrinea: the condition was different before, but I think this is correct

llvm/lib/Transforms/IPO/IPO.cpp

Show All 26 Lines	void llvm::initializeIPO(PassRegistry &Registry) {
initializeArgPromotionPass(Registry);		initializeArgPromotionPass(Registry);
initializeAnnotation2MetadataLegacyPass(Registry);		initializeAnnotation2MetadataLegacyPass(Registry);
initializeCalledValuePropagationLegacyPassPass(Registry);		initializeCalledValuePropagationLegacyPassPass(Registry);
initializeConstantMergeLegacyPassPass(Registry);		initializeConstantMergeLegacyPassPass(Registry);
initializeCrossDSOCFIPass(Registry);		initializeCrossDSOCFIPass(Registry);
initializeDAEPass(Registry);		initializeDAEPass(Registry);
initializeDAHPass(Registry);		initializeDAHPass(Registry);
initializeForceFunctionAttrsLegacyPassPass(Registry);		initializeForceFunctionAttrsLegacyPassPass(Registry);
initializeFunctionSpecializationLegacyPassPass(Registry);
initializeGlobalDCELegacyPassPass(Registry);		initializeGlobalDCELegacyPassPass(Registry);
initializeGlobalOptLegacyPassPass(Registry);		initializeGlobalOptLegacyPassPass(Registry);
initializeGlobalSplitPass(Registry);		initializeGlobalSplitPass(Registry);
initializeHotColdSplittingLegacyPassPass(Registry);		initializeHotColdSplittingLegacyPassPass(Registry);
initializeIROutlinerLegacyPassPass(Registry);		initializeIROutlinerLegacyPassPass(Registry);
initializeAlwaysInlinerLegacyPassPass(Registry);		initializeAlwaysInlinerLegacyPassPass(Registry);
initializeSimpleInlinerPass(Registry);		initializeSimpleInlinerPass(Registry);
initializeInferFunctionAttrsLegacyPassPass(Registry);		initializeInferFunctionAttrsLegacyPassPass(Registry);
▲ Show 20 Lines • Show All 100 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 684 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
if (AttributorRun & AttributorRunOption::MODULE)		if (AttributorRun & AttributorRunOption::MODULE)
MPM.add(createAttributorLegacyPass());		MPM.add(createAttributorLegacyPass());

addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);		addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);

if (OptLevel > 2)		if (OptLevel > 2)
MPM.add(createCallSiteSplittingPass());		MPM.add(createCallSiteSplittingPass());

// Propage constant function arguments by specializing the functions.
if (OptLevel > 2 && EnableFunctionSpecialization)
MPM.add(createFunctionSpecializationPass());

MPM.add(createIPSCCPPass()); // IP SCCP		MPM.add(createIPSCCPPass()); // IP SCCP
MPM.add(createCalledValuePropagationPass());		MPM.add(createCalledValuePropagationPass());

MPM.add(createGlobalOptimizerPass()); // Optimize out global vars		MPM.add(createGlobalOptimizerPass()); // Optimize out global vars
// Promote any localized global vars.		// Promote any localized global vars.
MPM.add(createPromoteMemoryToRegisterPass());		MPM.add(createPromoteMemoryToRegisterPass());

MPM.add(createDeadArgEliminationPass()); // Dead argument elimination		MPM.add(createDeadArgEliminationPass()); // Dead argument elimination
▲ Show 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {

// Infer attributes about declarations if possible.		// Infer attributes about declarations if possible.
PM.add(createInferFunctionAttrsLegacyPass());		PM.add(createInferFunctionAttrsLegacyPass());

if (OptLevel > 1) {		if (OptLevel > 1) {
// Split call-site with more constrained arguments.		// Split call-site with more constrained arguments.
PM.add(createCallSiteSplittingPass());		PM.add(createCallSiteSplittingPass());

// Propage constant function arguments by specializing the functions.
if (EnableFunctionSpecialization && OptLevel > 2)
PM.add(createFunctionSpecializationPass());

// Propagate constants at call sites into the functions they call. This		// Propagate constants at call sites into the functions they call. This
// opens opportunities for globalopt (and inlining) by substituting function		// opens opportunities for globalopt (and inlining) by substituting function
// pointers passed as arguments to direct uses of functions.		// pointers passed as arguments to direct uses of functions.
PM.add(createIPSCCPPass());		PM.add(createIPSCCPPass());

// Attach metadata to indirect call sites indicating the set of functions		// Attach metadata to indirect call sites indicating the set of functions
// they may target at run-time. This should follow IPSCCP.		// they may target at run-time. This should follow IPSCCP.
PM.add(createCalledValuePropagationPass());		PM.add(createCalledValuePropagationPass());
▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/SCCP.cpp

	Show All 16 Lines
	#include "llvm/Analysis/TargetTransformInfo.h"			#include "llvm/Analysis/TargetTransformInfo.h"
	#include "llvm/InitializePasses.h"			#include "llvm/InitializePasses.h"
	#include "llvm/Transforms/IPO.h"			#include "llvm/Transforms/IPO.h"
	#include "llvm/Transforms/Scalar/SCCP.h"			#include "llvm/Transforms/Scalar/SCCP.h"
	#include "llvm/Transforms/Utils/SCCPSolver.h"			#include "llvm/Transforms/Utils/SCCPSolver.h"

	using namespace llvm;			using namespace llvm;

	PreservedAnalyses IPSCCPPass::run(Module &M, ModuleAnalysisManager &AM) {			PreservedAnalyses IPSCCPPass::run(Module &M, ModuleAnalysisManager &AM) {
				fhahnUnsubmitted Done Reply Inline Actions move this to the the loop below, which uses it fhahn: move this to the the loop below, which uses it
	const DataLayout &DL = M.getDataLayout();			const DataLayout &DL = M.getDataLayout();
	auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();			auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
	auto GetTLI = [&FAM](Function &F) -> const TargetLibraryInfo & {			auto GetTLI = [&FAM](Function &F) -> const TargetLibraryInfo & {
	return FAM.getResult<TargetLibraryAnalysis>(F);			return FAM.getResult<TargetLibraryAnalysis>(F);
	};			};
				auto GetTTI = [&FAM](Function &F) -> TargetTransformInfo & {
				return FAM.getResult<TargetIRAnalysis>(F);
				};
				auto GetAC = [&FAM](Function &F) -> AssumptionCache & {
				return FAM.getResult<AssumptionAnalysis>(F);
				};
	auto getAnalysis = [&FAM](Function &F) -> AnalysisResultsForFn {			auto getAnalysis = [&FAM](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT = FAM.getResult<DominatorTreeAnalysis>(F);			DominatorTree &DT = FAM.getResult<DominatorTreeAnalysis>(F);
	return {			return {
	std::make_unique<PredicateInfo>(F, DT, FAM.getResult<AssumptionAnalysis>(F)),			std::make_unique<PredicateInfo>(F, DT, FAM.getResult<AssumptionAnalysis>(F)),
	&DT, FAM.getCachedResult<PostDominatorTreeAnalysis>(F)};			&DT, FAM.getCachedResult<PostDominatorTreeAnalysis>(F)};
				chillUnsubmitted Done Reply Inline Actions This part was added for the FunctionSpecialization, if func spec is disabled maybe not pass along the LoopAnalysis? chill: This part was added for the FunctionSpecialization, if func spec is disabled maybe not pass…
	};			};

	if (!runIPSCCP(M, DL, GetTLI, getAnalysis))			if (!runIPSCCP(M, DL, GetTLI, GetTTI, GetAC, getAnalysis))
	return PreservedAnalyses::all();			return PreservedAnalyses::all();

	PreservedAnalyses PA;			PreservedAnalyses PA;
	PA.preserve<DominatorTreeAnalysis>();			PA.preserve<DominatorTreeAnalysis>();
				chillUnsubmitted Done Reply Inline Actions Now that we added `LoopAnalysis` we may well preserve it too. (I should have included it in the patch which introduced the `LoopAnalysis` here) chill: Now that we added `LoopAnalysis` we may well preserve it too. (I should have included it in the…
				labrineaAuthorUnsubmitted Done Reply Inline Actions I tried this but the compiler crashes. Probably because SCCP deletes dead basic blocks. labrinea: I tried this but the compiler crashes. Probably because SCCP deletes dead basic blocks.
	PA.preserve<PostDominatorTreeAnalysis>();			PA.preserve<PostDominatorTreeAnalysis>();
	PA.preserve<FunctionAnalysisManagerModuleProxy>();			PA.preserve<FunctionAnalysisManagerModuleProxy>();
	return PA;			return PA;
	}			}

	namespace {			namespace {

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	//			//
	/// IPSCCP Class - This class implements interprocedural Sparse Conditional			/// IPSCCP Class - This class implements interprocedural Sparse Conditional
	/// Constant Propagation.			/// Constant Propagation.
	///			///
	class IPSCCPLegacyPass : public ModulePass {			class IPSCCPLegacyPass : public ModulePass {
	public:			public:
	static char ID;			static char ID;

	IPSCCPLegacyPass() : ModulePass(ID) {			IPSCCPLegacyPass() : ModulePass(ID) {
	initializeIPSCCPLegacyPassPass(*PassRegistry::getPassRegistry());			initializeIPSCCPLegacyPassPass(*PassRegistry::getPassRegistry());
	}			}

	bool runOnModule(Module &M) override {			bool runOnModule(Module &M) override {
	if (skipModule(M))			if (skipModule(M))
	return false;			return false;

	const DataLayout &DL = M.getDataLayout();			const DataLayout &DL = M.getDataLayout();
	auto GetTLI = [this](Function &F) -> const TargetLibraryInfo & {			auto GetTLI = [this](Function &F) -> const TargetLibraryInfo & {
	return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);			return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
	};			};
				auto GetTTI = [this](Function &F) -> TargetTransformInfo & {
				return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
				};
				auto GetAC = [this](Function &F) -> AssumptionCache & {
				return this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
				};
	auto getAnalysis = [this](Function &F) -> AnalysisResultsForFn {			auto getAnalysis = [this](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT =			DominatorTree &DT =
	this->getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();			this->getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
	return {			return {
	std::make_unique<PredicateInfo>(			std::make_unique<PredicateInfo>(
	F, DT,			F, DT,
	this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(			this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(
	F)),			F)),
	nullptr, // We cannot preserve the DT or PDT with the legacy pass			nullptr, // We cannot preserve the DT or PDT with the legacy pass
	nullptr}; // manager, so set them to nullptr.			nullptr}; // manager, so set them to nullptr.
	};			};

	return runIPSCCP(M, DL, GetTLI, getAnalysis);			return runIPSCCP(M, DL, GetTLI, GetTTI, GetAC, getAnalysis);
	}			}

	void getAnalysisUsage(AnalysisUsage &AU) const override {			void getAnalysisUsage(AnalysisUsage &AU) const override {
	AU.addRequired<AssumptionCacheTracker>();			AU.addRequired<AssumptionCacheTracker>();
	AU.addRequired<DominatorTreeWrapperPass>();			AU.addRequired<DominatorTreeWrapperPass>();
	AU.addRequired<TargetLibraryInfoWrapperPass>();			AU.addRequired<TargetLibraryInfoWrapperPass>();
				AU.addRequired<TargetTransformInfoWrapperPass>();
	}			}
	};			};

	} // end anonymous namespace			} // end anonymous namespace

	char IPSCCPLegacyPass::ID = 0;			char IPSCCPLegacyPass::ID = 0;

	INITIALIZE_PASS_BEGIN(IPSCCPLegacyPass, "ipsccp",			INITIALIZE_PASS_BEGIN(IPSCCPLegacyPass, "ipsccp",
	"Interprocedural Sparse Conditional Constant Propagation",			"Interprocedural Sparse Conditional Constant Propagation",
	false, false)			false, false)
	INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)			INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
	INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)			INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)			INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
	INITIALIZE_PASS_END(IPSCCPLegacyPass, "ipsccp",			INITIALIZE_PASS_END(IPSCCPLegacyPass, "ipsccp",
	"Interprocedural Sparse Conditional Constant Propagation",			"Interprocedural Sparse Conditional Constant Propagation",
	false, false)			false, false)

	// createIPSCCPPass - This is the public interface to this file.			// createIPSCCPPass - This is the public interface to this file.
	ModulePass *llvm::createIPSCCPPass() { return new IPSCCPLegacyPass(); }			ModulePass *llvm::createIPSCCPPass() { return new IPSCCPLegacyPass(); }

	PreservedAnalyses FunctionSpecializationPass::run(Module &M,
	ModuleAnalysisManager &AM) {
	const DataLayout &DL = M.getDataLayout();
	auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
	auto GetTLI = [&FAM](Function &F) -> TargetLibraryInfo & {
	return FAM.getResult<TargetLibraryAnalysis>(F);
	};
	auto GetTTI = [&FAM](Function &F) -> TargetTransformInfo & {
	return FAM.getResult<TargetIRAnalysis>(F);
	};
	auto GetAC = [&FAM](Function &F) -> AssumptionCache & {
	return FAM.getResult<AssumptionAnalysis>(F);
	};
	auto GetAnalysis = [&FAM](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT = FAM.getResult<DominatorTreeAnalysis>(F);
	return {std::make_unique<PredicateInfo>(
	F, DT, FAM.getResult<AssumptionAnalysis>(F)),
	&DT, FAM.getCachedResult<PostDominatorTreeAnalysis>(F)};
	};

	if (!runFunctionSpecialization(M, DL, GetTLI, GetTTI, GetAC, GetAnalysis))
	return PreservedAnalyses::all();

	PreservedAnalyses PA;
	PA.preserve<DominatorTreeAnalysis>();
	PA.preserve<PostDominatorTreeAnalysis>();
	PA.preserve<FunctionAnalysisManagerModuleProxy>();
	return PA;
	}

	namespace {
	struct FunctionSpecializationLegacyPass : public ModulePass {
	static char ID; // Pass identification, replacement for typeid
	FunctionSpecializationLegacyPass() : ModulePass(ID) {}

	void getAnalysisUsage(AnalysisUsage &AU) const override {
	AU.addRequired<AssumptionCacheTracker>();
	AU.addRequired<DominatorTreeWrapperPass>();
	AU.addRequired<TargetLibraryInfoWrapperPass>();
	AU.addRequired<TargetTransformInfoWrapperPass>();
	}

	virtual bool runOnModule(Module &M) override {
	if (skipModule(M))
	return false;

	const DataLayout &DL = M.getDataLayout();
	auto GetTLI = [this](Function &F) -> TargetLibraryInfo & {
	return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
	};
	auto GetTTI = [this](Function &F) -> TargetTransformInfo & {
	return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
	};
	auto GetAC = [this](Function &F) -> AssumptionCache & {
	return this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
	};

	auto GetAnalysis = [this](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT =
	this->getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
	return {
	std::make_unique<PredicateInfo>(
	F, DT,
	this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(
	F)),
	nullptr, // We cannot preserve the DT or PDT with the legacy pass
	nullptr}; // manager, so set them to nullptr.
	};
	return runFunctionSpecialization(M, DL, GetTLI, GetTTI, GetAC, GetAnalysis);
	}
	};
	} // namespace

	char FunctionSpecializationLegacyPass::ID = 0;

	INITIALIZE_PASS_BEGIN(
	FunctionSpecializationLegacyPass, "function-specialization",
	"Propagate constant arguments by specializing the function", false, false)

	INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
	INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
	INITIALIZE_PASS_END(FunctionSpecializationLegacyPass, "function-specialization",
	"Propagate constant arguments by specializing the function",
	false, false)

	ModulePass *llvm::createFunctionSpecializationPass() {
	return new FunctionSpecializationLegacyPass();
	}

llvm/lib/Transforms/Scalar/CMakeLists.txt

Show First 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMScalarOpts
COMPONENT_NAME		COMPONENT_NAME
Scalar		Scalar

LINK_COMPONENTS		LINK_COMPONENTS
AggressiveInstCombine		AggressiveInstCombine
Analysis		Analysis
Core		Core
InstCombine		InstCombine
		IPO
		chillUnsubmitted Not Done Reply Inline Actions Why add IPO here ? chill: Why add IPO here ?
		labrineaAuthorUnsubmitted Not Done Reply Inline Actions I vaguely remember a link time error without this change. See also `llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn` at the bottom of this diff. The IPSCCP pass now depends on the FunctionSpecializer whose cpp file is under the IPO directory. labrinea: I vaguely remember a link time error without this change. See also…
		chillUnsubmitted Not Done Reply Inline Actions IPO already depends on Scalar, i.e. in `IPO/CMakeLists.txt` we have ... COMPONENT_NAME IPO LINK_COMPONENTS ... Scalar ... Looks like a circular dependency. Perhaps `FunctionSpecialization` needs to go to `Utils` (alongside `SCCPSolver`). Or `runIPSCCP` needs to go to `IPO/SCCP.cpp`. Or both. chill: IPO already depends on Scalar, i.e. in `IPO/CMakeLists.txt` we have ``` ... COMPONENT_NAME…
Support		Support
TransformUtils		TransformUtils
)		)

llvm/lib/Transforms/Scalar/SCCP.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
#include "llvm/IR/User.h"		#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/Transforms/IPO/FunctionSpecialization.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
#include "llvm/Transforms/Utils/SCCPSolver.h"		#include "llvm/Transforms/Utils/SCCPSolver.h"
#include <cassert>		#include <cassert>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "sccp"		#define DEBUG_TYPE "sccp"

STATISTIC(NumInstRemoved, "Number of instructions removed");		STATISTIC(NumInstRemoved, "Number of instructions removed");
STATISTIC(NumDeadBlocks , "Number of basic blocks unreachable");		STATISTIC(NumDeadBlocks , "Number of basic blocks unreachable");
STATISTIC(NumInstReplaced,		STATISTIC(NumInstReplaced,
"Number of instructions replaced with (simpler) instruction");		"Number of instructions replaced with (simpler) instruction");

STATISTIC(IPNumInstRemoved, "Number of instructions removed by IPSCCP");		STATISTIC(IPNumInstRemoved, "Number of instructions removed by IPSCCP");
STATISTIC(IPNumArgsElimed ,"Number of arguments constant propagated by IPSCCP");		STATISTIC(IPNumArgsElimed ,"Number of arguments constant propagated by IPSCCP");
STATISTIC(IPNumGlobalConst, "Number of globals found to be constant by IPSCCP");		STATISTIC(IPNumGlobalConst, "Number of globals found to be constant by IPSCCP");
STATISTIC(		STATISTIC(
IPNumInstReplaced,		IPNumInstReplaced,
"Number of instructions replaced with (simpler) instruction by IPSCCP");		"Number of instructions replaced with (simpler) instruction by IPSCCP");

		static cl::opt<bool> SpecializeFunctions("specialize-functions", cl::init(false),
		cl::Hidden, cl::desc("Enable function specialization"));

// Helper to check if \p LV is either a constant or a constant		// Helper to check if \p LV is either a constant or a constant
// range with a single element. This should cover exactly the same cases as the		// range with a single element. This should cover exactly the same cases as the
// old ValueLatticeElement::isConstant() and is intended to be used in the		// old ValueLatticeElement::isConstant() and is intended to be used in the
// transition to ValueLatticeElement.		// transition to ValueLatticeElement.
static bool isConstant(const ValueLatticeElement &LV) {		static bool isConstant(const ValueLatticeElement &LV) {
return LV.isConstant() \|\|		return LV.isConstant() \|\|
(LV.isConstantRange() && LV.getConstantRange().isSingleElement());		(LV.isConstantRange() && LV.getConstantRange().isSingleElement());
}		}

// Helper to check if \p LV is either overdefined or a constant range with more		// Helper to check if \p LV is either overdefined or a constant range with more
// than a single element. This should cover exactly the same cases as the old		// than a single element. This should cover exactly the same cases as the old
// ValueLatticeElement::isOverdefined() and is intended to be used in the		// ValueLatticeElement::isOverdefined() and is intended to be used in the
// transition to ValueLatticeElement.		// transition to ValueLatticeElement.
static bool isOverdefined(const ValueLatticeElement &LV) {		static bool isOverdefined(const ValueLatticeElement &LV) {
return !LV.isUnknownOrUndef() && !isConstant(LV);		return !LV.isUnknownOrUndef() && !isConstant(LV);
}		}

static bool tryToReplaceWithConstant(SCCPSolver &Solver, Value *V) {		static bool tryToReplaceWithConstant(SCCPSolver &Solver, Value *V) {
Constant *Const = nullptr;		Constant *Const = nullptr;
		chillUnsubmitted Done Reply Inline Actions Wouldn't it work without the temporary vector? `markUsersAsChanged` would go over each user, look at the user's operands (including `Old`), and find the `New` (which is some constant) form the lattice values map. Thus we would maybe get just: Solver.markUsersAsChanged(Old); Old->replaceAllUsesWith(New); chill: Wouldn't it work without the temporary vector? `markUsersAsChanged` would go over each user…
if (V->getType()->isStructTy()) {		if (V->getType()->isStructTy()) {
std::vector<ValueLatticeElement> IVs = Solver.getStructLatticeValueFor(V);		std::vector<ValueLatticeElement> IVs = Solver.getStructLatticeValueFor(V);
if (llvm::any_of(IVs, isOverdefined))		if (llvm::any_of(IVs, isOverdefined))
return false;		return false;
std::vector<Constant *> ConstVals;		std::vector<Constant *> ConstVals;
auto *ST = cast<StructType>(V->getType());		auto *ST = cast<StructType>(V->getType());
for (unsigned i = 0, e = ST->getNumElements(); i != e; ++i) {		for (unsigned i = 0, e = ST->getNumElements(); i != e; ++i) {
ValueLatticeElement V = IVs[i];		ValueLatticeElement V = IVs[i];
Show All 27 Lines	if (CB && ((CB->isMustTailCall() && !CB->isSafeToRemove()) \|\|

LLVM_DEBUG(dbgs() << " Can\'t treat the result of call " << *CB		LLVM_DEBUG(dbgs() << " Can\'t treat the result of call " << *CB
<< " as a constant\n");		<< " as a constant\n");
return false;		return false;
}		}

LLVM_DEBUG(dbgs() << " Constant: " << Const << " = " << V << '\n');		LLVM_DEBUG(dbgs() << " Constant: " << Const << " = " << V << '\n');

// Replaces all of the uses of a variable with uses of the constant.		// Replaces all of the uses of a variable with uses of the constant.
V->replaceAllUsesWith(Const);		V->replaceAllUsesWith(Const);
return true;		return true;
}		}

static bool simplifyInstsInBlock(SCCPSolver &Solver, BasicBlock &BB,		static bool simplifyInstsInBlock(SCCPSolver &Solver, BasicBlock &BB,
		chillUnsubmitted Done Reply Inline Actions Is there a specific reason to remove the instruction from the block? If not, I'd suggest doing deletion in a single place, as opposed to spreading parts of it all over. chill: Is there a specific reason to remove the instruction from the block? If not, I'd suggest doing…
		labrineaAuthorUnsubmitted Done Reply Inline Actions I am not entirely sure. I wanted to avoid revisiting this instruction accidentally in either of simplifyInstsInBlock(), solve(), or resolvedUndefsIn(). For simplifyInstsInBlock() I could skip the instruction if it's present in `ToDelete`. For the others I don't know what the consequences of revisitng would be. I need to run some tests first. labrinea: I am not entirely sure. I wanted to avoid revisiting this instruction accidentally in either of…
		chillUnsubmitted Done Reply Inline Actions I can't see why would anything go wrong if the instruction is revisited. Do we know if the instruction is safe to remove? It could be `SDiv`/`SRem` with a zero divisor. chill: I can't see why would anything go wrong if the instruction is revisited. Do we know if the…
		chillUnsubmitted Done Reply Inline Actions Actually, never mind, we're not replacing the instruction with a constant but with another instruction. chill: Actually, never mind, we're not replacing the instruction with a constant but with another…
SmallPtrSetImpl<Value *> &InsertedValues,		SmallPtrSetImpl<Value *> &InsertedValues,
Statistic &InstRemovedStat,		Statistic &InstRemovedStat,
Statistic &InstReplacedStat) {		Statistic &InstReplacedStat) {
bool MadeChanges = false;		bool MadeChanges = false;
for (Instruction &Inst : make_early_inc_range(BB)) {		for (Instruction &Inst : make_early_inc_range(BB)) {
if (Inst.getType()->isVoidTy())		if (Inst.getType()->isVoidTy())
		labrineaAuthorUnsubmitted Done Reply Inline Actions Just found that we need to do the same inside `replaceSignedInst()` too. I will move this code a function. labrinea: Just found that we need to do the same inside `replaceSignedInst()` too. I will move this code…
		chillUnsubmitted Done Reply Inline Actions Would it be possible to call `markUsersAsChanged` here ? chill: Would it be possible to call `markUsersAsChanged` here ?
		labrineaAuthorUnsubmitted Done Reply Inline Actions I think we can't because if we replace the uses first then the users of the old value will be empty. Can we markUsersAsChanged before we replaceAllUsesWith the new value? Btw markUsersAsChanged is private for the SCCPInstVisitor, but I suppose I could make it public if need be. labrinea: I think we can't because if we replace the uses first then the users of the old value will be…
		labrineaAuthorUnsubmitted Done Reply Inline Actions Actually I could call markUsersAsChanged on the new Instrcution after replacing the uses of the old Instruction with it. labrinea: Actually I could call markUsersAsChanged on the new Instrcution after replacing the uses of the…
		chillUnsubmitted Done Reply Inline Actions OK, let's leave it hanging for now, until I can take a look on top of the latest trunk. Ideally, we are trying to avoid changing code until the Solver is done. Here we have found that an instruction has constant lattice value - we should not replace the users' operands right away, but notify the Solver. The Solver in turn would add the instructions that need reexamining to the instructions worklist and update their lattice values the next time we invoke `Solvet.solve()`. Most likely `SCCPSolver::visit` should become private, the Solver (and the SCCP algorithm in general) is driven by its worklists, we should stick to this design: want something done - add it to the worklist. chill: OK, let's leave it hanging for now, until I can take a look on top of the latest trunk.
		labrineaAuthorUnsubmitted Done Reply Inline Actions Update: I tried this. It works for 'some' cases. Instead of replacing values with constants I create mappings from the old to the new value and only after all the solving is done then I replace the uses. The specialization of recursive functions doesn't work because it relies on finding allocas of constant integers. Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. In theory I could pass on the mappings from sccp to the specializer but it seems overly complicated to do so. labrinea: Update: I tried this. It works for 'some' cases. Instead of replacing values with constants I…
		chillUnsubmitted Done Reply Inline Actions Instead of replacing values with constants I create mappings from the old to the new value .. But isn't this what the `ValueState` already contains? Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. Well, `FunctionSpecializer::rewriteCallSites` and everything else should lookup lattice values, not work directly with operands. But OK, let's not make too many changes at once and revisit it later. chill: > Instead of replacing values with constants I create mappings from the old to the new value ..
continue;		continue;
if (tryToReplaceWithConstant(Solver, &Inst)) {		if (tryToReplaceWithConstant(Solver, &Inst)) {
if (Inst.isSafeToRemove())		if (Inst.isSafeToRemove())
Inst.eraseFromParent();		Inst.eraseFromParent();

MadeChanges = true;		MadeChanges = true;
		chillUnsubmitted Done Reply Inline Actions Likewise. chill: Likewise.
++InstRemovedStat;		++InstRemovedStat;
} else if (isa<SExtInst>(&Inst)) {		} else if (isa<SExtInst>(&Inst)) {
Value *ExtOp = Inst.getOperand(0);		Value *ExtOp = Inst.getOperand(0);
if (isa<Constant>(ExtOp) \|\| InsertedValues.count(ExtOp))		if (isa<Constant>(ExtOp) \|\| InsertedValues.count(ExtOp))
continue;		continue;
const ValueLatticeElement &IV = Solver.getLatticeValueFor(ExtOp);		const ValueLatticeElement &IV = Solver.getLatticeValueFor(ExtOp);
if (!IV.isConstantRange(/UndefAllowed=/false))		if (!IV.isConstantRange(/UndefAllowed=/false))
continue;		continue;
▲ Show 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	if (FeasibleSuccessors.size() == 1) {
llvm_unreachable("Must have at least one feasible successor");		llvm_unreachable("Must have at least one feasible successor");
}		}
return true;		return true;
}		}

bool llvm::runIPSCCP(		bool llvm::runIPSCCP(
Module &M, const DataLayout &DL,		Module &M, const DataLayout &DL,
std::function<const TargetLibraryInfo &(Function &)> GetTLI,		std::function<const TargetLibraryInfo &(Function &)> GetTLI,
		std::function<TargetTransformInfo &(Function &)> GetTTI,
		std::function<AssumptionCache &(Function &)> GetAC,
function_ref<AnalysisResultsForFn(Function &)> getAnalysis) {		function_ref<AnalysisResultsForFn(Function &)> getAnalysis) {
SCCPSolver Solver(DL, GetTLI, M.getContext());		SCCPSolver Solver(DL, GetTLI, M.getContext());
		FunctionSpecializer Specializer(Solver, M, GetAC, GetTTI, GetTLI);

// Loop over all functions, marking arguments to those with their addresses		// Loop over all functions, marking arguments to those with their addresses
// taken or that are external as overdefined.		// taken or that are external as overdefined.
for (Function &F : M) {		for (Function &F : M) {
if (F.isDeclaration())		if (F.isDeclaration())
continue;		continue;

Solver.addAnalysis(F, getAnalysis(F));		Solver.addAnalysis(F, getAnalysis(F));
Show All 35 Lines	while (ResolvedUndefs) {
ResolvedUndefs = false;		ResolvedUndefs = false;
for (Function &F : M) {		for (Function &F : M) {
if (Solver.resolvedUndefsIn(F))		if (Solver.resolvedUndefsIn(F))
ResolvedUndefs = true;		ResolvedUndefs = true;
}		}
if (ResolvedUndefs)		if (ResolvedUndefs)
Solver.solve();		Solver.solve();
}		}

		chillUnsubmitted Done Reply Inline Actions IMHO, the invocation of the `FunctionSpecialization` pass ought to happen in this place. The general flow would be like: Initialise solver Run solver once (`Solver.solve()` + `resolvedUndefsIn` loop) Run function specialisation Run solver again Optionally go to 2. Do replacements (from line 512 on) At no point before the last step the passes ought to replace or delete anything (well, except called function operand for cloned functions). If an operand/argument is determined to be a constant, it does not need to be replaced right away, because the passes should consult its lattice value. Yeah, the devil is in the details, but this is the approach to merging the tow passes, as I see it. chill: IMHO, the invocation of the `FunctionSpecialization` pass ought to happen in this place. The…
bool MadeChanges = false;		bool MadeChanges = false;

// Iterate over all of the instructions in the module, replacing them with		// Iterate over all of the instructions in the module, replacing them with
// constants if we have found them to be of constant values.		// constants if we have found them to be of constant values.

		chillUnsubmitted Done Reply Inline Actions I would suggest not creating a vector of all the functions in the module as they could be quite a lot (e.g. in LTO) and thus trigger several heap allocations for `WorkList`. `solveWhileResolvedUndefIn` is quite small and could be overloaded for a `Module ` parameter. I considered making this function a template along the lines of: template<typename RangeT> void printNames(RangeT &&R) { for (auto &F : R) llvm::dbgs() << magic(F)->getName(); } std::vector<llvm::Function > v; llvm::Module M; int main() { printNames(M->functions()); printNames(v); } but couldn't come up with `magic`. As for `propagateConstants` it could be done with a few overloads as well: static bool propagateConstants(SCCPSolver &Solver, Function F, SmallPtrSetImpl<Instruction > &ToDelete); static bool propagateConstants(SCCPSolver &Solver, SmallVectorImpl<Function > &WorkList, SmallPtrSetImpl<Instruction > &ToDelete) { for (Function F : WorkList) propagateConstants(Solve, F, ToDelete); } static bool propagateConstants(SCCPSolver &Solver, Module M, SmallPtrSetImpl<Instruction > &ToDelete) { for (auto &F : Module) propagateConstants(Solve, &F, ToDelete); } chill: I would suggest not creating a vector of all the functions in the module as they could be quite…
for (Function &F : M) {		for (Function &F : M) {
if (F.isDeclaration())		if (F.isDeclaration())
continue;		continue;

SmallVector<BasicBlock *, 512> BlocksToErase;		SmallVector<BasicBlock *, 512> BlocksToErase;

if (Solver.isBlockExecutable(&F.front())) {		if (Solver.isBlockExecutable(&F.front())) {
bool ReplacedPointerArg = false;		bool ReplacedPointerArg = false;
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	for (Function &F : M) {
BasicBlock *NewUnreachableBB = nullptr;		BasicBlock *NewUnreachableBB = nullptr;
for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
MadeChanges \|= removeNonFeasibleEdges(Solver, &BB, DTU, NewUnreachableBB);		MadeChanges \|= removeNonFeasibleEdges(Solver, &BB, DTU, NewUnreachableBB);

for (BasicBlock *DeadBB : BlocksToErase)		for (BasicBlock *DeadBB : BlocksToErase)
if (!DeadBB->hasAddressTaken())		if (!DeadBB->hasAddressTaken())
DTU.deleteBB(DeadBB);		DTU.deleteBB(DeadBB);

		if (!SpecializeFunctions) {
		// The Function Specializer will delete those after completion.
for (BasicBlock &BB : F) {		for (BasicBlock &BB : F) {
for (Instruction &Inst : llvm::make_early_inc_range(BB)) {		for (Instruction &Inst : llvm::make_early_inc_range(BB)) {
if (Solver.getPredicateInfoFor(&Inst)) {		if (Solver.getPredicateInfoFor(&Inst)) {
if (auto *II = dyn_cast<IntrinsicInst>(&Inst)) {		if (auto *II = dyn_cast<IntrinsicInst>(&Inst)) {
if (II->getIntrinsicID() == Intrinsic::ssa_copy) {		if (II->getIntrinsicID() == Intrinsic::ssa_copy) {
Value *Op = II->getOperand(0);		Value *Op = II->getOperand(0);
Inst.replaceAllUsesWith(Op);		Inst.replaceAllUsesWith(Op);
Inst.eraseFromParent();		Inst.eraseFromParent();
}		}
}		}
}		}
}		}
}		}
}		}
		}

// If we inferred constant or undef return values for a function, we replaced		// If we inferred constant or undef return values for a function, we replaced
// all call uses with the inferred value. This means we don't need to bother		// all call uses with the inferred value. This means we don't need to bother
// actually returning anything from the function. Replace all return		// actually returning anything from the function. Replace all return
// instructions with return undef.		// instructions with return undef.
//		//
// Do this in two stages: first identify the functions we should process, then		// Do this in two stages: first identify the functions we should process, then
// actually zap their returns. This is important because we can only do this		// actually zap their returns. This is important because we can only do this
▲ Show 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	while (!GV->use_empty()) {
StoreInst *SI = cast<StoreInst>(GV->user_back());		StoreInst *SI = cast<StoreInst>(GV->user_back());
SI->eraseFromParent();		SI->eraseFromParent();
MadeChanges = true;		MadeChanges = true;
}		}
M.getGlobalList().erase(GV);		M.getGlobalList().erase(GV);
++IPNumGlobalConst;		++IPNumGlobalConst;
}		}

		if (SpecializeFunctions) {
		SmallVector<Function *> Candidates;
		for (Function *F : Solver.getArgumentTrackedFunctions())
		if (!F->hasFnAttribute(Attribute::NoDuplicate))
		Candidates.push_back(F);

		MadeChanges \|= Specializer.specialize(Candidates);
		}

return MadeChanges;		return MadeChanges;
}		}

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

	static_library("Scalar") {			static_library("Scalar") {
	output_name = "LLVMScalarOpts"			output_name = "LLVMScalarOpts"
	deps = [			deps = [
	"//llvm/include/llvm/Config:llvm-config",			"//llvm/include/llvm/Config:llvm-config",
	"//llvm/lib/Analysis",			"//llvm/lib/Analysis",
	"//llvm/lib/IR",			"//llvm/lib/IR",
	"//llvm/lib/Support",			"//llvm/lib/Support",
	"//llvm/lib/Transforms/AggressiveInstCombine",			"//llvm/lib/Transforms/AggressiveInstCombine",
				"//llvm/lib/Transforms/IPO",
	"//llvm/lib/Transforms/InstCombine",			"//llvm/lib/Transforms/InstCombine",
	"//llvm/lib/Transforms/Utils",			"//llvm/lib/Transforms/Utils",
	]			]
	sources = [			sources = [
	"ADCE.cpp",			"ADCE.cpp",
	"AlignmentFromAssumptions.cpp",			"AlignmentFromAssumptions.cpp",
	"AnnotationRemarks.cpp",			"AnnotationRemarks.cpp",
	"BDCE.cpp",			"BDCE.cpp",
	▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[FuncSpec] Make the Function Specializer part of the IPSCCP pass.ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 432226

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/LinkAllPasses.h

llvm/include/llvm/Transforms/IPO.h

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h

llvm/include/llvm/Transforms/IPO/SCCP.h

llvm/include/llvm/Transforms/Scalar/SCCP.h

llvm/lib/Passes/PassBuilderPipelines.cpp

llvm/lib/Passes/PassRegistry.def

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

llvm/lib/Transforms/IPO/IPO.cpp

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/lib/Transforms/IPO/SCCP.cpp

llvm/lib/Transforms/Scalar/CMakeLists.txt

llvm/lib/Transforms/Scalar/SCCP.cpp

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

[FuncSpec] Make the Function Specializer part of the IPSCCP pass.
ClosedPublic