This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
InitializePasses.h
-
LinkAllPasses.h
-
Transforms/
-
IPO.h
-
IPO/
1/1
FunctionSpecialization.h
-
SCCP.h
-
Scalar/
-
SCCP.h
-
lib/
-
Passes/
-
PassBuilderPipelines.cpp
-
PassRegistry.def
-
Transforms/
-
IPO/
22/23
FunctionSpecialization.cpp
-
IPO.cpp
-
PassManagerBuilder.cpp
4/4
SCCP.cpp
-
Scalar/
3
CMakeLists.txt
15/15
SCCP.cpp
-
test/Transforms/FunctionSpecialization/
-
Transforms/
-
FunctionSpecialization/
-
bug52821-use-after-free.ll
-
bug55000-read-uninitialized-value.ll
-
function-specialization-always-inline.ll
-
function-specialization-constant-expression.ll
-
function-specialization-constant-expression2.ll
-
function-specialization-constant-expression3.ll
-
function-specialization-constant-expression4.ll
-
function-specialization-constant-expression5.ll
-
function-specialization-constant-integers.ll
-
function-specialization-loop.ll
-
function-specialization-minsize.ll
-
function-specialization-minsize2.ll
-
function-specialization-minsize3.ll
-
function-specialization-nodup.ll
-
function-specialization-nodup2.ll
-
function-specialization-noexec.ll
-
function-specialization-nonconst-glob.ll
-
function-specialization-nothing-todo.ll
-
function-specialization-poison.ll
-
function-specialization-recursive.ll
-
function-specialization-recursive2.ll
-
function-specialization-recursive3.ll
-
function-specialization-recursive4.ll
-
function-specialization-stats.ll
-
function-specialization.ll
-
function-specialization2.ll
-
function-specialization3.ll
-
function-specialization4.ll
-
function-specialization5.ll
-
identical-specializations.ll
-
remove-dead-recursive-function.ll
-
specialize-multiple-arguments.ll
-
utils/gn/secondary/llvm/lib/Transforms/Scalar/
-
gn/
-
secondary/
-
llvm/
-
lib/
-
Transforms/
-
Scalar/
-
BUILD.gn

Differential D126455

[FuncSpec] Make the Function Specializer part of the IPSCCP pass.
ClosedPublic

Authored by labrinea on May 26 2022, 3:09 AM.

Download Raw Diff

Details

Reviewers

llvm-commits
ChuanqiXu
fhahn
nikic
efriedma
chill

Commits

rG8136a0172b3c: [FuncSpec] Make the Function Specializer part of the IPSCCP pass.
rG877a9f9abec6: [FuncSpec] Make the Function Specializer part of the IPSCCP pass.

Summary

The aim of this patch is to minimize the compilation time overhead of running Function Specialization. It is about 40% slower to run as a standalone pass (IPSCCP + FuncSpec vs IPSCCP with FuncSpec) according to my measurements. I compiled the llvm testsuite with NewPM-O3 + LTO and measured single threaded [user + system] time of IPSCCP and FuncSpec by passing the '-time-passes' option to lld. Then I compared the two configurations in terms of Instruction Count of the total compilation (not of the individual passes) as in https://llvm-compile-time-tracker.com. Geomean for non-LTO builds is -0.25% and LTO is -0.5% approximately.

You can find more info below:
https://discourse.llvm.org/t/rfc-should-we-enable-function-specialization/61518

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

labrinea created this revision.May 26 2022, 3:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 3:09 AM

Herald added subscribers: snehasish, ormris, hiraditya, mgorny. · View Herald Transcript

labrinea requested review of this revision.May 26 2022, 3:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 26 2022, 3:09 AM

This is a proof of concept for RFC: Should we enable Function Specialization?, not ready for review.

labrinea added a child revision: D126456: [SCCP] Notify the Solver when an instruction is removed..May 26 2022, 3:13 AM

Harbormaster completed remote builds in B166438: Diff 432226.May 26 2022, 3:43 AM

labrinea mentioned this in D128822: [FuncSpec] Partially revert rG8b360c69e9e3..Jun 29 2022, 7:26 AM

rebased + fixed tests

Herald added a subscriber: nlopes. · View Herald TranscriptJun 29 2022, 7:29 AM

labrinea added a parent revision: D128822: [FuncSpec] Partially revert rG8b360c69e9e3..Jun 29 2022, 7:30 AM

labrinea added a child revision: D128823: [SCCP] Make it possible to remove predicate info for a given instruction..Jun 29 2022, 7:32 AM

labrinea removed a child revision: D126456: [SCCP] Notify the Solver when an instruction is removed..Jun 29 2022, 7:40 AM

Harbormaster completed remote builds in B172752: Diff 441002.Jun 29 2022, 8:31 AM

labrinea added reviewers: ChuanqiXu, fhahn, eli.friedman.Jul 11 2022, 4:02 AM

tryToReplaceWithConstant method in SCCP does not update the lattice value map at SCCPSolver, and it might lead to a problem that

%arg = getelementptr %struct, %struct* @Global, i32 0, i32 3
%tmp0 = call i64 @func2(i64* %arg)

is folded into

%tmp0 = call i64 @func2(i64* getelementptr inbounds %struct, %struct* @Global, i32 0, i32 3)

a new callbase argument appears, but it is not recorded at SCCPSolver, and this leads to problems such as D128822.

Suggestion:
FunctionSpecializer::tryToReplaceWithConstant use the code piece below to update the propagated argument, and maybe we need such a change for tryToReplaceWithConstant as well.

for (auto *I : UseInsts)
  Solver.visit(I);

labrinea added a reviewer: nikic.Jul 12 2022, 3:30 AM

labrinea edited reviewers, added: efriedma; removed: eli.friedman.Aug 1 2022, 10:39 AM

In D126455#3644307, @sinan wrote:
tryToReplaceWithConstant method in SCCP does not update the lattice value map at SCCPSolver, and it might lead to a problem that

if
%arg = getelementptr %struct, %struct* @Global, i32 0, i32 3
%tmp0 = call i64 @func2(i64* %arg)
is folded into
%tmp0 = call i64 @func2(i64* getelementptr inbounds %struct, %struct* @Global, i32 0, i32 3)
a new callbase argument appears, but it is not recorded at SCCPSolver, and this leads to problems such as D128822.

Suggestion:
FunctionSpecializer::tryToReplaceWithConstant use the code piece below to update the propagated argument, and maybe we need such a change for tryToReplaceWithConstant as well.
for (auto *I : UseInsts)
  Solver.visit(I);

I am doing this in a later patch (see D126456) as I wanted to keep this one as close as possible to the original implementation.

labrinea edited the summary of this revision. (Show Details)Aug 15 2022, 1:44 AM

ping

fhahn added inline comments.Aug 17 2022, 1:51 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
301	can you fix the indentation here in a NFC?
666	IIUC this is done during in between solver runs, right? Is this needed? Isn't it sufficient to continue with the constant value in the value mapping? This would probably remove the need to tell the solver to forget instructions/values.

labrinea added inline comments.Aug 17 2022, 9:22 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
301	How? It seems ok.
666	This is not the only invocation of `tryToReplaceWithConstant` in FuncSpec. On this instance we try to replace the arguments of cloned functions. There's another invocation inside the functor `RunSCCPSolver`. On that instance we try to replace the instructions of cloned functions. Both calls occur as many times as `FuncSpecializationMaxIters` is set to. Moreover, the SCCP pass itself does the same thing on arguments of tracked functions and on instructions of executable blocks (with `tryToReplaceWithConstant` and `simplifyInstsInBlock` accordingly). This happens after the Solver runs and before the Function Specializer is invoked. Therefore, I think we still need to tell the Solver to forget instructions/values if we want to merge the two passes.

chill added a subscriber: chill.Oct 10 2022, 4:14 AM

chill added inline comments.

llvm/lib/Transforms/Scalar/SCCP.cpp
656–657	IMHO, the invocation of the `FunctionSpecialization` pass ought to happen in this place. The general flow would be like: Initialise solver Run solver once (`Solver.solve()` + `resolvedUndefsIn` loop) Run function specialisation Run solver again Optionally go to 2. Do replacements (from line 512 on) At no point before the last step the passes ought to replace or delete anything (well, except called function operand for cloned functions). If an operand/argument is determined to be a constant, it does not need to be replaced right away, because the passes should consult its lattice value. Yeah, the devil is in the details, but this is the approach to merging the tow passes, as I see it.

I have moved the invocation of the specializer earlier in the ipsccp pass, such that no instructions get deleted until all the solving is done. This essentially makes all of D128822, D128823, D128824, D128825, D126456 and D128827 obsolete. There is one thing I couldn't get working, which is to update the lattice value of the callsites to specialized functions. Unfortunately the semantics of the solver do not allow lattices to move from a generic to a more specific state (i.e. from a wider to a narrower constant range). That said the zapping of returned values won't work on specialized functions, neither we can propagate a constant returned value to the function body where the callsite resides. On another note, the function analysis information which is provided to the pass (predication info, dominator tree) cannot be used on the specialized functions; not sure if that's a problem though.

Harbormaster completed remote builds in B193102: Diff 469041.Oct 19 2022, 2:51 PM

labrinea removed a parent revision: D128822: [FuncSpec] Partially revert rG8b360c69e9e3..Oct 25 2022, 2:58 AM

labrinea added inline comments.Oct 25 2022, 3:01 AM

llvm/lib/Transforms/Scalar/SCCP.cpp
161–172	Just found that we need to do the same inside `replaceSignedInst()` too. I will move this code a function.

chill edited reviewers, added: chill; removed: momchil.velikov.Oct 27 2022, 7:26 AM

chill added inline comments.Oct 27 2022, 8:30 AM

llvm/lib/Transforms/Scalar/CMakeLists.txt
97	Why add IPO here ?
llvm/lib/Transforms/Scalar/SCCP.cpp
161–172	Would it be possible to call `markUsersAsChanged` here ?
237	Is there a specific reason to remove the instruction from the block? If not, I'd suggest doing deletion in a single place, as opposed to spreading parts of it all over.
254	Likewise.

labrinea added inline comments.Oct 28 2022, 4:07 AM

llvm/lib/Transforms/Scalar/CMakeLists.txt
97	I vaguely remember a link time error without this change. See also `llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn` at the bottom of this diff. The IPSCCP pass now depends on the FunctionSpecializer whose cpp file is under the IPO directory.
llvm/lib/Transforms/Scalar/SCCP.cpp
161–172	I think we can't because if we replace the uses first then the users of the old value will be empty. Can we markUsersAsChanged before we replaceAllUsesWith the new value? Btw markUsersAsChanged is private for the SCCPInstVisitor, but I suppose I could make it public if need be.
237	I am not entirely sure. I wanted to avoid revisiting this instruction accidentally in either of simplifyInstsInBlock(), solve(), or resolvedUndefsIn(). For simplifyInstsInBlock() I could skip the instruction if it's present in `ToDelete`. For the others I don't know what the consequences of revisitng would be. I need to run some tests first.

labrinea added inline comments.Oct 28 2022, 6:09 AM

llvm/lib/Transforms/Scalar/SCCP.cpp
161–172	Actually I could call markUsersAsChanged on the new Instrcution after replacing the uses of the old Instruction with it.

chill added inline comments.Oct 28 2022, 7:33 AM

llvm/lib/Transforms/Scalar/CMakeLists.txt
97	IPO already depends on Scalar, i.e. in `IPO/CMakeLists.txt` we have ... COMPONENT_NAME IPO LINK_COMPONENTS ... Scalar ... Looks like a circular dependency. Perhaps `FunctionSpecialization` needs to go to `Utils` (alongside `SCCPSolver`). Or `runIPSCCP` needs to go to `IPO/SCCP.cpp`. Or both.
llvm/lib/Transforms/Scalar/SCCP.cpp
161–172	OK, let's leave it hanging for now, until I can take a look on top of the latest trunk. Ideally, we are trying to avoid changing code until the Solver is done. Here we have found that an instruction has constant lattice value - we should not replace the users' operands right away, but notify the Solver. The Solver in turn would add the instructions that need reexamining to the instructions worklist and update their lattice values the next time we invoke `Solvet.solve()`. Most likely `SCCPSolver::visit` should become private, the Solver (and the SCCP algorithm in general) is driven by its worklists, we should stick to this design: want something done - add it to the worklist.
237	I can't see why would anything go wrong if the instruction is revisited. Do we know if the instruction is safe to remove? It could be `SDiv`/`SRem` with a zero divisor.

chill added inline comments.Oct 28 2022, 6:26 PM

llvm/lib/Transforms/Scalar/SCCP.cpp
237	Actually, never mind, we're not replacing the instruction with a constant but with another instruction.

labrinea added inline comments.Oct 30 2022, 8:52 AM

llvm/lib/Transforms/Scalar/SCCP.cpp
161–172	Update: I tried this. It works for 'some' cases. Instead of replacing values with constants I create mappings from the old to the new value and only after all the solving is done then I replace the uses. The specialization of recursive functions doesn't work because it relies on finding allocas of constant integers. Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. In theory I could pass on the mappings from sccp to the specializer but it seems overly complicated to do so.

Changes from last revision:

rebased on top of main; that required adjusting a couple of pass-manager tests as the LoopInfo analysis is now used in the default pipelines by the sccp pass
lazily eraseFromParent instructions which have been replaced instead of removeFromParent and deleteValue later as suggested by @chill
used markUsersAsChanged instead of visit as suggested by @chill (required exposing it to the public interface)
moved/renamed solveWhileResolvedUndefsIn to the solver as it is required from the specializer too (fixes a bug I have added a new testcase for)
changed createSpecialization to return a pointer to the cloned function

Herald added subscribers: wenlei, steven_wu. · View Herald TranscriptOct 31 2022, 8:39 AM

chill added inline comments.Oct 31 2022, 9:39 AM

llvm/lib/Transforms/IPO/SCCP.cpp
41	This part was added for the FunctionSpecialization, if func spec is disabled maybe not pass along the LoopAnalysis?

Harbormaster completed remote builds in B195281: Diff 472022.Oct 31 2022, 9:41 AM

chill added inline comments.Nov 2 2022, 4:08 AM

llvm/lib/Transforms/IPO/SCCP.cpp
48	Now that we added `LoopAnalysis` we may well preserve it too. (I should have included it in the patch which introduced the `LoopAnalysis` here)

Good point. Running the LoopInfo analysis when we do not specialize adds unnecessary compile time overhead. After some investigation I found that notifying the users after replacing a value (instead of notifying only the old users), as well as not removing replaced instructions, both add significant compile time overhead. I am inclined to revert the latter and rework on the former so that markUsersAsChanged can accept a list of users.

Changes from last revision:

Predicated the LoopInfo analysis on the cmd-line option that enables function specialization. As a result we no longer need to adjust the pass manager unit-tests.
Reverted the eraseFromParent to removeFromParent/deleteValue for replaced instructions to save compilation time.
Modified markUsersAsChanged to accept a UserList so that we avoid revisiting irrelevant users after replacing a value (saves compilation time).

Note: The instruction count delta for CTMark with NewPM-O3+LTO between baseline (da5ded4fc9d8c8edfd4a79fa0e75c2ac9165fa7b) and this patch (with funcspec disabled) is about 0.05 % geomean.

labrinea marked 7 inline comments as done.Nov 2 2022, 10:48 AM

labrinea added inline comments.

llvm/lib/Transforms/IPO/SCCP.cpp
48	I tried this but the compiler crashes. Probably because SCCP deletes dead basic blocks.

Harbormaster completed remote builds in B195754: Diff 472680.Nov 2 2022, 11:57 AM

chill added inline comments.Nov 2 2022, 12:26 PM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
246	`WorkList` is a too generic name for a parameter and it's not a worklist per se anyway.
335	This is a little bit fragile in the sense the caller may forget to clear the list. It would be nicer if this function itself clears the list first thing when it starts execution. `WorkList` could also be made a return, taking advantage of move semantics, which looks nicer on paper, but may cause a few allocations/deallocations if we iterate.
llvm/lib/Transforms/Scalar/SCCP.cpp
116	Wouldn't it work without the temporary vector? `markUsersAsChanged` would go over each user, look at the user's operands (including `Old`), and find the `New` (which is some constant) form the lattice values map. Thus we would maybe get just: Solver.markUsersAsChanged(Old); Old->replaceAllUsesWith(New);
161–172	Instead of replacing values with constants I create mappings from the old to the new value .. But isn't this what the `ValueState` already contains? Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. Well, `FunctionSpecializer::rewriteCallSites` and everything else should lookup lattice values, not work directly with operands. But OK, let's not make too many changes at once and revisit it later.
661	I would suggest not creating a vector of all the functions in the module as they could be quite a lot (e.g. in LTO) and thus trigger several heap allocations for `WorkList`. `solveWhileResolvedUndefIn` is quite small and could be overloaded for a `Module ` parameter. I considered making this function a template along the lines of: template<typename RangeT> void printNames(RangeT &&R) { for (auto &F : R) llvm::dbgs() << magic(F)->getName(); } std::vector<llvm::Function > v; llvm::Module M; int main() { printNames(M->functions()); printNames(v); } but couldn't come up with `magic`. As for `propagateConstants` it could be done with a few overloads as well: static bool propagateConstants(SCCPSolver &Solver, Function F, SmallPtrSetImpl<Instruction > &ToDelete); static bool propagateConstants(SCCPSolver &Solver, SmallVectorImpl<Function > &WorkList, SmallPtrSetImpl<Instruction > &ToDelete) { for (Function F : WorkList) propagateConstants(Solve, F, ToDelete); } static bool propagateConstants(SCCPSolver &Solver, Module M, SmallPtrSetImpl<Instruction > &ToDelete) { for (auto &F : Module) propagateConstants(Solve, &F, ToDelete); }
llvm/lib/Transforms/Utils/SCCPSolver.cpp
1577 ↗	(On Diff #472022)	All the functions here forward to the `Visitor`, this one should also just be forwarding. (Not sure why this proxy class exists at all, but I guess we can address it later).

This revision is somewhat different from the previous ones because we no longer replace instructions/arguments whilst in the main specialization loop of the SCCP pass. Instead we use the lattice value when rewritting callsites and when promoting constant stack values. Last time I checked this was even more lightweight from the previous revision. Builds successfully the llvm-test-suite with aggressive funcspec options; haven't tried clang bootstrap yet. Also improves one of the unit tests with recursive functions.

labrinea marked 4 inline comments as done.Nov 7 2022, 9:44 AM

labrinea added inline comments.

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h
70	I've removed an unused typedef from here ;)
llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
207–208	I am not sure whether this is necessary. The unit tests which exercise recursion are passing without it at least.
281	We now call this function once, not for every clone as it used to be.
709	no need to examine lattices of arguments if it's the key of the CallSpecBinding
711–715	There might be more call sites to rewrite than those in the CallSpecBinding that we have already found, therefore we need to repeat this look up of users here, but at least it now happens once for F compared to Clones.size() times which was the case before.
712	the condition was different before, but I think this is correct
716–717	We are modifying the list whist traversing it, so we swap the current element with the last one and reduce the iteration range by one.
llvm/lib/Transforms/Utils/SCCPSolver.cpp
465–483 ↗	(On Diff #473696)	I couldn't template these two. The main reason was that one iterates over `Function *` whereas the other over `Function &`.

Harbormaster completed remote builds in B196517: Diff 473696.Nov 7 2022, 11:32 AM

chill added inline comments.Nov 8 2022, 8:02 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
716–725	You can swap the order of the loops and get rid of `CallSiteToRewrite`.

chill added inline comments.Nov 8 2022, 8:30 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
716–725	Maybe I'm getting a bit ahead of me, but you can change the function to rewrite just a single call site and factor out the iteration over call sites. The benefit is the function becomes more reusable in as you can independently choose the set of call sites it operates upon. (Incidently, I'm planning to use it that way, but it generally a good change, even if what I have in mind turns out non-working).

labrinea added inline comments.Nov 9 2022, 12:52 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
716–725	You mean to traverse F's users here instead? We are iterating over them while modifying them, which is the reason why CallSiteToRewrite existed in the first place I believe. Also the dynamic cast and other checks we do: CS->getCalledFunction() == F && Solver.isBlockExecutable(CS->getParent()) (note: this one is missing from the current revision) don't need to be repeated on every iteration of the outer loop which walks the specializaions.
716–725	If I change it the way you suggest it will regress the current behaviour. Qsort() from SPEC's mcf won't specialize (it's a recursive function) and it's been quite a drive for this work. What if you alter it once you refactor how we rewrite callsites?

chill added inline comments.Nov 9 2022, 2:23 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
716–725	What we have now is: forall S : Specialisations { forall C : CallSites { do_stuff(S, C); } } What I'm suggesting is to reorder the loops forall C: call sites { forall S: specialisations { do_stuff(S, C); } } That'll avoid the swaps/pop_back. The vector itself stays, good point. And then I'm suggesting to move the outer loop out of the function: func foo() { ... forall C: CallSites { rewriteCallSite(C) } ... } func rewriteCallSite(C) { forall S : Specialisations { do_stuff(S, C); } } Both are NFC. Also the dynamic cast and other checks we do: CS->getCalledFunction() == F && Solver.isBlockExecutable(CS->getParent()) (note: this one is missing from the current revision) don't need to be repeated on every iteration of the outer loop which walks the specializaions. I don't understand this. We do nothing between loops, so interchanging them will execute exactly the same operations in the loop body. If you add these checks somewhere, it's another argument to move the iteration over call sites to the outer loop.

chill added inline comments.Nov 9 2022, 3:49 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
707–708	This is better placed outside of `rewriteCallSites`, perhaps just after the call to `rewriteCallSites`.

chill added inline comments.Nov 9 2022, 4:24 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
239	I believe the LLVM convention for these kinds of classes/methods is `run`, e.g. `Vectorizer::run()`, `EarlyCSE::run()`, etc.

chill added inline comments.Nov 10 2022, 2:13 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
707–708	Or a better idea: get the initial size of `CallSitesToRewrite`, decrement that number every time you update a call site. At the end if this number drops to zero mark the function unreachable.

chill added inline comments.Nov 10 2022, 2:26 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
711	Use braces around the `for`, since there are more than two levels of nesting.

labrinea added inline comments.Nov 10 2022, 5:28 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
707–708	that won't work for dead recursive functions

Changes from last revision:

renamed specialize() to run() as suggested
renamed rewriteCallSites to updateCallSites
interchanged the loops in updateCallSites as suggested
update callsites of specializations before anything else (changes the test identical-specializations.ll)
used braces for nested loop
clang formated

labrinea marked 5 inline comments as done.Nov 10 2022, 5:44 AM

Harbormaster completed remote builds in B197069: Diff 474522.Nov 10 2022, 6:40 AM

Found a bug in the last revision. The swapping idiom violates the expected order of traversing the call sites to update. They are supposed to be sorted by gain.

Harbormaster completed remote builds in B197086: Diff 474555.Nov 10 2022, 9:28 AM

Changes from last revison:

Used a counter for updated callsites instead of revising them at the end to identify dead functions.

Harbormaster completed remote builds in B197169: Diff 474673.Nov 11 2022, 12:49 AM

labrinea added inline comments.Nov 14 2022, 1:47 AM

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp
708	I'll rename this and add a comment to explain what it is used for.

Changes from last revision:

rebased
renamed the variable which determines whether we decrement the counter of left callsites to replace
cached the cloned function pointer to the SpecializationInfo to avoid keeping the Clone vector in sync with the Specialization vector when replacing callsites

Harbormaster completed remote builds in B198028: Diff 475858.Nov 16 2022, 10:12 AM

Also removed the immediate replacement of known callsites. It's better to lookup the lattice value instead. This fixed a compile time regression possibly caused by otherwise dead duplicate specializations that had to be processed by later passes.

labrinea mentioned this in D135463: [FuncSpec] Do not generate multiple copies for identical specializations..Nov 16 2022, 10:18 AM

labrinea added a child revision: D135463: [FuncSpec] Do not generate multiple copies for identical specializations..

Would you, please, mark as done the no longer relevant comments?
I think the only issue left is with the circular dependency between libraries.

In D126455#3936611, @chill wrote:

Would you, please, mark as done the no longer relevant comments?
I think the only issue left is with the circular dependency between libraries.

Ack

labrinea mentioned this in D138654: [IPSCCP] Move the IPSCCP run function under the IPO directory..Nov 24 2022, 4:15 AM

Removed the cyclic dependency between LLVMipo and LLVMScalarOpts and rebased on top of D138654.

labrinea added a parent revision: D138654: [IPSCCP] Move the IPSCCP run function under the IPO directory..Nov 24 2022, 6:17 AM

Harbormaster completed remote builds in B199410: Diff 477766.Nov 24 2022, 6:17 AM

LGTM, but let's give a chance for other people to have a look too. @sinan @fhahn

rebase
migrated all funcspec tests to use the -passes= cmdline option

Harbormaster completed remote builds in B200536: Diff 479303.Dec 1 2022, 11:21 AM

aeubanks added a subscriber: aeubanks.Dec 2 2022, 4:16 PM

chill accepted this revision.Dec 5 2022, 1:30 AM

This revision is now accepted and ready to land.Dec 5 2022, 1:30 AM

fhahn added inline comments.Dec 5 2022, 2:17 AM

llvm/lib/Transforms/IPO/SCCP.cpp
25	move this to the the loop below, which uses it
llvm/lib/Transforms/Utils/SCCPSolver.cpp
266 ↗	(On Diff #479303)	I think DomTreeUpdater provides a constructor that doesn't take a DT which could be used unconditionally instead of having all those `if (DTU)` checks spread out across various functions.

Moved a flag close to its use.
Used a DomTreeUpdater without DT/PDT for cloned functions.

labrinea marked 2 inline comments as done.Dec 5 2022, 4:41 AM

labrinea mentioned this in D128827: [WIP][SCCP] Don't track specialized functions unless they are recursive..Dec 5 2022, 5:15 AM

labrinea mentioned this in D128825: [SCCP] Add API for updating the state of the Solver..

labrinea mentioned this in D128824: [SCCP] Add API for AdditionalUsers to the Instruction Visitor..

labrinea mentioned this in D128823: [SCCP] Make it possible to remove predicate info for a given instruction..

labrinea mentioned this in D126456: [SCCP] Notify the Solver when an instruction is removed..

Harbormaster completed remote builds in B201073: Diff 480052.Dec 5 2022, 7:16 AM

chill added a child revision: D139346: [FuncSpec] Global ranking of specialisations.Dec 5 2022, 10:15 AM

This revision was landed with ongoing or failed builds.Dec 8 2022, 4:23 AM

Closed by commit rG877a9f9abec6: [FuncSpec] Make the Function Specializer part of the IPSCCP pass. (authored by labrinea). · Explain Why

This revision was automatically updated to reflect the committed changes.

labrinea mentioned this in rG42c2dc401742: [IPSCCP] Move the IPSCCP run function under the IPO directory..

labrinea added a commit: rG877a9f9abec6: [FuncSpec] Make the Function Specializer part of the IPSCCP pass..

labrinea added a reverting change: rG0f0cb92cb2ad: Revert "[FuncSpec] Make the Function Specializer part of the IPSCCP pass.".Dec 8 2022, 4:42 AM

labrinea added a commit: rG8136a0172b3c: [FuncSpec] Make the Function Specializer part of the IPSCCP pass..Dec 10 2022, 6:50 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

IPO.h

5 lines

IPO/

FunctionSpecialization.h

180 lines

SCCP.h

8 lines

Scalar/

SCCP.h

9 lines

lib/

Passes/

PassBuilderPipelines.cpp

6 lines

PassRegistry.def

1 line

Transforms/

IPO/

FunctionSpecialization.cpp

1086 lines

IPO.cpp

1 line

PassManagerBuilder.cpp

8 lines

SCCP.cpp

109 lines

Scalar/

CMakeLists.txt

1 line

SCCP.cpp

226 lines

test/

Transforms/

FunctionSpecialization/

bug52821-use-after-free.ll

15 lines

bug55000-read-uninitialized-value.ll

2 lines

function-specialization-always-inline.ll

4 lines

function-specialization-constant-expression.ll

5 lines

function-specialization-constant-expression2.ll

2 lines

function-specialization-constant-expression3.ll

2 lines

function-specialization-constant-expression4.ll

2 lines

function-specialization-constant-expression5.ll

2 lines

function-specialization-constant-integers.ll

4 lines

function-specialization-loop.ll

4 lines

function-specialization-minsize.ll

2 lines

function-specialization-minsize2.ll

2 lines

function-specialization-minsize3.ll

2 lines

function-specialization-nodup.ll

2 lines

function-specialization-nodup2.ll

2 lines

function-specialization-noexec.ll

2 lines

function-specialization-nonconst-glob.ll

6 lines

function-specialization-nothing-todo.ll

function-specialization-poison.ll

2 lines

function-specialization-recursive.ll

6 lines

function-specialization-recursive2.ll

2 lines

function-specialization-recursive3.ll

2 lines

function-specialization-recursive4.ll

2 lines

function-specialization-stats.ll

2 lines

function-specialization.ll

2 lines

function-specialization2.ll

28 lines

function-specialization3.ll

6 lines

function-specialization4.ll

4 lines

function-specialization5.ll

2 lines

identical-specializations.ll

2 lines

remove-dead-recursive-function.ll

2 lines

specialize-multiple-arguments.ll

8 lines

utils/

gn/

secondary/

llvm/

lib/

Transforms/

Scalar/

BUILD.gn

1 line

Diff 469041

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 156 Lines • ▼ Show 20 Lines
	void initializeFinalizeMachineBundlesPass(PassRegistry&);			void initializeFinalizeMachineBundlesPass(PassRegistry&);
	void initializeFixIrreduciblePass(PassRegistry &);			void initializeFixIrreduciblePass(PassRegistry &);
	void initializeFixupStatepointCallerSavedPass(PassRegistry&);			void initializeFixupStatepointCallerSavedPass(PassRegistry&);
	void initializeFlattenCFGLegacyPassPass(PassRegistry &);			void initializeFlattenCFGLegacyPassPass(PassRegistry &);
	void initializeFloat2IntLegacyPassPass(PassRegistry&);			void initializeFloat2IntLegacyPassPass(PassRegistry&);
	void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeForceFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeForwardControlFlowIntegrityPass(PassRegistry&);			void initializeForwardControlFlowIntegrityPass(PassRegistry&);
	void initializeFuncletLayoutPass(PassRegistry&);			void initializeFuncletLayoutPass(PassRegistry&);
	void initializeFunctionSpecializationLegacyPassPass(PassRegistry &);
	void initializeGCMachineCodeAnalysisPass(PassRegistry&);			void initializeGCMachineCodeAnalysisPass(PassRegistry&);
	void initializeGCModuleInfoPass(PassRegistry&);			void initializeGCModuleInfoPass(PassRegistry&);
	void initializeGVNHoistLegacyPassPass(PassRegistry&);			void initializeGVNHoistLegacyPassPass(PassRegistry&);
	void initializeGVNLegacyPassPass(PassRegistry&);			void initializeGVNLegacyPassPass(PassRegistry&);
	void initializeGVNSinkLegacyPassPass(PassRegistry&);			void initializeGVNSinkLegacyPassPass(PassRegistry&);
	void initializeGlobalDCELegacyPassPass(PassRegistry&);			void initializeGlobalDCELegacyPassPass(PassRegistry&);
	void initializeGlobalMergePass(PassRegistry&);			void initializeGlobalMergePass(PassRegistry&);
	void initializeGlobalOptLegacyPassPass(PassRegistry&);			void initializeGlobalOptLegacyPassPass(PassRegistry&);
	▲ Show 20 Lines • Show All 274 Lines • Show Last 20 Lines

llvm/include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 218 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createFloat2IntPass();		(void) llvm::createFloat2IntPass();
(void) llvm::createEliminateAvailableExternallyPass();		(void) llvm::createEliminateAvailableExternallyPass();
(void)llvm::createScalarizeMaskedMemIntrinLegacyPass();		(void)llvm::createScalarizeMaskedMemIntrinLegacyPass();
(void) llvm::createWarnMissedTransformationsPass();		(void) llvm::createWarnMissedTransformationsPass();
(void) llvm::createHardwareLoopsPass();		(void) llvm::createHardwareLoopsPass();
(void) llvm::createInjectTLIMappingsLegacyPass();		(void) llvm::createInjectTLIMappingsLegacyPass();
(void) llvm::createUnifyLoopExitsPass();		(void) llvm::createUnifyLoopExitsPass();
(void) llvm::createFixIrreduciblePass();		(void) llvm::createFixIrreduciblePass();
(void)llvm::createFunctionSpecializationPass();
(void)llvm::createSelectOptimizePass();		(void)llvm::createSelectOptimizePass();

(void)new llvm::IntervalPartition();		(void)new llvm::IntervalPartition();
(void)new llvm::ScalarEvolutionWrapperPass();		(void)new llvm::ScalarEvolutionWrapperPass();
llvm::Function::Create(nullptr, llvm::GlobalValue::ExternalLinkage)->viewCFGOnly();		llvm::Function::Create(nullptr, llvm::GlobalValue::ExternalLinkage)->viewCFGOnly();
llvm::RGPassManager RGM;		llvm::RGPassManager RGM;
llvm::TargetLibraryInfoImpl TLII;		llvm::TargetLibraryInfoImpl TLII;
llvm::TargetLibraryInfo TLI(TLII);		llvm::TargetLibraryInfo TLI(TLII);
Show All 11 Lines

llvm/include/llvm/Transforms/IPO.h

	Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	/// createIPSCCPPass - This pass propagates constants from call sites into the			/// createIPSCCPPass - This pass propagates constants from call sites into the
	/// bodies of functions, and keeps track of whether basic blocks are executable			/// bodies of functions, and keeps track of whether basic blocks are executable
	/// in the process.			/// in the process.
	///			///
	ModulePass *createIPSCCPPass();			ModulePass *createIPSCCPPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	/// createFunctionSpecializationPass - This pass propagates constants from call
	/// sites to the specialized version of the callee function.
	ModulePass *createFunctionSpecializationPass();

	//===----------------------------------------------------------------------===//
	//			//
	/// createLoopExtractorPass - This pass extracts all natural loops from the			/// createLoopExtractorPass - This pass extracts all natural loops from the
	/// program into a function if it can.			/// program into a function if it can.
	///			///
	Pass *createLoopExtractorPass();			Pass *createLoopExtractorPass();

	/// createSingleLoopExtractorPass - This pass extracts one natural loop from the			/// createSingleLoopExtractorPass - This pass extracts one natural loop from the
	/// program into a function if it can. This is used by bugpoint.			/// program into a function if it can. This is used by bugpoint.
	▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h

This file was added.

				//===- FunctionSpecialization.h - Function Specialization -----------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This specialises functions with constant parameters. Constant parameters
				// like function pointers and constant globals are propagated to the callee by
				// specializing the function. The main benefit of this pass at the moment is
				// that indirect calls are transformed into direct calls, which provides inline
				// opportunities that the inliner would not have been able to achieve. That's
				// why function specialisation is run before the inliner in the optimisation
				// pipeline; that is by design. Otherwise, we would only benefit from constant
				// passing, which is a valid use-case too, but hasn't been explored much in
				// terms of performance uplifts, cost-model and compile-time impact.
				//
				// Current limitations:
				// - It does not yet handle integer ranges. We do support "literal constants",
				// but that's off by default under an option.
				// - The cost-model could be further looked into (it mainly focuses on inlining
				// benefits),
				//
				// Ideas:
				// - With a function specialization attribute for arguments, we could have
				// a direct way to steer function specialization, avoiding the cost-model,
				// and thus control compile-times / code-size.
				//
				// Todos:
				// - Specializing recursive functions relies on running the transformation a
				// number of times, which is controlled by option
				// `func-specialization-max-iters`. Thus, increasing this value and the
				// number of iterations, will linearly increase the number of times recursive
				// functions get specialized, see also the discussion in
				// https://reviews.llvm.org/D106426 for details. Perhaps there is a
				// compile-time friendlier way to control/limit the number of specialisations
				// for recursive functions.
				// - Don't transform the function if function specialization does not trigger;
				// the SCCPSolver may make IR changes.
				//
				// References:
				// - 2021 LLVM Dev Mtg “Introducing function specialisation, and can we enable
				// it by default?”, https://www.youtube.com/watch?v=zJiCjeXgV5Q
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H
				#define LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H

				#include "llvm/Analysis/CodeMetrics.h"
				#include "llvm/Analysis/InlineCost.h"
				#include "llvm/Analysis/LoopInfo.h"
				#include "llvm/Analysis/TargetTransformInfo.h"
				#include "llvm/Transforms/Scalar/SCCP.h"
				#include "llvm/Transforms/Utils/Cloning.h"
				#include "llvm/Transforms/Utils/SCCPSolver.h"
				#include "llvm/Transforms/Utils/SizeOpts.h"

				using namespace llvm;

				namespace llvm {
				// Bookkeeping struct to pass data from the analysis and profitability phase
				// to the actual transform helper functions.
				struct SpecializationInfo {
				SmallVector<ArgInfo, 8> Args; // Stores the {formal,actual} argument pairs.
				InstructionCost Gain; // Profitability: Gain = Bonus - Cost.
				};

				using CallArgBinding = std::pair<CallBase , Constant >;
				labrineaAuthorUnsubmitted Done Reply Inline Actions I've removed an unused typedef from here ;) labrinea: I've removed an unused typedef from here ;)
				using CallSpecBinding = std::pair<CallBase *, SpecializationInfo>;
				// We are using MapVector because it guarantees deterministic iteration
				// order across executions.
				using SpecializationMap = SmallMapVector<CallBase *, SpecializationInfo, 8>;

				class FunctionSpecializer {

				/// The IPSCCP Solver.
				SCCPSolver &Solver;

				Module &M;

				/// Analyses used to help determine if a function should be specialized.
				std::function<const TargetLibraryInfo &(Function &)> GetTLI;
				std::function<TargetTransformInfo &(Function &)> GetTTI;
				std::function<AssumptionCache &(Function &)> GetAC;

				// The number of functions specialised, used for collecting statistics and
				// also in the cost model.
				unsigned NbFunctionsSpecialized = 0;

				SmallPtrSet<Function *, 32> SpecializedFuncs;
				SmallPtrSet<Function *, 32> FullySpecialized;
				DenseMap<Function *, CodeMetrics> FunctionMetrics;

				public:
				FunctionSpecializer(SCCPSolver &Solver, Module &M,
				std::function<const TargetLibraryInfo &(Function &)> GetTLI,
				std::function<TargetTransformInfo &(Function &)> GetTTI,
				std::function<AssumptionCache &(Function &)> GetAC)
				: Solver(Solver), M(M), GetTLI(GetTLI), GetTTI(GetTTI), GetAC(GetAC) {}

				~FunctionSpecializer() {
				// Eliminate dead code.
				removeDeadFunctions();
				cleanUpSSA();
				}

				bool isClonedFunction(Function *F) { return SpecializedFuncs.count(F); }

				bool specialize(SmallVectorImpl<Function *> &WorkList);

				/// Iterate over the argument tracked functions see if there
				/// are any new constant values for the call instruction via
				/// stack variables.
				void promoteConstantStackValues();

				private:
				/// Clean up fully specialized functions.
				void removeDeadFunctions();

				/// Remove any ssa_copy intrinsics that may have been introduced.
				void cleanUpSSA();

				// Compute the code metrics for function \p F.
				CodeMetrics &analyzeFunction(Function *F);

				/// This function decides whether it's worthwhile to specialize function
				/// \p F based on the known constant values its arguments can take on. It
				/// only discovers potential specialization opportunities without actually
				/// applying them.
				///
				/// \returns true if any specializations have been found.
				bool calculateGains(Function *F, InstructionCost Cost,
				SmallVectorImpl<CallSpecBinding> &WorkList);

				bool isCandidateFunction(Function *F);

				void createSpecialization(Function *F, SpecializationInfo &SpecInfo,
				SmallVectorImpl<Function *> &WorkList);

				/// Compute and return the cost of specializing function \p F.
				InstructionCost getSpecializationCost(Function *F);

				/// Compute a bonus for replacing argument \p A with constant \p C.
				InstructionCost getSpecializationBonus(Argument A, Constant C);

				/// Determine if we should specialize a function based on the incoming values
				/// of the given argument.
				///
				/// This function implements the goal-directed heuristic. It determines if
				/// specializing the function based on the incoming values of argument \p A
				/// would result in any significant optimization opportunities. If
				/// optimization opportunities exist, the constant values of \p A on which to
				/// specialize the function are collected in \p Constants.
				///
				/// \returns true if the function should be specialized on the given
				/// argument.
				bool isArgumentInteresting(Argument *A,
				SmallVectorImpl<CallArgBinding> &Constants);

				/// Collect in \p Constants all the constant values that argument \p A can
				/// take on.
				void getPossibleConstants(Argument *A,
				SmallVectorImpl<CallArgBinding> &Constants);

				/// Rewrite calls to function \p F to call function \p Clone instead.
				///
				/// This function modifies calls to function \p F as long as the actual
				/// arguments match those in \p Args. Note that for recursive calls we
				/// need to compare against the cloned formal arguments.
				///
				/// Callsites that have been marked with the MinSize function attribute won't
				/// be specialized and rewritten.
				void rewriteCallSites(Function *Clone, const SmallVectorImpl<ArgInfo> &Args,
				ValueToValueMapTy &Mappings);
				};
				} // namespace

				#endif // LLVM_TRANSFORMS_IPO_FUNCTIONSPECIALIZATION_H

llvm/include/llvm/Transforms/IPO/SCCP.h

	Show All 26 Lines
	class Module;			class Module;

	/// Pass to perform interprocedural constant propagation.			/// Pass to perform interprocedural constant propagation.
	class IPSCCPPass : public PassInfoMixin<IPSCCPPass> {			class IPSCCPPass : public PassInfoMixin<IPSCCPPass> {
	public:			public:
	PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);			PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
	};			};

	/// Pass to perform interprocedural constant propagation by specializing
	/// functions
	class FunctionSpecializationPass
	: public PassInfoMixin<FunctionSpecializationPass> {
	public:
	PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
	};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_IPO_SCCP_H			#endif // LLVM_TRANSFORMS_IPO_SCCP_H

llvm/include/llvm/Transforms/Scalar/SCCP.h

	Show All 36 Lines
	/// This pass performs function-level constant propagation and merging.			/// This pass performs function-level constant propagation and merging.
	class SCCPPass : public PassInfoMixin<SCCPPass> {			class SCCPPass : public PassInfoMixin<SCCPPass> {
	public:			public:
	PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);			PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
	};			};

	bool runIPSCCP(Module &M, const DataLayout &DL,			bool runIPSCCP(Module &M, const DataLayout &DL,
	std::function<const TargetLibraryInfo &(Function &)> GetTLI,			std::function<const TargetLibraryInfo &(Function &)> GetTLI,
	function_ref<AnalysisResultsForFn(Function &)> getAnalysis);

	bool runFunctionSpecialization(
	Module &M, const DataLayout &DL,
	std::function<TargetLibraryInfo &(Function &)> GetTLI,
	std::function<TargetTransformInfo &(Function &)> GetTTI,			std::function<TargetTransformInfo &(Function &)> GetTTI,
	std::function<AssumptionCache &(Function &)> GetAC,			std::function<AssumptionCache &(Function &)> GetAC,
	function_ref<AnalysisResultsForFn(Function &)> GetAnalysis);			function_ref<AnalysisResultsForFn(Function &)> getAnalysis);
	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TRANSFORMS_SCALAR_SCCP_H			#endif // LLVM_TRANSFORMS_SCALAR_SCCP_H

llvm/lib/Passes/PassBuilderPipelines.cpp

Show First 20 Lines • Show All 934 Lines • ▼ Show 20 Lines	PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
// post link pipeline after ICP. This is to enable usage of the type		// post link pipeline after ICP. This is to enable usage of the type
// tests in ICP sequences.		// tests in ICP sequences.
if (Phase == ThinOrFullLTOPhase::ThinLTOPostLink)		if (Phase == ThinOrFullLTOPhase::ThinLTOPostLink)
MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));		MPM.addPass(LowerTypeTestsPass(nullptr, nullptr, true));

for (auto &C : PipelineEarlySimplificationEPCallbacks)		for (auto &C : PipelineEarlySimplificationEPCallbacks)
C(MPM, Level);		C(MPM, Level);

// Specialize functions with IPSCCP.
if (EnableFunctionSpecialization && Level == OptimizationLevel::O3)
MPM.addPass(FunctionSpecializationPass());

// Interprocedural constant propagation now that basic cleanup has occurred		// Interprocedural constant propagation now that basic cleanup has occurred
// and prior to optimizing globals.		// and prior to optimizing globals.
// FIXME: This position in the pipeline hasn't been carefully considered in		// FIXME: This position in the pipeline hasn't been carefully considered in
// years, it should be re-analyzed.		// years, it should be re-analyzed.
MPM.addPass(IPSCCPPass());		MPM.addPass(IPSCCPPass());

// Attach metadata to indirect call sites indicating the set of functions		// Attach metadata to indirect call sites indicating the set of functions
// they may target at run-time. This should follow IPSCCP.		// they may target at run-time. This should follow IPSCCP.
▲ Show 20 Lines • Show All 584 Lines • ▼ Show 20 Lines	if (Level.getSpeedupLevel() > 1) {

// Indirect call promotion. This should promote all the targets that are		// Indirect call promotion. This should promote all the targets that are
// left by the earlier promotion pass that promotes intra-module targets.		// left by the earlier promotion pass that promotes intra-module targets.
// This two-step promotion is to save the compile time. For LTO, it should		// This two-step promotion is to save the compile time. For LTO, it should
// produce the same result as if we only do promotion here.		// produce the same result as if we only do promotion here.
MPM.addPass(PGOIndirectCallPromotion(		MPM.addPass(PGOIndirectCallPromotion(
true /* InLTO */, PGOOpt && PGOOpt->Action == PGOOptions::SampleUse));		true /* InLTO */, PGOOpt && PGOOpt->Action == PGOOptions::SampleUse));

if (EnableFunctionSpecialization && Level == OptimizationLevel::O3)
MPM.addPass(FunctionSpecializationPass());
// Propagate constants at call sites into the functions they call. This		// Propagate constants at call sites into the functions they call. This
// opens opportunities for globalopt (and inlining) by substituting function		// opens opportunities for globalopt (and inlining) by substituting function
// pointers passed as arguments to direct uses of functions.		// pointers passed as arguments to direct uses of functions.
MPM.addPass(IPSCCPPass());		MPM.addPass(IPSCCPPass());

// Attach metadata to indirect call sites indicating the set of functions		// Attach metadata to indirect call sites indicating the set of functions
// they may target at run-time. This should follow IPSCCP.		// they may target at run-time. This should follow IPSCCP.
MPM.addPass(CalledValuePropagationPass());		MPM.addPass(CalledValuePropagationPass());
▲ Show 20 Lines • Show All 354 Lines • Show Last 20 Lines

llvm/lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	MODULE_PASS("cross-dso-cfi", CrossDSOCFIPass())			MODULE_PASS("cross-dso-cfi", CrossDSOCFIPass())
	MODULE_PASS("deadargelim", DeadArgumentEliminationPass())			MODULE_PASS("deadargelim", DeadArgumentEliminationPass())
	MODULE_PASS("debugify", NewPMDebugifyPass())			MODULE_PASS("debugify", NewPMDebugifyPass())
	MODULE_PASS("dot-callgraph", CallGraphDOTPrinterPass())			MODULE_PASS("dot-callgraph", CallGraphDOTPrinterPass())
	MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())			MODULE_PASS("elim-avail-extern", EliminateAvailableExternallyPass())
	MODULE_PASS("extract-blocks", BlockExtractorPass())			MODULE_PASS("extract-blocks", BlockExtractorPass())
	MODULE_PASS("forceattrs", ForceFunctionAttrsPass())			MODULE_PASS("forceattrs", ForceFunctionAttrsPass())
	MODULE_PASS("function-import", FunctionImportPass())			MODULE_PASS("function-import", FunctionImportPass())
	MODULE_PASS("function-specialization", FunctionSpecializationPass())
	MODULE_PASS("globaldce", GlobalDCEPass())			MODULE_PASS("globaldce", GlobalDCEPass())
	MODULE_PASS("globalopt", GlobalOptPass())			MODULE_PASS("globalopt", GlobalOptPass())
	MODULE_PASS("globalsplit", GlobalSplitPass())			MODULE_PASS("globalsplit", GlobalSplitPass())
	MODULE_PASS("hotcoldsplit", HotColdSplittingPass())			MODULE_PASS("hotcoldsplit", HotColdSplittingPass())
	MODULE_PASS("inferattrs", InferFunctionAttrsPass())			MODULE_PASS("inferattrs", InferFunctionAttrsPass())
	MODULE_PASS("inliner-wrapper", ModuleInlinerWrapperPass())			MODULE_PASS("inliner-wrapper", ModuleInlinerWrapperPass())
	MODULE_PASS("inliner-ml-advisor-release", ModuleInlinerWrapperPass(getInlineParams(), true, {}, InliningAdvisorMode::Release, 0))			MODULE_PASS("inliner-ml-advisor-release", ModuleInlinerWrapperPass(getInlineParams(), true, {}, InliningAdvisorMode::Release, 0))
	MODULE_PASS("print<inline-advisor>", InlineAdvisorAnalysisPrinterPass(dbgs()))			MODULE_PASS("print<inline-advisor>", InlineAdvisorAnalysisPrinterPass(dbgs()))
	▲ Show 20 Lines • Show All 488 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/CodeMetrics.h"		#include "llvm/Analysis/CodeMetrics.h"
#include "llvm/Analysis/InlineCost.h"		#include "llvm/Analysis/InlineCost.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/TargetTransformInfo.h"		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/Analysis/ValueLattice.h"		#include "llvm/Analysis/ValueLattice.h"
#include "llvm/Analysis/ValueLatticeUtils.h"		#include "llvm/Analysis/ValueLatticeUtils.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
		#include "llvm/Transforms/IPO/FunctionSpecialization.h"
#include "llvm/Transforms/Scalar/SCCP.h"		#include "llvm/Transforms/Scalar/SCCP.h"
#include "llvm/Transforms/Utils/Cloning.h"		#include "llvm/Transforms/Utils/Cloning.h"
#include "llvm/Transforms/Utils/SCCPSolver.h"		#include "llvm/Transforms/Utils/SCCPSolver.h"
#include "llvm/Transforms/Utils/SizeOpts.h"		#include "llvm/Transforms/Utils/SizeOpts.h"
#include <cmath>		#include <cmath>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "function-specialization"		#define DEBUG_TYPE "function-specialization"

STATISTIC(NumFuncSpecialized, "Number of functions specialized");		STATISTIC(NumFuncSpecialized, "Number of functions specialized");

static cl::opt<bool> ForceFunctionSpecialization(		static cl::opt<bool> ForceFunctionSpecialization(
"force-function-specialization", cl::init(false), cl::Hidden,		"force-function-specialization", cl::init(false), cl::Hidden,
cl::desc("Force function specialization for every call site with a "		cl::desc("Force function specialization for every call site with a "
"constant argument"));		"constant argument"));

static cl::opt<unsigned> FuncSpecializationMaxIters(
"func-specialization-max-iters", cl::Hidden,
cl::desc("The maximum number of iterations function specialization is run"),
cl::init(1));

static cl::opt<unsigned> MaxClonesThreshold(		static cl::opt<unsigned> MaxClonesThreshold(
"func-specialization-max-clones", cl::Hidden,		"func-specialization-max-clones", cl::Hidden,
cl::desc("The maximum number of clones allowed for a single function "		cl::desc("The maximum number of clones allowed for a single function "
"specialization"),		"specialization"),
cl::init(3));		cl::init(3));

static cl::opt<unsigned> SmallFunctionThreshold(		static cl::opt<unsigned> SmallFunctionThreshold(
"func-specialization-size-threshold", cl::Hidden,		"func-specialization-size-threshold", cl::Hidden,
Show All 17 Lines
//		//
// https://llvm-compile-time-tracker.com		// https://llvm-compile-time-tracker.com
// https://github.com/nikic/llvm-compile-time-tracker		// https://github.com/nikic/llvm-compile-time-tracker
static cl::opt<bool> EnableSpecializationForLiteralConstant(		static cl::opt<bool> EnableSpecializationForLiteralConstant(
"function-specialization-for-literal-constant", cl::init(false), cl::Hidden,		"function-specialization-for-literal-constant", cl::init(false), cl::Hidden,
cl::desc("Enable specialization of functions that take a literal constant "		cl::desc("Enable specialization of functions that take a literal constant "
"as an argument."));		"as an argument."));

namespace {
// Bookkeeping struct to pass data from the analysis and profitability phase
// to the actual transform helper functions.
struct SpecializationInfo {
SmallVector<ArgInfo, 8> Args; // Stores the {formal,actual} argument pairs.
InstructionCost Gain; // Profitability: Gain = Bonus - Cost.
};
} // Anonymous namespace

using FuncList = SmallVectorImpl<Function *>;
using CallArgBinding = std::pair<CallBase , Constant >;
using CallSpecBinding = std::pair<CallBase *, SpecializationInfo>;
// We are using MapVector because it guarantees deterministic iteration
// order across executions.
using SpecializationMap = SmallMapVector<CallBase *, SpecializationInfo, 8>;

// Helper to check if \p LV is either a constant or a constant
// range with a single element. This should cover exactly the same cases as the
// old ValueLatticeElement::isConstant() and is intended to be used in the
// transition to ValueLatticeElement.
static bool isConstant(const ValueLatticeElement &LV) {
return LV.isConstant() \|\|
(LV.isConstantRange() && LV.getConstantRange().isSingleElement());
}

// Helper to check if \p LV is either overdefined or a constant int.
static bool isOverdefined(const ValueLatticeElement &LV) {
return !LV.isUnknownOrUndef() && !isConstant(LV);
}

static Constant getPromotableAlloca(AllocaInst Alloca, CallInst *Call) {		static Constant getPromotableAlloca(AllocaInst Alloca, CallInst *Call) {
Value *StoreValue = nullptr;		Value *StoreValue = nullptr;
for (auto *User : Alloca->users()) {		for (auto *User : Alloca->users()) {
// We can't use llvm::isAllocaPromotable() as that would fail because of		// We can't use llvm::isAllocaPromotable() as that would fail because of
// the usage in the CallInst, which is what we check here.		// the usage in the CallInst, which is what we check here.
if (User == Call)		if (User == Call)
continue;		continue;
if (auto *Bitcast = dyn_cast<BitCastInst>(User)) {		if (auto *Bitcast = dyn_cast<BitCastInst>(User)) {
Show All 13 Lines	for (auto *User : Alloca->users()) {
return nullptr;		return nullptr;
}		}
return dyn_cast_or_null<Constant>(StoreValue);		return dyn_cast_or_null<Constant>(StoreValue);
}		}

// A constant stack value is an AllocaInst that has a single constant		// A constant stack value is an AllocaInst that has a single constant
// value stored to it. Return this constant if such an alloca stack value		// value stored to it. Return this constant if such an alloca stack value
// is a function argument.		// is a function argument.
static Constant getConstantStackValue(CallInst Call, Value *Val,		static Constant getConstantStackValue(CallInst Call, Value *Val) {
SCCPSolver &Solver) {
if (!Val)		if (!Val)
return nullptr;		return nullptr;
Val = Val->stripPointerCasts();		Val = Val->stripPointerCasts();
if (auto *ConstVal = dyn_cast<ConstantInt>(Val))		if (auto *ConstVal = dyn_cast<ConstantInt>(Val))
return ConstVal;		return ConstVal;
auto *Alloca = dyn_cast<AllocaInst>(Val);		auto *Alloca = dyn_cast<AllocaInst>(Val);
if (!Alloca \|\| !Alloca->getAllocatedType()->isIntegerTy())		if (!Alloca \|\| !Alloca->getAllocatedType()->isIntegerTy())
return nullptr;		return nullptr;
Show All 16 Lines
//		//
// @funcspec.arg = internal constant i32 2		// @funcspec.arg = internal constant i32 2
//		//
// define internal void @someFunc(i32* arg1) {		// define internal void @someFunc(i32* arg1) {
// call void @otherFunc(i32* nonnull @funcspec.arg)		// call void @otherFunc(i32* nonnull @funcspec.arg)
// ret void		// ret void
// }		// }
//		//
static void constantArgPropagation(FuncList &WorkList, Module &M,		void FunctionSpecializer::promoteConstantStackValues() {
SCCPSolver &Solver) {
// Iterate over the argument tracked functions see if there		// Iterate over the argument tracked functions see if there
// are any new constant values for the call instruction via		// are any new constant values for the call instruction via
// stack variables.		// stack variables.
for (auto *F : WorkList) {		for (Function &F : M) {
		if (!Solver.isArgumentTrackedFunction(&F))
		continue;

for (auto *User : F->users()) {		for (auto *User : F.users()) {

auto *Call = dyn_cast<CallInst>(User);		auto *Call = dyn_cast<CallInst>(User);
if (!Call)		if (!Call)
continue;		continue;

bool Changed = false;		bool Changed = false;
for (const Use &U : Call->args()) {		for (const Use &U : Call->args()) {
unsigned Idx = Call->getArgOperandNo(&U);		unsigned Idx = Call->getArgOperandNo(&U);
Value *ArgOp = Call->getArgOperand(Idx);		Value *ArgOp = Call->getArgOperand(Idx);
Type *ArgOpType = ArgOp->getType();		Type *ArgOpType = ArgOp->getType();

if (!Call->onlyReadsMemory(Idx) \|\| !ArgOpType->isPointerTy())		if (!Call->onlyReadsMemory(Idx) \|\| !ArgOpType->isPointerTy())
continue;		continue;

auto *ConstVal = getConstantStackValue(Call, ArgOp, Solver);		auto *ConstVal = getConstantStackValue(Call, ArgOp);
if (!ConstVal)		if (!ConstVal)
continue;		continue;

Value *GV = new GlobalVariable(M, ConstVal->getType(), true,		Value *GV = new GlobalVariable(M, ConstVal->getType(), true,
GlobalValue::InternalLinkage, ConstVal,		GlobalValue::InternalLinkage, ConstVal,
"funcspec.arg");		"funcspec.arg");
if (ArgOpType != ConstVal->getType())		if (ArgOpType != ConstVal->getType())
GV = ConstantExpr::getBitCast(cast<Constant>(GV), ArgOpType);		GV = ConstantExpr::getBitCast(cast<Constant>(GV), ArgOpType);

Call->setArgOperand(Idx, GV);		Call->setArgOperand(Idx, GV);
Changed = true;		Changed = true;
}		}

// Add the changed CallInst to Solver Worklist		// Add the changed CallInst to Solver Worklist
if (Changed)		if (Changed)
Solver.visitCall(*Call);		Solver.visitCall(*Call);
		labrineaAuthorUnsubmitted Done Reply Inline Actions I am not sure whether this is necessary. The unit tests which exercise recursion are passing without it at least. labrinea: I am not sure whether this is necessary. The unit tests which exercise recursion are passing…
}		}
}		}
}		}

// ssa_copy intrinsics are introduced by the SCCP solver. These intrinsics		// ssa_copy intrinsics are introduced by the SCCP solver. These intrinsics
// interfere with the constantArgPropagation optimization.		// interfere with the promoteConstantStackValues() optimization.
static void removeSSACopy(Function &F) {		static void removeSSACopy(Function &F) {
for (BasicBlock &BB : F) {		for (BasicBlock &BB : F) {
for (Instruction &Inst : llvm::make_early_inc_range(BB)) {		for (Instruction &Inst : llvm::make_early_inc_range(BB)) {
auto *II = dyn_cast<IntrinsicInst>(&Inst);		auto *II = dyn_cast<IntrinsicInst>(&Inst);
if (!II)		if (!II)
continue;		continue;
if (II->getIntrinsicID() != Intrinsic::ssa_copy)		if (II->getIntrinsicID() != Intrinsic::ssa_copy)
continue;		continue;
Inst.replaceAllUsesWith(II->getOperand(0));		Inst.replaceAllUsesWith(II->getOperand(0));
Inst.eraseFromParent();		Inst.eraseFromParent();
}		}
}		}
}		}

static void removeSSACopy(Module &M) {		/// Remove any ssa_copy intrinsics that may have been introduced.
for (Function &F : M)		void FunctionSpecializer::cleanUpSSA() {
removeSSACopy(F);		for (Function *F : SpecializedFuncs)
}		removeSSACopy(*F);

namespace {
class FunctionSpecializer {

/// The IPSCCP Solver.
SCCPSolver &Solver;

/// Analyses used to help determine if a function should be specialized.
std::function<AssumptionCache &(Function &)> GetAC;
std::function<TargetTransformInfo &(Function &)> GetTTI;
std::function<TargetLibraryInfo &(Function &)> GetTLI;

SmallPtrSet<Function *, 4> SpecializedFuncs;
SmallPtrSet<Function *, 4> FullySpecialized;
SmallVector<Instruction *> ReplacedWithConstant;
DenseMap<Function *, CodeMetrics> FunctionMetrics;

public:
FunctionSpecializer(SCCPSolver &Solver,
std::function<AssumptionCache &(Function &)> GetAC,
std::function<TargetTransformInfo &(Function &)> GetTTI,
std::function<TargetLibraryInfo &(Function &)> GetTLI)
: Solver(Solver), GetAC(GetAC), GetTTI(GetTTI), GetTLI(GetTLI) {}

~FunctionSpecializer() {
// Eliminate dead code.
removeDeadInstructions();
removeDeadFunctions();
}		}

/// Attempt to specialize functions in the module to enable constant		/// Attempt to specialize functions in the module to enable constant
/// propagation across function boundaries.		/// propagation across function boundaries.
///		///
/// \returns true if at least one function is specialized.		/// \returns true if at least one function is specialized.
bool specializeFunctions(FuncList &Candidates, FuncList &WorkList) {		bool FunctionSpecializer::specialize(SmallVectorImpl<Function *> &WorkList) {
fhahnUnsubmitted Not Done Reply Inline Actions can you fix the indentation here in a NFC? fhahn: can you fix the indentation here in a NFC?
labrineaAuthorUnsubmitted Done Reply Inline Actions How? It seems ok. labrinea: How? It seems ok.
		chillUnsubmitted Done Reply Inline Actions I believe the LLVM convention for these kinds of classes/methods is `run`, e.g. `Vectorizer::run()`, `EarlyCSE::run()`, etc. chill: I believe the LLVM convention for these kinds of classes/methods is `run`, e.g. `Vectorizer…
bool Changed = false;		bool Changed = false;
for (auto *F : Candidates) {
if (!isCandidateFunction(F))		for (Function &F : M) {
		if (!isCandidateFunction(&F))
continue;		continue;

auto Cost = getSpecializationCost(F);		auto Cost = getSpecializationCost(&F);
		chillUnsubmitted Done Reply Inline Actions `WorkList` is a too generic name for a parameter and it's not a worklist per se anyway. chill: `WorkList` is a too generic name for a parameter and it's not a worklist per se anyway.
if (!Cost.isValid()) {		if (!Cost.isValid()) {
LLVM_DEBUG(		LLVM_DEBUG(
dbgs() << "FnSpecialization: Invalid specialization cost.\n");		dbgs() << "FnSpecialization: Invalid specialization cost.\n");
continue;		continue;
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Specialization cost for "		LLVM_DEBUG(dbgs() << "FnSpecialization: Specialization cost for "
<< F->getName() << " is " << Cost << "\n");		<< F.getName() << " is " << Cost << "\n");

SmallVector<CallSpecBinding, 8> Specializations;		SmallVector<CallSpecBinding, 8> Specializations;
if (!calculateGains(F, Cost, Specializations)) {		if (!calculateGains(&F, Cost, Specializations)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: No possible constants found\n");		LLVM_DEBUG(dbgs() << "FnSpecialization: No possible constants found\n");
continue;		continue;
}		}

Changed = true;		Changed = true;
for (auto &Entry : Specializations)		for (auto &Entry : Specializations)
specializeFunction(F, Entry.second, WorkList);		createSpecialization(&F, Entry.second, WorkList);
}		}

updateSpecializedFuncs(Candidates, WorkList);
NumFuncSpecialized += NbFunctionsSpecialized;		NumFuncSpecialized += NbFunctionsSpecialized;
return Changed;		return Changed;
}		}

void removeDeadInstructions() {		void FunctionSpecializer::removeDeadFunctions() {
for (auto *I : ReplacedWithConstant) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Removing dead instruction " << *I
<< "\n");
I->eraseFromParent();
}
ReplacedWithConstant.clear();
}

void removeDeadFunctions() {
for (auto *F : FullySpecialized) {		for (auto *F : FullySpecialized) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Removing dead function "		LLVM_DEBUG(dbgs() << "FnSpecialization: Removing dead function "
<< F->getName() << "\n");		<< F->getName() << "\n");
F->eraseFromParent();		F->eraseFromParent();
}		}
FullySpecialized.clear();		FullySpecialized.clear();
}		}

bool tryToReplaceWithConstant(Value *V) {
if (!V->getType()->isSingleValueType() \|\| isa<CallBase>(V) \|\|
V->user_empty())
return false;

const ValueLatticeElement &IV = Solver.getLatticeValueFor(V);
if (isOverdefined(IV))
return false;
auto *Const =
isConstant(IV) ? Solver.getConstant(IV) : UndefValue::get(V->getType());

LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing " << *V
<< "\nFnSpecialization: with " << *Const << "\n");

// Record uses of V to avoid visiting irrelevant uses of const later.
SmallVector<Instruction *> UseInsts;
for (auto *U : V->users())
if (auto *I = dyn_cast<Instruction>(U))
if (Solver.isBlockExecutable(I->getParent()))
UseInsts.push_back(I);

V->replaceAllUsesWith(Const);

for (auto *I : UseInsts)
Solver.visit(I);

// Remove the instruction from Block and Solver.
if (auto *I = dyn_cast<Instruction>(V)) {
if (I->isSafeToRemove()) {
ReplacedWithConstant.push_back(I);
Solver.removeLatticeValueFor(I);
}
}
return true;
}

private:
// The number of functions specialised, used for collecting statistics and
// also in the cost model.
unsigned NbFunctionsSpecialized = 0;

// Compute the code metrics for function \p F.		// Compute the code metrics for function \p F.
CodeMetrics &analyzeFunction(Function *F) {		CodeMetrics &FunctionSpecializer::analyzeFunction(Function *F) {
		labrineaAuthorUnsubmitted Done Reply Inline Actions We now call this function once, not for every clone as it used to be. labrinea: We now call this function once, not for every clone as it used to be.
auto I = FunctionMetrics.insert({F, CodeMetrics()});		auto I = FunctionMetrics.insert({F, CodeMetrics()});
CodeMetrics &Metrics = I.first->second;		CodeMetrics &Metrics = I.first->second;
if (I.second) {		if (I.second) {
// The code metrics were not cached.		// The code metrics were not cached.
SmallPtrSet<const Value *, 32> EphValues;		SmallPtrSet<const Value *, 32> EphValues;
CodeMetrics::collectEphemeralValues(F, &(GetAC)(*F), EphValues);		CodeMetrics::collectEphemeralValues(F, &(GetAC)(*F), EphValues);
for (BasicBlock &BB : *F)		for (BasicBlock &BB : *F)
Metrics.analyzeBasicBlock(&BB, (GetTTI)(*F), EphValues);		Metrics.analyzeBasicBlock(&BB, (GetTTI)(*F), EphValues);

LLVM_DEBUG(dbgs() << "FnSpecialization: Code size of function "		LLVM_DEBUG(dbgs() << "FnSpecialization: Code size of function "
<< F->getName() << " is " << Metrics.NumInsts		<< F->getName() << " is " << Metrics.NumInsts
<< " instructions\n");		<< " instructions\n");
}		}
return Metrics;		return Metrics;
}		}

/// Clone the function \p F and remove the ssa_copy intrinsics added by		/// Clone the function \p F and remove the ssa_copy intrinsics added by
/// the SCCPSolver in the cloned version.		/// the SCCPSolver in the cloned version.
Function cloneCandidateFunction(Function F, ValueToValueMapTy &Mappings) {		static Function cloneCandidateFunction(Function F,
		ValueToValueMapTy &Mappings) {
Function *Clone = CloneFunction(F, Mappings);		Function *Clone = CloneFunction(F, Mappings);
removeSSACopy(*Clone);		removeSSACopy(*Clone);
return Clone;		return Clone;
}		}

/// This function decides whether it's worthwhile to specialize function		/// This function decides whether it's worthwhile to specialize function
/// \p F based on the known constant values its arguments can take on. It		/// \p F based on the known constant values its arguments can take on. It
/// only discovers potential specialization opportunities without actually		/// only discovers potential specialization opportunities without actually
/// applying them.		/// applying them.
///		///
/// \returns true if any specializations have been found.		/// \returns true if any specializations have been found.
bool calculateGains(Function *F, InstructionCost Cost,		bool FunctionSpecializer::calculateGains(Function *F, InstructionCost Cost,
SmallVectorImpl<CallSpecBinding> &WorkList) {		SmallVectorImpl<CallSpecBinding> &WorkList) {
SpecializationMap Specializations;		SpecializationMap Specializations;
// Determine if we should specialize the function based on the values the		// Determine if we should specialize the function based on the values the
// argument can take on. If specialization is not profitable, we continue		// argument can take on. If specialization is not profitable, we continue
// on to the next argument.		// on to the next argument.
for (Argument &FormalArg : F->args()) {		for (Argument &FormalArg : F->args()) {
// Determine if this argument is interesting. If we know the argument can		// Determine if this argument is interesting. If we know the argument can
// take on any constant values, they are collected in Constants.		// take on any constant values, they are collected in Constants.
SmallVector<CallArgBinding, 8> ActualArgs;		SmallVector<CallArgBinding, 8> ActualArgs;
if (!isArgumentInteresting(&FormalArg, ActualArgs)) {		if (!isArgumentInteresting(&FormalArg, ActualArgs)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Argument "		LLVM_DEBUG(dbgs() << "FnSpecialization: Argument "
<< FormalArg.getNameOrAsOperand()		<< FormalArg.getNameOrAsOperand()
<< " is not interesting\n");		<< " is not interesting\n");
continue;		continue;
}		}

for (const auto &Entry : ActualArgs) {		for (const auto &Entry : ActualArgs) {
CallBase *Call = Entry.first;		CallBase *Call = Entry.first;
Constant *ActualArg = Entry.second;		Constant *ActualArg = Entry.second;

auto I = Specializations.insert({Call, SpecializationInfo()});		auto I = Specializations.insert({Call, SpecializationInfo()});
SpecializationInfo &S = I.first->second;		SpecializationInfo &S = I.first->second;
		chillUnsubmitted Done Reply Inline Actions This is a little bit fragile in the sense the caller may forget to clear the list. It would be nicer if this function itself clears the list first thing when it starts execution. `WorkList` could also be made a return, taking advantage of move semantics, which looks nicer on paper, but may cause a few allocations/deallocations if we iterate. chill: This is a little bit fragile in the sense the caller may forget to clear the list. It would be…

if (I.second)		if (I.second)
S.Gain = ForceFunctionSpecialization ? 1 : 0 - Cost;		S.Gain = ForceFunctionSpecialization ? 1 : 0 - Cost;
if (!ForceFunctionSpecialization)		if (!ForceFunctionSpecialization)
S.Gain += getSpecializationBonus(&FormalArg, ActualArg);		S.Gain += getSpecializationBonus(&FormalArg, ActualArg);
S.Args.push_back({&FormalArg, ActualArg});		S.Args.push_back({&FormalArg, ActualArg});
}		}
}		}

// Remove unprofitable specializations.		// Remove unprofitable specializations.
Specializations.remove_if(		Specializations.remove_if(
[](const auto &Entry) { return Entry.second.Gain <= 0; });		[](const auto &Entry) { return Entry.second.Gain <= 0; });

// Clear the MapVector and return the underlying vector.		// Clear the MapVector and return the underlying vector.
WorkList = Specializations.takeVector();		WorkList = Specializations.takeVector();

// Sort the candidates in descending order.		// Sort the candidates in descending order.
llvm::stable_sort(WorkList, [](const auto &L, const auto &R) {		llvm::stable_sort(WorkList, [](const auto &L, const auto &R) {
return L.second.Gain > R.second.Gain;		return L.second.Gain > R.second.Gain;
});		});

// Truncate the worklist to 'MaxClonesThreshold' candidates if necessary.		// Truncate the worklist to 'MaxClonesThreshold' candidates if necessary.
if (WorkList.size() > MaxClonesThreshold) {		if (WorkList.size() > MaxClonesThreshold) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Number of candidates exceed "		LLVM_DEBUG(dbgs() << "FnSpecialization: Number of candidates exceed "
<< "the maximum number of clones threshold.\n"		<< "the maximum number of clones threshold.\n"
<< "FnSpecialization: Truncating worklist to "		<< "FnSpecialization: Truncating worklist to "
<< MaxClonesThreshold << " candidates.\n");		<< MaxClonesThreshold << " candidates.\n");
WorkList.erase(WorkList.begin() + MaxClonesThreshold, WorkList.end());		WorkList.erase(WorkList.begin() + MaxClonesThreshold, WorkList.end());
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Specializations for function "		LLVM_DEBUG(dbgs() << "FnSpecialization: Specializations for function "
<< F->getName() << "\n";		<< F->getName() << "\n";
for (const auto &Entry		for (const auto &Entry
: WorkList) {		: WorkList) {
dbgs() << "FnSpecialization: Gain = " << Entry.second.Gain		dbgs() << "FnSpecialization: Gain = " << Entry.second.Gain
<< "\n";		<< "\n";
for (const ArgInfo &Arg : Entry.second.Args)		for (const ArgInfo &Arg : Entry.second.Args)
dbgs() << "FnSpecialization: FormalArg = "		dbgs() << "FnSpecialization: FormalArg = "
<< Arg.Formal->getNameOrAsOperand()		<< Arg.Formal->getNameOrAsOperand()
<< ", ActualArg = "		<< ", ActualArg = "
<< Arg.Actual->getNameOrAsOperand() << "\n";		<< Arg.Actual->getNameOrAsOperand() << "\n";
});		});

return !WorkList.empty();		return !WorkList.empty();
}		}

bool isCandidateFunction(Function *F) {		bool FunctionSpecializer::isCandidateFunction(Function *F) {
		if (F->isDeclaration())
		return false;

		if (F->hasFnAttribute(Attribute::NoDuplicate))
		return false;

		// Can't specialize non argument tracked functions.
		if (!Solver.isArgumentTrackedFunction(F))
		return false;

// Do not specialize the cloned function again.		// Do not specialize the cloned function again.
if (SpecializedFuncs.contains(F))		if (SpecializedFuncs.contains(F))
return false;		return false;

// If we're optimizing the function for size, we shouldn't specialize it.		// If we're optimizing the function for size, we shouldn't specialize it.
if (F->hasOptSize() \|\|		if (F->hasOptSize() \|\|
shouldOptimizeForSize(F, nullptr, nullptr, PGSOQueryType::IRPass))		shouldOptimizeForSize(F, nullptr, nullptr, PGSOQueryType::IRPass))
return false;		return false;

// Exit if the function is not executable. There's no point in specializing		// Exit if the function is not executable. There's no point in specializing
// a dead function.		// a dead function.
if (!Solver.isBlockExecutable(&F->getEntryBlock()))		if (!Solver.isBlockExecutable(&F->getEntryBlock()))
return false;		return false;

// It wastes time to specialize a function which would get inlined finally.		// It wastes time to specialize a function which would get inlined finally.
if (F->hasFnAttribute(Attribute::AlwaysInline))		if (F->hasFnAttribute(Attribute::AlwaysInline))
return false;		return false;

LLVM_DEBUG(dbgs() << "FnSpecialization: Try function: " << F->getName()		LLVM_DEBUG(dbgs() << "FnSpecialization: Try function: " << F->getName()
<< "\n");		<< "\n");
return true;		return true;
}		}

void specializeFunction(Function *F, SpecializationInfo &S,		void FunctionSpecializer::createSpecialization(Function *F,
FuncList &WorkList) {		SpecializationInfo &S,
		SmallVectorImpl<Function *> &WorkList) {
ValueToValueMapTy Mappings;		ValueToValueMapTy Mappings;
Function *Clone = cloneCandidateFunction(F, Mappings);		Function *Clone = cloneCandidateFunction(F, Mappings);

// Rewrite calls to the function so that they call the clone instead.		// Rewrite calls to the function so that they call the clone instead.
rewriteCallSites(Clone, S.Args, Mappings);		rewriteCallSites(Clone, S.Args, Mappings);

// Initialize the lattice state of the arguments of the function clone,		// Initialize the lattice state of the arguments of the function clone,
// marking the argument on which we specialized the function constant		// marking the argument on which we specialized the function constant
// with the given value.		// with the given value.
Solver.markArgInFuncSpecialization(Clone, S.Args);		Solver.markArgInFuncSpecialization(Clone, S.Args);

		Solver.addArgumentTrackedFunction(Clone);
		Solver.markBlockExecutable(&Clone->front());

// Mark all the specialized functions		// Mark all the specialized functions
WorkList.push_back(Clone);		WorkList.push_back(Clone);
		SpecializedFuncs.insert(Clone);
NbFunctionsSpecialized++;		NbFunctionsSpecialized++;

// If the function has been completely specialized, the original function		// If the function has been completely specialized, the original function
// is no longer needed. Mark it unreachable.		// is no longer needed. Mark it unreachable.
if (F->getNumUses() == 0 \|\| all_of(F->users(), [F](User *U) {		if (F->getNumUses() == 0 \|\| all_of(F->users(), [F](User *U) {
if (auto *CS = dyn_cast<CallBase>(U))		if (auto *CS = dyn_cast<CallBase>(U))
return CS->getFunction() == F;		return CS->getFunction() == F;
return false;		return false;
})) {		})) {
Solver.markFunctionUnreachable(F);		Solver.markFunctionUnreachable(F);
FullySpecialized.insert(F);		FullySpecialized.insert(F);
}		}
}		}

/// Compute and return the cost of specializing function \p F.		/// Compute and return the cost of specializing function \p F.
InstructionCost getSpecializationCost(Function *F) {		InstructionCost FunctionSpecializer::getSpecializationCost(Function *F) {
CodeMetrics &Metrics = analyzeFunction(F);		CodeMetrics &Metrics = analyzeFunction(F);
// If the code metrics reveal that we shouldn't duplicate the function, we		// If the code metrics reveal that we shouldn't duplicate the function, we
// shouldn't specialize it. Set the specialization cost to Invalid.		// shouldn't specialize it. Set the specialization cost to Invalid.
// Or if the lines of codes implies that this function is easy to get		// Or if the lines of codes implies that this function is easy to get
// inlined so that we shouldn't specialize it.		// inlined so that we shouldn't specialize it.
if (Metrics.notDuplicatable \|\| !Metrics.NumInsts.isValid() \|\|		if (Metrics.notDuplicatable \|\| !Metrics.NumInsts.isValid() \|\|
(!ForceFunctionSpecialization &&		(!ForceFunctionSpecialization &&
Metrics.NumInsts < SmallFunctionThreshold)) {		Metrics.NumInsts < SmallFunctionThreshold)) {
InstructionCost C{};		InstructionCost C{};
C.setInvalid();		C.setInvalid();
return C;		return C;
}		}

// Otherwise, set the specialization cost to be the cost of all the		// Otherwise, set the specialization cost to be the cost of all the
// instructions in the function and penalty for specializing more functions.		// instructions in the function and penalty for specializing more functions.
unsigned Penalty = NbFunctionsSpecialized + 1;		unsigned Penalty = NbFunctionsSpecialized + 1;
return Metrics.NumInsts * InlineConstants::getInstrCost() * Penalty;		return Metrics.NumInsts * InlineConstants::getInstrCost() * Penalty;
}		}

InstructionCost getUserBonus(User *U, llvm::TargetTransformInfo &TTI,		static InstructionCost getUserBonus(User *U, llvm::TargetTransformInfo &TTI,
LoopInfo &LI) {		LoopInfo &LI) {
auto *I = dyn_cast_or_null<Instruction>(U);		auto *I = dyn_cast_or_null<Instruction>(U);
// If not an instruction we do not know how to evaluate.		// If not an instruction we do not know how to evaluate.
// Keep minimum possible cost for now so that it doesnt affect		// Keep minimum possible cost for now so that it doesnt affect
// specialization.		// specialization.
if (!I)		if (!I)
return std::numeric_limits<unsigned>::min();		return std::numeric_limits<unsigned>::min();

InstructionCost Cost =		InstructionCost Cost =
TTI.getInstructionCost(U, TargetTransformInfo::TCK_SizeAndLatency);		TTI.getInstructionCost(U, TargetTransformInfo::TCK_SizeAndLatency);

// Traverse recursively if there are more uses.		// Traverse recursively if there are more uses.
// TODO: Any other instructions to be added here?		// TODO: Any other instructions to be added here?
if (I->mayReadFromMemory() \|\| I->isCast())		if (I->mayReadFromMemory() \|\| I->isCast())
for (auto *User : I->users())		for (auto *User : I->users())
Cost += getUserBonus(User, TTI, LI);		Cost += getUserBonus(User, TTI, LI);

// Increase the cost if it is inside the loop.		// Increase the cost if it is inside the loop.
auto LoopDepth = LI.getLoopDepth(I->getParent());		auto LoopDepth = LI.getLoopDepth(I->getParent());
Cost *= std::pow((double)AvgLoopIterationCount, LoopDepth);		Cost *= std::pow((double)AvgLoopIterationCount, LoopDepth);
return Cost;		return Cost;
}		}

/// Compute a bonus for replacing argument \p A with constant \p C.		/// Compute a bonus for replacing argument \p A with constant \p C.
InstructionCost getSpecializationBonus(Argument A, Constant C) {		InstructionCost FunctionSpecializer::getSpecializationBonus(Argument *A,
		Constant *C) {
Function *F = A->getParent();		Function *F = A->getParent();
DominatorTree DT(*F);		DominatorTree DT(*F);
LoopInfo LI(DT);		LoopInfo LI(DT);
auto &TTI = (GetTTI)(*F);		auto &TTI = (GetTTI)(*F);
LLVM_DEBUG(dbgs() << "FnSpecialization: Analysing bonus for constant: "		LLVM_DEBUG(dbgs() << "FnSpecialization: Analysing bonus for constant: "
<< C->getNameOrAsOperand() << "\n");		<< C->getNameOrAsOperand() << "\n");

InstructionCost TotalCost = 0;		InstructionCost TotalCost = 0;
for (auto *U : A->users()) {		for (auto *U : A->users()) {
TotalCost += getUserBonus(U, TTI, LI);		TotalCost += getUserBonus(U, TTI, LI);
LLVM_DEBUG(dbgs() << "FnSpecialization: User cost ";		LLVM_DEBUG(dbgs() << "FnSpecialization: User cost ";
TotalCost.print(dbgs()); dbgs() << " for: " << *U << "\n");		TotalCost.print(dbgs()); dbgs() << " for: " << *U << "\n");
}		}

// The below heuristic is only concerned with exposing inlining		// The below heuristic is only concerned with exposing inlining
// opportunities via indirect call promotion. If the argument is not a		// opportunities via indirect call promotion. If the argument is not a
// (potentially casted) function pointer, give up.		// (potentially casted) function pointer, give up.
Function *CalledFunction = dyn_cast<Function>(C->stripPointerCasts());		Function *CalledFunction = dyn_cast<Function>(C->stripPointerCasts());
if (!CalledFunction)		if (!CalledFunction)
return TotalCost;		return TotalCost;

// Get TTI for the called function (used for the inline cost).		// Get TTI for the called function (used for the inline cost).
auto &CalleeTTI = (GetTTI)(*CalledFunction);		auto &CalleeTTI = (GetTTI)(*CalledFunction);

// Look at all the call sites whose called value is the argument.		// Look at all the call sites whose called value is the argument.
// Specializing the function on the argument would allow these indirect		// Specializing the function on the argument would allow these indirect
// calls to be promoted to direct calls. If the indirect call promotion		// calls to be promoted to direct calls. If the indirect call promotion
// would likely enable the called function to be inlined, specializing is a		// would likely enable the called function to be inlined, specializing is a
// good idea.		// good idea.
int Bonus = 0;		int Bonus = 0;
for (User *U : A->users()) {		for (User *U : A->users()) {
if (!isa<CallInst>(U) && !isa<InvokeInst>(U))		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
continue;		continue;
auto *CS = cast<CallBase>(U);		auto *CS = cast<CallBase>(U);
if (CS->getCalledOperand() != A)		if (CS->getCalledOperand() != A)
continue;		continue;

// Get the cost of inlining the called function at this call site. Note		// Get the cost of inlining the called function at this call site. Note
// that this is only an estimate. The called function may eventually		// that this is only an estimate. The called function may eventually
// change in a way that leads to it not being inlined here, even though		// change in a way that leads to it not being inlined here, even though
// inlining looks profitable now. For example, one of its called		// inlining looks profitable now. For example, one of its called
// functions may be inlined into it, making the called function too large		// functions may be inlined into it, making the called function too large
// to be inlined into this call site.		// to be inlined into this call site.
//		//
// We apply a boost for performing indirect call promotion by increasing		// We apply a boost for performing indirect call promotion by increasing
// the default threshold by the threshold for indirect calls.		// the default threshold by the threshold for indirect calls.
auto Params = getInlineParams();		auto Params = getInlineParams();
Params.DefaultThreshold += InlineConstants::IndirectCallThreshold;		Params.DefaultThreshold += InlineConstants::IndirectCallThreshold;
InlineCost IC =		InlineCost IC =
getInlineCost(*CS, CalledFunction, Params, CalleeTTI, GetAC, GetTLI);		getInlineCost(*CS, CalledFunction, Params, CalleeTTI, GetAC, GetTLI);

// We clamp the bonus for this call to be between zero and the default		// We clamp the bonus for this call to be between zero and the default
// threshold.		// threshold.
if (IC.isAlways())		if (IC.isAlways())
Bonus += Params.DefaultThreshold;		Bonus += Params.DefaultThreshold;
else if (IC.isVariable() && IC.getCostDelta() > 0)		else if (IC.isVariable() && IC.getCostDelta() > 0)
Bonus += IC.getCostDelta();		Bonus += IC.getCostDelta();

LLVM_DEBUG(dbgs() << "FnSpecialization: Inlining bonus " << Bonus		LLVM_DEBUG(dbgs() << "FnSpecialization: Inlining bonus " << Bonus
<< " for user " << *U << "\n");		<< " for user " << *U << "\n");
}		}

return TotalCost + Bonus;		return TotalCost + Bonus;
}		}

/// Determine if we should specialize a function based on the incoming values		/// Determine if we should specialize a function based on the incoming values
/// of the given argument.		/// of the given argument.
///		///
/// This function implements the goal-directed heuristic. It determines if		/// This function implements the goal-directed heuristic. It determines if
/// specializing the function based on the incoming values of argument \p A		/// specializing the function based on the incoming values of argument \p A
/// would result in any significant optimization opportunities. If		/// would result in any significant optimization opportunities. If
/// optimization opportunities exist, the constant values of \p A on which to		/// optimization opportunities exist, the constant values of \p A on which to
/// specialize the function are collected in \p Constants.		/// specialize the function are collected in \p Constants.
///		///
/// \returns true if the function should be specialized on the given		/// \returns true if the function should be specialized on the given
/// argument.		/// argument.
bool isArgumentInteresting(Argument *A,		bool FunctionSpecializer::isArgumentInteresting(Argument *A,
SmallVectorImpl<CallArgBinding> &Constants) {		SmallVectorImpl<CallArgBinding> &Constants) {
// For now, don't attempt to specialize functions based on the values of		// For now, don't attempt to specialize functions based on the values of
// composite types.		// composite types.
if (!A->getType()->isSingleValueType() \|\| A->user_empty())		if (!A->getType()->isSingleValueType() \|\| A->user_empty())
return false;		return false;

// If the argument isn't overdefined, there's nothing to do. It should		// If the argument isn't overdefined, there's nothing to do. It should
// already be constant.		// already be constant.
if (!Solver.getLatticeValueFor(A).isOverdefined()) {		if (!Solver.getLatticeValueFor(A).isOverdefined()) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Nothing to do, argument "		LLVM_DEBUG(dbgs() << "FnSpecialization: Nothing to do, argument "
<< A->getNameOrAsOperand()		<< A->getNameOrAsOperand()
<< " is already constant?\n");		<< " is already constant?\n");
return false;		return false;
}		}

// Collect the constant values that the argument can take on. If the		// Collect the constant values that the argument can take on. If the
// argument can't take on any constant values, we aren't going to		// argument can't take on any constant values, we aren't going to
// specialize the function. While it's possible to specialize the function		// specialize the function. While it's possible to specialize the function
// based on non-constant arguments, there's likely not much benefit to		// based on non-constant arguments, there's likely not much benefit to
// constant propagation in doing so.		// constant propagation in doing so.
//		//
// TODO 1: currently it won't specialize if there are over the threshold of		// TODO 1: currently it won't specialize if there are over the threshold of
// calls using the same argument, e.g foo(a) x 4 and foo(b) x 1, but it		// calls using the same argument, e.g foo(a) x 4 and foo(b) x 1, but it
// might be beneficial to take the occurrences into account in the cost		// might be beneficial to take the occurrences into account in the cost
// model, so we would need to find the unique constants.		// model, so we would need to find the unique constants.
//		//
// TODO 2: this currently does not support constants, i.e. integer ranges.		// TODO 2: this currently does not support constants, i.e. integer ranges.
//		//
getPossibleConstants(A, Constants);		getPossibleConstants(A, Constants);

if (Constants.empty())		if (Constants.empty())
return false;		return false;

LLVM_DEBUG(dbgs() << "FnSpecialization: Found interesting argument "		LLVM_DEBUG(dbgs() << "FnSpecialization: Found interesting argument "
<< A->getNameOrAsOperand() << "\n");		<< A->getNameOrAsOperand() << "\n");
return true;		return true;
}		}

/// Collect in \p Constants all the constant values that argument \p A can		/// Collect in \p Constants all the constant values that argument \p A can
/// take on.		/// take on.
void getPossibleConstants(Argument *A,		void FunctionSpecializer::getPossibleConstants(Argument *A,
SmallVectorImpl<CallArgBinding> &Constants) {		SmallVectorImpl<CallArgBinding> &Constants) {
Function *F = A->getParent();		Function *F = A->getParent();

// SCCP solver does not record an argument that will be constructed on		// SCCP solver does not record an argument that will be constructed on
// stack.		// stack.
if (A->hasByValAttr() && !F->onlyReadsMemory())		if (A->hasByValAttr() && !F->onlyReadsMemory())
return;		return;

// Iterate over all the call sites of the argument's parent function.		// Iterate over all the call sites of the argument's parent function.
for (User *U : F->users()) {		for (User *U : F->users()) {
if (!isa<CallInst>(U) && !isa<InvokeInst>(U))		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
continue;		continue;
auto &CS = *cast<CallBase>(U);		auto &CS = *cast<CallBase>(U);
// If the call site has attribute minsize set, that callsite won't be		// If the call site has attribute minsize set, that callsite won't be
// specialized.		// specialized.
if (CS.hasFnAttr(Attribute::MinSize))		if (CS.hasFnAttr(Attribute::MinSize))
continue;		continue;

// If the parent of the call site will never be executed, we don't need		// If the parent of the call site will never be executed, we don't need
// to worry about the passed value.		// to worry about the passed value.
if (!Solver.isBlockExecutable(CS.getParent()))		if (!Solver.isBlockExecutable(CS.getParent()))
continue;		continue;

auto *V = CS.getArgOperand(A->getArgNo());		auto *V = CS.getArgOperand(A->getArgNo());
if (isa<PoisonValue>(V))		if (isa<PoisonValue>(V))
return;		return;

// TrackValueOfGlobalVariable only tracks scalar global variables.		// TrackValueOfGlobalVariable only tracks scalar global variables.
if (auto *GV = dyn_cast<GlobalVariable>(V)) {		if (auto *GV = dyn_cast<GlobalVariable>(V)) {
// Check if we want to specialize on the address of non-constant		// Check if we want to specialize on the address of non-constant
// global values.		// global values.
if (!GV->isConstant())		if (!GV->isConstant())
if (!SpecializeOnAddresses)		if (!SpecializeOnAddresses)
return;		return;

if (!GV->getValueType()->isSingleValueType())		if (!GV->getValueType()->isSingleValueType())
return;		return;
}		}

if (isa<Constant>(V) && (Solver.getLatticeValueFor(V).isConstant() \|\|		if (isa<Constant>(V) && (Solver.getLatticeValueFor(V).isConstant() \|\|
EnableSpecializationForLiteralConstant))		EnableSpecializationForLiteralConstant))
Constants.push_back({&CS, cast<Constant>(V)});		Constants.push_back({&CS, cast<Constant>(V)});
}		}
}		}

/// Rewrite calls to function \p F to call function \p Clone instead.		/// Rewrite calls to function \p F to call function \p Clone instead.
///		///
/// This function modifies calls to function \p F as long as the actual		/// This function modifies calls to function \p F as long as the actual
/// arguments match those in \p Args. Note that for recursive calls we		/// arguments match those in \p Args. Note that for recursive calls we
/// need to compare against the cloned formal arguments.		/// need to compare against the cloned formal arguments.
///		///
		fhahnUnsubmitted Done Reply Inline Actions IIUC this is done during in between solver runs, right? Is this needed? Isn't it sufficient to continue with the constant value in the value mapping? This would probably remove the need to tell the solver to forget instructions/values. fhahn: IIUC this is done during in between solver runs, right? Is this needed? Isn't it sufficient to…
		labrineaAuthorUnsubmitted Done Reply Inline Actions This is not the only invocation of `tryToReplaceWithConstant` in FuncSpec. On this instance we try to replace the arguments of cloned functions. There's another invocation inside the functor `RunSCCPSolver`. On that instance we try to replace the instructions of cloned functions. Both calls occur as many times as `FuncSpecializationMaxIters` is set to. Moreover, the SCCP pass itself does the same thing on arguments of tracked functions and on instructions of executable blocks (with `tryToReplaceWithConstant` and `simplifyInstsInBlock` accordingly). This happens after the Solver runs and before the Function Specializer is invoked. Therefore, I think we still need to tell the Solver to forget instructions/values if we want to merge the two passes. labrinea: This is not the only invocation of `tryToReplaceWithConstant` in FuncSpec. On this instance we…
/// Callsites that have been marked with the MinSize function attribute won't		/// Callsites that have been marked with the MinSize function attribute won't
/// be specialized and rewritten.		/// be specialized and rewritten.
void rewriteCallSites(Function *Clone, const SmallVectorImpl<ArgInfo> &Args,		void FunctionSpecializer::rewriteCallSites(Function *Clone,
		const SmallVectorImpl<ArgInfo> &Args,
ValueToValueMapTy &Mappings) {		ValueToValueMapTy &Mappings) {
assert(!Args.empty() && "Specialization without arguments");		assert(!Args.empty() && "Specialization without arguments");
Function *F = Args[0].Formal->getParent();		Function *F = Args[0].Formal->getParent();

SmallVector<CallBase *, 8> CallSitesToRewrite;		SmallVector<CallBase *, 8> CallSitesToRewrite;
for (auto *U : F->users()) {		for (auto *U : F->users()) {
if (!isa<CallInst>(U) && !isa<InvokeInst>(U))		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
continue;		continue;
auto &CS = *cast<CallBase>(U);		auto &CS = *cast<CallBase>(U);
if (!CS.getCalledFunction() \|\| CS.getCalledFunction() != F)		if (!CS.getCalledFunction() \|\| CS.getCalledFunction() != F)
continue;		continue;
CallSitesToRewrite.push_back(&CS);		CallSitesToRewrite.push_back(&CS);
}		}

LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing call sites of "		LLVM_DEBUG(dbgs() << "FnSpecialization: Replacing call sites of "
<< F->getName() << " with " << Clone->getName() << "\n");		<< F->getName() << " with " << Clone->getName() << "\n");

for (auto *CS : CallSitesToRewrite) {		for (auto *CS : CallSitesToRewrite) {
LLVM_DEBUG(dbgs() << "FnSpecialization: "		LLVM_DEBUG(dbgs() << "FnSpecialization: "
<< CS->getFunction()->getName() << " ->" << *CS		<< CS->getFunction()->getName() << " ->" << *CS
<< "\n");		<< "\n");
if (/* recursive call */		if (/* recursive call */
(CS->getFunction() == Clone &&		(CS->getFunction() == Clone &&
all_of(Args,		all_of(Args,
[CS, &Mappings](const ArgInfo &Arg) {		[CS, &Mappings](const ArgInfo &Arg) {
unsigned ArgNo = Arg.Formal->getArgNo();		unsigned ArgNo = Arg.Formal->getArgNo();
return CS->getArgOperand(ArgNo) == Mappings[Arg.Formal];		return CS->getArgOperand(ArgNo) == Mappings[Arg.Formal];
})) \|\|		})) \|\|
/* normal call */		/* normal call */
all_of(Args, [CS](const ArgInfo &Arg) {		all_of(Args, [CS](const ArgInfo &Arg) {
unsigned ArgNo = Arg.Formal->getArgNo();		unsigned ArgNo = Arg.Formal->getArgNo();
return CS->getArgOperand(ArgNo) == Arg.Actual;		return CS->getArgOperand(ArgNo) == Arg.Actual;
})) {		})) {
CS->setCalledFunction(Clone);		CS->setCalledFunction(Clone);
Solver.markOverdefined(CS);		Solver.markOverdefined(CS);
}		}
}		}
}		}
		labrineaAuthorUnsubmitted Done Reply Inline Actions There might be more call sites to rewrite than those in the CallSpecBinding that we have already found, therefore we need to repeat this look up of users here, but at least it now happens once for F compared to Clones.size() times which was the case before. labrinea: There might be more call sites to rewrite than those in the CallSpecBinding that we have…
		labrineaAuthorUnsubmitted Done Reply Inline Actions no need to examine lattices of arguments if it's the key of the CallSpecBinding labrinea: no need to examine lattices of arguments if it's the key of the CallSpecBinding
		labrineaAuthorUnsubmitted Done Reply Inline Actions We are modifying the list whist traversing it, so we swap the current element with the last one and reduce the iteration range by one. labrinea: We are modifying the list whist traversing it, so we swap the current element with the last one…
		labrineaAuthorUnsubmitted Done Reply Inline Actions the condition was different before, but I think this is correct labrinea: the condition was different before, but I think this is correct
		chillUnsubmitted Done Reply Inline Actions You can swap the order of the loops and get rid of `CallSiteToRewrite`. chill: You can swap the order of the loops and get rid of `CallSiteToRewrite`.
		chillUnsubmitted Done Reply Inline Actions Maybe I'm getting a bit ahead of me, but you can change the function to rewrite just a single call site and factor out the iteration over call sites. The benefit is the function becomes more reusable in as you can independently choose the set of call sites it operates upon. (Incidently, I'm planning to use it that way, but it generally a good change, even if what I have in mind turns out non-working). chill: Maybe I'm getting a bit ahead of me, but you can change the function to rewrite just a single…
		labrineaAuthorUnsubmitted Done Reply Inline Actions If I change it the way you suggest it will regress the current behaviour. Qsort() from SPEC's mcf won't specialize (it's a recursive function) and it's been quite a drive for this work. What if you alter it once you refactor how we rewrite callsites? labrinea: If I change it the way you suggest it will regress the current behaviour. Qsort() from SPEC's…
		labrineaAuthorUnsubmitted Done Reply Inline Actions You mean to traverse F's users here instead? We are iterating over them while modifying them, which is the reason why CallSiteToRewrite existed in the first place I believe. Also the dynamic cast and other checks we do: CS->getCalledFunction() == F && Solver.isBlockExecutable(CS->getParent()) (note: this one is missing from the current revision) don't need to be repeated on every iteration of the outer loop which walks the specializaions. labrinea: You mean to traverse F's users here instead? We are iterating over them while modifying them…
		chillUnsubmitted Done Reply Inline Actions What we have now is: forall S : Specialisations { forall C : CallSites { do_stuff(S, C); } } What I'm suggesting is to reorder the loops forall C: call sites { forall S: specialisations { do_stuff(S, C); } } That'll avoid the swaps/pop_back. The vector itself stays, good point. And then I'm suggesting to move the outer loop out of the function: func foo() { ... forall C: CallSites { rewriteCallSite(C) } ... } func rewriteCallSite(C) { forall S : Specialisations { do_stuff(S, C); } } Both are NFC. Also the dynamic cast and other checks we do: CS->getCalledFunction() == F && Solver.isBlockExecutable(CS->getParent()) (note: this one is missing from the current revision) don't need to be repeated on every iteration of the outer loop which walks the specializaions. I don't understand this. We do nothing between loops, so interchanging them will execute exactly the same operations in the loop body. If you add these checks somewhere, it's another argument to move the iteration over call sites to the outer loop. chill: What we have now is: ``` forall S : Specialisations { forall C : CallSites {…
		chillUnsubmitted Done Reply Inline Actions This is better placed outside of `rewriteCallSites`, perhaps just after the call to `rewriteCallSites`. chill: This is better placed outside of `rewriteCallSites`, perhaps just after the call to…
		chillUnsubmitted Done Reply Inline Actions Or a better idea: get the initial size of `CallSitesToRewrite`, decrement that number every time you update a call site. At the end if this number drops to zero mark the function unreachable. chill: Or a better idea: get the initial size of `CallSitesToRewrite`, decrement that number every…
		labrineaAuthorUnsubmitted Done Reply Inline Actions that won't work for dead recursive functions labrinea: that won't work for dead recursive functions
		chillUnsubmitted Done Reply Inline Actions Use braces around the `for`, since there are more than two levels of nesting. chill: Use braces around the `for`, since there are more than two levels of nesting.
		labrineaAuthorUnsubmitted Done Reply Inline Actions I'll rename this and add a comment to explain what it is used for. labrinea: I'll rename this and add a comment to explain what it is used for.

void updateSpecializedFuncs(FuncList &Candidates, FuncList &WorkList) {
for (auto *F : WorkList) {
SpecializedFuncs.insert(F);

// Initialize the state of the newly created functions, marking them
// argument-tracked and executable.
if (F->hasExactDefinition() && !F->hasFnAttribute(Attribute::Naked))
Solver.addTrackedFunction(F);

Solver.addArgumentTrackedFunction(F);
Candidates.push_back(F);
Solver.markBlockExecutable(&F->front());

// Replace the function arguments for the specialized functions.
for (Argument &Arg : F->args())
if (!Arg.use_empty() && tryToReplaceWithConstant(&Arg))
LLVM_DEBUG(dbgs() << "FnSpecialization: Replaced constant argument: "
<< Arg.getNameOrAsOperand() << "\n");
}
}
};
} // namespace

bool llvm::runFunctionSpecialization(
Module &M, const DataLayout &DL,
std::function<TargetLibraryInfo &(Function &)> GetTLI,
std::function<TargetTransformInfo &(Function &)> GetTTI,
std::function<AssumptionCache &(Function &)> GetAC,
function_ref<AnalysisResultsForFn(Function &)> GetAnalysis) {
SCCPSolver Solver(DL, GetTLI, M.getContext());
FunctionSpecializer FS(Solver, GetAC, GetTTI, GetTLI);
bool Changed = false;

// Loop over all functions, marking arguments to those with their addresses
// taken or that are external as overdefined.
for (Function &F : M) {
if (F.isDeclaration())
continue;
if (F.hasFnAttribute(Attribute::NoDuplicate))
continue;

LLVM_DEBUG(dbgs() << "\nFnSpecialization: Analysing decl: " << F.getName()
<< "\n");
Solver.addAnalysis(F, GetAnalysis(F));

// Determine if we can track the function's arguments. If so, add the
// function to the solver's set of argument-tracked functions.
if (canTrackArgumentsInterprocedurally(&F)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Can track arguments\n");
Solver.addArgumentTrackedFunction(&F);
continue;
} else {
LLVM_DEBUG(dbgs() << "FnSpecialization: Can't track arguments!\n"
<< "FnSpecialization: Doesn't have local linkage, or "
<< "has its address taken\n");
}

// Assume the function is called.
Solver.markBlockExecutable(&F.front());

// Assume nothing about the incoming arguments.
for (Argument &AI : F.args())
Solver.markOverdefined(&AI);
}

// Determine if we can track any of the module's global variables. If so, add
// the global variables we can track to the solver's set of tracked global
// variables.
for (GlobalVariable &G : M.globals()) {
G.removeDeadConstantUsers();
if (canTrackGlobalVariableInterprocedurally(&G))
Solver.trackValueOfGlobalVariable(&G);
}

auto &TrackedFuncs = Solver.getArgumentTrackedFunctions();
SmallVector<Function *, 16> FuncDecls(TrackedFuncs.begin(),
TrackedFuncs.end());

// No tracked functions, so nothing to do: don't run the solver and remove
// the ssa_copy intrinsics that may have been introduced.
if (TrackedFuncs.empty()) {
removeSSACopy(M);
return false;
}

// Solve for constants.
auto RunSCCPSolver = [&](auto &WorkList) {
bool ResolvedUndefs = true;

while (ResolvedUndefs) {
// Not running the solver unnecessary is checked in regression test
// nothing-to-do.ll, so if this debug message is changed, this regression
// test needs updating too.
LLVM_DEBUG(dbgs() << "FnSpecialization: Running solver\n");

Solver.solve();
LLVM_DEBUG(dbgs() << "FnSpecialization: Resolving undefs\n");
ResolvedUndefs = false;
for (Function *F : WorkList)
if (Solver.resolvedUndefsIn(*F))
ResolvedUndefs = true;
}

for (auto *F : WorkList) {
for (BasicBlock &BB : *F) {
if (!Solver.isBlockExecutable(&BB))
continue;
// FIXME: The solver may make changes to the function here, so set
// Changed, even if later function specialization does not trigger.
for (auto &I : make_early_inc_range(BB))
Changed \|= FS.tryToReplaceWithConstant(&I);
}
}
};

#ifndef NDEBUG
LLVM_DEBUG(dbgs() << "FnSpecialization: Worklist fn decls:\n");
for (auto *F : FuncDecls)
LLVM_DEBUG(dbgs() << "FnSpecialization: *) " << F->getName() << "\n");
#endif

// Initially resolve the constants in all the argument tracked functions.
RunSCCPSolver(FuncDecls);

SmallVector<Function *, 8> WorkList;
unsigned I = 0;
while (FuncSpecializationMaxIters != I++ &&
FS.specializeFunctions(FuncDecls, WorkList)) {
LLVM_DEBUG(dbgs() << "FnSpecialization: Finished iteration " << I << "\n");

// Run the solver for the specialized functions.
RunSCCPSolver(WorkList);

// Replace some unresolved constant arguments.
constantArgPropagation(FuncDecls, M, Solver);

WorkList.clear();
Changed = true;
}

LLVM_DEBUG(dbgs() << "FnSpecialization: Number of specializations = "
<< NumFuncSpecialized << "\n");

// Remove any ssa_copy intrinsics that may have been introduced.
removeSSACopy(M);
return Changed;
}

llvm/lib/Transforms/IPO/IPO.cpp

Show All 25 Lines	void llvm::initializeIPO(PassRegistry &Registry) {
initializeOpenMPOptCGSCCLegacyPassPass(Registry);		initializeOpenMPOptCGSCCLegacyPassPass(Registry);
initializeAnnotation2MetadataLegacyPass(Registry);		initializeAnnotation2MetadataLegacyPass(Registry);
initializeCalledValuePropagationLegacyPassPass(Registry);		initializeCalledValuePropagationLegacyPassPass(Registry);
initializeConstantMergeLegacyPassPass(Registry);		initializeConstantMergeLegacyPassPass(Registry);
initializeCrossDSOCFIPass(Registry);		initializeCrossDSOCFIPass(Registry);
initializeDAEPass(Registry);		initializeDAEPass(Registry);
initializeDAHPass(Registry);		initializeDAHPass(Registry);
initializeForceFunctionAttrsLegacyPassPass(Registry);		initializeForceFunctionAttrsLegacyPassPass(Registry);
initializeFunctionSpecializationLegacyPassPass(Registry);
initializeGlobalDCELegacyPassPass(Registry);		initializeGlobalDCELegacyPassPass(Registry);
initializeGlobalOptLegacyPassPass(Registry);		initializeGlobalOptLegacyPassPass(Registry);
initializeGlobalSplitPass(Registry);		initializeGlobalSplitPass(Registry);
initializeHotColdSplittingLegacyPassPass(Registry);		initializeHotColdSplittingLegacyPassPass(Registry);
initializeIROutlinerLegacyPassPass(Registry);		initializeIROutlinerLegacyPassPass(Registry);
initializeAlwaysInlinerLegacyPassPass(Registry);		initializeAlwaysInlinerLegacyPassPass(Registry);
initializeSimpleInlinerPass(Registry);		initializeSimpleInlinerPass(Registry);
initializeInferFunctionAttrsLegacyPassPass(Registry);		initializeInferFunctionAttrsLegacyPassPass(Registry);
▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	cl::opt<bool> EnableMatrix(
"enable-matrix", cl::init(false), cl::Hidden,		"enable-matrix", cl::init(false), cl::Hidden,
cl::desc("Enable lowering of the matrix intrinsics"));		cl::desc("Enable lowering of the matrix intrinsics"));

cl::opt<bool> EnableConstraintElimination(		cl::opt<bool> EnableConstraintElimination(
"enable-constraint-elimination", cl::init(false), cl::Hidden,		"enable-constraint-elimination", cl::init(false), cl::Hidden,
cl::desc(		cl::desc(
"Enable pass to eliminate conditions based on linear constraints."));		"Enable pass to eliminate conditions based on linear constraints."));

cl::opt<bool> EnableFunctionSpecialization(
"enable-function-specialization", cl::init(false), cl::Hidden,
cl::desc("Enable Function Specialization pass"));

cl::opt<AttributorRunOption> AttributorRun(		cl::opt<AttributorRunOption> AttributorRun(
"attributor-enable", cl::Hidden, cl::init(AttributorRunOption::NONE),		"attributor-enable", cl::Hidden, cl::init(AttributorRunOption::NONE),
cl::desc("Enable the attributor inter-procedural deduction pass."),		cl::desc("Enable the attributor inter-procedural deduction pass."),
cl::values(clEnumValN(AttributorRunOption::ALL, "all",		cl::values(clEnumValN(AttributorRunOption::ALL, "all",
"enable all attributor runs"),		"enable all attributor runs"),
clEnumValN(AttributorRunOption::MODULE, "module",		clEnumValN(AttributorRunOption::MODULE, "module",
"enable module-wide attributor runs"),		"enable module-wide attributor runs"),
clEnumValN(AttributorRunOption::CGSCC, "cgscc",		clEnumValN(AttributorRunOption::CGSCC, "cgscc",
▲ Show 20 Lines • Show All 455 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
if (AttributorRun & AttributorRunOption::MODULE)		if (AttributorRun & AttributorRunOption::MODULE)
MPM.add(createAttributorLegacyPass());		MPM.add(createAttributorLegacyPass());

addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);		addExtensionsToPM(EP_ModuleOptimizerEarly, MPM);

if (OptLevel > 2)		if (OptLevel > 2)
MPM.add(createCallSiteSplittingPass());		MPM.add(createCallSiteSplittingPass());

// Propage constant function arguments by specializing the functions.
if (OptLevel > 2 && EnableFunctionSpecialization)
MPM.add(createFunctionSpecializationPass());

MPM.add(createIPSCCPPass()); // IP SCCP		MPM.add(createIPSCCPPass()); // IP SCCP
MPM.add(createCalledValuePropagationPass());		MPM.add(createCalledValuePropagationPass());

MPM.add(createGlobalOptimizerPass()); // Optimize out global vars		MPM.add(createGlobalOptimizerPass()); // Optimize out global vars
// Promote any localized global vars.		// Promote any localized global vars.
MPM.add(createPromoteMemoryToRegisterPass());		MPM.add(createPromoteMemoryToRegisterPass());

MPM.add(createDeadArgEliminationPass()); // Dead argument elimination		MPM.add(createDeadArgEliminationPass()); // Dead argument elimination
▲ Show 20 Lines • Show All 230 Lines • Show Last 20 Lines

llvm/lib/Transforms/IPO/SCCP.cpp

	Show All 16 Lines
	#include "llvm/Analysis/TargetTransformInfo.h"			#include "llvm/Analysis/TargetTransformInfo.h"
	#include "llvm/InitializePasses.h"			#include "llvm/InitializePasses.h"
	#include "llvm/Transforms/IPO.h"			#include "llvm/Transforms/IPO.h"
	#include "llvm/Transforms/Scalar/SCCP.h"			#include "llvm/Transforms/Scalar/SCCP.h"
	#include "llvm/Transforms/Utils/SCCPSolver.h"			#include "llvm/Transforms/Utils/SCCPSolver.h"

	using namespace llvm;			using namespace llvm;

	PreservedAnalyses IPSCCPPass::run(Module &M, ModuleAnalysisManager &AM) {			PreservedAnalyses IPSCCPPass::run(Module &M, ModuleAnalysisManager &AM) {
				fhahnUnsubmitted Done Reply Inline Actions move this to the the loop below, which uses it fhahn: move this to the the loop below, which uses it
	const DataLayout &DL = M.getDataLayout();			const DataLayout &DL = M.getDataLayout();
	auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();			auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
	auto GetTLI = [&FAM](Function &F) -> const TargetLibraryInfo & {			auto GetTLI = [&FAM](Function &F) -> const TargetLibraryInfo & {
	return FAM.getResult<TargetLibraryAnalysis>(F);			return FAM.getResult<TargetLibraryAnalysis>(F);
	};			};
				auto GetTTI = [&FAM](Function &F) -> TargetTransformInfo & {
				return FAM.getResult<TargetIRAnalysis>(F);
				};
				auto GetAC = [&FAM](Function &F) -> AssumptionCache & {
				return FAM.getResult<AssumptionAnalysis>(F);
				};
	auto getAnalysis = [&FAM](Function &F) -> AnalysisResultsForFn {			auto getAnalysis = [&FAM](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT = FAM.getResult<DominatorTreeAnalysis>(F);			DominatorTree &DT = FAM.getResult<DominatorTreeAnalysis>(F);
	return {			return {
	std::make_unique<PredicateInfo>(F, DT, FAM.getResult<AssumptionAnalysis>(F)),			std::make_unique<PredicateInfo>(F, DT, FAM.getResult<AssumptionAnalysis>(F)),
	&DT, FAM.getCachedResult<PostDominatorTreeAnalysis>(F)};			&DT, FAM.getCachedResult<PostDominatorTreeAnalysis>(F)};
				chillUnsubmitted Done Reply Inline Actions This part was added for the FunctionSpecialization, if func spec is disabled maybe not pass along the LoopAnalysis? chill: This part was added for the FunctionSpecialization, if func spec is disabled maybe not pass…
	};			};

	if (!runIPSCCP(M, DL, GetTLI, getAnalysis))			if (!runIPSCCP(M, DL, GetTLI, GetTTI, GetAC, getAnalysis))
	return PreservedAnalyses::all();			return PreservedAnalyses::all();

	PreservedAnalyses PA;			PreservedAnalyses PA;
	PA.preserve<DominatorTreeAnalysis>();			PA.preserve<DominatorTreeAnalysis>();
				chillUnsubmitted Done Reply Inline Actions Now that we added `LoopAnalysis` we may well preserve it too. (I should have included it in the patch which introduced the `LoopAnalysis` here) chill: Now that we added `LoopAnalysis` we may well preserve it too. (I should have included it in the…
				labrineaAuthorUnsubmitted Done Reply Inline Actions I tried this but the compiler crashes. Probably because SCCP deletes dead basic blocks. labrinea: I tried this but the compiler crashes. Probably because SCCP deletes dead basic blocks.
	PA.preserve<PostDominatorTreeAnalysis>();			PA.preserve<PostDominatorTreeAnalysis>();
	PA.preserve<FunctionAnalysisManagerModuleProxy>();			PA.preserve<FunctionAnalysisManagerModuleProxy>();
	return PA;			return PA;
	}			}

	namespace {			namespace {

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	//			//
	/// IPSCCP Class - This class implements interprocedural Sparse Conditional			/// IPSCCP Class - This class implements interprocedural Sparse Conditional
	/// Constant Propagation.			/// Constant Propagation.
	///			///
	class IPSCCPLegacyPass : public ModulePass {			class IPSCCPLegacyPass : public ModulePass {
	public:			public:
	static char ID;			static char ID;

	IPSCCPLegacyPass() : ModulePass(ID) {			IPSCCPLegacyPass() : ModulePass(ID) {
	initializeIPSCCPLegacyPassPass(*PassRegistry::getPassRegistry());			initializeIPSCCPLegacyPassPass(*PassRegistry::getPassRegistry());
	}			}

	bool runOnModule(Module &M) override {			bool runOnModule(Module &M) override {
	if (skipModule(M))			if (skipModule(M))
	return false;			return false;

	const DataLayout &DL = M.getDataLayout();			const DataLayout &DL = M.getDataLayout();
	auto GetTLI = [this](Function &F) -> const TargetLibraryInfo & {			auto GetTLI = [this](Function &F) -> const TargetLibraryInfo & {
	return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);			return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
	};			};
				auto GetTTI = [this](Function &F) -> TargetTransformInfo & {
				return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
				};
				auto GetAC = [this](Function &F) -> AssumptionCache & {
				return this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
				};
	auto getAnalysis = [this](Function &F) -> AnalysisResultsForFn {			auto getAnalysis = [this](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT =			DominatorTree &DT =
	this->getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();			this->getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
	return {			return {
	std::make_unique<PredicateInfo>(			std::make_unique<PredicateInfo>(
	F, DT,			F, DT,
	this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(			this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(
	F)),			F)),
	nullptr, // We cannot preserve the DT or PDT with the legacy pass			nullptr, // We cannot preserve the DT or PDT with the legacy pass
	nullptr}; // manager, so set them to nullptr.			nullptr}; // manager, so set them to nullptr.
	};			};

	return runIPSCCP(M, DL, GetTLI, getAnalysis);			return runIPSCCP(M, DL, GetTLI, GetTTI, GetAC, getAnalysis);
	}			}

	void getAnalysisUsage(AnalysisUsage &AU) const override {			void getAnalysisUsage(AnalysisUsage &AU) const override {
	AU.addRequired<AssumptionCacheTracker>();			AU.addRequired<AssumptionCacheTracker>();
	AU.addRequired<DominatorTreeWrapperPass>();			AU.addRequired<DominatorTreeWrapperPass>();
	AU.addRequired<TargetLibraryInfoWrapperPass>();			AU.addRequired<TargetLibraryInfoWrapperPass>();
				AU.addRequired<TargetTransformInfoWrapperPass>();
	}			}
	};			};

	} // end anonymous namespace			} // end anonymous namespace

	char IPSCCPLegacyPass::ID = 0;			char IPSCCPLegacyPass::ID = 0;

	INITIALIZE_PASS_BEGIN(IPSCCPLegacyPass, "ipsccp",			INITIALIZE_PASS_BEGIN(IPSCCPLegacyPass, "ipsccp",
	"Interprocedural Sparse Conditional Constant Propagation",			"Interprocedural Sparse Conditional Constant Propagation",
	false, false)			false, false)
	INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)			INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
	INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)			INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)			INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
	INITIALIZE_PASS_END(IPSCCPLegacyPass, "ipsccp",			INITIALIZE_PASS_END(IPSCCPLegacyPass, "ipsccp",
	"Interprocedural Sparse Conditional Constant Propagation",			"Interprocedural Sparse Conditional Constant Propagation",
	false, false)			false, false)

	// createIPSCCPPass - This is the public interface to this file.			// createIPSCCPPass - This is the public interface to this file.
	ModulePass *llvm::createIPSCCPPass() { return new IPSCCPLegacyPass(); }			ModulePass *llvm::createIPSCCPPass() { return new IPSCCPLegacyPass(); }

	PreservedAnalyses FunctionSpecializationPass::run(Module &M,
	ModuleAnalysisManager &AM) {
	const DataLayout &DL = M.getDataLayout();
	auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
	auto GetTLI = [&FAM](Function &F) -> TargetLibraryInfo & {
	return FAM.getResult<TargetLibraryAnalysis>(F);
	};
	auto GetTTI = [&FAM](Function &F) -> TargetTransformInfo & {
	return FAM.getResult<TargetIRAnalysis>(F);
	};
	auto GetAC = [&FAM](Function &F) -> AssumptionCache & {
	return FAM.getResult<AssumptionAnalysis>(F);
	};
	auto GetAnalysis = [&FAM](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT = FAM.getResult<DominatorTreeAnalysis>(F);
	return {std::make_unique<PredicateInfo>(
	F, DT, FAM.getResult<AssumptionAnalysis>(F)),
	&DT, FAM.getCachedResult<PostDominatorTreeAnalysis>(F)};
	};

	if (!runFunctionSpecialization(M, DL, GetTLI, GetTTI, GetAC, GetAnalysis))
	return PreservedAnalyses::all();

	PreservedAnalyses PA;
	PA.preserve<DominatorTreeAnalysis>();
	PA.preserve<PostDominatorTreeAnalysis>();
	PA.preserve<FunctionAnalysisManagerModuleProxy>();
	return PA;
	}

	namespace {
	struct FunctionSpecializationLegacyPass : public ModulePass {
	static char ID; // Pass identification, replacement for typeid
	FunctionSpecializationLegacyPass() : ModulePass(ID) {}

	void getAnalysisUsage(AnalysisUsage &AU) const override {
	AU.addRequired<AssumptionCacheTracker>();
	AU.addRequired<DominatorTreeWrapperPass>();
	AU.addRequired<TargetLibraryInfoWrapperPass>();
	AU.addRequired<TargetTransformInfoWrapperPass>();
	}

	bool runOnModule(Module &M) override {
	if (skipModule(M))
	return false;

	const DataLayout &DL = M.getDataLayout();
	auto GetTLI = [this](Function &F) -> TargetLibraryInfo & {
	return this->getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(F);
	};
	auto GetTTI = [this](Function &F) -> TargetTransformInfo & {
	return this->getAnalysis<TargetTransformInfoWrapperPass>().getTTI(F);
	};
	auto GetAC = [this](Function &F) -> AssumptionCache & {
	return this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F);
	};

	auto GetAnalysis = [this](Function &F) -> AnalysisResultsForFn {
	DominatorTree &DT =
	this->getAnalysis<DominatorTreeWrapperPass>(F).getDomTree();
	return {
	std::make_unique<PredicateInfo>(
	F, DT,
	this->getAnalysis<AssumptionCacheTracker>().getAssumptionCache(
	F)),
	nullptr, // We cannot preserve the DT or PDT with the legacy pass
	nullptr}; // manager, so set them to nullptr.
	};
	return runFunctionSpecialization(M, DL, GetTLI, GetTTI, GetAC, GetAnalysis);
	}
	};
	} // namespace

	char FunctionSpecializationLegacyPass::ID = 0;

	INITIALIZE_PASS_BEGIN(
	FunctionSpecializationLegacyPass, "function-specialization",
	"Propagate constant arguments by specializing the function", false, false)

	INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
	INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
	INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
	INITIALIZE_PASS_END(FunctionSpecializationLegacyPass, "function-specialization",
	"Propagate constant arguments by specializing the function",
	false, false)

	ModulePass *llvm::createFunctionSpecializationPass() {
	return new FunctionSpecializationLegacyPass();
	}

llvm/lib/Transforms/Scalar/CMakeLists.txt

Show First 20 Lines • Show All 88 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMScalarOpts
COMPONENT_NAME		COMPONENT_NAME
Scalar		Scalar

LINK_COMPONENTS		LINK_COMPONENTS
AggressiveInstCombine		AggressiveInstCombine
Analysis		Analysis
Core		Core
InstCombine		InstCombine
		IPO
		chillUnsubmitted Not Done Reply Inline Actions Why add IPO here ? chill: Why add IPO here ?
		labrineaAuthorUnsubmitted Not Done Reply Inline Actions I vaguely remember a link time error without this change. See also `llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn` at the bottom of this diff. The IPSCCP pass now depends on the FunctionSpecializer whose cpp file is under the IPO directory. labrinea: I vaguely remember a link time error without this change. See also…
		chillUnsubmitted Not Done Reply Inline Actions IPO already depends on Scalar, i.e. in `IPO/CMakeLists.txt` we have ... COMPONENT_NAME IPO LINK_COMPONENTS ... Scalar ... Looks like a circular dependency. Perhaps `FunctionSpecialization` needs to go to `Utils` (alongside `SCCPSolver`). Or `runIPSCCP` needs to go to `IPO/SCCP.cpp`. Or both. chill: IPO already depends on Scalar, i.e. in `IPO/CMakeLists.txt` we have ``` ... COMPONENT_NAME…
Support		Support
TransformUtils		TransformUtils
)		)

llvm/lib/Transforms/Scalar/SCCP.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
#include "llvm/IR/User.h"		#include "llvm/IR/User.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/InitializePasses.h"		#include "llvm/InitializePasses.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/Transforms/IPO/FunctionSpecialization.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
#include "llvm/Transforms/Utils/SCCPSolver.h"		#include "llvm/Transforms/Utils/SCCPSolver.h"
#include <cassert>		#include <cassert>
#include <utility>		#include <utility>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "sccp"		#define DEBUG_TYPE "sccp"

STATISTIC(NumInstRemoved, "Number of instructions removed");		STATISTIC(NumInstRemoved, "Number of instructions removed");
STATISTIC(NumDeadBlocks , "Number of basic blocks unreachable");		STATISTIC(NumDeadBlocks , "Number of basic blocks unreachable");
STATISTIC(NumInstReplaced,		STATISTIC(NumInstReplaced,
"Number of instructions replaced with (simpler) instruction");		"Number of instructions replaced with (simpler) instruction");

STATISTIC(IPNumInstRemoved, "Number of instructions removed by IPSCCP");		STATISTIC(IPNumInstRemoved, "Number of instructions removed by IPSCCP");
STATISTIC(IPNumArgsElimed ,"Number of arguments constant propagated by IPSCCP");		STATISTIC(IPNumArgsElimed ,"Number of arguments constant propagated by IPSCCP");
STATISTIC(IPNumGlobalConst, "Number of globals found to be constant by IPSCCP");		STATISTIC(IPNumGlobalConst, "Number of globals found to be constant by IPSCCP");
STATISTIC(		STATISTIC(
IPNumInstReplaced,		IPNumInstReplaced,
"Number of instructions replaced with (simpler) instruction by IPSCCP");		"Number of instructions replaced with (simpler) instruction by IPSCCP");

		static cl::opt<bool> SpecializeFunctions("specialize-functions", cl::init(false),
		cl::Hidden, cl::desc("Enable function specialization"));

		static cl::opt<unsigned> FuncSpecializationMaxIters(
		"func-specialization-max-iters", cl::Hidden,
		cl::desc("The maximum number of iterations function specialization is run"),
		cl::init(1));

// Helper to check if \p LV is either a constant or a constant		// Helper to check if \p LV is either a constant or a constant
// range with a single element. This should cover exactly the same cases as the		// range with a single element. This should cover exactly the same cases as the
// old ValueLatticeElement::isConstant() and is intended to be used in the		// old ValueLatticeElement::isConstant() and is intended to be used in the
// transition to ValueLatticeElement.		// transition to ValueLatticeElement.
static bool isConstant(const ValueLatticeElement &LV) {		static bool isConstant(const ValueLatticeElement &LV) {
return LV.isConstant() \|\|		return LV.isConstant() \|\|
(LV.isConstantRange() && LV.getConstantRange().isSingleElement());		(LV.isConstantRange() && LV.getConstantRange().isSingleElement());
}		}
Show All 14 Lines	static bool canRemoveInstruction(Instruction *I) {
// those cases by falling through to here.		// those cases by falling through to here.
// TODO: Mark globals as being constant earlier, so		// TODO: Mark globals as being constant earlier, so
// TODO: wouldInstructionBeTriviallyDead() knows that atomic loads		// TODO: wouldInstructionBeTriviallyDead() knows that atomic loads
// TODO: are safe to remove.		// TODO: are safe to remove.
return isa<LoadInst>(I);		return isa<LoadInst>(I);
}		}

static bool tryToReplaceWithConstant(SCCPSolver &Solver, Value *V) {		static bool tryToReplaceWithConstant(SCCPSolver &Solver, Value *V) {
Constant *Const = nullptr;		Constant *Const = nullptr;
		chillUnsubmitted Done Reply Inline Actions Wouldn't it work without the temporary vector? `markUsersAsChanged` would go over each user, look at the user's operands (including `Old`), and find the `New` (which is some constant) form the lattice values map. Thus we would maybe get just: Solver.markUsersAsChanged(Old); Old->replaceAllUsesWith(New); chill: Wouldn't it work without the temporary vector? `markUsersAsChanged` would go over each user…
if (V->getType()->isStructTy()) {		if (V->getType()->isStructTy()) {
std::vector<ValueLatticeElement> IVs = Solver.getStructLatticeValueFor(V);		std::vector<ValueLatticeElement> IVs = Solver.getStructLatticeValueFor(V);
if (llvm::any_of(IVs, isOverdefined))		if (llvm::any_of(IVs, isOverdefined))
return false;		return false;
std::vector<Constant *> ConstVals;		std::vector<Constant *> ConstVals;
auto *ST = cast<StructType>(V->getType());		auto *ST = cast<StructType>(V->getType());
for (unsigned i = 0, e = ST->getNumElements(); i != e; ++i) {		for (unsigned i = 0, e = ST->getNumElements(); i != e; ++i) {
ValueLatticeElement V = IVs[i];		ValueLatticeElement V = IVs[i];
Show All 28 Lines	if (CB && ((CB->isMustTailCall() &&

LLVM_DEBUG(dbgs() << " Can\'t treat the result of call " << *CB		LLVM_DEBUG(dbgs() << " Can\'t treat the result of call " << *CB
<< " as a constant\n");		<< " as a constant\n");
return false;		return false;
}		}

LLVM_DEBUG(dbgs() << " Constant: " << Const << " = " << V << '\n');		LLVM_DEBUG(dbgs() << " Constant: " << Const << " = " << V << '\n');

		// Record uses of V to avoid visiting irrelevant uses of const later.
		SmallVector<Instruction *> UseInsts;
		for (auto *U : V->users())
		if (auto *I = dyn_cast<Instruction>(U))
		if (Solver.isBlockExecutable(I->getParent()))
		UseInsts.push_back(I);

// Replaces all of the uses of a variable with uses of the constant.		// Replaces all of the uses of a variable with uses of the constant.
V->replaceAllUsesWith(Const);		V->replaceAllUsesWith(Const);

		for (auto *I : UseInsts)
		Solver.visit(I);
		labrineaAuthorUnsubmitted Done Reply Inline Actions Just found that we need to do the same inside `replaceSignedInst()` too. I will move this code a function. labrinea: Just found that we need to do the same inside `replaceSignedInst()` too. I will move this code…
		chillUnsubmitted Done Reply Inline Actions Would it be possible to call `markUsersAsChanged` here ? chill: Would it be possible to call `markUsersAsChanged` here ?
		labrineaAuthorUnsubmitted Done Reply Inline Actions I think we can't because if we replace the uses first then the users of the old value will be empty. Can we markUsersAsChanged before we replaceAllUsesWith the new value? Btw markUsersAsChanged is private for the SCCPInstVisitor, but I suppose I could make it public if need be. labrinea: I think we can't because if we replace the uses first then the users of the old value will be…
		labrineaAuthorUnsubmitted Done Reply Inline Actions Actually I could call markUsersAsChanged on the new Instrcution after replacing the uses of the old Instruction with it. labrinea: Actually I could call markUsersAsChanged on the new Instrcution after replacing the uses of the…
		chillUnsubmitted Done Reply Inline Actions OK, let's leave it hanging for now, until I can take a look on top of the latest trunk. Ideally, we are trying to avoid changing code until the Solver is done. Here we have found that an instruction has constant lattice value - we should not replace the users' operands right away, but notify the Solver. The Solver in turn would add the instructions that need reexamining to the instructions worklist and update their lattice values the next time we invoke `Solvet.solve()`. Most likely `SCCPSolver::visit` should become private, the Solver (and the SCCP algorithm in general) is driven by its worklists, we should stick to this design: want something done - add it to the worklist. chill: OK, let's leave it hanging for now, until I can take a look on top of the latest trunk.
		labrineaAuthorUnsubmitted Done Reply Inline Actions Update: I tried this. It works for 'some' cases. Instead of replacing values with constants I create mappings from the old to the new value and only after all the solving is done then I replace the uses. The specialization of recursive functions doesn't work because it relies on finding allocas of constant integers. Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. In theory I could pass on the mappings from sccp to the specializer but it seems overly complicated to do so. labrinea: Update: I tried this. It works for 'some' cases. Instead of replacing values with constants I…
		chillUnsubmitted Done Reply Inline Actions Instead of replacing values with constants I create mappings from the old to the new value .. But isn't this what the `ValueState` already contains? Also the rewriting of callsites doesn't work either if the actual arguments have been constant propagated prior to specialization, but the old value hasn't been replaced yet. Well, `FunctionSpecializer::rewriteCallSites` and everything else should lookup lattice values, not work directly with operands. But OK, let's not make too many changes at once and revisit it later. chill: > Instead of replacing values with constants I create mappings from the old to the new value ..

return true;		return true;
}		}

/// Try to replace signed instructions with their unsigned equivalent.		/// Try to replace signed instructions with their unsigned equivalent.
static bool replaceSignedInst(SCCPSolver &Solver,		static bool replaceSignedInst(SCCPSolver &Solver,
		SmallPtrSetImpl<Instruction *> &ToDelete,
SmallPtrSetImpl<Value *> &InsertedValues,		SmallPtrSetImpl<Value *> &InsertedValues,
Instruction &Inst) {		Instruction &Inst) {
// Determine if a signed value is known to be >= 0.		// Determine if a signed value is known to be >= 0.
auto isNonNegative = [&Solver](Value *V) {		auto isNonNegative = [&Solver](Value *V) {
// If this value was constant-folded, it may not have a solver entry.		// If this value was constant-folded, it may not have a solver entry.
// Handle integers. Otherwise, return false.		// Handle integers. Otherwise, return false.
if (auto *C = dyn_cast<Constant>(V)) {		if (auto *C = dyn_cast<Constant>(V)) {
auto *CInt = dyn_cast<ConstantInt>(C);		auto *CInt = dyn_cast<ConstantInt>(C);
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	static bool replaceSignedInst(SCCPSolver &Solver,
}		}

// Wire up the new instruction and update state.		// Wire up the new instruction and update state.
assert(NewInst && "Expected replacement instruction");		assert(NewInst && "Expected replacement instruction");
NewInst->takeName(&Inst);		NewInst->takeName(&Inst);
InsertedValues.insert(NewInst);		InsertedValues.insert(NewInst);
Inst.replaceAllUsesWith(NewInst);		Inst.replaceAllUsesWith(NewInst);
Solver.removeLatticeValueFor(&Inst);		Solver.removeLatticeValueFor(&Inst);
Inst.eraseFromParent();		Inst.removeFromParent();
		chillUnsubmitted Done Reply Inline Actions Is there a specific reason to remove the instruction from the block? If not, I'd suggest doing deletion in a single place, as opposed to spreading parts of it all over. chill: Is there a specific reason to remove the instruction from the block? If not, I'd suggest doing…
		labrineaAuthorUnsubmitted Done Reply Inline Actions I am not entirely sure. I wanted to avoid revisiting this instruction accidentally in either of simplifyInstsInBlock(), solve(), or resolvedUndefsIn(). For simplifyInstsInBlock() I could skip the instruction if it's present in `ToDelete`. For the others I don't know what the consequences of revisitng would be. I need to run some tests first. labrinea: I am not entirely sure. I wanted to avoid revisiting this instruction accidentally in either of…
		chillUnsubmitted Done Reply Inline Actions I can't see why would anything go wrong if the instruction is revisited. Do we know if the instruction is safe to remove? It could be `SDiv`/`SRem` with a zero divisor. chill: I can't see why would anything go wrong if the instruction is revisited. Do we know if the…
		chillUnsubmitted Done Reply Inline Actions Actually, never mind, we're not replacing the instruction with a constant but with another instruction. chill: Actually, never mind, we're not replacing the instruction with a constant but with another…
		ToDelete.insert(&Inst);
return true;		return true;
}		}

static bool simplifyInstsInBlock(SCCPSolver &Solver, BasicBlock &BB,		static bool simplifyInstsInBlock(SCCPSolver &Solver, BasicBlock &BB,
		SmallPtrSetImpl<Instruction *> &ToDelete,
SmallPtrSetImpl<Value *> &InsertedValues,		SmallPtrSetImpl<Value *> &InsertedValues,
Statistic &InstRemovedStat,		Statistic &InstRemovedStat,
Statistic &InstReplacedStat) {		Statistic &InstReplacedStat) {
bool MadeChanges = false;		bool MadeChanges = false;
for (Instruction &Inst : make_early_inc_range(BB)) {		for (Instruction &Inst : make_early_inc_range(BB)) {
if (Inst.getType()->isVoidTy())		if (Inst.getType()->isVoidTy())
continue;		continue;
if (tryToReplaceWithConstant(Solver, &Inst)) {		if (tryToReplaceWithConstant(Solver, &Inst)) {
if (canRemoveInstruction(&Inst))		if (canRemoveInstruction(&Inst)) {
Inst.eraseFromParent();		Solver.removeLatticeValueFor(&Inst);
		Inst.removeFromParent();
		chillUnsubmitted Done Reply Inline Actions Likewise. chill: Likewise.
		ToDelete.insert(&Inst);
		}

MadeChanges = true;		MadeChanges = true;
++InstRemovedStat;		++InstRemovedStat;
} else if (replaceSignedInst(Solver, InsertedValues, Inst)) {		} else if (replaceSignedInst(Solver, ToDelete, InsertedValues, Inst)) {
MadeChanges = true;		MadeChanges = true;
++InstReplacedStat;		++InstReplacedStat;
}		}
}		}
return MadeChanges;		return MadeChanges;
}		}

static bool removeNonFeasibleEdges(const SCCPSolver &Solver, BasicBlock *BB,		static bool removeNonFeasibleEdges(const SCCPSolver &Solver, BasicBlock *BB,
DomTreeUpdater &DTU,		DomTreeUpdater *DTU,
BasicBlock *&NewUnreachableBB);		BasicBlock *&NewUnreachableBB);

// runSCCP() - Run the Sparse Conditional Constant Propagation algorithm,		// runSCCP() - Run the Sparse Conditional Constant Propagation algorithm,
// and return true if the function was modified.		// and return true if the function was modified.
static bool runSCCP(Function &F, const DataLayout &DL,		static bool runSCCP(Function &F, const DataLayout &DL,
const TargetLibraryInfo *TLI, DomTreeUpdater &DTU) {		const TargetLibraryInfo *TLI, DomTreeUpdater &DTU) {
LLVM_DEBUG(dbgs() << "SCCP on function '" << F.getName() << "'\n");		LLVM_DEBUG(dbgs() << "SCCP on function '" << F.getName() << "'\n");
SCCPSolver Solver(		SCCPSolver Solver(
Show All 16 Lines	static bool runSCCP(Function &F, const DataLayout &DL,
}		}

bool MadeChanges = false;		bool MadeChanges = false;

// If we decided that there are basic blocks that are dead in this function,		// If we decided that there are basic blocks that are dead in this function,
// delete their contents now. Note that we cannot actually delete the blocks,		// delete their contents now. Note that we cannot actually delete the blocks,
// as we cannot modify the CFG of the function.		// as we cannot modify the CFG of the function.

		SmallPtrSet<Instruction *, 32> ToDelete;
SmallPtrSet<Value *, 32> InsertedValues;		SmallPtrSet<Value *, 32> InsertedValues;
SmallVector<BasicBlock *, 8> BlocksToErase;		SmallVector<BasicBlock *, 8> BlocksToErase;
for (BasicBlock &BB : F) {		for (BasicBlock &BB : F) {
if (!Solver.isBlockExecutable(&BB)) {		if (!Solver.isBlockExecutable(&BB)) {
LLVM_DEBUG(dbgs() << " BasicBlock Dead:" << BB);		LLVM_DEBUG(dbgs() << " BasicBlock Dead:" << BB);
++NumDeadBlocks;		++NumDeadBlocks;
BlocksToErase.push_back(&BB);		BlocksToErase.push_back(&BB);
MadeChanges = true;		MadeChanges = true;
continue;		continue;
}		}

MadeChanges \|= simplifyInstsInBlock(Solver, BB, InsertedValues,		MadeChanges \|= simplifyInstsInBlock(Solver, BB, ToDelete, InsertedValues,
NumInstRemoved, NumInstReplaced);		NumInstRemoved, NumInstReplaced);
}		}
		for (Instruction *DeadInst : ToDelete)
		DeadInst->deleteValue();

// Remove unreachable blocks and non-feasible edges.		// Remove unreachable blocks and non-feasible edges.
for (BasicBlock *DeadBB : BlocksToErase)		for (BasicBlock *DeadBB : BlocksToErase)
NumInstRemoved += changeToUnreachable(DeadBB->getFirstNonPHI(),		NumInstRemoved += changeToUnreachable(DeadBB->getFirstNonPHI(),
/PreserveLCSSA=/false, &DTU);		/PreserveLCSSA=/false, &DTU);

BasicBlock *NewUnreachableBB = nullptr;		BasicBlock *NewUnreachableBB = nullptr;
for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
MadeChanges \|= removeNonFeasibleEdges(Solver, &BB, DTU, NewUnreachableBB);		MadeChanges \|= removeNonFeasibleEdges(Solver, &BB, &DTU, NewUnreachableBB);

for (BasicBlock *DeadBB : BlocksToErase)		for (BasicBlock *DeadBB : BlocksToErase)
if (!DeadBB->hasAddressTaken())		if (!DeadBB->hasAddressTaken())
DTU.deleteBB(DeadBB);		DTU.deleteBB(DeadBB);

return MadeChanges;		return MadeChanges;
}		}

▲ Show 20 Lines • Show All 108 Lines • ▼ Show 20 Lines	for (BasicBlock &BB : F) {

if (auto *RI = dyn_cast<ReturnInst>(BB.getTerminator()))		if (auto *RI = dyn_cast<ReturnInst>(BB.getTerminator()))
if (!isa<UndefValue>(RI->getOperand(0)))		if (!isa<UndefValue>(RI->getOperand(0)))
ReturnsToZap.push_back(RI);		ReturnsToZap.push_back(RI);
}		}
}		}

static bool removeNonFeasibleEdges(const SCCPSolver &Solver, BasicBlock *BB,		static bool removeNonFeasibleEdges(const SCCPSolver &Solver, BasicBlock *BB,
DomTreeUpdater &DTU,		DomTreeUpdater *DTU,
BasicBlock *&NewUnreachableBB) {		BasicBlock *&NewUnreachableBB) {
SmallPtrSet<BasicBlock *, 8> FeasibleSuccessors;		SmallPtrSet<BasicBlock *, 8> FeasibleSuccessors;
bool HasNonFeasibleEdges = false;		bool HasNonFeasibleEdges = false;
for (BasicBlock *Succ : successors(BB)) {		for (BasicBlock *Succ : successors(BB)) {
if (Solver.isEdgeFeasible(BB, Succ))		if (Solver.isEdgeFeasible(BB, Succ))
FeasibleSuccessors.insert(Succ);		FeasibleSuccessors.insert(Succ);
else		else
HasNonFeasibleEdges = true;		HasNonFeasibleEdges = true;
Show All 10 Lines	assert((isa<BranchInst>(TI) \|\| isa<SwitchInst>(TI) \|\|
"Terminator must be a br, switch or indirectbr");		"Terminator must be a br, switch or indirectbr");

if (FeasibleSuccessors.size() == 0) {		if (FeasibleSuccessors.size() == 0) {
// Branch on undef/poison, replace with unreachable.		// Branch on undef/poison, replace with unreachable.
SmallPtrSet<BasicBlock *, 8> SeenSuccs;		SmallPtrSet<BasicBlock *, 8> SeenSuccs;
SmallVector<DominatorTree::UpdateType, 8> Updates;		SmallVector<DominatorTree::UpdateType, 8> Updates;
for (BasicBlock *Succ : successors(BB)) {		for (BasicBlock *Succ : successors(BB)) {
Succ->removePredecessor(BB);		Succ->removePredecessor(BB);
if (SeenSuccs.insert(Succ).second)		if (DTU && SeenSuccs.insert(Succ).second)
Updates.push_back({DominatorTree::Delete, BB, Succ});		Updates.push_back({DominatorTree::Delete, BB, Succ});
}		}
TI->eraseFromParent();		TI->eraseFromParent();
new UnreachableInst(BB->getContext(), BB);		new UnreachableInst(BB->getContext(), BB);
DTU.applyUpdatesPermissive(Updates);		if (DTU)
		DTU->applyUpdatesPermissive(Updates);
} else if (FeasibleSuccessors.size() == 1) {		} else if (FeasibleSuccessors.size() == 1) {
// Replace with an unconditional branch to the only feasible successor.		// Replace with an unconditional branch to the only feasible successor.
BasicBlock OnlyFeasibleSuccessor = FeasibleSuccessors.begin();		BasicBlock OnlyFeasibleSuccessor = FeasibleSuccessors.begin();
SmallVector<DominatorTree::UpdateType, 8> Updates;		SmallVector<DominatorTree::UpdateType, 8> Updates;
bool HaveSeenOnlyFeasibleSuccessor = false;		bool HaveSeenOnlyFeasibleSuccessor = false;
for (BasicBlock *Succ : successors(BB)) {		for (BasicBlock *Succ : successors(BB)) {
if (Succ == OnlyFeasibleSuccessor && !HaveSeenOnlyFeasibleSuccessor) {		if (Succ == OnlyFeasibleSuccessor && !HaveSeenOnlyFeasibleSuccessor) {
// Don't remove the edge to the only feasible successor the first time		// Don't remove the edge to the only feasible successor the first time
// we see it. We still do need to remove any multi-edges to it though.		// we see it. We still do need to remove any multi-edges to it though.
HaveSeenOnlyFeasibleSuccessor = true;		HaveSeenOnlyFeasibleSuccessor = true;
continue;		continue;
}		}

Succ->removePredecessor(BB);		Succ->removePredecessor(BB);
		if (DTU)
Updates.push_back({DominatorTree::Delete, BB, Succ});		Updates.push_back({DominatorTree::Delete, BB, Succ});
}		}

BranchInst::Create(OnlyFeasibleSuccessor, BB);		BranchInst::Create(OnlyFeasibleSuccessor, BB);
TI->eraseFromParent();		TI->eraseFromParent();
DTU.applyUpdatesPermissive(Updates);		if (DTU)
		DTU->applyUpdatesPermissive(Updates);
} else if (FeasibleSuccessors.size() > 1) {		} else if (FeasibleSuccessors.size() > 1) {
SwitchInstProfUpdateWrapper SI(*cast<SwitchInst>(TI));		SwitchInstProfUpdateWrapper SI(*cast<SwitchInst>(TI));
SmallVector<DominatorTree::UpdateType, 8> Updates;		SmallVector<DominatorTree::UpdateType, 8> Updates;

// If the default destination is unfeasible it will never be taken. Replace		// If the default destination is unfeasible it will never be taken. Replace
// it with a new block with a single Unreachable instruction.		// it with a new block with a single Unreachable instruction.
BasicBlock *DefaultDest = SI->getDefaultDest();		BasicBlock *DefaultDest = SI->getDefaultDest();
if (!FeasibleSuccessors.contains(DefaultDest)) {		if (!FeasibleSuccessors.contains(DefaultDest)) {
if (!NewUnreachableBB) {		if (!NewUnreachableBB) {
NewUnreachableBB =		NewUnreachableBB =
BasicBlock::Create(DefaultDest->getContext(), "default.unreachable",		BasicBlock::Create(DefaultDest->getContext(), "default.unreachable",
DefaultDest->getParent(), DefaultDest);		DefaultDest->getParent(), DefaultDest);
new UnreachableInst(DefaultDest->getContext(), NewUnreachableBB);		new UnreachableInst(DefaultDest->getContext(), NewUnreachableBB);
}		}

SI->setDefaultDest(NewUnreachableBB);		SI->setDefaultDest(NewUnreachableBB);
		if (DTU) {
Updates.push_back({DominatorTree::Delete, BB, DefaultDest});		Updates.push_back({DominatorTree::Delete, BB, DefaultDest});
Updates.push_back({DominatorTree::Insert, BB, NewUnreachableBB});		Updates.push_back({DominatorTree::Insert, BB, NewUnreachableBB});
}		}
		}

for (auto CI = SI->case_begin(); CI != SI->case_end();) {		for (auto CI = SI->case_begin(); CI != SI->case_end();) {
if (FeasibleSuccessors.contains(CI->getCaseSuccessor())) {		if (FeasibleSuccessors.contains(CI->getCaseSuccessor())) {
++CI;		++CI;
continue;		continue;
}		}

BasicBlock *Succ = CI->getCaseSuccessor();		BasicBlock *Succ = CI->getCaseSuccessor();
Succ->removePredecessor(BB);		Succ->removePredecessor(BB);
		if (DTU)
Updates.push_back({DominatorTree::Delete, BB, Succ});		Updates.push_back({DominatorTree::Delete, BB, Succ});
SI.removeCase(CI);		SI.removeCase(CI);
// Don't increment CI, as we removed a case.		// Don't increment CI, as we removed a case.
}		}

DTU.applyUpdatesPermissive(Updates);		if (DTU)
		DTU->applyUpdatesPermissive(Updates);
} else {		} else {
llvm_unreachable("Must have at least one feasible successor");		llvm_unreachable("Must have at least one feasible successor");
}		}
return true;		return true;
}		}

		static bool propagateConstants(SCCPSolver &Solver,
		SmallVectorImpl<Function *> &Functions,
		SmallPtrSetImpl<Instruction *> &ToDelete) {
		bool MadeChanges = false;

		for (Function *F : Functions) {
		if (F->isDeclaration())
		continue;

		if (Solver.isBlockExecutable(&F->front())) {
		bool ReplacedPointerArg = false;
		for (Argument &Arg : F->args()) {
		if (!Arg.use_empty() && tryToReplaceWithConstant(Solver, &Arg)) {
		ReplacedPointerArg \|= Arg.getType()->isPointerTy();
		++IPNumArgsElimed;
		}
		}

		// If we replaced an argument, the argmemonly and
		// inaccessiblemem_or_argmemonly attributes do not hold any longer. Remove
		// them from both the function and callsites.
		if (ReplacedPointerArg) {
		AttributeMask AttributesToRemove;
		AttributesToRemove.addAttribute(Attribute::ArgMemOnly);
		AttributesToRemove.addAttribute(Attribute::InaccessibleMemOrArgMemOnly);
		F->removeFnAttrs(AttributesToRemove);

		for (User *U : F->users()) {
		auto *CB = dyn_cast<CallBase>(U);
		if (!CB \|\| CB->getCalledFunction() != F)
		continue;

		CB->removeFnAttrs(AttributesToRemove);
		}
		}
		MadeChanges \|= ReplacedPointerArg;
		}

		SmallPtrSet<Value *, 32> InsertedValues;
		for (BasicBlock &BB : *F)
		if (Solver.isBlockExecutable(&BB))
		MadeChanges \|= simplifyInstsInBlock(Solver, BB, ToDelete, InsertedValues,
		IPNumInstRemoved, IPNumInstReplaced);
		}

		return MadeChanges;
		}

		static void solve(SCCPSolver &Solver, SmallVectorImpl<Function *> &Functions) {
		bool ResolvedUndefs = true;
		while (ResolvedUndefs) {
		Solver.solve();
		LLVM_DEBUG(dbgs() << "RESOLVING UNDEFS\n");
		ResolvedUndefs = false;
		for (Function *F : Functions)
		ResolvedUndefs \|= Solver.resolvedUndefsIn(*F);
		}
		}

bool llvm::runIPSCCP(		bool llvm::runIPSCCP(
Module &M, const DataLayout &DL,		Module &M, const DataLayout &DL,
std::function<const TargetLibraryInfo &(Function &)> GetTLI,		std::function<const TargetLibraryInfo &(Function &)> GetTLI,
		std::function<TargetTransformInfo &(Function &)> GetTTI,
		std::function<AssumptionCache &(Function &)> GetAC,
function_ref<AnalysisResultsForFn(Function &)> getAnalysis) {		function_ref<AnalysisResultsForFn(Function &)> getAnalysis) {
SCCPSolver Solver(DL, GetTLI, M.getContext());		SCCPSolver Solver(DL, GetTLI, M.getContext());
		FunctionSpecializer Specializer(Solver, M, GetTLI, GetTTI, GetAC);

// Loop over all functions, marking arguments to those with their addresses		// Loop over all functions, marking arguments to those with their addresses
// taken or that are external as overdefined.		// taken or that are external as overdefined.
for (Function &F : M) {		for (Function &F : M) {
if (F.isDeclaration())		if (F.isDeclaration())
continue;		continue;

Solver.addAnalysis(F, getAnalysis(F));		Solver.addAnalysis(F, getAnalysis(F));
Show All 21 Lines	bool llvm::runIPSCCP(
// Determine if we can track any of the module's global variables. If so, add		// Determine if we can track any of the module's global variables. If so, add
// the global variables we can track to the solver's set of tracked global		// the global variables we can track to the solver's set of tracked global
// variables.		// variables.
for (GlobalVariable &G : M.globals()) {		for (GlobalVariable &G : M.globals()) {
G.removeDeadConstantUsers();		G.removeDeadConstantUsers();
if (canTrackGlobalVariableInterprocedurally(&G))		if (canTrackGlobalVariableInterprocedurally(&G))
Solver.trackValueOfGlobalVariable(&G);		Solver.trackValueOfGlobalVariable(&G);
}		}

// Solve for constants.
bool ResolvedUndefs = true;
Solver.solve();
while (ResolvedUndefs) {
LLVM_DEBUG(dbgs() << "RESOLVING UNDEFS\n");
ResolvedUndefs = false;
for (Function &F : M) {
if (Solver.resolvedUndefsIn(F))
ResolvedUndefs = true;
}
if (ResolvedUndefs)
Solver.solve();
}

bool MadeChanges = false;		bool MadeChanges = false;
		chillUnsubmitted Done Reply Inline Actions IMHO, the invocation of the `FunctionSpecialization` pass ought to happen in this place. The general flow would be like: Initialise solver Run solver once (`Solver.solve()` + `resolvedUndefsIn` loop) Run function specialisation Run solver again Optionally go to 2. Do replacements (from line 512 on) At no point before the last step the passes ought to replace or delete anything (well, except called function operand for cloned functions). If an operand/argument is determined to be a constant, it does not need to be replaced right away, because the passes should consult its lattice value. Yeah, the devil is in the details, but this is the approach to merging the tow passes, as I see it. chill: IMHO, the invocation of the `FunctionSpecialization` pass ought to happen in this place. The…
		SmallVector<Function *> WorkList;
		SmallPtrSet<Instruction *, 32> ToDelete;

// Iterate over all of the instructions in the module, replacing them with		for (Function &F : M)
		chillUnsubmitted Done Reply Inline Actions I would suggest not creating a vector of all the functions in the module as they could be quite a lot (e.g. in LTO) and thus trigger several heap allocations for `WorkList`. `solveWhileResolvedUndefIn` is quite small and could be overloaded for a `Module ` parameter. I considered making this function a template along the lines of: template<typename RangeT> void printNames(RangeT &&R) { for (auto &F : R) llvm::dbgs() << magic(F)->getName(); } std::vector<llvm::Function > v; llvm::Module M; int main() { printNames(M->functions()); printNames(v); } but couldn't come up with `magic`. As for `propagateConstants` it could be done with a few overloads as well: static bool propagateConstants(SCCPSolver &Solver, Function F, SmallPtrSetImpl<Instruction > &ToDelete); static bool propagateConstants(SCCPSolver &Solver, SmallVectorImpl<Function > &WorkList, SmallPtrSetImpl<Instruction > &ToDelete) { for (Function F : WorkList) propagateConstants(Solve, F, ToDelete); } static bool propagateConstants(SCCPSolver &Solver, Module M, SmallPtrSetImpl<Instruction > &ToDelete) { for (auto &F : Module) propagateConstants(Solve, &F, ToDelete); } chill: I would suggest not creating a vector of all the functions in the module as they could be quite…
// constants if we have found them to be of constant values.		WorkList.push_back(&F);

for (Function &F : M) {		// Solve for constants.
if (F.isDeclaration())		solve(Solver, WorkList);
continue;

SmallVector<BasicBlock *, 512> BlocksToErase;		// Iterate over all of the instructions in the module, replacing them with
		// constants if we have found them to be of constant values.
		MadeChanges \|= propagateConstants(Solver, WorkList, ToDelete);

if (Solver.isBlockExecutable(&F.front())) {		if (SpecializeFunctions) {
bool ReplacedPointerArg = false;		unsigned Iters = 0;
for (Argument &Arg : F.args()) {		WorkList.clear();
if (!Arg.use_empty() && tryToReplaceWithConstant(Solver, &Arg)) {		while (Iters++ < FuncSpecializationMaxIters &&
ReplacedPointerArg \|= Arg.getType()->isPointerTy();		Specializer.specialize(WorkList)) {
++IPNumArgsElimed;		solve(Solver, WorkList);
		MadeChanges \|= propagateConstants(Solver, WorkList, ToDelete);
		Specializer.promoteConstantStackValues();
		WorkList.clear();
}		}
}		}

// If we replaced an argument, the argmemonly and		// Don't delete anything until solving is done.
// inaccessiblemem_or_argmemonly attributes do not hold any longer. Remove		for (Instruction *DeadInst : ToDelete)
// them from both the function and callsites.		DeadInst->deleteValue();
if (ReplacedPointerArg) {
AttributeMask AttributesToRemove;
AttributesToRemove.addAttribute(Attribute::ArgMemOnly);
AttributesToRemove.addAttribute(Attribute::InaccessibleMemOrArgMemOnly);
F.removeFnAttrs(AttributesToRemove);

for (User *U : F.users()) {		for (Function &F : M) {
auto *CB = dyn_cast<CallBase>(U);		if (F.isDeclaration())
if (!CB \|\| CB->getCalledFunction() != &F)
continue;		continue;

CB->removeFnAttrs(AttributesToRemove);		SmallVector<BasicBlock *, 512> BlocksToErase;
}
}
MadeChanges \|= ReplacedPointerArg;
}

SmallPtrSet<Value *, 32> InsertedValues;
for (BasicBlock &BB : F) {		for (BasicBlock &BB : F) {
if (!Solver.isBlockExecutable(&BB)) {		if (!Solver.isBlockExecutable(&BB)) {
LLVM_DEBUG(dbgs() << " BasicBlock Dead:" << BB);		LLVM_DEBUG(dbgs() << " BasicBlock Dead:" << BB);
++NumDeadBlocks;		++NumDeadBlocks;

MadeChanges = true;		MadeChanges = true;

if (&BB != &F.front())		if (&BB != &F.front())
BlocksToErase.push_back(&BB);		BlocksToErase.push_back(&BB);
continue;
}		}

MadeChanges \|= simplifyInstsInBlock(Solver, BB, InsertedValues,
IPNumInstRemoved, IPNumInstReplaced);
}		}

DomTreeUpdater DTU = Solver.getDTU(F);		Optional<DomTreeUpdater> OptDTU =
		SpecializeFunctions && Specializer.isClonedFunction(&F) ?
		None : Optional<DomTreeUpdater>(Solver.getDTU(F));

		DomTreeUpdater DTU = OptDTU ? &OptDTU : nullptr;

// Change dead blocks to unreachable. We do it after replacing constants		// Change dead blocks to unreachable. We do it after replacing constants
// in all executable blocks, because changeToUnreachable may remove PHI		// in all executable blocks, because changeToUnreachable may remove PHI
// nodes in executable blocks we found values for. The function's entry		// nodes in executable blocks we found values for. The function's entry
// block is not part of BlocksToErase, so we have to handle it separately.		// block is not part of BlocksToErase, so we have to handle it separately.
for (BasicBlock *BB : BlocksToErase) {		for (BasicBlock *BB : BlocksToErase) {
NumInstRemoved += changeToUnreachable(BB->getFirstNonPHI(),		NumInstRemoved += changeToUnreachable(BB->getFirstNonPHI(),
/PreserveLCSSA=/false, &DTU);		/PreserveLCSSA=/false, DTU);
}		}
if (!Solver.isBlockExecutable(&F.front()))		if (!Solver.isBlockExecutable(&F.front()))
NumInstRemoved += changeToUnreachable(F.front().getFirstNonPHI(),		NumInstRemoved += changeToUnreachable(F.front().getFirstNonPHI(),
/PreserveLCSSA=/false, &DTU);		/PreserveLCSSA=/false, DTU);

BasicBlock *NewUnreachableBB = nullptr;		BasicBlock *NewUnreachableBB = nullptr;
for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
MadeChanges \|= removeNonFeasibleEdges(Solver, &BB, DTU, NewUnreachableBB);		MadeChanges \|= removeNonFeasibleEdges(Solver, &BB, DTU, NewUnreachableBB);

for (BasicBlock *DeadBB : BlocksToErase)		for (BasicBlock *DeadBB : BlocksToErase) {
if (!DeadBB->hasAddressTaken())		if (!DeadBB->hasAddressTaken()) {
DTU.deleteBB(DeadBB);		if (DTU)
		DTU->deleteBB(DeadBB);
		else
		DeadBB->eraseFromParent();
		}
		}

for (BasicBlock &BB : F) {		for (BasicBlock &BB : F) {
for (Instruction &Inst : llvm::make_early_inc_range(BB)) {		for (Instruction &Inst : llvm::make_early_inc_range(BB)) {
if (Solver.getPredicateInfoFor(&Inst)) {		if (Solver.getPredicateInfoFor(&Inst)) {
if (auto *II = dyn_cast<IntrinsicInst>(&Inst)) {		if (auto *II = dyn_cast<IntrinsicInst>(&Inst)) {
if (II->getIntrinsicID() == Intrinsic::ssa_copy) {		if (II->getIntrinsicID() == Intrinsic::ssa_copy) {
Value *Op = II->getOperand(0);		Value *Op = II->getOperand(0);
Inst.replaceAllUsesWith(Op);		Inst.replaceAllUsesWith(Op);
▲ Show 20 Lines • Show All 116 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/bug52821-use-after-free.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -S < %s \| FileCheck %s

	%mystruct = type { i32, [2 x i64] }			%mystruct = type { i32, [2 x i64] }

	define internal %mystruct* @myfunc(%mystruct* %arg) {			define internal %mystruct* @myfunc(%mystruct* %arg) {
	; CHECK-LABEL: @myfunc(			; CHECK-LABEL: @myfunc(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[FOR_COND:%.*]]			; CHECK-NEXT: br label [[FOR_COND:%.*]]
	; CHECK: for.cond:			; CHECK: for.cond:
	; CHECK-NEXT: br i1 true, label [[FOR_COND2:%.]], label [[FOR_BODY:%.]]			; CHECK-NEXT: br label [[FOR_COND2:%.*]]
	; CHECK: for.body:
	; CHECK-NEXT: call void @callee(%mystruct* nonnull null)
	; CHECK-NEXT: br label [[FOR_COND]]
	; CHECK: for.cond2:			; CHECK: for.cond2:
	; CHECK-NEXT: br i1 false, label [[FOR_END:%.]], label [[FOR_BODY2:%.]]			; CHECK-NEXT: br label [[FOR_BODY2:%.*]]
	; CHECK: for.body2:			; CHECK: for.body2:
	; CHECK-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds [[MYSTRUCT:%.]], %mystruct* null, i64 0, i32 1, i64 3
	; CHECK-NEXT: br label [[FOR_COND2]]			; CHECK-NEXT: br label [[FOR_COND2]]
	; CHECK: for.end:
	; CHECK-NEXT: ret %mystruct* [[ARG:%.*]]
	;			;
	entry:			entry:
	br label %for.cond			br label %for.cond

	for.cond: ; preds = %for.body, %entry			for.cond: ; preds = %for.body, %entry
	%phi = phi %mystruct* [ undef, %for.body ], [ null, %entry ]			%phi = phi %mystruct* [ undef, %for.body ], [ null, %entry ]
	%cond = icmp eq %mystruct* %phi, null			%cond = icmp eq %mystruct* %phi, null
	br i1 %cond, label %for.cond2, label %for.body			br i1 %cond, label %for.cond2, label %for.body
	Show All 13 Lines
	for.end: ; preds = %for.cond2			for.end: ; preds = %for.cond2
	ret %mystruct* %arg			ret %mystruct* %arg
	}			}

	define %mystruct* @caller() {			define %mystruct* @caller() {
	; CHECK-LABEL: @caller(			; CHECK-LABEL: @caller(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[CALL:%.]] = call %mystruct @myfunc(%mystruct* undef)			; CHECK-NEXT: [[CALL:%.]] = call %mystruct @myfunc(%mystruct* undef)
	; CHECK-NEXT: ret %mystruct* [[CALL]]			; CHECK-NEXT: ret %mystruct* undef
	;			;
	entry:			entry:
	%call = call %mystruct* @myfunc(%mystruct* undef)			%call = call %mystruct* @myfunc(%mystruct* undef)
	ret %mystruct* %call			ret %mystruct* %call
	}			}

	declare void @callee(%mystruct*)			declare void @callee(%mystruct*)

llvm/test/Transforms/FunctionSpecialization/bug55000-read-uninitialized-value.ll

	; RUN: opt -function-specialization -func-specialization-max-iters=2 -func-specialization-size-threshold=20 -func-specialization-avg-iters-cost=20 -function-specialization-for-literal-constant=true -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=2 -func-specialization-max-clones=1 -function-specialization-for-literal-constant=true -S < %s \| FileCheck %s

	declare hidden i1 @compare(ptr) align 2			declare hidden i1 @compare(ptr) align 2
	declare hidden { i8, ptr } @getType(ptr) align 2			declare hidden { i8, ptr } @getType(ptr) align 2

	; CHECK-LABEL: @foo			; CHECK-LABEL: @foo
	; CHECK-LABEL: @foo.1			; CHECK-LABEL: @foo.1
	; CHECK-LABEL: @foo.2			; CHECK-LABEL: @foo.2

	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-always-inline.ll

	; RUN: opt -function-specialization -func-specialization-avg-iters-cost=3 -func-specialization-size-threshold=10 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-avg-iters-cost=3 -func-specialization-size-threshold=10 -S < %s \| FileCheck %s

	; CHECK-NOT: foo.{{[0-9]+}}			; CHECK-NOT: foo.{{[0-9]+}}

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	@A = external dso_local constant i32, align 4			@A = external dso_local constant i32, align 4
	@B = external dso_local constant i32, align 4			@B = external dso_local constant i32, align 4
	@C = external dso_local constant i32, align 4			@C = external dso_local constant i32, align 4
	▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

	if.else:			if.else:
	%call1 = call i32 @foo(i32 %y, i32* @B, i32* @D)			%call1 = call i32 @foo(i32 %y, i32* @B, i32* @D)
	br label %return			br label %return

	return:			return:
	%retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ]			%retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ]
	ret i32 %retval.0			ret i32 %retval.0
	}			}
	No newline at end of file

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py

	; Test function specialization wouldn't crash due to constant expression.			; Test function specialization wouldn't crash due to constant expression.
	; Note that this test case shows that function specialization pass would			; Note that this test case shows that function specialization pass would
	; transform the function even if no specialization happened.			; transform the function even if no specialization happened.

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	%struct = type { i8, i16, i32, i64, i64}			%struct = type { i8, i16, i32, i64, i64}
	@Global = internal constant %struct {i8 0, i16 1, i32 2, i64 3, i64 4}			@Global = internal constant %struct {i8 0, i16 1, i32 2, i64 3, i64 4}

	define internal i64 @func2(i64 *%x) {			define internal i64 @func2(i64 *%x) {
	entry:			entry:
	%val = ptrtoint i64* %x to i64			%val = ptrtoint i64* %x to i64
	ret i64 %val			ret i64 %val
	}			}

	define internal i64 @func(i64 %x, i64 (i64)* %binop) {			define internal i64 @func(i64 %x, i64 (i64)* %binop) {
	; CHECK-LABEL: @func(			; CHECK-LABEL: @func(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[TMP0:%.]] = call i64 [[BINOP:%.]](i64* [[X:%.*]])			; CHECK-NEXT: unreachable
	; CHECK-NEXT: ret i64 [[TMP0]]
	;			;
	entry:			entry:
	%tmp0 = call i64 %binop(i64* %x)			%tmp0 = call i64 %binop(i64* %x)
	ret i64 %tmp0			ret i64 %tmp0
	}			}

	define internal i64 @zoo(i1 %flag) {			define internal i64 @zoo(i1 %flag) {
	; CHECK-LABEL: @zoo(			; CHECK-LABEL: @zoo(
	▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression2.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; Check that we don't crash and specialise on a constant expression.			; Check that we don't crash and specialise on a constant expression.

	%struct.pluto = type { %struct.spam }			%struct.pluto = type { %struct.spam }
	%struct.quux = type { i16 }			%struct.quux = type { i16 }
	%struct.spam = type { i16 }			%struct.spam = type { i16 }

	@global.5 = external dso_local global [4 x %struct.pluto], align 1			@global.5 = external dso_local global [4 x %struct.pluto], align 1
	Show All 34 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression3.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	define i32 @main() {			define i32 @main() {
	; CHECK-LABEL: @main(			; CHECK-LABEL: @main(
	; CHECK-NEXT: bb:			; CHECK-NEXT: bb:
	; CHECK-NEXT: tail call void @wombat.1(i8* undef, i64 undef, i64 undef, i32 (i8, i8)* bitcast (i32 ()* @quux to i32 (i8, i8)*))			; CHECK-NEXT: tail call void @wombat.1(i8* undef, i64 undef, i64 undef, i32 (i8, i8)* bitcast (i32 ()* @quux to i32 (i8, i8)*))
	; CHECK-NEXT: tail call void @wombat.2(i8* undef, i64 undef, i64 undef, i32 (i8, i8)* bitcast (i32 ()* @eggs to i32 (i8, i8)*))			; CHECK-NEXT: tail call void @wombat.2(i8* undef, i64 undef, i64 undef, i32 (i8, i8)* bitcast (i32 ()* @eggs to i32 (i8, i8)*))
	; CHECK-NEXT: ret i32 undef			; CHECK-NEXT: ret i32 undef
	;			;
	Show All 14 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression4.ll

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; Check that we don't crash and specialise on a function call with byval attribute.			; Check that we don't crash and specialise on a function call with byval attribute.

	; CHECK-NOT: wombat.{{[0-9]+}}			; CHECK-NOT: wombat.{{[0-9]+}}

	declare i32* @quux()			declare i32* @quux()
	declare i32* @eggs()			declare i32* @eggs()

	Show All 23 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression5.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -force-function-specialization -func-specialization-on-address -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-on-address -S < %s \| FileCheck %s

	; Check that we don't crash and specialise on a scalar global variable with byval attribute.			; Check that we don't crash and specialise on a scalar global variable with byval attribute.

	; CHECK-NOT: wombat.{{[0-9]+}}			; CHECK-NOT: wombat.{{[0-9]+}}

	%struct.pluto = type { %struct.spam }			%struct.pluto = type { %struct.spam }
	%struct.quux = type { i16 }			%struct.quux = type { i16 }
	%struct.spam = type { i16 }			%struct.spam = type { i16 }
	Show All 36 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-integers.ll

	; RUN: opt -function-specialization -function-specialization-for-literal-constant=true -func-specialization-size-threshold=10 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -function-specialization-for-literal-constant=true -func-specialization-size-threshold=10 -S < %s \| FileCheck %s

	; Check that the literal constant parameter could be specialized.			; Check that the literal constant parameter could be specialized.
	; CHECK: @foo.1(			; CHECK: @foo.1(
	; CHECK: @foo.2(			; CHECK: @foo.2(

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	declare i32 @getValue()			declare i32 @getValue()
	Show All 26 Lines
	}			}

	define dso_local i32 @bar(i32 %x, i32 %y) {			define dso_local i32 @bar(i32 %x, i32 %y) {
	entry:			entry:
	%retval.1 = call i32 @foo(i1 1)			%retval.1 = call i32 @foo(i1 1)
	%retval.2 = call i32 @foo(i1 0)			%retval.2 = call i32 @foo(i1 0)
	%retval = add nsw i32 %retval.1, %retval.2			%retval = add nsw i32 %retval.1, %retval.2
	ret i32 %retval			ret i32 %retval
	}			}
	No newline at end of file

llvm/test/Transforms/FunctionSpecialization/function-specialization-loop.ll

	; RUN: opt -function-specialization -func-specialization-avg-iters-cost=3 -func-specialization-size-threshold=10 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-avg-iters-cost=3 -func-specialization-size-threshold=10 -S < %s \| FileCheck %s

	; Check that the loop depth results in a larger specialization bonus.			; Check that the loop depth results in a larger specialization bonus.
	; CHECK: @foo.1(			; CHECK: @foo.1(
	; CHECK: @foo.2(			; CHECK: @foo.2(

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	@A = external dso_local constant i32, align 4			@A = external dso_local constant i32, align 4
	▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines

	if.else:			if.else:
	%call1 = call i32 @foo(i32 %y, i32* @B, i32* @D)			%call1 = call i32 @foo(i32 %y, i32* @B, i32* @D)
	br label %return			br label %return

	return:			return:
	%retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ]			%retval.0 = phi i32 [ %call, %if.then ], [ %call1, %if.else ]
	ret i32 %retval.0			ret i32 %retval.0
	}			}
	No newline at end of file

llvm/test/Transforms/FunctionSpecialization/function-specialization-minsize.ll

	; RUN: opt -function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -S < %s \| FileCheck %s

	; CHECK-NOT: @compute.1			; CHECK-NOT: @compute.1
	; CHECK-NOT: @compute.2			; CHECK-NOT: @compute.2

	define i64 @main(i64 %x, i1 %flag) {			define i64 @main(i64 %x, i1 %flag) {
	entry:			entry:
	br i1 %flag, label %plus, label %minus			br i1 %flag, label %plus, label %minus

	Show All 30 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-minsize2.ll

	; RUN: opt -function-specialization -func-specialization-size-threshold=3 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-size-threshold=3 -S < %s \| FileCheck %s

	; Checks for callsites that have been annotated with MinSize. No specialisation			; Checks for callsites that have been annotated with MinSize. No specialisation
	; expected here:			; expected here:
	;			;
	; CHECK-NOT: @compute.1			; CHECK-NOT: @compute.1
	; CHECK-NOT: @compute.2			; CHECK-NOT: @compute.2

	define i64 @main(i64 %x, i1 %flag) {			define i64 @main(i64 %x, i1 %flag) {
	Show All 35 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-minsize3.ll

	; RUN: opt -function-specialization -func-specialization-size-threshold=3 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-size-threshold=3 -S < %s \| FileCheck %s

	; Checks for callsites that have been annotated with MinSize. We only expect			; Checks for callsites that have been annotated with MinSize. We only expect
	; specialisation for the call that does not have the attribute:			; specialisation for the call that does not have the attribute:
	;			;
	; CHECK: plus:			; CHECK: plus:
	; CHECK: %tmp0 = call i64 @compute.1(i64 %x, i64 (i64)* @plus)			; CHECK: %tmp0 = call i64 @compute.1(i64 %x, i64 (i64)* @plus)
	; CHECK: br label %merge			; CHECK: br label %merge
	; CHECK: minus:			; CHECK: minus:
	Show All 39 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-nodup.ll

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; Function @foo has function attribute 'noduplicate', so check that we don't			; Function @foo has function attribute 'noduplicate', so check that we don't
	; specialize it:			; specialize it:

	; CHECK-NOT: @foo.1(			; CHECK-NOT: @foo.1(
	; CHECK-NOT: @foo.2(			; CHECK-NOT: @foo.2(

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
	Show All 30 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-nodup2.ll

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; Check that function foo does not gets specialised as it contains an intrinsic			; Check that function foo does not gets specialised as it contains an intrinsic
	; that is marked as NoDuplicate.			; that is marked as NoDuplicate.
	; Please note that the use of the hardwareloop intrinsic is arbitrary; it's			; Please note that the use of the hardwareloop intrinsic is arbitrary; it's
	; just an easy to use intrinsic that has NoDuplicate.			; just an easy to use intrinsic that has NoDuplicate.

	; CHECK-NOT: @foo.1(			; CHECK-NOT: @foo.1(
	; CHECK-NOT: @foo.2(			; CHECK-NOT: @foo.2(
	Show All 33 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-noexec.ll

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; The if.then block is not executed, so check that we don't specialise here.			; The if.then block is not executed, so check that we don't specialise here.

	; CHECK-NOT: @foo.1(			; CHECK-NOT: @foo.1(
	; CHECK-NOT: @foo.2(			; CHECK-NOT: @foo.2(

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	Show All 27 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-nonconst-glob.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s
	; RUN: opt -function-specialization -force-function-specialization -func-specialization-on-address=0 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-on-address=0 -S < %s \| FileCheck %s
	; RUN: opt -function-specialization -force-function-specialization -func-specialization-on-address=1 -S < %s \| FileCheck %s --check-prefix=ON-ADDRESS			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-on-address=1 -S < %s \| FileCheck %s --check-prefix=ON-ADDRESS

	; Global B is not constant. We do not specialise on addresses unless we			; Global B is not constant. We do not specialise on addresses unless we
	; enable that:			; enable that:

	; ON-ADDRESS: call i32 @foo.1(i32 %x, i32* @A)			; ON-ADDRESS: call i32 @foo.1(i32 %x, i32* @A)
	; ON-ADDRESS: call i32 @foo.2(i32 %y, i32* @B)			; ON-ADDRESS: call i32 @foo.2(i32 %y, i32* @B)

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
	▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-nothing-todo.ll

This file was deleted.

	; REQUIRES: asserts
	; RUN: opt -function-specialization -debug -S < %s 2>&1 \| FileCheck %s

	; The purpose of this test is to check that we don't run the solver as there's
	; nothing to do here. For a test that doesn't trigger function specialisation,
	; it is intentionally 'big' because we also want to check that the ssa.copy
	; intrinsics that are introduced by the solver are cleaned up if we bail
	; early. Thus, first check the debug messages for the introduction of these
	; intrinsics:

	; CHECK: FnSpecialization: Analysing decl: foo
	; CHECK: Found replacement{{.*}} call i32 @llvm.ssa.copy.i32
	; CHECK: Found replacement{{.*}} call i32 @llvm.ssa.copy.i32

	; Then, make sure the solver didn't run:

	; CHECK-NOT: Running solver

	; Finally, check the absence and thus removal of these intrinsics:

	; CHECK-LABEL: @foo
	; CHECK-NOT: call i32 @llvm.ssa.copy.i32

	@N = external dso_local global i32, align 4
	@B = external dso_local global i32*, align 8
	@A = external dso_local global i32*, align 8

	define dso_local i32 @foo() {
	entry:
	br label %for.cond

	for.cond:
	%i.0 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
	%0 = load i32, i32* @N, align 4
	%cmp = icmp slt i32 %i.0, %0
	br i1 %cmp, label %for.body, label %for.cond.cleanup

	for.cond.cleanup:
	ret i32 undef

	for.body:
	%1 = load i32, i32* @B, align 8
	%idxprom = sext i32 %i.0 to i64
	%arrayidx = getelementptr inbounds i32, i32* %1, i64 %idxprom
	%2 = load i32, i32* %arrayidx, align 4
	%3 = load i32, i32* @A, align 8
	%arrayidx2 = getelementptr inbounds i32, i32* %3, i64 %idxprom
	store i32 %2, i32* %arrayidx2, align 4
	%inc = add nsw i32 %i.0, 1
	br label %for.cond
	}

llvm/test/Transforms/FunctionSpecialization/function-specialization-poison.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; Check that we don't crash and specialise on a poison value.			; Check that we don't crash and specialise on a poison value.

	%struct.quux = type { i16 }			%struct.quux = type { i16 }
	%struct.spam = type { i16 }			%struct.spam = type { i16 }

	@global.12 = external global %struct.quux, align 1			@global.12 = external global %struct.quux, align 1

	Show All 32 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive.ll

	; RUN: opt -function-specialization -force-function-specialization -func-specialization-max-iters=2 -inline -instcombine -S < %s \| FileCheck %s --check-prefix=ITERS2			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=2 -inline -instcombine -S < %s \| FileCheck %s --check-prefix=ITERS2
	; RUN: opt -function-specialization -force-function-specialization -func-specialization-max-iters=3 -inline -instcombine -S < %s \| FileCheck %s --check-prefix=ITERS3			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=3 -inline -instcombine -S < %s \| FileCheck %s --check-prefix=ITERS3
	; RUN: opt -function-specialization -force-function-specialization -func-specialization-max-iters=4 -inline -instcombine -S < %s \| FileCheck %s --check-prefix=ITERS4			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=4 -inline -instcombine -S < %s \| FileCheck %s --check-prefix=ITERS4

	@low = internal constant i32 0, align 4			@low = internal constant i32 0, align 4
	@high = internal constant i32 6, align 4			@high = internal constant i32 6, align 4

	define internal void @recursiveFunc(i32* nocapture readonly %lo, i32 %step, i32* nocapture readonly %hi) {			define internal void @recursiveFunc(i32* nocapture readonly %lo, i32 %step, i32* nocapture readonly %hi) {
	%lo.temp = alloca i32, align 4			%lo.temp = alloca i32, align 4
	%hi.temp = alloca i32, align 4			%hi.temp = alloca i32, align 4
	%lo.load = load i32, i32* %lo, align 4			%lo.load = load i32, i32* %lo, align 4
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive2.ll

	; RUN: opt -function-specialization -force-function-specialization -func-specialization-max-iters=2 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=2 -S < %s \| FileCheck %s

	; Volatile store preventing recursive specialisation:			; Volatile store preventing recursive specialisation:
	;			;
	; CHECK: @recursiveFunc.1			; CHECK: @recursiveFunc.1
	; CHECK-NOT: @recursiveFunc.2			; CHECK-NOT: @recursiveFunc.2

	@Global = internal constant i32 1, align 4			@Global = internal constant i32 1, align 4

	Show All 23 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive3.ll

	; RUN: opt -function-specialization -force-function-specialization -func-specialization-max-iters=2 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=2 -S < %s \| FileCheck %s

	; Duplicate store preventing recursive specialisation:			; Duplicate store preventing recursive specialisation:
	;			;
	; CHECK: @recursiveFunc.1			; CHECK: @recursiveFunc.1
	; CHECK-NOT: @recursiveFunc.2			; CHECK-NOT: @recursiveFunc.2

	@Global = internal constant i32 1, align 4			@Global = internal constant i32 1, align 4

	Show All 25 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive4.ll

	; RUN: opt -function-specialization -force-function-specialization -func-specialization-max-iters=2 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -func-specialization-max-iters=2 -S < %s \| FileCheck %s

	; Alloca is not an integer type:			; Alloca is not an integer type:
	;			;
	; CHECK: @recursiveFunc.1			; CHECK: @recursiveFunc.1
	; CHECK-NOT: @recursiveFunc.2			; CHECK-NOT: @recursiveFunc.2

	@Global = internal constant i32 1, align 4			@Global = internal constant i32 1, align 4

	Show All 23 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization-stats.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: opt -stats -function-specialization -S -force-function-specialization < %s 2>&1 \| FileCheck %s			; RUN: opt -stats -ipsccp -specialize-functions -S -force-function-specialization < %s 2>&1 \| FileCheck %s

	; CHECK: 2 function-specialization - Number of functions specialized			; CHECK: 2 function-specialization - Number of functions specialized

	define i64 @main(i64 %x, i1 %flag) {			define i64 @main(i64 %x, i1 %flag) {
	entry:			entry:
	br i1 %flag, label %plus, label %minus			br i1 %flag, label %plus, label %minus

	plus:			plus:
	Show All 29 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization.ll

	; RUN: opt -function-specialization -func-specialization-size-threshold=3 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-size-threshold=3 -S < %s \| FileCheck %s

	define i64 @main(i64 %x, i1 %flag) {			define i64 @main(i64 %x, i1 %flag) {
	;			;
	; CHECK-LABEL: @main(i64 %x, i1 %flag) {			; CHECK-LABEL: @main(i64 %x, i1 %flag) {
	; CHECK: entry:			; CHECK: entry:
	; CHECK-NEXT: br i1 %flag, label %plus, label %minus			; CHECK-NEXT: br i1 %flag, label %plus, label %minus
	; CHECK: plus:			; CHECK: plus:
	; CHECK-NEXT: [[TMP0:%.+]] = call i64 @compute.1(i64 %x, i64 (i64)* @plus)			; CHECK-NEXT: [[TMP0:%.+]] = call i64 @compute.1(i64 %x, i64 (i64)* @plus)
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization2.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -deadargelim -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -deadargelim -force-function-specialization -S < %s \| FileCheck %s
	; RUN: opt -function-specialization -func-specialization-max-iters=1 -deadargelim -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-max-iters=1 -deadargelim -force-function-specialization -S < %s \| FileCheck %s
	; RUN: opt -function-specialization -func-specialization-max-iters=0 -deadargelim -force-function-specialization -S < %s \| FileCheck %s --check-prefix=DISABLED			; RUN: opt -ipsccp -specialize-functions -func-specialization-max-iters=0 -deadargelim -force-function-specialization -S < %s \| FileCheck %s --check-prefix=DISABLED
	; RUN: opt -function-specialization -func-specialization-avg-iters-cost=1 -deadargelim -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-avg-iters-cost=1 -deadargelim -force-function-specialization -S < %s \| FileCheck %s

	; DISABLED-NOT: @func.1(			; DISABLED-NOT: @func.1(
	; DISABLED-NOT: @func.2(			; DISABLED-NOT: @func.2(

	define internal i32 @func(i32* %0, i32 %1, void (i32) nocapture %2) {			define internal i32 @func(i32* %0, i32 %1, void (i32) nocapture %2) {
	%4 = alloca i32, align 4			%4 = alloca i32, align 4
	store i32 %1, i32* %4, align 4			store i32 %1, i32* %4, align 4
	%5 = load i32, i32* %4, align 4			%5 = load i32, i32* %4, align 4
	Show All 24 Lines
	define internal void @decrement(i32* nocapture %0) {			define internal void @decrement(i32* nocapture %0) {
	%2 = load i32, i32* %0, align 4			%2 = load i32, i32* %0, align 4
	%3 = add nsw i32 %2, -1			%3 = add nsw i32 %2, -1
	store i32 %3, i32* %0, align 4			store i32 %3, i32* %0, align 4
	ret void			ret void
	}			}

	define i32 @main(i32* %0, i32 %1) {			define i32 @main(i32* %0, i32 %1) {
	; CHECK: [[TMP3:%.]] = call i32 @func.2(i32 [[TMP0:%.]], i32 [[TMP1:%.]])			; CHECK: call void @func.2(i32* [[TMP0:%.]], i32 [[TMP1:%.]])
	%3 = call i32 @func(i32* %0, i32 %1, void (i32) nonnull @increment)			%3 = call i32 @func(i32* %0, i32 %1, void (i32) nonnull @increment)
	; CHECK: [[TMP4:%.]] = call i32 @func.1(i32 [[TMP0]], i32 [[TMP3]])			; CHECK: call void @func.1(i32* [[TMP0]], i32 0)
	%4 = call i32 @func(i32* %0, i32 %3, void (i32) nonnull @decrement)			%4 = call i32 @func(i32* %0, i32 %3, void (i32) nonnull @decrement)
				; CHECK: ret i32 0
	ret i32 %4			ret i32 %4
	}			}

	; CHECK: @func.1(			; CHECK: @func.1(
	; CHECK: [[TMP3:%.*]] = alloca i32, align 4			; CHECK: [[TMP3:%.*]] = alloca i32, align 4
	; CHECK: store i32 [[TMP1:%.]], i32 [[TMP3]], align 4			; CHECK: store i32 [[TMP1:%.]], i32 [[TMP3]], align 4
	; CHECK: [[TMP4:%.]] = load i32, i32 [[TMP3]], align 4			; CHECK: [[TMP4:%.]] = load i32, i32 [[TMP3]], align 4
	; CHECK: [[TMP5:%.*]] = icmp slt i32 [[TMP4]], 1			; CHECK: [[TMP5:%.*]] = icmp slt i32 [[TMP4]], 1
	; CHECK: br i1 [[TMP5]], label [[TMP13:%.]], label [[TMP6:%.]]			; CHECK: br i1 [[TMP5]], label [[TMP13:%.]], label [[TMP6:%.]]
	; CHECK: 6:			; CHECK: 6:
	; CHECK: [[TMP7:%.]] = load i32, i32 [[TMP3]], align 4			; CHECK: [[TMP7:%.]] = load i32, i32 [[TMP3]], align 4
	; CHECK: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64			; CHECK: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64
	; CHECK: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP0:%.*]], i64 [[TMP8]]			; CHECK: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP0:%.*]], i64 [[TMP8]]
	; CHECK: call void @decrement(i32* [[TMP9]])			; CHECK: call void @decrement(i32* [[TMP9]])
	; CHECK: [[TMP10:%.]] = load i32, i32 [[TMP3]], align 4			; CHECK: [[TMP10:%.]] = load i32, i32 [[TMP3]], align 4
	; CHECK: [[TMP11:%.*]] = add nsw i32 [[TMP10]], -1			; CHECK: [[TMP11:%.*]] = add nsw i32 [[TMP10]], -1
	; CHECK: [[TMP12:%.]] = call i32 @func.1(i32 [[TMP0]], i32 [[TMP11]])			; CHECK: call void @func.1(i32* [[TMP0]], i32 [[TMP11]])
	; CHECK: br label [[TMP13]]			; CHECK: br label [[TMP12:%.*]]
	; CHECK: 13:			; CHECK: 12:
	; CHECK: ret i32 0			; CHECK: ret void
	;			;
	;			;
	; CHECK: @func.2(			; CHECK: @func.2(
	; CHECK: [[TMP3:%.*]] = alloca i32, align 4			; CHECK: [[TMP3:%.*]] = alloca i32, align 4
	; CHECK: store i32 [[TMP1:%.]], i32 [[TMP3]], align 4			; CHECK: store i32 [[TMP1:%.]], i32 [[TMP3]], align 4
	; CHECK: [[TMP4:%.]] = load i32, i32 [[TMP3]], align 4			; CHECK: [[TMP4:%.]] = load i32, i32 [[TMP3]], align 4
	; CHECK: [[TMP5:%.*]] = icmp slt i32 [[TMP4]], 1			; CHECK: [[TMP5:%.*]] = icmp slt i32 [[TMP4]], 1
	; CHECK: br i1 [[TMP5]], label [[TMP13:%.]], label [[TMP6:%.]]			; CHECK: br i1 [[TMP5]], label [[TMP13:%.]], label [[TMP6:%.]]
	; CHECK: 6:			; CHECK: 6:
	; CHECK: [[TMP7:%.]] = load i32, i32 [[TMP3]], align 4			; CHECK: [[TMP7:%.]] = load i32, i32 [[TMP3]], align 4
	; CHECK: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64			; CHECK: [[TMP8:%.*]] = sext i32 [[TMP7]] to i64
	; CHECK: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP0:%.*]], i64 [[TMP8]]			; CHECK: [[TMP9:%.]] = getelementptr inbounds i32, i32 [[TMP0:%.*]], i64 [[TMP8]]
	; CHECK: call void @increment(i32* [[TMP9]])			; CHECK: call void @increment(i32* [[TMP9]])
	; CHECK: [[TMP10:%.]] = load i32, i32 [[TMP3]], align 4			; CHECK: [[TMP10:%.]] = load i32, i32 [[TMP3]], align 4
	; CHECK: [[TMP11:%.*]] = add nsw i32 [[TMP10]], -1			; CHECK: [[TMP11:%.*]] = add nsw i32 [[TMP10]], -1
	; CHECK: [[TMP12:%.]] = call i32 @func.2(i32 [[TMP0]], i32 [[TMP11]])			; CHECK: call void @func.2(i32* [[TMP0]], i32 [[TMP11]])
	; CHECK: br label [[TMP13]]			; CHECK: br label [[TMP12:%.*]]
	; CHECK: ret i32 0			; CHECK: 12:
				; CHECK: ret void

llvm/test/Transforms/FunctionSpecialization/function-specialization3.ll

	; RUN: opt -function-specialization -func-specialization-avg-iters-cost=3 -S < %s \| \			; RUN: opt -ipsccp -specialize-functions -func-specialization-avg-iters-cost=3 -S < %s \| \
	; RUN: FileCheck %s --check-prefixes=COMMON,DISABLED			; RUN: FileCheck %s --check-prefixes=COMMON,DISABLED
	; RUN: opt -function-specialization -force-function-specialization -S < %s \| \			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| \
	; RUN: FileCheck %s --check-prefixes=COMMON,FORCE			; RUN: FileCheck %s --check-prefixes=COMMON,FORCE
	; RUN: opt -function-specialization -func-specialization-avg-iters-cost=3 -force-function-specialization -S < %s \| \			; RUN: opt -ipsccp -specialize-functions -func-specialization-avg-iters-cost=3 -force-function-specialization -S < %s \| \
	; RUN: FileCheck %s --check-prefixes=COMMON,FORCE			; RUN: FileCheck %s --check-prefixes=COMMON,FORCE

	; Test for specializing a constant global.			; Test for specializing a constant global.

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	@A = external dso_local constant i32, align 4			@A = external dso_local constant i32, align 4
	@B = external dso_local constant i32, align 4			@B = external dso_local constant i32, align 4
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization4.ll

	; RUN: opt -function-specialization -force-function-specialization \			; RUN: opt -ipsccp -specialize-functions -force-function-specialization \
	; RUN: -func-specialization-max-clones=2 -S < %s \| FileCheck %s			; RUN: -func-specialization-max-clones=2 -S < %s \| FileCheck %s

	; RUN: opt -function-specialization -force-function-specialization \			; RUN: opt -ipsccp -specialize-functions -force-function-specialization \
	; RUN: -func-specialization-max-clones=1 -S < %s \| FileCheck %s --check-prefix=CONST1			; RUN: -func-specialization-max-clones=1 -S < %s \| FileCheck %s --check-prefix=CONST1

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	@A = external dso_local constant i32, align 4			@A = external dso_local constant i32, align 4
	@B = external dso_local constant i32, align 4			@B = external dso_local constant i32, align 4
	@C = external dso_local constant i32, align 4			@C = external dso_local constant i32, align 4
	@D = external dso_local constant i32, align 4			@D = external dso_local constant i32, align 4
	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/function-specialization5.ll

	; RUN: opt -function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	; There's nothing to specialize here as both calls are the same, so check that:			; There's nothing to specialize here as both calls are the same, so check that:
	;			;
	; CHECK-NOT: define internal i32 @foo.1(			; CHECK-NOT: define internal i32 @foo.1(
	; CHECK-NOT: define internal i32 @foo.2(			; CHECK-NOT: define internal i32 @foo.2(

	target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"			target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"

	Show All 31 Lines

llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -passes=function-specialization -force-function-specialization -S < %s \| FileCheck %s			; RUN: opt -passes=ipsccp -specialize-functions -force-function-specialization -S < %s \| FileCheck %s

	define i64 @main(i64 %x, i64 %y, i1 %flag) {			define i64 @main(i64 %x, i64 %y, i1 %flag) {
	; CHECK-LABEL: @main(			; CHECK-LABEL: @main(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br i1 [[FLAG:%.]], label [[PLUS:%.]], label [[MINUS:%.*]]			; CHECK-NEXT: br i1 [[FLAG:%.]], label [[PLUS:%.]], label [[MINUS:%.*]]
	; CHECK: plus:			; CHECK: plus:
	; CHECK-NEXT: [[CMP0:%.]] = call i64 @compute.1(i64 [[X:%.]], i64 [[Y:%.]], i64 (i64, i64) @plus, i64 (i64, i64)* @minus)			; CHECK-NEXT: [[CMP0:%.]] = call i64 @compute.1(i64 [[X:%.]], i64 [[Y:%.]], i64 (i64, i64) @plus, i64 (i64, i64)* @minus)
	; CHECK-NEXT: br label [[MERGE:%.*]]			; CHECK-NEXT: br label [[MERGE:%.*]]
	▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/remove-dead-recursive-function.ll

	; RUN: opt -function-specialization -func-specialization-size-threshold=3 -S < %s \| FileCheck %s			; RUN: opt -ipsccp -specialize-functions -func-specialization-size-threshold=3 -S < %s \| FileCheck %s

	define i64 @main(i64 %x, i1 %flag) {			define i64 @main(i64 %x, i1 %flag) {
	entry:			entry:
	br i1 %flag, label %plus, label %minus			br i1 %flag, label %plus, label %minus

	plus:			plus:
	%tmp0 = call i64 @compute(i64 %x, i64 (i64)* @plus)			%tmp0 = call i64 @compute(i64 %x, i64 (i64)* @plus)
	br label %merge			br label %merge
	▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

llvm/test/Transforms/FunctionSpecialization/specialize-multiple-arguments.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -function-specialization -func-specialization-max-clones=0 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=NONE			; RUN: opt -ipsccp -specialize-functions -func-specialization-max-clones=0 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=NONE
	; RUN: opt -function-specialization -func-specialization-max-clones=1 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=ONE			; RUN: opt -ipsccp -specialize-functions -func-specialization-max-clones=1 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=ONE
	; RUN: opt -function-specialization -func-specialization-max-clones=2 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=TWO			; RUN: opt -ipsccp -specialize-functions -func-specialization-max-clones=2 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=TWO
	; RUN: opt -function-specialization -func-specialization-max-clones=3 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=THREE			; RUN: opt -ipsccp -specialize-functions -func-specialization-max-clones=3 -func-specialization-size-threshold=14 -S < %s \| FileCheck %s --check-prefix=THREE

	; Make sure that we iterate correctly after sorting the specializations:			; Make sure that we iterate correctly after sorting the specializations:
	; FnSpecialization: Specializations for function compute			; FnSpecialization: Specializations for function compute
	; FnSpecialization: Gain = 608			; FnSpecialization: Gain = 608
	; FnSpecialization: FormalArg = binop1, ActualArg = power			; FnSpecialization: FormalArg = binop1, ActualArg = power
	; FnSpecialization: FormalArg = binop2, ActualArg = mul			; FnSpecialization: FormalArg = binop2, ActualArg = mul
	; FnSpecialization: Gain = 982			; FnSpecialization: Gain = 982
	; FnSpecialization: FormalArg = binop1, ActualArg = plus			; FnSpecialization: FormalArg = binop1, ActualArg = plus
	▲ Show 20 Lines • Show All 172 Lines • Show Last 20 Lines

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

	static_library("Scalar") {			static_library("Scalar") {
	output_name = "LLVMScalarOpts"			output_name = "LLVMScalarOpts"
	deps = [			deps = [
	"//llvm/include/llvm/Config:llvm-config",			"//llvm/include/llvm/Config:llvm-config",
	"//llvm/lib/Analysis",			"//llvm/lib/Analysis",
	"//llvm/lib/IR",			"//llvm/lib/IR",
	"//llvm/lib/Support",			"//llvm/lib/Support",
	"//llvm/lib/Transforms/AggressiveInstCombine",			"//llvm/lib/Transforms/AggressiveInstCombine",
				"//llvm/lib/Transforms/IPO",
	"//llvm/lib/Transforms/InstCombine",			"//llvm/lib/Transforms/InstCombine",
	"//llvm/lib/Transforms/Utils",			"//llvm/lib/Transforms/Utils",
	]			]
	sources = [			sources = [
	"ADCE.cpp",			"ADCE.cpp",
	"AlignmentFromAssumptions.cpp",			"AlignmentFromAssumptions.cpp",
	"AnnotationRemarks.cpp",			"AnnotationRemarks.cpp",
	"BDCE.cpp",			"BDCE.cpp",
	▲ Show 20 Lines • Show All 77 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[FuncSpec] Make the Function Specializer part of the IPSCCP pass.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 469041

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/LinkAllPasses.h

llvm/include/llvm/Transforms/IPO.h

llvm/include/llvm/Transforms/IPO/FunctionSpecialization.h

llvm/include/llvm/Transforms/IPO/SCCP.h

llvm/include/llvm/Transforms/Scalar/SCCP.h

llvm/lib/Passes/PassBuilderPipelines.cpp

llvm/lib/Passes/PassRegistry.def

llvm/lib/Transforms/IPO/FunctionSpecialization.cpp

llvm/lib/Transforms/IPO/IPO.cpp

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp

llvm/lib/Transforms/IPO/SCCP.cpp

llvm/lib/Transforms/Scalar/CMakeLists.txt

llvm/lib/Transforms/Scalar/SCCP.cpp

llvm/test/Transforms/FunctionSpecialization/bug52821-use-after-free.ll

llvm/test/Transforms/FunctionSpecialization/bug55000-read-uninitialized-value.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-always-inline.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression2.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression3.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression4.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-expression5.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-constant-integers.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-loop.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-minsize.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-minsize2.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-minsize3.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-nodup.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-nodup2.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-noexec.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-nonconst-glob.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-nothing-todo.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-poison.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive2.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive3.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-recursive4.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization-stats.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization2.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization3.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization4.ll

llvm/test/Transforms/FunctionSpecialization/function-specialization5.ll

llvm/test/Transforms/FunctionSpecialization/identical-specializations.ll

llvm/test/Transforms/FunctionSpecialization/remove-dead-recursive-function.ll

llvm/test/Transforms/FunctionSpecialization/specialize-multiple-arguments.ll

llvm/utils/gn/secondary/llvm/lib/Transforms/Scalar/BUILD.gn

[FuncSpec] Make the Function Specializer part of the IPSCCP pass.
ClosedPublic