This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
6
SCCP.cpp
-
test/
-
Other/
-
new-pm-defaults.ll
-
new-pm-lto-defaults.ll
-
new-pm-thinlto-defaults.ll
-
Transforms/SCCP/
-
SCCP/
-
ipsccp-specialization.ll

Differential D36432

[IPSCCP] Add function specialization ability
AbandonedPublic

Authored by mssimpso on Aug 7 2017, 3:11 PM.

Download Raw Diff

Details

Reviewers

davide
• dberlin
chandlerc
efriedma

Summary

This patch adds function specialization to IPSCCP. After an initial IPSCCP data-flow is solved, we may find that a function argument can take on a particular constant value in some contexts. If this is the case, we can create a new version of the function in which the argument is replaced by the constant value and then recompute the data-flow. This allows constant propagation to cross function boundaries when an argument can take on more than one value.

Specialization is controlled by a goal-oriented heuristic that seeks to predict if replacing an argument with a particular constant value would result in any significant optimization opportunities. Currently, we limit this heuristic to exposing inlining opportunities via indirect call promotion, but the kinds of optimizations we look for can be extended in the future.

Specialization can be disabled by -ipsccp-enable-function-specialization=false.

Function cloning in constant propagation was mentioned on llvm-dev recently, so I thought I would throw this up for review. I've added a simple demonstration test, but we can add more tests to the patch if the approach looks good.

Long ago, there was a separate partial specialization pass. However, this pass was removed (in r123554 and r152759) apparently due to known bugs and lack of maintenance.

Please take a look.

Diff Detail

Build Status

Buildable 9107
Build 9107: arc lint + arc unit

Event Timeline

mssimpso created this revision.Aug 7 2017, 3:11 PM

Herald added subscribers: eraman, mehdi_amini. · View Herald TranscriptAug 7 2017, 3:11 PM

fhahn added a subscriber: fhahn.Aug 7 2017, 3:46 PM

Very neat work, thanks, this was in my todolist for a while :)
Some meta comments, I'll try to find some time to review this more carefully this week.

I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused. What are your thoughts?

That said, the approach I had in mind (as mentioned on llvm-dev) was that of:

implementing "real" jump functions for IPSCCP. To the best of my understanding complicated jump functions never got traction (so, e.g. GCC implements only constant and passthrough).
Propagate the information down the DAG of SCC (of the callgraph). From what I can see your approach is instead iterative so everytime you specialize something, you do another round of solving. I'm not sure whether this converges more slowly, or if it matters in practice, probably worth taking some numbers.

I think your heuristic is on the right track (it's the same one I used in an early prototype), but it would be nice to collect numbers to tune it further.

I'm a little bit reluctant about having this enabled by default (in the beginning)a s unfortunately cloning can have catastrophic consequences (I had a chat with the GCC folks where they reported substantial growth in the executable sizes for large programs without real speedups, and I also had an early prototype that confirmed this thesis).

Thanks again for working on this!

The way this is written, there isn't much point to sticking the code into IPSCCP; you could just as easily make this a separate pass which runs afterwards. (The only thing you're getting is constant propagation inside the cloned function, which you could just as easily run separately.)

lib/Transforms/Scalar/SCCP.cpp
2279	This loop doesn't work the way you want it to. IPSCCP isn't actually complete until the "while (ResolvedUndefs)" loop finishes, and specializing based on a partially unsolved Solver probably won't give you the results you want.

In D36432#835343, @davide wrote:

Very neat work, thanks, this was in my todolist for a while :)
Some meta comments, I'll try to find some time to review this more carefully this week.

Thanks for taking a look!

I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused. What are your thoughts?

Yes, I think I see what you're talking about. Analysis/SparsePropagation.h? I haven't looked at this closely yet, but I can see some advantages to using it. Any idea why SCCP hasn't already been switched over to use it?

That said, the approach I had in mind (as mentioned on llvm-dev) was that of:

implementing "real" jump functions for IPSCCP. To the best of my understanding complicated jump functions never got traction (so, e.g. GCC implements only constant and passthrough).

Propagate the information down the DAG of SCC (of the callgraph). From what I can see your approach is instead iterative so everytime you specialize something, you do another round of solving. I'm not sure whether this converges more slowly, or if it matters in practice, probably worth taking some numbers.

Right, I've incorporated specialization as an iterative solve. I'm not an expert in this area, but the iterative approach seemed to be the most straightforward extension of our current IPSCCP implementation (which seems to be more-or-less the Wegman, Zadeck approach). It makes us iterate, but we only need to visit the specializations on subsequent iterations.

(In the case of indirect call promotion, we could also revisit the called functions, but the current patch doesn't do this. For example, after call promotion, a function may no longer be address-taken, enabling us to then track it's arguments. We would have to reset their lattice state, though, since they would have already made it to overdefined.)

Am I right in thinking the comment about jump functions is about our IPSCCP infrastructure in general, rather than specialization specifically? If I understand correctly, jump functions (like in Callahan et al.) are a trade-off between the precision of the analysis and the time/space needed to solve it. They summarize functions and then propagate the summaries through call sites. I think our current IPSCCP implementation should (at least in theory) be more precise than this. For example, if we find out via constant propagation that a call is not executed, this could enable more propagation/specialization, etc. But again, this is a trade-off since we may have to revisit a large portion of the module. I would definitely appreciate your thoughts.

I think your heuristic is on the right track (it's the same one I used in an early prototype), but it would be nice to collect numbers to tune it further.

Here are some numbers (AArch64, LTO). It looks like the current tuning is fairly conservative for the LLVM test suite and SPEC. We only hit a handful of the benchmarks, and of the ones we do, spec2017/mcf is the big winner (16% faster). MultiSource/Benchmarks/McCat/12-IOtest is also improved. I would expect us to hit more benchmarks if/when we consider other optimization opportunities beyond indirect call promotion. The table below shows the performance increase, the code size increase, the number of specialized functions, and the number of specializations of those functions. It also shows the number of extra iterations of the IPSCCP solve loop that were performed due to specialization. (I'm in Phabricator, so hopefully this table will look OK.)

Benchmark	Perf % (+ faster)	Size % (+ larger)	# Added Solver Iters	# Specializations	# Specialized Funcs
llvm/MultiSource/Applications/SPASS	0.30	0.03	2	18	6
llvm/MultiSource/Applications/lua	0.00	0.04	1	2	1
llvm/MultiSource/Applications/siod	-0.17	0.21	0	2	1
llvm/MultiSource/Benchmarks/McCat/12-IOtest	7.04	-7.20	2	18	1
llvm/MultiSource/Benchmarks/Ptrdist/bc	0.11	1.30	1	1	1
spec2000/perlbmk	-0.25	-0.00	1	3	1
spec2006/gcc	0.29	0.32	1	43	6
spec2006/gobmk	0.03	0.10	2	4	3
spec2006/hmmer	0.00	0.57	1	5	1
spec2017/blender	0.58	0.84	14	1441	44
spec2017/gcc	0.08	1.54	8	141	22
spec2017/imagick	-0.44	-0.06	1	5	1
spec2017/mcf	16.01	19.26	1	2	1
spec2017/parest	0.09	-0.02	0	1	1
spec2017/perlbench	-0.44	0.15	2	5	2
spec2017/povray	-0.28	2.19	1	5	1
spec2017/xz	0.31	0.91	2	5	3

I'm a little bit reluctant about having this enabled by default (in the beginning)a s unfortunately cloning can have catastrophic consequences (I had a chat with the GCC folks where they reported substantial growth in the executable sizes for large programs without real speedups, and I also had an early prototype that confirmed this thesis).

Thanks again for working on this!

Makes sense. Hopefully folks could try out whatever we end up with on some larger programs.

In D36432#835608, @efriedma wrote:

The way this is written, there isn't much point to sticking the code into IPSCCP; you could just as easily make this a separate pass which runs afterwards. (The only thing you're getting is constant propagation inside the cloned function, which you could just as easily run separately.)

Thanks for the comments! The old specialization pass was indeed separate. I think (long term) it makes sense to incorporate this into IPSCCP. It could be the case that specialization enables further propagation, which then enables additional specialization, etc. I don't think we could get that kind of behavior out of a separate pass. But if we only want to care about indirect call promotion, we could probably achieve the same thing with a separate pass that runs before constant propagation.

lib/Transforms/Scalar/SCCP.cpp
2279	I'm not sure this is really a problem. Sure IPSCCP isn't yet complete, but `ResolvedUndefsIn` only moves values from unknown to constant. If an actual argument does become constant while resolving undefs, we should analyze the corresponding (formal argument, constant) pair in `getSpecializationBonus` the next time we visit the function when looking for specialization opportunities.

mcrosier added inline comments.Aug 9 2017, 11:35 AM

lib/Transforms/Scalar/SCCP.cpp
249	used -> use
2023	The inline cost model already takes into consideration cold call sites, but I was wondering if it would make sense to use this information directly during functional specialization. For example, you could filter out cold calls and avoid the overhead of computing the inline cost in cases when you think inlining likely won't happen or where specialization isn't likely to be beneficial. Of course, the inline cost computation will bail sooner with a lower threshold (due to the call site being cold), so you might not be saving too much.. Just a thought.

It could be the case that specialization enables further propagation, which then enables additional specialization, etc.

This doesn't really happen at the moment. After you specialize a function, you aren't resetting the solver, so everything that's overdefined will stay overdefined.

lib/Transforms/Scalar/SCCP.cpp
2279	Moving a value from undef to constant can move dependent values from constant to overdefined. So you could end up cloning a function, and then fail to actually do any useful constant propagation into the cloned function.

In D36432#837082, @efriedma wrote:

It could be the case that specialization enables further propagation, which then enables additional specialization, etc.

This doesn't really happen at the moment. After you specialize a function, you aren't resetting the solver, so everything that's overdefined will stay overdefined.

That's right. I think I mentioned this in my response to Davide's comments. This is a current limitation, but is something I think we will want to work towards.

lib/Transforms/Scalar/SCCP.cpp
2279	Ah, right. I think I see what you mean now.

In D36432#836883, @mssimpso wrote:

In D36432#835343, @davide wrote:

Very neat work, thanks, this was in my todolist for a while :)
Some meta comments, I'll try to find some time to review this more carefully this week.

Thanks for taking a look!

I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused. What are your thoughts?

Yes, I think I see what you're talking about. Analysis/SparsePropagation.h? I haven't looked at this closely yet, but I can see some advantages to using it. Any idea why SCCP hasn't already been switched over to use it?

Same as anything else - lack of time for someone ;)

We moved GCC' s SCCP to a generic sparse propagation engine.
As for performance - i'm with Davide.
Cloning as the end transformation is nice, but it also tends to be a harder cost model to get right.

GCC was unable to get this right in practice. Most benefit from from IPSCCP is producing range info for each context.

Am I right in thinking the comment about jump functions is about our IPSCCP infrastructure in general, rather than specialization specifically? If I understand correctly, jump functions (like in Callahan et al.) are a trade-off between the precision of the analysis and the time/space needed to solve it. They summarize functions and then propagate the summaries through call sites. I think our current IPSCCP implementation should (at least in theory) be more precise than this. For example, if we find out via constant propagation that a call is not executed, this could enable more propagation/specialization, etc. But again, this is a trade-off since we may have to revisit a large portion of the module. I would definitely appreciate your thoughts.

Almost right. Our modeling should be more precise than *most* jump functions. But not all, i believe. I think what we are doing is equivalent to passthrough in reality[1] . Essentially, we have extended the constant prop lattice, for call functions, to to do passthrough of arguments and whatever their value is, and not just constants.
But you can imagine a more powerful jump function. For example, the polynomial one. It can be viewed as "extending the constant prop lattice for formal parameters to be polynomial functions", not just constants.

So even if foo(a) and foo(b) is foo(2) and foo(4), maybe in reality it could be expressed as foo(argument * 2) or something. We would not get this. The polynomial jump function would.

So, you are right that as implemented by most(all?) compilers, they use passthrough, and we use passthrough, so we're the same.

[1] I believe what we do can be proven equivalent to passthrough, because we merge the state of incoming arguments from the call sites parameters, without changing the lattice value. If all arguments just pass through, we will pass through the state. IE Without us doing anything else, given a call chain of arguments of foo(a) ->bar(a)->bob(a), a will remain underdefined in our algorithm.

Note: Wegmans SCCP paper covers a variant of IPSCCP.
https://www.cs.utexas.edu/users/lin/cs380c/wegman.pdf
They just link all the SSA procedures together, and run IPSCCP on it. What we do should be identical in practice, i believe.

It should be linear in the total size of the program (or we screwed up :P)
It also talks about integrating it with inlining.

In D36432#837245, @dberlin wrote:

In D36432#836883, @mssimpso wrote:

In D36432#835343, @davide wrote:

Very neat work, thanks, this was in my todolist for a while :)
Some meta comments, I'll try to find some time to review this more carefully this week.

Thanks for taking a look!

I wonder if we're at a point where it makes sense to split the lattice solver from the transforms, as with your partial specialization code SCCP is growing a lot :)
This would have also a bunch of other advantages, e.g. we could more easily try to plug arbitrary lattices (for example for variables/ranges/bits).
IIRC Chris Lattner started a propagation enging a while ago, it should still be in tree, but unused. What are your thoughts?

Yes, I think I see what you're talking about. Analysis/SparsePropagation.h? I haven't looked at this closely yet, but I can see some advantages to using it. Any idea why SCCP hasn't already been switched over to use it?

Same as anything else - lack of time for someone ;)

We moved GCC' s SCCP to a generic sparse propagation engine.
As for performance - i'm with Davide.
Cloning as the end transformation is nice, but it also tends to be a harder cost model to get right.

A (somehow) related bug came to my mind https://bugs.llvm.org/show_bug.cgi?id=33253

GCC was unable to get this right in practice. Most benefit from from IPSCCP is producing range info for each context.

Am I right in thinking the comment about jump functions is about our IPSCCP infrastructure in general, rather than specialization specifically? If I understand correctly, jump functions (like in Callahan et al.) are a trade-off between the precision of the analysis and the time/space needed to solve it. They summarize functions and then propagate the summaries through call sites. I think our current IPSCCP implementation should (at least in theory) be more precise than this. For example, if we find out via constant propagation that a call is not executed, this could enable more propagation/specialization, etc. But again, this is a trade-off since we may have to revisit a large portion of the module. I would definitely appreciate your thoughts.

Almost right. Our modeling should be more precise than *most* jump functions. But not all, i believe. I think what we are doing is equivalent to passthrough in reality[1] . Essentially, we have extended the constant prop lattice, for call functions, to to do passthrough of arguments and whatever their value is, and not just constants.
But you can imagine a more powerful jump function. For example, the polynomial one. It can be viewed as "extending the constant prop lattice for formal parameters to be polynomial functions", not just constants.

So even if foo(a) and foo(b) is foo(2) and foo(4), maybe in reality it could be expressed as foo(argument * 2) or something. We would not get this. The polynomial jump function would.

So, you are right that as implemented by most(all?) compilers, they use passthrough, and we use passthrough, so we're the same.

Yes. When I read "the" paper about jump functions, https://scholarship.rice.edu/handle/1911/13733 I was under the impression that more sophisticated jump function doesn't actually buy you much (and in a brief chat with David Callahan [who was around at the time] I got a confirmation). That said, this was a long time ago, and many things changed, so maybe things can be re-evaluated at some point.

FWIW, GCC doesn't even implement return jump functions (see e.g. https://godbolt.org/g/eahtHR ) and relies on inlining/cloning to get things right, so in this respect llvm's constant propagation is actually stronger than GCC's [if I remember correctly the reason why they don't implement this is because it's a little complicated to handle the case of mutually recursive functions while walking down the DAG of SCCs, but if we'll ever go for real jump functions we should consider to implement return(s) as well).

[1] I believe what we do can be proven equivalent to passthrough, because we merge the state of incoming arguments from the call sites parameters, without changing the lattice value. If all arguments just pass through, we will pass through the state. IE Without us doing anything else, given a call chain of arguments of foo(a) ->bar(a)->bob(a), a will remain underdefined in our algorithm.

Note: Wegmans SCCP paper covers a variant of IPSCCP.
https://www.cs.utexas.edu/users/lin/cs380c/wegman.pdf
They just link all the SSA procedures together, and run IPSCCP on it. What we do should be identical in practice, i believe.

It should be linear in the total size of the program (or we screwed up :P)
It also talks about integrating it with inlining.

I agree this is a good path forward.

ashutosh.nema added a subscriber: ashutosh.nema.Aug 11 2017, 2:07 AM

Danny/Davide,

Thanks very much for the feedback. I have some replies to your comments below (abridged, so it's easier to read in Phab.), but I thought I would first summarize things so far. It sounds like your main points are that (1) the cost model for function specialization is difficult to get right in practice, and that (2) our current IPSCCP infrastructure could be improved to do better than pass-through for arguments and returns. I agree with both of these points. So do we think we should iterate on this patch and add function specialization to our current infrastructure?

In D36432#837245, @dberlin wrote:

Almost right. Our modeling should be more precise than *most* jump functions. But not all, i believe. I think what we are doing is equivalent to passthrough in reality[1] . Essentially, we have extended the constant prop lattice, for call functions, to to do passthrough of arguments and whatever their value is, and not just constants.

Ah, that's right, we're doing the same thing as pass-through. We would probably need to significantly rework our analysis to do anything better (for ranges, etc.).

Note: Wegmans SCCP paper covers a variant of IPSCCP.
https://www.cs.utexas.edu/users/lin/cs380c/wegman.pdf
They just link all the SSA procedures together, and run IPSCCP on it. What we do should be identical in practice, i believe.

Yes, this is indeed the way we do it now, if I understand correctly.

In D36432#837721, @davide wrote:

A (somehow) related bug came to my mind https://bugs.llvm.org/show_bug.cgi?id=33253

Interesting. Thanks for the pointer!

I agree this is a good path forward.

I'm not sure if you're talking about the current patch or IPSCCP improvements in general :)

In D36432#839506, @mssimpso wrote:

Danny/Davide,

Thanks very much for the feedback. I have some replies to your comments below (abridged, so it's easier to read in Phab.), but I thought I would first summarize things so far. It sounds like your main points are that (1) the cost model for function specialization is difficult to get right in practice, and that (2) our current IPSCCP infrastructure could be improved to do better than pass-through for arguments and returns. I agree with both of these points. So do we think we should iterate on this patch and add function specialization to our current infrastructure?

I'm not opposed if we have cases where it matters.
But if we can't find these cases, it's not gonna be turned on by default, and then i think our time would be better spent elsewhere.
IE i don't think it will be horribly useful to have function specialization but not have it good enough to be worth it by default.
At that point, i think'd we better off exploring different kinds of improvements (IE store function argument range info) to IPSCCP or different mechanisms of using the existing info (IE inlining)

In D36432#839572, @dberlin wrote:

In D36432#839506, @mssimpso wrote:

Danny/Davide,

Thanks very much for the feedback. I have some replies to your comments below (abridged, so it's easier to read in Phab.), but I thought I would first summarize things so far. It sounds like your main points are that (1) the cost model for function specialization is difficult to get right in practice, and that (2) our current IPSCCP infrastructure could be improved to do better than pass-through for arguments and returns. I agree with both of these points. So do we think we should iterate on this patch and add function specialization to our current infrastructure?

I'm not opposed if we have cases where it matters.
But if we can't find these cases, it's not gonna be turned on by default, and then i think our time would be better spent elsewhere.
IE i don't think it will be horribly useful to have function specialization but not have it good enough to be worth it by default.
At that point, i think'd we better off exploring different kinds of improvements (IE store function argument range info) to IPSCCP or different mechanisms of using the existing info (IE inlining)

I agree. Looking at the biggest performance winners from the current patch (spec2017/mcf, in particular), most of the improvement is coming from inlining (enabled after the indirect call promotion that specialization allows). I wonder if there's a better way to achieve this benefit without specializing. For example, we could propagate constant sets indicating the functions indirect call sites could possibly target. Although we would probably want to limit the size of the sets to something small, the pass could attach the sets via metadata to the calls so that this information could be consumed by later passes. Such metadata could be used for indirect call promotion, intersecting the function attributes of the possible targets (i.e., in CallSite::hasFnAttr), etc. We could perform this here in SCCP or in a separate pass using the generic solver. What do you think?

hiraditya added a subscriber: hiraditya.Aug 24 2017, 5:27 AM

mssimpso mentioned this in D37355: Add CalledValuePropagation pass.Aug 31 2017, 2:09 PM

I'm abandoning this for now to close the review. Davide/Danny, please feel free to revive this patch if you want. To summarize the main points in the review, we need to work on the cost model more to enable something like this by default.

dongAxis1944 added a subscriber: dongAxis1944.Oct 27 2020, 1:26 AM

Herald added a subscriber: steven_wu. · View Herald TranscriptOct 27 2020, 1:26 AM

ChuanqiXu added a subscriber: ChuanqiXu.Oct 27 2020, 5:47 AM

yaozhongxiao added a subscriber: yaozhongxiao.Oct 27 2020, 5:53 AM

yaozhongxiao removed a subscriber: yaozhongxiao.

sanwou01 mentioned this in D93838: [SCCP] Add Function Specialization pass.Jan 27 2021, 6:13 AM

SjoerdMeijer mentioned this in rGc4a0969b9c14: Function Specialization Pass.Jun 11 2021, 1:22 AM

Revision Contents

Path

Size

lib/

Transforms/

Scalar/

SCCP.cpp

532 lines

test/

Other/

new-pm-defaults.ll

2 lines

new-pm-lto-defaults.ll

1 line

new-pm-thinlto-defaults.ll

2 lines

Transforms/

SCCP/

ipsccp-specialization.ll

52 lines

Diff 110098

lib/Transforms/Scalar/SCCP.cpp

Show All 18 Lines

#include "llvm/Transforms/IPO/SCCP.h"		#include "llvm/Transforms/IPO/SCCP.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/DenseSet.h"		#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/PointerIntPair.h"		#include "llvm/ADT/PointerIntPair.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
		#include "llvm/Analysis/BlockFrequencyInfo.h"
		#include "llvm/Analysis/CodeMetrics.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
		#include "llvm/Analysis/InlineCost.h"
		#include "llvm/Analysis/ProfileSummaryInfo.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
		#include "llvm/Analysis/TargetTransformInfo.h"
#include "llvm/IR/CallSite.h"		#include "llvm/IR/CallSite.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/InstVisitor.h"		#include "llvm/IR/InstVisitor.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Transforms/IPO.h"		#include "llvm/Transforms/IPO.h"
#include "llvm/Transforms/Scalar.h"		#include "llvm/Transforms/Scalar.h"
#include "llvm/Transforms/Scalar/SCCP.h"		#include "llvm/Transforms/Scalar/SCCP.h"
		#include "llvm/Transforms/Utils/Cloning.h"
#include "llvm/Transforms/Utils/Local.h"		#include "llvm/Transforms/Utils/Local.h"
		#include "llvm/Transforms/Utils/ValueMapper.h"
#include <algorithm>		#include <algorithm>
using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "sccp"		#define DEBUG_TYPE "sccp"

STATISTIC(NumInstRemoved, "Number of instructions removed");		STATISTIC(NumInstRemoved, "Number of instructions removed");
STATISTIC(NumDeadBlocks , "Number of basic blocks unreachable");		STATISTIC(NumDeadBlocks , "Number of basic blocks unreachable");

STATISTIC(IPNumInstRemoved, "Number of instructions removed by IPSCCP");		STATISTIC(IPNumInstRemoved, "Number of instructions removed by IPSCCP");
STATISTIC(IPNumArgsElimed ,"Number of arguments constant propagated by IPSCCP");		STATISTIC(IPNumArgsElimed ,"Number of arguments constant propagated by IPSCCP");
STATISTIC(IPNumGlobalConst, "Number of globals found to be constant by IPSCCP");		STATISTIC(IPNumGlobalConst, "Number of globals found to be constant by IPSCCP");

		STATISTIC(IPNumSpecializedFuncs, "Number of functions specialized by IPSCCP");

		/// Enable IPSCCP function specialization.
		static cl::opt<bool> EnableFunctionSpecialization(
		"ipsccp-enable-function-specialization", cl::init(true), cl::Hidden,
		cl::desc("Enable IPSCCP function specialization"));

namespace {		namespace {
/// LatticeVal class - This class represents the different lattice values that		/// LatticeVal class - This class represents the different lattice values that
/// an LLVM value may occupy. It is a simple class with value semantics.		/// an LLVM value may occupy. It is a simple class with value semantics.
///		///
class LatticeVal {		class LatticeVal {
enum LatticeValueTy {		enum LatticeValueTy {
/// unknown - This LLVM Value has no known value yet.		/// unknown - This LLVM Value has no known value yet.
unknown,		unknown,
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	public:
bool MarkBlockExecutable(BasicBlock *BB) {		bool MarkBlockExecutable(BasicBlock *BB) {
if (!BBExecutable.insert(BB).second)		if (!BBExecutable.insert(BB).second)
return false;		return false;
DEBUG(dbgs() << "Marking Block Executable: " << BB->getName() << '\n');		DEBUG(dbgs() << "Marking Block Executable: " << BB->getName() << '\n');
BBWorkList.push_back(BB); // Add the block to the work list!		BBWorkList.push_back(BB); // Add the block to the work list!
return true;		return true;
}		}

		/// Mark all of the blocks in function \p F non-executable. Clients can used
		mcrosierUnsubmitted Not Done Reply Inline Actions used -> use mcrosier: used -> use
		/// this method to erase a function from the module (e.g., if it has been
		/// completely specialized and is no longer needed).
		void markFunctionUnreachable(Function *F) {
		for (auto &BB : *F)
		BBExecutable.erase(&BB);
		}

/// TrackValueOfGlobalVariable - Clients can use this method to		/// TrackValueOfGlobalVariable - Clients can use this method to
/// inform the SCCPSolver that it should track loads and stores to the		/// inform the SCCPSolver that it should track loads and stores to the
/// specified global variable if it can. This is only legal to call if		/// specified global variable if it can. This is only legal to call if
/// performing Interprocedural SCCP.		/// performing Interprocedural SCCP.
void TrackValueOfGlobalVariable(GlobalVariable *GV) {		void TrackValueOfGlobalVariable(GlobalVariable *GV) {
// We only track the contents of scalar globals.		// We only track the contents of scalar globals.
if (GV->getValueType()->isSingleValueType()) {		if (GV->getValueType()->isSingleValueType()) {
LatticeVal &IV = TrackedGlobals[GV];		LatticeVal &IV = TrackedGlobals[GV];
Show All 15 Lines	void AddTrackedFunction(Function *F) {
} else		} else
TrackedRetVals.insert(std::make_pair(F, LatticeVal()));		TrackedRetVals.insert(std::make_pair(F, LatticeVal()));
}		}

void AddArgumentTrackedFunction(Function *F) {		void AddArgumentTrackedFunction(Function *F) {
TrackingIncomingArguments.insert(F);		TrackingIncomingArguments.insert(F);
}		}

		/// Return a reference to the set of argument tracked functions.
		SmallPtrSetImpl<Function *> &getArgumentTrackedFunctions() {
		return TrackingIncomingArguments;
		}

/// Solve - Solve for constants and executable blocks.		/// Solve - Solve for constants and executable blocks.
///		///
void Solve();		void Solve();

/// ResolvedUndefsIn - While solving the dataflow for a function, we assume		/// ResolvedUndefsIn - While solving the dataflow for a function, we assume
/// that branches on undef values cannot reach any of their successors.		/// that branches on undef values cannot reach any of their successors.
/// However, this is not a safe assumption. After we solve dataflow, this		/// However, this is not a safe assumption. After we solve dataflow, this
/// method should be use to handle this. If this returns true, the solver		/// method should be use to handle this. If this returns true, the solver
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
assert(It != TrackedMultipleRetVals.end());		assert(It != TrackedMultipleRetVals.end());
LatticeVal LV = It->second;		LatticeVal LV = It->second;
if (LV.isOverdefined())		if (LV.isOverdefined())
return false;		return false;
}		}
return true;		return true;
}		}

		/// Mark argument \p A constant with value \p C in a new function
		/// specialization. The argument's parent function is a specialization of the
		/// original function \p F. All other arguments of the specialization inherit
		/// the lattice state of their corresponding values in the original function.
		void markArgInFuncSpecialization(Function F, Argument A, Constant *C) {
		assert(F->arg_size() == A->getParent()->arg_size() &&
		"Functions should have the same number of arguments");

		// Mark the argument constant in the new function.
		markConstant(A, C);

		// For the remaining arguments in the new function, copy the lattice state
		// over from the old function.
		for (auto I = F->arg_begin(), J = A->getParent()->arg_begin(),
		E = F->arg_end();
		I != E; ++I, ++J)
		if (J != A && ValueState.count(I)) {
		ValueState[J] = ValueState[I];
		pushToWorkList(ValueState[J], J);
		}
		}

private:		private:
// pushToWorkList - Helper for markConstant/markForcedConstant/markOverdefined		// pushToWorkList - Helper for markConstant/markForcedConstant/markOverdefined
void pushToWorkList(LatticeVal &IV, Value *V) {		void pushToWorkList(LatticeVal &IV, Value *V) {
if (IV.isOverdefined())		if (IV.isOverdefined())
return OverdefinedInstWorkList.push_back(V);		return OverdefinedInstWorkList.push_back(V);
InstWorkList.push_back(V);		InstWorkList.push_back(V);
}		}

▲ Show 20 Lines • Show All 1,345 Lines • ▼ Show 20 Lines	INITIALIZE_PASS_BEGIN(SCCPLegacyPass, "sccp",
"Sparse Conditional Constant Propagation", false, false)		"Sparse Conditional Constant Propagation", false, false)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_END(SCCPLegacyPass, "sccp",		INITIALIZE_PASS_END(SCCPLegacyPass, "sccp",
"Sparse Conditional Constant Propagation", false, false)		"Sparse Conditional Constant Propagation", false, false)

// createSCCPPass - This is the public interface to this file.		// createSCCPPass - This is the public interface to this file.
FunctionPass *llvm::createSCCPPass() { return new SCCPLegacyPass(); }		FunctionPass *llvm::createSCCPPass() { return new SCCPLegacyPass(); }

		namespace {

		/// FunctionSpecializer is responsible for specializing functions based on the
		/// values of their incoming arguments.
		///
		/// After an initial IPSCCP data-flow is solved, we may find that a formal
		/// argument of a function can take on a particular constant value in some
		/// calling contexts, but not all. If this is the case, we can create a new
		/// version of the function in which the argument is replaced by the constant
		/// value and recompute the data-flow. This allows constant propagation across
		/// function boundaries when an argument can take on more than one value.
		///
		/// Specialization is controlled by a goal-oriented heuristic that seeks to
		/// predict if replacing an argument with a constant value would result in any
		/// significant optimization opportunities. The heuristic is currently limited
		/// to revealing inlining opportunities via indirect call promotion. Inline
		/// cost is used to weigh the expected benefit of specialization against the
		/// cost of doing so.
		class FunctionSpecializer {

		/// The IPSCCP Solver.
		SCCPSolver &Solver;

		/// Analyses used to help determine if a function should be specialized.
		ProfileSummaryInfo *PSI;
		std::function<AssumptionCache &(Function &)> *GetAC;
		Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI;
		std::function<TargetTransformInfo &(Function &)> *GetTTI;

		/// Maps to save specialization costs and bonuses so we don't recompute them
		/// unnecessarily.
		DenseMap<Function *, unsigned> SpecializationCosts;
		DenseMap<std::pair<Argument , Constant >, unsigned> SpecializationBonuses;

		/// A mapping from specialized functions to their corresponding functions
		/// from the original module.
		DenseMap<Function , Function > OriginalFunctions;

		/// The set of functions from the original module that have been specialized.
		/// This set is primarily used to count the number of functions that have
		/// been specialized.
		SmallPtrSet<Function *, 4> SpecializedFunctions;

		public:
		FunctionSpecializer(
		SCCPSolver &Solver, ProfileSummaryInfo *PSI,
		std::function<AssumptionCache &(Function &)> *GetAC,
		Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,
		std::function<TargetTransformInfo &(Function &)> *GetTTI)
		: Solver(Solver), PSI(PSI), GetAC(GetAC), GetBFI(GetBFI), GetTTI(GetTTI) {
		}

		/// Attempt to specialize functions in the module to enable constant
		/// propagation across function boundaries.
		///
		/// \returns true if at least one function is specialized.
		bool specializeFunctions() {

		// Holds the newly created functions. We maintain a separate list of these
		// functions to avoid iterator invalidation.
		SmallVector<Function *, 4> Specializations;

		// Attempt to specialize the argument-tracked functions.
		bool Changed = false;
		for (auto *F : Solver.getArgumentTrackedFunctions()) {

		// If the function is not in the OriginalFunctions map, initialize it
		// with an identity relationship.
		if (!OriginalFunctions.count(F))
		OriginalFunctions[F] = F;

		Changed \|= specializeFunction(F, Specializations);
		}

		// Initialize the state of the newly created functions, marking them
		// argument-tracked and executable.
		for (auto *F : Specializations) {
		if (F->hasExactDefinition() && !F->hasFnAttribute(Attribute::Naked))
		Solver.AddTrackedFunction(F);
		Solver.AddArgumentTrackedFunction(F);
		Solver.MarkBlockExecutable(&F->front());
		}

		return Changed;
		}

		private:
		/// Attempt to specialize function \p F.
		///
		/// This function decides whether to specialize function \p F based on the
		/// known constant values its arguments can take on. Specialization is
		/// performed on the first interesting argument. Specializations based on
		/// additional arguments will be evaluated on following iterations of the
		/// main IPSCCP solve loop.
		///
		/// \returns true if the function is specialized.
		bool specializeFunction(Function *F,
		SmallVectorImpl<Function *> &Specializations) {

		// If we're optimizing the function for size, we shouldn't specialize it.
		if (F->optForSize())
		return false;

		// Exit if the function is not executable. There's no point in specializing
		// a dead function.
		if (!Solver.isBlockExecutable(&F->getEntryBlock()))
		return false;

		// Determine if we should specialize the function based on the values the
		// argument can take on. If specialization is not profitable, we continue
		// on to the next argument.
		for (Argument &A : F->args()) {

		// True if this will be a partial specialization. We will need to keep
		// the original function around in addition to the added specializations.
		bool IsPartial = true;

		// Determine if this argument is interesting. If we know the argument can
		// take on any constant values, they are collected in Constants. If the
		// argument can only ever equal a constant value in Constants, the
		// function will be completely specialized, and the IsPartial flag will
		// be set to false by isArgumentInteresting (that function only adds
		// values to the Constants list that are deemed profitable).
		SmallVector<Constant *, 4> Constants;
		if (!isArgumentInteresting(&A, Constants, IsPartial))
		continue;

		assert(!Constants.empty() && "No constants on which to specialize");
		DEBUG(dbgs() << "Specializing function " << F->getName() << " on " << A
		<< ": (" << *Constants[0];
		for (unsigned I = 1; I < Constants.size(); ++I) dbgs()
		<< ", " << *Constants[I];
		dbgs() << ")\n");

		// Create a version of the function in which the argument is marked
		// constant with the given value.
		for (auto *C : Constants) {

		// Clone the function. We leave the ValueToValueMap empty to allow
		// IPSCCP to propagate the constant arguments.
		ValueToValueMapTy EmptyMap;
		Function *Clone = CloneFunction(F, EmptyMap);

		// Rewrite calls to the function so that they call the clone instead.
		rewriteCallSites(F, Clone, A.getArgNo(), C);

		// Initialize the lattice state of the arguments of the function clone,
		// marking the argument on which we specialized the function constant
		// with the given value.
		Solver.markArgInFuncSpecialization(F, Clone->arg_begin() + A.getArgNo(),
		C);
		Specializations.push_back(Clone);

		// Set the parent function of the clone.
		OriginalFunctions[Clone] = OriginalFunctions[F];
		}

		// The function has been specialized. Increment the counter.
		if (SpecializedFunctions.insert(OriginalFunctions[F]).second)
		++IPNumSpecializedFuncs;

		// If the function has been completely specialized, the original function
		// is no longer needed. Mark it unreachable.
		if (!IsPartial)
		Solver.markFunctionUnreachable(F);

		return true;
		}

		return false;
		}

		/// Compute the cost of specializing function \p F.
		///
		/// This function computes the cost of specializing the given function.
		/// Specialization is performed for a particular argument if the bonus from
		/// specializing on that argument is greater than the specialization cost
		/// computed here.
		///
		/// \returns the cost of specialization function \p F.
		unsigned getSpecializationCost(Function *F) {

		// If we've already computed the specialization cost for the given
		// function, just return it.
		if (SpecializationCosts.count(F))
		return SpecializationCosts[F];

		// Compute the code metrics for the function.
		SmallPtrSet<const Value *, 32> EphValues;
		CodeMetrics::collectEphemeralValues(F, &(GetAC)(F), EphValues);
		CodeMetrics Metrics;
		for (BasicBlock &BB : *F)
		Metrics.analyzeBasicBlock(&BB, (GetTTI)(F), EphValues);

		// If the code metrics reveal that we shouldn't duplicate the function, we
		// shouldn't specialize it. Set the specialization cost to the maximum.
		if (Metrics.notDuplicatable)
		return SpecializationCosts[F] = std::numeric_limits<unsigned>::max();

		// Otherwise, set the specialization cost to be the cost of all the
		// instructions in the function.
		return SpecializationCosts[F] =
		Metrics.NumInsts * InlineConstants::InstrCost;
		}

		/// Compute a bonus for replacing argument \p A with constant \p C.
		///
		/// When specializing a function, we replace an argument with a constant.
		/// This function computes the expected benefit from doing to.
		///
		/// The current heuristic is limited to uncovering inlining opportunities.
		/// That is, if the argument is a function pointer used for indirect calls
		/// within the function, we can specialize the function and promote the
		/// indirect calls to direct calls. If the called function is sufficiently
		/// simple, it will be inlined into the caller.
		///
		/// TODO: We should consider expanding the kinds of optimization
		/// opportunities we look for.
		///
		/// \returns a bonus for replacing argument \p A with constant \p C.
		unsigned getSpecializationBonus(Argument A, Constant C) {

		// If we've already computed the specialization bonus for the given
		// argument and constant, just return it.
		if (SpecializationBonuses.count(std::make_pair(A, C)))
		return SpecializationBonuses[std::make_pair(A, C)];

		// The current heuristic is only concerned with exposing inlining
		// opportunities via indirect call promotion. If the argument is not a
		// function pointer, give up.
		if (!isa<PointerType>(A->getType()) \|\|
		!isa<FunctionType>(A->getType()->getPointerElementType()))
		return SpecializationBonuses[std::make_pair(A, C)] = 0;

		// Since the argument is a function pointer, its incoming constant values
		// should be functions or constant expressions. The code below attempts to
		// look through cast expressions to find the function that will be called.
		Value *CalledValue = C;
		while (isa<ConstantExpr>(CalledValue) &&
		cast<ConstantExpr>(CalledValue)->isCast())
		CalledValue = cast<User>(CalledValue)->getOperand(0);
		Function *CalledFunction = dyn_cast<Function>(CalledValue);
		if (!CalledFunction)
		return SpecializationBonuses[std::make_pair(A, C)] = 0;

		// Get TTI for the called function (used for the inline cost).
		auto &CalleeTTI = (GetTTI)(CalledFunction);

		// Look at all the call sites whose called value is the argument.
		// Specializing the function on the argument would allow these indirect
		// calls to be promoted to direct calls. If the indirect call promotion
		// would likely enable the called function to be inlined, specializing is a
		// good idea.
		int Bonus = 0;
		for (User *U : A->users()) {
		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
		continue;
		auto CS = CallSite(U);
		if (CS.getCalledValue() != A)
		continue;

		// Get the cost of inlining the called function at this call site. Note
		// that this is only an estimate. The called function may eventually
		// change in a way that leads to it not being inlined here, even though
		// inlining looks profitable now. For example, one of its called
		// functions may be inlined into it, making the called function too large
		// to be inlined into this call site.
		//
		// We apply a boost for performing indirect call promotion by increasing
		// the default threshold by the threshold for indirect calls.
		auto Params = getInlineParams();
		Params.DefaultThreshold += InlineConstants::IndirectCallThreshold;
		InlineCost IC = getInlineCost(CS, CalledFunction, Params, CalleeTTI,
		mcrosierUnsubmitted Not Done Reply Inline Actions The inline cost model already takes into consideration cold call sites, but I was wondering if it would make sense to use this information directly during functional specialization. For example, you could filter out cold calls and avoid the overhead of computing the inline cost in cases when you think inlining likely won't happen or where specialization isn't likely to be beneficial. Of course, the inline cost computation will bail sooner with a lower threshold (due to the call site being cold), so you might not be saving too much.. Just a thought. mcrosier: The inline cost model already takes into consideration cold call sites, but I was wondering if…
		*GetAC, GetBFI, PSI);

		// We clamp the bonus for this call to be between zero and the default
		// threshold.
		if (IC.isAlways())
		Bonus += Params.DefaultThreshold;
		else if (IC.isVariable() && IC.getCostDelta() > 0)
		Bonus += IC.getCostDelta();
		}

		assert(Bonus >= 0 && "Computed negative commulative bonus");

		// Return the commulative bonus for replacing the argument with the given
		// constant.
		return SpecializationBonuses[std::make_pair(A, C)] = Bonus;
		}

		/// Determine if we should specialize a function based on the incoming values
		/// of the given argument.
		///
		/// This function implements the goal-directed heuristic. It determines if
		/// specializing the function based on the incoming values of argument \p A
		/// would result in any significant optimization opportunities. If
		/// optimization opportunities exist, the constant values of \p A on which to
		/// specialize the function are collected in \p Constants. If the values in
		/// \p Constants represent the complete set of values that \p A can take on,
		/// the function will be completely specialized, and the \p IsPartial flag is
		/// set to false.
		///
		/// \returns true if the function should be specialized on the given
		/// argument.
		bool isArgumentInteresting(Argument *A,
		SmallVectorImpl<Constant *> &Constants,
		bool &IsPartial) {
		Function *F = A->getParent();

		// For now, don't attempt to specialize functions based on the values of
		// struct types.
		if (A->getType()->isStructTy())
		return false;

		// If the argument isn't overdefined, there's nothing to do. It should
		// already be constant.
		if (!Solver.getLatticeValueFor(A).isOverdefined())
		return false;

		// Collect the constant values that the argument can take on. If the
		// argument can't take on any constant values, we aren't going to
		// specialize the function. While it's possible to specialize the function
		// based on non-constant arguments, there's likely not much benefit to
		// constant propagation in doing so.
		SmallPtrSet<Constant *, 4> PossibleConstants;
		bool AllConstant = getPossibleConstants(A, PossibleConstants);
		if (PossibleConstants.empty())
		return false;

		// Determine if it would be profitable to create a specialization of the
		// function where the argument takes on the given constant value. If so,
		// add the constant to Constants.
		for (auto *C : PossibleConstants)
		if (getSpecializationBonus(A, C) > getSpecializationCost(F))
		Constants.push_back(C);

		// None of the constant values the argument can take on were deemed good
		// candidates on which to specialize the function.
		if (Constants.empty())
		return false;

		// This will be a partial specialization if some of the constants were
		// rejected due to their profitability.
		IsPartial = !AllConstant \|\| PossibleConstants.size() != Constants.size();

		return true;
		}

		/// Collect in \p Constants all the constant values that argument \p A can
		/// take on.
		///
		/// \returns true if all of the values the argument can take on are constant
		/// (e.g., the argument's parent function cannot be called with an
		/// overdefined value).
		bool getPossibleConstants(Argument *A,
		SmallPtrSetImpl<Constant *> &Constants) {
		Function *F = A->getParent();
		bool AllConstant = true;

		// Iterate over all the call sites of the argument's parent function.
		for (User *U : F->users()) {
		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
		continue;
		auto CS = CallSite(U);

		// If the parent of the call site will never be executed, we don't need
		// to worry about the passed value.
		if (!Solver.isBlockExecutable(CS.getInstruction()->getParent()))
		continue;

		// Get the lattice value for the value the call site passes to the
		// argument. If this value is not constant, move on to the next call
		// site. Additionally, set the AllConstant flag to false.
		if (CS.getArgument(A->getArgNo()) != A &&
		!Solver.getLatticeValueFor(CS.getArgument(A->getArgNo()))
		.isConstant()) {
		AllConstant = false;
		continue;
		}

		// Add the constant to the set.
		if (auto *C = dyn_cast<Constant>(CS.getArgument(A->getArgNo())))
		Constants.insert(C);
		}

		// If the argument can only take on constant values, AllConstant will be
		// true.
		return AllConstant;
		}

		/// Rewrite calls to function \p F to call function \p Clone instead.
		///
		/// This function modifies calls to function \p F whose argument at index \p
		/// ArgNo is equal to constant \p C. The calls are rewritten to call function
		/// \p Clone instead.
		void rewriteCallSites(Function F, Function Clone, unsigned ArgNo,
		Constant *C) {
		SmallVector<CallSite, 4> CallSitesToRewrite;
		for (auto *U : F->users()) {
		if (!isa<CallInst>(U) && !isa<InvokeInst>(U))
		continue;
		CallSite CS(U);
		if (!CS.getCalledFunction() \|\| CS.getCalledFunction() != F)
		continue;
		CallSitesToRewrite.push_back(CS);
		}
		for (auto CS : CallSitesToRewrite)
		if (CS.getInstruction()->getParent()->getParent() == Clone \|\|
		CS.getArgument(ArgNo) == C)
		CS.setCalledFunction(Clone);
		}
		};
		} // namespace

static bool AddressIsTaken(const GlobalValue *GV) {		static bool AddressIsTaken(const GlobalValue *GV) {
// Delete any dead constantexpr klingons.		// Delete any dead constantexpr klingons.
GV->removeDeadConstantUsers();		GV->removeDeadConstantUsers();

for (const Use &U : GV->uses()) {		for (const Use &U : GV->uses()) {
const User *UR = U.getUser();		const User *UR = U.getUser();
if (const auto *SI = dyn_cast<StoreInst>(UR)) {		if (const auto *SI = dyn_cast<StoreInst>(UR)) {
if (SI->getOperand(0) == GV \|\| SI->isVolatile())		if (SI->getOperand(0) == GV \|\| SI->isVolatile())
Show All 24 Lines	if (!F.hasLocalLinkage() \|\| AddressTakenFunctions.count(&F))
return;		return;

for (BasicBlock &BB : F)		for (BasicBlock &BB : F)
if (auto *RI = dyn_cast<ReturnInst>(BB.getTerminator()))		if (auto *RI = dyn_cast<ReturnInst>(BB.getTerminator()))
if (!isa<UndefValue>(RI->getOperand(0)))		if (!isa<UndefValue>(RI->getOperand(0)))
ReturnsToZap.push_back(RI);		ReturnsToZap.push_back(RI);
}		}

static bool runIPSCCP(Module &M, const DataLayout &DL,		static bool
const TargetLibraryInfo *TLI) {		runIPSCCP(Module &M, const DataLayout &DL, const TargetLibraryInfo *TLI,
		ProfileSummaryInfo *PSI,
		std::function<AssumptionCache &(Function &)> *GetAC,
		Optional<function_ref<BlockFrequencyInfo &(Function &)>> GetBFI,
		std::function<TargetTransformInfo &(Function &)> *GetTTI) {
SCCPSolver Solver(DL, TLI);		SCCPSolver Solver(DL, TLI);
		FunctionSpecializer FS(Solver, PSI, GetAC, GetBFI, GetTTI);

// AddressTakenFunctions - This set keeps track of the address-taken functions		// AddressTakenFunctions - This set keeps track of the address-taken functions
// that are in the input. As IPSCCP runs through and simplifies code,		// that are in the input. As IPSCCP runs through and simplifies code,
// functions that were address taken can end up losing their		// functions that were address taken can end up losing their
// address-taken-ness. Because of this, we keep track of their addresses from		// address-taken-ness. Because of this, we keep track of their addresses from
// the first pass so we can use them for the later simplification pass.		// the first pass so we can use them for the later simplification pass.
SmallPtrSet<Function*, 32> AddressTakenFunctions;		SmallPtrSet<Function*, 32> AddressTakenFunctions;

Show All 37 Lines	runIPSCCP(Module &M, const DataLayout &DL, const TargetLibraryInfo *TLI,
// variables that do not have their 'addresses taken'. If they don't have		// variables that do not have their 'addresses taken'. If they don't have
// their addresses taken, we can propagate constants through them.		// their addresses taken, we can propagate constants through them.
for (GlobalVariable &G : M.globals())		for (GlobalVariable &G : M.globals())
if (!G.isConstant() && G.hasLocalLinkage() &&		if (!G.isConstant() && G.hasLocalLinkage() &&
G.hasDefinitiveInitializer() && !AddressIsTaken(&G))		G.hasDefinitiveInitializer() && !AddressIsTaken(&G))
Solver.TrackValueOfGlobalVariable(&G);		Solver.TrackValueOfGlobalVariable(&G);

// Solve for constants.		// Solve for constants.
bool ResolvedUndefs = true;		bool SolveForConstants = true;
while (ResolvedUndefs) {		while (SolveForConstants) {
Solver.Solve();		Solver.Solve();

DEBUG(dbgs() << "RESOLVING UNDEFS\n");		DEBUG(dbgs() << "RESOLVING UNDEFS\n");
ResolvedUndefs = false;		SolveForConstants = false;
for (Function &F : M)		for (Function &F : M)
ResolvedUndefs \|= Solver.ResolvedUndefsIn(F);		SolveForConstants \|= Solver.ResolvedUndefsIn(F);

		if (!EnableFunctionSpecialization)
		continue;

		DEBUG(dbgs() << "SPECIALIZING FUNCTIONS\n");
		SolveForConstants \|= FS.specializeFunctions();
		efriedmaUnsubmitted Not Done Reply Inline Actions This loop doesn't work the way you want it to. IPSCCP isn't actually complete until the "while (ResolvedUndefs)" loop finishes, and specializing based on a partially unsolved Solver probably won't give you the results you want. efriedma: This loop doesn't work the way you want it to. IPSCCP isn't actually complete until the "while…
		mssimpsoAuthorUnsubmitted Not Done Reply Inline Actions I'm not sure this is really a problem. Sure IPSCCP isn't yet complete, but `ResolvedUndefsIn` only moves values from unknown to constant. If an actual argument does become constant while resolving undefs, we should analyze the corresponding (formal argument, constant) pair in `getSpecializationBonus` the next time we visit the function when looking for specialization opportunities. mssimpso: I'm not sure this is really a problem. Sure IPSCCP isn't yet complete, but `ResolvedUndefsIn`…
		efriedmaUnsubmitted Not Done Reply Inline Actions Moving a value from undef to constant can move dependent values from constant to overdefined. So you could end up cloning a function, and then fail to actually do any useful constant propagation into the cloned function. efriedma: Moving a value from undef to constant can move dependent values from constant to overdefined.
		mssimpsoAuthorUnsubmitted Not Done Reply Inline Actions Ah, right. I think I see what you mean now. mssimpso: Ah, right. I think I see what you mean now.
}		}

bool MadeChanges = false;		bool MadeChanges = false;

// Iterate over all of the instructions in the module, replacing them with		// Iterate over all of the instructions in the module, replacing them with
// constants if we have found them to be of constant values.		// constants if we have found them to be of constant values.
//		//
SmallVector<BasicBlock*, 512> BlocksToErase;		SmallVector<BasicBlock*, 512> BlocksToErase;
▲ Show 20 Lines • Show All 118 Lines • ▼ Show 20 Lines	runIPSCCP(Module &M, const DataLayout &DL, const TargetLibraryInfo *TLI,
}		}

return MadeChanges;		return MadeChanges;
}		}

PreservedAnalyses IPSCCPPass::run(Module &M, ModuleAnalysisManager &AM) {		PreservedAnalyses IPSCCPPass::run(Module &M, ModuleAnalysisManager &AM) {
const DataLayout &DL = M.getDataLayout();		const DataLayout &DL = M.getDataLayout();
auto &TLI = AM.getResult<TargetLibraryAnalysis>(M);		auto &TLI = AM.getResult<TargetLibraryAnalysis>(M);
if (!runIPSCCP(M, DL, &TLI))		auto &FAM = AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
		auto &PSI = AM.getResult<ProfileSummaryAnalysis>(M);

		std::function<AssumptionCache &(Function &)> GetAC =
		[&FAM](Function &F) -> AssumptionCache & {
		return FAM.getResult<AssumptionAnalysis>(F);
		};

		std::function<TargetTransformInfo &(Function &)> GetTTI =
		[&FAM](Function &F) -> TargetTransformInfo & {
		return FAM.getResult<TargetIRAnalysis>(F);
		};

		std::function<BlockFrequencyInfo &(Function &)> GetBFI =
		[&FAM](Function &F) -> BlockFrequencyInfo & {
		return FAM.getResult<BlockFrequencyAnalysis>(F);
		};

		if (!runIPSCCP(M, DL, &TLI, &PSI, &GetAC, {GetBFI}, &GetTTI))
return PreservedAnalyses::all();		return PreservedAnalyses::all();
return PreservedAnalyses::none();		return PreservedAnalyses::none();
}		}

namespace {		namespace {
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
//		//
/// IPSCCP Class - This class implements interprocedural Sparse Conditional		/// IPSCCP Class - This class implements interprocedural Sparse Conditional
/// Constant Propagation.		/// Constant Propagation.
///		///
class IPSCCPLegacyPass : public ModulePass {		class IPSCCPLegacyPass : public ModulePass {
public:		public:
static char ID;		static char ID;

IPSCCPLegacyPass() : ModulePass(ID) {		IPSCCPLegacyPass() : ModulePass(ID) {
initializeIPSCCPLegacyPassPass(*PassRegistry::getPassRegistry());		initializeIPSCCPLegacyPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnModule(Module &M) override {		bool runOnModule(Module &M) override {
if (skipModule(M))		if (skipModule(M))
return false;		return false;
const DataLayout &DL = M.getDataLayout();		const DataLayout &DL = M.getDataLayout();
const TargetLibraryInfo *TLI =		const TargetLibraryInfo *TLI =
&getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();		&getAnalysis<TargetLibraryInfoWrapperPass>().getTLI();
return runIPSCCP(M, DL, TLI);
		ProfileSummaryInfo *PSI =
		getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
		AssumptionCacheTracker *ACT = &getAnalysis<AssumptionCacheTracker>();

		TargetTransformInfoWrapperPass *TTIWP =
		&getAnalysis<TargetTransformInfoWrapperPass>();

		std::function<AssumptionCache &(Function &)> GetAC =
		[&ACT](Function &F) -> AssumptionCache & {
		return ACT->getAssumptionCache(F);
		};

		std::function<TargetTransformInfo &(Function &)> GetTTI =
		[&TTIWP](Function &F) -> TargetTransformInfo & {
		return TTIWP->getTTI(F);
		};

		return runIPSCCP(M, DL, TLI, PSI, &GetAC, None, &GetTTI);
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
		AU.addRequired<AssumptionCacheTracker>();
		AU.addRequired<ProfileSummaryInfoWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
		AU.addRequired<TargetTransformInfoWrapperPass>();
}		}
};		};
} // end anonymous namespace		} // end anonymous namespace

char IPSCCPLegacyPass::ID = 0;		char IPSCCPLegacyPass::ID = 0;
INITIALIZE_PASS_BEGIN(IPSCCPLegacyPass, "ipsccp",		INITIALIZE_PASS_BEGIN(IPSCCPLegacyPass, "ipsccp",
"Interprocedural Sparse Conditional Constant Propagation",		"Interprocedural Sparse Conditional Constant Propagation",
false, false)		false, false)
		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
		INITIALIZE_PASS_DEPENDENCY(ProfileSummaryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(TargetTransformInfoWrapperPass)
INITIALIZE_PASS_END(IPSCCPLegacyPass, "ipsccp",		INITIALIZE_PASS_END(IPSCCPLegacyPass, "ipsccp",
"Interprocedural Sparse Conditional Constant Propagation",		"Interprocedural Sparse Conditional Constant Propagation",
false, false)		false, false)

// createIPSCCPPass - This is the public interface to this file.		// createIPSCCPPass - This is the public interface to this file.
ModulePass *llvm::createIPSCCPPass() { return new IPSCCPLegacyPass(); }		ModulePass *llvm::createIPSCCPPass() { return new IPSCCPLegacyPass(); }

test/Other/new-pm-defaults.ll

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running analysis: AssumptionAnalysis			; CHECK-O-NEXT: Running analysis: AssumptionAnalysis
	; CHECK-O-NEXT: Running pass: SROA			; CHECK-O-NEXT: Running pass: SROA
	; CHECK-O-NEXT: Running analysis: DominatorTreeAnalysis			; CHECK-O-NEXT: Running analysis: DominatorTreeAnalysis
	; CHECK-O-NEXT: Running pass: EarlyCSEPass			; CHECK-O-NEXT: Running pass: EarlyCSEPass
	; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis			; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis
	; CHECK-O-NEXT: Running pass: LowerExpectIntrinsicPass			; CHECK-O-NEXT: Running pass: LowerExpectIntrinsicPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: IPSCCPPass			; CHECK-O-NEXT: Running pass: IPSCCPPass
				; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running pass: GlobalOptPass			; CHECK-O-NEXT: Running pass: GlobalOptPass
	; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PromotePass>			; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PromotePass>
	; CHECK-O-NEXT: Running pass: DeadArgumentEliminationPass			; CHECK-O-NEXT: Running pass: DeadArgumentEliminationPass
	; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}PassManager{{.}}>			; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}PassManager{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis			; CHECK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
	; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass			; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running analysis: GlobalsAA			; CHECK-O-NEXT: Running analysis: GlobalsAA
	; CHECK-O-NEXT: Running analysis: CallGraphAnalysis			; CHECK-O-NEXT: Running analysis: CallGraphAnalysis
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.}}LazyCallGraph{{.}}>			; CHECK-O-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.}}LazyCallGraph{{.}}>
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Starting CGSCC pass manager run.			; CHECK-O-NEXT: Starting CGSCC pass manager run.
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph{{.}}>			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph{{.}}>
	; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass			; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

test/Other/new-pm-lto-defaults.ll

	Show All 27 Lines
	; CHECK-O-NEXT: Running pass: GlobalDCEPass			; CHECK-O-NEXT: Running pass: GlobalDCEPass
	; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass			; CHECK-O-NEXT: Running pass: ForceFunctionAttrsPass
	; CHECK-O-NEXT: Running pass: InferFunctionAttrsPass			; CHECK-O-NEXT: Running pass: InferFunctionAttrsPass
	; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis			; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis
	; CHECK-O2-NEXT: PGOIndirectCallPromotion			; CHECK-O2-NEXT: PGOIndirectCallPromotion
	; CHECK-O2-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}Function			; CHECK-O2-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}Function
	; CHECK-O2-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis			; CHECK-O2-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
	; CHECK-O2-NEXT: Running pass: IPSCCPPass			; CHECK-O2-NEXT: Running pass: IPSCCPPass
				; CHECK-O2-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.*}}PostOrderFunctionAttrsPass>			; CHECK-O-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.*}}PostOrderFunctionAttrsPass>
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}SCC			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}SCC
	; CHECK-O1-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}Function			; CHECK-O1-NEXT: Running analysis: InnerAnalysisManagerProxy<{{.*}}Function
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph{{.}}>			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph{{.}}>
	; CHECK-O-NEXT: Running analysis: AAManager			; CHECK-O-NEXT: Running analysis: AAManager
	; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis			; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

test/Other/new-pm-thinlto-defaults.ll

	Show First 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
	; CHECK-O-NEXT: Running analysis: AssumptionAnalysis			; CHECK-O-NEXT: Running analysis: AssumptionAnalysis
	; CHECK-O-NEXT: Running pass: SROA			; CHECK-O-NEXT: Running pass: SROA
	; CHECK-O-NEXT: Running analysis: DominatorTreeAnalysis			; CHECK-O-NEXT: Running analysis: DominatorTreeAnalysis
	; CHECK-O-NEXT: Running pass: EarlyCSEPass			; CHECK-O-NEXT: Running pass: EarlyCSEPass
	; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis			; CHECK-O-NEXT: Running analysis: TargetLibraryAnalysis
	; CHECK-O-NEXT: Running pass: LowerExpectIntrinsicPass			; CHECK-O-NEXT: Running pass: LowerExpectIntrinsicPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: IPSCCPPass			; CHECK-O-NEXT: Running pass: IPSCCPPass
				; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running pass: GlobalOptPass			; CHECK-O-NEXT: Running pass: GlobalOptPass
	; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PromotePass>			; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.*}}PromotePass>
	; CHECK-O-NEXT: Running pass: DeadArgumentEliminationPass			; CHECK-O-NEXT: Running pass: DeadArgumentEliminationPass
	; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}PassManager{{.}}>			; CHECK-O-NEXT: Running pass: ModuleToFunctionPassAdaptor<{{.}}PassManager{{.}}>
	; CHECK-O-NEXT: Starting llvm::Function pass manager run.			; CHECK-O-NEXT: Starting llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: InstCombinePass			; CHECK-O-NEXT: Running pass: InstCombinePass
	; CHECK-PRELINK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis			; CHECK-PRELINK-O-NEXT: Running analysis: OptimizationRemarkEmitterAnalysis
	; CHECK-O-NEXT: Running pass: SimplifyCFGPass			; CHECK-O-NEXT: Running pass: SimplifyCFGPass
	; CHECK-O-NEXT: Finished llvm::Function pass manager run.			; CHECK-O-NEXT: Finished llvm::Function pass manager run.
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}GlobalsAA
	; CHECK-O-NEXT: Running analysis: GlobalsAA			; CHECK-O-NEXT: Running analysis: GlobalsAA
	; CHECK-O-NEXT: Running analysis: CallGraphAnalysis			; CHECK-O-NEXT: Running analysis: CallGraphAnalysis
	; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis			; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running analysis: ProfileSummaryAnalysis
	; CHECK-O-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.}}LazyCallGraph{{.}}>			; CHECK-O-NEXT: Running pass: ModuleToPostOrderCGSCCPassAdaptor<{{.}}LazyCallGraph{{.}}>
	; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy			; CHECK-O-NEXT: Running analysis: InnerAnalysisManagerProxy
	; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis			; CHECK-O-NEXT: Running analysis: LazyCallGraphAnalysis
	; CHECK-O-NEXT: Starting CGSCC pass manager run.			; CHECK-O-NEXT: Starting CGSCC pass manager run.
	; CHECK-O-NEXT: Running pass: InlinerPass			; CHECK-O-NEXT: Running pass: InlinerPass
	; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph{{.}}>			; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy<{{.}}LazyCallGraph{{.}}>
	; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass			; CHECK-O-NEXT: Running pass: PostOrderFunctionAttrsPass
	; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy			; CHECK-O-NEXT: Running analysis: FunctionAnalysisManagerCGSCCProxy
	▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

test/Transforms/SCCP/ipsccp-specialization.ll

This file was added.

				; RUN: opt -ipsccp -deadargelim -inline -S < %s \| FileCheck %s

				target triple = "aarch64-unknown-linux-gnueabi"

				; CHECK-LABEL: @main(i64 %x, i1 %flag) {
				; CHECK: entry:
				; CHECK-NEXT: br i1 %flag, label %plus, label %minus
				; CHECK: plus:
				; CHECK-NEXT: [[TMP0:%.+]] = add i64 %x, 1
				; CHECH-NEXT: br label %merge
				; CHECK: minus:
				; CHECK-NEXT: [[TMP1:%.+]] = sub i64 %x, 1
				; CHECK-NEXT: br label %merge
				; CHECK: merge:
				; CHECK-NEXT: [[TMP2:%.+]] = phi i64 [ [[TMP0]], %plus ], [ [[TMP1]], %minus ]
				; CHECK-NEXT: ret i64 [[TMP2]]
				; CHECK-NEXT: }
				;
				define i64 @main(i64 %x, i1 %flag) {
				entry:
				br i1 %flag, label %plus, label %minus

				plus:
				%tmp0 = call i64 @compute(i64 %x, i64 (i64)* @plus)
				br label %merge

				minus:
				%tmp1 = call i64 @compute(i64 %x, i64 (i64)* @minus)
				br label %merge

				merge:
				%tmp2 = phi i64 [ %tmp0, %plus ], [ %tmp1, %minus]
				ret i64 %tmp2
				}

				define internal i64 @compute(i64 %x, i64 (i64)* %binop) {
				entry:
				%tmp0 = call i64 %binop(i64 %x)
				ret i64 %tmp0
				}

				define internal i64 @plus(i64 %x) {
				entry:
				%tmp0 = add i64 %x, 1
				ret i64 %tmp0
				}

				define internal i64 @minus(i64 %x) {
				entry:
				%tmp0 = sub i64 %x, 1
				ret i64 %tmp0
				}