This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/
-
llvm/
-
Analysis/
-
BasicAliasAnalysis.h
10/24
InvariantInfo.h
-
PostDominators.h
-
InitializePasses.h
-
Transforms/Scalar/
-
Scalar/
-
GVN.h
-
lib/
-
Analysis/
8/19
BasicAliasAnalysis.cpp
-
CMakeLists.txt
23/28
InvariantInfo.cpp
2/2
MemoryDependenceAnalysis.cpp
3
PostDominators.cpp
-
Passes/
-
PassBuilder.cpp
-
PassRegistry.def
-
Transforms/
-
InstCombine/
1
InstructionCombining.cpp
-
Scalar/
2
GVN.cpp
-
test/Analysis/InvariantInfo/
-
Analysis/
-
InvariantInfo/
-
gvn-basic.ll
-
gvn.ll
-
instcombine.ll
-
print.ll
-
verify.ll

Differential D15124

Use @llvm.invariant.start/end intrinsics to extend basic AA with invariant range analysis for GVN-based load elimination purposes [Local objects only]
Needs ReviewPublic

Authored by lvoufo on Dec 1 2015, 12:14 PM.

Download Raw Diff

Details

Reviewers

chandlerc
dnovillo
nlewycky
hfinkel

Summary

The invariant range analysis is supported under a new pass, -invariant-range-analysis, and uses both dominator tree and post-dominator tree analyses to determine if a given instruction modifies some given memory.
For starters, this is focused only on local variables and call instructions as potential clobbers.

To avoid efficiency concerns with postdom analysis, and pending upcoming benchmark results, the effects of this analysis are also localized to GVN only (via a flag that is set only in GVN).

Diff Detail

Event Timeline

lvoufo updated this revision to Diff 41549.Dec 1 2015, 12:14 PM

lvoufo retitled this revision from to Use @llvm.invariant.start/end intrinsics to extend the meaning of basic AA's pointsToConstantMemory(), for GVN-based load elimination purposes.

lvoufo updated this object.

lvoufo added reviewers: reames, nlewycky, chandlerc.

lvoufo added a subscriber: llvm-commits.

lvoufo updated this revision to Diff 41552.Dec 1 2015, 12:23 PM

lvoufo retitled this revision from Use @llvm.invariant.start/end intrinsics to extend the meaning of basic AA's pointsToConstantMemory(), for GVN-based load elimination purposes to Use @llvm.invariant.start/end intrinsics to extend the meaning of basic AA's pointsToConstantMemory(), for GVN-based load elimination purposes [Local objects only].Dec 1 2015, 1:57 PM

jevinskie added a subscriber: jevinskie.Dec 1 2015, 3:41 PM

lvoufo added a child revision: D15135: Extend the use of @llvm.invariant.start/end intrinsics for GVN-based load elimination purposes to also handle global variables..Dec 1 2015, 5:00 PM

dnovillo added a reviewer: dnovillo.Dec 4 2015, 9:42 AM

I've just started looking at this, so these first few comments are mostly for my benefit to make sure I'm understanding this.

Would it make sense to have tests that are purely around the InvariantInfo analysis? This patch adds a user in AA and the test is on AA. But if the analysis were to print its findings, we could have a test that simply checks that the analysis figures out what we are expecting without involving AA.

include/llvm/Analysis/InvariantInfo.h
29	Documentation does not match the class name. I'd just get rid of the class name in the doc string.
33	Actually, this'd be used by any pass that needs to understand write-once objects, right?
49	Likewise here. No need to duplicate the class name in the docstring.
lib/Analysis/BasicAliasAnalysis.cpp
710	Should this be defined in lib/Analysis/InvariantInfo.cpp?
lib/Analysis/InvariantInfo.cpp
61	Would it be better to have a remove() function in the interface, instead of making nullptr a magic value?

In D15124#302677, @dnovillo wrote:

I've just started looking at this, so these first few comments are mostly for my benefit to make sure I'm understanding this.

Would it make sense to have tests that are purely around the InvariantInfo analysis? This patch adds a user in AA and the test is on AA. But if the analysis were to print its findings, we could have a test that simply checks that the analysis figures out what we are expecting without involving AA.

The problem with this approach is that the mapping does not get updated until instructions are traversed in -gvn; and by that time, -gvn already requires basic AA. Moreover, within the same pass, the mapping can change as invariant_start and invariant_end calls are encountered, adding or removing entries. The mappings are not fixed once updated like assumptions are for -assumption-cache-tracker.

With an updated (or rather non-empty) mapping, one could either print out analysis content to show that loads will be removed as expected (or not), or one could just go ahead and show that the said loads are removed. It is also not very clear to me what kind of information would be useful to print in this case, though I welcome any idea... Perhaps we could work on this in a separate patch/extension?

For the time being, it seems more effective to take the current approach. But perhaps I am missing something about LLVM's print analyses... I am also going from this notion that test cases are about confidence more than completeness. So, it is perfectly okay to center them around specific use cases showing the end results of the print messages (i.e. load removal) rather than the print messages themselves(?).

include/llvm/Analysis/InvariantInfo.h
33	Yes. This comment was specific to this particular design. But I could generalize it.
lib/Analysis/BasicAliasAnalysis.cpp
710	Perhaps later, but for now, it's only used in this file?
lib/Analysis/InvariantInfo.cpp
61	Yes.

pushing in revised patch soon...

Update patch based feedback.

I read through this, and tried to check my understanding of it with Larisse. Here's how I see things working:

GVN processes instructions in reverse post order. This means dominating blocks come first, i.e. defs before uses.
GVN's processInstruction calls llvm::processInvariantIntrinsic() on invariant intrinsics
This updates InvariantMap to hold values passed to invariant.start. A call to invariant.end will remove the pointer from the map, and a new invariant.start will overwrite it.
BasicAA has been modified to return MRI_NoModRef for addresses that are present in InvariantMap, as well as other things. This is what enables GVN to do better.
MemDep has been modified so that when it scans backwards across an invariant start, it won't still consider the memory to be constant and load something bad like undef.

Fundamentally, the approach of having a side table populated by GVN's instruction visitation order makes me nervous, but I'm not able to construct a solid counterexample as to why.

In D15124#302904, @lvoufo wrote:

For the time being, it seems more effective to take the current approach. But perhaps I am missing something about LLVM's print analyses... I am also going from this notion that test cases are about confidence more than completeness. So, it is perfectly okay to center them around specific use cases showing the end results of the print messages (i.e. load removal) rather than the print messages themselves(?).

What I'm proposing is to make this analysis pass runnable via opt. Kind of like the other analyses in lib/Analysis. So, we could run, for instance, 'opt -invariant-info-marker file.ll' and it would print the invariant ranges in file.ll. From this we can then create tests specifically for the pass, without having to go through another pass.

lib/Analysis/InvariantInfo.cpp
111	I'm confused by this comment. When we find an invariant_start, shouldn't the address be considered as pointing to constant memory?
126	What does the return value indicate?
139	This function is supposedly doing a scan backwards, but all it really seems to do is decide whether II is an invariant_start or an invariant_end. Does this need to be renamed? I'm not really sure what this is doing.
140	Formatting is odd here. Could you run clang-format over the file?

lvoufo added inline comments.Dec 8 2015, 2:47 PM

lib/Analysis/InvariantInfo.cpp
111	Oops. Typo. That's exactly what the comment is supposed to be saying.
126	That an invariant intrinsic was processed. Will add a comment.
139	This is to be called while scanning instructions backward to temporarily reset the mapping of invariant info so that function calls outside invariant regions can continue to clobber loads inside invariant regions. Yes, it should probably be renamed. But I'm thinking of getting read of it all together, in favor of a separate function or pass that processes invariant intrinsics independently of the order in which GVN processes all instructions.
140	Yes, I will.

In D15124#305326, @dnovillo wrote:

In D15124#302904, @lvoufo wrote:

For the time being, it seems more effective to take the current approach. But perhaps I am missing something about LLVM's print analyses... I am also going from this notion that test cases are about confidence more than completeness. So, it is perfectly okay to center them around specific use cases showing the end results of the print messages (i.e. load removal) rather than the print messages themselves(?).

What I'm proposing is to make this analysis pass runnable via opt. Kind of like the other analyses in lib/Analysis. So, we could run, for instance, 'opt -invariant-info-marker file.ll' and it would print the invariant ranges in file.ll. From this we can then create tests specifically for the pass, without having to go through another pass.

Making this analysis pass more substantial and printable is very doable. I don't see a problem with printing out appropriate invariant info for each instruction. The only difficulty I find with this is showing load removal using this pass alone, without -gvn which in turns also performs additional analysis to remove loads. Anyhow, let's see how you like the next patch update...

lvoufo added inline comments.Dec 8 2015, 3:14 PM

lib/Analysis/InvariantInfo.cpp
127	Actually the comment follows right below here (?).

nlewycky added inline comments.Dec 8 2015, 8:55 PM

include/llvm/Analysis/InvariantInfo.h
41–42	In the comment you say Address but the argument is named Addr. Either you meant to say "address" or "Addr" in the comment, or change "Addr" to "Address" in the function declaration.
58–59	Odd line wrapping?
82	Typo, "intructions" -> "instructions".
82–84	The comment fails to explain what it does. Looking at the code, it appears to update InvInfo by unsetting the start instruction when II is a start. It also has an assert to check that things are sane when II is an end, but since assertions can be disabled, that's not really part of its functionality. It also returns true iff II is either start or end intrinsic. It wasn't clear the first time through how closely tied this is to the operation of PreserveInvariantInfoRAII.
lib/Analysis/InvariantInfo.cpp
119	No need for llvm:: here, we're inside a .cpp with "using namespace llvm;".
123	Extra llvm::
146	Extra llvm::
150	Need "(void)IStart;" here.
lib/Analysis/MemoryDependenceAnalysis.cpp
478–480	Walking ScanIt across an invariant intrinsic that doesn't appertain to QueryInst will not cause InvInfo to be updated. At that point, BasicAAResult::pointsToConstantMemory will produce wrong answers for those pointers.
test/Transforms/LoadElim/invariant.ll
1 ↗	(On Diff #41942)	Why are you creating a new test/Transforms subdirectory? The directories there are named for the passes, this should go into test/Transforms/GVN. Also, this should be -basicaa -gvn to test your changes to -basicaa, right? Or do you want a separate testcase in test/Analysis/BasicAA for that?

nlewycky added inline comments.Dec 8 2015, 8:55 PM

include/llvm/Analysis/InvariantInfo.h
23–25	Please alphabetize.
30	I don't understand the connection between "writeonce" (which doesn't seem to be defined anywhere? link?) and invariant.start/end. I have have as many invariant intrinsics over the same memory as I like, there's nothing 'once' about it. There's also no guarantee that the memory ever gets written to at any point in particular.
31	Typo, "intrisic_start" should be "intrinsic_start".
45	I looked up the "certain preconditions" and they appear to be "if Addr is a global or an alloca". Why those preconditions? Why not llvm::isIdentifiedObject, for example? Why aren't these preconditions mentioned on SetStartInstruction? Ttrue if removed, false if preconditions are not met ... what if the preconditions are met and it's not removed (because it's not a member)?
lib/Analysis/BasicAliasAnalysis.cpp
503	"associated to" --> "associated with".
503–504	I don't understand this comment. Globals, for instance, aren't local to the function.
507	You assert on invariant.start with a constant global variable, but that isn't a verifier failure. You would assert if queried on this well-defined code, right? declare {}* @llvm.invariant.start(i64 %size, i8* nocapture %ptr) @gv = internal constant i8 0 define void @test() { call {}* @llvm.invariant.start(i64 1, i8* @gv) ret void } I agree the construct is not useful, either we should make it fail the verifier (this would mean auditing all parts of llvm which mark globals constant to ensure they eliminate all possible invariant intrinsic calls that use the global), or we should delete them in some optimization pass (instcombine).
508–509	Please add a space after "or". As it is, the text from the two lines get concatenated producing "...instructions ornon-constant global...".
788–789	This ...
814–816	... and this look like solid changes on their own. Can you factor them out into their own review and write testcases for them?
lib/Analysis/InvariantInfo.cpp
30	Capitalize.
36–38	Does this code simplify to: return InvariantMarkers.lookup(const_cast<Value *>(Addr)); ?
55–57	Er, the code doesn't match the comment here, right? As written, this means that if it is set, then trying to set again will unset it. That doesn't ensure the new value is inserted.
60	Extra space between words in in "alloca instruction".
69–70	You need to add a (void)Inserted; for people building with assertions off and -Wunused-variable on.

lvoufo marked 22 inline comments as done.Dec 9 2015, 9:10 AM

lvoufo added inline comments.

include/llvm/Analysis/InvariantInfo.h
31	The design doc stated that this was all part of simulating 'writeonce' behavior in LLVM (+Clang)... Ideally, invariant.start instructions are generated right after first write (e.g. construction) and LLVM would reject any write after an invariant_start and before a corresponding invariant_end, but we are not reinforcing this in LLVM. Documentations like this are meant to specify (and clarify) the intended use of invariant.start/end (for simulating wirteonce semantics). If it makes things even clearer, I could update the LangRef to this effect later (?).
42–43	meant "address", not "Address".
45	Yeah. I think this is a case of poorly-named functions after code restructuring... I'll reorganize everything accordingly.
82–84	Right. This could be clearer. I mentioned in earlier comments that I may be getting rid of this function altogether (in the next revision of this patch), if I can't rename it appropriately, as part of separating the extended logic from the order in which GVN processes instructions.
lib/Analysis/BasicAliasAnalysis.cpp
503–504	I think the operative word is "considered". This is merely an attempt to follow the current documentation structure of the function pointsToConstantMemory(). Its documentation states: /// Returns whether the given pointer value points to memory that is local to /// the function, with global constants being considered local to all /// functions. bool BasicAAResult::pointsToConstantMemory(const MemoryLocation &Loc, bool OrLocal)
507	I have removed the assertion. We just want to localized this extended logic to non-constant globals, and this should be checked in InvariantInfo::SetStartInstruction().
lib/Analysis/InvariantInfo.cpp
36–38	In theory, yes. But I like to edge on the side of safety with find() over lookup() when it comes to the default initialization of pointers (or other non-user-defined objects). The documentation of lookup says: "lookup - Return the entry for the specified key, or a default constructed value if no such entry exists". This explicitly returns nullptr, in place of a "default constructed value"---the specification of which could in practice change at any time, "if no such entry exists".
55–57	Removed.
119	Oops. Residuals from multiple restructurings...
lib/Analysis/MemoryDependenceAnalysis.cpp
478–480	How so? I think it will do exactly what we want. By manipulating invariant info here based on QueryInst, we are selectively choosing when to apply the extension in BasicAA. A load from %i within an invariant range for %i temporarily unsets the marking for %i. A load from %j within an invariant range for %i does nothing to the marking for %i. A load from %j within an invariant range for %i should preserve the value of BasicAAResult::pointsToConstantMemory() for %j, whether it is nested within an invariant range for %j or not. Meanwhile, A load from %i within an invariant range for %i always leads to BasicAAResult::pointsToConstantMemory for %i == true. What is missing?

lvoufo marked 5 inline comments as done.Dec 9 2015, 9:33 AM

lvoufo added inline comments.

test/Transforms/LoadElim/invariant.ll
1 ↗	(On Diff #41942)	(1) This test case is eventually merged with other passes besides -gvn, like -instcombine. See for example D15136. You can also take a look at earlier patches linked from the design doc where we explore even more complex pass combinations. Keeping things in their respective pass directories, like GVN/ or InstCombine/, essentially duplicates the same test case several times. As we progress through the next incremental patches, there will be more test cases that will require the same kind of duplicate tests. To avoid all this duplication. It looks fitting to just keep all the test cases in one new folder, LoadElim. I should probably rename it to InvariantInfo since these test cases are really more about the effects of manipulating InvariantInfo's more than they are about -gvn, -instcombine, or -globalopt. (2) -gvn already requires -basicaa. So, -basicaa would be redundant in '-basicaa -gvn'. I could add it for clarity if you prefer.

"(2) -gvn already requires -basicaa. So, -basicaa would be redundant in
'-basicaa -gvn'. I could add it for clarity if you prefer.
"
-gvn does not require -basicaa.
You've stated this a few times, and i'm not sure why you think this.

<from GVN>:
void getAnalysisUsage(AnalysisUsage &AU) const override {

AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();
if (!NoLoads)
  AU.addRequired<MemoryDependenceAnalysis>();
AU.addRequired<AAResultsWrapperPass>();

}

GVN requires AC, DominatorTree, and TLI all the time.
It uses AA results, which is "whatever aa you have enabled".
It requires memorydependence if you want to optimize loads.
Memory dependence, in turn, does not require basicaa either:

void MemoryDependenceAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {

AU.setPreservesAll();
AU.addRequired<AssumptionCacheTracker>();
AU.addRequiredTransitive<AAResultsWrapperPass>();
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();

}

No passes *require* basicaa, because basicaa is an optional pass providing
AA results.

In D15124#307279, @dberlin wrote:
"(2) -gvn already requires -basicaa. So, -basicaa would be redundant in
'-basicaa -gvn'. I could add it for clarity if you prefer.
"
-gvn does not require -basicaa.
You've stated this a few times, and i'm not sure why you think this.

<from GVN>:
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();
if (!NoLoads)
  AU.addRequired<MemoryDependenceAnalysis>();
AU.addRequired<AAResultsWrapperPass>();
}

GVN requires AC, DominatorTree, and TLI all the time.
It uses AA results, which is "whatever aa you have enabled".
It requires memorydependence if you want to optimize loads.
Memory dependence, in turn, does not require basicaa either:

void MemoryDependenceAnalysis::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();
AU.addRequired<AssumptionCacheTracker>();
AU.addRequiredTransitive<AAResultsWrapperPass>();
AU.addRequiredTransitive<TargetLibraryInfoWrapperPass>();
}

No passes *require* basicaa, because basicaa is an optional pass providing
AA results.

Point taken. I may be misusing "require" here, but I sure understand what is going on better now thanks to this.
To clarify my statements and for added clarity from your points above, I have 2 questions:

1 - What is one to make of the dependence on BasicAAWrapperPass in the following?

INITIALIZE_PASS_BEGIN(AAResultsWrapperPass, "aa",
                      "Function Alias Analysis Results", false, true)
INITIALIZE_PASS_DEPENDENCY(BasicAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(CFLAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(ExternalAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(ObjCARCAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(SCEVAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(ScopedNoAliasAAWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TypeBasedAAWrapperPass)
INITIALIZE_PASS_END(AAResultsWrapperPass, "aa",
                    "Function Alias Analysis Results", false, true)

So far as I understand, this *will* initialize (and register) a BasicAAWrapperPass pass which, when run (either via run() or runOnFunction()), will build a BasicAAResult object.

2 - There does not seem to be a difference between opt -gvn and opt -basicaa -gvn. In fact, running opt -gvn -debug-pass=Arguments and opt -basicaa -gvn -debug-pass=Arguments both return the following, where -basicaa is automatically added.

Pass Arguments:  -targetlibinfo -tti -assumption-cache-tracker -basicaa -domtree -aa -memdep -gvn -verify

I will be running benchmarks over next week and keep you posted. For now, I just want to see if this is headed in the right direction... Please let me know what you think.

Note: This patch does not fully implement the PostDominatorTree pass for use with the new pass manager. If this patch is viable, then one will have to fully migrate the PostDominatorTree from the legacy to the new pass manager.

Later patches will also adapt the logic of InvariantInfo::pointsToReadOnlyMemory() to other (applicable) uses of basic AA's pointsToConstantMemory(), e.g, in FunctionAttrs, LICM, for other kinds of instructions, etc... (See included FIXMEs/TODOs.)

Ensure invariant info is properly shared between multiple passes, e.g., GVN and basic AA, or -On.

Note that uses of invariant_start instructions can be other that invariant_end.

Fix typo (resulting in warning) in PostDominators.h

First pass of comments. Also, see my reply to Hal on the earlier thread as pertains to post-dominance information.

While many of these issues are small, there are a couple of pretty significant design comments below. Happy to discuss those to try to make sure you have the right design before you work on another iteration.

include/llvm/Analysis/InvariantInfo.h
72	Using an init function is not terribly common in LLVM. Why not use a constructor?
82	The style of this patch seems mixed. Sometimes methods start with a lower case, some times an upper case. Please try to consistently follow the LLVM coding standards.
86–97	Why are these public methods? they only appear to be called from within the implementation or from diagnostic utilities.
196–219	Why use a pass at all? You don't seem to be getting much value from the pass infrastructure currently. It looks like currently, you could just have a function like "populateInfo" which returns an InvariantRangeInfo object after analyzing that function, and the InvariantRangeInfo object could have a method called "pointsToReadOnlyMemory"? Do you have specific future changes that would require the pass infrastructure to work? It's entirely possible I'm missing something, but if so, it would be very helpful to explain this kind of detail in the comments.
include/llvm/Analysis/PostDominators.h
118–135	Please split the port of postdom to the new pass manager into a separate patch. Even if it is truly necessary, the port shouldn't be part of the patch that introduces the invariant stuff.
lib/Analysis/BasicAliasAnalysis.cpp
710	It isn't clear what value this helper function is providing. Is it valid to pass a null pointer here? I find that surprising in and of itself. Either way, I think it would be more clear to either teach the routine that is already provided by InvariantInfo.h to accept ImmutableCallSite objects, or to inline the extraction of the instruction into the places this is called below.
788–789	This formatting seems inconsistent. Run clang-format?
814–816	You haven't addressed Nick's comment that teaching BasicAA to ignore the invariant intrinsic calls themselves is a separable patch that should be split out and submitted independently. Please do so.
lib/Analysis/InvariantInfo.cpp
37–39	I don't really understand your concern here. The comments are using imprecise language in that they're not using the standard's precise terminology, but the intent seems clear: it returns whatever "T()" produces. That always produces a null pointer when T is a pointer type. All of LLVM's code (and much of Clang's) relies on the lookup method being reasonable to call when the result is a pointer. Please follow this convention as suggested by Nick originally.
90–96	You have two different API comments, one in the header, and one in the source. Please only have one comment. For public interfaces, it should be in the header. For private functions, you can put it near the implementation if you want.
lib/Analysis/PostDominators.cpp
53–68	This doesn't seem like it belongs in the postdom pass. Instead if anything it should live somewhere with the invariant handling code.
lib/Transforms/Scalar/GVN.cpp
2165–2171	This does not seem like the right design and seems over-engineered. Here is an alternative that seems much simpler: pass your InvariantRangeAnalysis or InvariantRangeInfo in as an optional parameter to MemoryDependenceAnalysis::getSimplePointerDependencyFrom and the few APIs that call it. Then GVN can opt into building the invariant information structures and passing them into the query and other passes can skip that. Then, within memdep, where we query AA, also query the invariant analysis. Perhaps there are other designs as well that would be simpler and more direct? I'm curious what others think.

Alas, I wanted to address all these comments today, but I ended up being consumed with a bit of data crunching, while building clang with different versions of clang (with and without this patch).
For what it's worth---and my intention was for this to help this ongoing discussion, here are the results:
https://docs.google.com/spreadsheets/d/19W1167l9QFrXMXccX4-cValmC5EtlFfUbT7Tr3xaoJs/edit#gid=1063925865usp=sharing

More load instructions are deleted as expected, a little bit above 9% increase, and there are other gains and losses (highlighted in the spreadsheet) which I haven't gotten around to analyzing yet.
In general, there is either a slight compilation time increase, below 1%, when there is no decrease.
And there is a runtime improvement of 4% with clang binaries build with GCC.
Unexpectedly, there is a small runtime hit, below 0/5%, with clang binaries built with clang.

The times here were collected while other programs may have been running on my machine. I am going to rerun the scripts overnight (with all other programs turned off) and see if we get different outcomes in tomorrow...

In the meantime, I welcome any suggestion for better data crunching...

I will also be addressing all the above comments tomorrow.

I'm sorry, what is this data comparing? Specifically, what is extended in
clang? Is it just this patch applied?

I hope I addressed all the comments. Let me know if I missed anything.

include/llvm/Analysis/InvariantInfo.h
72	Perhaps `set` makes more sense here than `init`. `InvariantInfo` serves two purposes. First, it tracks all invariant intrinsics in the input program. Second, it performs invariant region analysis on demand, from basic AA and GVN, based on tracked intrinsics. The region analysis requires dominance info, but tracking intrinsics, e.g. to print out or verify invariant info (via the passes `InvariantInfoPrinterPass` or `InvariantInfoVerifierPass`), does not need dominance info. So, it does not make much sense to require dominance info in the constructor of `InvariantInfo`. Besides, requiring dominance info in the constructor does not work very well with the pass manager system(s). For example, look at the uses of `InvariantInfo` in the passes `InvariantInfoAnalysis` and `InvariantRangeAnalysisPass`. There is no guarantee that `DominatorTree` or `PostDominatorTreeT` instances will be created (and present) when these passes are instantiated. Even if the instances are present, there is no guarantee that they would be the instances that the range analysis needs to be run with. Ideally, instead of introducing `init()` here, one would make both `InvariantRangeAnalysisPass` and `PostDominatorTree` dependencies of `BasicAAWrapperPass` (along with `DominatorTreeWrapperPass`), and pass the arguments of `init()` herein as additional arguments to `pointsToReadOnlyMemory()`. But this either stands in the way of or is redundant with two of our objectives, which are as follows. Keep as much of our changes -- and their effect -- as possible local to GVN, and only for purposes of load elimination (for now)---at least until we are okay with using post-dominance info more pervasively throughout LLVM. Ensure that both dominator and post-dominator passes are instantiated whenever invariant range analysis is enabled. (Note that we can do the analysis without both passes.) To achieve Objective #1, we rely on calls to `EnableInvariantRangeAnalysis()` from `GVN::runOnFunction()` which appropriately enable and disable the analysis. This allows us to avoid adding a new field in `BasicAAResults`, and, more generally speaking, to avoid creating situations where implementation details are unnecessarily exposed at function call sites thereby hindering future-proof-ness. (In other words, the `pointsToReadOnlyMemory()` call from basic AA does not have to express that its implementation uses dominance info. That kind of detail should be left to `InvariantInfo` if possible.) To achieve Objective #2, we rely on `InitDependencies()`, which throws a runtime exception if both passes are not available when invariant range analysis is enabled, and directly sets pointers to the passes in InvariantInfo. This is okay because we have made sure to add `PostDominatorTree` as a dependency of `GVN` that is explicitly required in `getAnalysisUsage()`. We cannot directly make the passes dependencies of `InvariantRangeAnalysisPass` because `InvariantRangeAnalysisPass` is an `ImmutablePass`... Instead of relying on `InitDependencies()`, could we just not do range analysis if the passes are not available? Sure. But we just would not know if we've erroneously missed opportunities for load elimination. As per `init`, `InitDependencies` should probably be renamed to `SetDependencies`.
82	My mistake... I'll fix that.
86–97	Right. I'll fix these too. (I was preoccupied with the functional aspect of the implementation.)
196–219	Ok. This was addressed in at least one previous thread. In principle, the `-invariant-range-analysis` pass is similar to the `-assumption-tracker` pass, except that it enables selective data analysis in addition to data tracking. (`-assumption-tracker` only does data tracking, enabling analysis by other passes.) Like `-assumption-tracker`, `-invariant-range-analysis` populates and uses tracked data across different passes, in this case GVN and basic AA. GVN selectively enables the analysis while basic AA requests the analysis to be performed... cf. Objective #1 in earlier comment. The selective process in GVN could be (incrementaly) adapted in other passes beyond GVN where it makes sense to extend basic AA with invariant range analysis, e.g., places where `pointsToConstantMemory()` is called with `OrLocal == false`.
include/llvm/Analysis/PostDominators.h
118–135	Yes. You probably missed this, but when this patch was submitted, the description indicated that the port will have to be done separately... Indeed, this is a very minimal port, only intended for getting this patch working. The actual port will have to be a bit more substantial than that... (I am ready to do it as soon as the invariant portion of this patch is agreed upon.)
lib/Analysis/BasicAliasAnalysis.cpp
503–504	I am marking these off because this extension has been moved out of here and into `pointsToReadOnlyMemory()`...
710	Ok.
788–789	Ok. Thought I did, but will try again, and pay closer attention...
788–789	This is necessary for the test cases below, where load instructions are merged into other load instructions across intrinsic calls. Consider calling `getModRef(Inst, Loc)` from memdep where `Inst` is an intrinsic_start call. Without this, `pointsToReadOnlyMemory()` does not know that `Inst` does not modify `Loc.Ptr` if there is `Inst` itself is not already in another invariant range over `Loc.Ptr`.
814–816	I just addressed Nick's comment above. Sorry I missed it earlier. I am not sure that all of the test cases below would still work without this... I think that keeping these around are actually an essential part of the extended logic for load elimination. I'll take another look and see...
lib/Analysis/InvariantInfo.cpp
37–39	Right. This comment is outdated by the way... The implementation was very different when at the time of Nick's comment.
90–96	Ok. Will do.
140	Marking these off because they are outdated...
lib/Analysis/PostDominators.cpp
53–68	Yes. This part should remain with this pass, while the rest of the postdom migration to the new pass manager should go in the postdom migration patch.
lib/Transforms/Scalar/GVN.cpp
2165–2171	Hmm... See my earlier reply about Objective #1. This sounds like it'd basically shift the selective process from GVN into memdep, and increase opportunities for errors along the way. Note that GVN does not call `getSimplePointerDependencyFrom` directly. It starts with `getDependency` which eventually calls `getSimplePointerDependencyFrom` through multiple hoops in the call chain. I prefer to keep things as simple as possible, i.e., enable the analysis in one place and not worry about it again until the analysis is requested in basic AA... Open for alternative suggestions too.

In D15124#326601, @chandlerc wrote:

I'm sorry, what is this data comparing? Specifically, what is extended in
clang? Is it just this patch applied?

[Updating my earlier reply in email here -- to make sure it doesn't get lost...]
Yes, what is extended in clang is just this patch applied (i.e., anything named "opt*").
This is comparing build times, execution times, and statistical data from "-mllvm -stats" between the version of clang with this patch applied and a version of clang without the patch (i.e., anything named "cur*").

These comparison data are respectively in the sheets named "Build Clang", "Run Clang" and "Stats [*]".
Just in case, I also generated opt*- and cur*- clang binaries using both clang and gcc.
The sheets named "Stats [Clang]" and "Stats [GCC]" both represent statistical data for clang built with clang and gcc.

For run times, I simply ran regression tests. There may be other ways to get better runtime measurements (?). But I thought regression tests could give us something to start with.

I also put some descriptions under the "Legend" sheet. I recognize that this could be clearer. So, please don't hesitate to ask me for any clarification.

In D15124#328471, @lvoufo wrote:

In D15124#326601, @chandlerc wrote:

I'm sorry, what is this data comparing? Specifically, what is extended in
clang? Is it just this patch applied?

[Updating my earlier reply in email here -- to make sure it doesn't get lost...]
Yes, what is extended in clang is just this patch applied (i.e., anything named "opt*").
This is comparing build times, execution times, and statistical data from "-mllvm -stats" between the version of clang with this patch applied and a version of clang without the patch (i.e., anything named "cur*).

These comparison data are respectively in the sheets named "Build Clang", "Run Clang" and "Stats [*]".
Just in case, I also generated opt*- and cur*- clang binaries using both clang and gcc.
The sheets named "Stats [Clang]" and "Stats [GCC]" both represent statistical data with clang build clang and gcc.

For run times, I simply ran regression tests. There may be other ways to get better runtime measurements (?). But I thought regression tests could give us something to start with.

I also put some descriptions under the "Legend" sheet. I recognize that this could be clearer. So, please don't hesitate to ask me for any clarification.

[Update] The second pass at the benchmarks script is still running, amidst a few hiccups which are now resolved... But with the results that I currently have, I think we have a new piece of observation, that is as follows:
For a little bit over 2000 invariant ranges generated, and an average of 433 invariant range analysis computed per range (also approximately the number of instructions within each range), it looks like we get:

a 9.26% improvement on the number of load instructions deleted (15 more loads / range),
a 3.51% improvement on the number of instructions deleted (21 more instrs / range), and
a 8.46% (or 2.85%) runtime improvement with clang-built (or gcc-built) binaries.

I do get slightly different stats numbers than with the first run, but with the same percentages. So, it is possible that I may have missed something while fidgeting with the hiccups. So, I will re-run all these again over the week-end and see if the numbers get more stable. I do not anticipate any more hiccups. So, if everything goes well, the new numbers should be in on Sunday evening.

I'm open to any comment. But if you prefer to wait until after the 3rd run, that's fine too. Will keep you posted.

Why post-dominance analysis? I'm glad you asked. :)
Check out this doc: https://docs.google.com/document/d/1R-gINRdpxzLy82EZK_ymnVNyOPO-tzW5ksNcHuRvxBs/edit?usp=sharing.
Comments welcome.

On the benchmark results: The previous runs have led to results that are inconsistent enough that I am revising the benchmarking environment in use. Will keep you posted once I have a more reliable environment and more consistent data.

Sorry for not mentioning it here sooner, but I've put some comments in the
document. Let me know where you'd like to conclude that discussion. I think
it will be hard to dig much further into the code without resolving that
fundamental question.

Hello All,

A quick note to drop a quick summary of my meeting with Nick (below).

Stats at:
https://docs.google.com/spreadsheets/d/19W1167l9QFrXMXccX4-cValmC5EtlFfUbT7Tr3xaoJs/edit?usp=sharing#gid=1063925865

Summary/Highlights:

The number of generated intrinsics is significant enough to validate the analysis.
Also helps to know that the stats are the same regardless of whether the host compiler for the LLVM binaries is GCC or Clang.
Expected wins from GVN:
- Number of equalities propagated: 416 : 1.24%
- Number of instructions deleted: 43905 : 3.51%
- Number of instructions simplified: 678 : 0.25%
- Number of loads deleted: 30762 : 9.26%
- Number of loads PRE’d: 9826 : 10.77%
Undesired losses from GVN
- Number of blocks merged -301 : -6.51%
- Number of instructions PRE’d -1023 : -2.94%
Losses could be due to the fact that other parts of the compilation pipeline do not know about invariant intrinsics yet.
- Need to teach other passes about the intrinsics,
  - in particular, -sroa and -early-cse,
  - then -functionattrs, -inline, etc…
- The current patch does not have to wait for other passes to learn about the intrinsics, but
- The corresponding Clang patch (generating the intrinsics) will have to wait until the passes are properly taught.
A positive effect in -instcombine:
- Number of constants folds, instructions combined, and instructions sunk.
- GVN may have set this up well for -instcombine, even though -instcombine does not know about invariant intrinsics yet!

Have a good weekend! -- Larisse.

In an effort to keep review comments within code review system, I'm uploading a PDF snapshot of the document arguing for post-dominance analysis at https://docs.google.com/document/d/1R-gINRdpxzLy82EZK_ymnVNyOPO-tzW5ksNcHuRvxBs/edit?usp=sharing here.

Here's the PDF snapshot:

InvariantintrinsicsforloadeliminationinLLVM.pdf392 KBDownload

Some thoughts going through my head... I'm reading through Keith D. Cooper and Linda Torczon's book, "Engineering a Compiler", 10.3.1 [Eliminating Useless Code]; and here are some excerpts:

"The algorithm [...] consists of two passes. The first [...] discovers the set of useful operations. The second [...] removes useless operations. [The first] relies on reverse dominance frontiers [...]"
"The treatment of operations other than branches or jumps is straightforward. The marking phase determines whether an operation is useful. The sweep phase removes operations that have not been marked as useful."
"The treatment of control-flow operations is more complex. [...] As the marking phase discovers useful operations, it also marks the appropriate branches as useful. To map from a marked operation to the branches that it makes useful, the algorithm relies on on the notion of control dependence."
"The definition of control dependence relies on postdominance. [...]. In a CFG, node j is control-dependent on node i if and only if (1) [...] j postdominates every node on the path after i. [...] and (2) j does not strictly postdominate i. [...]"
"The notion of control dependence is captured precisely by the reverse dominance frontier of j [...]. Reverse dominance frontiers are simply dominance frontiers computed on the reverse CFG."

My thoughts are:

The main difference between the algorithm in the book and what we are doing is two folds:
1. through getModRef() in -memdep, we identify useless operations instead of useful ones, and
2. we do so using whether an invariant_end is control-dependent on a load-clobber, instead of whether a branch is control-dependent on a useful operation.
control-dependence is said to "require postdominance" and to be "captured precisely by the reverse dominance frontier", which is "simply dominance frontier computed on the reverse CFG"; which is exactly what our current implementation of postdominator as a dual of dominator gives us.
The book does not seem to reference doing control-dependence any other way. So I highly doubt that we can get to a complete and sound solution for this range analysis without postdominance.

Thoughts? Did I miss anything? If so, what did I miss?

Thanks,

Larisse.

Update the patch with respect to API changes in LLVM since the last version. Of particular interest here:

The PostDominators API has been restructured as suggested from the last version of this patch, which simplifies this patch.
The count of invariant range analysis requests has been restricted to non-trivial requests.

Update:
Over the past few weeks, I have been researching how to best benchmark this work, and am finalizing an infrastructure that will help with this and other similar extensions (where existing benchmarks do not cover enough of some given extensions' use case).
I am hoping to use that infrastructure next week or so to crunch some useful data for this discussion.

Benchmarking put aside, could someone please remind what the fundamental concern to pushing this in right now is?
The only thing that comes to mind is that this makes use of undefined behavior that is currently not handled by UBSan; which is thus likely to cause regressions on some systems. To get around this, until UBSan checks for const objects, we could disable the pass by default.

After this patch goes in, note that there is still a lot of work that must be done, incrementally, to:
1 - teach other LLVM passes about the invariant intrinsics,
2 - broaden the application of the intrinsics to more complex const objects like const member objects, and
3 - generate the intrinsics from the frontend.

So, the sooner this goes in, the quicker the progress in actually using "const" (and similar features) for optimization purposes.
The later the goes in, the more outdated my gazilion previous patches (which are waiting in the background), will be. :)

Please advice.

Thanks,

Larisse.

Syntax change -- in the way that unused variable warnings are disabled.

reames resigned from this revision.Apr 18 2016, 6:07 PM

reames removed a reviewer: reames.

Benchmarking put aside, could someone please remind what the fundamental concern to pushing this in right now is?

It's been a while, so I apologize if I'm off base on this...

I recall being concerned about the use of single-invariant-end postdominance here, because it is not, fundamentally, the correct property. You don't care that a single invariant-end call postdominates the access, but that the set of all such calls do. As you allude to in the PDT change, once exceptions are enabled, there are multiple paths on which a destructor is called. If I assume that, in general, you want an invariant-start intrinsic after the constructor call, and an invariant-end call before the destructor call, then you need to deal with this property. Even if Clang always generates one cleanup block, and even when exceptions are disabled, nothing prevents other passes from "tail merging" the cleanup block. Given that, with exceptions enabled, we already have the more-general case of multiple ends per start, we should design this around a scheme that can handle that situation.

lib/Analysis/BasicAliasAnalysis.cpp
797	By "all call sites" what do you mean? Invokes? Please either make the FIXME more explanatory, or remove it.
lib/Analysis/PostDominators.cpp
53–68	I don't understand your response. Modifying the definition of post-dominance to add a special case for exception unwinding desired by the invariant analysis seems unwise - i.e. likely to lead to bugs in other users of this code.
lib/Transforms/InstCombine/InstructionCombining.cpp
1982	Why is Intrinsic::objectsize handling in this patch?

george.burgess.iv added a subscriber: george.burgess.iv.Apr 26 2016, 9:21 PM

Revision Contents

Path

Size

include/

llvm/

Analysis/

52 lines

231 lines

10 lines

1 line

Transforms/

Scalar/

GVN.h

3 lines

lib/

Analysis/

BasicAliasAnalysis.cpp

38 lines

CMakeLists.txt

1 line

InvariantInfo.cpp

406 lines

MemoryDependenceAnalysis.cpp

7 lines

PostDominators.cpp

54 lines

Passes/

PassBuilder.cpp

1 line

PassRegistry.def

3 lines

Transforms/

InstCombine/

InstructionCombining.cpp

10 lines

Scalar/

GVN.cpp

33 lines

test/

Analysis/

InvariantInfo/

128 lines

158 lines

42 lines

100 lines

100 lines

Diff 52373

include/llvm/Analysis/BasicAliasAnalysis.h

	Show All 23 Lines
	#include "llvm/IR/LLVMContext.h"			#include "llvm/IR/LLVMContext.h"
	#include "llvm/IR/Module.h"			#include "llvm/IR/Module.h"
	#include "llvm/IR/PassManager.h"			#include "llvm/IR/PassManager.h"
	#include "llvm/Support/ErrorHandling.h"			#include "llvm/Support/ErrorHandling.h"

	namespace llvm {			namespace llvm {
	class AssumptionCache;			class AssumptionCache;
	class DominatorTree;			class DominatorTree;
				struct InvariantInfo;
	class LoopInfo;			class LoopInfo;

	/// This is the AA result object for the basic, local, and stateless alias			/// This is the AA result object for the basic, local, and stateless alias
	/// analysis. It implements the AA query interface in an entirely stateless			/// analysis. It implements the AA query interface in an entirely stateless
	/// manner. As one consequence, it is never invalidated. While it does retain			/// manner. As one consequence, it is never invalidated. While it does retain
	/// some storage, that is used as an optimization and not to preserve			/// some storage, that is used as an optimization and not to preserve
	/// information from query to query.			/// information from query to query.
	class BasicAAResult : public AAResultBase<BasicAAResult> {			class BasicAAResult : public AAResultBase<BasicAAResult> {
	friend AAResultBase<BasicAAResult>;			friend AAResultBase<BasicAAResult>;

	const DataLayout &DL;			const DataLayout &DL;
	const TargetLibraryInfo &TLI;			const TargetLibraryInfo &TLI;
	AssumptionCache &AC;			AssumptionCache &AC;
	DominatorTree *DT;			DominatorTree *DT;
				InvariantInfo *InvRA;
	LoopInfo *LI;			LoopInfo *LI;

	public:			public:
	BasicAAResult(const DataLayout &DL, const TargetLibraryInfo &TLI,			BasicAAResult(const DataLayout &DL, const TargetLibraryInfo &TLI,
	AssumptionCache &AC, DominatorTree *DT = nullptr,			AssumptionCache &AC, DominatorTree *DT = nullptr,
	LoopInfo *LI = nullptr)			InvariantInfo InvInfo = nullptr, LoopInfo LI = nullptr)
	: AAResultBase(), DL(DL), TLI(TLI), AC(AC), DT(DT), LI(LI) {}			: AAResultBase(), DL(DL), TLI(TLI), AC(AC), DT(DT), InvRA(InvInfo), LI(LI) {}

	BasicAAResult(const BasicAAResult &Arg)			BasicAAResult(const BasicAAResult &Arg)
	: AAResultBase(Arg), DL(Arg.DL), TLI(Arg.TLI), AC(Arg.AC), DT(Arg.DT),			: AAResultBase(), DL(Arg.DL), TLI(Arg.TLI), AC(Arg.AC), DT(Arg.DT),
	LI(Arg.LI) {}			InvRA(Arg.InvRA), LI(Arg.LI) {}
	BasicAAResult(BasicAAResult &&Arg)			BasicAAResult(BasicAAResult &&Arg)
	: AAResultBase(std::move(Arg)), DL(Arg.DL), TLI(Arg.TLI), AC(Arg.AC),			: AAResultBase(std::move(Arg)), DL(Arg.DL), TLI(Arg.TLI), AC(Arg.AC),
	DT(Arg.DT), LI(Arg.LI) {}			DT(Arg.DT), InvRA(Arg.InvRA), LI(Arg.LI) {}

	/// Handle invalidation events from the new pass manager.			/// Handle invalidation events from the new pass manager.
	///			///
	/// By definition, this result is stateless and so remains valid.			/// By definition, this result is stateless and so remains valid.
	bool invalidate(Function &, const PreservedAnalyses &) { return false; }			bool invalidate(Function &, const PreservedAnalyses &) { return false; }

	AliasResult alias(const MemoryLocation &LocA, const MemoryLocation &LocB);			AliasResult alias(const MemoryLocation &LocA, const MemoryLocation &LocB);

	ModRefInfo getModRefInfo(ImmutableCallSite CS, const MemoryLocation &Loc);			ModRefInfo getModRefInfo(ImmutableCallSite CS, const MemoryLocation &Loc);

	ModRefInfo getModRefInfo(ImmutableCallSite CS1, ImmutableCallSite CS2);			ModRefInfo getModRefInfo(ImmutableCallSite CS1, ImmutableCallSite CS2);

	/// Chases pointers until we find a (constant global) or not.			/// Chases pointers until we find a (constant global) or not.
	bool pointsToConstantMemory(const MemoryLocation &Loc, bool OrLocal);			bool pointsToConstantMemory(const MemoryLocation &Loc, bool OrLocal);

	/// Get the location associated with a pointer argument of a callsite.			/// Get the location associated with a pointer argument of a callsite.
	ModRefInfo getArgModRefInfo(ImmutableCallSite CS, unsigned ArgIdx);			ModRefInfo getArgModRefInfo(ImmutableCallSite CS, unsigned ArgIdx);

	/// Returns the behavior when calling the given call site.			/// Returns the behavior when calling the given call site.
	FunctionModRefBehavior getModRefBehavior(ImmutableCallSite CS);			FunctionModRefBehavior getModRefBehavior(ImmutableCallSite CS);

	/// Returns the behavior when calling the given function. For use when the			/// Returns the behavior when calling the given function. For use when the
	/// call site is not known.			/// call site is not known.
	FunctionModRefBehavior getModRefBehavior(const Function *F);			FunctionModRefBehavior getModRefBehavior(const Function *F);

	private:			private:
	// A linear transformation of a Value; this class represents ZExt(SExt(V,			// A linear transformation of a Value; this class represents ZExt(SExt(V,
	// SExtBits), ZExtBits) * Scale + Offset.			// SExtBits), ZExtBits) * Scale + Offset.
	struct VariableGEPIndex {			struct VariableGEPIndex {

	// An opaque Value - we can't decompose this further.			// An opaque Value - we can't decompose this further.
	const Value *V;			const Value *V;
	▲ Show 20 Lines • Show All 128 Lines • Show Last 20 Lines

include/llvm/Analysis/InvariantInfo.h

This file was added.

				//===-------- llvm/InvariantInfo.h - invariant_start/end info ---- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines properties for handling invariant_start/end instructions
				// for purposes of load elimination.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_ANALYSIS_INVARIANTINFO_H
				#define LLVM_ANALYSIS_INVARIANTINFO_H

				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/SmallSet.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/iterator_range.h"
				#include "llvm/Analysis/MemoryLocation.h"
				#include "llvm/Pass.h"
				#include "llvm/IR/PassManager.h"

				nlewyckyUnsubmitted Done Reply Inline Actions Please alphabetize. nlewycky: Please alphabetize.

				namespace llvm {
				template <typename IRUnitT>
				class AnalysisManager;
				dnovilloUnsubmitted Done Reply Inline Actions Documentation does not match the class name. I'd just get rid of the class name in the doc string. dnovillo: Documentation does not match the class name. I'd just get rid of the class name in the doc…
				class DataLayout;
				nlewyckyUnsubmitted Not Done Reply Inline Actions I don't understand the connection between "writeonce" (which doesn't seem to be defined anywhere? link?) and invariant.start/end. I have have as many invariant intrinsics over the same memory as I like, there's nothing 'once' about it. There's also no guarantee that the memory ever gets written to at any point in particular. nlewycky: I don't understand the connection between "writeonce" (which doesn't seem to be defined…
				class DominatorTree;
				nlewyckyUnsubmitted Done Reply Inline Actions Typo, "intrisic_start" should be "intrinsic_start". nlewycky: Typo, "intrisic_start" should be "intrinsic_start".
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions The design doc stated that this was all part of simulating 'writeonce' behavior in LLVM (+Clang)... Ideally, invariant.start instructions are generated right after first write (e.g. construction) and LLVM would reject any write after an invariant_start and before a corresponding invariant_end, but we are not reinforcing this in LLVM. Documentations like this are meant to specify (and clarify) the intended use of invariant.start/end (for simulating wirteonce semantics). If it makes things even clearer, I could update the LangRef to this effect later (?). lvoufo: The design doc stated that this was all part of simulating 'writeonce' behavior in LLVM…
				class FunctionPass;
				class Instruction;
				dnovilloUnsubmitted Done Reply Inline Actions Actually, this'd be used by any pass that needs to understand write-once objects, right? dnovillo: Actually, this'd be used by any pass that needs to understand write-once objects, right?
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Yes. This comment was specific to this particular design. But I could generalize it. lvoufo: Yes. This comment was specific to this particular design. But I could generalize it.
				class IntrinsicInst;
				class LoadInst;
				class PreservedAnalyses;
				struct PostDominatorTree;
				class Value;

				/// A data structure to (lazilly) collect and process invariant intrinsic calls;
				/// necessary for invariant range analysis via \c InvariantRangeAnalysisPass or
				/// \c InvariantInfoAnalysis.
				nlewyckyUnsubmitted Done Reply Inline Actions In the comment you say Address but the argument is named Addr. Either you meant to say "address" or "Addr" in the comment, or change "Addr" to "Address" in the function declaration. nlewycky: In the comment you say Address but the argument is named Addr. Either you meant to say…
				struct InvariantInfo {
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions meant "address", not "Address". lvoufo: meant "address", not "Address".
				typedef SmallVector<const IntrinsicInst *, 8> InvListT;
				typedef SmallDenseMap<const Value *, InvListT> InvListMapT;
				nlewyckyUnsubmitted Not Done Reply Inline Actions I looked up the "certain preconditions" and they appear to be "if Addr is a global or an alloca". Why those preconditions? Why not llvm::isIdentifiedObject, for example? Why aren't these preconditions mentioned on SetStartInstruction? Ttrue if removed, false if preconditions are not met ... what if the preconditions are met and it's not removed (because it's not a member)? nlewycky: I looked up the "certain preconditions" and they appear to be "if Addr is a global or an…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Yeah. I think this is a case of poorly-named functions after code restructuring... I'll reorganize everything accordingly. lvoufo: Yeah. I think this is a case of poorly-named functions after code restructuring... I'll…
				typedef SmallDenseMap<const Function *, InvListMapT> InvListMapMapT;

				private:
				/// \brief A mapping of allocated memory pointer values to corresponding
				dnovilloUnsubmitted Done Reply Inline Actions Likewise here. No need to duplicate the class name in the docstring. dnovillo: Likewise here. No need to duplicate the class name in the docstring.
				/// invariant_start/end ranges; Keeps track of the functions that the
				/// invariant intrinsics are called from, as well as the addresses that the
				/// intrinsics are associated with.
				InvListMapMapT AllInvariantsMaps;

				/// \brief Keeps track of functions for which invariant intrinsic calls have
				/// been collected into /c AllInvariantsMaps.
				SmallSet<const Function *, 8> InvariantsComputed;

				/// \brief The dominator and post-dominator trees, the analyses of which are
				nlewyckyUnsubmitted Done Reply Inline Actions Odd line wrapping? nlewycky: Odd line wrapping?
				/// required by invariant range analysis.
				DominatorTree *DT;
				PostDominatorTree *PDT;

				/// \brief Determines whether range analysis is enabled. If not, it will not
				/// be computed (cf. \c queriedFromInvariantRange()). This helps keep the
				/// effect of the analysis local to specific passes like GVN.
				bool RangeAnalysisIsEnabled;

				public:
				InvariantInfo() : DT(nullptr), PDT(nullptr), RangeAnalysisIsEnabled(false) {}

				/// \brief Sets the dominator and post-dominator trees.
				chandlercUnsubmitted Not Done Reply Inline Actions Using an init function is not terribly common in LLVM. Why not use a constructor? chandlerc: Using an init function is not terribly common in LLVM. Why not use a constructor?
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Perhaps `set` makes more sense here than `init`. `InvariantInfo` serves two purposes. First, it tracks all invariant intrinsics in the input program. Second, it performs invariant region analysis on demand, from basic AA and GVN, based on tracked intrinsics. The region analysis requires dominance info, but tracking intrinsics, e.g. to print out or verify invariant info (via the passes `InvariantInfoPrinterPass` or `InvariantInfoVerifierPass`), does not need dominance info. So, it does not make much sense to require dominance info in the constructor of `InvariantInfo`. Besides, requiring dominance info in the constructor does not work very well with the pass manager system(s). For example, look at the uses of `InvariantInfo` in the passes `InvariantInfoAnalysis` and `InvariantRangeAnalysisPass`. There is no guarantee that `DominatorTree` or `PostDominatorTreeT` instances will be created (and present) when these passes are instantiated. Even if the instances are present, there is no guarantee that they would be the instances that the range analysis needs to be run with. Ideally, instead of introducing `init()` here, one would make both `InvariantRangeAnalysisPass` and `PostDominatorTree` dependencies of `BasicAAWrapperPass` (along with `DominatorTreeWrapperPass`), and pass the arguments of `init()` herein as additional arguments to `pointsToReadOnlyMemory()`. But this either stands in the way of or is redundant with two of our objectives, which are as follows. Keep as much of our changes -- and their effect -- as possible local to GVN, and only for purposes of load elimination (for now)---at least until we are okay with using post-dominance info more pervasively throughout LLVM. Ensure that both dominator and post-dominator passes are instantiated whenever invariant range analysis is enabled. (Note that we can do the analysis without both passes.) To achieve Objective #1, we rely on calls to `EnableInvariantRangeAnalysis()` from `GVN::runOnFunction()` which appropriately enable and disable the analysis. This allows us to avoid adding a new field in `BasicAAResults`, and, more generally speaking, to avoid creating situations where implementation details are unnecessarily exposed at function call sites thereby hindering future-proof-ness. (In other words, the `pointsToReadOnlyMemory()` call from basic AA does not have to express that its implementation uses dominance info. That kind of detail should be left to `InvariantInfo` if possible.) To achieve Objective #2, we rely on `InitDependencies()`, which throws a runtime exception if both passes are not available when invariant range analysis is enabled, and directly sets pointers to the passes in InvariantInfo. This is okay because we have made sure to add `PostDominatorTree` as a dependency of `GVN` that is explicitly required in `getAnalysisUsage()`. We cannot directly make the passes dependencies of `InvariantRangeAnalysisPass` because `InvariantRangeAnalysisPass` is an `ImmutablePass`... Instead of relying on `InitDependencies()`, could we just not do range analysis if the passes are not available? Sure. But we just would not know if we've erroneously missed opportunities for load elimination. As per `init`, `InitDependencies` should probably be renamed to `SetDependencies`. lvoufo: Perhaps `set` makes more sense here than `init`. `InvariantInfo` serves two purposes. First…
				void init(DominatorTree DomTree, PostDominatorTree PostDomTree) {
				// If the dominator and postdominator trees have changed, then
				// repeat the analysis.
				if (DT != DomTree \|\| PDT != PostDomTree)
				InstructionsInInvariantRange.shrink_and_clear();
				DT = DomTree;
				PDT = PostDomTree;
				}

				/// \brief Enables invariant range analysis.
				nlewyckyUnsubmitted Done Reply Inline Actions Typo, "intructions" -> "instructions". nlewycky: Typo, "intructions" -> "instructions".
				chandlercUnsubmitted Not Done Reply Inline Actions The style of this patch seems mixed. Sometimes methods start with a lower case, some times an upper case. Please try to consistently follow the LLVM coding standards. chandlerc: The style of this patch seems mixed. Sometimes methods start with a lower case, some times an…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions My mistake... I'll fix that. lvoufo: My mistake... I'll fix that.
				void EnableInvariantRangeAnalysis(bool DoEnable = true) {
				RangeAnalysisIsEnabled = DoEnable;
				nlewyckyUnsubmitted Done Reply Inline Actions The comment fails to explain what it does. Looking at the code, it appears to update InvInfo by unsetting the start instruction when II is a start. It also has an assert to check that things are sane when II is an end, but since assertions can be disabled, that's not really part of its functionality. It also returns true iff II is either start or end intrinsic. It wasn't clear the first time through how closely tied this is to the operation of PreserveInvariantInfoRAII. nlewycky: The comment fails to explain what it does. Looking at the code, it appears to update InvInfo by…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Right. This could be clearer. I mentioned in earlier comments that I may be getting rid of this function altogether (in the next revision of this patch), if I can't rename it appropriately, as part of separating the extended logic from the order in which GVN processes instructions. lvoufo: Right. This could be clearer. I mentioned in earlier comments that I may be getting rid of this…
				}

				/// \brief Retrieves the part of the invariants mapping that is associated
				/// with the given function.
				InvListMapT &getInvariantsMap(const Function &F);

				/// \brief Access the invariants mapping to extract invariant_start/end
				/// instructions that are associated with the given address.
				InvListT &getInvariants(const Function &F, const Value *Addr);

				/// \brief Adds an entry to the invariants mapping.
				void addInvariant(const Function &F, const Value *Addr,
				const IntrinsicInst *IStart);
				chandlercUnsubmitted Not Done Reply Inline Actions Why are these public methods? they only appear to be called from within the implementation or from diagnostic utilities. chandlerc: Why are these public methods? they only appear to be called from within the implementation or…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Right. I'll fix these too. (I was preoccupied with the functional aspect of the implementation.) lvoufo: Right. I'll fix these too. (I was preoccupied with the functional aspect of the implementation.)

				/// \brief Populates the invariants mapping for the given function.
				void populateInfo(const Function &F);

				/// \brief Determines if the given memory location can be modified by the
				/// given instruction, based on invariant range analysis.
				bool pointsToReadOnlyMemory(const Instruction *I, const MemoryLocation &Loc,
				const DataLayout &DL);

				private:
				/// \brief Performs invariant range analysis for some given instruction,
				/// based on some allocated memory address:
				// If the instruction accesses the given address, then this checks whether
				/// there is an invariant range (over the address) that the instruction
				/// belongs in.
				/// NOTE: Range analysis must be enabled for this computation to go through.
				bool queriedFromInvariantRange(const Instruction I, const Value Addr);

				typedef SmallSet<const Instruction *, 8> InstListT;
				SmallDenseMap<const Value *, InstListT> InstructionsInInvariantRange;

				/// \brief Determines if the given instruction was previously found to
				/// be in an invariant range over the given address; avoids recomputing
				/// \c queriedFromInvariantRange().
				/// FIXME: Try to optimize this by keeping track of computed ranges as well,
				/// and then checking the relative positions of the instructions...
				bool isInInvariantRange(const Value Addr, const Instruction I) {
				return InstructionsInInvariantRange[Addr].count(I);
				}

				/// \brief Indicates that the given instruction has been found to
				/// be in an invariant range over the given address.
				void markInInvariantRange(const Value Addr, const Instruction I) {
				InstructionsInInvariantRange[Addr].insert(I);
				}

				public:
				/// \brief Various helper functionalities for related analysis passes, e.g.,
				/// \c InvariantInfoAnalysis, \c InvariantInfoPrinterPass,
				/// \c InvariantInfoVerifierPass and \c InvariantRangeAnalysisPass (below).
				/// @{
				void clear();
				void verify() const;
				void verify(const Function &F) const;
				void print(raw_ostream &OS) const;
				void print(raw_ostream &OS, const Function &F) const;
				#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
				void dump() const;
				#endif
				/// @}
				};

				/// \brief A function analysis which provides an \c InvariantInfo.
				///
				/// This analysis is intended for use with the new pass manager and will vend
				/// invariant info for a given function.
				class InvariantInfoAnalysis
				: public AnalysisInfoMixin<InvariantInfoAnalysis> {
				friend AnalysisInfoMixin<InvariantInfoAnalysis>;
				static char PassID;

				public:
				typedef InvariantInfo Result;

				/// \brief Opaque, unique identifier for this analysis pass.
				static void ID() { return (void )&PassID; }

				/// \brief Provide a name for the analysis for debugging and logging.
				static StringRef name() { return "InvariantInfoAnalysis"; }

				InvariantInfoAnalysis() {}
				InvariantInfoAnalysis(const InvariantInfoAnalysis &Arg) {}
				InvariantInfoAnalysis(InvariantInfoAnalysis &&Arg) {}
				InvariantInfoAnalysis &operator=(const InvariantInfoAnalysis &RHS) {
				return *this;
				}
				InvariantInfoAnalysis &operator=(InvariantInfoAnalysis &&RHS) {
				return *this;
				}

				InvariantInfo run(Function &F, AnalysisManager<Function> &AM) {
				return InvariantInfo();
				}
				};

				/// \brief Printer pass for the \c InvariantInfoAnalysis results.
				class InvariantInfoPrinterPass
				: public AnalysisInfoMixin<InvariantInfoPrinterPass> {
				raw_ostream &OS;

				public:
				explicit InvariantInfoPrinterPass(raw_ostream &OS) : OS(OS) {}
				PreservedAnalyses run(Function &F, AnalysisManager<Function> &AM);
				static StringRef name() { return "InvariantInfoPrinterPass"; }
				};

				/// \brief Verifier pass for the \c InvariantInfoAnalysis results.
				struct InvariantInfoVerifierPass
				: public AnalysisInfoMixin<InvariantInfoVerifierPass> {
				PreservedAnalyses run(Function &F, AnalysisManager<Function> &AM);
				static StringRef name() { return "InvariantInfoVerifierPass"; }
				};

				/// \brief The -invariant-range-analysis pass.
				class InvariantRangeAnalysisPass : public ImmutablePass {
				InvariantInfo InvInfo;

				public:
				static char ID; // Pass identification
				explicit InvariantRangeAnalysisPass();
				~InvariantRangeAnalysisPass() override;

				InvariantInfo &getInvariantInfo() { return InvInfo; }
				const InvariantInfo &getInvariantInfo() const { return InvInfo; }

				void InitDependencies(FunctionPass &P);

				void releaseMemory() override { InvInfo.clear(); }
				void verifyAnalysis() const override;
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				void print(raw_ostream &OS, const Module *) const override;
				void dump() const;
				chandlercUnsubmitted Not Done Reply Inline Actions Why use a pass at all? You don't seem to be getting much value from the pass infrastructure currently. It looks like currently, you could just have a function like "populateInfo" which returns an InvariantRangeInfo object after analyzing that function, and the InvariantRangeInfo object could have a method called "pointsToReadOnlyMemory"? Do you have specific future changes that would require the pass infrastructure to work? It's entirely possible I'm missing something, but if so, it would be very helpful to explain this kind of detail in the comments. chandlerc: Why use a pass at all? You don't seem to be getting much value from the pass infrastructure…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Ok. This was addressed in at least one previous thread. In principle, the `-invariant-range-analysis` pass is similar to the `-assumption-tracker` pass, except that it enables selective data analysis in addition to data tracking. (`-assumption-tracker` only does data tracking, enabling analysis by other passes.) Like `-assumption-tracker`, `-invariant-range-analysis` populates and uses tracked data across different passes, in this case GVN and basic AA. GVN selectively enables the analysis while basic AA requests the analysis to be performed... cf. Objective #1 in earlier comment. The selective process in GVN could be (incrementaly) adapted in other passes beyond GVN where it makes sense to extend basic AA with invariant range analysis, e.g., places where `pointsToConstantMemory()` is called with `OrLocal == false`. lvoufo: Ok. This was addressed in at least one previous thread. In principle, the `-invariant-range…
				bool doFinalization(Module &) override {
				verifyAnalysis();
				return false;
				}
				};

				/// \brief Check if the given instruction is an invariant_start/end.
				bool IsInvariantIntrinsic(const IntrinsicInst *const II);

				} // End llvm namespace

				#endif

include/llvm/Analysis/PostDominators.h

Show All 28 Lines	struct PostDominatorTree : public DominatorTreeBase<BasicBlock> {

PostDominatorTree(PostDominatorTree &&Arg)		PostDominatorTree(PostDominatorTree &&Arg)
: Base(std::move(static_cast<Base &>(Arg))) {}		: Base(std::move(static_cast<Base &>(Arg))) {}

PostDominatorTree &operator=(PostDominatorTree &&RHS) {		PostDominatorTree &operator=(PostDominatorTree &&RHS) {
Base::operator=(std::move(static_cast<Base &>(RHS)));		Base::operator=(std::move(static_cast<Base &>(RHS)));
return *this;		return *this;
}		}

		bool dominates(const Instruction Def, const Instruction User) const;

		inline bool dominates(DomTreeNode A, DomTreeNode B) const {
		return Base::dominates(A, B);
		}

		inline bool dominates(const BasicBlock A, const BasicBlock B) const {
		return Base::dominates(A, B);
		}
};		};

/// \brief Analysis pass which computes a \c PostDominatorTree.		/// \brief Analysis pass which computes a \c PostDominatorTree.
class PostDominatorTreeAnalysis		class PostDominatorTreeAnalysis
: public AnalysisInfoMixin<PostDominatorTreeAnalysis> {		: public AnalysisInfoMixin<PostDominatorTreeAnalysis> {
friend AnalysisInfoMixin<PostDominatorTreeAnalysis>;		friend AnalysisInfoMixin<PostDominatorTreeAnalysis>;
static char PassID;		static char PassID;

▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	else
return df_end(getEntryNode(N));		return df_end(getEntryNode(N));
}		}

static nodes_iterator nodes_end(PostDominatorTree *N) {		static nodes_iterator nodes_end(PostDominatorTree *N) {
return df_end(getEntryNode(N));		return df_end(getEntryNode(N));
}		}
};		};

} // End llvm namespace		} // End llvm namespace

#endif		#endif

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 149 Lines • ▼ Show 20 Lines
	void initializeIndVarSimplifyPass(PassRegistry&);			void initializeIndVarSimplifyPass(PassRegistry&);
	void initializeInferFunctionAttrsLegacyPassPass(PassRegistry&);			void initializeInferFunctionAttrsLegacyPassPass(PassRegistry&);
	void initializeInlineCostAnalysisPass(PassRegistry&);			void initializeInlineCostAnalysisPass(PassRegistry&);
	void initializeInstructionCombiningPassPass(PassRegistry&);			void initializeInstructionCombiningPassPass(PassRegistry&);
	void initializeInstCountPass(PassRegistry&);			void initializeInstCountPass(PassRegistry&);
	void initializeInstNamerPass(PassRegistry&);			void initializeInstNamerPass(PassRegistry&);
	void initializeInternalizePassPass(PassRegistry&);			void initializeInternalizePassPass(PassRegistry&);
	void initializeIntervalPartitionPass(PassRegistry&);			void initializeIntervalPartitionPass(PassRegistry&);
				void initializeInvariantRangeAnalysisPassPass(PassRegistry &);
	void initializeIRTranslatorPass(PassRegistry &);			void initializeIRTranslatorPass(PassRegistry &);
	void initializeJumpThreadingPass(PassRegistry&);			void initializeJumpThreadingPass(PassRegistry&);
	void initializeLCSSAPass(PassRegistry&);			void initializeLCSSAPass(PassRegistry&);
	void initializeLICMPass(PassRegistry&);			void initializeLICMPass(PassRegistry&);
	void initializeLazyValueInfoPass(PassRegistry&);			void initializeLazyValueInfoPass(PassRegistry&);
	void initializeLintPass(PassRegistry&);			void initializeLintPass(PassRegistry&);
	void initializeLiveDebugVariablesPass(PassRegistry&);			void initializeLiveDebugVariablesPass(PassRegistry&);
	void initializeLiveIntervalsPass(PassRegistry&);			void initializeLiveIntervalsPass(PassRegistry&);
	▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

include/llvm/Transforms/Scalar/GVN.h

Show All 16 Lines
#define LLVM_TRANSFORMS_SCALAR_GVN_H		#define LLVM_TRANSFORMS_SCALAR_GVN_H

#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/SetVector.h"		#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallPtrSet.h"		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
		#include "llvm/Analysis/InvariantInfo.h"
#include "llvm/Analysis/MemoryDependenceAnalysis.h"		#include "llvm/Analysis/MemoryDependenceAnalysis.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/PassManager.h"		#include "llvm/IR/PassManager.h"

namespace llvm {		namespace llvm {

/// A private "module" namespace for types and utilities used by GVN. These		/// A private "module" namespace for types and utilities used by GVN. These
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	private:
SmallVector<Instruction *, 8> InstrsToErase;		SmallVector<Instruction *, 8> InstrsToErase;

typedef SmallVector<NonLocalDepResult, 64> LoadDepVect;		typedef SmallVector<NonLocalDepResult, 64> LoadDepVect;
typedef SmallVector<gvn::AvailableValueInBlock, 64> AvailValInBlkVect;		typedef SmallVector<gvn::AvailableValueInBlock, 64> AvailValInBlkVect;
typedef SmallVector<BasicBlock *, 64> UnavailBlkVect;		typedef SmallVector<BasicBlock *, 64> UnavailBlkVect;

bool runImpl(Function &F, AssumptionCache &RunAC, DominatorTree &RunDT,		bool runImpl(Function &F, AssumptionCache &RunAC, DominatorTree &RunDT,
const TargetLibraryInfo &RunTLI, AAResults &RunAA,		const TargetLibraryInfo &RunTLI, AAResults &RunAA,
MemoryDependenceResults *RunMD);		MemoryDependenceResults RunMD, InvariantInfo InvInfo);

/// Push a new Value to the LeaderTable onto the list for its value number.		/// Push a new Value to the LeaderTable onto the list for its value number.
void addToLeaderTable(uint32_t N, Value V, const BasicBlock BB) {		void addToLeaderTable(uint32_t N, Value V, const BasicBlock BB) {
LeaderTableEntry &Curr = LeaderTable[N];		LeaderTableEntry &Curr = LeaderTable[N];
if (!Curr.Val) {		if (!Curr.Val) {
Curr.Val = V;		Curr.Val = V;
Curr.BB = BB;		Curr.BB = BB;
return;		return;
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

lib/Analysis/BasicAliasAnalysis.cpp

Show All 14 Lines

#include "llvm/Analysis/BasicAliasAnalysis.h"		#include "llvm/Analysis/BasicAliasAnalysis.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/CFG.h"		#include "llvm/Analysis/CFG.h"
#include "llvm/Analysis/CaptureTracking.h"		#include "llvm/Analysis/CaptureTracking.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
		#include "llvm/Analysis/InvariantInfo.h"
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/DerivedTypes.h"		#include "llvm/IR/DerivedTypes.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
▲ Show 20 Lines • Show All 463 Lines • ▼ Show 20 Lines	if (!Visited.insert(V).second) {
Visited.clear();		Visited.clear();
return AAResultBase::pointsToConstantMemory(Loc, OrLocal);		return AAResultBase::pointsToConstantMemory(Loc, OrLocal);
}		}

// An alloca instruction defines local memory.		// An alloca instruction defines local memory.
if (OrLocal && isa<AllocaInst>(V))		if (OrLocal && isa<AllocaInst>(V))
continue;		continue;

// A global constant counts as local memory for our purposes.		// A global constant counts as local memory for our purposes.
		nlewyckyUnsubmitted Done Reply Inline Actions "associated to" --> "associated with". nlewycky: "associated to" --> "associated with".
if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(V)) {		if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(V)) {
		nlewyckyUnsubmitted Done Reply Inline Actions I don't understand this comment. Globals, for instance, aren't local to the function. nlewycky: I don't understand this comment. Globals, for instance, aren't local to the function.
		lvoufoAuthorUnsubmitted Done Reply Inline Actions I think the operative word is "considered". This is merely an attempt to follow the current documentation structure of the function pointsToConstantMemory(). Its documentation states: /// Returns whether the given pointer value points to memory that is local to /// the function, with global constants being considered local to all /// functions. bool BasicAAResult::pointsToConstantMemory(const MemoryLocation &Loc, bool OrLocal) lvoufo: I think the operative word is "considered". This is merely an attempt to follow the current…
		lvoufoAuthorUnsubmitted Not Done Reply Inline Actions I am marking these off because this extension has been moved out of here and into `pointsToReadOnlyMemory()`... lvoufo: I am marking these off because this extension has been moved out of here and into…
// Note: this doesn't require GV to be "ODR" because it isn't legal for a		// Note: this doesn't require GV to be "ODR" because it isn't legal for a
// global to be marked constant in some modules and non-constant in		// global to be marked constant in some modules and non-constant in
// others. GV may even be a declaration, not a definition.		// others. GV may even be a declaration, not a definition.
		nlewyckyUnsubmitted Done Reply Inline Actions You assert on invariant.start with a constant global variable, but that isn't a verifier failure. You would assert if queried on this well-defined code, right? declare {}* @llvm.invariant.start(i64 %size, i8* nocapture %ptr) @gv = internal constant i8 0 define void @test() { call {}* @llvm.invariant.start(i64 1, i8* @gv) ret void } I agree the construct is not useful, either we should make it fail the verifier (this would mean auditing all parts of llvm which mark globals constant to ensure they eliminate all possible invariant intrinsic calls that use the global), or we should delete them in some optimization pass (instcombine). nlewycky: You assert on invariant.start with a constant global variable, but that isn't a verifier…
		lvoufoAuthorUnsubmitted Done Reply Inline Actions I have removed the assertion. We just want to localized this extended logic to non-constant globals, and this should be checked in InvariantInfo::SetStartInstruction(). lvoufo: I have removed the assertion. We just want to localized this extended logic to non-constant…
if (!GV->isConstant()) {		if (!GV->isConstant()) {
Visited.clear();		Visited.clear();
		nlewyckyUnsubmitted Done Reply Inline Actions Please add a space after "or". As it is, the text from the two lines get concatenated producing "...instructions ornon-constant global...". nlewycky: Please add a space after "or". As it is, the text from the two lines get concatenated producing…
return AAResultBase::pointsToConstantMemory(Loc, OrLocal);		return AAResultBase::pointsToConstantMemory(Loc, OrLocal);
}		}
continue;		continue;
}		}

// If both select values point to local memory, then so does the select.		// If both select values point to local memory, then so does the select.
if (const SelectInst *SI = dyn_cast<SelectInst>(V)) {		if (const SelectInst *SI = dyn_cast<SelectInst>(V)) {
Worklist.push_back(SI->getTrueValue());		Worklist.push_back(SI->getTrueValue());
▲ Show 20 Lines • Show All 183 Lines • ▼ Show 20 Lines	AliasResult BasicAAResult::alias(const MemoryLocation &LocA,
// shrink_and_clear so it quickly returns to the inline capacity of the		// shrink_and_clear so it quickly returns to the inline capacity of the
// SmallDenseMap if it ever grows larger.		// SmallDenseMap if it ever grows larger.
// FIXME: This should really be shrink_to_inline_capacity_and_clear().		// FIXME: This should really be shrink_to_inline_capacity_and_clear().
AliasCache.shrink_and_clear();		AliasCache.shrink_and_clear();
VisitedPhiBBs.clear();		VisitedPhiBBs.clear();
return Alias;		return Alias;
}		}

		static bool isInvariantIntrinsic(ImmutableCallSite CS) {
		return IsInvariantIntrinsic(dyn_cast<IntrinsicInst>(CS.getInstruction()));
		dnovilloUnsubmitted Done Reply Inline Actions Should this be defined in lib/Analysis/InvariantInfo.cpp? dnovillo: Should this be defined in lib/Analysis/InvariantInfo.cpp?
		lvoufoAuthorUnsubmitted Done Reply Inline Actions Perhaps later, but for now, it's only used in this file? lvoufo: Perhaps later, but for now, it's only used in this file?
		chandlercUnsubmitted Not Done Reply Inline Actions It isn't clear what value this helper function is providing. Is it valid to pass a null pointer here? I find that surprising in and of itself. Either way, I think it would be more clear to either teach the routine that is already provided by InvariantInfo.h to accept ImmutableCallSite objects, or to inline the extraction of the instruction into the places this is called below. chandlerc: It isn't clear what value this helper function is providing. Is it valid to pass a null…
		lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Ok. lvoufo: Ok.
		}

/// Checks to see if the specified callsite can clobber the specified memory		/// Checks to see if the specified callsite can clobber the specified memory
/// object.		/// object.
///		///
/// Since we only look at local properties of this function, we really can't		/// Since we only look at local properties of this function, we really can't
/// say much about this query. We do, however, use simple "address taken"		/// say much about this query. We do, however, use simple "address taken"
/// analysis on local objects.		/// analysis on local objects.
ModRefInfo BasicAAResult::getModRefInfo(ImmutableCallSite CS,		ModRefInfo BasicAAResult::getModRefInfo(ImmutableCallSite CS,
const MemoryLocation &Loc) {		const MemoryLocation &Loc) {
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	ModRefInfo BasicAAResult::getModRefInfo(ImmutableCallSite CS,
}		}

// While the assume intrinsic is marked as arbitrarily writing so that		// While the assume intrinsic is marked as arbitrarily writing so that
// proper control dependencies will be maintained, it never aliases any		// proper control dependencies will be maintained, it never aliases any
// particular memory location.		// particular memory location.
if (isAssumeIntrinsic(CS))		if (isAssumeIntrinsic(CS))
return MRI_NoModRef;		return MRI_NoModRef;

		// Invariant intrinsics follow the same pattern as assume intrinsic.
		if (isInvariantIntrinsic(CS)) return MRI_NoModRef;
		nlewyckyUnsubmitted Not Done Reply Inline Actions This ... nlewycky: This ...
		lvoufoAuthorUnsubmitted Not Done Reply Inline Actions This is necessary for the test cases below, where load instructions are merged into other load instructions across intrinsic calls. Consider calling `getModRef(Inst, Loc)` from memdep where `Inst` is an intrinsic_start call. Without this, `pointsToReadOnlyMemory()` does not know that `Inst` does not modify `Loc.Ptr` if there is `Inst` itself is not already in another invariant range over `Loc.Ptr`. lvoufo: This is necessary for the test cases below, where load instructions are merged into other load…
		chandlercUnsubmitted Not Done Reply Inline Actions This formatting seems inconsistent. Run clang-format? chandlerc: This formatting seems inconsistent. Run clang-format?
		lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Ok. Thought I did, but will try again, and pay closer attention... lvoufo: Ok. Thought I did, but will try again, and pay closer attention...

// The AAResultBase base class has some smarts, lets use them.		// The AAResultBase base class has some smarts, lets use them.
return AAResultBase::getModRefInfo(CS, Loc);		ModRefInfo Mask = AAResultBase::getModRefInfo(CS, Loc);

		// If available, use invariant range analysis to determine if this call could
		// modify the memory location.
		// NOTE: Only call instructions for now.
		// FIXME: Allow all call sites.
		hfinkelUnsubmitted Not Done Reply Inline Actions By "all call sites" what do you mean? Invokes? Please either make the FIXME more explanatory, or remove it. hfinkel: By "all call sites" what do you mean? Invokes? Please either make the FIXME more explanatory…
		if (const CallInst *CI = dyn_cast<CallInst>(CS.getInstruction())) {
		if ((Mask & MRI_Mod) && InvRA && InvRA->pointsToReadOnlyMemory(CI, Loc, DL))
		Mask = ModRefInfo(Mask & ~MRI_Mod);
		}

		return Mask;
}		}

ModRefInfo BasicAAResult::getModRefInfo(ImmutableCallSite CS1,		ModRefInfo BasicAAResult::getModRefInfo(ImmutableCallSite CS1,
ImmutableCallSite CS2) {		ImmutableCallSite CS2) {
// While the assume intrinsic is marked as arbitrarily writing so that		// While the assume intrinsic is marked as arbitrarily writing so that
// proper control dependencies will be maintained, it never aliases any		// proper control dependencies will be maintained, it never aliases any
// particular memory location.		// particular memory location.
if (isAssumeIntrinsic(CS1) \|\| isAssumeIntrinsic(CS2))		if (isAssumeIntrinsic(CS1) \|\| isAssumeIntrinsic(CS2))
return MRI_NoModRef;		return MRI_NoModRef;

		// Invariant intrinsics follow the same pattern as assume intrinsic.
		if (isInvariantIntrinsic(CS1) \|\| isInvariantIntrinsic(CS2))
		return MRI_NoModRef;
		nlewyckyUnsubmitted Not Done Reply Inline Actions ... and this look like solid changes on their own. Can you factor them out into their own review and write testcases for them? nlewycky: ... and this look like solid changes on their own. Can you factor them out into their own…
		chandlercUnsubmitted Not Done Reply Inline Actions You haven't addressed Nick's comment that teaching BasicAA to ignore the invariant intrinsic calls themselves is a separable patch that should be split out and submitted independently. Please do so. chandlerc: You haven't addressed Nick's comment that teaching BasicAA to ignore the invariant intrinsic…
		lvoufoAuthorUnsubmitted Not Done Reply Inline Actions I just addressed Nick's comment above. Sorry I missed it earlier. I am not sure that all of the test cases below would still work without this... I think that keeping these around are actually an essential part of the extended logic for load elimination. I'll take another look and see... lvoufo: I just addressed Nick's comment above. Sorry I missed it earlier. I am not sure that all of…

// The AAResultBase base class has some smarts, lets use them.		// The AAResultBase base class has some smarts, lets use them.
return AAResultBase::getModRefInfo(CS1, CS2);		return AAResultBase::getModRefInfo(CS1, CS2);
}		}

/// Provide ad-hoc rules to disambiguate accesses through two GEP operators,		/// Provide ad-hoc rules to disambiguate accesses through two GEP operators,
/// both having the exact same pointer operand.		/// both having the exact same pointer operand.
static AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1,		static AliasResult aliasSameBasePointerGEPs(const GEPOperator *GEP1,
uint64_t V1Size,		uint64_t V1Size,
▲ Show 20 Lines • Show All 805 Lines • ▼ Show 20 Lines

char BasicAA::PassID;		char BasicAA::PassID;

BasicAAResult BasicAA::run(Function &F, AnalysisManager<Function> &AM) {		BasicAAResult BasicAA::run(Function &F, AnalysisManager<Function> &AM) {
return BasicAAResult(F.getParent()->getDataLayout(),		return BasicAAResult(F.getParent()->getDataLayout(),
AM.getResult<TargetLibraryAnalysis>(F),		AM.getResult<TargetLibraryAnalysis>(F),
AM.getResult<AssumptionAnalysis>(F),		AM.getResult<AssumptionAnalysis>(F),
&AM.getResult<DominatorTreeAnalysis>(F),		&AM.getResult<DominatorTreeAnalysis>(F),
		AM.getCachedResult<InvariantInfoAnalysis>(F),
AM.getCachedResult<LoopAnalysis>(F));		AM.getCachedResult<LoopAnalysis>(F));
}		}

BasicAAWrapperPass::BasicAAWrapperPass() : FunctionPass(ID) {		BasicAAWrapperPass::BasicAAWrapperPass() : FunctionPass(ID) {
initializeBasicAAWrapperPassPass(*PassRegistry::getPassRegistry());		initializeBasicAAWrapperPassPass(*PassRegistry::getPassRegistry());
}		}

char BasicAAWrapperPass::ID = 0;		char BasicAAWrapperPass::ID = 0;
Show All 10 Lines
FunctionPass *llvm::createBasicAAWrapperPass() {		FunctionPass *llvm::createBasicAAWrapperPass() {
return new BasicAAWrapperPass();		return new BasicAAWrapperPass();
}		}

bool BasicAAWrapperPass::runOnFunction(Function &F) {		bool BasicAAWrapperPass::runOnFunction(Function &F) {
auto &ACT = getAnalysis<AssumptionCacheTracker>();		auto &ACT = getAnalysis<AssumptionCacheTracker>();
auto &TLIWP = getAnalysis<TargetLibraryInfoWrapperPass>();		auto &TLIWP = getAnalysis<TargetLibraryInfoWrapperPass>();
auto &DTWP = getAnalysis<DominatorTreeWrapperPass>();		auto &DTWP = getAnalysis<DominatorTreeWrapperPass>();
		auto *IRAP = getAnalysisIfAvailable<InvariantRangeAnalysisPass>();
auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();		auto *LIWP = getAnalysisIfAvailable<LoopInfoWrapperPass>();

Result.reset(new BasicAAResult(F.getParent()->getDataLayout(), TLIWP.getTLI(),		Result.reset(new BasicAAResult(F.getParent()->getDataLayout(), TLIWP.getTLI(),
ACT.getAssumptionCache(F), &DTWP.getDomTree(),		ACT.getAssumptionCache(F),
		&DTWP.getDomTree(),
		IRAP ? &IRAP->getInvariantInfo() : nullptr,
LIWP ? &LIWP->getLoopInfo() : nullptr));		LIWP ? &LIWP->getLoopInfo() : nullptr));

return false;		return false;
}		}

void BasicAAWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {		void BasicAAWrapperPass::getAnalysisUsage(AnalysisUsage &AU) const {
AU.setPreservesAll();		AU.setPreservesAll();
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
}		}

BasicAAResult llvm::createLegacyPMBasicAAResult(Pass &P, Function &F) {		BasicAAResult llvm::createLegacyPMBasicAAResult(Pass &P, Function &F) {
		// TODO: Add and use createInvariantRangeAnalysisPass()?
		auto *InvRA = P.getAnalysisIfAvailable<InvariantRangeAnalysisPass>();
return BasicAAResult(		return BasicAAResult(
F.getParent()->getDataLayout(),		F.getParent()->getDataLayout(),
P.getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(),		P.getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(),
P.getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F));		P.getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F),
		/DT = / nullptr,
		InvRA ? &InvRA->getInvariantInfo() : nullptr);
}		}

lib/Analysis/CMakeLists.txt

Show All 27 Lines	add_llvm_library(LLVMAnalysis
EHPersonalities.cpp		EHPersonalities.cpp
GlobalsModRef.cpp		GlobalsModRef.cpp
IVUsers.cpp		IVUsers.cpp
InlineCost.cpp		InlineCost.cpp
InstCount.cpp		InstCount.cpp
InstructionSimplify.cpp		InstructionSimplify.cpp
Interval.cpp		Interval.cpp
IntervalPartition.cpp		IntervalPartition.cpp
		InvariantInfo.cpp
IteratedDominanceFrontier.cpp		IteratedDominanceFrontier.cpp
LazyCallGraph.cpp		LazyCallGraph.cpp
LazyValueInfo.cpp		LazyValueInfo.cpp
Lint.cpp		Lint.cpp
Loads.cpp		Loads.cpp
LoopAccessAnalysis.cpp		LoopAccessAnalysis.cpp
LoopUnrollAnalyzer.cpp		LoopUnrollAnalyzer.cpp
LoopInfo.cpp		LoopInfo.cpp
Show All 36 Lines

lib/Analysis/InvariantInfo.cpp

This file was added.

				//===---------- InvariantInfo.cpp - invariant_start/end info --------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines properties for handling invariant_start/end instructions
				// for purposes of load elimination.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/Analysis/InvariantInfo.h"
				#include "llvm/ADT/Statistic.h"
				#include "llvm/Analysis/PostDominators.h"
				#include "llvm/Analysis/ValueTracking.h"
				#include "llvm/IR/DataLayout.h"
				#include "llvm/IR/Dominators.h"
				#include "llvm/IR/GlobalVariable.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/IntrinsicInst.h"
				#include "llvm/IR/PassManager.h"
				using namespace llvm;

				// -stats data:
				#define DEBUG_TYPE "invariant_info"
				STATISTIC(NumInvStart, "Number of invariant_start instructions");
				STATISTIC(NumInvEnd, "Number of invariant_end instructions");
				nlewyckyUnsubmitted Done Reply Inline Actions Capitalize. nlewycky: Capitalize.
				STATISTIC(NumInvRangeReq, "Number of invariant range analysis requested");
				STATISTIC(NumInvRangeComp, "Number of invariant range analysis computed");

				InvariantInfo::InvListMapT &InvariantInfo::getInvariantsMap(const Function &F) {
				if (!InvariantsComputed.count(&F)) populateInfo(F);
				return AllInvariantsMaps[&F];
				}

				nlewyckyUnsubmitted Done Reply Inline Actions Does this code simplify to: return InvariantMarkers.lookup(const_cast<Value >(Addr)); ? nlewycky:* Does this code simplify to: return InvariantMarkers.lookup(const_cast<Value *>(Addr)); ?
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions In theory, yes. But I like to edge on the side of safety with find() over lookup() when it comes to the default initialization of pointers (or other non-user-defined objects). The documentation of lookup says: "lookup - Return the entry for the specified key, or a default constructed value if no such entry exists". This explicitly returns nullptr, in place of a "default constructed value"---the specification of which could in practice change at any time, "if no such entry exists". lvoufo: In theory, yes. But I like to edge on the side of safety with find() over lookup() when it…
				InvariantInfo::InvListT &InvariantInfo::getInvariants(const Function &F,
				chandlercUnsubmitted Done Reply Inline Actions I don't really understand your concern here. The comments are using imprecise language in that they're not using the standard's precise terminology, but the intent seems clear: it returns whatever "T()" produces. That always produces a null pointer when T is a pointer type. All of LLVM's code (and much of Clang's) relies on the lookup method being reasonable to call when the result is a pointer. Please follow this convention as suggested by Nick originally. chandlerc: I don't really understand your concern here. The comments are using imprecise language in that…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Right. This comment is outdated by the way... The implementation was very different when at the time of Nick's comment. lvoufo: Right. This comment is outdated by the way... The implementation was very different when at the…
				const Value *Addr) {
				assert(Addr && "Address must be nonnull.");

				// Retrieve the value from the markers map.
				return getInvariantsMap(F)[Addr];
				}

				void InvariantInfo::addInvariant(const Function &F, const Value *Addr,
				const IntrinsicInst *II) {
				assert(Addr && II && "Value must be nonnull.");

				// Only mark either non-constant global variables or alloca instructions.
				if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(Addr)) {
				if (GV->isConstant()) return;
				} else if (!isa<AllocaInst>(Addr))
				return;

				InvListT &IIs = getInvariantsMap(F)[Addr];
				nlewyckyUnsubmitted Done Reply Inline Actions Er, the code doesn't match the comment here, right? As written, this means that if it is set, then trying to set again will unset it. That doesn't ensure the new value is inserted. nlewycky: Er, the code doesn't match the comment here, right? As written, this means that if it is set…
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Removed. lvoufo: Removed.
				IIs.push_back(II);

				// Update -stats data
				nlewyckyUnsubmitted Done Reply Inline Actions Extra space between words in in "alloca instruction". nlewycky: Extra space between words in in "alloca instruction".
				++NumInvStart;
				dnovilloUnsubmitted Done Reply Inline Actions Would it be better to have a remove() function in the interface, instead of making nullptr a magic value? dnovillo: Would it be better to have a remove() function in the interface, instead of making nullptr a…
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Yes. lvoufo: Yes.
				NumInvEnd += II->getNumUses();
				}

				void InvariantInfo::populateInfo(const Function &F) {
				if (InvariantsComputed.count(&F)) return;

				// Mark the map entry as computed.
				InvariantsComputed.insert(&F);

				nlewyckyUnsubmitted Done Reply Inline Actions You need to add a (void)Inserted; for people building with assertions off and -Wunused-variable on. nlewycky: You need to add a (void)Inserted; for people building with assertions off and -Wunused…
				assert(getInvariantsMap(F).empty() &&
				"Already have invariant info when scanning!");

				// Go through all instructions in all blocks, add all invariant_start calls
				// to this map. There is no need to explicitly collect invariant_end calls
				// because they are uses of the collected invariant_start calls.
				for (const BasicBlock &B : F) {
				for (const Instruction &I : B) {
				if (const IntrinsicInst *II = dyn_cast<IntrinsicInst>(&I)) {
				if (II->getIntrinsicID() == Intrinsic::invariant_start) {
				Value *Addr = II->getArgOperand(1)->stripPointerCasts();
				addInvariant(F, Addr, II);
				}
				}
				}
				}
				}

				/// \brief An instruction cannot modify memory if the instruction is in an
				/// invariant range over the memory's pointer value.
				/// This implementation is modeled after \c pointsToConstantMemory(), with
				/// OrLocal == false, and uses \c queriedFromInvariantRange() to check if
				/// the given instruction modify the given memory.
				/// In practice, this function should be called immediately after
				/// \c pointsToConstantMemory() (with OrLocal == false) returns false.
				bool InvariantInfo::pointsToReadOnlyMemory(const Instruction *I,
				chandlercUnsubmitted Not Done Reply Inline Actions You have two different API comments, one in the header, and one in the source. Please only have one comment. For public interfaces, it should be in the header. For private functions, you can put it near the implementation if you want. chandlerc: You have two different API comments, one in the header, and one in the source. Please only have…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Ok. Will do. lvoufo: Ok. Will do.
				const MemoryLocation &Loc,
				const DataLayout &DL) {
				// Proceed only if invariant range analysis is enabled. Otherwise,
				// this just repeats the same process from pointsToConstantMemory().
				if (!RangeAnalysisIsEnabled) return false;

				SmallPtrSet<const Value *, 16> Visited;
				unsigned MaxLookup = 8;
				SmallVector<const Value *, 16> Worklist;
				Worklist.push_back(Loc.Ptr);
				do {
				const Value *V = GetUnderlyingObject(Worklist.pop_back_val(), DL);
				if (!Visited.insert(V).second) return false;

				if (queriedFromInvariantRange(I, V)) continue;
				dnovilloUnsubmitted Done Reply Inline Actions I'm confused by this comment. When we find an invariant_start, shouldn't the address be considered as pointing to constant memory? dnovillo: I'm confused by this comment. When we find an invariant_start, shouldn't the address be…
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Oops. Typo. That's exactly what the comment is supposed to be saying. lvoufo: Oops. Typo. That's exactly what the comment is supposed to be saying.

				if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(V)) {
				if (GV->isConstant()) continue;
				return false;
				}

				// If both select values point to readonly memory, then so does the select.
				if (const SelectInst *SI = dyn_cast<SelectInst>(V)) {
				nlewyckyUnsubmitted Done Reply Inline Actions No need for llvm:: here, we're inside a .cpp with "using namespace llvm;". nlewycky: No need for llvm:: here, we're inside a .cpp with "using namespace llvm;".
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Oops. Residuals from multiple restructurings... lvoufo: Oops. Residuals from multiple restructurings...
				Worklist.push_back(SI->getTrueValue());
				Worklist.push_back(SI->getFalseValue());
				continue;
				}
				nlewyckyUnsubmitted Done Reply Inline Actions Extra llvm:: nlewycky: Extra llvm::

				// If all values incoming to a phi node point to readonly memory, then so
				// does the phi.
				dnovilloUnsubmitted Done Reply Inline Actions What does the return value indicate? dnovillo: What does the return value indicate?
				lvoufoAuthorUnsubmitted Done Reply Inline Actions That an invariant intrinsic was processed. Will add a comment. lvoufo: That an invariant intrinsic was processed. Will add a comment.
				if (const PHINode *PN = dyn_cast<PHINode>(V)) {
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Actually the comment follows right below here (?). lvoufo: Actually the comment follows right below here (?).
				// Don't bother inspecting phi nodes with many operands.
				if (PN->getNumIncomingValues() > MaxLookup) return false;
				for (Value *IncValue : PN->incoming_values())
				Worklist.push_back(IncValue);
				continue;
				}

				// Otherwise be conservative.
				return false;

				} while (!Worklist.empty() && --MaxLookup);

				dnovilloUnsubmitted Done Reply Inline Actions This function is supposedly doing a scan backwards, but all it really seems to do is decide whether II is an invariant_start or an invariant_end. Does this need to be renamed? I'm not really sure what this is doing. dnovillo: This function is supposedly doing a scan backwards, but all it really seems to do is decide…
				lvoufoAuthorUnsubmitted Done Reply Inline Actions This is to be called while scanning instructions backward to temporarily reset the mapping of invariant info so that function calls outside invariant regions can continue to clobber loads inside invariant regions. Yes, it should probably be renamed. But I'm thinking of getting read of it all together, in favor of a separate function or pass that processes invariant intrinsics independently of the order in which GVN processes all instructions. lvoufo: This is to be called while scanning instructions backward to temporarily reset the mapping of…
				return Worklist.empty();
				dnovilloUnsubmitted Done Reply Inline Actions Formatting is odd here. Could you run clang-format over the file? dnovillo: Formatting is odd here. Could you run clang-format over the file?
				lvoufoAuthorUnsubmitted Done Reply Inline Actions Yes, I will. lvoufo: Yes, I will.
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Marking these off because they are outdated... lvoufo: Marking these off because they are outdated...
				}

				/// \brief Match the instruction accessing the given address against
				/// invariant_start calls in the map. Then, traverse the calls to determine
				/// whether the intruction is dominated by an invariant_start such that either:
				/// * the invariant_start has no associated invariant_end, or
				nlewyckyUnsubmitted Done Reply Inline Actions Extra llvm:: nlewycky: Extra llvm::
				/// * an associated invariant_end post-dominates the instruction.
				/// Returns true if such an invariant_start is found.
				/// NOTE: Invariant_start/end calls nested within other invariant ranges
				/// are considered redundant in this analysis.
				nlewyckyUnsubmitted Done Reply Inline Actions Need "(void)IStart;" here. nlewycky: Need "(void)IStart;" here.
				/// TODO: For global variables, also consider invariant info from global
				/// constructors.
				/// NOTE: The invariants mapping only processes non-constant global variables
				/// or alloca instructions. So, this function returns false for those
				/// uncovered kinds of memory pointer values.
				bool InvariantInfo::queriedFromInvariantRange(const Instruction *I,
				const Value *Addr) {
				assert(RangeAnalysisIsEnabled && "Invariant range analysis must be enabled.");
				assert(I && Addr && "No analysis without an instruction or an address.");
				assert(DT && PDT &&
				"Both the dominator and the post-dominator tree"
				"analyzes are needed for this analysis.");

				// Not designed for constant globals.
				if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(Addr)) {
				if (GV->isConstant()) return false;
				}

				// Also not designed for non-alloca instructions.
				if (!isa<AllocaInst>(Addr)) return false;

				// Update -stats data
				++NumInvRangeReq;

				// Avoid recomputing this range analysis.
				if (isInInvariantRange(Addr, I)) return true;

				const Function *F = I->getFunction();
				InvListT &IIs = getInvariants(*F, Addr);

				// Update -stats data
				// If there are no invariants to process, then this is a trivial request
				// that should not be accounted for in the stats.
				if (IIs.empty())
				--NumInvRangeReq;
				else
				++NumInvRangeComp;

				for (auto InvsIter = IIs.begin(), InvsEnd = IIs.end(); InvsIter != InvsEnd;
				++InvsIter) {
				const IntrinsicInst IStart = InvsIter;

				// If this invariant_start dominates the instruction, then check if any
				// of its associated invariant_ends post-dominates the instruction.
				// Otherwise, skip to the next invariant_start.
				if (!DT->dominates(IStart, I)) continue;

				for (auto *U : IStart->users()) {
				if (const IntrinsicInst *IEnd = dyn_cast<IntrinsicInst>(U)) {
				assert(IEnd->getIntrinsicID() == Intrinsic::invariant_end &&
				"Intrinsic user of invairant_start must be intrinsic_end.");
				if (PDT->dominates(IEnd, I)) {
				markInInvariantRange(Addr, I);
				return true;
				}
				}

				// Skip non-intrinsic uses of invariant_start because they can be either
				// select instructions or phi-nodes, which would require any intrinsic_end
				// to be in scope.
				}

				// If this invariant_start has no user, then this range is invariant
				// throughout the end of the lifetime of the underlying object.
				if (!IStart->getNumUses()) {
				markInInvariantRange(Addr, I);
				return true;
				}
				}

				// Otherwise, the instruction does not belong to an invariant range.
				return false;
				}

				void InvariantInfo::clear() {
				AllInvariantsMaps.shrink_and_clear();
				InvariantsComputed.clear();
				DT = nullptr;
				PDT = nullptr;
				RangeAnalysisIsEnabled = false;
				InstructionsInInvariantRange.shrink_and_clear();
				}

				void InvariantInfo::verify() const {
				for (auto InvsMapsIter = AllInvariantsMaps.begin(),
				InvsMapsEnd = AllInvariantsMaps.end();
				InvsMapsIter != InvsMapsEnd; ++InvsMapsIter) {
				const Function &F = *(InvsMapsIter->first);
				verify(F);
				}
				}

				void InvariantInfo::verify(const Function &F) const {
				InvListMapT &InvsMap =
				const_cast<InvariantInfo *>(this)->AllInvariantsMaps[&F];

				for (auto InvsMapIter = InvsMap.begin(), InvsMapEnd = InvsMap.end();
				InvsMapIter != InvsMapEnd; ++InvsMapIter) {
				auto *Addr = InvsMapIter->first;
				InvListT &IIs = InvsMapIter->second;

				for (auto InvsIter = IIs.begin(), InvsEnd = IIs.end(); InvsIter != InvsEnd;
				++InvsIter) {
				const IntrinsicInst II = InvsIter;

				assert(II->getIntrinsicID() == Intrinsic::invariant_start &&
				"Only invariant_start calls are collected.");
				assert(Addr == II->getArgOperand(1)->stripPointerCasts() &&
				"Address mismatch in invariant_start.");

				const GlobalVariable *GV = dyn_cast<GlobalVariable>(Addr);
				GV = GV; // Avoid unused warnings.
				assert((isa<AllocaInst>(Addr) \|\| (GV && !GV->isConstant())) &&
				"A matching entry in the map must correspond to "
				"a non-constant global variable or an alloca instruction.");

				if (isa<AllocaInst>(Addr))
				assert(II->getFunction() == cast<AllocaInst>(Addr)->getFunction() &&
				"Invariant_start calls on a given alloca instruction must have"
				"the same parent function as the alloca instruction.");

				for (auto *U : II->users()) {
				if (const IntrinsicInst *IEnd = dyn_cast<IntrinsicInst>(U)) {
				IEnd = IEnd; // Avoid unused warnings.
				assert(IEnd->getIntrinsicID() == Intrinsic::invariant_end &&
				"Only invariant_end calls are intrinsic uses of "
				"invariant_start calls.");
				assert(II == IEnd->getArgOperand(0)->stripPointerCasts() &&
				"invariant_start/end.");
				assert(IEnd->getArgOperand(2)->stripPointerCasts() ==
				II->getArgOperand(1)->stripPointerCasts() &&
				"Address mismatch in invariant_end.");

				if (isa<AllocaInst>(Addr))
				assert(II->getFunction() == IEnd->getFunction() &&
				"Invariant_end calls on a given alloca instruction must have"
				"the same parent function as its invariant_start.");
				} else {
				assert((isa<SelectInst>(U) \|\| isa<PHINode>(U)) &&
				"Non-intrinsic uses of invariant_start must be either "
				"select instructions or phi-nodes.");
				}
				}
				}
				}
				}

				void InvariantInfo::print(raw_ostream &OS) const {
				for (auto InvsMapsIter = AllInvariantsMaps.begin(),
				InvsMapsEnd = AllInvariantsMaps.end();
				InvsMapsIter != InvsMapsEnd; ++InvsMapsIter) {
				const Function &F = *(InvsMapsIter->first);
				print(OS, F);
				}
				}

				void InvariantInfo::print(raw_ostream &OS, const Function &F) const {
				OS << "Invariant Info for function: " << F.getName() << "\n";
				InvListMapT &InvsMap = const_cast<InvariantInfo *>(this)->getInvariantsMap(F);

				for (auto InvsMapIter = InvsMap.begin(), InvsMapEnd = InvsMap.end();
				InvsMapIter != InvsMapEnd; ++InvsMapIter) {
				InvListT &IIs = InvsMapIter->second;
				OS << *(InvsMapIter->first) << " : \n";

				for (auto InvsIter = IIs.begin(), InvsEnd = IIs.end(); InvsIter != InvsEnd;
				++InvsIter) {
				const IntrinsicInst II = InvsIter;
				OS << " invariant_start in block: " << II->getParent()->getName()
				<< "\n";

				for (auto *U : II->users()) {
				if (const IntrinsicInst *IEnd = dyn_cast<IntrinsicInst>(U)) {
				OS << " invariant_end in block: " << IEnd->getParent()->getName()
				<< "\n";
				}
				}
				}
				}

				OS << "\n";
				}

				#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
				void InvariantInfo::dump() const { print(dbgs()); }
				#endif

				char InvariantInfoAnalysis::PassID;

				PreservedAnalyses InvariantInfoPrinterPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				InvariantInfo &InvInfo = AM.getResult<InvariantInfoAnalysis>(F);
				InvInfo.print(OS, F);
				return PreservedAnalyses::all();
				}

				PreservedAnalyses InvariantInfoVerifierPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				InvariantInfo &InvInfo = AM.getResult<InvariantInfoAnalysis>(F);
				InvInfo.verify(F);
				return PreservedAnalyses::all();
				}

				InvariantRangeAnalysisPass::InvariantRangeAnalysisPass() : ImmutablePass(ID) {
				initializeInvariantRangeAnalysisPassPass(*PassRegistry::getPassRegistry());
				}

				InvariantRangeAnalysisPass::~InvariantRangeAnalysisPass() { }

				/// This invariant range analysis is expected to run over some given function, based
				/// on both dominator and post-dominator analysis information.
				/// This function initializes both analysis passes for this analysis pass.
				void InvariantRangeAnalysisPass::InitDependencies(FunctionPass &P) {
				auto &DT = P.getAnalysis<DominatorTreeWrapperPass>();
				auto &PDT = P.getAnalysis<PostDominatorTreeWrapperPass>();
				InvInfo.init(&DT.getDomTree(), &PDT.getPostDomTree());
				}

				void InvariantRangeAnalysisPass::verifyAnalysis() const { InvInfo.verify(); }

				void InvariantRangeAnalysisPass::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.setPreservesAll();
				AU.addRequired<DominatorTreeWrapperPass>();
				AU.addRequired<PostDominatorTreeWrapperPass>();
				}

				/// \brief If a Module object is provided, then print the current content of the
				/// invariants map. Otherwise, print the mappings specific to the current
				/// function, recomputing the mappings as necessary.
				void InvariantRangeAnalysisPass::print(raw_ostream &OS, const Module *M) const {
				if (!M)
				InvInfo.print(OS);
				else {
				// If a module is explicitly provided, e.g. with -analyze, then dump all
				// the functions in the module (that are not declarations), populating the
				// invariant maps as necessary.
				for (const Function &F : M->functions()) {
				if (!F.isDeclaration())
				InvInfo.print(OS, F);
				}
				}
				}

				#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
				void InvariantRangeAnalysisPass::dump() const { InvInfo.dump(); }
				#endif

				char InvariantRangeAnalysisPass::ID = 0;

				INITIALIZE_PASS(InvariantRangeAnalysisPass, "invariant-range-analysis",
				"Invariant_start/end analysis", false, true)

				bool llvm::IsInvariantIntrinsic(const IntrinsicInst *II) {
				return II && (II->getIntrinsicID() == Intrinsic::invariant_start \|\|
				II->getIntrinsicID() == Intrinsic::invariant_end);
				}

lib/Analysis/MemoryDependenceAnalysis.cpp

Show All 14 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/Analysis/MemoryDependenceAnalysis.h"		#include "llvm/Analysis/MemoryDependenceAnalysis.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/AssumptionCache.h"		#include "llvm/Analysis/AssumptionCache.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
		#include "llvm/Analysis/InvariantInfo.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/PHITransAddr.h"		#include "llvm/Analysis/PHITransAddr.h"
#include "llvm/Analysis/OrderedBasicBlock.h"		#include "llvm/Analysis/OrderedBasicBlock.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
▲ Show 20 Lines • Show All 433 Lines • ▼ Show 20 Lines	MemDepResult MemoryDependenceResults::getSimplePointerDependencyFrom(
auto isOtherMemAccess = [](Instruction *I) -> bool {		auto isOtherMemAccess = [](Instruction *I) -> bool {
return !isa<LoadInst>(I) && !isa<StoreInst>(I) && I->mayReadOrWriteMemory();		return !isa<LoadInst>(I) && !isa<StoreInst>(I) && I->mayReadOrWriteMemory();
};		};

// Walk backwards through the basic block, looking for dependencies.		// Walk backwards through the basic block, looking for dependencies.
while (ScanIt != BB->begin()) {		while (ScanIt != BB->begin()) {
Instruction Inst = &--ScanIt;		Instruction Inst = &--ScanIt;

if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst))		if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
// Debug intrinsics don't (and can't) cause dependencies.		// Debug intrinsics don't (and can't) cause dependencies.
if (isa<DbgInfoIntrinsic>(II))		if (isa<DbgInfoIntrinsic>(II))
continue;		continue;

		// Same for invariant intrinsics
		if (IsInvariantIntrinsic(II)) continue;
		}
		nlewyckyUnsubmitted Done Reply Inline Actions Walking ScanIt across an invariant intrinsic that doesn't appertain to QueryInst will not cause InvInfo to be updated. At that point, BasicAAResult::pointsToConstantMemory will produce wrong answers for those pointers. nlewycky: Walking ScanIt across an invariant intrinsic that doesn't appertain to QueryInst will not cause…
		lvoufoAuthorUnsubmitted Done Reply Inline Actions How so? I think it will do exactly what we want. By manipulating invariant info here based on QueryInst, we are selectively choosing when to apply the extension in BasicAA. A load from %i within an invariant range for %i temporarily unsets the marking for %i. A load from %j within an invariant range for %i does nothing to the marking for %i. A load from %j within an invariant range for %i should preserve the value of BasicAAResult::pointsToConstantMemory() for %j, whether it is nested within an invariant range for %j or not. Meanwhile, A load from %i within an invariant range for %i always leads to BasicAAResult::pointsToConstantMemory for %i == true. What is missing? lvoufo: How so? I think it will do exactly what we want. By manipulating invariant info here based on…

// Limit the amount of scanning we do so we don't end up with quadratic		// Limit the amount of scanning we do so we don't end up with quadratic
// running time on extreme testcases.		// running time on extreme testcases.
--Limit;		--Limit;
if (!Limit)		if (!Limit)
return MemDepResult::getUnknown();		return MemDepResult::getUnknown();

if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {		if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(Inst)) {
// If we reach a lifetime begin or end marker, then the query ends here		// If we reach a lifetime begin or end marker, then the query ends here
▲ Show 20 Lines • Show All 1,227 Lines • Show Last 20 Lines

lib/Analysis/PostDominators.cpp

	Show All 21 Lines
	using namespace llvm;			using namespace llvm;

	#define DEBUG_TYPE "postdomtree"			#define DEBUG_TYPE "postdomtree"

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// PostDominatorTree Implementation			// PostDominatorTree Implementation
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

				// dominates - Return true if Def dominates a use in User. This performs
				// the special checks necessary if Def and User are in the same basic block.
				// Note that Def doesn't dominate a use in Def itself!
				bool PostDominatorTree::dominates(const Instruction *Def,
				const Instruction *User) const {
				const BasicBlock *UseBB = User->getParent();
				const BasicBlock *DefBB = Def->getParent();

				// Any unreachable use is post-dominated, even if Def == User.
				// NOTE: isReachableFromEntry() here means "can reach exit from ...".
				if (!isReachableFromEntry(getNode(const_cast<BasicBlock *>(UseBB))))
				return true;

				// Unreachable definitions don't post-dominate anything.
				// NOTE: isReachableFromEntry() here means "can reach exit from ...".
				if (!isReachableFromEntry(getNode(const_cast<BasicBlock *>(DefBB))))
				return false;

				// An instruction doesn't dominate a use in itself.
				if (Def == User) return false;

				// A PHI node post-dominates an instruction if every possible use in DefBB
				// post-dominates the instruction.
				// TODO: Special case for invoke instructions:
				// Allow invoke instructions to be post-dominated by instructions in
				// their normal destination path to allow further optimizations from
				// invariant range analysis such as the elimination of the second load
				// instruction in
				//
				// <code>
				// try { ...
				// @llvm.invariant.start(..., bitcast i32* %i to i8*)
				// load %i
				// invoke(...)
				// load %i
				// ...
				// @llvm.invariant.end(..., bitcast i32* %i to i8*)
				// } catch (...) { ... }
				// </code>
				chandlercUnsubmitted Not Done Reply Inline Actions This doesn't seem like it belongs in the postdom pass. Instead if anything it should live somewhere with the invariant handling code. chandlerc: This doesn't seem like it belongs in the postdom pass. Instead if anything it should live…
				lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Yes. This part should remain with this pass, while the rest of the postdom migration to the new pass manager should go in the postdom migration patch. lvoufo: Yes. This part should remain with this pass, while the rest of the postdom migration to the new…
				hfinkelUnsubmitted Not Done Reply Inline Actions I don't understand your response. Modifying the definition of post-dominance to add a special case for exception unwinding desired by the invariant analysis seems unwise - i.e. likely to lead to bugs in other users of this code. hfinkel: I don't understand your response. Modifying the definition of post-dominance to add a special…
				//
				if (isa<PHINode>(Def)) {
				if (DefBB == UseBB) return false;
				return dominates(DefBB, UseBB);
				}

				if (DefBB != UseBB) return dominates(DefBB, UseBB);

				// Loop through the basic block, in reverse order, until we find Def or User.
				BasicBlock::const_reverse_iterator I = DefBB->rbegin();
				for (; &I != Def && &I != User; ++I) /empty/;

				return &*I == Def;
				}

	char PostDominatorTreeWrapperPass::ID = 0;			char PostDominatorTreeWrapperPass::ID = 0;
	INITIALIZE_PASS(PostDominatorTreeWrapperPass, "postdomtree",			INITIALIZE_PASS(PostDominatorTreeWrapperPass, "postdomtree",
	"Post-Dominator Tree Construction", true, true)			"Post-Dominator Tree Construction", true, true)

	bool PostDominatorTreeWrapperPass::runOnFunction(Function &F) {			bool PostDominatorTreeWrapperPass::runOnFunction(Function &F) {
	DT.recalculate(F);			DT.recalculate(F);
	return false;			return false;
	}			}
	Show All 27 Lines

lib/Passes/PassBuilder.cpp

	Show All 20 Lines
	#include "llvm/Analysis/AliasAnalysisEvaluator.h"			#include "llvm/Analysis/AliasAnalysisEvaluator.h"
	#include "llvm/Analysis/AssumptionCache.h"			#include "llvm/Analysis/AssumptionCache.h"
	#include "llvm/Analysis/BasicAliasAnalysis.h"			#include "llvm/Analysis/BasicAliasAnalysis.h"
	#include "llvm/Analysis/CFLAliasAnalysis.h"			#include "llvm/Analysis/CFLAliasAnalysis.h"
	#include "llvm/Analysis/CGSCCPassManager.h"			#include "llvm/Analysis/CGSCCPassManager.h"
	#include "llvm/Analysis/CallGraph.h"			#include "llvm/Analysis/CallGraph.h"
	#include "llvm/Analysis/DominanceFrontier.h"			#include "llvm/Analysis/DominanceFrontier.h"
	#include "llvm/Analysis/GlobalsModRef.h"			#include "llvm/Analysis/GlobalsModRef.h"
				#include "llvm/Analysis/InvariantInfo.h"
	#include "llvm/Analysis/LazyCallGraph.h"			#include "llvm/Analysis/LazyCallGraph.h"
	#include "llvm/Analysis/LoopInfo.h"			#include "llvm/Analysis/LoopInfo.h"
	#include "llvm/Analysis/MemoryDependenceAnalysis.h"			#include "llvm/Analysis/MemoryDependenceAnalysis.h"
	#include "llvm/Analysis/PostDominators.h"			#include "llvm/Analysis/PostDominators.h"
	#include "llvm/Analysis/RegionInfo.h"			#include "llvm/Analysis/RegionInfo.h"
	#include "llvm/Analysis/ScalarEvolution.h"			#include "llvm/Analysis/ScalarEvolution.h"
	#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"			#include "llvm/Analysis/ScalarEvolutionAliasAnalysis.h"
	#include "llvm/Analysis/ScopedNoAliasAA.h"			#include "llvm/Analysis/ScopedNoAliasAA.h"
	▲ Show 20 Lines • Show All 624 Lines • Show Last 20 Lines

lib/Passes/PassRegistry.def

	Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines

	#ifndef FUNCTION_ANALYSIS			#ifndef FUNCTION_ANALYSIS
	#define FUNCTION_ANALYSIS(NAME, CREATE_PASS)			#define FUNCTION_ANALYSIS(NAME, CREATE_PASS)
	#endif			#endif
	FUNCTION_ANALYSIS("aa", AAManager())			FUNCTION_ANALYSIS("aa", AAManager())
	FUNCTION_ANALYSIS("assumptions", AssumptionAnalysis())			FUNCTION_ANALYSIS("assumptions", AssumptionAnalysis())
	FUNCTION_ANALYSIS("domtree", DominatorTreeAnalysis())			FUNCTION_ANALYSIS("domtree", DominatorTreeAnalysis())
	FUNCTION_ANALYSIS("postdomtree", PostDominatorTreeAnalysis())			FUNCTION_ANALYSIS("postdomtree", PostDominatorTreeAnalysis())
				FUNCTION_ANALYSIS("invariant-info", InvariantInfoAnalysis())
	FUNCTION_ANALYSIS("domfrontier", DominanceFrontierAnalysis())			FUNCTION_ANALYSIS("domfrontier", DominanceFrontierAnalysis())
	FUNCTION_ANALYSIS("loops", LoopAnalysis())			FUNCTION_ANALYSIS("loops", LoopAnalysis())
	FUNCTION_ANALYSIS("memdep", MemoryDependenceAnalysis())			FUNCTION_ANALYSIS("memdep", MemoryDependenceAnalysis())
	FUNCTION_ANALYSIS("regions", RegionInfoAnalysis())			FUNCTION_ANALYSIS("regions", RegionInfoAnalysis())
	FUNCTION_ANALYSIS("no-op-function", NoOpFunctionAnalysis())			FUNCTION_ANALYSIS("no-op-function", NoOpFunctionAnalysis())
	FUNCTION_ANALYSIS("scalar-evolution", ScalarEvolutionAnalysis())			FUNCTION_ANALYSIS("scalar-evolution", ScalarEvolutionAnalysis())
	FUNCTION_ANALYSIS("targetlibinfo", TargetLibraryAnalysis())			FUNCTION_ANALYSIS("targetlibinfo", TargetLibraryAnalysis())
	FUNCTION_ANALYSIS("targetir",			FUNCTION_ANALYSIS("targetir",
	Show All 21 Lines
	FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())			FUNCTION_PASS("invalidate<all>", InvalidateAllAnalysesPass())
	FUNCTION_PASS("no-op-function", NoOpFunctionPass())			FUNCTION_PASS("no-op-function", NoOpFunctionPass())
	FUNCTION_PASS("lower-expect", LowerExpectIntrinsicPass())			FUNCTION_PASS("lower-expect", LowerExpectIntrinsicPass())
	FUNCTION_PASS("gvn", GVN())			FUNCTION_PASS("gvn", GVN())
	FUNCTION_PASS("print", PrintFunctionPass(dbgs()))			FUNCTION_PASS("print", PrintFunctionPass(dbgs()))
	FUNCTION_PASS("print<assumptions>", AssumptionPrinterPass(dbgs()))			FUNCTION_PASS("print<assumptions>", AssumptionPrinterPass(dbgs()))
	FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))			FUNCTION_PASS("print<domtree>", DominatorTreePrinterPass(dbgs()))
	FUNCTION_PASS("print<postdomtree>", PostDominatorTreePrinterPass(dbgs()))			FUNCTION_PASS("print<postdomtree>", PostDominatorTreePrinterPass(dbgs()))
				FUNCTION_PASS("print<invariant-info>", InvariantInfoPrinterPass(dbgs()))
	FUNCTION_PASS("print<domfrontier>", DominanceFrontierPrinterPass(dbgs()))			FUNCTION_PASS("print<domfrontier>", DominanceFrontierPrinterPass(dbgs()))
	FUNCTION_PASS("print<loops>", LoopPrinterPass(dbgs()))			FUNCTION_PASS("print<loops>", LoopPrinterPass(dbgs()))
	FUNCTION_PASS("print<regions>", RegionInfoPrinterPass(dbgs()))			FUNCTION_PASS("print<regions>", RegionInfoPrinterPass(dbgs()))
	FUNCTION_PASS("print<scalar-evolution>", ScalarEvolutionPrinterPass(dbgs()))			FUNCTION_PASS("print<scalar-evolution>", ScalarEvolutionPrinterPass(dbgs()))
	FUNCTION_PASS("simplify-cfg", SimplifyCFGPass())			FUNCTION_PASS("simplify-cfg", SimplifyCFGPass())
	FUNCTION_PASS("sroa", SROA())			FUNCTION_PASS("sroa", SROA())
	FUNCTION_PASS("verify", VerifierPass())			FUNCTION_PASS("verify", VerifierPass())
	FUNCTION_PASS("verify<domtree>", DominatorTreeVerifierPass())			FUNCTION_PASS("verify<domtree>", DominatorTreeVerifierPass())
	FUNCTION_PASS("verify<regions>", RegionInfoVerifierPass())			FUNCTION_PASS("verify<regions>", RegionInfoVerifierPass())
				FUNCTION_PASS("verify<invariant-info>", InvariantInfoVerifierPass())
	#undef FUNCTION_PASS			#undef FUNCTION_PASS

	#ifndef LOOP_ANALYSIS			#ifndef LOOP_ANALYSIS
	#define LOOP_ANALYSIS(NAME, CREATE_PASS)			#define LOOP_ANALYSIS(NAME, CREATE_PASS)
	#endif			#endif
	LOOP_ANALYSIS("no-op-loop", NoOpLoopAnalysis())			LOOP_ANALYSIS("no-op-loop", NoOpLoopAnalysis())
	#undef LOOP_ANALYSIS			#undef LOOP_ANALYSIS

	#ifndef LOOP_PASS			#ifndef LOOP_PASS
	#define LOOP_PASS(NAME, CREATE_PASS)			#define LOOP_PASS(NAME, CREATE_PASS)
	#endif			#endif
	LOOP_PASS("invalidate<all>", InvalidateAllAnalysesPass())			LOOP_PASS("invalidate<all>", InvalidateAllAnalysesPass())
	LOOP_PASS("no-op-loop", NoOpLoopPass())			LOOP_PASS("no-op-loop", NoOpLoopPass())
	LOOP_PASS("print", PrintLoopPass(dbgs()))			LOOP_PASS("print", PrintLoopPass(dbgs()))
	#undef LOOP_PASS			#undef LOOP_PASS

lib/Transforms/InstCombine/InstructionCombining.cpp

Show First 20 Lines • Show All 1,972 Lines • ▼ Show 20 Lines	for (unsigned i = 0, e = Users.size(); i != e; ++i) {
Instruction I = cast<Instruction>(&Users[i]);		Instruction I = cast<Instruction>(&Users[i]);

if (ICmpInst *C = dyn_cast<ICmpInst>(I)) {		if (ICmpInst *C = dyn_cast<ICmpInst>(I)) {
replaceInstUsesWith(*C,		replaceInstUsesWith(*C,
ConstantInt::get(Type::getInt1Ty(C->getContext()),		ConstantInt::get(Type::getInt1Ty(C->getContext()),
C->isFalseWhenEqual()));		C->isFalseWhenEqual()));
} else if (isa<BitCastInst>(I) \|\| isa<GetElementPtrInst>(I)) {		} else if (isa<BitCastInst>(I) \|\| isa<GetElementPtrInst>(I)) {
replaceInstUsesWith(*I, UndefValue::get(I->getType()));		replaceInstUsesWith(*I, UndefValue::get(I->getType()));
		} else if (IntrinsicInst *II = dyn_cast<IntrinsicInst>(I)) {
		if (II->getIntrinsicID() == Intrinsic::objectsize) {
		hfinkelUnsubmitted Not Done Reply Inline Actions Why is Intrinsic::objectsize handling in this patch? hfinkel: Why is Intrinsic::objectsize handling in this patch?
		ConstantInt *CI = cast<ConstantInt>(II->getArgOperand(1));
		uint64_t DontKnow = CI->isZero() ? -1ULL : 0;
		replaceInstUsesWith(*I, ConstantInt::get(I->getType(), DontKnow));
		}

		// If this is an invariant start, mark it as unused before erasing it.
		if (II->getIntrinsicID() == Intrinsic::invariant_start)
		replaceInstUsesWith(*I, UndefValue::get(I->getType()));
}		}
eraseInstFromFunction(*I);		eraseInstFromFunction(*I);
}		}

if (InvokeInst *II = dyn_cast<InvokeInst>(&MI)) {		if (InvokeInst *II = dyn_cast<InvokeInst>(&MI)) {
// Replace invoke with a NOP intrinsic to maintain the original CFG		// Replace invoke with a NOP intrinsic to maintain the original CFG
Module *M = II->getModule();		Module *M = II->getModule();
Function *F = Intrinsic::getDeclaration(M, Intrinsic::donothing);		Function *F = Intrinsic::getDeclaration(M, Intrinsic::donothing);
▲ Show 20 Lines • Show All 1,170 Lines • Show Last 20 Lines

lib/Transforms/Scalar/GVN.cpp

Show All 28 Lines
#include "llvm/Analysis/CFG.h"		#include "llvm/Analysis/CFG.h"
#include "llvm/Analysis/ConstantFolding.h"		#include "llvm/Analysis/ConstantFolding.h"
#include "llvm/Analysis/GlobalsModRef.h"		#include "llvm/Analysis/GlobalsModRef.h"
#include "llvm/Analysis/InstructionSimplify.h"		#include "llvm/Analysis/InstructionSimplify.h"
#include "llvm/Analysis/Loads.h"		#include "llvm/Analysis/Loads.h"
#include "llvm/Analysis/MemoryBuiltins.h"		#include "llvm/Analysis/MemoryBuiltins.h"
#include "llvm/Analysis/MemoryDependenceAnalysis.h"		#include "llvm/Analysis/MemoryDependenceAnalysis.h"
#include "llvm/Analysis/PHITransAddr.h"		#include "llvm/Analysis/PHITransAddr.h"
		#include "llvm/Analysis/PostDominators.h"
#include "llvm/Analysis/TargetLibraryInfo.h"		#include "llvm/Analysis/TargetLibraryInfo.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/LLVMContext.h"		#include "llvm/IR/LLVMContext.h"
▲ Show 20 Lines • Show All 544 Lines • ▼ Show 20 Lines	PreservedAnalyses GVN::run(Function &F, AnalysisManager<Function> &AM) {
// significant! Re-ordering these variables will cause GVN when run alone to		// significant! Re-ordering these variables will cause GVN when run alone to
// be less effective! We should fix memdep and basic-aa to not exhibit this		// be less effective! We should fix memdep and basic-aa to not exhibit this
// behavior, but until then don't change the order here.		// behavior, but until then don't change the order here.
auto &AC = AM.getResult<AssumptionAnalysis>(F);		auto &AC = AM.getResult<AssumptionAnalysis>(F);
auto &DT = AM.getResult<DominatorTreeAnalysis>(F);		auto &DT = AM.getResult<DominatorTreeAnalysis>(F);
auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);		auto &TLI = AM.getResult<TargetLibraryAnalysis>(F);
auto &AA = AM.getResult<AAManager>(F);		auto &AA = AM.getResult<AAManager>(F);
auto &MemDep = AM.getResult<MemoryDependenceAnalysis>(F);		auto &MemDep = AM.getResult<MemoryDependenceAnalysis>(F);
bool Changed = runImpl(F, AC, DT, TLI, AA, &MemDep);		auto *InvInfo = AM.getCachedResult<InvariantInfoAnalysis>(F);
		bool Changed = runImpl(F, AC, DT, TLI, AA, &MemDep, InvInfo);
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();		return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all();
}		}

#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)		#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
void GVN::dump(DenseMap<uint32_t, Value*>& d) {		void GVN::dump(DenseMap<uint32_t, Value*>& d) {
errs() << "{\n";		errs() << "{\n";
for (DenseMap<uint32_t, Value*>::iterator I = d.begin(),		for (DenseMap<uint32_t, Value*>::iterator I = d.begin(),
E = d.end(); I != E; ++I) {		E = d.end(); I != E; ++I) {
▲ Show 20 Lines • Show All 1,549 Lines • ▼ Show 20 Lines	bool GVN::processInstruction(Instruction *I) {
if (MD && Repl->getType()->getScalarType()->isPointerTy())		if (MD && Repl->getType()->getScalarType()->isPointerTy())
MD->invalidateCachedPointerInfo(Repl);		MD->invalidateCachedPointerInfo(Repl);
markInstructionForDeletion(I);		markInstructionForDeletion(I);
return true;		return true;
}		}

/// runOnFunction - This is the main transformation entry point for a function.		/// runOnFunction - This is the main transformation entry point for a function.
bool GVN::runImpl(Function &F, AssumptionCache &RunAC, DominatorTree &RunDT,		bool GVN::runImpl(Function &F, AssumptionCache &RunAC, DominatorTree &RunDT,
const TargetLibraryInfo &RunTLI, AAResults &RunAA,		const TargetLibraryInfo &RunTLI, AAResults &RunAA,
MemoryDependenceResults *RunMD) {		MemoryDependenceResults RunMD, InvariantInfo InvInfo) {

		// Enable invariant range analysis here to keep its effect local to GVN.
		if (InvInfo)
		InvInfo->EnableInvariantRangeAnalysis();

		chandlercUnsubmitted Not Done Reply Inline Actions This does not seem like the right design and seems over-engineered. Here is an alternative that seems much simpler: pass your InvariantRangeAnalysis or InvariantRangeInfo in as an optional parameter to MemoryDependenceAnalysis::getSimplePointerDependencyFrom and the few APIs that call it. Then GVN can opt into building the invariant information structures and passing them into the query and other passes can skip that. Then, within memdep, where we query AA, also query the invariant analysis. Perhaps there are other designs as well that would be simpler and more direct? I'm curious what others think. chandlerc: This does not seem like the right design and seems over-engineered. Here is an alternative…
		lvoufoAuthorUnsubmitted Not Done Reply Inline Actions Hmm... See my earlier reply about Objective #1. This sounds like it'd basically shift the selective process from GVN into memdep, and increase opportunities for errors along the way. Note that GVN does not call `getSimplePointerDependencyFrom` directly. It starts with `getDependency` which eventually calls `getSimplePointerDependencyFrom` through multiple hoops in the call chain. I prefer to keep things as simple as possible, i.e., enable the analysis in one place and not worry about it again until the analysis is requested in basic AA... Open for alternative suggestions too. lvoufo: Hmm... See my earlier reply about Objective #1. This sounds like it'd basically shift the…
AC = &RunAC;		AC = &RunAC;
DT = &RunDT;		DT = &RunDT;
VN.setDomTree(DT);		VN.setDomTree(DT);
TLI = &RunTLI;		TLI = &RunTLI;
VN.setAliasAnalysis(&RunAA);		VN.setAliasAnalysis(&RunAA);
MD = RunMD;		MD = RunMD;
VN.setMemDep(MD);		VN.setMemDep(MD);

Show All 36 Lines	bool GVN::runImpl(Function &F, AssumptionCache &RunAC, DominatorTree &RunDT,
// we can't do this until PRE's critical edge splitting updates memdep.		// we can't do this until PRE's critical edge splitting updates memdep.
// Actually, when this happens, we should just fully integrate PRE into GVN.		// Actually, when this happens, we should just fully integrate PRE into GVN.

cleanupGlobalSets();		cleanupGlobalSets();
// Do not cleanup DeadBlocks in cleanupGlobalSets() as it's called for each		// Do not cleanup DeadBlocks in cleanupGlobalSets() as it's called for each
// iteration.		// iteration.
DeadBlocks.clear();		DeadBlocks.clear();

		// Disable invariant range analysis when finished with this pass.
		if (InvInfo)
		InvInfo->EnableInvariantRangeAnalysis(false);

return Changed;		return Changed;
}		}

bool GVN::processBlock(BasicBlock *BB) {		bool GVN::processBlock(BasicBlock *BB) {
// FIXME: Kill off InstrsToErase by doing erasing eagerly in a helper function		// FIXME: Kill off InstrsToErase by doing erasing eagerly in a helper function
// (and incrementing BI before processing an instruction).		// (and incrementing BI before processing an instruction).
assert(InstrsToErase.empty() &&		assert(InstrsToErase.empty() &&
"We expect InstrsToErase to be empty across iterations");		"We expect InstrsToErase to be empty across iterations");
▲ Show 20 Lines • Show All 447 Lines • ▼ Show 20 Lines	explicit GVNLegacyPass(bool NoLoads = false)
: FunctionPass(ID), NoLoads(NoLoads) {		: FunctionPass(ID), NoLoads(NoLoads) {
initializeGVNLegacyPassPass(*PassRegistry::getPassRegistry());		initializeGVNLegacyPassPass(*PassRegistry::getPassRegistry());
}		}

bool runOnFunction(Function &F) override {		bool runOnFunction(Function &F) override {
if (skipOptnoneFunction(F))		if (skipOptnoneFunction(F))
return false;		return false;

		if (!NoLoads) {
		// Once enabled, initialize the analysis' dependencies.
		auto &InvRA = getAnalysis<InvariantRangeAnalysisPass>();
		InvRA.InitDependencies(*this);
		}

return Impl.runImpl(		return Impl.runImpl(
F, getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F),		F, getAnalysis<AssumptionCacheTracker>().getAssumptionCache(F),
getAnalysis<DominatorTreeWrapperPass>().getDomTree(),		getAnalysis<DominatorTreeWrapperPass>().getDomTree(),
getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(),		getAnalysis<TargetLibraryInfoWrapperPass>().getTLI(),
getAnalysis<AAResultsWrapperPass>().getAAResults(),		getAnalysis<AAResultsWrapperPass>().getAAResults(),
NoLoads ? nullptr		NoLoads ? nullptr
: &getAnalysis<MemoryDependenceWrapperPass>().getMemDep());		: &getAnalysis<MemoryDependenceWrapperPass>().getMemDep(),
		NoLoads ? nullptr
		: &getAnalysis<InvariantRangeAnalysisPass>().getInvariantInfo());
}		}

void getAnalysisUsage(AnalysisUsage &AU) const override {		void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<AssumptionCacheTracker>();		AU.addRequired<AssumptionCacheTracker>();
AU.addRequired<DominatorTreeWrapperPass>();		AU.addRequired<DominatorTreeWrapperPass>();
AU.addRequired<TargetLibraryInfoWrapperPass>();		AU.addRequired<TargetLibraryInfoWrapperPass>();
if (!NoLoads)		if (!NoLoads) {
AU.addRequired<MemoryDependenceWrapperPass>();		AU.addRequired<MemoryDependenceWrapperPass>();
		AU.addRequired<InvariantRangeAnalysisPass>();
		AU.addRequired<PostDominatorTreeWrapperPass>();
		AU.addPreserved<PostDominatorTreeWrapperPass>();
		}
AU.addRequired<AAResultsWrapperPass>();		AU.addRequired<AAResultsWrapperPass>();

AU.addPreserved<DominatorTreeWrapperPass>();		AU.addPreserved<DominatorTreeWrapperPass>();
AU.addPreserved<GlobalsAAWrapperPass>();		AU.addPreserved<GlobalsAAWrapperPass>();
}		}

private:		private:
bool NoLoads;		bool NoLoads;
GVN Impl;		GVN Impl;
};		};

char GVNLegacyPass::ID = 0;		char GVNLegacyPass::ID = 0;

// The public interface to this file...		// The public interface to this file...
FunctionPass *llvm::createGVNPass(bool NoLoads) {		FunctionPass *llvm::createGVNPass(bool NoLoads) {
return new GVNLegacyPass(NoLoads);		return new GVNLegacyPass(NoLoads);
}		}

INITIALIZE_PASS_BEGIN(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)		INITIALIZE_PASS_BEGIN(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)
INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)		INITIALIZE_PASS_DEPENDENCY(AssumptionCacheTracker)
INITIALIZE_PASS_DEPENDENCY(MemoryDependenceWrapperPass)		INITIALIZE_PASS_DEPENDENCY(MemoryDependenceWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(InvariantRangeAnalysisPass)
INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)		INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
		INITIALIZE_PASS_DEPENDENCY(PostDominatorTreeWrapperPass)
INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)		INITIALIZE_PASS_DEPENDENCY(TargetLibraryInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)		INITIALIZE_PASS_DEPENDENCY(AAResultsWrapperPass)
INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)		INITIALIZE_PASS_DEPENDENCY(GlobalsAAWrapperPass)
INITIALIZE_PASS_END(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)		INITIALIZE_PASS_END(GVNLegacyPass, "gvn", "Global Value Numbering", false, false)

test/Analysis/InvariantInfo/gvn-basic.ll

This file was added.

				; RUN: opt < %s -S -gvn 2>&1 \| FileCheck %s
				; RUN: opt < %s -gvn -stats 2>&1 \| FileCheck %s -check-prefix=STATS

				; Tests elementary load elimination using call instructions as potential
				; clobbers. Also checks statistical data generation.
				; NOTE: "-gvn" expands to
				; "-targetlibinfo -tti -assumption-cache-tracker -domtree "
				; "-postdomtree -invariant-range-analysis -basicaa -aa -memdep "
				; "-gvn -verify"

				; STATS: 2 gvn - Number of blocks merged
				; STATS: 2 gvn - Number of equalities propagated
				; STATS: 6 gvn - Number of instructions deleted
				; STATS: 6 gvn - Number of loads deleted
				; STATS: 18 invariant_info - Number of invariant range analysis computed
				; STATS: 21 invariant_info - Number of invariant range analysis requested
				; STATS: 4 invariant_info - Number of invariant_end instructions
				; STATS: 4 invariant_info - Number of invariant_start instructions
				; STATS: 1 memdep - Number of block queries that were completely cached
				; STATS: 1 memdep - Number of uncached non-local ptr responses

				; A simple single range.
				define void @simple_range() {
				; CHECK: define void @simple_range()
				entry:
				%i = alloca i32
				call void @foo(i32* %i) ; #1

				%bi = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bi)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%ld1 = load i32, i32* %i ; Clobbered by #1; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld1)
				call void @foo(i32* %i) ; #2
				%ld2 = load i32, i32* %i ; Not clobbered by #2; Merged into %ld1.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld2)
				call void @foo(i32* %i) ; #3

				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %bi)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				%ld3 = load i32, i32* %i ; Not clobbered by #3, nor #2; Merged into %ld1.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld3)
				call void @foo(i32* %i) ; #4
				%ld4 = load i32, i32* %i ; Clobbered by #4; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld4)
				ret void
				}

				; Multiple ranges, spanning different branches.
				define void @multi_branch_ranges() {
				; CHECK: define void @multi_branch_ranges()
				entry:
				%i = alloca i32
				call void @foo(i32* %i) ; #1

				%bi = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bi)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				call void @foo(i32* %i) ; #2
				br label %next

				next:
				%ld1 = load i32, i32* %i ; Not clobbered by #2; Clobbered by #1; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld1)

				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %bi)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				%ld2 = load i32, i32* %i ; Merged into %2.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld2)

				%inv2 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bi)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				call void @foo(i32* %i) ; #3
				%ld3 = load i32, i32* %i ; Clobbered by #3; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld3)
				%cond = icmp eq i32 %ld1, %ld3
				br i1 %cond, label %same, label %end

				same:
				call void @foo(i32* %i) ; #4
				%ld4 = load i32, i32* %i ; Not clobbered by #4; Merged into %ld3 == %ld1.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld4)

				call void @llvm.invariant.end({}* %inv2, i64 4, i8* %bi) ; Redundant -- since nested within another invariant range.
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				call void @foo(i32* %i) ; #5
				%ld5 = load i32, i32* %i ; Not clobbered by #5; Merged into %ld3 == %ld1.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld5)
				br label %end_same

				end_same:
				call void @llvm.invariant.end({}* %inv2, i64 4, i8* %bi) ; invariant_start %inv2 ends here.
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				call void @foo(i32* %i) ; #6
				%ld6 = load i32, i32* %i ; Clobbered by #6; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld6)

				%inv3 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bi)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				call void @foo(i32* %i) ; #7
				%ld7 = load i32, i32* %i ; Not clobbered by #7; Merged into %ld6.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld7)
				br label %end

				end:
				call void @foo(i32* %i) ; #8
				%ld8 = load i32, i32* %i ; Clobbered by #8; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld8)
				ret void
				}

				declare void @bar(i32) readonly
				declare void @foo(i32*)
				declare {}* @llvm.invariant.start(i64, i8* nocapture)
				declare void @llvm.invariant.end({}, i64, i8 nocapture)

test/Analysis/InvariantInfo/gvn.ll

This file was added.

				; RUN: opt < %s -gvn -S \| FileCheck %s

				; Tests elementary load elimination using instructions other than call
				; instructions as potential clobbers.

				; select instructions:
				; Loading from a select instruction is clobbered by any instruction that
				; clobbers at least one of the select values. It is not clobbered by
				; instructions that do not clobber any of the select values.
				define void @select() {
				; CHECK: define void @select()
				entry:
				%i = alloca i32
				%j = alloca i32
				%cond = call i1 @condition();
				%ij = select i1 %cond, i32* %i, i32* %j
				call void @foo(i32* %i, i32* %j) ; #1

				%bi = bitcast i32* %i to i8*
				%invi = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bi)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%ld1 = load i32, i32* %ij ; Clobbered by #1; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld1)
				call void @foo(i32* %i, i32* %j) ; #2
				%ld2 = load i32, i32* %ij ; Clobbered by #2; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld2)
				call void @foo(i32* %i, i32* %j) ; #3

				%bj = bitcast i32* %j to i8*
				%invj = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bj)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%ld3 = load i32, i32* %ij ; Clobbered by #3; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld3)
				call void @foo(i32* %i, i32* %j) ; #4
				%ld4 = load i32, i32* %ij ; Not clobbered by #4; Merged into %ld3.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld4)
				call void @foo(i32* %i, i32* %j) ; #5

				call void @llvm.invariant.end({}* %invj, i64 4, i8* %bj)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				%ld5 = load i32, i32* %ij ; Not clobbered by #5, nor #4; Merged into %ld3.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld5)
				call void @foo(i32* %i, i32* %j) ; #6
				%ld6 = load i32, i32* %ij ; Clobbered by #6; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld6)
				call void @foo(i32* %i, i32* %j) ; #7

				call void @llvm.invariant.end({}* %invi, i64 4, i8* %bi)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				%ld7 = load i32, i32* %ij ; Clobbered by #7; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld7)
				call void @foo(i32* %i, i32* %j) ; #8
				%ld8 = load i32, i32* %ij ; Clobbered by #8; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld8)

				ret void
				}

				; PHI nodes:
				; Loading from a phi node is clobbered by any instruction that clobbers
				; at least one of the incoming values. It is not clobbered by instructions
				; that do not clobber any of the incoming values.
				define void @phi_node() {
				; CHECK: define void @phi_node()
				entry:
				%i = alloca i32
				%j = alloca i32
				%cond = call i1 @condition();
				br i1 %cond, label %iside, label %jside

				iside: ; Some fillers...
				call void @foo(i32* %i, i32* %j)
				br label %both

				jside: ; Some fillers...
				call void @foo(i32* %i, i32* %j)
				%ldj = load i32, i32* %j
				call void @bar(i32 %ldj)
				call void @foo(i32* %i, i32* %j)
				br label %both

				both: ; The rest is similar to select() above.
				%ij = phi i32* [ %i, %iside ], [ %j, %jside ]
				call void @foo(i32* %i, i32* %j) ; #1

				%bi = bitcast i32* %i to i8*
				%invi = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bi)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%ld1 = load i32, i32* %ij ; Clobbered by #1; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld1)
				call void @foo(i32* %i, i32* %j) ; #2
				%ld2 = load i32, i32* %ij ; Clobbered by #2; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld2)
				call void @foo(i32* %i, i32* %j) ; #3

				%bj = bitcast i32* %j to i8*
				%invj = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %bj)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%ld3 = load i32, i32* %ij ; Clobbered by #3; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld3)
				call void @foo(i32* %i, i32* %j) ; #4
				%ld4 = load i32, i32* %ij ; Not clobbered by #4; Merged into %ld3.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld4)
				call void @foo(i32* %i, i32* %j) ; #5

				call void @llvm.invariant.end({}* %invj, i64 4, i8* %bj)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				%ld5 = load i32, i32* %ij ; Not clobbered by #5, nor #4; Merged into %ld3.
				; CHECK-NOT: load i32, i32*
				call void @bar(i32 %ld5)
				call void @foo(i32* %i, i32* %j) ; #6
				%ld6 = load i32, i32* %ij ; Clobbered by #6; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld6)
				call void @foo(i32* %i, i32* %j) ; #7

				call void @llvm.invariant.end({}* %invi, i64 4, i8* %bi)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				%ld7 = load i32, i32* %ij ; Clobbered by #7; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld7)
				call void @foo(i32* %i, i32* %j) ; #8
				%ld8 = load i32, i32* %ij ; Clobbered by #8; Unchanged.
				; CHECK: load i32, i32*
				call void @bar(i32 %ld8)

				ret void
				}

				; TODO: Test other instructions that can modify memory.
				; Examples:
				; define void @invoke() { ... }
				; define void @store() { ... }
				; define void @vaarg() { ... }
				; define void @catchpad() { ... }
				; define void @catchreturn() { ... }
				; ...
				; define void @globalvar() { ... }
				; ...

				declare i1 @condition()
				declare void @bar(i32) readonly
				declare void @foo(i32, i32)
				declare {}* @llvm.invariant.start(i64, i8* nocapture)
				declare void @llvm.invariant.end({}, i64, i8 nocapture)

test/Analysis/InvariantInfo/instcombine.ll

This file was added.

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				; Do not remove intrinsic calls after merging uses of defined values.
				define void @merged_def() {
				; CHECK: define void @merged_def
				entry:
				%i = alloca i32
				call void @foo(i32* %i)
				%0 = bitcast i32* %i to i8*
				%inv0 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				; CHECK: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%ld1 = load i32, i32* %i
				call void @bar(i32 %ld1)
				call void @bar(i32 %ld1)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %0)
				; CHECK: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				ret void
				}

				; Remove intrinsic calls after merging uses of undefined values.
				define void @merged_undef() {
				; CHECK: define void @merged_undef
				entry:
				%i = alloca i32
				%0 = bitcast i32* %i to i8*
				%inv0 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				; CHECK-NOT: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				; CHECK-NOT: call {{.}}@llvm.invariant.start(i64 {{[0-9]+}}, i8
				call void @bar(i32 undef)
				call void @bar(i32 undef)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %0)
				; CHECK-NOT: call {{.}}@llvm.invariant.end({{.}}, i64 {{[0-9]+}}, i8*
				ret void
				}

				declare void @bar(i32) readonly
				declare void @foo(i32*)
				declare {}* @llvm.invariant.start(i64, i8* nocapture)
				declare void @llvm.invariant.end({}, i64, i8 nocapture)

test/Analysis/InvariantInfo/print.ll

This file was added.

				; RUN: opt < %s -invariant-range-analysis -analyze 2>&1 \| FileCheck %s
				; RUN: opt < %s -invariant-range-analysis -disable-output -passes="print<invariant-info>" 2>&1 \| FileCheck %s
				; RUN: opt < %s -invariant-range-analysis -disable-output -passes="print<invariant-info>,verify<invariant-info>" 2>&1 \| FileCheck %s
				; RUN: opt < %s -invariant-range-analysis -disable-output -passes="print<invariant-info>,verify<invariant-info>" -stats 2>&1 \| FileCheck %s -check-prefix=STATS

				; Tests printing, verification, and statistics functionality for invariant
				; range analysis.

				; CHECK: Invariant Info for function: simple_range
				; CHECK-NEXT: %i = alloca i32 :
				; CHECK-NEXT: invariant_start in block: entry
				; CHECK-NEXT: invariant_end in block: entry

				; CHECK: Invariant Info for function: multi_branch_ranges
				; CHECK-NEXT: %i = alloca i32 :
				; CHECK-NEXT: invariant_start in block: entry
				; CHECK-NEXT: invariant_end in block: next
				; CHECK-NEXT: invariant_start in block: next
				; CHECK-NEXT: invariant_end in block: end_same
				; CHECK-NEXT: invariant_end in block: same
				; CHECK-NEXT: invariant_start in block: end_same
				; CHECK-NOT: invariant_end

				; STATS: 4 invariant_info - Number of invariant_end instructions
				; STATS: 4 invariant_info - Number of invariant_start instructions

				; A simple single range.
				define void @simple_range() {
				entry:
				%i = alloca i32
				call void @foo(i32* %i)
				%0 = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				%ld1 = load i32, i32* %i
				call void @bar(i32 %ld1)
				call void @foo(i32* %i)
				%ld2 = load i32, i32* %i
				call void @bar(i32 %ld2)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %0)
				%ld3 = load i32, i32* %i
				call void @bar(i32 %ld3)
				call void @foo(i32* %i)
				%ld4 = load i32, i32* %i
				call void @bar(i32 %ld4)
				ret void
				}

				; Multiple ranges, spanning different branches.
				define void @multi_branch_ranges() {
				entry:
				%i = alloca i32
				call void @foo(i32* %i)
				%0 = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				br label %next

				next:
				%ld1 = load i32, i32* %i
				call void @bar(i32 %ld1)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %0)
				%ld2 = load i32, i32* %i
				call void @bar(i32 %ld2)
				%inv2 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				%ld3 = load i32, i32* %i
				call void @bar(i32 %ld3)
				%cond = icmp eq i32 %ld2, %ld3
				br i1 %cond, label %same, label %end

				same:
				%ld4 = load i32, i32* %i
				call void @bar(i32 %ld4)
				call void @llvm.invariant.end({}* %inv2, i64 4, i8* %0)
				call void @foo(i32* %i)
				%ld5 = load i32, i32* %i
				call void @bar(i32 %ld5)
				br label %end_same

				end_same:
				call void @llvm.invariant.end({}* %inv2, i64 4, i8* %0)
				call void @foo(i32* %i)
				%ld6 = load i32, i32* %i
				call void @bar(i32 %ld6)
				%inv3 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				%ld7 = load i32, i32* %i
				call void @bar(i32 %ld7)
				br label %end

				end:
				ret void
				}

				declare void @bar(i32) readonly
				declare void @foo(i32*)
				declare {}* @llvm.invariant.start(i64, i8* nocapture)
				declare void @llvm.invariant.end({}, i64, i8 nocapture)

test/Analysis/InvariantInfo/verify.ll

This file was added.

				; Check the validity of our uses of invariant_start/end pairs,
				; based on the legacy pass manager.
				; RUN: opt < %s -invariant-range-analysis -disable-output -verify

				; Also check the validity under optimization levels enabled
				; (which implicitly enable -invariant-range-analysis).
				; RUN: opt < %s -O1 -disable-output -verify
				; RUN: opt < %s -O2 -disable-output -verify
				; RUN: opt < %s -O3 -disable-output -verify

				; A simple single range.
				define void @simple_range() {
				entry:
				%i = alloca i32
				call void @foo(i32* %i)
				%0 = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				%ld1 = load i32, i32* %i
				call void @bar(i32 %ld1)
				call void @foo(i32* %i)
				%ld2 = load i32, i32* %i
				call void @bar(i32 %ld2)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %0)
				%ld3 = load i32, i32* %i
				call void @bar(i32 %ld3)
				call void @foo(i32* %i)
				%ld4 = load i32, i32* %i
				call void @bar(i32 %ld4)
				ret void
				}

				; Multiple ranges, spanning different branches.
				define void @multi_branch_ranges() {
				entry:
				%i = alloca i32
				call void @foo(i32* %i)
				%0 = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				br label %next

				next:
				%ld1 = load i32, i32* %i
				call void @bar(i32 %ld1)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %0)
				%ld2 = load i32, i32* %i
				call void @bar(i32 %ld2)
				%inv2 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				%ld3 = load i32, i32* %i
				call void @bar(i32 %ld3)
				%cond = icmp eq i32 %ld2, %ld3
				br i1 %cond, label %same, label %end

				same:
				%ld4 = load i32, i32* %i
				call void @bar(i32 %ld4)
				call void @llvm.invariant.end({}* %inv2, i64 4, i8* %0)
				call void @foo(i32* %i)
				%ld5 = load i32, i32* %i
				call void @bar(i32 %ld5)
				br label %end_same

				end_same:
				call void @llvm.invariant.end({}* %inv2, i64 4, i8* %0)
				call void @foo(i32* %i)
				%ld6 = load i32, i32* %i
				call void @bar(i32 %ld6)
				%inv3 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				call void @foo(i32* %i)
				%ld7 = load i32, i32* %i
				call void @bar(i32 %ld7)
				br label %end

				end:
				ret void
				}

				; Remove intrinsic calls after substituting undefined values.
				define void @undef() {
				entry:
				%i = alloca i32
				%0 = bitcast i32* %i to i8*
				%inv0 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %0)
				%1 = bitcast i32* %i to i8*
				%inv1 = call {}* (i64, i8) @llvm.invariant.start(i64 4, i8 %1)
				%ld1 = load i32, i32* %i
				call void @bar(i32 %ld1)
				%ld2 = load i32, i32* %i
				call void @bar(i32 %ld2)
				call void @llvm.invariant.end({}* %inv1, i64 4, i8* %1)
				ret void
				}

				declare void @bar(i32) readonly
				declare void @foo(i32*)
				declare {}* @llvm.invariant.start(i64, i8* nocapture)
				declare void @llvm.invariant.end({}, i64, i8 nocapture)

This is an archive of the discontinued LLVM Phabricator instance.

Use @llvm.invariant.start/end intrinsics to extend basic AA with invariant range analysis for GVN-based load elimination purposes [Local objects only]Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 52373

include/llvm/Analysis/BasicAliasAnalysis.h

include/llvm/Analysis/InvariantInfo.h

include/llvm/Analysis/PostDominators.h

include/llvm/InitializePasses.h

include/llvm/Transforms/Scalar/GVN.h

lib/Analysis/BasicAliasAnalysis.cpp

lib/Analysis/CMakeLists.txt

lib/Analysis/InvariantInfo.cpp

lib/Analysis/MemoryDependenceAnalysis.cpp

lib/Analysis/PostDominators.cpp

lib/Passes/PassBuilder.cpp

lib/Passes/PassRegistry.def

lib/Transforms/InstCombine/InstructionCombining.cpp

lib/Transforms/Scalar/GVN.cpp

test/Analysis/InvariantInfo/gvn-basic.ll

test/Analysis/InvariantInfo/gvn.ll

test/Analysis/InvariantInfo/instcombine.ll

test/Analysis/InvariantInfo/print.ll

test/Analysis/InvariantInfo/verify.ll

Use @llvm.invariant.start/end intrinsics to extend basic AA with invariant range analysis for GVN-based load elimination purposes [Local objects only]
Needs ReviewPublic