- User Since
- Jul 13 2012, 1:07 PM (292 w, 5 d)
Thanks, that is very helpful. It's much more likely that Dominator info is up to date than loop info, but it's not worth getting into here. The only thing I request is that you update the comments to suggest this only probably makes sense to use if loopinfo is computed already, and please just add your description of how it works with the example to the comments.
Hey, can you help me understand the algorithm?
I don't have a good feel for the invariants you expect going in.
It just says "it works on anything given loop info".
IE What info you expect the loopinfo to provide vs the topo ordering.
Sun, Feb 18
To be clear, i think the lifetime system has enough issues that trying to cover it with MemorySSA as it currently exists wouldn't end up well.
Things that want to care about and try to optimize around lifetimes can do that.
Intermixing object lifetime and aliasing results in the same system will, IMHO, give you a bad time.
Fri, Feb 16
Interesting. NewGVN can already figure this out if it's important.
How often does stuff like this occur?
Thu, Feb 15
Oh god i've been away from LLVM too long.
So, i don't know what we *want* the semantic to be.
Can you print the memoryssa dumps for this for me?
Add hal, etc, get his thoughts
Wed, Feb 14
Personally, i totally think this is worthwhile and worth spending some time on to prevent the slow proliferation of 10+ argument functions taking analysis results. Because that's the path we are heading down, and that's definitely hard to refactor, change, etc
But i'm also not the person who would do the work (though happy to review).
Tue, Feb 13
This adds another basic block walk per InstBB, because llvm::insertDebugValuesForPHIs walks the entire bb :(
An example of that is getBestSimplifyQuery, which takes various analysis managers, etc, and extracts the analysis it cares about them.
If we did similar, we could simplify what gets passed through and easily add updating of a new thing.
So this looks right to me.
I'm a little worried about how we are having to modify tons and tons of functions to push these through.
When we get to postdominator updates, we'll have to do the same thing.
Mon, Feb 12
Please use getAnalysisIfAvailable for the old pass manager
Please use getCachedResult for the new pass manager.
Wed, Feb 7
LVI functions fine without DT, and it's going to be too expensive to ever use a lazy analysis that is constantly recomputing info like LVI because it may require hundreds or thousands of Dominator tree rebuilds. I would hand LVI a fake dt pointer class where when the Dominator tree is up to date, it works and when it's not up to date, it is null. That would work with how LVI currently checking I think. Regardless I think the right answer here is to make it not use it when it's not up to date. That will give better answers then it used to get and you get gradually better answers as you can preserve more or become less lazy
Fri, Feb 2
These were actually deliberately split apart in the past year, so not sure what makes sense here.
Tue, Jan 30
(also, the DT was not updated at all prior during JT prior to all of these patches, so i guess it all worked because LVI didn't use DT when called from JumpThreading or something)?
Mon, Jan 29
Err, looking at that bug, it looks like it occurs in release versions before the dominance update patch was ever submitted :)
Jan 22 2018
Is it a thing that is even different for each callgraph implementation? (If not, why is it a trait?)
Even if it is, if you look, you'll see we don't try to encode random computed Node property differences in GraphTraits. Only the things that *every* graph algorithm needs (IE it caters to anything, not everything). Random node property differences are handled through other trait classes or filtering. This does not, at a glance, seem like a thing that literally every algorithm that is going to use a Callgraph needs. It looks like one that some may need. You even prove this yourself.
I don't understand why you need has_function_def. It looks very much like a thing you should just be using a filter_iterator for or something. It does not look like it belongs in the graphtraits at all.
Jan 19 2018
I like the idea here a lot, but i don't understand why you need to make a new type of graph traits with things called call_edge_begin/end.
Can you explain the necessity here?
(IE why can you just not implement a graphtraits, why do you need a callgraphtraits)
Jan 17 2018
To try to help explain this (and feel free to turn it into a comment in NewGVN), for a phi of ops, let me try to cover the cases.
Yes, that is true, but that case should only occur if we have not created a phi of ops, and need to mark what will happen later.
Note also that we mark all operands that are part of the expression as Deps. If the leader of those operands changed, they would already be marked as changed as Users through the addAdditionalUsers part we do.
This is definitely not right, unfortunately.
At a glance, i don't believe this is correct.
The ClassChanged part above should have already done this.
If not, can you detail me how it is happening?
Jan 5 2018
First, it's definitely the case that you can't compute the difference for all SCEVs. It's equivalent to being able to symbolically executing the program with only static information :)
Jan 3 2018
Dec 28 2017
Dec 27 2017
Reading the old code you are replacing I don't feel like I understand the graph theory behind the algorithm.
Nov 29 2017
Nov 27 2017
So, what Sanjoy has pointed out is that canonicalization like this to select, breaks the ability to remove redundant loads and stores and do PRE, compared to the equivalent control flow version, in general.
Nov 26 2017
First, to bikeshed this horribly:
Nov 14 2017
Nov 10 2017
Yeah, I filed a bug about this when we started the first review, its good to have a real testcasr
Nov 9 2017
Errr, outside of an irrelevant store, it generates literally the same output and jump threading result as: https://reviews.llvm.org/rL317768
Nov 8 2017
That's fine too.
Can you please fix LoadPRE as well?
ConstructSSALoadSet needs to be patching instructions it materializes if they come from an existing value.
In load PRE, it is replacing the value of a load with a phi of other values.
Your patch has convinced me that GVN is probably generally broken here, but just happens to not value number or create new phi nodes except for PRE, so it doesn't generate a lot of instances of this issue.
So yeah, i'm fine with the patch in general
Nov 6 2017
I think it would be a lot easier to understand your example if you could produce IR that, when run through -gvn -enable-pre -jump-threading, produces the wrong result without this patch.
TL; DR I'm .. confused.
Oct 31 2017
I'll repeat what i said on the bug
FWIW: Given the choice, i would much rather fix the underlying problem then patch various passes to work around some small subset of it (here, load of selected addresses)
This is underway right now, actually.
Oct 29 2017
Oct 27 2017
This seems very similar to what n-ary reassociate wants to do, I'd like to understand why we shouldn't do it there?
(where it seems to fit perfectly)
Oct 26 2017
Oct 25 2017
Oct 23 2017
This seems reasonable to me,
I'd just add a comment to explain why it's a small set vector (so nobody breaks this future)
Oct 22 2017
So, in the original test and code it's hard to tell, but yes, if it's trying to undo the propagation of constants into phi nodes, it's already going to be fighting, because everything else is trying to do that transform specifically :)
GVN/NewGVN/et al will undo this transform, and propagate constants. In fact, i expect pretty much any pass that can, will.
Oct 20 2017
I'll add a unit test.
- Update for review comments
Oct 18 2017
I haven't heard any better approaches yet, but i'll give a week before committing this.
Oct 17 2017
"Even if we mark for deletion, we should update the map immediately to be able to hoist across the marked instruction. Marking alone does not solve the problem, it will still need the same code that also updates the map. If marking is only needed for uniformity of GVN code, this can be done separately from fixing the actual bug."
The instructions will be deleted and the map updated as soon as we return from this function up the call stack.
It will not prevent further anything, because
- the hoisting has already happened (and all the functions in this stack are going to return and then the next thing that will happen is we will delete the dead instructions)
- The marked instruction could not have blocked hoisting anyway, or else it would not have been safe to PRE.
Both of the deletions are in PRE, after it has been successful. If either the inserted or the deleted instructions were implicit control flow, PRE should have been unsafe. It's probably worth noting this in a comment
Oct 16 2017
Oct 13 2017
I attached a patch for this to PR 34937.
I don't have time ATM to perform testing and shepherding of it through review, however.
This is use after free.
When the first implicit control flow instruction for a bb gets erased, we now have wrong info.
Oct 12 2017
Oct 11 2017
I do not plan to address these at the moment, because the order is consistent in the pass and doesn't affect correctness or what gets optimized.
If someone else wants to try to make everything reverse order clean, they are welcome to, it's not a priority for me.
Yes, this is known. The original "fix" this reverts is clearly broken. There is a discussion on the original revert thread about this and ways to fix it properly.
Oct 10 2017
This actually sounds like a great idea, IMHO.
Let's try it and see how well it works in practice.
It certainly would address my current concerns.
Okay if you address the remaining comments.
Suggesting a slightly different fix.
I suspect at least one of your bugs is that the param state lattice doesn't get updated when the value state lattice changes.
Oct 9 2017
I'd reuse the type we already have to express alias results.
While you don't care about partial alias, it makes all code have the same meaning for the same thing :)
(I think this needs a little thought)
So, i can't find what the suggestion was, but these are fairly specific to value lattices and don't belong being used by most other things.
There are lattices where they are correct, and lattices where they make no sense.
It's also kinda conservative.
thanks for working through this :)
Sounds reasonable to start with.
Also, if you used sets, you could just use set.swap in the constructor (or move constructors) to avoid the extra copying.
I'm a bit confused why you use a vector instead of std::set or DenseSet. It looks like either would be better in every way?
Oct 7 2017
Preds will in fact be thousands sometimes with switch statements. If you don't want to fix it, let me know and I'll fix the sort on Monday
Oct 6 2017
Oct 5 2017
Also this needs a test
We sort siblings by RPO in newgvn. If you want to guarantee parent before child, however, just sort by the dominator tree dfs in and out numbers.
Oct 4 2017
(Gonna let jakub comment on what should work/not work here)
This looks right to me, since people were trying to do it anyway :)
Oct 3 2017
Ah, the joy of not having users to break.
Yes, i expect you will have to move a bunch if not all of these to the header.
Oct 2 2017
FYI: I'm about to rename getOrInitValueState to getValueState.
That is what SCCP calls it, and IMHO, we should be consistent.
It's worth noting: You can make propagation much faster by keeping a separate worklist for overdefined , and prioritizing it over the normal one.
This will move things to overdefined as quickly as possible.
(Because anything meet overdefined is just about always overdefined).
I see this review has been hanging around a while, so i'll take a stab at reviewing it.