Details
- Reviewers
qcolombet atrick hfinkel - Commits
- rGa49559befb66: [SCEV] Try to reuse existing value during SCEV expansion
rGed133978a0ee: [SCEV] Try to reuse existing value during SCEV expansion
rL259736: [SCEV] Try to reuse existing value during SCEV expansion
rL259662: [SCEV] Try to reuse existing value during SCEV expansion
Diff Detail
- Repository
- rL LLVM
Event Timeline
Hi Wei,
I leave the actual review to Andy and Hal as I do not know the internals of this API at all.
That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.
Anyway, here are a couple of high level comments.
Thanks,
-Quentin
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
1649 ↗ | (On Diff #32339) | This seems for the users of the SCEV API to know that when all those conditions apply, it needs to create a new value. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
2602 ↗ | (On Diff #32339) | The formatting of the comment looks strange and the comment itself is hard to digest. |
Quentin, thanks for the review.
That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.
Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
1649 ↗ | (On Diff #32339) | Yes, that is a precise description about the use of the code above. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
2602 ↗ | (On Diff #32339) | Sorry, I will fix the comment. The comment is trying to say the SCEV expansion may query the DT at the same time when the func createEmptyLoop generates new bypass blocks. This is before InnerLoopVectorizer::updateAnalysis update the whole DT so we need to maintain the DT incrementally. I don't know the common way to do that so I just move the code originally in InnerLoopVectorizer::updateAnalysis forward. |
It seems that CSE will only handle the more-trivial cases. If constant folding, or reassociation, etc. has changed the form of the expressions, then CSE won't help. I think handling at least some of this in SCEV does make sense.
That having been said, adding a Value* and the enum to every SCEV adds an overhead to every SCEV created, even though most of them are never expanded. Would it be better to keep this information on the side (in a DenseMap or similar)?
Move Value * and enum from SCEV class to maps in ScalarEvolution class.
performance test result of llvm testsuite is neutral on a x86-64 sandybridge.
include/llvm/Analysis/ScalarEvolution.h | ||
---|---|---|
104 ↗ | (On Diff #34043) | This now looks only like an unnecessary formatting change. |
lib/Analysis/ScalarEvolution.cpp | ||
3374 ↗ | (On Diff #34043) | This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however. One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now. |
This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however.
One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now.
Thanks! It is a problem indeed. I took the first option and used WeakVH in the map.
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
1646 ↗ | (On Diff #35675) | Why are we reusing an existing value only for AddRecs? I see in the summary that you say:
But I don't see the downside to always reusing an available value from that. Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later. Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec? |
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
1646 ↗ | (On Diff #35675) |
No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization)
I guess your point is: multiple values can be mapped to the same SCEV. Only one of those values will be recorded in ExprValueMap and will be used in expansion. To get the maximum optimization opportunity, the values must be encountered/expanded in a certain order. However, I think the same problem exists either even without the patch because the fact that mulitple values can be mapped to the same SCEV is true w/wo the patch (This is determined by ScalarEvolution::createSCEV and we didn't change it in the patch). value = X-Y Suppose X-Y and X'-Y' will both be mapped to S1. Whether S1 will be expanded to X-Y or X'-Y' depends on which one is first encountered by getSCEV. This is true even without the patch.
I think you mean not to store the value/scAddRecExpr pair in the map. It can make the map smaller. I will change it. |
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
1646 ↗ | (On Diff #35675) |
I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior. |
lib/Analysis/ScalarEvolutionExpander.cpp | ||
---|---|---|
1646 ↗ | (On Diff #35675) |
A case is like this: BBi: %i_1 = PHI (%i_0, %i_next) %i_next = %i_1 + 1; %cmp = icmp slt %i_next, %z br i1 %cmp, label %for.body, label %for.end for.end: If we expand the SCEV when we try to get the backedge count, without the patch it will be "%z - (%x - %y) - 1", with the patch it may be "%z - %sub1 - 1" if %sub1 is recorded in SCEV of "%x - %y", or it may be "%z - %sub2 - 1" if %sub2 is recorded in SCEV of "%x - %y". It is also possible that BBi cannot reach BBj, so SCEV also may expand to "%z - (%x - %y) - 1" with the patch. This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point, so in the case above, SCEV will only be expanded to "%z - %sub1 - 1" if BBi cannot reach BBj, so reuse will be realized every time and there will be much less uncertainty in the expansion result.
|
Update the patch (Sorry for not updating it for a long time).
This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point.
- The improvement is implemented. Because there can be multiple Value mapping to the same SCEV, record the mapping from SCEV to vector<WeakVH> in ExprValueMap. During SCEV expansion, choose one Value from the vector which can dominate the insertPt. The update to the unittest test/Transforms/IndVarSimplify/udiv.ll reflects the usage of the change. IndVars doesn't emit a udiv in for.body.preheader BB after the change. %div1 will be reused there.
I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior.
- This concern has not been addressed very well. I still don't have a good solution for it right now. What I have done for it is to add a comment describing the status before func SCEVExpander::expand.
Some minor nits inline.
Overall, I agree with Hal's judgement that any LSR specific behavior should be guarded on LSRMode; both so that the reason for the limitation is obvious, and also to not unnecessarily (and, to the end user, inexplicably) do a worse job than we could have done.
lib/Analysis/ScalarEvolution.cpp | ||
---|---|---|
3314 ↗ | (On Diff #39695) | I'd rename this to containsAddRecurrence. |
3317 ↗ | (On Diff #39695) | Why not return I->second;? |
3319 ↗ | (On Diff #39695) | Can't you use a SCEVTraversal here? |
lib/Analysis/ScalarEvolutionExpander.cpp | ||
1650 ↗ | (On Diff #39695) | Nit: LLVM naming style is auto const &Ent : *Vec. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
2598 ↗ | (On Diff #39695) | Nit: wrapping |
Overall, I agree with Hal's judgement that any LSR specific behavior should be guarded on LSRMode;
Thanks for the explaination. I misunderstood Hal's comment and thought existing use of LSRMode is already bad so I shouldn't add another use of LSRMode (My bad English). That is why I say I cannot figure out a better way to make the interface clean.
Now the behavior of SCEVExpander::expand is defined clearer in its function header comment:
The expansion of SCEV will either reuse a previous Value in ExprValueMap,
or expand the SCEV literally. Specifically, if the expansion is in LSRMode,
and the SCEV contains any sub scAddRecExpr type SCEV, it will be expanded
literally, to prevent LSR transformed SCEV from being reverted. Otherwise,
the expansion will try to reuse Value from ExprValueMap, and only when it
fails, expand the SCEV literally.
lib/Analysis/ScalarEvolution.cpp | ||
---|---|---|
3319 ↗ | (On Diff #39695) | That is much better. Done. |
Addressed Hal and Sanjoy's comments.
Other changes:
- Change vector to set in ExprValueMap.
- Add test scev-expander-existing-value.ll.
include/llvm/Analysis/ScalarEvolution.h | ||
---|---|---|
253 ↗ | (On Diff #42488) | store the analysis result about -> record |
259 ↗ | (On Diff #42488) | As I note later, you probably want a SetVector here, not a std::set. Also, std::set is generally much slower than DenseSet, so we should use the latter if possible (SetVector uses a DenseSet). |
lib/Analysis/ScalarEvolutionExpander.cpp | ||
1608 ↗ | (On Diff #42488) | LSR -> LSR's |
1654 ↗ | (On Diff #42488) | LSR -> LSR's |
1659 ↗ | (On Diff #42488) | You're iterating over the elements of a set here, and those have WeakVH (i.e. pointer-valued) keys. That seems unlikely to be deterministic. SetVector seems like a better choice. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
2600 ↗ | (On Diff #42488) | func -> function (no need to abbreviate here) |
Addressed Hal's comments. Changed std::set<WeakVH> to SetVector<WeakVH, std::vector<WeakVH>, DenseSet<WeakVH>>.
include/llvm/Analysis/ScalarEvolution.h | ||
---|---|---|
211 ↗ | (On Diff #42502) | SetVector is defined as: template <typename T, typename Vector = std::vector<T>, typename Set = DenseSet<T>> class SetVector { ... and so ,std::vector<WeakVH>, DenseSet<WeakVH> should be implied by the first template argument. If this can be simplified to: typedef SetVector<WeakVH> WeakVHSetType; then please do. (but, you also need to change the WeakVH type, see below) |
include/llvm/IR/ValueHandle.h | ||
177 ↗ | (On Diff #42502) | I apologize, because I believe I was the one who implied this would work. But that fact that you had to add this here reminded me that it won't. The problem is that a WeakVH's value changes when the underlying Value is removed (it changes from the pointer value to nullptr). Thus, we can't use these as keys in a set (or map) because the key needs to remain fixed (otherwise it will be in the wrong bucket, or in the wrong order for a sorted set, after the change). We need instead to use a different kind of ValueHandle that can remove itself from its parent map once the underlying value goes away. The good news is that we already have implementations of this: We have a ValueMap class (include/llvm/IR/ValueMap.h), and we have SCEV's ValueExprMapType type SCEVCallbackVH. All things considered, I think that just using raw pointers in the SetVectors is probably your best option, and enhance SCEVCallbackVH to also remove outdated Values (for every value in one of those vectors, we must already have an entry in ValueExprMap which should have the same lifetime). Thus, in SCEVCallbackVH's callback, you can use the associated SCEV* to lookup the correct SetVector<Value*> and remove the necessary entry (SetVector has a convenient 'remove' member function for this purpose). |
Thanks for detecting the potential error, and your suggestion to use SCEVCallbackVH's callback instead of WeakVH looks feasible.
I just looked at ScalarEvolution::getSCEV again and believed ExprValueMap[S].insert(WeakVH(V)) will be called only once for the same Value (It will not happen that two instances of the same Value are inserted to the set). So can we simply use a std::vector<WeakVH> instead of SetVector, which may be cheaper because SetVector uses std::vector inside of it.
I try std::vector and find it is possible for ExprValueMap to have duplicate Values in the vector because createSCEV will be called multiple times for the same PHI Value. Another weakness is there may be multiple WeakVHs with nullptr Values staying in the vector.
So I still follow Hal's suggestion to use SetVector<Value *>.
ScalarEvolution::eraseValueFromMap is created to ensure whenever V->S is removed from ValueExprMap, V is also removed from the set of ExprValueMap[S] . In this way, entry in ValueExprMap will always have equal or longer life time than corresponding entry in ExprValueMap. So when V is deleted and V is in a SetVector of ExprValueMap, ValueExprMap[V] can always be used as the Key of ExprValueMap.
lib/Analysis/ScalarEvolution.cpp | ||
---|---|---|
3319 ↗ | (On Diff #42927) | I'd name this FoundOne (instead of FindOne), because it indicates whether or not an AddRec was found, not a directive for the future search). |
3370 ↗ | (On Diff #42927) | I appreciate this idea, but please don't do this by default. In a build with asserts, this check makes Value removal O(N^2). If you'd like to have this check, you'll need a separate flag. This reminds me of EnableExpensiveChecks in lib/CodeGen/SelectionDAG/LegalizeTypes.cpp. |
lib/Transforms/Vectorize/LoopVectorize.cpp | ||
2598 ↗ | (On Diff #42927) | dominate -> dominator |
lib/Analysis/ScalarEvolution.cpp | ||
---|---|---|
3370 ↗ | (On Diff #42927) | I put the check under VerifySCEVMap option (similar as VerifySCEV option). And I moved the check to ScalarEvolution::getSCEVValues from ScalarEvolution::eraseValueFromMap, so it can ensure every Value set returned by getSCEVValues don't have dangling value inside of it. |
Thank you for your patience and thank you for providing many helpful
suggestions!
Wei.