This is an archive of the discontinued LLVM Phabricator instance.

Try to reuse existing value during SCEV expansion
ClosedPublic

Authored by wmi on Aug 17 2015, 2:28 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi updated this revision to Diff 32339.Aug 17 2015, 2:28 PM
wmi retitled this revision from to Try to reuse existing value during SCEV expansion.
wmi updated this object.
wmi added reviewers: qcolombet, hfinkel, atrick.
wmi set the repository for this revision to rL LLVM.
wmi added a subscriber: davidxl.
qcolombet edited edge metadata.Aug 18 2015, 1:45 PM

Hi Wei,

I leave the actual review to Andy and Hal as I do not know the internals of this API at all.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Anyway, here are a couple of high level comments.

Thanks,
-Quentin

lib/Analysis/ScalarEvolutionExpander.cpp
1649 ↗(On Diff #32339)

This seems for the users of the SCEV API to know that when all those conditions apply, it needs to create a new value.

lib/Transforms/Vectorize/LoopVectorize.cpp
2602 ↗(On Diff #32339)

The formatting of the comment looks strange and the comment itself is hard to digest.
Could you rephrase?
Also, don’t we have a split method directly on the DT?
Having to explicitly add the block to DT seems error prone to me.

wmi added a comment.Aug 18 2015, 2:13 PM

Quentin, thanks for the review.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.

lib/Analysis/ScalarEvolutionExpander.cpp
1649 ↗(On Diff #32339)

Yes, that is a precise description about the use of the code above.

lib/Transforms/Vectorize/LoopVectorize.cpp
2602 ↗(On Diff #32339)

Sorry, I will fix the comment.

The comment is trying to say the SCEV expansion may query the DT at the same time when the func createEmptyLoop generates new bypass blocks. This is before InnerLoopVectorizer::updateAnalysis update the whole DT so we need to maintain the DT incrementally. I don't know the common way to do that so I just move the code originally in InnerLoopVectorizer::updateAnalysis forward.

hfinkel edited edge metadata.Aug 27 2015, 6:37 PM
In D12090#226980, @wmi wrote:

Quentin, thanks for the review.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.

It seems that CSE will only handle the more-trivial cases. If constant folding, or reassociation, etc. has changed the form of the expressions, then CSE won't help. I think handling at least some of this in SCEV does make sense.

That having been said, adding a Value* and the enum to every SCEV adds an overhead to every SCEV created, even though most of them are never expanded. Would it be better to keep this information on the side (in a DenseMap or similar)?

wmi added a comment.Aug 27 2015, 9:42 PM
In D12090#226980, @wmi wrote:

Quentin, thanks for the review.

That being said, the changes itself look strange to me and if we are just looking at making CSE and such, I would just calling the related passes at the right places.

Yes, if CSE can fully clean up the redundencies, that way looks simpler and cleaner. I am not sure which is the best way to fix the problem. The patch is proposed just as an alternative I can think of to fully clean up such redundencies. I expect people having better understanding of SCEV to provide some suggestions here.

It seems that CSE will only handle the more-trivial cases. If constant folding, or reassociation, etc. has changed the form of the expressions, then CSE won't help. I think handling at least some of this in SCEV does make sense.

That having been said, adding a Value* and the enum to every SCEV adds an overhead to every SCEV created, even though most of them are never expanded. Would it be better to keep this information on the side (in a DenseMap or similar)?

Yes, keeping the information in a map looks better. I will change it.

wmi updated this revision to Diff 34043.Sep 4 2015, 10:35 AM
wmi edited edge metadata.

Move Value * and enum from SCEV class to maps in ScalarEvolution class.
performance test result of llvm testsuite is neutral on a x86-64 sandybridge.

hfinkel added inline comments.Sep 24 2015, 2:35 AM
include/llvm/Analysis/ScalarEvolution.h
104 ↗(On Diff #34043)

This now looks only like an unnecessary formatting change.

lib/Analysis/ScalarEvolution.cpp
3374 ↗(On Diff #34043)

This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however.

One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now.

wmi updated this revision to Diff 35675.Sep 24 2015, 2:44 PM

This makes me a bit uncomfortable; you're relying on the fact that, if a Value* is removed, then no new Value* will be created in the same location, or if that does happen, the new Value* won't be used with getSCEV(). Nothing really guarantees this, however.
One option is to hold WeakVH as the values in your map. Another option is to enhance the SCEVCallbackVH implementation to update the ExprValueMap directly. Given that you seem to have already done the SCEVCallbackVH update below, maybe the additional check is just unnecessary now.

Thanks! It is a problem indeed. I took the first option and used WeakVH in the map.

hfinkel added inline comments.Sep 25 2015, 3:27 PM
lib/Analysis/ScalarEvolutionExpander.cpp
1646 ↗(On Diff #35675)

Why are we reusing an existing value only for AddRecs?

I see in the summary that you say:

The intuition is, if only SCEV doesn't contain scAddRecExpr, using the original value to expand will not nullify valid loop transformations.

But I don't see the downside to always reusing an available value from that. Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later.

Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec?

wmi added inline comments.Sep 25 2015, 11:12 PM
lib/Analysis/ScalarEvolutionExpander.cpp
1646 ↗(On Diff #35675)

Why are we reusing an existing value only for AddRecs?

No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization)

Another potentially-problematic issue is that SCEVs may not be unique, and I'm a bit concerned about always taking only the first or last such value encountered, because it imposes an indirect constraint on the users of ScalarEvolution and the expander to ensure that they always visit all such values in some deterministic order. This is not currently the case. Moreover, it can be problematic if, for example, you symbolically compute X-Y first, and then call getSCEV on a value that happens to be X-Y, will expand differently than calling getSCEV on that value and doing the symbolic calculation later.

I guess your point is: multiple values can be mapped to the same SCEV. Only one of those values will be recorded in ExprValueMap and will be used in expansion. To get the maximum optimization opportunity, the values must be encountered/expanded in a certain order.

However, I think the same problem exists either even without the patch because the fact that mulitple values can be mapped to the same SCEV is true w/wo the patch (This is determined by ScalarEvolution::createSCEV and we didn't change it in the patch).

value = X-Y
...
To expand S1.

Suppose X-Y and X'-Y' will both be mapped to S1. Whether S1 will be expanded to X-Y or X'-Y' depends on which one is first encountered by getSCEV. This is true even without the patch.

Also, if we are going to restrict these to a subclass of SCEVs, why wouldn't you only store the values/SCEV pair in the map if the SCEV satisfies hasAnyRec?

I think you mean not to store the value/scAddRecExpr pair in the map. It can make the map smaller. I will change it.

hfinkel added inline comments.Sep 28 2015, 1:57 PM
lib/Analysis/ScalarEvolutionExpander.cpp
1646 ↗(On Diff #35675)

No, we are reusing an existing value only for scevs which are not scAddRecExpr. The downside to use existing value for scAddRecExpr is that it may nullify some optimization done by LSR. (I think only scAddRecExpr type scev is substantially involved in LSR optimization)

I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior.

wmi added inline comments.Sep 28 2015, 11:13 PM
lib/Analysis/ScalarEvolutionExpander.cpp
1646 ↗(On Diff #35675)
  1. I rethought your previous comments about uncertainty of the expansion result and found something I can improve. Thanks for those comments.

A case is like this:

BBi:
%sub2 = %x - %y;
...
BBj:
%x = load %a;
%y = load %b;
%sub1 = %x - %y;
...
%i_0 = %sub1;
for.body:

%i_1 = PHI (%i_0, %i_next)
%i_next = %i_1 + 1;
%cmp = icmp slt %i_next, %z
br i1 %cmp, label %for.body, label %for.end

for.end:

If we expand the SCEV when we try to get the backedge count, without the patch it will be "%z - (%x - %y) - 1", with the patch it may be "%z - %sub1 - 1" if %sub1 is recorded in SCEV of "%x - %y", or it may be "%z - %sub2 - 1" if %sub2 is recorded in SCEV of "%x - %y". It is also possible that BBi cannot reach BBj, so SCEV also may expand to "%z - (%x - %y) - 1" with the patch.

This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point, so in the case above, SCEV will only be expanded to "%z - %sub1 - 1" if BBi cannot reach BBj, so reuse will be realized every time and there will be much less uncertainty in the expansion result.

  1. For another concern you raised, although existing behavior of SCEVExpander::expand is more consistent -- it always literally generates all the computations SCEV represents, it sacrifices the opportunity to reuse existing values. The patch introduces some inconsistency. The inconsistency is that we keep the expansion behavior of scAddRecExpr the same as without the patch, but may reuse existing values when expanding other kinds of SCEV (So the inconsistency here is actually an improvement when expanding non-scAddRecExpr SCEVs). I think about the generaility of the interface, besides LSR, other components can still use the interface the same as before and believe the interface will generate equal value -- without affecting correctness, but possibly with less cost. And I think I can describe the inconsistency more clearly in comments to remove potential confusion from the users of the interface.
wmi updated this revision to Diff 39695.Nov 9 2015, 9:11 AM
wmi updated this object.

Update the patch (Sorry for not updating it for a long time).

This can be improved. I can record the set of all possible Values mapped to the same SCEV in ExprValueMap. In SCEVExpander::expand, I will select one value from the set which will dominate the insert point.

  1. The improvement is implemented. Because there can be multiple Value mapping to the same SCEV, record the mapping from SCEV to vector<WeakVH> in ExprValueMap. During SCEV expansion, choose one Value from the vector which can dominate the insertPt. The update to the unittest test/Transforms/IndVarSimplify/udiv.ll reflects the usage of the change. IndVars doesn't emit a udiv in for.body.preheader BB after the change. %div1 will be reused there.

I'm very afraid here of creating an quirky interface that, while appearing to offer a general set of facilities, contains a set of unexpected behaviors tailored to a specific consumer (LSR, in this case). Regardless, ScalarEvolutionExpander already contains a special 'LSRMode', and we should key and LSR-specific customizations off of that. In the general case, we should have a consistent behavior.

  1. This concern has not been addressed very well. I still don't have a good solution for it right now. What I have done for it is to add a comment describing the status before func SCEVExpander::expand.
wmi added a comment.Dec 7 2015, 2:46 PM

Ping.

Thanks,
Wei.

sanjoy added a comment.Dec 8 2015, 1:23 AM

Some minor nits inline.

Overall, I agree with Hal's judgement that any LSR specific behavior should be guarded on LSRMode; both so that the reason for the limitation is obvious, and also to not unnecessarily (and, to the end user, inexplicably) do a worse job than we could have done.

lib/Analysis/ScalarEvolution.cpp
3314 ↗(On Diff #39695)

I'd rename this to containsAddRecurrence.

3317 ↗(On Diff #39695)

Why not return I->second;?

3319 ↗(On Diff #39695)

Can't you use a SCEVTraversal here?

lib/Analysis/ScalarEvolutionExpander.cpp
1650 ↗(On Diff #39695)

Nit: LLVM naming style is auto const &Ent : *Vec.

lib/Transforms/Vectorize/LoopVectorize.cpp
2598 ↗(On Diff #39695)

Nit: wrapping

wmi marked 4 inline comments as done.Dec 10 2015, 5:26 PM

Overall, I agree with Hal's judgement that any LSR specific behavior should be guarded on LSRMode;

Thanks for the explaination. I misunderstood Hal's comment and thought existing use of LSRMode is already bad so I shouldn't add another use of LSRMode (My bad English). That is why I say I cannot figure out a better way to make the interface clean.

Now the behavior of SCEVExpander::expand is defined clearer in its function header comment:

The expansion of SCEV will either reuse a previous Value in ExprValueMap,
or expand the SCEV literally. Specifically, if the expansion is in LSRMode,
and the SCEV contains any sub scAddRecExpr type SCEV, it will be expanded
literally, to prevent LSR transformed SCEV from being reverted. Otherwise,
the expansion will try to reuse Value from ExprValueMap, and only when it
fails, expand the SCEV literally.

lib/Analysis/ScalarEvolution.cpp
3319 ↗(On Diff #39695)

That is much better. Done.

wmi updated this revision to Diff 42488.Dec 10 2015, 5:37 PM

Addressed Hal and Sanjoy's comments.

Other changes:

  1. Change vector to set in ExprValueMap.
  2. Add test scev-expander-existing-value.ll.
hfinkel added inline comments.Dec 10 2015, 6:00 PM
include/llvm/Analysis/ScalarEvolution.h
253 ↗(On Diff #42488)

store the analysis result about -> record

259 ↗(On Diff #42488)

As I note later, you probably want a SetVector here, not a std::set. Also, std::set is generally much slower than DenseSet, so we should use the latter if possible (SetVector uses a DenseSet).

lib/Analysis/ScalarEvolutionExpander.cpp
1608 ↗(On Diff #42488)

LSR -> LSR's

1654 ↗(On Diff #42488)

LSR -> LSR's

1659 ↗(On Diff #42488)

You're iterating over the elements of a set here, and those have WeakVH (i.e. pointer-valued) keys. That seems unlikely to be deterministic.

SetVector seems like a better choice.

lib/Transforms/Vectorize/LoopVectorize.cpp
2600 ↗(On Diff #42488)

func -> function (no need to abbreviate here)

wmi marked 6 inline comments as done and 4 inline comments as done.Dec 10 2015, 11:07 PM
wmi updated this revision to Diff 42502.Dec 10 2015, 11:08 PM

Addressed Hal's comments. Changed std::set<WeakVH> to SetVector<WeakVH, std::vector<WeakVH>, DenseSet<WeakVH>>.

hfinkel added inline comments.Dec 11 2015, 2:13 AM
include/llvm/Analysis/ScalarEvolution.h
211 ↗(On Diff #42502)

SetVector is defined as:

template <typename T, typename Vector = std::vector<T>,
          typename Set = DenseSet<T>>
class SetVector {
...

and so

,std::vector<WeakVH>, DenseSet<WeakVH>

should be implied by the first template argument. If this can be simplified to:

typedef SetVector<WeakVH> WeakVHSetType;

then please do.

(but, you also need to change the WeakVH type, see below)

include/llvm/IR/ValueHandle.h
177 ↗(On Diff #42502)

I apologize, because I believe I was the one who implied this would work. But that fact that you had to add this here reminded me that it won't. The problem is that a WeakVH's value changes when the underlying Value is removed (it changes from the pointer value to nullptr). Thus, we can't use these as keys in a set (or map) because the key needs to remain fixed (otherwise it will be in the wrong bucket, or in the wrong order for a sorted set, after the change).

We need instead to use a different kind of ValueHandle that can remove itself from its parent map once the underlying value goes away. The good news is that we already have implementations of this: We have a ValueMap class (include/llvm/IR/ValueMap.h), and we have SCEV's ValueExprMapType type SCEVCallbackVH.

All things considered, I think that just using raw pointers in the SetVectors is probably your best option, and enhance SCEVCallbackVH to also remove outdated Values (for every value in one of those vectors, we must already have an entry in ValueExprMap which should have the same lifetime). Thus, in SCEVCallbackVH's callback, you can use the associated SCEV* to lookup the correct SetVector<Value*> and remove the necessary entry (SetVector has a convenient 'remove' member function for this purpose).

wmi added a comment.Dec 11 2015, 9:41 AM

Thanks for detecting the potential error, and your suggestion to use SCEVCallbackVH's callback instead of WeakVH looks feasible.

I just looked at ScalarEvolution::getSCEV again and believed ExprValueMap[S].insert(WeakVH(V)) will be called only once for the same Value (It will not happen that two instances of the same Value are inserted to the set). So can we simply use a std::vector<WeakVH> instead of SetVector, which may be cheaper because SetVector uses std::vector inside of it.

wmi updated this revision to Diff 42927.Dec 15 2015, 3:57 PM

I try std::vector and find it is possible for ExprValueMap to have duplicate Values in the vector because createSCEV will be called multiple times for the same PHI Value. Another weakness is there may be multiple WeakVHs with nullptr Values staying in the vector.

So I still follow Hal's suggestion to use SetVector<Value *>.

ScalarEvolution::eraseValueFromMap is created to ensure whenever V->S is removed from ValueExprMap, V is also removed from the set of ExprValueMap[S] . In this way, entry in ValueExprMap will always have equal or longer life time than corresponding entry in ExprValueMap. So when V is deleted and V is in a SetVector of ExprValueMap, ValueExprMap[V] can always be used as the Key of ExprValueMap.

hfinkel added inline comments.Feb 2 2016, 2:11 PM
lib/Analysis/ScalarEvolution.cpp
3319 ↗(On Diff #42927)

I'd name this FoundOne (instead of FindOne), because it indicates whether or not an AddRec was found, not a directive for the future search).

3370 ↗(On Diff #42927)

I appreciate this idea, but please don't do this by default. In a build with asserts, this check makes Value removal O(N^2). If you'd like to have this check, you'll need a separate flag. This reminds me of EnableExpensiveChecks in lib/CodeGen/SelectionDAG/LegalizeTypes.cpp.

lib/Transforms/Vectorize/LoopVectorize.cpp
2598 ↗(On Diff #42927)

dominate -> dominator

wmi marked 2 inline comments as done.Feb 2 2016, 5:52 PM
wmi added inline comments.
lib/Analysis/ScalarEvolution.cpp
3370 ↗(On Diff #42927)

I put the check under VerifySCEVMap option (similar as VerifySCEV option).

And I moved the check to ScalarEvolution::getSCEVValues from ScalarEvolution::eraseValueFromMap, so it can ensure every Value set returned by getSCEVValues don't have dangling value inside of it.

wmi updated this revision to Diff 46730.Feb 2 2016, 5:53 PM
hfinkel accepted this revision.Feb 2 2016, 6:01 PM
hfinkel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Feb 2 2016, 6:01 PM
wmi added a comment.Feb 2 2016, 8:41 PM

Thank you for your patience and thank you for providing many helpful
suggestions!

Wei.

This revision was automatically updated to reflect the committed changes.