This is an archive of the discontinued LLVM Phabricator instance.

[AssumptionCache] Avoid dangling llvm.assume calls in the cache
ClosedPublic

Authored by jdoerfert on Feb 5 2021, 11:47 AM.

Details

Summary

PR49043 exposed a problem when it comes to RAUW llvm.assumes. While
D96106 would fix it for GVNSink, it seems a more general concern. To
avoid future problems this patch moves away from the vector of weak
reference model used in the assumption cache. Instead, we track the
llvm.assume calls with a callback handle which will remove itself from
the cache if the call is deleted.

Fixes PR49043.

Diff Detail

Event Timeline

jdoerfert created this revision.Feb 5 2021, 11:47 AM
jdoerfert requested review of this revision.Feb 5 2021, 11:47 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2021, 11:47 AM

@nikic, @lebedev.ri Once the tests are back, could either of you start a run of this to track compile time?

I realized I could simply remove the problematic assertion and we allow duplication in the AssumeHandles vector, though that doesn't sound too great either.
I hope this might actually be the better solution in the long run, assuming (pun intended) the compile time is not impacted negatively.

nikic added a comment.Feb 6 2021, 2:19 AM

Compile-time is in the noise: https://llvm-compile-time-tracker.com/compare.php?from=6e1afd858757256afdb619665befb790c76418bb&to=9f7b74f200bdee05bda1fbd9447ffd6fbd7c6262&stat=instructions I don't think CTMark generates many assumes though. In any case, I don't see anything here that should have a negative compile-time effect.

The patch looks fine to me, but I think I still haven't really understood at which point the problem is introduced in GVNSink. Could you point me to where the duplicate would get inserted?

llvm/include/llvm/Analysis/AssumptionCache.h
51

Missing final.

290

Is it not sufficient to pass DenseMapInfo<Value *> as the second parameter to DenseSet?

jdoerfert updated this revision to Diff 321947.Feb 6 2021, 9:13 AM
jdoerfert marked 2 inline comments as done.

Address comments

Compile-time is in the noise: https://llvm-compile-time-tracker.com/compare.php?from=6e1afd858757256afdb619665befb790c76418bb&to=9f7b74f200bdee05bda1fbd9447ffd6fbd7c6262&stat=instructions I don't think CTMark generates many assumes though. In any case, I don't see anything here that should have a negative compile-time effect.

The patch looks fine to me, but I think I still haven't really understood at which point the problem is introduced in GVNSink. Could you point me to where the duplicate would get inserted?

here: https://reviews.llvm.org/D96106#2546918

This also avoids ever having dangling references in the cache, which is kinda nice I guess.

nikic accepted this revision.Feb 6 2021, 10:04 AM

LGTM

This revision is now accepted and ready to land.Feb 6 2021, 10:04 AM
This revision was landed with ongoing or failed builds.Feb 6 2021, 10:18 AM
This revision was automatically updated to reflect the committed changes.
Meinersbur added inline comments.
llvm/include/llvm/Analysis/AssumptionCache.h
81

Changing this list to a DenseSet introduces indeterminism in Polly's test case ScopInfo/user_provided_assumptions.ll and even in basic.ll that is changed in this patch. ScalarEvolution is known to give different result depending on the order in which elements are processed. PredicateInfo might emit different IR depending on the order as well.

I think we should play it save and changed it back to a SmallVector. Unfortunately, a SetVector instead would be of no use since the removing an element still searches the element in the vector.

llvm/lib/Analysis/AssumptionCache.cpp
171

Call base method CallbackVH::deleted() here?

jdoerfert added inline comments.Feb 9 2021, 7:10 PM
llvm/include/llvm/Analysis/AssumptionCache.h
81

I'm not sure I follow. If the user would benefit from more assumptions that are now "later" in the traversal, they should scan on. basic.ll is a print method which can be non-determistic, IMHO. Long story short, I'd say SCEV or Polly should find the best information among all assumptions.

Meinersbur added inline comments.Feb 9 2021, 8:19 PM
llvm/include/llvm/Analysis/AssumptionCache.h
81

In the case of Polly, the added assumption are simplified using previous assumptions. This gives indeterministic results such as

remark: <unknown>:0:0: Use user assumption: [Debug, M] -> {  : Debug = 0 and 0 < M <= 100 }
remark: <unknown>:0:0: Use user assumption: [Debug, M, N] -> {  : Debug = 0 and 0 < M <= 100 and N >= -2147483648 - M }
remark: <unknown>:0:0: Use user assumption: [Debug, M, N] -> {  : Debug = 0 and 0 < M <= 100 and N > 0 }
remark: <unknown>:0:0: Use user assumption: [Debug, M, N] -> {  : Debug = 0 and 0 < M <= 100 and 0 < N <= 2147483647 - M }
remark: <unknown>:0:0: Use user assumption: [N, M] -> {  : M >= -2147483648 - N }
remark: <unknown>:0:0: Use user assumption: [N, M] -> {  : N > 0 and M >= -2147483648 - N }
remark: <unknown>:0:0: Use user assumption: [N, M] -> {  : N > 0 and -2147483648 - N <= M <= 2147483647 - N }
remark: <unknown>:0:0: Use user assumption: [N, M, Debug] -> {  : Debug = 0 and N > 0 and 0 < M <= 2147483647 - N and M <= 100 }
remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : N <= 2147483647 - M }
remark: <unknown>:0:0: Use user assumption: [M, N] -> {  : -2147483648 - M <= N <= 2147483647 - M }
remark: <unknown>:0:0: Use user assumption: [M, N, Debug] -> {  : Debug = 0 and 0 < M <= 100 and -2147483648 - M <= N <= 2147483647 - M }
remark: <unknown>:0:0: Use user assumption: [M, N, Debug] -> {  : Debug = 0 and 0 < M <= 100 and 0 < N <= 2147483647 - M }

i.e. it's not just the order in which they are printed.

Even if that can be fixed in Polly, as already mentioned the result ScalarEvolution returns depends on cached SCEVs (e.g. to resolve recursion), i.e. its result depends on previous queries, as much we dislike that. See for instance D74810.

jdoerfert added inline comments.Feb 9 2021, 9:19 PM
llvm/include/llvm/Analysis/AssumptionCache.h
81

I think I miss the point about ScalarEvolution here, is it impacted by the order of assumptions?

For polly, just to check, the set's are equivalent, right?

Meinersbur added inline comments.Feb 9 2021, 10:04 PM
llvm/include/llvm/Analysis/AssumptionCache.h
81

I think I miss the point about ScalarEvolution here, is it impacted by the order of assumptions?

ScalarEvolution::applyLoopGuards, while iterating over all assumptions calls getSCEV in order of the assumptions. A different call order, such as

getSCEV(A); 
getSCEV(B);

potentially gives a different result than

getSCEV(B); 
getSCEV(A);

and any query of getSCEV after that.

For polly, just to check, the set's are equivalent, right?

Nope, this is a FileCheck test for -Rpass remarks. No isl_set_is_equal within FileCheck.

nikic added inline comments.Feb 10 2021, 12:50 AM
llvm/include/llvm/Analysis/AssumptionCache.h
81

Yeah, we have quite a few places where order of assumptions can matter. This can also affect LVI because range intersections don't commute.

I think we should revert this change, as the original issue should already be addressed by the WeakTrackingVH -> WeakVH change.

jdoerfert added inline comments.Feb 10 2021, 8:40 AM
llvm/include/llvm/Analysis/AssumptionCache.h
81

Yeah, we have quite a few places where order of assumptions can matter. This can also affect LVI because range intersections don't commute.

We can fix the order but keep the self-removing property, llvm.assumes should not be deleted often.

That said, I think if assumption order matters it is also not great for users:

__builtin_assume(exp1);
__builtin_assume(exp2);

is blazing fast and

__builtin_assume(exp2);
__builtin_assume(exp1);

is slow. How do we explain that?


Nope, this is a FileCheck test for -Rpass remarks. No isl_set_is_equal within FileCheck.

That was not the question.