This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] More precise trip multiples
ClosedPublic

Authored by caojoshua on Jan 16 2023, 1:19 AM.

Details

Summary

We currently have getMinTrailingZeros(), from which we can get a SCEV's
multiple by computing 1 << MinTrailingZeroes. However, this only gets us
multiples that are a power of 2. This patch introduces a way to get max
constant multiples that are not just a power of 2. The logic is similar
to that of getMinTrailingZeros. getMinTrailingZerosImpl is replaced by
computing the max constant multiple, and counting the number of trailing
bits.

I have so far found this useful in two places:

  1. Computing unsigned constant ranges. For example, if we have i8 {10,+,10}<nuw>, we know the max constant it can be is 250.
  1. My original intent was to use this in getSmallConstantTripMultiples, but it has no effect right now due to change from D110587. For example, if we have backedge count (6 * %N) - 1, the trip count becomes 1 + zext((6 * %N) - 1), and we cannot say that 6 is a multiple of the SCEV. I plan to look further into this separately.

The implementation assumes the value is unsigned. It can probably be
extended to handle signed values as well.

If the code sees that a SCEV does not have <nuw>, it will fall back to
finding the max multiple that is a power of 2. Multiples that are a
power of 2 will still be a multiple even after the SCEV overflows. This
does not apply to other values.

Diff Detail

Event Timeline

caojoshua created this revision.Jan 16 2023, 1:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 1:19 AM
caojoshua updated this revision to Diff 489463.Jan 16 2023, 1:39 AM

Small code updates

caojoshua updated this revision to Diff 489465.Jan 16 2023, 1:47 AM

Move a code comment

caojoshua published this revision for review.Jan 16 2023, 1:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 1:48 AM
nikic added a comment.Jan 16 2023, 3:39 AM

Precommit tests please.

llvm/lib/Analysis/ScalarEvolution.cpp
6304

I don't think this is correct. trunc(X nuw* C) is not, in general, the same as trunc(X) nuw* trunc(C).

6304

APInt::getOneBitSet

6365

minimum

Precommit tests please.

What is precommit test? I've been running llvm-lit llvm/test

llvm/lib/Analysis/ScalarEvolution.cpp
6304

Thats a good point. Need to think about this. I think we can fall back to count min trailing zeros.

Precommit tests please.

What is precommit test? I've been running llvm-lit llvm/test

Commit the new tests with baseline checks (without your patch), and then rebase on top, so it only shows diffs.

  • Fix typos
  • Don't assume multiples hold through truncation, unless they are a power of 2
  • only show diffs for updated test outputs

Precommit tests please.

What is precommit test? I've been running llvm-lit llvm/test

Commit the new tests with baseline checks (without your patch), and then rebase on top, so it only shows diffs.

Thanks. I've done this for the newest revision. I see why this is preferable, but unless I think I'm missing something, this should be included somewhere in LLVM docs i.e. https://llvm.org/docs/CodeReview.html or https://llvm.org/docs/TestingGuide.html.

I have another question. When patches get approved, should owners squash the commits before pushing? Or push two separate commits.

nikic added a comment.Jan 24 2023, 2:01 AM

Precommit tests please.

What is precommit test? I've been running llvm-lit llvm/test

Commit the new tests with baseline checks (without your patch), and then rebase on top, so it only shows diffs.

Thanks. I've done this for the newest revision. I see why this is preferable, but unless I think I'm missing something, this should be included somewhere in LLVM docs i.e. https://llvm.org/docs/CodeReview.html or https://llvm.org/docs/TestingGuide.html.

I've put up https://reviews.llvm.org/D142441 to update the TestingGuide.

I have another question. When patches get approved, should owners squash the commits before pushing? Or push two separate commits.

It should be two separate commits. In fact, you can just land the test commit now, without waiting for the patch to be approved.

I believe this patch needs a rebase, because of some recent refactorings.

My original intent was to use this in getSmallConstantTripMultiples, but it has no effect right now due to change from D110587. For example, if we have backedge count (6 * %N) - 1, the trip count becomes 1 + zext((6 * %N) - 1), and we cannot say that 6 is a multiple of the SCEV. I plan to look further into this separately.

I wonder if it might make sense to address this first? I'm a bit worried that the only test coverage for this functionality we have right now is very indirect, by the effect the multiple has on ranges. It would be great if we could test this functionality directly based on the trip multiple.

nikic added a comment.Jan 24 2023, 2:17 AM

Logic looks about right to me.

When I tested the original patch, there was a significant impact on compile-time: http://llvm-compile-time-tracker.com/compare.php?from=68a534e9bf69e7e5f081a515e05f1d3cb4c21761&to=8f3c56e720e64e569f930190b246e4af61be2323&stat=instructions:u But I'm not sure if it's avoidable :(

llvm/include/llvm/Analysis/ScalarEvolution.h
966

than -> then

970

Missing doc comment.

llvm/lib/Analysis/ScalarEvolution.cpp
6294

Unnecessary reference

6329–6331
6330

Unnecessary braces

6358
6367

Or use the same operands().drop_front() style as above.

8280

So, is getRawData() here supposed to be an implicit truncate? Let's not do that...

I wonder if it might make sense to address this first? I'm a bit worried that the only test coverage for this functionality we have right now is very indirect, by the effect the multiple has on ranges. It would be great if we could test this functionality directly based on the trip multiple.

This does make sense. Let me try to get this working.

When I tested the original patch, there was a significant impact on compile-time: http://llvm-compile-time-tracker.com/compare.php?from=68a534e9bf69e7e5f081a515e05f1d3cb4c21761&to=8f3c56e720e64e569f930190b246e4af61be2323&stat=instructions:u But I'm not sure if it's avoidable :(

I was not expecting those numbers. Computing multiples is very similar to GetMinTrailingZeros. The main difference I can think of is that multiples uses GreatestCommonDivisor, but I don't think it should be too expensive, and in most cases it probably just returns 1. I'll look into this as well.

I think this needs stronger test coverage. At least I want tests for all operations (either IR tests or unittests in CPP, whatever is easier) exercising corner case scenarios, such as bit width overflow with mul.

llvm/include/llvm/Analysis/ScalarEvolution.h
965

Separate NFC?

llvm/lib/Analysis/ScalarEvolution.cpp
6296

Why zero and not APInt::getOneBitSet(BitWidth, BitWidth - 1)? Zero is not even a power of 2, how to interpret that?

6299

/*param name/* nullptr

6360

Early bail if GCD has become 1? It won't get any better anyways.

I think this needs stronger test coverage. At least I want tests for all operations (either IR tests or unittests in CPP, whatever is easier) exercising corner case scenarios, such as bit width overflow with mul.

Agree that more test coverage is needed here. I'd like to go with nikic's suggestion to figure out the issues with trip multiples first. I think that will make testing this much easier.

llvm/include/llvm/Analysis/ScalarEvolution.h
965

You're right. I'll take this out.

mkazantsev requested changes to this revision.Feb 5 2023, 11:26 PM

Marking as "request changes" to shrink my review list, will take a look once comments are addressed.

This revision now requires changes to proceed.Feb 5 2023, 11:26 PM
caojoshua updated this revision to Diff 513487.Apr 14 2023, 1:49 AM
  • Big rebasing
  • rename getMaxConstantMultiple -> getConstantMultiple
  • Apply getConstantMultiple to getSmallConstantTripMultiple() with a lot of tests
  • early bail loop if GCD of add SCEV is 1
  • stylistic changes based on feedback

When I tested the original patch, there was a significant impact on compile-time: http://llvm-compile-time-tracker.com/compare.php?from=68a534e9bf69e7e5f081a515e05f1d3cb4c21761&to=8f3c56e720e64e569f930190b246e4af61be2323&stat=instructions:u But I'm not sure if it's avoidable :(

@nikic The patch is quite a bit different now. I followed the test suite guide and compared compilation of SingleSource, MultiSource, Bitcode, and MicroBenchmarks tests and saw no compile time differences. I think the source files might be too small to see significant difference. I'm also not sure how much these tests will stress SCEV, but I do see the MicroBenchmarks directory has vectorization and other tests, so at least those tests should test SCEV.

Any recommendations to further benchmark compile time? If you feel appropriate, could you run this patch through your compile time tracker?

caojoshua updated this revision to Diff 513493.Apr 14 2023, 2:05 AM

Rename GetPowerOfTwo -> GetShiftedByZeros

caojoshua marked 2 inline comments as done.Apr 14 2023, 2:10 AM
caojoshua added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
6296

I renamed it to GetShiftByZeros(). Happy for better suggestions.

getConstantMultiple needs to be able to return zero sometimes. For example, if we have SCEV 0 + 10<nuw>, the true constant multiple is 10. Since its an addition, we want to compute GCD(0, 10), instead of something like GCD(1<<31, 10). I use a constant in this example, but the logic that we need to return 0 applies to all cases.

mkazantsev added a comment.EditedApr 21 2023, 1:03 AM

Whenever a new cache is introduced, it is highly recommended to add logic to ScalarEvoltion::verify for it to make sure it is sane. Can you please add that?

Yeah I see that it has existed before, but having a cache we don't verify is a potential point for bugs.

llvm/lib/Analysis/ScalarEvolution.cpp
6328

hasNoUnsignedWrap doesn't make sense for min/max. How about:

case scAddExpr:
case scAddRecExpr:
case scUMaxExpr:
  <handling for nuw>
  // fallthrough
case scUMaxExpr:
case scSMaxExpr:
case scUMinExpr:
case scSMinExpr:
case scSequentialUMinExpr:
  <common part>

?

6332

This might be overly conservative for mul. You can just take constant multiple of any operand, or even their product from all operands. I'm OK if it's not in this patch, but maybe you should consider this.

mkazantsev requested changes to this revision.Apr 21 2023, 1:10 AM

Some style comments & request to see if smth is doable in verifier with this cache.

This revision now requires changes to proceed.Apr 21 2023, 1:10 AM
caojoshua marked 2 inline comments as done.Apr 21 2023, 1:18 AM
caojoshua added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
6332

Maybe you missed this. There is a case scMulExpr above that takes the product if there is no unsigned wrap.

caojoshua marked an inline comment as done.
  • Don't check for wrap flags for min/max SCEVs
  • add validation to ScalarEvolution::verify()

Whenever a new cache is introduced, it is highly recommended to add logic to ScalarEvoltion::verify for it to make sure it is sane. Can you please add that?

I'm not sure what it means to be sane in this case. It makes sense for certain caches. For example, I see that verify() checks that Values that exist in ExprValueMap also exist in the reversed ValueExprMap. From what I can see, the current validations all expect some certain values to currently be in a cache, but this does not apply to ConstantMultipleCache.

For now, I added verification that the cached value is aligned with getConstantMultipleImpl(). It looks like the first verification of this kind. I'm not sure how useful it is, because getConstantMultipleImpl() makes calls to getConstantMultiple(), which retrieves from the cache. I also added -passes=verify<scalar-evolution> to two tests, since there are actually very few tests that do run verification.

mkazantsev accepted this revision.Apr 23 2023, 10:11 PM

LG, thanks! It's better to have more verification than needed than less than needed. :) Bugs with corrupted caches are so annoying.

This revision is now accepted and ready to land.Apr 23 2023, 10:11 PM
This revision was landed with ongoing or failed builds.Apr 24 2023, 12:32 AM
This revision was automatically updated to reflect the committed changes.
nikic added inline comments.Apr 24 2023, 12:41 AM
llvm/lib/Analysis/ScalarEvolution.cpp
6328

min/max are implicitly nuw/nsw. With the new code structure we will use only the trailing zeros for min/max, while using the GCD would be legal.

6358–6360

This should probably return const APInt &?

8281

Should be getZExtValue() instead of getRawData().

14346

Hm, can this end up modifying the map we're iterating?

caojoshua added inline comments.Apr 24 2023, 1:41 AM
llvm/lib/Analysis/ScalarEvolution.cpp
14346

I think its possible. I'm going to revert this change and investigate later. Looks like buildbot is failing 027a4c8b96c7f97df8e98b1dac069b956810ab94.

Bad link in previous comment. Buildbot failing link is https://lab.llvm.org/buildbot/#/builders/16/builds/47038

mkazantsev added inline comments.Apr 24 2023, 1:46 AM
llvm/lib/Analysis/ScalarEvolution.cpp
14346

SE2 is a different entity, so how could it?

nikic added inline comments.Apr 24 2023, 1:48 AM
llvm/lib/Analysis/ScalarEvolution.cpp
14346

Oh yeah, good point. I missed that this is on SE2.

caojoshua marked 3 inline comments as done.Apr 29 2023, 2:15 PM
caojoshua added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
6358–6360

The APInts are local variables. Can't return reference.