This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Preserve divisibility and min/max information in applyLoopGuards
ClosedPublic

Authored by alonkom on Jan 16 2023, 7:01 AM.

Details

Summary

applyLoopGuards doesn't always preserve information when there are multiple assumes.
This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:

  1. __builtin_assume(TC % 8 == 0);
  2. __builtin_assume(TC > 0);
  3. __builtin_assume(TC < 100);

Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)

Looking at this SCEV, it doesn't preserve the divisibility by 8 information.

After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.

Diff Detail

Event Timeline

alonkom created this revision.Jan 16 2023, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 7:01 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
alonkom requested review of this revision.Jan 16 2023, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 7:01 AM

What is the relation to https://reviews.llvm.org/D128701? Seems maybe you should close the other one.

Can you make the commit title more specific, and add more info in the commit message? In particular, I would like to see something like:

after applyLoopGuards, turns <A> into <B>

where A and B are SCEV expressions.

Other information like, what the problem was before, and how you address it, is also helpful. I would suggest looking at the git log of ScalarEvolution.cpp for examples of descriptive commit messages. This will save the reviewers time, and will be helpful when people look at the git log/blame.

alonkom retitled this revision from [SCEV] Preserve information in applyLoopGuards to [SCEV] Preserve divisibility and min/max information in applyLoopGuards.Jan 17 2023, 4:24 AM
alonkom edited the summary of this revision. (Show Details)

What is the relation to https://reviews.llvm.org/D128701? Seems maybe you should close the other one.

Can you make the commit title more specific, and add more info in the commit message? In particular, I would like to see something like:

after applyLoopGuards, turns <A> into <B>

where A and B are SCEV expressions.

Other information like, what the problem was before, and how you address it, is also helpful. I would suggest looking at the git log of ScalarEvolution.cpp for examples of descriptive commit messages. This will save the reviewers time, and will be helpful when people look at the git log/blame.

Thanks, updated the commit message.

alonkom updated this revision to Diff 490073.Jan 18 2023, 1:33 AM
alonkom added a reviewer: fhahn.

The improved trip multiples from the test results look good. Ordering in applyLoopGuards is an issue. However, I think we can simplify this down a bit. What if we always applied min/max first, before we apply divisibility guards? For example, given:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);
__builtin_assume(TC < 100);
  1. apply max: umax(1, TC)
  2. apply min: umin(100, umax(1, TC))
  3. apply divisibility info: 8 * (umin(100, umax(1, TC))) / 8

This makes divisibility info obvious. And traversing the SCEV, we can still see TC > 0 and TC < 100. I believe that if we always apply max/min first, we will never lose this info. This approach seems much simpler and easier to understand.

FYI. I haven't been around that long, and this change is non-trivial enough where I prefer a longstanding developer to give the final approval. I will still be around to give my thoughts.

llvm/lib/Analysis/ScalarEvolution.cpp
15038

SequentialMinMax SCEVs can be applied here as well.

15039

unused var

15043

Constant should always be on the left side. Lets not change that.

15058

Have not thought about it too deeply yet, but I'm concerned that we may need to take a look at when this is legal given Expr's NoWrapFlags. Same for the equivalent getMinusSCEV below

15172

Shouldn't swap the constant. Assume constant is on left.

The improved trip multiples from the test results look good. Ordering in applyLoopGuards is an issue. However, I think we can simplify this down a bit. What if we always applied min/max first, before we apply divisibility guards? For example, given:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);
__builtin_assume(TC < 100);
  1. apply max: umax(1, TC)
  2. apply min: umin(100, umax(1, TC))
  3. apply divisibility info: 8 * (umin(100, umax(1, TC))) / 8

This makes divisibility info obvious. And traversing the SCEV, we can still see TC > 0 and TC < 100. I believe that if we always apply max/min first, we will never lose this info. This approach seems much simpler and easier to understand.

FYI. I haven't been around that long, and this change is non-trivial enough where I prefer a longstanding developer to give the final approval. I will still be around to give my thoughts.

Thanks for the reply.
This works in this example, but what would happen if we had only these assumes:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);

In that case, we would have created the following SCEV:

8 * (umax(1, TC) / 8)

since umax(1, TC) may still be between [1,7], when we divide it by 8, and multiply by 8, we get 0. So this doesn't preserve the fact that TC > 0.
In order to do that we must align up 1 to 8, and then umax(8, TC) / 8 is always > 0.

The improved trip multiples from the test results look good. Ordering in applyLoopGuards is an issue. However, I think we can simplify this down a bit. What if we always applied min/max first, before we apply divisibility guards? For example, given:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);
__builtin_assume(TC < 100);
  1. apply max: umax(1, TC)
  2. apply min: umin(100, umax(1, TC))
  3. apply divisibility info: 8 * (umin(100, umax(1, TC))) / 8

This makes divisibility info obvious. And traversing the SCEV, we can still see TC > 0 and TC < 100. I believe that if we always apply max/min first, we will never lose this info. This approach seems much simpler and easier to understand.

FYI. I haven't been around that long, and this change is non-trivial enough where I prefer a longstanding developer to give the final approval. I will still be around to give my thoughts.

Thanks for the reply.
This works in this example, but what would happen if we had only these assumes:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);

In that case, we would have created the following SCEV:

8 * (umax(1, TC) / 8)

since umax(1, TC) may still be between [1,7], when we divide it by 8, and multiply by 8, we get 0. So this doesn't preserve the fact that TC > 0.
In order to do that we must align up 1 to 8, and then umax(8, TC) / 8 is always > 0.

umax(1, TC) implies TC <= 1, which holds in this case.

I think what you meant is in the case that, TC > 0, we get: 8 * (umin(1, TC)) / 8. If TC is in [0, 7], the expression evaluates to 0, and TC > 0 is lost.

Your point comes across that we can't just apply guards in a certain order. I have some thoughts, but I'll think about it and write it down later.

After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

This example looks wrong. I think min/max should be switched. Should be

min(max((8 * (TC / 8)), 96), 8)

Please update description.


In terms of overall approach, I'm not sure. Feels a bit hacky to have custom logic to check that an expressions is a min/max of mul/div. I'll let others chime in here.

llvm/lib/Analysis/ScalarEvolution.cpp
15164

typo: wether

I'm not sure what the divisor B in \p DividesBy means.

I think this paragraph needs to be more clear. What does it mean to be composed on Min/Max SCEVs?

mkazantsev requested changes to this revision.Jan 24 2023, 1:54 AM

Some nits, and potentially a bug in formula.

llvm/lib/Analysis/ScalarEvolution.cpp
15038

I'd prefer it to be a separate patch, if it's legal at all. Need to check carefully how poison flows in these formulae.

15041

MinMax can have more than 2 operands. Check that it is exactly 2 of them?

15050

greater or equal

15053

If I'm reading this correctly, this is a guarantee of no-overflow for computations you are doing. Maybe add this comment explicitly?

15058

This computation is inconsistent. If Rem is constant 0, you'll return Expr. But if Rem is effectively zero, but not a constant (e.g. some complex expression which is always zero), you return Expr + Divisor. Bug?

15064

less or equal

15072

I think you can safely drop check for Rem->isZero() here as it will be trivially simplified away in getMinusSCEV

15080

auto

15166

auto

15180

if (auto *MinMax = dyn_cast<SCEVMinMaxExpr>(Expr))

15189

auto

This revision now requires changes to proceed.Jan 24 2023, 1:54 AM
alonkom marked 14 inline comments as done.Jan 31 2023, 12:27 AM
After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

This example looks wrong. I think min/max should be switched. Should be

min(max((8 * (TC / 8)), 96), 8)

Please update description.


In terms of overall approach, I'm not sure. Feels a bit hacky to have custom logic to check that an expressions is a min/max of mul/div. I'll let others chime in here.

I think the example is correct.
Even before this patch:
TC > 0 is translated to max (TC, 1)
TC < 99 is translated to min (TC, 98)

llvm/lib/Analysis/ScalarEvolution.cpp
15038

They are not created in this function, so I prefer only handling MinMax at this point.

15038

can you explain what isn't legal here?
just checking if this is a min/max of 2 operands, when the 2nd one is constant.

15039

it is used as an output of the function

15053

I will only handle constants for now.

15058

I will only support constants for now

15058

I will only support constants for now

15080

can't use auto since there's a recursive call here

15164

rephrased it a bit.

15166

can't use auto since there's a recursive call here

15189

can't use auto, due to recursive call.

alonkom updated this revision to Diff 493504.Jan 31 2023, 12:28 AM
alonkom marked 5 inline comments as done.
mkazantsev added inline comments.Jan 31 2023, 8:46 AM
llvm/lib/Analysis/ScalarEvolution.cpp
15038

I mean "I would prefer adding sequential min/max in a separate patch". Meaning, so far so good here.

mkazantsev added inline comments.Jan 31 2023, 8:50 AM
llvm/lib/Analysis/ScalarEvolution.cpp
15166

auto HasDivisibiltyInfo = [&](const SCEV *Expr, const SCEV *&DividesBy) -> bool {...
?

I think the example is correct.
Even before this patch:
TC > 0 is translated to max (TC, 1)
TC < 99 is translated to min (TC, 98)

My mistake. You're correct.

alonkom marked an inline comment as done.Jan 31 2023, 11:45 PM
alonkom added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15166

I'm still getting this build error:
use of ‘HasDivisibiltyInfo’ before deduction of ‘auto’

mkazantsev added inline comments.Feb 2 2023, 9:59 PM
llvm/lib/Analysis/ScalarEvolution.cpp
15225

How do you account for RHS being SINT_MIN and similar cases here?

mkazantsev added inline comments.Feb 2 2023, 10:05 PM
llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
397

Please precommit tests as is, I want to see what exactly this patch changes.

mkazantsev added inline comments.Feb 2 2023, 10:10 PM
llvm/lib/Analysis/ScalarEvolution.cpp
15217

auto *

alonkom marked an inline comment as done.Feb 5 2023, 3:58 AM
alonkom added inline comments.
llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
397
mkazantsev requested changes to this revision.Feb 5 2023, 11:12 PM

I think supporting smin/smax is a bug, this code is only written for unsigned values.

llvm/lib/Analysis/ScalarEvolution.cpp
15038

nit: auto *

15060

Do you check bit width anywhere? What if it doesn't fit into unsigned?

Better use APInt unless you have a reason not to.

15096

Init with nullptr, makes it easier to debug.

15101

Do you really support smin here? Your code is only correct for unsigned values. It interprets all values as non-negative.

This revision now requires changes to proceed.Feb 5 2023, 11:12 PM
alonkom marked an inline comment as done.Feb 6 2023, 5:08 AM
alonkom added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15225

I wonder how this worked before my patch.
if we have assume (N < SINT_MIN) then the following SCEV was generated:
smin(N, SINT_MIN - 1) which would overflow.

alonkom updated this revision to Diff 496081.Feb 9 2023, 4:07 AM
alonkom marked 3 inline comments as done.
alonkom added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15101

I check if the operands are non-negative inside GetPreviousSCEV/GetNextSCEVD

mkazantsev requested changes to this revision.Feb 10 2023, 1:04 AM

Mostly looks good, but I think I found a bug in IsMinMaxSCEVWithConstant . Also please rebase on top of your tests.

llvm/lib/Analysis/ScalarEvolution.cpp
15044
  1. If I remember correctly, SCEV operands are always sorted by kind. It means that constants always go first. RHS can't be a constant because then LHS is also a constant and the whole thing should be folded away. So it should be trivially false, and I guess this code part isn't covered by tests if it wasn't caught.
  2. Usually matcher methods aren't supposed to change operands if they return false. In your case it doesn't matter, but generally better to follow the standard practice and do all checks before modifying the operands.
15101

Ah ok, then it makes sense. Can we assert on that here?

15197

if (auto *MinMax = dyn_cast<SCEVMinMaxExpr>(Expr)) { ...

15225

Well, if there is a check somewhere that RHS is non-negative, then it makes sense. Otherwise it's possibly broken. :)

llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
397

Rebase patch on top of it?

This revision now requires changes to proceed.Feb 10 2023, 1:04 AM
alonkom updated this revision to Diff 496757.Feb 12 2023, 5:49 AM
alonkom marked 3 inline comments as done.
alonkom marked 4 inline comments as done.Feb 20 2023, 6:00 AM

@mkazantsev Let me know if you have any other comments

llvm/lib/Analysis/ScalarEvolution.cpp
15044
  1. Nice catch! It didn't hurt the divisibility info, but some min/max info was lost. I Fixed this and added a unit-test since the lit only checks the divisibility.
  2. Done.
15225

Anyway, the change I've made only applies to non-negative values

mkazantsev accepted this revision.Feb 20 2023, 9:52 PM

Ok, thanks, let's give it a try.

llvm/lib/Analysis/ScalarEvolution.cpp
15065

nit: {} not needed

15077–15084

This code is copy-paste-ish (see lambda above), maybe factor out?

15091

If you see any problems with compile time from this patch, this is a potential place where things can be limited. Not sure if it's worth it in practice.

15104

nit: auto*

15136

Is it possible that the map contains both LHSUnknown -> RewrittenLHS and RewrittenLHS -> SomeOtherLHS? Should this be a loop?

I'm ok if it's a separate patch.

This revision is now accepted and ready to land.Feb 20 2023, 9:52 PM
alonkom updated this revision to Diff 499115.Feb 21 2023, 4:59 AM
alonkom marked 3 inline comments as done.
alonkom updated this revision to Diff 499375.Feb 21 2023, 11:07 PM
alonkom added inline comments.Feb 23 2023, 12:02 AM
llvm/lib/Analysis/ScalarEvolution.cpp
15091

I don't expect more than 3 assumes per value in the real-world, so I don't believe it'll affect compilation time drastically, but let's see.

15136

The entire function assumes 1 level of nesting

Why was this reverted?

Why was this reverted?

Failed on an assertion. Uploading a fix now.
(Also some buildbots failed on time-outs, but I'm not sure this is due to my change)

alonkom updated this revision to Diff 500525.Feb 26 2023, 1:38 AM