This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Preserve divisibility and min/max information in applyLoopGuards
ClosedPublic

Authored by alonkom on Jan 16 2023, 7:01 AM.

Details

Summary

applyLoopGuards doesn't always preserve information when there are multiple assumes.
This patch tries to deal with multiple assumes regarding a SCEV's divisibility and min/max values, and rewrite it into a SCEV that still preserves all of the information.
For example, let the trip count of the loop be TC. Consider the 3 following assumes:

  1. __builtin_assume(TC % 8 == 0);
  2. __builtin_assume(TC > 0);
  3. __builtin_assume(TC < 100);

Before this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 99), 1)

Looking at this SCEV, it doesn't preserve the divisibility by 8 information.

After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

By aligning up 1 to 8, and aligning down 99 to 96, the new SCEV still preserves all of the original assumes.

Diff Detail

Event Timeline

alonkom created this revision.Jan 16 2023, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 7:01 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
alonkom requested review of this revision.Jan 16 2023, 7:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 16 2023, 7:01 AM

What is the relation to https://reviews.llvm.org/D128701? Seems maybe you should close the other one.

Can you make the commit title more specific, and add more info in the commit message? In particular, I would like to see something like:

after applyLoopGuards, turns <A> into <B>

where A and B are SCEV expressions.

Other information like, what the problem was before, and how you address it, is also helpful. I would suggest looking at the git log of ScalarEvolution.cpp for examples of descriptive commit messages. This will save the reviewers time, and will be helpful when people look at the git log/blame.

alonkom retitled this revision from [SCEV] Preserve information in applyLoopGuards to [SCEV] Preserve divisibility and min/max information in applyLoopGuards.Jan 17 2023, 4:24 AM
alonkom edited the summary of this revision. (Show Details)

What is the relation to https://reviews.llvm.org/D128701? Seems maybe you should close the other one.

Can you make the commit title more specific, and add more info in the commit message? In particular, I would like to see something like:

after applyLoopGuards, turns <A> into <B>

where A and B are SCEV expressions.

Other information like, what the problem was before, and how you address it, is also helpful. I would suggest looking at the git log of ScalarEvolution.cpp for examples of descriptive commit messages. This will save the reviewers time, and will be helpful when people look at the git log/blame.

Thanks, updated the commit message.

alonkom updated this revision to Diff 490073.Jan 18 2023, 1:33 AM
alonkom added a reviewer: fhahn.

The improved trip multiples from the test results look good. Ordering in applyLoopGuards is an issue. However, I think we can simplify this down a bit. What if we always applied min/max first, before we apply divisibility guards? For example, given:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);
__builtin_assume(TC < 100);
  1. apply max: umax(1, TC)
  2. apply min: umin(100, umax(1, TC))
  3. apply divisibility info: 8 * (umin(100, umax(1, TC))) / 8

This makes divisibility info obvious. And traversing the SCEV, we can still see TC > 0 and TC < 100. I believe that if we always apply max/min first, we will never lose this info. This approach seems much simpler and easier to understand.

FYI. I haven't been around that long, and this change is non-trivial enough where I prefer a longstanding developer to give the final approval. I will still be around to give my thoughts.

llvm/lib/Analysis/ScalarEvolution.cpp
15042

SequentialMinMax SCEVs can be applied here as well.

15043

unused var

15047

Constant should always be on the left side. Lets not change that.

15062

Have not thought about it too deeply yet, but I'm concerned that we may need to take a look at when this is legal given Expr's NoWrapFlags. Same for the equivalent getMinusSCEV below

15171

Shouldn't swap the constant. Assume constant is on left.

The improved trip multiples from the test results look good. Ordering in applyLoopGuards is an issue. However, I think we can simplify this down a bit. What if we always applied min/max first, before we apply divisibility guards? For example, given:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);
__builtin_assume(TC < 100);
  1. apply max: umax(1, TC)
  2. apply min: umin(100, umax(1, TC))
  3. apply divisibility info: 8 * (umin(100, umax(1, TC))) / 8

This makes divisibility info obvious. And traversing the SCEV, we can still see TC > 0 and TC < 100. I believe that if we always apply max/min first, we will never lose this info. This approach seems much simpler and easier to understand.

FYI. I haven't been around that long, and this change is non-trivial enough where I prefer a longstanding developer to give the final approval. I will still be around to give my thoughts.

Thanks for the reply.
This works in this example, but what would happen if we had only these assumes:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);

In that case, we would have created the following SCEV:

8 * (umax(1, TC) / 8)

since umax(1, TC) may still be between [1,7], when we divide it by 8, and multiply by 8, we get 0. So this doesn't preserve the fact that TC > 0.
In order to do that we must align up 1 to 8, and then umax(8, TC) / 8 is always > 0.

The improved trip multiples from the test results look good. Ordering in applyLoopGuards is an issue. However, I think we can simplify this down a bit. What if we always applied min/max first, before we apply divisibility guards? For example, given:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);
__builtin_assume(TC < 100);
  1. apply max: umax(1, TC)
  2. apply min: umin(100, umax(1, TC))
  3. apply divisibility info: 8 * (umin(100, umax(1, TC))) / 8

This makes divisibility info obvious. And traversing the SCEV, we can still see TC > 0 and TC < 100. I believe that if we always apply max/min first, we will never lose this info. This approach seems much simpler and easier to understand.

FYI. I haven't been around that long, and this change is non-trivial enough where I prefer a longstanding developer to give the final approval. I will still be around to give my thoughts.

Thanks for the reply.
This works in this example, but what would happen if we had only these assumes:

__builtin_assume(TC % 8 == 0);
__builtin_assume(TC > 0);

In that case, we would have created the following SCEV:

8 * (umax(1, TC) / 8)

since umax(1, TC) may still be between [1,7], when we divide it by 8, and multiply by 8, we get 0. So this doesn't preserve the fact that TC > 0.
In order to do that we must align up 1 to 8, and then umax(8, TC) / 8 is always > 0.

umax(1, TC) implies TC <= 1, which holds in this case.

I think what you meant is in the case that, TC > 0, we get: 8 * (umin(1, TC)) / 8. If TC is in [0, 7], the expression evaluates to 0, and TC > 0 is lost.

Your point comes across that we can't just apply guards in a certain order. I have some thoughts, but I'll think about it and write it down later.

After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

This example looks wrong. I think min/max should be switched. Should be

min(max((8 * (TC / 8)), 96), 8)

Please update description.


In terms of overall approach, I'm not sure. Feels a bit hacky to have custom logic to check that an expressions is a min/max of mul/div. I'll let others chime in here.

llvm/lib/Analysis/ScalarEvolution.cpp
15163

typo: wether

I'm not sure what the divisor B in \p DividesBy means.

I think this paragraph needs to be more clear. What does it mean to be composed on Min/Max SCEVs?

mkazantsev requested changes to this revision.Jan 24 2023, 1:54 AM

Some nits, and potentially a bug in formula.

llvm/lib/Analysis/ScalarEvolution.cpp
15042

I'd prefer it to be a separate patch, if it's legal at all. Need to check carefully how poison flows in these formulae.

15045

MinMax can have more than 2 operands. Check that it is exactly 2 of them?

15054

greater or equal

15057

If I'm reading this correctly, this is a guarantee of no-overflow for computations you are doing. Maybe add this comment explicitly?

15062

This computation is inconsistent. If Rem is constant 0, you'll return Expr. But if Rem is effectively zero, but not a constant (e.g. some complex expression which is always zero), you return Expr + Divisor. Bug?

15068

less or equal

15076

I think you can safely drop check for Rem->isZero() here as it will be trivially simplified away in getMinusSCEV

15084

auto

15165

auto

15179

if (auto *MinMax = dyn_cast<SCEVMinMaxExpr>(Expr))

15188

auto

This revision now requires changes to proceed.Jan 24 2023, 1:54 AM
alonkom marked 14 inline comments as done.Jan 31 2023, 12:27 AM
After this patch, depending on the assume processing order applyLoopGuards could create the following SCEV:
max(min((8 * (TC / 8)) , 96), 8)

This example looks wrong. I think min/max should be switched. Should be

min(max((8 * (TC / 8)), 96), 8)

Please update description.


In terms of overall approach, I'm not sure. Feels a bit hacky to have custom logic to check that an expressions is a min/max of mul/div. I'll let others chime in here.

I think the example is correct.
Even before this patch:
TC > 0 is translated to max (TC, 1)
TC < 99 is translated to min (TC, 98)

llvm/lib/Analysis/ScalarEvolution.cpp
15042

They are not created in this function, so I prefer only handling MinMax at this point.

15042

can you explain what isn't legal here?
just checking if this is a min/max of 2 operands, when the 2nd one is constant.

15043

it is used as an output of the function

15057

I will only handle constants for now.

15062

I will only support constants for now

15062

I will only support constants for now

15084

can't use auto since there's a recursive call here

15163

rephrased it a bit.

15165

can't use auto since there's a recursive call here

15188

can't use auto, due to recursive call.

alonkom updated this revision to Diff 493504.Jan 31 2023, 12:28 AM
alonkom marked 5 inline comments as done.
mkazantsev added inline comments.Jan 31 2023, 8:46 AM
llvm/lib/Analysis/ScalarEvolution.cpp
15042

I mean "I would prefer adding sequential min/max in a separate patch". Meaning, so far so good here.

mkazantsev added inline comments.Jan 31 2023, 8:50 AM
llvm/lib/Analysis/ScalarEvolution.cpp
15165

auto HasDivisibiltyInfo = [&](const SCEV *Expr, const SCEV *&DividesBy) -> bool {...
?

I think the example is correct.
Even before this patch:
TC > 0 is translated to max (TC, 1)
TC < 99 is translated to min (TC, 98)

My mistake. You're correct.

alonkom marked an inline comment as done.Jan 31 2023, 11:45 PM
alonkom added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15165

I'm still getting this build error:
use of ‘HasDivisibiltyInfo’ before deduction of ‘auto’

mkazantsev added inline comments.Feb 2 2023, 9:59 PM
llvm/lib/Analysis/ScalarEvolution.cpp
15224

How do you account for RHS being SINT_MIN and similar cases here?

mkazantsev added inline comments.Feb 2 2023, 10:05 PM
llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
397

Please precommit tests as is, I want to see what exactly this patch changes.

mkazantsev added inline comments.Feb 2 2023, 10:10 PM
llvm/lib/Analysis/ScalarEvolution.cpp
15216

auto *

alonkom marked an inline comment as done.Feb 5 2023, 3:58 AM
alonkom added inline comments.
llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
397
mkazantsev requested changes to this revision.Feb 5 2023, 11:12 PM

I think supporting smin/smax is a bug, this code is only written for unsigned values.

llvm/lib/Analysis/ScalarEvolution.cpp
15042

nit: auto *

15064

Do you check bit width anywhere? What if it doesn't fit into unsigned?

Better use APInt unless you have a reason not to.

15100

Init with nullptr, makes it easier to debug.

15105

Do you really support smin here? Your code is only correct for unsigned values. It interprets all values as non-negative.

This revision now requires changes to proceed.Feb 5 2023, 11:12 PM
alonkom marked an inline comment as done.Feb 6 2023, 5:08 AM
alonkom added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15224

I wonder how this worked before my patch.
if we have assume (N < SINT_MIN) then the following SCEV was generated:
smin(N, SINT_MIN - 1) which would overflow.

alonkom updated this revision to Diff 496081.Feb 9 2023, 4:07 AM
alonkom marked 3 inline comments as done.
alonkom added inline comments.
llvm/lib/Analysis/ScalarEvolution.cpp
15105

I check if the operands are non-negative inside GetPreviousSCEV/GetNextSCEVD

mkazantsev requested changes to this revision.Feb 10 2023, 1:04 AM

Mostly looks good, but I think I found a bug in IsMinMaxSCEVWithConstant . Also please rebase on top of your tests.

llvm/lib/Analysis/ScalarEvolution.cpp
15048
  1. If I remember correctly, SCEV operands are always sorted by kind. It means that constants always go first. RHS can't be a constant because then LHS is also a constant and the whole thing should be folded away. So it should be trivially false, and I guess this code part isn't covered by tests if it wasn't caught.
  2. Usually matcher methods aren't supposed to change operands if they return false. In your case it doesn't matter, but generally better to follow the standard practice and do all checks before modifying the operands.
15105

Ah ok, then it makes sense. Can we assert on that here?

15196

if (auto *MinMax = dyn_cast<SCEVMinMaxExpr>(Expr)) { ...

15224

Well, if there is a check somewhere that RHS is non-negative, then it makes sense. Otherwise it's possibly broken. :)

llvm/test/Analysis/ScalarEvolution/trip-multiple-guard-info.ll
397

Rebase patch on top of it?

This revision now requires changes to proceed.Feb 10 2023, 1:04 AM
alonkom updated this revision to Diff 496757.Feb 12 2023, 5:49 AM
alonkom marked 3 inline comments as done.
alonkom marked 4 inline comments as done.Feb 20 2023, 6:00 AM

@mkazantsev Let me know if you have any other comments

llvm/lib/Analysis/ScalarEvolution.cpp
15048
  1. Nice catch! It didn't hurt the divisibility info, but some min/max info was lost. I Fixed this and added a unit-test since the lit only checks the divisibility.
  2. Done.
15224

Anyway, the change I've made only applies to non-negative values

mkazantsev accepted this revision.Feb 20 2023, 9:52 PM

Ok, thanks, let's give it a try.

llvm/lib/Analysis/ScalarEvolution.cpp
15069

nit: {} not needed

15081–15088

This code is copy-paste-ish (see lambda above), maybe factor out?

15095

If you see any problems with compile time from this patch, this is a potential place where things can be limited. Not sure if it's worth it in practice.

15108

nit: auto*

15135

Is it possible that the map contains both LHSUnknown -> RewrittenLHS and RewrittenLHS -> SomeOtherLHS? Should this be a loop?

I'm ok if it's a separate patch.

This revision is now accepted and ready to land.Feb 20 2023, 9:52 PM
alonkom updated this revision to Diff 499115.Feb 21 2023, 4:59 AM
alonkom marked 3 inline comments as done.
alonkom updated this revision to Diff 499375.Feb 21 2023, 11:07 PM
alonkom added inline comments.Feb 23 2023, 12:02 AM
llvm/lib/Analysis/ScalarEvolution.cpp
15095

I don't expect more than 3 assumes per value in the real-world, so I don't believe it'll affect compilation time drastically, but let's see.

15135

The entire function assumes 1 level of nesting

Why was this reverted?

Why was this reverted?

Failed on an assertion. Uploading a fix now.
(Also some buildbots failed on time-outs, but I'm not sure this is due to my change)

alonkom updated this revision to Diff 500525.Feb 26 2023, 1:38 AM