This is an archive of the discontinued LLVM Phabricator instance.

[LoopFlatten] Fix missed LoopFlatten opportunity
ClosedPublic

Authored by RosieSumpter on Jul 12 2021, 2:12 AM.

Details

Summary

When the limit of the inner loop is a known integer, the InstCombine
pass now causes the transformation e.g. imcp ult i32 %inc, tripcount ->
icmp ult %j, tripcount-step (where %j is the inner loop induction
variable and %inc is add %j, step), which is now accounted for when
identifying the trip count of the loop. This is also an acceptable use
of %j (provided the step is 1) so is ignored as long as the compare
that it's used in is also the condition of the inner branch.

Diff Detail

Event Timeline

RosieSumpter created this revision.Jul 12 2021, 2:12 AM
RosieSumpter requested review of this revision.Jul 12 2021, 2:12 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2021, 2:12 AM
SjoerdMeijer added inline comments.Jul 12 2021, 5:18 AM
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
156

Make this an else if?

165

I am wondering what happens if the RHS is not a constant, so instead of this:

%cmp = icmp ult i32 %j, 18

we have something like:

%cmp = icmp ult i32 %j, %N

Do we reject that earlier, and/or do we not have these test cases yet?

168

And I was also wondering if we need to worry about RHS being INT_MAX, in which case the + 1 will overflow.

403

Perhaps here we can have:

// The use is in the compare which is also the condition of the inner
// branch. In this case the compare has been altered by another
// transformation (e.g icmp ult %inc, limit -> icmp ult %j, limit-1).
// Ignore this use as the compare gets removed later anyway.
if (U == FI.InnerBranch->getCondition())
   continue;

To make things a bit simpler.

llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll
348

You need to CHECK: something here.

379

Same here.

llvm/test/Transforms/LoopFlatten/loop-flatten.ll
589

It's more common to put the CHECK-LABEL just before

; CHECK: entry:
fhahn added inline comments.Jul 12 2021, 1:26 PM
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
162

Those kinds of patterns are very fragile in general. Would it be possible to use SCEV instead, which makes it easy to analyse the induction variables & trip counts in a more robust way?

SjoerdMeijer added inline comments.Jul 13 2021, 1:09 AM
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
162

You're absolutely right. In fact, there's a quite some pattern matching going on for things like getLoopTest, getLoopIncrement, etc., that you'd probably expect to be present in a loop util helper function or something like that, but they aren't. The pattern matching makes some things easier though (as we are looking for some specific patterns). I feel that moving some things to helpers and using SCEV is major surgery, and if we promise not to add more pattern matching after this, I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten enabled by default (we have this enabled for years now downstream), and before we do that we wanted to fix 2 minor things:

  • this minor extension for a case which you expect to trigger,
  • and we need to strengthen an overflow check.

After this, I think a nice follow up project is indeed to see if we can move code to helpers and use SCEV.

RosieSumpter marked 3 inline comments as done.
  • Changed else to else if
  • Amended comments
  • Moved 'CHECK-LABEL' statement
  • Added test for when incoming phi node value for preheader is a variable (not zero)
  • Fixed debug output for when incoming phi node value can't be cast to ConstantInt (gives null which causes a crash when dump() is used)
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
165

It shouldn't get here unless it is constant since this only catches the case where the compare which is the condition of the back edge has been changed from e.g.

%cmp = icmp ult i32 %inc, 19

to

%cmp = icmp ult i32 %j, 18

If the RHS is unknown e.g.

%cmp = icmp ult i32 %inc, %N

it won't have been changed and so will be caught by one of the previous pattern matches.

I've altered the comments so that it's clearer that this match is only for the case when limit is a constant, and also added an assert to ensure that the LHS of the compare is the induction phi.

168

So here the limit is RHS+1 because the another transformation changed the compare from

%cmp = icmp ult i32 %inc, limit

to

%cmp = icmp ult i32 %j, limit-1

(where limit is a known constant) so the only way RHS=limit-1 would be INT_MAX is if limit had already overflowed, so I think we don't need to check for it here?

llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll
348

The only CHECK in this file is at the beginning:

CHECK-NOT: Checks all passed, doing the transformation

which checks that all these tests fail doesn't it? Or should I being doing more specific checks here?

fhahn added inline comments.Jul 14 2021, 4:57 AM
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
162

I was hoping to postpone this surgery. I.e., we are on a little mission to get LoopFlatten enabled by default

I understand the motivation, but I am not sure it is the best way forward. If SCEV is the the right tool to use, I think it would be great to make the switch before enabling it by default; I am not too familiar with the details, but it looks like the pattern matching of induction variables and compares to get the loop trip counts seems like something that SCEV already provides.

After it is enabled by default, the motivation for a substantial refactoring might be smaller than before enabling it. When enabling it by default, it is also beneficial for the pass to be as general as possible, so fundamental problems can be flushed out early and the compile-time/code-size impact is also representative on a large set of inputs.

Also, not using SCEV makes it harder to reason about the pass, because we can't rely on trusting SCEV to correctly handle all the difficult cases with respect to overflows and such when it comes to trip counts and inductions. We instead need to verify the manual patterns and their interaction with the dependent code. For example, it seems like using SCEV would allow us to be more confident when reasoning about some of the issues in the comments below.

SjoerdMeijer added inline comments.Jul 14 2021, 6:42 AM
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
162

Alright, fair enough, let's then first see what using SCEV exactly means and would involve.

I am not too familiar with the details, but it looks like the pattern matching of induction variables and compares to get the loop trip counts seems like something that SCEV already provides.

Yeah, the trip count is the easy part, but it's also the other loop components like the increment, test, etc., which are not provided by existing APIs. Thus, with major surgery, I probably also meant getting rid of findLoopComponents and adding something similar to Loop or LoopUtils, possibly using SCEV more.

This little pattern match extension here might still be innocent and the way to go, but like I said, let's review SCEV usage first then.

RosieSumpter edited the summary of this revision. (Show Details)

Other patches have refactored parts of LoopFlatten's code to use SCEV and Loop instead of pattern matching (see https://reviews.llvm.org/D106580 and https://reviews.llvm.org/D106256) so this diff has been updated to be on top of these changes.

SjoerdMeijer added inline comments.Jul 28 2021, 1:32 AM
llvm/lib/Transforms/Scalar/LoopFlatten.cpp
177–178

Nit: every time I need to read this if a couple of times, but I think it makes sense. Just for readability, to reduce indentation inside the if a little bit, I was wondering if this would help:

if (SE->getSCEV(TripCount) != SCEVTripCount) && !IsWidened) {
..
} else if (SE->getSCEV(TripCount) != SCEVTripCount)) {
..
}
206

Nit: what was this true argument again? I would guess thought that we don't need...

llvm/test/Transforms/LoopFlatten/loop-flatten-negative.ll
348

Ah, sorry, I had missed that. It's indeed fine like this.

RosieSumpter marked 2 inline comments as done.
  • Replaced checks on RHS of compare with asserts
  • Slight rewriting to improve readability
SjoerdMeijer accepted this revision.Jul 28 2021, 3:40 AM

Thanks, much easier to read now, LGTM!

This revision is now accepted and ready to land.Jul 28 2021, 3:40 AM
This revision was automatically updated to reflect the committed changes.
RosieSumpter reopened this revision.Aug 2 2021, 1:21 AM
This revision is now accepted and ready to land.Aug 2 2021, 1:21 AM
  • Added check and test for trip count
SjoerdMeijer accepted this revision.EditedAug 2 2021, 2:02 AM

I've ran downstream testing for this patch, and that came back okay. LGTM.

This revision was automatically updated to reflect the committed changes.