Page MenuHomePhabricator

[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}"
ClosedPublic

Authored by iid_iunknown on May 8 2016, 10:22 AM.

Details

Summary

Description

This makes WidenIV::widenIVUse (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code.

When WidenIV::widenIVUse gets a NarrowUse such as {(-2 + %inc.lcssa),+,1}<nsw><%for.body3>, it first tries to get a wide recurrence for it via the getWideRecurrence call.
getWideRecurrence returns recurrence like this: {(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>.

Then a wide use operation is generated by cloneIVUser. The generated wide use is evaluated to {(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>, which is different from the getWideRecurrence result. cloneIVUser sees the difference and returns nullptr.

This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction.

Minimal reproducer:

int foo(int a, int b, int c);
int baz();

void bar()
{
   int arr[20];
   int i = 0;

   for (i = 0; i < 4; ++i)
     arr[i] = baz();

   for (; i < 20; ++i)
     arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]);
}

Clang command line:

clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir

Expected result:
The -mllvm -debug log shows that all the IV's for the second for loop have been eliminated.

Diff Detail

Repository
rL LLVM

Event Timeline

iid_iunknown retitled this revision from to [SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}".
iid_iunknown updated this object.
iid_iunknown added a reviewer: sanjoy.
iid_iunknown set the repository for this revision to rL LLVM.
iid_iunknown added a subscriber: llvm-commits.
sanjoy requested changes to this revision.May 8 2016, 11:36 PM
sanjoy edited edge metadata.

This needs at least one test case (preferably more than one) demonstrating the motivation. A simplified form of the "Minimal reproducer:" will do.

lib/Analysis/ScalarEvolution.cpp
2283–2286

Can you please add a comment here on why this is correct:

// This follows from the fact that the no-wrap flags on the outer add
// expression is applicable on the 0th iteration, when the add recurrence
// will be equal to its start value.
This revision now requires changes to proceed.May 8 2016, 11:36 PM
iid_iunknown edited edge metadata.

Addressing the review comments.

This needs at least one test case (preferably more than one) demonstrating the motivation. A simplified form of the "Minimal reproducer:" will do.

Thanks for your comments!

I added the test case and reduced it a bit but its IR is still quite big. The problem is that any attempt to shorten it further seems to hide the problem - IV elimination starts working even without our fix. Could you check if such a test is acceptable, please?

The test case checks if the optimized IR has no IV and its unnecessary sext. Another way to check this is to use opt -debug and look for "INDVARS: eliminating ..." in the output. Please, let me know if this way is more preferable.

Thanks!

iid_iunknown marked an inline comment as done.May 13 2016, 3:17 PM
asl added a subscriber: asl.May 16 2016, 7:08 AM
sanjoy requested changes to this revision.May 16 2016, 2:54 PM
sanjoy edited edge metadata.
sanjoy added inline comments.
test/Analysis/ScalarEvolution/iv_elimination.ll
1 ↗(On Diff #57262)

We don't run -On in unit tests unless unavoidable. What I had in was something like:

; RUN: opt -analyze -scalar-evolution < %s | FileCheck %s


< IR that -indvars sees, you can get this by breakpointing at IndVarSimplify::runOnLoop , and dumping out the Module >

with CHECK: lines verifying that the sext instructions don't get mapped to sext SCEV expressions.

This should also live in nsw.ll.

This revision now requires changes to proceed.May 16 2016, 2:54 PM
atrick added a subscriber: atrick.May 16 2016, 6:53 PM

Could you elaborate on this part of your comment, please?

with `CHECK:` lines verifying that the `sext` instructions don't get mapped to `sext` SCEV expressions.

A piece from my modified test that exposes the problem:

for.body3:
  %i.16 = phi i32 [ %inc5, %for.body3 ], [ %i.0.lcssa, %for.cond1.preheader ]
  %sub = add nsw i32 %i.16, -2
  %idxprom = sext i32 %sub to i64
  %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %arr, i64 0, i64 %idxprom

The patch eliminates %idxprom and uses an introduced %indvars.iv for array indexing. I am not sure this is something that can be well tested with opt -analyze -scalar-evolution. Opt shows different outputs with and w/o the patch, but these differences mainly relate to moving the constants outside of 'sext'. E.g.:

Patched:

%idxprom = sext i32 %sub to i64
  -->  {(-2 + (sext i32 %i.0.lcssa to i64))<nsw>,+,1}<nsw><%for.body3> U: [-2147483650,4294967304) S: [-2147483650,4294967304)		Exits: (-2 + (zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 %i.0.lcssa to i64))

No patch:

%idxprom = sext i32 %sub to i64
  -->  {(sext i32 (-2 + %i.0.lcssa) to i64),+,1}<nsw><%for.body3> U: [-2147483648,4294967306) S: [-2147483648,4294967306)		Exits: ((zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 (-2 + %i.0.lcssa) to i64))

(Patch gives "(-2 + (sext i32 %i.0.lcssa to i64))<nsw>" instead of "(sext i32 (-2 + %i.0.lcssa) to i64)").

My idea was to check that %idxprom gets eliminated and the array is indexed by an expression w/o 'sext'. This can be done by opt -indvars (-analyze is not useful for -indvars as IndVarsSimplify::print() is not defined).

Is this is what the test is expected to do or you have a different idea?

Thanks.

Could you elaborate on this part of your comment, please?

with `CHECK:` lines verifying that the `sext` instructions don't get mapped to `sext` SCEV expressions.

A piece from my modified test that exposes the problem:

for.body3:
  %i.16 = phi i32 [ %inc5, %for.body3 ], [ %i.0.lcssa, %for.cond1.preheader ]
  %sub = add nsw i32 %i.16, -2
  %idxprom = sext i32 %sub to i64
  %arrayidx = getelementptr inbounds [10 x i32], [10 x i32]* %arr, i64 0, i64 %idxprom

The patch eliminates %idxprom and uses an introduced %indvars.iv for
array indexing. I am not sure this is something that can be well
tested with opt -analyze -scalar-evolution. Opt shows different
outputs with and w/o the patch, but these differences mainly relate to
moving the constants outside of 'sext'. E.g.:

Patched:

%idxprom = sext i32 %sub to i64
  -->  {(-2 + (sext i32 %i.0.lcssa to i64))<nsw>,+,1}<nsw><%for.body3> U: [-2147483650,4294967304) S: [-2147483650,4294967304)		Exits: (-2 + (zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 %i.0.lcssa to i64))

No patch:

%idxprom = sext i32 %sub to i64
  -->  {(sext i32 (-2 + %i.0.lcssa) to i64),+,1}<nsw><%for.body3> U: [-2147483648,4294967306) S: [-2147483648,4294967306)		Exits: ((zext i32 (9 + (-1 * %i.0.lcssa)) to i64) + (sext i32 (-2 + %i.0.lcssa) to i64))

This _is_ the difference I'm suggesting we should test for. In the
unpatched case, SCEV has not been able to prove that `(-2 +
%i.0.lcssa)` won't sign overflow, so it cannot commute a sext into the
addition to transform sext(-2 + %i.0.lcssa) into `-2 +
sext(%i.0.lcssa). With your change SCEV is able to prove that -2 +
%i.0.lcssa` does not sign overflow, so the sext can be commuted into
the addition, simplifying the expression to `(-2 + (sext i32
%i.0.lcssa to i64))` as you can see.

My idea was to check that %idxprom gets eliminated and the array is
indexed by an expression w/o 'sext'. This can be done by `opt
-indvars` (-analyze is not useful for -indvars as
IndVarsSimplify::print() is not defined).

That's a useful test too, and if you're more comfortable with that I
won't be opposed to it. But checking SCEV directly is more targeted.

iid_iunknown edited edge metadata.

Addressing the review comments from Sanjoy.

The test case was reduced and changed to check that the sext instructions don't get mapped to sext SCEV expressions.

sanjoy accepted this revision.May 24 2016, 9:35 AM
sanjoy edited edge metadata.

lgtm with nits

Let me know if you need me to check this in for you.

test/Analysis/ScalarEvolution/nsw.ll
181

Nit: you don't need to specify D20058 here. Also use a CHECK-LABEL: <function name> like in the previous tests.

201

Add newline at end of file?

This revision is now accepted and ready to land.May 24 2016, 9:35 AM
iid_iunknown edited edge metadata.

Corrections according to the Sanjoy's comments

iid_iunknown marked 2 inline comments as done.May 24 2016, 4:10 PM

Let me know if you need me to check this in for you.

I have commit access and can commit it on my own.
Thanks!

test/Analysis/ScalarEvolution/nsw.ll
201

The tool I used for diff creation removed the trailing eol for some reason.
Fixed.

iid_iunknown closed this revision.May 25 2016, 6:07 AM
iid_iunknown marked an inline comment as done.