This is an archive of the discontinued LLVM Phabricator instance.

[IndVars] Change the order to compute WidenAddRec in widenIVUse
ClosedPublic

Authored by wmi on Oct 27 2016, 6:29 PM.

Details

Summary

The patch is to solve the case like following which current indvar cannot widen because of limitation caused by SCEV folding.

void foo(int nsteps, unsigned char *maxarray, int startx, int V1, int V2, unsigned *lined) {

int j, k;
for (j = 0; j < nsteps; j++)
startx = V1 + j * V2;
for (k = 1; k < size - 1; k++)
((unsigned char *)lined[startx + k]) = ...


We have a two level nested loops. There is a array store lined[startx + k] inside the inner loop. Its index is a result of add expr with nsw flag. The operand0 startx can be represented as a SCEV: {V1 ,+, V2}<outerloop> (Note: no nsw here). The operand1 k can be represented as a SCEV: {1 ,+, 1}<nsw><innerloop>.

Because of folding in ScalarEvolution::getAddExpr (folding loop invariant into add recurrence), the SCEV of the add expr is: {{(1 + V1)<nsw> ,+, V2}<outerloop> ,+, 1}<nsw><innerloop>. As a result of the folding, IndVar cannot prove sext{{(1 + V1)<nsw> ,+, V2}<outerloop> ,+, 1}<nsw><innerloop> == sext( {V1 ,+, V2}<outerloop>) +nsw sext( {1 ,+, 1}<nsw><innerloop>) ("The major reason is there is no nsw for {V1 ,+, V2}<outerloop>"), so it cannot do widening for the add expr and cannot remove the sext of the index.

However, since we know from IR that the add expr "startx + k" has nsw flag, we know it is legal to do widening for the expr. We already utilize the nsw flag from IR to enhance the widening by using getExtendedOperandRecurrence. Why it doesn't work here?

To answer the question by myself, the problem is that in this case both WidenIV::getWideRecurrence and WidenIV::getExtendedOperandRecurrence return non-null but different WideAddRec. Because getWideRecurrence is called before getExtendedOperandRecurrence, we won't bother to use the result of getExtendedOperandRecurrence. However, if WidenIV::getExtendedOperandRecurrence returns non-null WideAddRec, we know for sure that it is legal to do widening for current instruction, so we should actually put getExtendedOperandRecurrence before getWideRecurrence. This is what the patch is doing.

Diff Detail

Repository
rL LLVM

Event Timeline

wmi updated this revision to Diff 76154.Oct 27 2016, 6:29 PM
wmi retitled this revision from to [IndVars] Change the order to compute WidenAddRec in widenIVUse.
wmi updated this object.
wmi added reviewers: sanjoy, atrick.
wmi set the repository for this revision to rL LLVM.
wmi added subscribers: llvm-commits, davidxl.
sanjoy accepted this revision.Nov 8 2016, 6:33 PM
sanjoy edited edge metadata.

lgtm, thanks!

If it isn't too much trouble, can you add a zext equivalent of the test case too, just as a sanity check?

This revision is now accepted and ready to land.Nov 8 2016, 6:33 PM
wmi updated this revision to Diff 77546.Nov 10 2016, 1:53 PM
wmi edited edge metadata.

Add zext to the iv-widen.ll test.

I still need to update iv-widen-elim-ext.ll. Now SE->isKnownPredicate return false for "%add = add nsw i32 %i.02, 2" in foo because %add feeds into zext and zext doesn't propagate full poison, so the nsw flag cannot be copied from IR to SCEV. As a result, NarrowIVDefUse::NeverNegative for %add is false and my change here will generate an extra trunc before zext. The fact that existing compiler doesn't generate the extra trunc is by luck to some extent.

I update the test to add another use for %add: "udiv 5, %add", which is to ensure %add is not poison otherwise there will be undefine behavior, and ensure that nsw flag on %add instruction can be copied to its SCEV, so that NarrowIVDefUse::NeverNegative for %add will be true.

After the test update, w/wo the change here, indvars will not generate extra trunc for sext/zext for foo in the testcase.

This revision was automatically updated to reflect the committed changes.