This is an archive of the discontinued LLVM Phabricator instance.

[LoopOptimizer][Fix]PR32859, PR24738
ClosedPublic

Authored by dtemirbulatov on May 10 2017, 10:36 AM.

Details

Summary

The Loop vectorizer pass introduced undef value while it is fixing output of LCSSA form.
Here it is:

before: %e.0.ph = phi i32 [ 0, %for.inc.2.i ]
after: %e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ undef, %middle.block ]

and after this change we have:

%e.0.ph = phi i32 [ 0, %for.inc.2.i ]
%e.0.ph = phi i32 [ 0, %for.inc.2.i ], [ 0, %middle.block ]

Ok to commit?

Diff Detail

Event Timeline

dtemirbulatov created this revision.May 10 2017, 10:36 AM
davide edited edge metadata.May 10 2017, 10:42 AM

Can you explain your fix? Why replacing undef with 0 just works?

test/Transforms/LoopVectorize/pr32859.ll
2–3

This is, unfortunately, really really brittle.
You may want to rewrite this to only run -loop-vectorize.

144–155

This test can be greatly reduced using bugpoint and or delta. For example, I think all these lines (attributes/metadata) aren't really needed).

Well, Out of the LCSSA for we could have "phi i32 [ 0, %for.inc.2.i ]" or "phi i32 [ undef, %for.inc.2.i ]" like in PR14725, but the IR Verifier requires for PHI one entry for each predecessor of its parent basic block. The original PR14725 just added 'undef' for an predecessor BB and it is not correct. We copy the real value for another predecessor instead of bringing 'undef'.

test/Transforms/LoopVectorize/pr32859.ll
144–155

yes, correct.

dberlin edited edge metadata.May 10 2017, 11:12 AM

Well, Out of the LCSSA for we could have "phi i32 [ 0, %for.inc.2.i ]" or "phi i32 [ undef, %for.inc.2.i ]" like in PR14725, but the IR Verifier requires for PHI one entry for each predecessor of its parent basic block. The original PR14725 just added 'undef' for an predecessor BB and it is not correct. We copy the real value for another predecessor instead of bringing 'undef'.

Not all LCSSA phis are single variable phis.

Any loop exit with multiple predecessors will have multiple incoming edges.
Why is "0" the right predecessor to pick there?
ISTM the right predecessor would be the one that we would have gone through before we split the loop.

IE if the loop is split at block B, the right predecessor is whichever would have been reached by B.

I think in the particular case we are limited here to just one predecessor, there is "LCSSAPhi->getNumIncomingValues() == 1" condition above.

I think in the particular case we are limited here to just one predecessor, there is "LCSSAPhi->getNumIncomingValues() == 1" condition above.

Yeah, i'm not sure why (LCSSA.cpp definitely has code to create LCSSA phis that have multiple incoming values, which surprised me, since normally LCSSA is single-valued phis, but ...)
This whole thing seems like a brutal hack (i'm not sure how it's not breaking LCSSA in some cases if we really allow multi-value phis).
But your change should at least be correct for what it does :)

Also, can you please make sure we don't miscompile any of the dups of the original bug with this patch?

Also, can you please make sure we don't miscompile any of the dups of the original bug with this patch?

you mean PRPR14725's testcase? yes this one looks correct to me after this change.

Also, can you please make sure we don't miscompile any of the dups of the original bug with this patch?

you mean PRPR14725's testcase? yes this one looks correct to me after this change.

Yes, you can just run the examples in:
https://bugs.llvm.org/show_bug.cgi?id=32859
and
https://bugs.llvm.org/show_bug.cgi?id=32935

and make sure they return the wrong result after NewGVN.

Thanks!

yes, both examples from PR32935, PR32859 are fixed by this change

update after review.

This patch seems at least to plug a sane value rather than undef for the phi arg.
I don't think this should go in without a vectorizer expert signing it off (i.e. @mkuper or @mssimpso).

test/Transforms/LoopVectorize/pr32859.ll
1–2

And add a comment explaining what this test does/fixes.

26

I think you can probably get rid of some attributes.

28–30

Do you really need this?

mssimpso added inline comments.May 11 2017, 11:41 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
4349

This makes sense to me. My only suggestion might would be to add an assert that LCSSAPhi->getIncomingValue(0) is loop-invariant in the original loop. We allow external uses of inductions and reductions, but we fix those phis differently before calling fixLCSSAPHIs. So if a phi doesn't yet have an entry for the middle block by the time we get to fixLCSSAPHIs, we know the incoming value should be loop-invariant.

test/Transforms/LoopVectorize/pr32859.ll
1

You don't need the "2>&1"

26–30

You don't need the attributes and metadata.

another update after review. Erased attributes for the test added test's explanation. Added an assert that LCSSAPhi->getIncomingValue(0) is loop-invariant in the original loop

Fixed some typos.

mssimpso accepted this revision.May 12 2017, 7:49 AM
This revision is now accepted and ready to land.May 12 2017, 7:49 AM
This revision was automatically updated to reflect the committed changes.