This is an archive of the discontinued LLVM Phabricator instance.

[LV] Don't allow outside uses of IVs if the SCEV is predicated on loop conditions
ClosedPublic

Authored by mkuper on Jul 10 2017, 4:26 PM.

Details

Summary

This fixes PR33706.

I'm still not 100% sure the PSCEV actually makes sense here. Sanjoy, if you could take a look (even post-commit) at the original PR, that would be great.

Diff Detail

Repository
rL LLVM

Event Timeline

mkuper created this revision.Jul 10 2017, 4:26 PM
mssimpso edited edge metadata.Jul 11 2017, 8:06 AM

Hi Michael,

I'm probably missing something obvious here. For external IV uses, we compute the end IV value assuming we're coming from the vector loop. But if we executed the vector loop, shouldn't the SCEV predicate have been true, which would mean that the PSEV was a safe assumption?

Hi Michael,

I'm probably missing something obvious here. For external IV uses, we compute the end IV value assuming we're coming from the vector loop. But if we executed the vector loop, shouldn't the SCEV predicate have been true, which would mean that the PSEV was a safe assumption?

You're not necessarily missing something obvious, I may be misunderstanding what PSCEV does here. If I understand correctly, PSCEV adds the assumption the IV doesn't overflow inside the loop. In this case, what I see is that it overflows on loop exit. So, either:
a) I misunderstand what PSCEV is doing here. :-)
b) PSCEV is wrong inside the loop.
c) The assumption PSCEV makes is correct, except on exit. If this is the case, then we can't use the SCEV we get to compute the value after the last iteration.

mssimpso accepted this revision.Jul 11 2017, 2:15 PM

Hi Michael,

I'm probably missing something obvious here. For external IV uses, we compute the end IV value assuming we're coming from the vector loop. But if we executed the vector loop, shouldn't the SCEV predicate have been true, which would mean that the PSEV was a safe assumption?

You're not necessarily missing something obvious, I may be misunderstanding what PSCEV does here. If I understand correctly, PSCEV adds the assumption the IV doesn't overflow inside the loop. In this case, what I see is that it overflows on loop exit. So, either:
a) I misunderstand what PSCEV is doing here. :-)
b) PSCEV is wrong inside the loop.
c) The assumption PSCEV makes is correct, except on exit. If this is the case, then we can't use the SCEV we get to compute the value after the last iteration.

I see. Yes, that makes sense to me, then.

This revision is now accepted and ready to land.Jul 11 2017, 2:15 PM

I'm probably missing something obvious here. For external IV uses, we compute the end IV value assuming we're coming from the vector loop. But if we executed the vector loop, shouldn't the SCEV predicate have been true, which would mean that the PSEV was a safe assumption?

You're not necessarily missing something obvious, I may be misunderstanding what PSCEV does here. If I understand correctly, PSCEV adds the assumption the IV doesn't overflow inside the loop. In this case, what I see is that it overflows on loop exit. So, either:
a) I misunderstand what PSCEV is doing here. :-)
b) PSCEV is wrong inside the loop.
c) The assumption PSCEV makes is correct, except on exit. If this is the case, then we can't use the SCEV we get to compute the value after the last iteration.

I can't be sure without understanding what exactly LV is trying to do here, but I suspect it is (c). This is true not only of PSCEV but also of SCEV -- "{A,+,B} is nsw" means "the increment *operation* will not overflow if the backedge is taken" or "the increment *operation* may overflow on the last iteration".

Concretely, this is the predicate PSCEV uses to make {A,+,B} nsw:

//   Step < 0, Start - |Step| * Backedge <= Start
//   Step >= 0, Start + |Step| * Backedge > Start

So, e.g. if Backedge is 1, Start is INT_SMAX - 1 and Step is also 1, then the predicate will be true but on the last iteration the IV will be INT_SMAX and the increment *operation* will overflow (even though the overflowed value will not flow back into the IV's PHI node).

I have some more detail here: https://www.playingwithpointers.com/scev-integer-overflow.html

This revision was automatically updated to reflect the committed changes.
gilr added inline comments.Jul 16 2017, 4:18 AM
llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
5321

Can we narrow the check as documented above to only check if the phi's SCEV relies on predicates?

5322

The phi node is presumably guaranteed not to overflow also on the last iteration, so should be correct as a live out, right?

llvm/trunk/test/Transforms/LoopVectorize/pr33706.ll
8

It would be good to document the specific problem in vectorizing this loop, which is the live-out value %tmp20. When %arg2 == 1, the trip count is 65536 and %tmp20 coming out of the loop should be 0 (65536 & 65535), but currently LV pre-computes the live-out in the pre-header as 65536.
Right?

Can the test be minimized?

Ayal added a subscriber: Ayal.Jul 18 2017, 12:30 AM

Thanks, Gil!

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
5321

I don't think so - IIUC, once we've added the predicate to PSCEV, all further SCEV expressions we get from it can rely the same predicate, and I'm not sure there's a way to query whether a specific one did.

5322

Oh, yes, I think you're right - and this meshes well with what sanjoy told me!
Feel free to refine it.

llvm/trunk/test/Transforms/LoopVectorize/pr33706.ll
8

Right. I thought PR33706 has sufficient details, and pointing to it was fine. I can add the explanation here if you want.

I actually spent a while trying to minimize the test, but it's pretty annoying - bugpoint minimizes to the wrong thing, because we can also vectorize similar things using first-order recurrences.

Ayal added a comment.Jul 27 2017, 12:58 AM

Instead of refraining to vectorize a loop which has an externally used phi (or rather the bump thereof) and any predicate, can a predicate be added (or an existing one be extended) to also cover the last iteration? Pity to bail out on such corner cases.

llvm/trunk/lib/Transforms/Vectorize/LoopVectorize.cpp
5321

Yeah; we could try to note if addInductionPhi() is being called from the first attempt to recognize isInductionPHI() which adds no new predicates, or otherwise from the second attempt which does. But in both cases isInductionPHI() calls PSE.getSCEV(Phi) which may use existing predicates. Probably needs restructuring to first check the result of SE.getSCEV(Phi).

Best clarify the documentation above to match the code.