This is an archive of the discontinued LLVM Phabricator instance.

[LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH instead
ClosedPublic

Authored by jmolloy on Aug 30 2015, 4:19 AM.

Details

Summary

We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch.

There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity.

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy updated this revision to Diff 33531.Aug 30 2015, 4:19 AM
jmolloy retitled this revision from to [LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH instead.
jmolloy updated this object.
jmolloy added reviewers: hfinkel, anemet, mzolotukhin.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a subscriber: llvm-commits.
mzolotukhin edited edge metadata.Aug 30 2015, 2:04 PM

Hi James,

Won't we execute one iteration of scalar loop when trip-count is 0? IIUC, before the change we checked (0%VF==0) and went to MiddleBlock, where we checked it against 0, and then skipped the scalar loop. After the change we check (0%VF==0) and go straight to scalar pre-header. Thus we'll execute one iteration for this loop, which is incorrect. Do I miss something?

Hi Michael,

Won't we execute one iteration of scalar loop when trip-count is 0? IIUC, before the change we checked (0%VF==0) and went to MiddleBlock, where we checked it against 0, and then skipped the scalar loop. After the change we check (0%VF==0) and go straight to scalar pre-header. Thus we'll execute one iteration for this loop, which is incorrect. Do I miss something?

You're right, but I think this behavior is correct. The trip count can never be zero, because the original loop is in LoopSimplify form. That means it has a trip count of at least one (because the loop test is in the latch). Therefore we never need to check if the trip count is 0 before the vector loop - just after the vector loop (because N - (N % VF) could be zero).

James

Hi James,

Yeah, I think you're right. Then this is a nice simplification indeed!

Thanks,
Michael

test/Transforms/LoopVectorize/reduction.ll
179

Why did it disappear?

With the diff reverted, I committed this in r246637.

test/Transforms/LoopVectorize/reduction.ll
179

It shouldn't. Perhaps it needed to in a previous incarnation of this patch, but it certainly doesn't need to now. I've reverted this diff.

anemet accepted this revision.Apr 8 2016, 5:12 PM
anemet edited edge metadata.

MichaelZ accepted this and then it was committed.

This revision is now accepted and ready to land.Apr 8 2016, 5:12 PM
anemet closed this revision.Apr 8 2016, 5:13 PM