This is an archive of the discontinued LLVM Phabricator instance.

[LV] Unconditionally branch from middle to scalar preheader if the scalar loop must execute
ClosedPublic

Authored by reames on Jan 17 2021, 8:18 PM.

Details

Summary

If we know that the scalar epilogue is required to run, modify the CFG to end the middle block with an unconditional branch to scalar preheader. This is instead of a conditional branch to either the preheader or the exit block.

The motivation to do this is to support multiple exit blocks. Specifically, the current structure forces us to identify immediate dominators and *which* exit block to branch from in the middle terminator. For the multiple exit case - where we know require scalar will hold - these questions are ill formed.

This is the last change needed to support multiple exit loops, but since the diffs are already large enough, I'm going to land this, and then enable separately. You can think of this as being NFCIish prep work, but the changes are a bit too involved for me to feel comfortable tagging the review that way.

Diff Detail

Event Timeline

reames created this revision.Jan 17 2021, 8:18 PM
reames requested review of this revision.Jan 17 2021, 8:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 17 2021, 8:18 PM
reames updated this revision to Diff 317261.Jan 17 2021, 8:25 PM

Update ascii art to reflect change and clarify a confusing bit in the original.

reames updated this revision to Diff 317262.Jan 17 2021, 8:35 PM

rebase over a landed set of tests

fhahn added a comment.Jan 18 2021, 1:06 PM

This makes sense, but the fact that we need to scatter tests throughout various parts of the code seems unfortunate. I'm planning to take a closer look over the next few days to see if I can provide any ideas on how this could be improved.

Florian, have you had a chance to give thought to alternatives? Given it's been two weeks, unless you have an actionable suggestion, I'd like to move forward with the current patch. I'll emphasize that I'm happy to iterate on the design here, either during review, or after submission if you think of something cleaner.

fhahn accepted this revision.Feb 4 2021, 2:24 PM

Florian, have you had a chance to give thought to alternatives? Given it's been two weeks, unless you have an actionable suggestion, I'd like to move forward with the current patch. I'll emphasize that I'm happy to iterate on the design here, either during review, or after submission if you think of something cleaner.

Unfortunately I did not really have time to think much about better alternatives so far (and it's unlikely I'll get time until end of next week). But I agree, it's not a really big deal and it works well enough for now. Let's tweak it post-commit, should a better alternative presents itself. So for now I just have a few small suggestions, mostly wording related.

LGTM, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
3153

This applies for all loops that require scalar epilogues, not just ones with multiple exits, right? If so, it might be better word it in terms of requiring scalar epiloge, rather than multi exits. (perhaps something like An edge from the middle block to the exit block is only added if the scalar epilogue may not be executed. Thus only update the immediate dominators if the scalar epilogue is not required.)

3402

nit: set up a conditional branch from the middle block to the scalar loop pre-header?

3410

Do we need this assert? There's a similar assert just a few lines above. If it's not needed, getting rid of the lambda would make things a bit easier to read IMO (perhaps it could be just BranchInst *BrInst = Cost->requiresScalarEpilogue() ? BranchInst::Create(LoopScalarPreHeader) : BranchInst::Create(LoopExitBlock, LoopScalarPreHeader, Builder.getTrue()))

3524

nit: if we require *a* scalar epilogue .... as we unconditionally branch to *the* scalar preheader?

3667

nit: unrelated whitespace change

This revision is now accepted and ready to land.Feb 4 2021, 2:24 PM
Ayal added inline comments.May 24 2021, 7:27 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5919

Not sure if this may help with the reported failures, but if vectorizing a loop having multiple exits is not (yet) intended to work for epilog vectorization, perhaps return false here if OrigLoop has multiple exit blocks.
(Otherwise the CFG of https://llvm.org/docs/Vectorizers.html#epilogue-vectorization may be updated).

Another thought to try and reduce the effect of the patch temporarily, is to check
if (!LoopExitBlock)
instead of
if (!Cost->requiresScalarEpilogue())
leaving the suboptimal but currently working code for single exit loops that require scalar epilogue.

reames added inline comments.Jun 7 2021, 9:31 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
5919

Not sure if this may help with the reported failures, but if vectorizing a loop having multiple exits is not (yet) intended to work for epilog vectorization, perhaps return false here if OrigLoop has multiple exit blocks.
(Otherwise the CFG of https://llvm.org/docs/Vectorizers.html#epilogue-vectorization may be updated).

This seems to be a comment which applies to the future patch which enables multiple exit vectorization, not this one. Unless I'm missing something?

Also, I would really expect a flag called *requires* scalar epilogue to override the epilogue vectorization setting, but I haven't stared at the code enough to know if that's really true.