Page MenuHomePhabricator

[LAA] Relax restrictions on early exits in loop structure
ClosedPublic

Authored by reames on Nov 24 2020, 6:23 PM.

Details

Summary

This is a preparation patch for supporting multiple exits in the loop vectorizer, by itself it should be purely NFC. This patch moves the loop structure checks from LAA to their respective consumers (where duplicates don't already exist).

Why do this? Well, after auditing the code, I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times. This patch simply makes this explicit so that if one consumer - say LV in the near future (hopefully) - wants to handle a broader class of loops, it can do so.

Diff Detail

Event Timeline

reames created this revision.Nov 24 2020, 6:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2020, 6:23 PM
reames requested review of this revision.Nov 24 2020, 6:23 PM
Ayal accepted this revision.Dec 12 2020, 3:23 PM

This looks good to me, adding some minor comments. Thanks!

I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times.

Well, LAA's RuntimePointerChecking::insert() relies on getBackedgeTakenCount() to provide "the" trip-count, which may vary for distinct instructions if the loop is not bottom-tested. But it does provide an upper-bound for all instructions, i.e., a safe, conservative value. In other words, a slightly tighter runtime check could potentially be provided for pointers accessed one less iteration, appearing 'below' the exiting block.

(where duplicates don't already exist)

LV indeed checks for single-exit-at-bottom by itself, in LVL.canVectorize*(). Curious to see this restriction being lifted! It is somewhat related to "massaging" inner loops when vectorizing an outer loop.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
1789

Note: these DEBUG messages get lost.

llvm/lib/Transforms/Scalar/LoopDistribute.cpp
674

This existing !getExitBlock() check indeed saves us from introducing an additional !getExitingBlock() check; worth a comment, here and/or in LoopInfo?
Note that L may have a single exit block B and multiple exiting blocks - all jumping to B. But getExitBlock() returns false for such an L, in contrast to getUniqueExitBlock() which returns true. I.e., getExitBlock() does imply getExitingBlock().

680

May be worth noting in the patch summary that some ORE and LAA debug messages are added and dropped, respectively.

683–684

Hoist or remove above comment?

llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
635

Does this (TODO?) comment imply that in present/future this limitation can/should be dropped?

This revision is now accepted and ready to land.Dec 12 2020, 3:23 PM
fhahn accepted this revision.Dec 13 2020, 7:33 AM

LGTM, thanks! It looks like there are some formatting issues, please re-format before landing.

I can't actually find anything in LAA itself which relies on having all instructions within a loop execute an equal number of times.

Well, LAA's RuntimePointerChecking::insert() relies on getBackedgeTakenCount() to provide "the" trip-count, which may vary for distinct instructions if the loop is not bottom-tested. But it does provide an upper-bound for all instructions, i.e., a safe, conservative value. In other words, a slightly tighter runtime check could potentially be provided for pointers accessed one less iteration, appearing 'below' the exiting block.

That matches my understanding as well. The current approach should be conservatively correct, with potential for making LAA smarter when it comes to reasoning about dependences only along certain exit paths in the future.

reames added inline comments.Dec 14 2020, 12:29 PM
llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
635

I was thinking of it more as explaining why a random check exists. Someone interested could certainly relax this, I have no intention to pursue.

This revision was landed with ongoing or failed builds.Dec 14 2020, 12:44 PM
This revision was automatically updated to reflect the committed changes.