This is an archive of the discontinued LLVM Phabricator instance.

[LV] Avoid vectorization if wrap predicates are always false.
Needs ReviewPublic

Authored by fhahn on Jun 9 2023, 12:27 PM.

Details

Reviewers
Ayal
gilr
rengolin
Summary

Add a new helper to check if a wrap predicate is always false. If we can
prove a predicate is always false, avoid vectorizing all together
instead of creating a dead vector loop.

Diff Detail

Event Timeline

fhahn created this revision.Jun 9 2023, 12:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2023, 12:27 PM
fhahn requested review of this revision.Jun 9 2023, 12:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2023, 12:27 PM
Ayal added inline comments.Jun 14 2023, 1:03 PM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
1921

\p's should appear under ///, and refer to actual parameter names?

(Following offline discussion:)
The idea is the check if any WrapPredicate fires across all iterations of the vector loop, using its trip count if known, otherwise using VFxUF as a lower-bound of trip-count for reaching the vector loop. Suffice to check once - for trip count if constant or else for VFxUF.

Instead of building a SCEV for double the type size, evaluating both SCEVs at last iteration, and comparing to prove wrapping occurred, suffice to deduce the first iteration when wrap will occur, given constant step and constant (or lower-bound of) start, and size of type? Then compare this iteration with the trip-count if constant or VFxUF lower-bound if not. This could also allow vectorizing a subset of iterations until first wrap, followed by scalar remainder (or strip-mining the loop).

Wrapping may be tolerated if it occurs on vector boundaries, considering vector loads, stores, and interleave groups. This requires alignment analysis. Unaligned accesses could tolerate wrapping by vectorizing into gathers or scatters.

7630

nit: unrelated new line.

llvm/test/Transforms/LoopVectorize/AArch64/epilog-vectorization-widen-inductions.ll
161

nit: these changes from OFFSET_IDX to INDEX are unneeded?

399

An i8 IV<0,+,1> will surely wrap across 10,000 iterations.
But seems like an infinite loop - how can %iv.next.ext ever be equal to 10,000?

llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll
8

Fix comment.

Worth also adding tests where wrapping does not occur within VF*UF or constant trip count, and vectorization is not aborted?

19

Must this IV<0,+,1> % 4 wrap for VF=4 and unknown trip-count N? The first vector iteration would still work?

llvm/test/Transforms/LoopVectorize/scev-predicate-reasoning.ll
99–100

IV <30,+,1> wraps (as unsigned?) but immediately exits as soon as it reaching 0, so effectively iterates w/o wrapping?