Add a new helper to check if a wrap predicate is always false. If we can
prove a predicate is always false, avoid vectorizing all together
instead of creating a dead vector loop.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
1921 | \p's should appear under ///, and refer to actual parameter names? (Following offline discussion:) Instead of building a SCEV for double the type size, evaluating both SCEVs at last iteration, and comparing to prove wrapping occurred, suffice to deduce the first iteration when wrap will occur, given constant step and constant (or lower-bound of) start, and size of type? Then compare this iteration with the trip-count if constant or VFxUF lower-bound if not. This could also allow vectorizing a subset of iterations until first wrap, followed by scalar remainder (or strip-mining the loop). Wrapping may be tolerated if it occurs on vector boundaries, considering vector loads, stores, and interleave groups. This requires alignment analysis. Unaligned accesses could tolerate wrapping by vectorizing into gathers or scatters. | |
7630 | nit: unrelated new line. | |
llvm/test/Transforms/LoopVectorize/AArch64/epilog-vectorization-widen-inductions.ll | ||
161 | nit: these changes from OFFSET_IDX to INDEX are unneeded? | |
399 | An i8 IV<0,+,1> will surely wrap across 10,000 iterations. | |
llvm/test/Transforms/LoopVectorize/runtime-check-small-clamped-bounds.ll | ||
8 | Fix comment. Worth also adding tests where wrapping does not occur within VF*UF or constant trip count, and vectorization is not aborted? | |
19 | Must this IV<0,+,1> % 4 wrap for VF=4 and unknown trip-count N? The first vector iteration would still work? | |
llvm/test/Transforms/LoopVectorize/scev-predicate-reasoning.ll | ||
99–100 | IV <30,+,1> wraps (as unsigned?) but immediately exits as soon as it reaching 0, so effectively iterates w/o wrapping? |
\p's should appear under ///, and refer to actual parameter names?
(Following offline discussion:)
The idea is the check if any WrapPredicate fires across all iterations of the vector loop, using its trip count if known, otherwise using VFxUF as a lower-bound of trip-count for reaching the vector loop. Suffice to check once - for trip count if constant or else for VFxUF.
Instead of building a SCEV for double the type size, evaluating both SCEVs at last iteration, and comparing to prove wrapping occurred, suffice to deduce the first iteration when wrap will occur, given constant step and constant (or lower-bound of) start, and size of type? Then compare this iteration with the trip-count if constant or VFxUF lower-bound if not. This could also allow vectorizing a subset of iterations until first wrap, followed by scalar remainder (or strip-mining the loop).
Wrapping may be tolerated if it occurs on vector boundaries, considering vector loads, stores, and interleave groups. This requires alignment analysis. Unaligned accesses could tolerate wrapping by vectorizing into gathers or scatters.