The vectorizer currently does not attempt to create interleave-groups that contain predicated loads/stores; predicated strided accesses can currently be vectorized only using masked gather/scatter or scalarization. This patch makes predicated loads/stores candidates for forming interleave-groups during the Loop-Vectorizer's analysis, and adds the proper support for masked-interleave-groups to the Loop-Vectorizer's planning and transformation stages. The patch also extends the TTI API to allow querying the cost of masked interleave groups (which each target can control); Targets that support masked vector loads/stores may choose to enable this feature and allow vectorizing predicated strided loads/stores using masked wide loads/stores and shuffles.
|131 ↗||(On Diff #168749)|
"each of the elements in a vector of \p VF elements" >> “each element in a vector of \p VF elements”, or “each of the \p VF elements in a vector”.
Indicate “Interleaving” in the name of the method, given the InterleaveFactor argument?
|822 ↗||(On Diff #168749)|
Below might look simpler using BlockA = A->getParent(); BlockB = B->getParent();
|2598 ↗||(On Diff #168749)|
(Unrelated clang-format change)
|416 ↗||(On Diff #168749)|
" optionally" >> "optionally"
|1985 ↗||(On Diff #168749)|
May be slightly better to check (!BlockInMask || !Group->isReverse()); or fold under the "if (IsMaskRequired)" below.
|2041 ↗||(On Diff #168749)|
Above may look simpler by first setting
auto *Undefs = UndefValue::get(Mask->getType()); auto *TransposedMask = createTransposedMask(Builder, InterleaveFactor, VF);
|2043 ↗||(On Diff #168749)|
"wide.masked.load" >> "wide.masked.vec", to follow "wide.vec" more closely?
|2120 ↗||(On Diff #168749)|
See above simplification comment.
|4304 ↗||(On Diff #168749)|
Better check only if isLegalMaskedLoad : isLegalMaskedStore. The purpose of optimizing interleaved accesses is to improve upon the use of Gather/Scatter.
|7108 ↗||(On Diff #168749)|
Simpler to fold into
|2038 ↗||(On Diff #168784)|
One could set Undefs and RepMask once outside the "for Part" loop, using Mask->getType(), and also reuse them for interleaved store group. But doing so conditional on IsMaskRequired is a bit less appealing.
|2042 ↗||(On Diff #168784)|
Use UndefVec instead of UndefValue::get(VecTy)