Under Opt for Size, the vectorizer does not vectorize interleave-groups that have gaps at the end of the group (such as a loop that reads only the even elements: a[2*i]) because that implies that we'll require a scalar epilogue (which is not allowed under Opt for Size). This patch extends the support for masked-interleave-groups (introduced by D53011 for conditional accesses) to also cover the case of gaps in a group of loads; Targets that enable the masked-interleave-group feature don't have to invalidate interleave-groups of loads with gaps; they could now use masked wide-loads and shuffles (if that's what the cost model selects).
Looking at tests next.
Maybe rename IsMasked into UseMaskForCond or IsConditional, to distinguish between the two Is/Use Masks.
"masks away gaps" >> "filters the members"
createBitMaskForGaps or createBinaryMaskForGaps ?
Check if !EnabledMaskedInterleave before calling invalidateGroups...()?
Could first "peel" to build a mask for all members, then replicate it VF-1 times. Not sure it's any better.
More logical to reverse the condition? Admittedly this is only being moved here.
Also added a test with stride 3.
Thanks, one was indeed needed.
LGTM, with few minor additional optional suggestions.
The "if there is no other means..." is now part of the "This can happen when". I.e.,
This scaling works just as well for gap-masked loads, right?
Comment here that UseMaskForGaps alone does not add to Cost, because its mask is uniform. Unlike below where it adds the cost of And-ing the two masks.
"or masking ..." >> "and cannot be masked (not enabled)."
"Under optsize" and when the trip count is very small "we don't ..."
Rename IsMaskRequired to [Is]MaskForCondRequired?
Rename ShuffledMask to MaskForCond?
"costModel" >> "cost model", or "CostModel"
It's indeed good to record in CostModel the constraint forbidding a scalar epilogue, instead of passing an overloaded and abused OptForSize parameter around. It should be recorded at the outset, say at the constructor of CostModel, to be consistent; and isScalarEpilogueAllowed() should be used inside the CostModel instead of OptForSize throughout. This NFC refactoring can be done separately, before/after this patch.
Add checks for this scalarizing when enable-masked-interleaved-access is disabled, or comment that it is already checked above?