As it can be seen in InnerLoopVectorizer::vectorizeInterleaveGroup(),
in some cases (reported by UseMaskForGaps), the gaps in the interleaved load/store group
will be masked away by another constant mask, so there is no need to
account for the cost of replication of the mask for these.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
LGTM (by inspection) - I tend to prefer to use the getScalarizationOverhead wrapper when we're using all the elements of the type, but I think it's better for consistency here to always specify the demanded elts.
Ideally we'd have at least some test coverage for this, but I understand mask gaps codegen can be tricky.
llvm/include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
1241–1244 | Worth pulling these out? const APInt DemandedAllSubElts = APInt::getAllOnes(NumSubElts); |
Comment Actions
@RKSimon thank you for the review!
Applied nit suggestion.
Yeah, it would indeed be really great to have test coverage for this.
Worth pulling these out?
const APInt DemandedAllSubElts = APInt::getAllOnes(NumSubElts);
const APInt DemandedAllResultElts = APInt::getAllOnes(NumElts);