This is an archive of the discontinued LLVM Phabricator instance.

[BasicTTI] getInterleavedMemoryOpCost(): discount unused members of mask if mask for gap will be used
ClosedPublic

Authored by lebedev.ri on Oct 30 2021, 4:08 PM.

Details

Summary

As it can be seen in InnerLoopVectorizer::vectorizeInterleaveGroup(),
in some cases (reported by UseMaskForGaps), the gaps in the interleaved load/store group
will be masked away by another constant mask, so there is no need to
account for the cost of replication of the mask for these.

Diff Detail

Event Timeline

lebedev.ri requested review of this revision.Oct 30 2021, 4:08 PM
lebedev.ri created this revision.

LGTM (by inspection) - I tend to prefer to use the getScalarizationOverhead wrapper when we're using all the elements of the type, but I think it's better for consistency here to always specify the demanded elts.

Ideally we'd have at least some test coverage for this, but I understand mask gaps codegen can be tricky.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1241

Worth pulling these out?

const APInt DemandedAllSubElts = APInt::getAllOnes(NumSubElts);
const APInt DemandedAllResultElts = APInt::getAllOnes(NumElts);

lebedev.ri updated this revision to Diff 384420.Nov 3 2021, 6:34 AM
lebedev.ri marked an inline comment as done.

@RKSimon thank you for the review!
Applied nit suggestion.

Yeah, it would indeed be really great to have test coverage for this.

lebedev.ri updated this revision to Diff 384428.Nov 3 2021, 7:32 AM

And we have a winner!
Test coverage added, will land immediately.

This revision was not accepted when it landed; it landed in state Needs Review.Nov 3 2021, 7:33 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.