This is an archive of the discontinued LLVM Phabricator instance.

[X86] `X86TTIImpl::getInterleavedMemoryOpCost()`: scale interleaving cost by the fraction of live members
ClosedPublic

Authored by lebedev.ri on Oct 22 2021, 4:33 AM.

Details

Summary

By definition, interleaving load of stride N means:
load N*VF elements, and shuffle them into N VF-sized vectors,
with 0'th vector containing elements [0, VF)*stride + 0,
and 1'th vector containing elements [0, VF)*stride + 1.
Example: https://godbolt.org/z/df561Me5E (i64 stride 4 vf 2 => cost 6)

Now, not fully interleaved load, is when not all of these vectors is demanded.
So at worst, we could just pretend that everything is demanded,
and discard the non-demanded vectors. What this means is that the cost
for not-fully-interleaved group should be not greater than the cost
for the same fully-interleaved group, but perhaps somewhat less.
Examples:
https://godbolt.org/z/a78dK5Geq (i64 stride 4 (indices 012u) vf 2 => cost 4)
https://godbolt.org/z/G91ceo8dM (i64 stride 4 (indices 01uu) vf 2 => cost 2)
https://godbolt.org/z/5joYob9rx (i64 stride 4 (indices 0uuu) vf 2 => cost 1)

Right now, for such not-fully-interleaved loads we just use the costs
for fully-interleaved loads. But at least in general,
that is obviously overly pessimistic, because in general,
not all the shuffles needed to perform the full interleaving
will end up being live.

So what i propose, is to naively scale the interleaving cost
by the fraction of the live members. I believe this should still result
in the right ballpark cost estimate.

Diff Detail

Event Timeline

lebedev.ri created this revision.Oct 22 2021, 4:33 AM
lebedev.ri requested review of this revision.Oct 22 2021, 4:33 AM

Rebased, NFC.

RKSimon accepted this revision.Oct 22 2021, 5:57 AM

SGTM - cheers

llvm/lib/Target/X86/X86TargetTransformInfo.cpp
5428–5434

Add a comment explaining this is just an approximation

This revision is now accepted and ready to land.Oct 22 2021, 5:57 AM

SGTM - cheers

To check - was it clear from my explanation that this is a *rough* approximation,
that can deviate from reality in either direction, i.e. it may both be
higher cost than in reality, and for some cases it might be lower cost than in reality?

SGTM - cheers

To check - was it clear from my explanation that this is a *rough* approximation,
that can deviate from reality in either direction, i.e. it may both be
higher cost than in reality, and for some cases it might be lower cost than in reality?

Yes - which is why I asked you to include that as a comment to GetDiscountedCost. Cheers.

lebedev.ri marked an inline comment as done.

SGTM - cheers

To check - was it clear from my explanation that this is a *rough* approximation,
that can deviate from reality in either direction, i.e. it may both be
higher cost than in reality, and for some cases it might be lower cost than in reality?

Yes - which is why I asked you to include that as a comment to GetDiscountedCost. Cheers.

Awesome, thank you for the review!

This revision was landed with ongoing or failed builds.Oct 22 2021, 6:34 AM
This revision was automatically updated to reflect the committed changes.