Page MenuHomePhabricator

[X86][Costmodel] Now that `getReplicationShuffleCost()` is good, update `getInterleavedMemoryOpCostAVX512()`
ClosedPublic

Authored by lebedev.ri on Nov 20 2021, 4:22 AM.

Details

Summary

... to actually ask about i1-elt-wide mask, since that is what will probably be used on AVX512.
This unblocks D111460.

Diff Detail

Event Timeline

lebedev.ri created this revision.Nov 20 2021, 4:22 AM

Rebased, NFC.
As far as i currently know, this is the last prerequisite for D114316.

RKSimon accepted this revision.Nov 29 2021, 2:49 AM

LGTM

This revision is now accepted and ready to land.Nov 29 2021, 2:49 AM

LGTM

Thank you for the review!

lebedev.ri added inline comments.Nov 29 2021, 4:04 AM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
5351–5354

Hmm, but we can't fold masked load into shuffle, can we?
The mask on the shuffle is for the output, not the input.