This is an archive of the discontinued LLVM Phabricator instance.

[X86][TTI] `getReplicationShuffleCost()`: account for deduplication
Changes PlannedPublic

Authored by lebedev.ri on Mar 5 2022, 2:43 AM.

Details

Reviewers
RKSimon
Summary

Let's take a look at:
https://godbolt.org/z/4f6bv69hc

Even though it would seem that we need 4 shuffles there,
we only need two, because the replication factor is 2x the vector size,
so half of the vectors can be materialized via a move.

Effectively, this means that there is a hard upper limit
for the replication cost along the replication factor axis.

Diff Detail

Event Timeline

lebedev.ri created this revision.Mar 5 2022, 2:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 5 2022, 2:43 AM
lebedev.ri requested review of this revision.Mar 5 2022, 2:43 AM
lebedev.ri edited the summary of this revision. (Show Details)
lebedev.ri planned changes to this revision.Mar 5 2022, 10:17 AM

Hm, this isn't quite right, but then i'm not sure how often
do such large replication factors happen in practice,
so i'm not sure if i should bother.