This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add worst case shuffle costs
ClosedPublic

Authored by dmgreen on Jul 18 2021, 9:51 AM.

Details

Summary

This adds some missing single source shuffle costs for AArch64, of i16 and i8 vectors. v4i16 are the same as v4i32 with a worse case cost of 3 coming from the perfect shuffle tables. The larger vector sizes expand into a constant pool, plus a load (and adrp) and a tbl. I arbitrarily chose 8 for the cost to be expensive but not too expensive.

Diff Detail

Event Timeline

dmgreen created this revision.Jul 18 2021, 9:51 AM
dmgreen requested review of this revision.Jul 18 2021, 9:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 18 2021, 9:51 AM
Matt added a subscriber: Matt.Jul 20 2021, 7:01 AM
david-arm accepted this revision.Jul 21 2021, 3:51 AM

LGTM! The new costs look a lot more sensible. Not for this patch, but I do wonder why the v4i1,etc. costs are so high for reduce-xor.ll compared to reduce-or.ll?

This revision is now accepted and ready to land.Jul 21 2021, 3:51 AM
This revision was landed with ongoing or failed builds.Jul 23 2021, 1:02 AM
This revision was automatically updated to reflect the committed changes.