This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Only mark cost 1 perfect shuffles as legal
ClosedPublic

Authored by dmgreen on Apr 8 2022, 3:39 AM.

Details

Summary

The perfect shuffle tables encode a cost of either 0 (a nop-copy) or 1 (a single instruction) with a cost encoding of 0 in the upper 2 bits. All perfect shuffles with any cost are then marked as legal shuffles though (the maximum encoded cost is 3), which can confuse the DAG combiner into thinking the shuffles are cheaper than the should be.

Limiting legal shuffles to single instructions seems to do better in most case, producing less instructions for complex shuffles. There are some cases that now become tbl, which may be better or worse depending on whether the instruction is in a loop and the tbl load can be hoisted out.

Diff Detail

Event Timeline

dmgreen created this revision.Apr 8 2022, 3:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 3:39 AM
dmgreen requested review of this revision.Apr 8 2022, 3:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2022, 3:39 AM
dmgreen updated this revision to Diff 421481.Apr 8 2022, 3:41 AM

Remove NFC change in LowerVECTOR_SHUFFLE.

SjoerdMeijer accepted this revision.Apr 13 2022, 12:52 AM

That's quite a lot change in the test cases. It's easy to see that the smaller ones are improvements. For the bigger changes that isn't that obvious. But I trust you have run numbers and this is overall better. So LGTM, let's give this a try.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11492–11494

This comment is a bit cryptic for me. The description of this ticket is clear though, perhaps you adopt some of that rationale here.

This revision is now accepted and ready to land.Apr 13 2022, 12:52 AM
This revision was landed with ongoing or failed builds.Apr 19 2022, 4:59 AM
This revision was automatically updated to reflect the committed changes.