This is an archive of the discontinued LLVM Phabricator instance.

[X86] Enable fast variable per-lane shuffle tuning on all Ryzen targets
ClosedPublic

Authored by RKSimon on Apr 7 2022, 6:17 AM.

Details

Summary

rGa3b8695bf592 enabled this for znver3, but AMD SoG, Agner and uops.info all agree that even znver1 has a fast per-lane shuffle op (VPSHUFB), but cross-lane shuffles seem to be slow (PERMPS etc.)

Fixes PR44795

Diff Detail

Event Timeline

RKSimon created this revision.Apr 7 2022, 6:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2022, 6:17 AM
RKSimon requested review of this revision.Apr 7 2022, 6:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2022, 6:17 AM
This revision is now accepted and ready to land.Apr 7 2022, 6:58 AM
This revision was landed with ongoing or failed builds.Apr 7 2022, 8:20 AM
This revision was automatically updated to reflect the committed changes.