unpckps has the same performance as unpckpd (only port5) wereas
unpckdq can run on p15 on some newer architectures.
unpckdq is in the integer domain, so only do the transform if the
target has no bypass delay on shuffles (SKL+).
Paths
| Differential D147729
[X86] Add inst fixup for `unpckps` -> `unpckdq`. ClosedPublic Authored by goldstein.w.n on Apr 6 2023, 11:12 AM.
Details Summary unpckps has the same performance as unpckpd (only port5) wereas unpckdq is in the integer domain, so only do the transform if the
Diff Detail
Event Timelinegoldstein.w.n retitled this revision from [X86] Add inst fixup for `unpckps` -> `unpckdq`/`shufps`. to [X86] Add inst fixup for `unpckps` -> `unpckdq`..Apr 6 2023, 11:22 AM goldstein.w.n added a parent revision: D147728: [X86] Add inst fixup for `unpckpd` -> `unpckqdq`..Apr 6 2023, 11:22 AM This revision is now accepted and ready to land.Apr 6 2023, 1:21 PM This revision was landed with ongoing or failed builds.Apr 9 2023, 10:17 PM Closed by commit rGd65720652dd6: [X86] Add inst fixup for `unpckps` -> `unpckdq`. (authored by goldstein.w.n). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 512071 llvm/lib/Target/X86/X86FixupInstTuning.cpp
llvm/test/CodeGen/X86/tuning-shuffle-unpckps-avx512.ll
llvm/test/CodeGen/X86/tuning-shuffle-unpckps.ll
|