Use this to handle new transform: {v}unpck{l|h}pd -> {v}shufps. We
need the sched information here as {v}shufps is 1 more byte of code
size, so we only want to make this transformation if {v}shufps is
actually faster.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll | ||
---|---|---|
7–16 | This seems a bad check. Same below. |
llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll | ||
---|---|---|
7–16 | Fixed, sorry about that. |
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
73 | Don't see much value to use optinal, should be better to use bool directly? template <typename T> static bool CmpOptionals(T NewVal, T CurVal) { if (NewVal && CurVal) return *NewVal < *CurVal; return false; } | |
89 | Should be better hoist the check into NewOpcPreferable or ProcessVPERMILPSmi etc? | |
92–93 | Should be better to sink them into GetInstTput and GetInstLat? | |
178 | In which case will vmovlhps be transformed into vshufps r, r, 0xee? |
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
73 |
The thing is we need three states:
<true> -> make change | |
178 |
see the define <4 x float> @transform_VUNPCKLPDrr test case. |
Maybe use has_value() instead of nullopt comparisons?
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
77 | Maybe? |
Add ICX test runs to tuning-shuffle-unpckpd.ll so we have test coverage ?
llvm/lib/Target/X86/X86FixupInstTuning.cpp | ||
---|---|---|
101 | if (unsigned Size = TII->get(Opcode).getSize()) return Size; |
LGTM with one minor
llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll | ||
---|---|---|
2–3 | SKX/v3 can probably share a CHECK-AVX2 common prefix: --check-prefixes=CHECK,CHECK-AVX2,CHECK-SKL --check-prefixes=CHECK,CHECK-AVX2,CHECK-V3 |
Don't see much value to use optinal, should be better to use bool directly?