This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add support for using Sched/Codesize information to `X86FixupInstTuning` Pass.
ClosedPublic

Authored by goldstein.w.n on Feb 22 2023, 9:14 AM.

Details

Summary

Use this to handle new transform: {v}unpck{l|h}pd -> {v}shufps. We
need the sched information here as {v}shufps is 1 more byte of code
size, so we only want to make this transformation if {v}shufps is
actually faster.

Diff Detail

Event Timeline

goldstein.w.n created this revision.Feb 22 2023, 9:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2023, 9:14 AM
goldstein.w.n requested review of this revision.Feb 22 2023, 9:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2023, 9:14 AM
Matt added a subscriber: Matt.Feb 22 2023, 9:16 AM
pengfei added inline comments.Feb 22 2023, 5:01 PM
llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll
7–16

This seems a bad check. Same below.

goldstein.w.n marked an inline comment as done.Feb 22 2023, 5:42 PM
goldstein.w.n added inline comments.
llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll
7–16

Fixed, sorry about that.

goldstein.w.n marked an inline comment as done.

Fix broken checks

Fixup comment

pengfei added inline comments.Feb 25 2023, 6:13 AM
llvm/lib/Target/X86/X86FixupInstTuning.cpp
73

Don't see much value to use optinal, should be better to use bool directly?

template <typename T>
static bool CmpOptionals(T NewVal, T CurVal) {
  if (NewVal && CurVal)
    return *NewVal < *CurVal;

  return false;
}
89

Should be better hoist the check into NewOpcPreferable or ProcessVPERMILPSmi etc?

92–93

Should be better to sink them into GetInstTput and GetInstLat?

178

In which case will vmovlhps be transformed into vshufps r, r, 0xee?

goldstein.w.n marked 2 inline comments as done.Feb 25 2023, 5:00 PM
goldstein.w.n added inline comments.
llvm/lib/Target/X86/X86FixupInstTuning.cpp
73

Don't see much value to use optinal, should be better to use bool directly?

template <typename T>
static bool CmpOptionals(T NewVal, T CurVal) {
  if (NewVal && CurVal)
    return *NewVal < *CurVal;

  return false;
}

The thing is we need three states:

  1. <true> New Val better
  2. <false> Old Val better
  3. <nullopt> No difference (either unable to get information i.e no schedmodel) or the value are equal.

<true> -> make change
<false> -> don't make change
<nullopt> -> check next metric.

178

In which case will vmovlhps be transformed into vshufps r, r, 0xee?

see the define <4 x float> @transform_VUNPCKLPDrr test case.

Make lat/tput cleaner

Rebase (after D144832, no dep on D143786)

Maybe use has_value() instead of nullopt comparisons?

llvm/lib/Target/X86/X86FixupInstTuning.cpp
77

Maybe?
if (NewVal.has_value() && CurVal.has_value() && *NewVal != *CurVal)

goldstein.w.n marked an inline comment as done.Mar 6 2023, 10:44 AM

Maybe use has_value() instead of nullopt comparisons?

Done.

Use .has_value() instead of nullopt comparison

Add ICX test runs to tuning-shuffle-unpckpd.ll so we have test coverage ?

llvm/lib/Target/X86/X86FixupInstTuning.cpp
101
if (unsigned Size = TII->get(Opcode).getSize())
  return Size;
goldstein.w.n marked an inline comment as done.Mar 7 2023, 4:25 PM

Add ICX test runs to tuning-shuffle-unpckpd.ll so we have test coverage ?

Done. See D145531

RKSimon accepted this revision.Mar 7 2023, 5:24 PM

LGTM with one minor

llvm/test/CodeGen/X86/tuning-shuffle-unpckpd.ll
2–3

SKX/v3 can probably share a CHECK-AVX2 common prefix:

--check-prefixes=CHECK,CHECK-AVX2,CHECK-SKL
--check-prefixes=CHECK,CHECK-AVX2,CHECK-V3
This revision is now accepted and ready to land.Mar 7 2023, 5:24 PM
goldstein.w.n marked an inline comment as done.Mar 7 2023, 7:04 PM

Add common check for AVX2

This revision was landed with ongoing or failed builds.Mar 8 2023, 9:58 AM
This revision was automatically updated to reflect the committed changes.