Page MenuHomePhabricator

[AMDGPU][GlobalISel] Support mad/fma_mix selection
ClosedPublic

Authored by Pierre-vh on Sep 21 2022, 5:41 AM.

Details

Summary

Adds support for selecting the following instructions using GlobalISel:

  • v_mad_mix/v_fma_mix
  • v_mad_mixhi/v_fma_mixhi
  • v_mad_mixlo/v_fma_mixlo

To select those instructions properly, some additional changes were
needed which impacted other tests as well.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Pierre-vh requested review of this revision.Oct 4 2022, 5:00 AM

Needs another quick re-review because of the shuffle/shift combine removal - there's some small regressions I was unable to fix for now but they only affect VI/CI

arsenm accepted this revision.Oct 12 2022, 8:43 AM
This revision is now accepted and ready to land.Oct 12 2022, 8:43 AM
foad added inline comments.Oct 17 2022, 3:59 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
534–535

Personally I prefer to write this as (IsFMA ? Subtarget->hasMadMixInsts() : Subtarget->hasFmaMixInsts()).

Also I don't understand what the (!Subtarget->hasMadMixInsts() && !Subtarget->hasFmaMixInsts()) part is for. Why can't the whole thing just be: (IsFMA ? Subtarget->!hasFmaMixInsts() : !Subtarget->hasMadMixInsts())?

748

Don't need the braces.

754

Don't need the braces.

llvm/lib/Target/AMDGPU/SIInstructions.td
2752 ↗(On Diff #467050)

This looks weird. Why would we every want to generate V_LSHLREV_B32 with an sgpr operand?

Pierre-vh marked 4 inline comments as done.

Comments

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
534–535

That condition was imported from selectFMAD_FMA in the DAGISel and yeah, after looking at it more carefully it's indeed redundant, I rewrote it.

llvm/lib/Target/AMDGPU/SIInstructions.td
2752 ↗(On Diff #467050)

I just changed the pattern matching to also allow VGPR operands (which, IIRC, will be copied to the right regbank in any case). Not sure why the pattern uses a SGPR output though.

I did this change because it helped codegen in one test go from:

v_mad_mix_f32 v0, v0, v1, v2 op_sel_hi:[1,1,1] clamp 
v_cvt_f16_f32_e32 v0, v0 
v_and_b32_e32 v1, 0xffff, v0 
v_lshl_or_b32 v0, v0, 16, v1

to

v_mad_mix_f32 v0, v0, v1, v2 op_sel_hi:[1,1,1] clamp
v_cvt_f16_f32_sdwa v0, v0 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:DWORD

Perhaps it'd be more adequate to duplicate the pattern and have another one that uses VReg input/output?

foad added inline comments.Oct 19 2022, 1:44 AM
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

Is this required? I am confused about whether you are now matching VOP3PMadMixMods in TableGen patterns, or only doing it from the C++ code in selectG_FMA.

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
525

Call it selectG_FMA_FMAD.

534

The usual convention seems to be to return false (with no braces) here, and have the caller do:

if (selectG_FMA(I))
  return true;
return selectImpl(I, *CoverageInfo);

Would that work? Or is there something special about the return false on line 575?

558

Return false? No braces.

574

No braces.

746

Is this related to the current patch? Could it be split out?

llvm/lib/Target/AMDGPU/SIInstructions.td
2752 ↗(On Diff #467050)

Maybe split this into a separate patch and we can discuss it there? It doesn't seem to be directly related to selecting the mad_mix instruction.

Pierre-vh updated this revision to Diff 468822.Oct 19 2022, 2:09 AM
Pierre-vh marked 7 inline comments as done.

Comments, splitting up stuff in other diffs

Pierre-vh added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

I'm doing it exclusively in C++.
This is needed to tell the GlobalISelEmitter that, when it sees "VOP3PMadMixMods" it needs to use the "selectVOP3PMadMixMods" function from AMDGPUInstructionSelector.
Without that, it wouldn't import patterns that use it.

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
746

It's required. D136235

llvm/lib/Target/AMDGPU/SIInstructions.td
2752 ↗(On Diff #467050)

v_mad_mixhi_f16_f16lo_f16lo_f16lo_undeflo_clamp_precvt regressed due to splitting the change into D136236

foad added inline comments.Oct 19 2022, 2:18 AM
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

I'm still confused. If we are now importing patterns that use VOP3PMadMixMods, why do you need to do the selection in C++ code?

Pierre-vh marked an inline comment as done.Oct 19 2022, 2:21 AM
Pierre-vh added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

It's a ComplexPattern

def VOP3PMadMixMods  : ComplexPattern<untyped, 2, "SelectVOP3PMadMixMods">;

Those functions work with DAG nodes and must be rewritten for GISel so the patterns can be imported
https://llvm.org/docs/GlobalISel/InstructionSelect.html#complexpatterns

foad added inline comments.Oct 19 2022, 2:26 AM
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

Right, but why do you need to write code in selectG_FMA_FMAD to manually select MAD_MIX/FMA_MIX? Why can't the imported patterns do this automatically?

Pierre-vh marked 2 inline comments as done.Oct 19 2022, 2:31 AM
Pierre-vh added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

Because there are no patterns for MIX, only for MIXHI/MIXLO. MIX selection is still manual in the DAG as well (see SelectFMAD_FMA)

foad added inline comments.Oct 19 2022, 2:38 AM
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
156

OK, thanks for explaining. Out of curiosity, is there a good reason why mix selection cannot be done with patterns?

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3396

I've also done this as part of D136238.

foad added inline comments.Oct 19 2022, 3:15 AM
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
157

Why is this "s64"?

Pierre-vh updated this revision to Diff 468842.Oct 19 2022, 3:16 AM
Pierre-vh marked an inline comment as done.

Rebase

Pierre-vh added inline comments.Oct 19 2022, 3:18 AM
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
157

I'm not sure about this actually, I just followed the examples above. It's supposed to be "the expected type at the root of the match" but IIRC it doesn't look like it matter much in this case at least

arsenm accepted this revision.Oct 19 2022, 8:27 AM
arsenm added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
157

I'm surprised this doesn't do anything but should probably be untyped

Pierre-vh marked an inline comment as done.Oct 20 2022, 3:01 AM
Pierre-vh added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUGISel.td
157

untyped doesn't work

[build] /home/pvanhout/work/trunk/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUGISel.td:157:5: error: Value specified for template argument 'GIComplexOperandMatcher:type' (#0) is of type ValueType; expected type LLT: untyped
[build]     GIComplexOperandMatcher<untyped, "selectVOP3PMadMixMods">,

There's a TODO for it

  // The expected type of the root of the match.
  //
  // TODO: We should probably support, any-type, any-scalar, and multiple types
  //       in the future.
  LLT Type = type;
`
Pierre-vh marked an inline comment as done.

Rebase

Pierre-vh updated this revision to Diff 470142.Oct 24 2022, 7:27 AM

DAG -> SDAG for FileCheck prefix
Still blocked by parent diff

Pierre-vh updated this revision to Diff 471432.Oct 28 2022, 1:24 AM

Rebase

There are some codegen changes (due to D136922) but it looks like a net positive everywhere

arsenm accepted this revision.Nov 1 2022, 2:41 PM

LGTM

Pierre-vh updated this revision to Diff 473175.Fri, Nov 4, 2:28 AM

Rebase without D136922, some regressions are back but it's relatively small.

Pierre-vh updated this revision to Diff 473579.Sun, Nov 6, 11:48 PM

Abandoning D135145 as we can't really decide whether it's a good or bad thing.
There's a couple of regressions then but nothing blocking landing afaik.
@arsenm can you confirm this is good to land? For the remainingg cases I'll add them to my notes and try to come up with something when time allows

Pierre-vh requested review of this revision.Sun, Nov 6, 11:48 PM
arsenm accepted this revision.Mon, Nov 7, 9:33 AM

Should work to fix these regressions vs. the DAG but they certainly aren't directly related to this

This revision is now accepted and ready to land.Mon, Nov 7, 9:33 AM
This revision was landed with ongoing or failed builds.Tue, Nov 8, 12:02 AM
This revision was automatically updated to reflect the committed changes.