Page MenuHomePhabricator

[AMDGPU][GlobalISel] Select v_fma_mix_f32 and v_mad_mix_f32
Needs ReviewPublic

Authored by mbrkusanin on Oct 1 2021, 7:50 AM.

Diff Detail

Event Timeline

mbrkusanin created this revision.Oct 1 2021, 7:50 AM
mbrkusanin requested review of this revision.Oct 1 2021, 7:50 AM

This can land before D98050 (or any other fma combine) but for a review it might be easier to see src modifiers in .ll tests than .mir.
If this is accepted before, we can commit it without .ll tests.

llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h
595–609

Does this need to be a separate patch with a unit test?

mbrkusanin marked an inline comment as not done.Oct 1 2021, 7:52 AM
foad added a comment.Oct 6 2021, 8:30 AM

Looks like the logic is mostly copied from selectiondag, so I guess it's OK.

llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h
595

Does this match exactly one copy? Would it be more useful to skip zero or more copies?

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
2322–2324

I think these three lines can be replaced with IsFMA ? !Subtarget->hasFmaMixInsts() : !Subtarget->hasMadMixInsts()

mbrkusanin updated this revision to Diff 377567.Oct 6 2021, 9:08 AM
mbrkusanin marked an inline comment as done.
llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h
595

m_Copy matches exactly one copy. This matches zero or one copy. Renamed to m_IgnoreCopy, StripCopy was a little misleading.

It could be easily changed to zero or more, but I did not run into a case where that was necessary.
In this patch we start from G_FMA/FMAD src operand which is vpgr and match until a constant (most likely a sgpr), so sooner or later a copy will appear.

mbrkusanin added inline comments.
llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h
595–609

@aemerson, @paquette Any objections to m_IgnoreCopy matcher?

mbrkusanin marked an inline comment as not done.Oct 7 2021, 3:16 AM
paquette added inline comments.Oct 7 2021, 9:09 AM
llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h
595–609

I feel like it would be better to match multiple copies, kind of like getDefIgnoringCopies. You'll probably run into multiple copies one day, and chances are you'll want to walk past them too.

Updated IgnoreCopy matcher to skip through multiple copies.

IR has matchers like m_TruncOrSelf and m_ZExtOrSelf which match one or zero of those nodes. So I guess it's not unprecedented to have a matcher like this.
Maybe rename to m_CopyOrSelf?

arsenm added inline comments.Oct 19 2021, 1:34 PM
llvm/include/llvm/CodeGen/GlobalISel/MIPatternMatch.h
595

The practical case where you would want to use this is when you know you can ignore cross bank copies, which I think should only ever see 1 copy

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3512

Really this copy shouldn't be here, but regbankselect and the post-regbank combiner don't optimally handle constants. For an inline constant we should always have the constant appear directly with the result regbank. Can you add a fix to remove this?