This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add patterns for mad/mac legacy f32 instructions
ClosedPublic

Authored by foad on Oct 6 2020, 5:36 AM.

Details

Summary

Note that all subtargets up to GFX10.1 have v_mad_legacy_f32, but GFX8/9
lack v_mac_legacy_f32. GFX10.3 has no mad/mac f32 instructions at all.

Diff Detail

Event Timeline

foad created this revision.Oct 6 2020, 5:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 6 2020, 5:36 AM
foad requested review of this revision.Oct 6 2020, 5:36 AM
arsenm added inline comments.Oct 6 2020, 7:09 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
887–888

But these do have modifiers?

foad added inline comments.Oct 6 2020, 7:30 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
887–888

Sure, but that's covered by the TODO at line 869. I could repeat the comment here. I assume there was some reason it wasn't totally straightforward (maybe something to do with the tied operands?) else it would've been done already.

arsenm added inline comments.Oct 6 2020, 7:33 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
887–888

I think it is straightforward, and just n/nobody bothered to do it

foad added inline comments.Oct 6 2020, 9:12 AM
llvm/lib/Target/AMDGPU/SIInstructions.td
887–888

Having looked into this some more, I think there is no point selecting v_mac with modifiers, because that would force VOP3 encoding, at which point there is no downside to using v_mad instead:

  • it's always available (there are subtargets with only v_mad but none with only v_mac)
  • it's no bigger
  • it lets you use different registers for dst and src2
  • it lets you ave modifiers on all three inputs

If this sounds reasonable I'll update the comments but leave the implementation as-is.

foad updated this revision to Diff 296503.Oct 6 2020, 10:49 AM

Update comments for NoMods patterns.

arsenm accepted this revision.Oct 7 2020, 6:52 AM
This revision is now accepted and ready to land.Oct 7 2020, 6:52 AM
foad updated this revision to Diff 296690.Oct 7 2020, 8:28 AM

Fix a couple more problems shown up by running the MC tests:

  • Update asm/dis special cases for V_MAC_LEGACY_F32 now that it uses the VOP_MAC_F32 profile.
  • Fix use of OtherPredicates in VOP2 pseudos.
foad added a comment.Oct 8 2020, 2:13 AM

I'll rebase this after D89000 lands, since it touches some of the same code in VOP2Instructions.td and gfx1030_err.s.

foad updated this revision to Diff 296923.Oct 8 2020, 4:32 AM

Rebase after D89000.

foad added a reviewer: dp.Oct 8 2020, 4:33 AM
arsenm accepted this revision.Oct 8 2020, 7:00 AM
This revision was landed with ongoing or failed builds.Oct 8 2020, 7:24 AM
This revision was automatically updated to reflect the committed changes.

Looks reasonable to me.