This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add subtarget feature for MAD_U64/I64 bug on GFX11
ClosedPublic

Authored by mbrkusanin on Aug 31 2022, 4:46 AM.

Diff Detail

Event Timeline

mbrkusanin created this revision.Aug 31 2022, 4:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 4:46 AM
mbrkusanin requested review of this revision.Aug 31 2022, 4:46 AM

Following an offline discussion I've looked into adding a subtarget feature for bug for V_MAD_U64_U32 and V_MAD_I64_I32. Since VOPReal is tied with VOPPseudo I don't see a nice way of having both options for gfx11 without also having 2 real versions.
If we don't care about having an option to disable this bug on gfx11 then we can just eliminate the other Real.

foad added a comment.Sep 1 2022, 10:08 AM

Looks reasonable to me.

llvm/lib/Target/AMDGPU/VOP3Instructions.td
921

You shouldn't need to set OtherPredicates here. Real instructions usually copy these predicates from the Pseudo instruction.

Is this still relevant? I thought there was a discussion that this wasn't actually broken?

  • By overriding opName (or PseudoInstr of tablegen record) for _strict pseudo we can have both pseudos select into same real instruction, so now we don't need two.
  • Updated use of features/predicate so now it works properly with -mattr=-mad-intra-fwd-bug which was the point of this change.

Is this still relevant? I thought there was a discussion that this wasn't actually broken?

Nothing was really broken. It was just if we want a feature for this bug or not.

  • By overriding opName (or PseudoInstr of tablegen record) for _strict pseudo we can have both pseudos select into same real instruction, so now we don't need two.

This is a creative use of the opName, but I think it probably hinders readability. No other instructions are doing this, so when I look at definition of the Real instruction it seems to refer to the non-strict pseudo.

  • By overriding opName (or PseudoInstr of tablegen record) for _strict pseudo we can have both pseudos select into same real instruction, so now we don't need two.

This is a creative use of the opName, but I think it probably hinders readability. No other instructions are doing this, so when I look at definition of the Real instruction it seems to refer to the non-strict pseudo.

Well, it refers to both in a way. I changed the one it actually uses for initialization to be _strict pseudo which in turns uses same name as non strict one.

Joe_Nash accepted this revision.Nov 17 2022, 7:30 AM

Thanks, LGTM.

llvm/lib/Target/AMDGPU/VOP3Instructions.td
296

For other types of instructions (VOPC, VOP2) opName is used as a key into mapping tables, so having duplicate entries with the same name can cause collisions. As long as there are no functional issues, I'm fine with it as is. It will be a tablegen failure if someone tries to add a mapping table and use that as the key in the future.

This revision is now accepted and ready to land.Nov 17 2022, 7:30 AM
mbrkusanin added inline comments.Nov 17 2022, 7:47 AM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
296

Is current solutions preferable to having two versions for Real instruction? Problem with having two Reals is we would need to disable one for decoding because of conflict and then manually adjust it, which does not look nice. Encoding is easily resolved with just some extra predicates.

foad added a comment.Nov 17 2022, 7:53 AM

I have often wanted a standard way of mapping two different pseudos to the same real instruction.

I have often wanted a standard way of mapping two different pseudos to the same real instruction.

I guess that is useful, but the behavior here is not always available. We are basically having multiple maps overloaded onto the same keys.
Here we are using only one map, from pseudo to real using opName (actually PseudoInstr, which is derived from opName) as the key. Other maps such as getDPPOp32 also use opName as the key. So if we always wanted to conveniently map two pseudos to one real, we would likely want to introduce several new keys that don't interfere with each other.

llvm/lib/Target/AMDGPU/VOP3Instructions.td
296

I don't quite understand what you are asking. Currently we have 2 pseudos and one real for each subtarget. We should not need multiple reals on a single subtarget. The way to resolve my comment fully is to go to the following. Which is NFC currently.

defm V_MAD_U64_U32_strict : VOP3Inst <"v_mad_u64_u32_strict", VOP3b_I64_I1_I32_I32_I64>;
defm V_MAD_I64_I32_strict : VOP3Inst <"v_mad_i64_i32_strict", VOP3b_I64_I1_I32_I32_I64>;

mbrkusanin added inline comments.Nov 18 2022, 7:00 AM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
296

Actually it would not be NFC. Mapping non-strict pseudo to real would fail for: llc -mcpu=gfx1100 -mattr=-mad-intra-fwd-bug.
The point of the patch is to have an option to disable the workaround for the bug.

Maybe I misunderstood what the first comment meant. Patch as-is matches both Pseudos into same Real. Is the potential issue for a new mapping table in the future if that table wants to use opName as key? Or are you talking about mapping Reals back into Pseudos?
Would ,,let PseudoInstr = " for pseudo make a difference (opName would be different then)?

arsenm accepted this revision.Nov 18 2022, 8:52 AM

LGTM but I don't understand the name change

llvm/lib/Target/AMDGPU/VOP3Instructions.td
295–296

I don't understand the name change from _gfx11 to _strict

mbrkusanin added inline comments.Nov 18 2022, 8:59 AM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
295–296

It's strange to have _gfx11 on a pseudo and then a real _gfx11_e64_gfx11. Should I restore it?

arsenm added inline comments.Nov 18 2022, 9:01 AM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
295–296

That's consistent with other _gfx* behavior changing instruction variants.

I do think we have a sustainability problem with all the semantic changes of the same opcodes. Over time I've started to think it would be better to codegen to concrete opcodes and swap out the instruction tables per-sub target or something

  • reverted name change
  • rebase
This revision was landed with ongoing or failed builds.Nov 18 2022, 9:21 AM
This revision was automatically updated to reflect the committed changes.
Joe_Nash added inline comments.Nov 18 2022, 9:48 AM
llvm/lib/Target/AMDGPU/VOP3Instructions.td
296

Actually it would not be NFC. Mapping non-strict pseudo to real would fail for: llc -mcpu=gfx1100 -mattr=-mad-intra-fwd-bug.
The point of the patch is to have an option to disable the workaround for the bug.

Ah, I missed this part where you wanted to turn off the feature on gfx11. Perhaps there should be a test for that.

Maybe I misunderstood what the first comment meant. Patch as-is matches both Pseudos into same Real. Is the potential issue for a new mapping table in the future if that table wants to use opName as key? Or are you talking about mapping Reals back into Pseudos?
Would ,,let PseudoInstr = " for pseudo make a difference (opName would be different then)?

"a new mapping table in the future if that table wants to use opName as key" is the part I was worried about. It's OK to leave it for now and address it later if it becomes a problem.

Last I checked you can't actually turn off a subtarget feature that's attached to the cpu definition. It has some insane behavior where it concludes you didn't really mean that processor and turns off all the features