This is an archive of the discontinued LLVM Phabricator instance.

[TwoAddressInstruction] Tweak constraining of tied operands w/undef source
Needs ReviewPublic

Authored by reames on Jul 27 2023, 9:01 AM.

Details

Summary

This is a rework - and arguably a partial revert - of dff3454bda097723799935e8ea7f026ff0626940. That change modified the register constraint on an undef tied operand to use the regclass of the source operand. Before that change, we'd use the regclass of the operand from MCDesc.

From what I can tell from that patch, the intention was to handle generic target opcodes which don't have meaningful MCDesc. Unfortunately, the original test case appears to have been somewhat fragile. I can fully revert the original patch, and nothing changes today. So, I'm not 100% sure I actually understand it's intent.

This change uses the MCDesc regclass if one is available, and then falls back to the source operand if not. The motivation for this is that we can CSE together two instructions with distinct def regclasses, and end up with the source operand being a subset of the two. The result of this is an overly constrained source. Because the source is undef, we can relax it back to what the instruction definition required, and prevent redundant copies being inserted.

Note that in practice, the only case we appear to merge two instructions with distinct destination register classes are generic target opcodes. I hit this while working on a change in IMPLICIT_DEF handling, and have not been able to exercise it on ToT through anything but a carefully crafted MIR test.

Diff Detail

Event Timeline

reames created this revision.Jul 27 2023, 9:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 9:01 AM
reames requested review of this revision.Jul 27 2023, 9:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 9:01 AM
foad added a comment.Aug 1 2023, 3:31 AM

This is a rework - and arguably a partial revert - of dff3454bda097723799935e8ea7f026ff0626940. That change modified the register constraint on an undef tied operand to use the regclass of the source operand. Before that change, we'd use the regclass of the operand from MCDesc.

From what I can tell from that patch, the intention was to handle generic target opcodes which don't have meaningful MCDesc.

Quoting the description of D110944:

before:

%16:sgpr_96 = INSERT_SUBREG undef %15:sgpr_96_with_sub0_sub1(tied-def 0), killed %11:sreg_64_xexec, %subreg.sub0_sub1

After, without this patch:

undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec

The problem was that sgpr_96 does not fully support sub0_sub1. Something should have constrained that regclass to sgpr_96_with_sub0_sub1 somehow, e.g. by calling getSubClassWithSubReg. My approach was to use the class of the tied src operand to constrain it.

Unfortunately, the original test case appears to have been somewhat fragile. I can fully revert the original patch, and nothing changes today. So, I'm not 100% sure I actually understand it's intent.

I suspect the original repro no longer applies since D151463 changed the way AMDGPU's sgpr_96 register classes are defined.

llvm/test/CodeGen/RISCV/twoaddress-undef-regclass-constraint.mir
1

Can you precommit this or split it out to a separate patch, so we can see the diff?

foad added inline comments.Aug 1 2023, 3:46 AM
llvm/test/CodeGen/RISCV/twoaddress-undef-regclass-constraint.mir
100

Do you have a realistic example of where the :vrnov0 class might have come from in the first place? That might help explain why it's OK to ignore it.

foad added a comment.Aug 2 2023, 2:49 AM

Just trying to clarify my own thoughts:

  • I have no objection to your patch, though I do wonder if we can simplify this code even more.
  • The specific AMDGPU case that prompted D110944 no longer exists, and if it or anything like it comes back, we can re-examine it afresh.
  • If I was trying to fix that AMDGPU case today I think I would do it differently anyway: the part of TwoAddressInstructionPass::runOnMachineFunction that rewrites INSERT_SUBREG as COPY should probably just call getSubClassWithSubReg to constrain the dst operand when it sets the subreg.