In collectTiedOperands, when handling an undef use that is tied to a
def, constrain the dst reg with the actual register class of the src
reg, instead of with the register class from the instructions's
MCInstrDesc. This makes a difference in some AMDGPU test cases like
this, before:
%16:sgpr_96 = INSERT_SUBREG undef %15:sgpr_96_with_sub0_sub1(tied-def 0), killed %11:sreg_64_xexec, %subreg.sub0_sub1
After, without this patch:
undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec
This fails machine verification if you force it to run after
TwoAddressInstruction (currently it is disabled) with:
- Bad machine code: Invalid register class for subregister index ***
- function: s_load_constant_v3i32_align4
- basic block: %bb.0 (0xa011a88)
- instruction: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec
- operand 0: undef %16.sub0_sub1:sgpr_96
Register class SGPR_96 does not fully support subreg index 4
After, with this patch:
undef %16.sub0_sub1:sgpr_96_with_sub0_sub1 = COPY killed %11:sreg_64_xexec
See also svn r159120 which introduced the code to handle tied undef
uses.