This is an archive of the discontinued LLVM Phabricator instance.

[RegCoalescer] Replace undef subreg copies used in early-clobber def insts with IMPLICIT_DEFs
AbandonedPublic

Authored by abinavpp on Sep 19 2022, 7:38 AM.

Details

Summary

This change gets the RegisterCoalescer to replace undef subreg copies
used in instructions having early-clobber defs with IMPLICIT_DEFs to
avoid overlapping register assignments in regalloc.

This change, along with D134343, fixes the overlapping register
assignment of gfx11's v_mad_u64_u32 (See SWDEV-352540).

Diff Detail

Event Timeline

abinavpp created this revision.Sep 19 2022, 7:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 19 2022, 7:38 AM
abinavpp requested review of this revision.Sep 19 2022, 7:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 19 2022, 7:38 AM

Haven't looked into all the test changes, especially the ones in Thumb2; Not sure if there's a better way to handle this

This should be two separate patches and needs MIR tests for each

llvm/lib/CodeGen/RegisterCoalescer.cpp
1683

My initial thought is you shouldn't have to scan over all use instructions to avoid this and you can query the range and check for EC slots

abinavpp updated this revision to Diff 461521.Sep 20 2022, 2:27 AM
  • Replaced the subreg copy with an implicit_def; This eliminates the new copies in the Thumb2 tests
  • Reword a comment
abinavpp added inline comments.Sep 20 2022, 3:45 AM
llvm/lib/CodeGen/RegisterCoalescer.cpp
1683

We're looking for the EC defs that are using DstReg. There are no EC slots in DstReg's LiveRange. I don't know if there is any quick way to get the other EC slots that are overlapping with DstReg's LiveRange.

This still should be split into two separate patches, one for the coalescer and one for two address instructions, and needs MIR tests for each

foad added a comment.Sep 21 2022, 2:59 AM

Well, what is the exact definition of earlyclobber in MIR? Is it OK for an earlyclobber output to overwrite an undef input? Apparently it is OK, but you are changing the definition to say it is not OK. I think this deserves wider discussion.

llvm/test/CodeGen/AMDGPU/mad_u64_u32.ll
18

Please rebase on f09e3ad88a19.

abinavpp updated this revision to Diff 461836.Sep 21 2022, 3:03 AM
abinavpp retitled this revision from [WIP][RegCoalescer][TwoAddrInst] Keep undef subreg copies that are used in early-clobber insts to [RegCoalescer] Replace undef subreg copies used in early-clobber def insts with IMPLICIT_DEFs.
abinavpp edited the summary of this revision. (Show Details)

Rebase; Move TwoAddrInst changes to D134343

abinavpp abandoned this revision.Oct 11 2022, 11:28 PM

Sorry, but we're not fixing anything in gfx11 here. AMDGPUAsmParser::validateEarlyClobberLimitations() was incorrectly asserting a violation with overlapping vdst and src2 registers in v_mad_u64_u32. D134272 took care of that.