This is an archive of the discontinued LLVM Phabricator instance.

[RISCV][NFC] Add test for register coalescing x0
AcceptedPublic

Authored by luke on Jan 26 2023, 3:01 AM.

Details

Reviewers
asb
craig.topper
Summary

Ideally all the COPY $x0s would be coalesced here, but register
allocator doesn't seem to like the fact that %1 lives across multiple
basic blocks
Part of https://github.com/llvm/llvm-project/issues/56872

Diff Detail

Event Timeline

luke created this revision.Jan 26 2023, 3:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2023, 3:01 AM
luke requested review of this revision.Jan 26 2023, 3:01 AM
luke updated this revision to Diff 492414.Jan 26 2023, 5:42 AM

Fix llc invocation

This revision is now accepted and ready to land.Jan 26 2023, 10:04 PM
luke updated this revision to Diff 492701.Jan 27 2023, 4:16 AM

Simplify test

luke updated this revision to Diff 492806.Jan 27 2023, 9:30 AM

Add another test that should be fixed

luke added a subscriber: qcolombet.Jan 27 2023, 9:45 AM

@qcolombet Apologies if I'm pinging the wrong person here, I'm looking for some advice on some register coalescer behaviour, let me know if this question is better directed towards someone else.
On RISC-V we're seeing some unexpected extra COPYs being left around with these $x0 registers, which is both a reserved and constant physical register.

In both test cases, %0 is unable to be coalesced due to the fact that it has a "complex interval" with multiple values, i.e. in f the intervals look like:

********** INTERVALS **********
%0 [16r,48B:1)[64r,80B:0)[80B,96r:2) 0@64r 1@16r 2@80B-phi  weight:0.000000e+00
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function f: NoPHIs

0B	bb.0:
	  successors: %bb.2(0x40000000), %bb.1(0x40000000); %bb.2(50.00%), %bb.1(50.00%)

16B	  %0:gpr = COPY $x0
32B	  BEQ $x1, $x2, %bb.2

48B	bb.1:
	; predecessors: %bb.0
	  successors: %bb.2(0x80000000); %bb.2(100.00%)

64B	  %0:gpr = COPY $x0

80B	bb.2:
	; predecessors: %bb.0, %bb.1

96B	  PseudoRET implicit %0:gpr

So it can't perform joinReservedPhysReg on %0 because its interval has multiple values.

(Note that test case requires the conditional branch, otherwise the second def of %0 is marked as killed and it gets split into another virtual register, which coincidentally allows the copies to be coalesced since they will all have simple intervals)

Ideally in test case f we'd like to see:

bb.0:
  ; COPY removed
  BEQ $x1, $x2, %bb.2

bb.1:
  ; COPY removed

bb.2:
  PseudoRET implicit $x0

and in test case g:

bb.0:
  %0 = COPY $x0
  BEQ $x1, $x2, %bb.2

bb.1:
  ; COPY removed
  %0 = ADDI $x0, 1

bb.2:
  PseudoRET implicit %0

Should these copy eliminations be possible to do in the register coalescer, or is it something that should be taken care of by another pass?
I'm mainly asking because in g we can't just remove all the copies and coalesce the $0s and %1s together, we can only remove some of them. And from what I understand from reading RegisterCoalescer.cpp, it can currently only join intervals completely.

Hi @luke,

In both test cases, %0 is unable to be coalesced due to the fact that it has a "complex interval" with multiple values,

That's by design because for complex intervals, in theory, we wouldn't save any copies.
If you consider the general case:

   if
1:   %0 = copy $reserved
   else
2:   %0 = ... something else
3: ... = %0

If we coalesce #1, because the interval is complex we know that at least part of the interval won't be able to use $reserved so we'll have to leave a virtReg = copy $reserved somewhere.

In your f case this is actually possible to completely eliminate all copies because all the definitions are a copy of that same register. We might want to support that in the coalescer, but that looks a bit too specific to be worth it but I could be wrong.
An alternative would be to check why CSE is not picking it up.

For your second case (g), what is putting the copy and the add in bb.1 on the same live-interval?
If they were still separate, the copy you want to remove should be removed directly by the coalescer.

Assuming the issue is not created by the coalescer itself, I would suggest to fix that or consider running something similar to RenameIndependentSubregs to break the live-intervals that have been merge for no good reason.
Now if the issue is created by the coalescer itself, we'll need to dig more to see why we end up in this situation.

Cheers,
-Quentin