Page MenuHomePhabricator

[RegisterCoalescer] Fix IMPLICIT_DEF init removal for a register on joining
ClosedPublic

Authored by vpykhtin on Jun 20 2020, 7:15 AM.

Details

Summary

This is actually a revert of

commit 9d7bc0874cf20f44cd331c77f5a003b4c4b262bd
Author: Matthias Braun <matze@braunis.de>
Date: Thu Jan 8 00:21:23 2015 +0000

RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for subranges.

The register coalescer used to remove implicit_defs when they are
covered by the main range anyway. With subreg liveness tracking we can't
do that anymore in places where the IMPLICIT_DEF is required as begin of
a subregister liverange.

llvm-svn: 225416

Without this patch the bb2 of the test looks like:

bb.2:
; predecessors: %bb.0
  successors: %bb.3(0x80000000); %bb.3(100.00%)

  %0.sub1:sgpr_64 = IMPLICIT_DEF

Since there is no undef flag %0 is considered uninitialized in bb2, leading to an assert on mir validation. The debug dump (manually enhanced) shows what happend:

0B	bb.0:
	  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)

16B	  S_CBRANCH_SCC0 %bb.2, implicit undef $scc

32B	bb.1:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

48B	  undef %0.sub0:sgpr_64 = S_MOV_B32 1
64B	  %0.sub1:sgpr_64 = S_MOV_B32 2
80B	  %1:sgpr_32 = COPY %0.sub0:sgpr_64
96B	  S_BRANCH %bb.3

112B	bb.2:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	  %1:sgpr_32 = IMPLICIT_DEF
144B	  undef %0.sub0:sgpr_64 = IMPLICIT_DEF
160B	  %0.sub1:sgpr_64 = IMPLICIT_DEF

176B	bb.3:
	; predecessors: %bb.1, %bb.2

192B	  S_NOP 0, implicit %1:sgpr_32
208B	  S_NOP 0, implicit %0:sgpr_64

# End machine code for function coalescing_makes_lane_undefined.

********** SIMPLE REGISTER COALESCING **********
********** Function: coalescing_makes_lane_undefined
********** JOINING INTERVALS ***********
:
:
80B	%1:sgpr_32 = COPY %0.sub0:sgpr_64
	Considering merging to SGPR_64 with %1 in %0:sub0
		RHS = %1 	[80r,112B:1) 1@80r
				[128r,176B:0) 0@128r
				[176B,192r:2) 2@176B-phi weight:0.000000e+00

		LHS = %0	[48r,64r:1) 1@48r
				[64r,112B:3) 3@64r
				[144r,160r:0) 0@144r
				[160r,176B:2) 2@160r
				[176B,208r:4) 4@176B-phi 

			L0000000000000003 
				[48r,112B:1) 1@48r
				[144r,176B:0) 0@144r
				[176B,208r:2) 2@176B-phi 

			L000000000000000C 
				[64r,112B:1) 1@64r
				[160r,176B:0) 0@160r
				[176B,208r:2) 2@176B-phi weight:0.000000e+00

		merge %0:0@144r into %1:0@128r --> @128r
		merge %1:1@80r into %0:3@64r --> @64r
		merge %1:2@176B into %0:4@176B --> @176B

		RHSVals %1:sub0: 
			0@128r     Write:0000000000000003 Valid:0000000000000000 Keep ImpDef Pruned -> 0@128r, 
			1@80r      Write:0000000000000003 Valid:0000000000000003 Erase Other:3@64r -> 3@64r, 
			2@176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4@176B-phi -> 4@176B-phi

		LHSVals %0: 
			0@144r     Write:0000000000000003 Valid:0000000000000003 Erase Other:0@128r ImpDef -> 0@128r, 
			1@48r      Write:0000000000000003 Valid:0000000000000003 Keep -> 1@48r, 
			2@160r     Write:000000000000000C Valid:000000000000000F Replace Other:0@128r Redef:0@144r ImpDef -> 2@160r, 
			3@64r      Write:000000000000000C Valid:000000000000000F Keep Redef:1@48r -> 3@64r, 
			4@176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2@176B-phi -> 4@176B-phi

		LHST = %0 %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4)  0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2)  0@144r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2)  0@160r 1@64r 2@176B-phi weight:0.000000e+00

		merge %0:0@144r into %1:0@128r --> @128r
		merge %1:1@80r into %0:1@48r --> @48r
		merge %1:2@176B into %0:2@176B --> @176B

		RHSVals %1:sub0:0000000000000003: 
			0@128r     Write:0000000000000001 Valid:0000000000000000 Keep ImpDef -> 0@128r, 
			1@80r      Write:0000000000000001 Valid:0000000000000001 Erase Other:1@48r -> 1@48r, 
			2@176B-phi Write:0000000000000001 Valid:0000000000000001 Merge Other:2@176B-phi -> 2@176B-phi

		LHSVals %0:0000000000000003: 
			0@144r     Write:0000000000000001 Valid:0000000000000000 Erase Other:0@128r ImpDef -> 0@128r, 
			1@48r      Write:0000000000000001 Valid:0000000000000001 Keep -> 1@48r, 
			2@176B-phi Write:0000000000000001 Valid:0000000000000001 Keep Other:2@176B-phi -> 2@176B-phi

		joined lanes: 0000000000000003 
			[48r,112B:1) 1@48r
			[128r,176B:0) 0@128r  
			[176B,208r:2) 2@176B-phi

	Joined SubRanges %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4)  0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2)  0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2)  0@160r 1@64r 2@176B-phi weight:0.000000e+00
		Expecting instruction removal at 144r
		Expecting instruction removal at 128r
		Prune sublane 0000000000000003 at 128r
		Expecting instruction removal at 80r
		pruned all of %0 at 144r: [48r,64r:1)[64r,112B:3)[160r,176B:2)[176B,208r:4)  0@144r 1@48r 2@160r 3@64r 4@176B-phi
		pruned %1 at 160r: [80r,112B:1)[128r,160r:0)[176B,192r:2)  0@128r 1@80r 2@176B-phi
		erased:	144r	undef %0.sub0:sgpr_64 = IMPLICIT_DEF
		removed 0@128r: [80r,112B:1)[176B,192r:2)  0@x 1@80r 2@176B-phi
		erased:	128r	%1:sgpr_32 = IMPLICIT_DEF
		erased:	80r	%1:sgpr_32 = COPY %0.sub0:sgpr_64
		restoring liveness to 2 points: 160r,176B:  %0 [48r,64r:1)[64r,112B:3)[160r,176B:2)[176B,208r:4)  0@x 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[176B,208r:2)  0@x 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2)  0@160r 1@64r 2@176B-phi weight:0.000000e+00

# Machine code for function coalescing_makes_lane_undefined: NoPHIs, TracksLiveness

bb.0:
  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)

  S_CBRANCH_SCC0 %bb.2, implicit undef $scc

bb.1:
; predecessors: %bb.0
  successors: %bb.3(0x80000000); %bb.3(100.00%)

  undef %0.sub0:sgpr_64 = S_MOV_B32 1
  %0.sub1:sgpr_64 = S_MOV_B32 2
  S_BRANCH %bb.3

bb.2:
; predecessors: %bb.0
  successors: %bb.3(0x80000000); %bb.3(100.00%)

  %0.sub1:sgpr_64 = IMPLICIT_DEF

bb.3:
; predecessors: %bb.1, %bb.2

  S_NOP 0, implicit %1:sgpr_32
  S_NOP 0, implicit %0:sgpr_64

# End machine code for function coalescing_makes_lane_undefined.

*** Bad machine code: Reading virtual register without a def ***
- function:    coalescing_makes_lane_undefined
- basic block: %bb.3  (0x66758e8)
- instruction: S_NOP 0, implicit %1:sgpr_32
- operand 1:   implicit %1:sgpr_32

*** Bad machine code: Virtual register defs don't dominate all uses. ***
- function:    coalescing_makes_lane_undefined
- v. register: %0
LLVM ERROR: Found 2 machine code errors.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: C:\work\git\llvm-project\build\Debug\bin\llc.exe -debug-only=regalloc -march=amdgcn -mcpu=gfx803 -run-pass simple-register-coalescing -verify-machineinstrs coalescing_makes_lanes_undef.mir 
1.	Running pass 'Function Pass Manager' on module 'coalescing_makes_lanes_undef.mir'.
2.	Running pass 'Simple Register Coalescing' on function '@coalescing_makes_lane_undefined'

It erases 144 and 128 leaving 160. This happens because 160r replaces 128r, 128r is marked pruned due to the replace and since 128r is impdef it is erased.

I think its sufficient to erase IMPLICIT_DEF on any other incoming other value - in any case the reg would be initialized, no matter of subregs involved.

With the patch the dump looks like:

0B	bb.0:
	  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)

16B	  S_CBRANCH_SCC0 %bb.2, implicit undef $scc

32B	bb.1:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

48B	  undef %0.sub0:sgpr_64 = S_MOV_B32 1
64B	  %0.sub1:sgpr_64 = S_MOV_B32 2
80B	  %1:sgpr_32 = COPY %0.sub0:sgpr_64
96B	  S_BRANCH %bb.3

112B	bb.2:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	  %1:sgpr_32 = IMPLICIT_DEF
144B	  undef %0.sub0:sgpr_64 = IMPLICIT_DEF
160B	  %0.sub1:sgpr_64 = IMPLICIT_DEF

176B	bb.3:
	; predecessors: %bb.1, %bb.2

192B	  S_NOP 0, implicit %1:sgpr_32
208B	  S_NOP 0, implicit %0:sgpr_64

# End machine code for function coalescing_makes_lane_undefined.

********** SIMPLE REGISTER COALESCING **********
********** Function: coalescing_makes_lane_undefined
********** JOINING INTERVALS ***********
:
:
80B	%1:sgpr_32 = COPY %0.sub0:sgpr_64
	Considering merging to SGPR_64 with %1 in %0:sub0
		RHS = %1 [80r,112B:1)[128r,176B:0)[176B,192r:2)  0@128r 1@80r 2@176B-phi weight:0.000000e+00
		LHS = %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4)  0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2)  0@144r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2)  0@160r 1@64r 2@176B-phi weight:0.000000e+00
		merge %0:0@144r into %1:0@128r --> @128r
		merge %0:2@160r into %1:0@128r --> @128r
		merge %1:1@80r into %0:3@64r --> @64r
		merge %1:2@176B into %0:4@176B --> @176B
		RHSVals %1:sub0: 
			0@128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef -> 0@128r, 
			1@80r Write:0000000000000003 Valid:0000000000000003 Erase Other:3@64r -> 3@64r, 
			2@176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4@176B-phi -> 4@176B-phi
		LHSVals %0: 
			0@144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0@128r ImpDef -> 0@128r, 
			1@48r Write:0000000000000003 Valid:0000000000000003 Keep -> 1@48r, 
			2@160r Write:000000000000000C Valid:000000000000000F Erase Other:0@128r Redef:0@144r ImpDef -> 0@128r, 
			3@64r Write:000000000000000C Valid:000000000000000F Keep Redef:1@48r -> 3@64r, 
			4@176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2@176B-phi -> 4@176B-phi

		LHST = %0 %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4)  0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2)  0@144r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2)  0@160r 1@64r 2@176B-phi weight:0.000000e+00
		merge %0:0@144r into %1:0@128r --> @128r
		merge %1:1@80r into %0:1@48r --> @48r
		merge %1:2@176B into %0:2@176B --> @176B
		RHSVals %1:sub0:0000000000000003: 
                        0@128r Write:0000000000000001 Valid:0000000000000000 Keep ImpDef -> 0@128r, 
                        1@80r Write:0000000000000001 Valid:0000000000000001 Erase Other:1@48r -> 1@48r,
                        2@176B-phi Write:0000000000000001 Valid:0000000000000001 Merge Other:2@176B-phi -> 2@176B-phi
		LHSVals %0:0000000000000003: 
                        0@144r Write:0000000000000001 Valid:0000000000000000 Erase Other:0@128r ImpDef -> 0@128r,
                        1@48r Write:0000000000000001 Valid:0000000000000001 Keep -> 1@48r,
                        2@176B-phi Write:0000000000000001 Valid:0000000000000001 Keep Other:2@176B-phi -> 2@176B-phi
		joined lanes: 0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2)  0@128r 1@48r 2@176B-phi
	Joined SubRanges %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4)  0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2)  0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2)  0@160r 1@64r 2@176B-phi weight:0.000000e+00
		Expecting instruction removal at 144r
		Expecting instruction removal at 160r
		Prune sublane 000000000000000C at 160r
		Expecting instruction removal at 80r
		erased:	144r	undef %0.sub0:sgpr_64 = IMPLICIT_DEF
		erased:	160r	%0.sub1:sgpr_64 = IMPLICIT_DEF
		erased:	80r	%1:sgpr_32 = COPY %0.sub0:sgpr_64
AllocationOrder(SGPR_64) = [ $sgpr0_sgpr1 $sgpr2_sgpr3 $sgpr4_sgpr5 $sgpr6_sgpr7 $sgpr8_sgpr9 $sgpr10_sgpr11 $sgpr12_sgpr13 $sgpr14_sgpr15 $sgpr16_sgpr17 $sgpr18_sgpr19 $sgpr20_sgpr21 $sgpr22_sgpr23 $sgpr24_sgpr25 $sgpr26_sgpr27 $sgpr28_sgpr29 $sgpr30_sgpr31 $sgpr32_sgpr33 $sgpr34_sgpr35 $sgpr36_sgpr37 $sgpr38_sgpr39 $sgpr40_sgpr41 $sgpr42_sgpr43 $sgpr44_sgpr45 $sgpr46_sgpr47 $sgpr48_sgpr49 $sgpr50_sgpr51 $sgpr52_sgpr53 $sgpr54_sgpr55 $sgpr56_sgpr57 $sgpr58_sgpr59 $sgpr60_sgpr61 $sgpr62_sgpr63 $sgpr64_sgpr65 $sgpr66_sgpr67 $sgpr68_sgpr69 $sgpr70_sgpr71 $sgpr72_sgpr73 $sgpr74_sgpr75 $sgpr76_sgpr77 $sgpr78_sgpr79 $sgpr80_sgpr81 $sgpr82_sgpr83 $sgpr84_sgpr85 $sgpr86_sgpr87 $sgpr88_sgpr89 $sgpr90_sgpr91 $sgpr92_sgpr93 $sgpr94_sgpr95 $sgpr96_sgpr97 $sgpr98_sgpr99 $sgpr100_sgpr101 ]
		updated: 128B	undef %0.sub0:sgpr_64 = IMPLICIT_DEF
		updated: 192B	S_NOP 0, implicit %0.sub0:sgpr_64
	Success: %1:sub0 -> %0
	Result = %0 [48r,64r:1)[64r,112B:2)[128r,176B:0)[176B,208r:3)  0@128r 1@48r 2@64r 3@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2)  0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[176B,208r:2)  0@x 1@64r 2@176B-phi weight:0.000000e+00
:
:
Trying to inflate 0 regs.
********** INTERVALS **********
%0 [48r,64r:1)[64r,112B:2)[128r,176B:0)[176B,208r:3)  0@128r 1@48r 2@64r 3@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2)  0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[176B,208r:2)  0@x 1@64r 2@176B-phi weight:0.000000e+00
RegMasks:
********** MACHINEINSTRS **********
# Machine code for function coalescing_makes_lane_undefined: NoPHIs, TracksLiveness

0B	bb.0:
	  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)

16B	  S_CBRANCH_SCC0 %bb.2, implicit undef $scc

32B	bb.1:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

48B	  undef %0.sub0:sgpr_64 = S_MOV_B32 1
64B	  %0.sub1:sgpr_64 = S_MOV_B32 2
96B	  S_BRANCH %bb.3

112B	bb.2:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	  undef %0.sub0:sgpr_64 = IMPLICIT_DEF

176B	bb.3:
	; predecessors: %bb.1, %bb.2

192B	  S_NOP 0, implicit %0.sub0:sgpr_64
208B	  S_NOP 0, implicit %0:sgpr_64

Sent JoinVals dumper used in this dump for review: https://reviews.llvm.org/D82580

Diff Detail

Event Timeline

vpykhtin created this revision.Jun 20 2020, 7:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2020, 7:15 AM
vpykhtin edited the summary of this revision. (Show Details)Jun 26 2020, 1:14 AM

It would help if 9d7bc0874cf20f44cd331c77f5a003b4c4b262bd had a test...

I'm somewhat suspicious, since I have seen cases where the implicit def is needed in cases where the full register is never fully covered. However, it's possible this is still a leftover from before DetectDeadLanes was added

vpykhtin added a comment.EditedJul 14 2020, 10:28 PM

It would help if 9d7bc0874cf20f44cd331c77f5a003b4c4b262bd had a test...

I'm somewhat suspicious, since I have seen cases where the implicit def is needed in cases where the full register is never fully covered. However, it's possible this is still a leftover from before DetectDeadLanes was added

Do you mean there might be a situation when some lane becomes uninitialized after RenameIndpendentSubregs? I checked RenameIndpendentSubregs pass - it has dedicated code in computeMainRangesFixFlags (line 310 ..) that creates IMPLICIT_DEF initialization for indpendent lanes.

It would help if 9d7bc0874cf20f44cd331c77f5a003b4c4b262bd had a test...

I'm somewhat suspicious, since I have seen cases where the implicit def is needed in cases where the full register is never fully covered. However, it's possible this is still a leftover from before DetectDeadLanes was added

Do you mean there might be a situation when some lane becomes uninitialized after RenameIndpendentSubregs? I checked RenameIndpendentSubregs pass - it has dedicated code in computeMainRangesFixFlags (line 310 ..) that creates IMPLICIT_DEF initialization for indpendent lanes.

The scheduler runs after RenameIndpendentSubregs, which makes it seem like anything can happen

Can you remove the commit message from the previous commit from this message? I found it confusing to read it here

llvm/test/CodeGen/AMDGPU/coalescing_makes_lanes_undef.mir
3

Should have comment explaining the test

9–10

Probably should generate full checks

Can you remove the commit message from the previous commit from this message? I found it confusing to read it here

Sorry Matt, I don't understand, where is the commit message from the previous commit?

Honestly, I have no idea one way or another. I'm leaning toward @arsenm's theory that the implicit def may be needed in case the full register is never fully covered.
But I would need to dig deeper into this.

vpykhtin edited the summary of this revision. (Show Details)Aug 31 2020, 3:20 AM
vpykhtin updated this revision to Diff 288917.Aug 31 2020, 3:48 AM

Rebased, updated per review comments.

Honestly, I have no idea one way or another. I'm leaning toward @arsenm's theory that the implicit def may be needed in case the full register is never fully covered.
But I would need to dig deeper into this.

I'm trying to imagine the situation when a subreg needs IMPLICIT_DEF when other not-intersecting subreg of the same whole register already contain some value. If RenameIndpendentSubregs would split the whole reg into separate registers the unitialized subreg will have its own IMPLICIT_DEF. Machine scheduler doens't touch IMPLICIT_DEFs but it erases undef flags before scheduling and reset it after.

vpykhtin marked 2 inline comments as done.Sep 14 2020, 8:04 AM

Hi,

I had a (slightly) closer look and I have a question:

I think its sufficient to erase IMPLICIT_DEF on any other incoming other value - in any case the reg would be initialized, no matter of subregs involved.

If we were to remove the implicit_def, wouldn't we get another verifier error of the sort "definition doesn't dominate all of its uses" because of the lack of any (implicit) definition on the 0 -> 2 -> 3 path for %0?

Basically, I am wondering if we set ErasableImplicitDef correctly for subranges.

Cheers,
-Quentin

HI Quentin,

Let A and B registers are about to be joined and at some point we met:

undef A.sub0 = <some value>
...
B.sub1 = IMPLICIT_DEF

Why would we need to keep this impdef?

Hi,

Why would we need to keep this impdef?

In your last example we wouldn't.

I was referring to the test case in your patch.
We have:

if (<cond>)
  undef A.sub0 = <some value>
else
  undef B.sub0 = IMPLICIT_DEF
C = phi(A, B)

I was wondering if we would get a machine verifier error if we were to eliminate the implicit def for B.

Now looking at the before and after of the dump of the test case in your patch, I see that yes, we remove an implicit def, but there is still another one after (and correct this time.)

Cheers,
-Quentin

vpykhtin added a comment.EditedSep 23 2020, 4:52 AM

Sorry, I had to explain the context around the modified code.

The IMPLICIT_DEF is erased only if there some "incoming" value from "other reg".

JoinVals::ConflictResolution
JoinVals::analyzeValue(unsigned ValNo, JoinVals &Other) {
...
Line 2612:
  // Find the value in Other that overlaps VNI->def, if any.
  LiveQueryResult OtherLRQ = Other.LR.Query(VNI->def);
...

Line 2648:
  // No simultaneous def. Is Other live at the def?
  V.OtherVNI = OtherLRQ.valueIn();
  if (!V.OtherVNI)
    // No overlap, no conflict.
    return CR_Keep;

Upon dealing with IMPLICIT_DEF this code checks if there was any other value from the joining counterpart, and if there isn't - we just keep the IMPLICIT_DEF.

In this case we have (%1 is about to be joined with %0.sub0):

112B	bb.2:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	  %1:sgpr_32 = IMPLICIT_DEF
144B	  undef %0.sub0:sgpr_64 = IMPLICIT_DEF
160B	  %0.sub1:sgpr_64 = IMPLICIT_DEF

RHSVals %1:sub0:  // this strange notation means %1 is going to be joined into sub0 of the joined full reg
			0@128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef -> 0@128r, 
			1@80r Write:0000000000000003 Valid:0000000000000003 Erase Other:3@64r -> 3@64r, 
			2@176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4@176B-phi -> 4@176B-phi
LHSVals %0: 
			0@144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0@128r ImpDef -> 0@128r, 
			1@48r Write:0000000000000003 Valid:0000000000000003 Keep -> 1@48r, 
			2@160r Write:000000000000000C Valid:000000000000000F Erase Other:0@128r Redef:0@144r ImpDef -> 0@128r, 
			3@64r Write:000000000000000C Valid:000000000000000F Keep Redef:1@48r -> 3@64r, 
			4@176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2@176B-phi -> 4@176B-phi

128B %1:sgpr_32 = IMPLICIT_DEF
This is keeped, as there is no other "incoming" value:
RHS: 0@128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef -> 0@128r,

144B undef %0.sub0:sgpr_64 = IMPLICIT_DEF
This one is erased as %1 (joining counterpart) already have some value, that is the RHS:0@128r (IMPLICIT_DEF) from the instruction above:
LHS: 0@144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0@128r ImpDef -> 0@128r,

160B %0.sub1:sgpr_64 = IMPLICIT_DEF
This is also erased becasue of the incoming RHS:0@128r:
LHS: 2@160r Write:000000000000000C Valid:000000000000000F Erase Other:0@128r Redef:0@144r ImpDef -> 0@128r,

So after the processing the only instruction left is:
128B %1:sgpr_32 = IMPLICIT_DEF

When the join is finalized %1 becomes %0.sub0 and the initialization gets <undef> flag as the first assignment to %0 on that path:

112B	bb.2:
	; predecessors: %bb.0
	  successors: %bb.3(0x80000000); %bb.3(100.00%)

128B	  undef %0.sub0:sgpr_64 = IMPLICIT_DEF
qcolombet accepted this revision.Sep 23 2020, 10:12 AM

The IMPLICIT_DEF is erased only if there some "incoming" value from "other reg".

That makes sense.

Thanks for the explanation.

LGTM and If that piece of code turns out to be mandatory, we'll find out soon enough and we would get a test case :).

This revision is now accepted and ready to land.Sep 23 2020, 10:12 AM

If that piece of code turns out to be mandatory, we'll find out soon enough and we would get a test case :).

Thank you Quentin, lets try :)