This is actually a revert of
commit 9d7bc0874cf20f44cd331c77f5a003b4c4b262bd
Author: Matthias Braun <matze@braunis.de>
Date: Thu Jan 8 00:21:23 2015 +0000RegisterCoalescer: Do not remove IMPLICIT_DEFS if they are required for subranges. The register coalescer used to remove implicit_defs when they are covered by the main range anyway. With subreg liveness tracking we can't do that anymore in places where the IMPLICIT_DEF is required as begin of a subregister liverange. llvm-svn: 225416
Without this patch the bb2 of the test looks like:
bb.2: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) %0.sub1:sgpr_64 = IMPLICIT_DEF
Since there is no undef flag %0 is considered uninitialized in bb2, leading to an assert on mir validation. The debug dump (manually enhanced) shows what happend:
0B bb.0: successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%) 16B S_CBRANCH_SCC0 %bb.2, implicit undef $scc 32B bb.1: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) 48B undef %0.sub0:sgpr_64 = S_MOV_B32 1 64B %0.sub1:sgpr_64 = S_MOV_B32 2 80B %1:sgpr_32 = COPY %0.sub0:sgpr_64 96B S_BRANCH %bb.3 112B bb.2: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) 128B %1:sgpr_32 = IMPLICIT_DEF 144B undef %0.sub0:sgpr_64 = IMPLICIT_DEF 160B %0.sub1:sgpr_64 = IMPLICIT_DEF 176B bb.3: ; predecessors: %bb.1, %bb.2 192B S_NOP 0, implicit %1:sgpr_32 208B S_NOP 0, implicit %0:sgpr_64 # End machine code for function coalescing_makes_lane_undefined. ********** SIMPLE REGISTER COALESCING ********** ********** Function: coalescing_makes_lane_undefined ********** JOINING INTERVALS *********** : : 80B %1:sgpr_32 = COPY %0.sub0:sgpr_64 Considering merging to SGPR_64 with %1 in %0:sub0 RHS = %1 [80r,112B:1) 1@80r [128r,176B:0) 0@128r [176B,192r:2) 2@176B-phi weight:0.000000e+00 LHS = %0 [48r,64r:1) 1@48r [64r,112B:3) 3@64r [144r,160r:0) 0@144r [160r,176B:2) 2@160r [176B,208r:4) 4@176B-phi L0000000000000003 [48r,112B:1) 1@48r [144r,176B:0) 0@144r [176B,208r:2) 2@176B-phi L000000000000000C [64r,112B:1) 1@64r [160r,176B:0) 0@160r [176B,208r:2) 2@176B-phi weight:0.000000e+00 merge %0:0@144r into %1:0@128r --> @128r merge %1:1@80r into %0:3@64r --> @64r merge %1:2@176B into %0:4@176B --> @176B RHSVals %1:sub0: 0@128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef Pruned -> 0@128r, 1@80r Write:0000000000000003 Valid:0000000000000003 Erase Other:3@64r -> 3@64r, 2@176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4@176B-phi -> 4@176B-phi LHSVals %0: 0@144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0@128r ImpDef -> 0@128r, 1@48r Write:0000000000000003 Valid:0000000000000003 Keep -> 1@48r, 2@160r Write:000000000000000C Valid:000000000000000F Replace Other:0@128r Redef:0@144r ImpDef -> 2@160r, 3@64r Write:000000000000000C Valid:000000000000000F Keep Redef:1@48r -> 3@64r, 4@176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2@176B-phi -> 4@176B-phi LHST = %0 %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2) 0@144r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0@160r 1@64r 2@176B-phi weight:0.000000e+00 merge %0:0@144r into %1:0@128r --> @128r merge %1:1@80r into %0:1@48r --> @48r merge %1:2@176B into %0:2@176B --> @176B RHSVals %1:sub0:0000000000000003: 0@128r Write:0000000000000001 Valid:0000000000000000 Keep ImpDef -> 0@128r, 1@80r Write:0000000000000001 Valid:0000000000000001 Erase Other:1@48r -> 1@48r, 2@176B-phi Write:0000000000000001 Valid:0000000000000001 Merge Other:2@176B-phi -> 2@176B-phi LHSVals %0:0000000000000003: 0@144r Write:0000000000000001 Valid:0000000000000000 Erase Other:0@128r ImpDef -> 0@128r, 1@48r Write:0000000000000001 Valid:0000000000000001 Keep -> 1@48r, 2@176B-phi Write:0000000000000001 Valid:0000000000000001 Keep Other:2@176B-phi -> 2@176B-phi joined lanes: 0000000000000003 [48r,112B:1) 1@48r [128r,176B:0) 0@128r [176B,208r:2) 2@176B-phi Joined SubRanges %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0@160r 1@64r 2@176B-phi weight:0.000000e+00 Expecting instruction removal at 144r Expecting instruction removal at 128r Prune sublane 0000000000000003 at 128r Expecting instruction removal at 80r pruned all of %0 at 144r: [48r,64r:1)[64r,112B:3)[160r,176B:2)[176B,208r:4) 0@144r 1@48r 2@160r 3@64r 4@176B-phi pruned %1 at 160r: [80r,112B:1)[128r,160r:0)[176B,192r:2) 0@128r 1@80r 2@176B-phi erased: 144r undef %0.sub0:sgpr_64 = IMPLICIT_DEF removed 0@128r: [80r,112B:1)[176B,192r:2) 0@x 1@80r 2@176B-phi erased: 128r %1:sgpr_32 = IMPLICIT_DEF erased: 80r %1:sgpr_32 = COPY %0.sub0:sgpr_64 restoring liveness to 2 points: 160r,176B: %0 [48r,64r:1)[64r,112B:3)[160r,176B:2)[176B,208r:4) 0@x 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[176B,208r:2) 0@x 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0@160r 1@64r 2@176B-phi weight:0.000000e+00 # Machine code for function coalescing_makes_lane_undefined: NoPHIs, TracksLiveness bb.0: successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%) S_CBRANCH_SCC0 %bb.2, implicit undef $scc bb.1: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) undef %0.sub0:sgpr_64 = S_MOV_B32 1 %0.sub1:sgpr_64 = S_MOV_B32 2 S_BRANCH %bb.3 bb.2: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) %0.sub1:sgpr_64 = IMPLICIT_DEF bb.3: ; predecessors: %bb.1, %bb.2 S_NOP 0, implicit %1:sgpr_32 S_NOP 0, implicit %0:sgpr_64 # End machine code for function coalescing_makes_lane_undefined. *** Bad machine code: Reading virtual register without a def *** - function: coalescing_makes_lane_undefined - basic block: %bb.3 (0x66758e8) - instruction: S_NOP 0, implicit %1:sgpr_32 - operand 1: implicit %1:sgpr_32 *** Bad machine code: Virtual register defs don't dominate all uses. *** - function: coalescing_makes_lane_undefined - v. register: %0 LLVM ERROR: Found 2 machine code errors. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: C:\work\git\llvm-project\build\Debug\bin\llc.exe -debug-only=regalloc -march=amdgcn -mcpu=gfx803 -run-pass simple-register-coalescing -verify-machineinstrs coalescing_makes_lanes_undef.mir 1. Running pass 'Function Pass Manager' on module 'coalescing_makes_lanes_undef.mir'. 2. Running pass 'Simple Register Coalescing' on function '@coalescing_makes_lane_undefined'
It erases 144 and 128 leaving 160. This happens because 160r replaces 128r, 128r is marked pruned due to the replace and since 128r is impdef it is erased.
I think its sufficient to erase IMPLICIT_DEF on any other incoming other value - in any case the reg would be initialized, no matter of subregs involved.
With the patch the dump looks like:
0B bb.0: successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%) 16B S_CBRANCH_SCC0 %bb.2, implicit undef $scc 32B bb.1: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) 48B undef %0.sub0:sgpr_64 = S_MOV_B32 1 64B %0.sub1:sgpr_64 = S_MOV_B32 2 80B %1:sgpr_32 = COPY %0.sub0:sgpr_64 96B S_BRANCH %bb.3 112B bb.2: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) 128B %1:sgpr_32 = IMPLICIT_DEF 144B undef %0.sub0:sgpr_64 = IMPLICIT_DEF 160B %0.sub1:sgpr_64 = IMPLICIT_DEF 176B bb.3: ; predecessors: %bb.1, %bb.2 192B S_NOP 0, implicit %1:sgpr_32 208B S_NOP 0, implicit %0:sgpr_64 # End machine code for function coalescing_makes_lane_undefined. ********** SIMPLE REGISTER COALESCING ********** ********** Function: coalescing_makes_lane_undefined ********** JOINING INTERVALS *********** : : 80B %1:sgpr_32 = COPY %0.sub0:sgpr_64 Considering merging to SGPR_64 with %1 in %0:sub0 RHS = %1 [80r,112B:1)[128r,176B:0)[176B,192r:2) 0@128r 1@80r 2@176B-phi weight:0.000000e+00 LHS = %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2) 0@144r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0@160r 1@64r 2@176B-phi weight:0.000000e+00 merge %0:0@144r into %1:0@128r --> @128r merge %0:2@160r into %1:0@128r --> @128r merge %1:1@80r into %0:3@64r --> @64r merge %1:2@176B into %0:4@176B --> @176B RHSVals %1:sub0: 0@128r Write:0000000000000003 Valid:0000000000000000 Keep ImpDef -> 0@128r, 1@80r Write:0000000000000003 Valid:0000000000000003 Erase Other:3@64r -> 3@64r, 2@176B-phi Write:0000000000000003 Valid:0000000000000003 Merge Other:4@176B-phi -> 4@176B-phi LHSVals %0: 0@144r Write:0000000000000003 Valid:0000000000000003 Erase Other:0@128r ImpDef -> 0@128r, 1@48r Write:0000000000000003 Valid:0000000000000003 Keep -> 1@48r, 2@160r Write:000000000000000C Valid:000000000000000F Erase Other:0@128r Redef:0@144r ImpDef -> 0@128r, 3@64r Write:000000000000000C Valid:000000000000000F Keep Redef:1@48r -> 3@64r, 4@176B-phi Write:FFFFFFFFFFFFFFFF Valid:FFFFFFFFFFFFFFFF Keep Other:2@176B-phi -> 4@176B-phi LHST = %0 %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[144r,176B:0)[176B,208r:2) 0@144r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0@160r 1@64r 2@176B-phi weight:0.000000e+00 merge %0:0@144r into %1:0@128r --> @128r merge %1:1@80r into %0:1@48r --> @48r merge %1:2@176B into %0:2@176B --> @176B RHSVals %1:sub0:0000000000000003: 0@128r Write:0000000000000001 Valid:0000000000000000 Keep ImpDef -> 0@128r, 1@80r Write:0000000000000001 Valid:0000000000000001 Erase Other:1@48r -> 1@48r, 2@176B-phi Write:0000000000000001 Valid:0000000000000001 Merge Other:2@176B-phi -> 2@176B-phi LHSVals %0:0000000000000003: 0@144r Write:0000000000000001 Valid:0000000000000000 Erase Other:0@128r ImpDef -> 0@128r, 1@48r Write:0000000000000001 Valid:0000000000000001 Keep -> 1@48r, 2@176B-phi Write:0000000000000001 Valid:0000000000000001 Keep Other:2@176B-phi -> 2@176B-phi joined lanes: 0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0@128r 1@48r 2@176B-phi Joined SubRanges %0 [48r,64r:1)[64r,112B:3)[144r,160r:0)[160r,176B:2)[176B,208r:4) 0@144r 1@48r 2@160r 3@64r 4@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[160r,176B:0)[176B,208r:2) 0@160r 1@64r 2@176B-phi weight:0.000000e+00 Expecting instruction removal at 144r Expecting instruction removal at 160r Prune sublane 000000000000000C at 160r Expecting instruction removal at 80r erased: 144r undef %0.sub0:sgpr_64 = IMPLICIT_DEF erased: 160r %0.sub1:sgpr_64 = IMPLICIT_DEF erased: 80r %1:sgpr_32 = COPY %0.sub0:sgpr_64 AllocationOrder(SGPR_64) = [ $sgpr0_sgpr1 $sgpr2_sgpr3 $sgpr4_sgpr5 $sgpr6_sgpr7 $sgpr8_sgpr9 $sgpr10_sgpr11 $sgpr12_sgpr13 $sgpr14_sgpr15 $sgpr16_sgpr17 $sgpr18_sgpr19 $sgpr20_sgpr21 $sgpr22_sgpr23 $sgpr24_sgpr25 $sgpr26_sgpr27 $sgpr28_sgpr29 $sgpr30_sgpr31 $sgpr32_sgpr33 $sgpr34_sgpr35 $sgpr36_sgpr37 $sgpr38_sgpr39 $sgpr40_sgpr41 $sgpr42_sgpr43 $sgpr44_sgpr45 $sgpr46_sgpr47 $sgpr48_sgpr49 $sgpr50_sgpr51 $sgpr52_sgpr53 $sgpr54_sgpr55 $sgpr56_sgpr57 $sgpr58_sgpr59 $sgpr60_sgpr61 $sgpr62_sgpr63 $sgpr64_sgpr65 $sgpr66_sgpr67 $sgpr68_sgpr69 $sgpr70_sgpr71 $sgpr72_sgpr73 $sgpr74_sgpr75 $sgpr76_sgpr77 $sgpr78_sgpr79 $sgpr80_sgpr81 $sgpr82_sgpr83 $sgpr84_sgpr85 $sgpr86_sgpr87 $sgpr88_sgpr89 $sgpr90_sgpr91 $sgpr92_sgpr93 $sgpr94_sgpr95 $sgpr96_sgpr97 $sgpr98_sgpr99 $sgpr100_sgpr101 ] updated: 128B undef %0.sub0:sgpr_64 = IMPLICIT_DEF updated: 192B S_NOP 0, implicit %0.sub0:sgpr_64 Success: %1:sub0 -> %0 Result = %0 [48r,64r:1)[64r,112B:2)[128r,176B:0)[176B,208r:3) 0@128r 1@48r 2@64r 3@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[176B,208r:2) 0@x 1@64r 2@176B-phi weight:0.000000e+00 : : Trying to inflate 0 regs. ********** INTERVALS ********** %0 [48r,64r:1)[64r,112B:2)[128r,176B:0)[176B,208r:3) 0@128r 1@48r 2@64r 3@176B-phi L0000000000000003 [48r,112B:1)[128r,176B:0)[176B,208r:2) 0@128r 1@48r 2@176B-phi L000000000000000C [64r,112B:1)[176B,208r:2) 0@x 1@64r 2@176B-phi weight:0.000000e+00 RegMasks: ********** MACHINEINSTRS ********** # Machine code for function coalescing_makes_lane_undefined: NoPHIs, TracksLiveness 0B bb.0: successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%) 16B S_CBRANCH_SCC0 %bb.2, implicit undef $scc 32B bb.1: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) 48B undef %0.sub0:sgpr_64 = S_MOV_B32 1 64B %0.sub1:sgpr_64 = S_MOV_B32 2 96B S_BRANCH %bb.3 112B bb.2: ; predecessors: %bb.0 successors: %bb.3(0x80000000); %bb.3(100.00%) 128B undef %0.sub0:sgpr_64 = IMPLICIT_DEF 176B bb.3: ; predecessors: %bb.1, %bb.2 192B S_NOP 0, implicit %0.sub0:sgpr_64 208B S_NOP 0, implicit %0:sgpr_64
Sent JoinVals dumper used in this dump for review: https://reviews.llvm.org/D82580
Should have comment explaining the test