Diff 292120

llvm/lib/CodeGen/RegisterCoalescer.cpp

Show First 20 Lines • Show All 1,206 Lines • ▼ Show 20 Lines	for (LiveInterval::SubRange &SR : IntB.subranges()) {
for (unsigned I = 0; I != EndPoints.size(); ) {		for (unsigned I = 0; I != EndPoints.size(); ) {
if (SlotIndex::isSameInstr(EndPoints[I], CopyIdx)) {		if (SlotIndex::isSameInstr(EndPoints[I], CopyIdx)) {
EndPoints[I] = EndPoints.back();		EndPoints[I] = EndPoints.back();
EndPoints.pop_back();		EndPoints.pop_back();
continue;		continue;
}		}
++I;		++I;
}		}
LIS->extendToIndices(SR, EndPoints);		SmallVector<SlotIndex, 8> Undefs;
		IntB.computeSubRangeUndefs(Undefs, SR.LaneMask, *MRI,
		*LIS->getSlotIndexes());
		LIS->extendToIndices(SR, EndPoints, Undefs);
}		}
// If any dead defs were extended, truncate them.		// If any dead defs were extended, truncate them.
shrinkToUses(&IntB);		shrinkToUses(&IntB);

// Finally, update the live-range of IntA.		// Finally, update the live-range of IntA.
shrinkToUses(&IntA);		shrinkToUses(&IntA);
return true;		return true;
}		}
▲ Show 20 Lines • Show All 2,746 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/coalescer-removepartial-extend-undef-subrange.mir

This file was added.

				# RUN: llc -march=amdgcn -run-pass simple-register-coalescing -verify-machineinstrs -o - %s \| FileCheck %s
				#
				arsenmUnsubmitted Not Done Reply Inline Actions Add a -mcpu=gfx900 or some other target arsenm: Add a -mcpu=gfx900 or some other target
				# CHECK-LABEL: bb.4:
				# CHECK-NOT: COPY
				# CHECK-LABEL: bb.5:
				#
				arsenmUnsubmitted Not Done Reply Inline Actions These checks are really weak, I would rather just generate these arsenm: These checks are really weak, I would rather just generate these
				# After some coalescing work, the copies in bb.4
				# change from:
				# %12:vreg_512 = COPY killed %244
				# %242:vreg_512 = COPY killed %12
				# to:
				# %90:vreg_512 = COPY %38:vreg_512
				#
				# The failure occurs when processing this COPY in removePartialRedundency(). The
				# coalescer tries to prune and extend one of the subrange of %90 that is undef
				# from bb.0. The findReachingDef() will search all the way through one of
				# the predecessor chain: 4->27->1->0 to find the reaching-def. As we did not
				# provide the Undefs, it failed to find the reaching-def in that predecessor
				# chain and assert:
				# "Use of $noreg does not have a corresponding definition on every path
				arsenmUnsubmitted Not Done Reply Inline Actions $noreg doesn't have liveness? arsenm: $noreg doesn't have liveness?
				ruilingAuthorUnsubmitted Done Reply Inline Actions I am not sure what you are really asking here. I know the assert message is a little bit misleading. The reason it output $noreg is we are passing 0 to argument 'PhyReg' of LiveRangeCalc::extend() in LiveIntervals::extendToIndices(). When assert happens, we are trying to extend a subrange of register %90. The offending subrange has a def point in one of bb.4 predecessor chain, so it has liveness in this path. but in another predecessor chain 4->27->1->0, it only has one undef point in bb.0. ruiling: I am not sure what you are really asking here. I know the assert message is a little bit…
				# LLVM ERROR: Use not jointly dominated by defs"

				---
				name: _amdgpu_ps_main
				alignment: 1
				tracksRegLiveness: true
				body: \|
				bb.0:
				successors: %bb.1, %bb.2
				arsenmUnsubmitted Not Done Reply Inline Actions Can you reduce this any further? -run-pass=none will also compact the register numbers arsenm: Can you reduce this any further? -run-pass=none will also compact the register numbers
				ruilingAuthorUnsubmitted Done Reply Inline Actions I just tried to remove each one of the most blocks containing instructions in MIR, I failed to trigger the crash after removing them. If you have any good idea to simplify further, I can try it. Some more notes: The test case is already a much simplified version than original failure IR with 100+ basic blocks. I have tried removing each basic block in IR before and it would cause problem disappear. The problem is hard to trigger, it also depends on the order of coalescing work. For example, if I reorder the blocks, the problem may disappear because the coalescing of registers will be different then. ruiling: I just tried to remove each one of the most blocks containing instructions in MIR, I failed to…
				arsenmUnsubmitted Not Done Reply Inline Actions I've reduced coalescer cases by dumping the MIR at intermediate points during the coalescer. Presumably not every step is relevant to get to the problem arsenm: I've reduced coalescer cases by dumping the MIR at intermediate points during the coalescer.
				liveins: $sgpr2, $sgpr3, $vgpr3

				%53:vgpr_32 = COPY killed $vgpr3
				%50:sgpr_32 = COPY killed $sgpr3
				%49:sgpr_32 = COPY killed $sgpr2
				%57:sreg_64 = S_GETPC_B64
				%58:sreg_64 = COPY killed %57
				%58.sub0:sreg_64 = COPY killed %49
				%59:sgpr_128 = S_LOAD_DWORDX4_IMM killed %58, 0, 0, 0
				%60:sgpr_32 = S_BUFFER_LOAD_DWORD_IMM killed %59, 1, 0, 0 :: (dereferenceable invariant load 4)
				%63:vgpr_32 = V_MOV_B32_e32 1092616192, implicit $exec
				%62:vgpr_32 = nnan nsz arcp contract afn reassoc nofpexcept V_MUL_F32_e32 %60, killed %63, implicit $mode, implicit $exec
				%183:vgpr_32 = V_MOV_B32_e32 1065353216, implicit $exec
				undef %182.sub0:vreg_512 = COPY killed %183
				%182.sub2:vreg_512 = COPY %62
				%182.sub3:vreg_512 = COPY %62
				%88:sreg_64 = nofpexcept V_CMP_GT_F32_e64 0, 1065353216, 0, %53, 0, implicit $mode, implicit $exec
				%89:sreg_64 = nofpexcept V_CMP_NGT_F32_e64 0, 1065353216, 0, %53, 0, implicit $mode, implicit $exec
				%5:sreg_64_xexec = nofpexcept V_CMP_LT_F32_e64 0, 1065353216, 0, killed %50, 0, implicit $mode, implicit $exec
				%92:sreg_64 = nofpexcept V_CMP_GT_F32_e64 0, 0, 0, %53, 0, implicit $mode, implicit $exec
				%93:sreg_64 = nofpexcept V_CMP_NGT_F32_e64 0, 0, 0, killed %53, 0, implicit $mode, implicit $exec
				%241:sreg_64 = COPY %88
				%242:vreg_512 = COPY %182
				%8:sreg_64 = COPY $exec, implicit-def $exec
				%262:sreg_64 = S_AND_B64 %8, %89, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %262
				S_CBRANCH_EXECZ %bb.2, implicit $exec
				S_BRANCH %bb.1

				bb.1:
				%96:sreg_64 = nofpexcept V_CMP_NGT_F32_e64 0, 1065353216, 0, killed %60, 0, implicit $mode, implicit $exec
				%94:sreg_64 = S_MOV_B64 -1
				%98:sreg_64 = S_AND_B64 $exec, killed %96, implicit-def dead $scc
				$vcc = COPY killed %98
				S_CBRANCH_VCCNZ %bb.3, implicit killed $vcc
				S_BRANCH %bb.27

				bb.2:
				successors: %bb.13, %bb.17

				$exec = S_OR_B64 $exec, killed %8, implicit-def $scc
				%9:vreg_512 = COPY killed %242
				%10:sreg_64 = COPY killed %241
				%11:sreg_64 = COPY $exec, implicit-def $exec
				%263:sreg_64 = S_AND_B64 %11, killed %10, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %263
				S_CBRANCH_EXECZ %bb.17, implicit $exec
				S_BRANCH %bb.13

				bb.3:
				%99:sreg_64 = S_MOV_B64 0
				%245:sreg_64 = IMPLICIT_DEF
				%246:sreg_64 = IMPLICIT_DEF
				%247:sreg_64 = COPY killed %99
				%248:vreg_512 = COPY killed %182
				S_BRANCH %bb.5

				bb.4:
				%13:sreg_64 = COPY killed %243
				%215:sreg_64 = S_ANDN2_B64 killed %88, $exec, implicit-def dead $scc
				%216:sreg_64 = S_AND_B64 killed %13, $exec, implicit-def dead $scc
				%214:sreg_64 = S_OR_B64 killed %215, killed %216, implicit-def dead $scc
				%241:sreg_64 = COPY killed %214
				%12:vreg_512 = COPY killed %244
				%242:vreg_512 = COPY killed %12
				S_BRANCH %bb.2

				bb.5:
				successors: %bb.6, %bb.7

				%15:vreg_512 = COPY killed %248
				%14:sreg_64 = COPY killed %247
				%225:sreg_64 = COPY killed %246
				%238:sreg_64 = COPY killed %245
				%100:sreg_64 = S_MOV_B64 -1
				%101:vgpr_32 = V_CNDMASK_B32_e64 0, 0, 0, 1, %5, implicit $exec
				V_CMP_NE_U32_e32 1, killed %101, implicit-def $vcc, implicit $exec
				$vcc = S_AND_B64 $exec, killed $vcc, implicit-def dead $scc
				%249:sreg_64 = COPY %100
				%250:vreg_512 = COPY %15
				S_CBRANCH_VCCNZ %bb.7, implicit killed $vcc
				S_BRANCH %bb.6

				bb.6:
				successors: %bb.8, %bb.9

				%104:sreg_64 = S_MOV_B64 0
				%251:sreg_64 = COPY killed %104
				%252:vreg_512 = COPY %15
				%16:sreg_64 = COPY $exec, implicit-def $exec
				%264:sreg_64 = S_AND_B64 %16, %93, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %264
				S_CBRANCH_EXECZ %bb.9, implicit $exec
				S_BRANCH %bb.8

				bb.7:
				successors: %bb.10, %bb.11

				%17:vreg_512 = COPY killed %250
				%18:sreg_64 = COPY killed %249
				%223:sreg_64 = S_OR_B64 killed %225, $exec, implicit-def dead $scc
				%253:sreg_64 = COPY killed %100
				%254:sreg_64 = COPY %223
				%19:sreg_64 = COPY $exec, implicit-def $exec
				%265:sreg_64 = S_AND_B64 %19, killed %18, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %265
				S_CBRANCH_EXECZ %bb.11, implicit $exec
				S_BRANCH %bb.10

				bb.8:
				undef %175.sub0:vreg_512 = COPY %62
				%175.sub2:vreg_512 = COPY %15.sub2
				%175.sub3:vreg_512 = COPY %15.sub3
				%20:vreg_512 = COPY killed %175
				%220:sreg_64 = COPY $exec
				%251:sreg_64 = COPY killed %220
				%252:vreg_512 = COPY killed %20

				bb.9:
				$exec = S_OR_B64 $exec, killed %16, implicit-def $scc
				%21:vreg_512 = COPY killed %252
				%22:sreg_64 = COPY killed %251
				%249:sreg_64 = COPY killed %22
				%250:vreg_512 = COPY killed %21
				S_BRANCH %bb.7

				bb.10:
				successors: %bb.12, %bb.25

				%131:sreg_64 = S_MOV_B64 -1
				%261:sreg_64 = COPY killed %131
				%23:sreg_64 = COPY $exec, implicit-def $exec
				%266:sreg_64 = S_AND_B64 %23, %93, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %266
				S_CBRANCH_EXECZ %bb.25, implicit $exec
				S_BRANCH %bb.12

				bb.11:
				successors: %bb.26(0x04000000), %bb.5(0x7c000000)

				$exec = S_OR_B64 $exec, killed %19, implicit-def $scc
				%24:sreg_64 = COPY killed %254
				%25:sreg_64 = COPY killed %253
				%271:sreg_64 = S_AND_B64 $exec, killed %25, implicit-def $scc
				%26:sreg_64 = S_OR_B64 %271, killed %14, implicit-def $scc
				%239:sreg_64 = S_ANDN2_B64 killed %238, $exec, implicit-def dead $scc
				%240:sreg_64 = S_AND_B64 %24, $exec, implicit-def dead $scc
				%44:sreg_64 = S_OR_B64 killed %239, killed %240, implicit-def dead $scc
				%245:sreg_64 = COPY %44
				%246:sreg_64 = COPY killed %24
				%247:sreg_64 = COPY %26
				%248:vreg_512 = COPY killed %17
				$exec = S_ANDN2_B64_term $exec, %26, implicit-def $scc
				S_CBRANCH_EXECNZ %bb.5, implicit $exec
				S_BRANCH %bb.26

				bb.12:
				%234:sreg_64 = S_XOR_B64 $exec, -1, implicit-def dead $scc
				%261:sreg_64 = COPY killed %234
				S_BRANCH %bb.25

				bb.13:
				successors: %bb.14, %bb.18

				%137:sreg_64 = S_MOV_B64 -1
				%255:sreg_64 = COPY killed %137
				%256:vreg_512 = COPY %9
				%27:sreg_64 = COPY $exec, implicit-def $exec
				%267:sreg_64 = S_AND_B64 %27, %89, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %267
				S_CBRANCH_EXECZ %bb.18, implicit $exec
				S_BRANCH %bb.14

				bb.14:
				successors: %bb.19, %bb.20

				%139:sreg_64 = S_MOV_B64 0
				%257:sreg_64 = COPY killed %139
				%258:vreg_512 = COPY %9
				%28:sreg_64 = COPY $exec, implicit-def $exec
				%268:sreg_64 = S_AND_B64 %28, killed %93, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %268
				S_CBRANCH_EXECZ %bb.20, implicit $exec
				S_BRANCH %bb.19

				bb.15:
				S_BRANCH %bb.23

				bb.16:

				bb.17:
				$exec = S_OR_B64 $exec, killed %11, implicit-def $scc
				%168:vgpr_32 = V_MOV_B32_e32 0, implicit $exec
				EXP_DONE 0, killed %168, undef %170:vgpr_32, undef %172:vgpr_32, undef %174:vgpr_32, -1, 0, 1, implicit $exec
				S_ENDPGM 0

				bb.18:
				successors: %bb.21, %bb.16

				$exec = S_OR_B64 $exec, killed %27, implicit-def $scc
				%30:vreg_512 = COPY killed %256
				%31:sreg_64 = COPY killed %255
				%32:sreg_64 = COPY $exec, implicit-def $exec
				%269:sreg_64 = S_AND_B64 %32, killed %31, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %269
				S_CBRANCH_EXECZ %bb.16, implicit $exec
				S_BRANCH %bb.21

				bb.19:
				undef %203.sub0:vreg_512 = COPY %9.sub2
				%203.sub3:vreg_512 = COPY %9.sub3
				%33:vreg_512 = COPY killed %203
				%232:sreg_64 = COPY $exec
				%257:sreg_64 = COPY killed %232
				%258:vreg_512 = COPY killed %33

				bb.20:
				$exec = S_OR_B64 $exec, killed %28, implicit-def $scc
				%34:vreg_512 = COPY killed %258
				%35:sreg_64 = COPY killed %257
				%230:sreg_64 = S_ORN2_B64 killed %35, $exec, implicit-def dead $scc
				%255:sreg_64 = COPY killed %230
				%256:vreg_512 = COPY killed %34
				S_BRANCH %bb.18

				bb.21:
				successors: %bb.22, %bb.23

				%36:sreg_64 = COPY $exec, implicit-def $exec
				%270:sreg_64 = S_AND_B64 %36, killed %89, implicit-def dead $scc
				$exec = S_MOV_B64_term killed %270
				S_CBRANCH_EXECZ %bb.23, implicit $exec
				S_BRANCH %bb.22

				bb.22:
				%38:vgpr_32 = COPY %30.sub0
				%166:sreg_64 = S_MOV_B64 0
				%259:sreg_64 = COPY killed %166
				%260:vgpr_32 = COPY killed %38
				S_BRANCH %bb.24

				bb.23:
				S_BRANCH %bb.16

				bb.24:
				successors: %bb.15(0x04000000), %bb.24(0x7c000000)

				%39:sreg_64 = COPY killed %259
				%272:sreg_64 = S_AND_B64 $exec, %92, implicit-def $scc
				%41:sreg_64 = S_OR_B64 %272, killed %39, implicit-def $scc
				%40:vgpr_32 = COPY killed %260
				%42:vgpr_32 = V_CNDMASK_B32_e64 0, killed %40, 0, %30.sub3, %5, implicit $exec
				%259:sreg_64 = COPY %41
				%260:vgpr_32 = COPY killed %42
				$exec = S_ANDN2_B64_term $exec, %41, implicit-def $scc
				S_CBRANCH_EXECNZ %bb.24, implicit $exec
				S_BRANCH %bb.15

				bb.25:
				$exec = S_OR_B64 $exec, killed %23, implicit-def $scc
				%43:sreg_64 = COPY killed %261
				%227:sreg_64 = S_ANDN2_B64 killed %223, $exec, implicit-def dead $scc
				%224:sreg_64 = COPY killed %227
				%228:sreg_64 = S_ORN2_B64 killed %43, $exec, implicit-def dead $scc
				%253:sreg_64 = COPY killed %228
				%254:sreg_64 = COPY killed %224
				S_BRANCH %bb.11

				bb.26:
				$exec = S_OR_B64 $exec, killed %26, implicit-def $scc
				%243:sreg_64 = COPY killed %44
				%244:vreg_512 = COPY killed %15
				S_BRANCH %bb.4

				bb.27:
				%243:sreg_64 = COPY killed %94
				%244:vreg_512 = COPY killed %182
				S_BRANCH %bb.4

				...

This is an archive of the discontinued LLVM Phabricator instance.

[RegisterCoalescer] passs Undefs to extendToIndices()
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 292120

llvm/lib/CodeGen/RegisterCoalescer.cpp

llvm/test/CodeGen/AMDGPU/coalescer-removepartial-extend-undef-subrange.mir

This is an archive of the discontinued LLVM Phabricator instance.

[RegisterCoalescer] passs Undefs to extendToIndices()ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 292120

llvm/lib/CodeGen/RegisterCoalescer.cpp

llvm/test/CodeGen/AMDGPU/coalescer-removepartial-extend-undef-subrange.mir

[RegisterCoalescer] passs Undefs to extendToIndices()
ClosedPublic