Download Raw Diff

Details

Reviewers

paquette
arsenm
foad
Petar.Avramovic

Commits

rG7091a7f781c9: [GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval()…

Summary

For artifacts excluding G_TRUNC/G_SEXT, which have IR counterparts, we don't
seem to have debug users of defs. However, in the legalizer we're always calling
MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() which is expensive.
In some rare cases, this contributes significantly to unreasonably long compile
times when we have lots of artifact combiner activity.

To verify this, I added asserts to that function when it actually replaced a debug
use operand with undef for these artifacts. On CTMark with both -O0 and -Os and
debug info enabled, I didn't see a single case where it triggered.

In my measurements I saw around a 0.5% geomean compile-time improvement on -g -O0
for AArch64 with this change.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aemerson created this revision.Sep 14 2021, 2:45 AM

Herald added subscribers: kerbowa, hiraditya, kristof.beyls and 3 others. · View Herald TranscriptSep 14 2021, 2:45 AM

aemerson requested review of this revision.Sep 14 2021, 2:45 AM

Herald added a subscriber: wdng. · View Herald TranscriptSep 14 2021, 2:45 AM

Fix test.

Harbormaster completed remote builds in B123820: Diff 372446.Sep 14 2021, 3:27 AM

arsenm added inline comments.Sep 14 2021, 6:00 AM

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
175 ↗	(On Diff #372446)	Don't we already have an isArtifact check somewhere?

arsenm added inline comments.Sep 14 2021, 6:02 AM

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
175 ↗	(On Diff #372446)	Oh, I just reread and this is different. Can you add a comment justifying why these operations are skipped?

Maybe should add a verifier check that these never have dbg users

foad added a reviewer: Petar.Avramovic.Sep 14 2021, 6:45 AM

Petar.Avramovic added inline comments.Sep 14 2021, 7:55 AM

llvm/test/CodeGen/AMDGPU/GlobalISel/bug-legalization-artifact-combiner-dead-def.mir
87	This is now broken, dbg_use has no def. Maybe delete this test since it was artificial to begin with. Relevant tests are above.

In D109750#2999516, @arsenm wrote:

Maybe should add a verifier check that these never have dbg users

I don't think we should go that far. If we do have a debug user then not marking the use as undef AFAICT isn't strictly incorrect.

aemerson updated this revision to Diff 372817.Sep 15 2021, 3:23 PM

Add comment.

arsenm added inline comments.Sep 15 2021, 3:28 PM

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
185 ↗	(On Diff #372817)	Should still get a comment explaining why these operations

aemerson added inline comments.Sep 15 2021, 3:30 PM

llvm/lib/CodeGen/GlobalISel/Legalizer.cpp
185 ↗	(On Diff #372817)	I updated very shortly before your reply. Is this comment ok?

Harbormaster completed remote builds in B124102: Diff 372819.Sep 15 2021, 4:05 PM

Needs rebasing (maybe re-benchmarking too?) after D109154.

In D109750#3009146, @foad wrote:

Needs rebasing (maybe re-benchmarking too?) after D109154.

I think this is orthogonal to D109154 so don't need to rebenchmark.

Rebase

Harbormaster completed remote builds in B124768: Diff 373724.Sep 20 2021, 3:46 PM

paquette added inline comments.Sep 20 2021, 4:56 PM

llvm/lib/CodeGen/GlobalISel/Utils.cpp
1060	might as well pull the `return false;` into the default case?

aemerson updated this revision to Diff 373750.Sep 20 2021, 5:02 PM

arsenm accepted this revision.Sep 20 2021, 5:02 PM

arsenm added inline comments.

llvm/lib/CodeGen/GlobalISel/Utils.cpp
1060	I think the only combination that stops warnings on all compilers is return false in the default case with an llvm_unreachable after the switch

This revision is now accepted and ready to land.Sep 20 2021, 5:02 PM

Harbormaster completed remote builds in B124786: Diff 373750.Sep 20 2021, 5:32 PM

This revision was landed with ongoing or failed builds.Sep 20 2021, 11:34 PM

Closed by commit rG7091a7f781c9: [GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval()… (authored by aemerson). · Explain Why

This revision was automatically updated to reflect the committed changes.

aemerson added a commit: rG7091a7f781c9: [GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval()….

dsanders added a subscriber: dsanders.Oct 4 2021, 2:10 PM

dsanders added inline comments.

llvm/lib/CodeGen/GlobalISel/Utils.cpp
1054–1058	I don't think this assumption holds beyond the first time the legalizer processes each instruction and is dubious earlier than that if you have pre-legalization passes too. For the standard pipeline of passes, suppose you have an expansion in the legalizer that replaces some operation with a DBG_VALUE with something ending in G_MERGE_VALUES (or one of the others). On the next legalization step you have a G_MERGE_VALUES with a DBG_VALUE. That already breaks the assumption it doesn't have one but it takes a bit more to go wrong. If anything causes this G_MERGE_VALUES to be deleted without replacement you are left with a use-without-def as the DBG_VALUE is not erased. For example, if this is one lane of a bigger G_UNMERGE_VALUES/G_MERGE_VALUES pair and something proves the lane isn't needed it would erase all the instructions for the lane but leave the DBG_VALUE behind. I haven't dug into it myself but I'm told we've run into this scenario in our downstream target and we've had to revert this change.

jackoalan added a subscriber: jackoalan.Oct 17 2021, 12:57 PM

jackoalan added inline comments.

llvm/lib/CodeGen/GlobalISel/Utils.cpp
1054–1058	I'd like to echo this claim. It is far too presumptuous to assume artifacts like `G_MERGE_VALUES` are never created pre-legalizer for all targets. If anything, this decision should be made per-target in `LegalizerInfo`.

jackoalan added inline comments.Oct 17 2021, 1:56 PM

llvm/lib/CodeGen/GlobalISel/Utils.cpp
1054–1058	I found a specific case where GlobalISel's IRTranslator emits `G_MERGE_VALUES`: Lowering a call with arguments that are wider than native machine types. I think `G_UNMERGE_VALUES`, `G_MERGE_VALUES`, `G_CONCAT_VECTORS`, `G_BUILD_VECTOR` should be excluded from this test because they are all reachable in CallLowering and therefore originate from IR.

jackoalan mentioned this in D111970: [GlobalISel][Legalizer] Restore eraseFromParentAndMarkDBGValuesForRemoval() for CallLowering artifacts..Oct 17 2021, 5:16 PM

jackoalan mentioned this in D112852: [GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues.Oct 29 2021, 3:53 PM

jackoalan mentioned this in rGf108c7f59dfa: [GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues..Dec 5 2021, 12:56 PM

Diff 373787

llvm/lib/CodeGen/GlobalISel/Utils.cpp

	Show First 20 Lines • Show All 1,045 Lines • ▼ Show 20 Lines

	bool llvm::shouldOptForSize(const MachineBasicBlock &MBB,			bool llvm::shouldOptForSize(const MachineBasicBlock &MBB,
	ProfileSummaryInfo PSI, BlockFrequencyInfo BFI) {			ProfileSummaryInfo PSI, BlockFrequencyInfo BFI) {
	const auto &F = MBB.getParent()->getFunction();			const auto &F = MBB.getParent()->getFunction();
	return F.hasOptSize() \|\| F.hasMinSize() \|\|			return F.hasOptSize() \|\| F.hasMinSize() \|\|
	llvm::shouldOptimizeForSize(MBB.getBasicBlock(), PSI, BFI);			llvm::shouldOptimizeForSize(MBB.getBasicBlock(), PSI, BFI);
	}			}

				/// These artifacts generally don't have any debug users because they don't
				/// directly originate from IR instructions, but instead usually from
				/// legalization. Avoiding checking for debug users improves compile time.
				/// Note that truncates or extends aren't included because they have IR
				/// counterparts which can have debug users after translation.
				dsandersUnsubmitted Not Done Reply Inline Actions I don't think this assumption holds beyond the first time the legalizer processes each instruction and is dubious earlier than that if you have pre-legalization passes too. For the standard pipeline of passes, suppose you have an expansion in the legalizer that replaces some operation with a DBG_VALUE with something ending in G_MERGE_VALUES (or one of the others). On the next legalization step you have a G_MERGE_VALUES with a DBG_VALUE. That already breaks the assumption it doesn't have one but it takes a bit more to go wrong. If anything causes this G_MERGE_VALUES to be deleted without replacement you are left with a use-without-def as the DBG_VALUE is not erased. For example, if this is one lane of a bigger G_UNMERGE_VALUES/G_MERGE_VALUES pair and something proves the lane isn't needed it would erase all the instructions for the lane but leave the DBG_VALUE behind. I haven't dug into it myself but I'm told we've run into this scenario in our downstream target and we've had to revert this change. dsanders: I don't think this assumption holds beyond the first time the legalizer processes each…
				jackoalanUnsubmitted Not Done Reply Inline Actions I'd like to echo this claim. It is far too presumptuous to assume artifacts like `G_MERGE_VALUES` are never created pre-legalizer for all targets. If anything, this decision should be made per-target in `LegalizerInfo`. jackoalan: I'd like to echo this claim. It is far too presumptuous to assume artifacts like…
				jackoalanUnsubmitted Not Done Reply Inline Actions I found a specific case where GlobalISel's IRTranslator emits `G_MERGE_VALUES`: Lowering a call with arguments that are wider than native machine types. I think `G_UNMERGE_VALUES`, `G_MERGE_VALUES`, `G_CONCAT_VECTORS`, `G_BUILD_VECTOR` should be excluded from this test because they are all reachable in CallLowering and therefore originate from IR. jackoalan: I found a specific case where GlobalISel's IRTranslator emits `G_MERGE_VALUES`: Lowering a call…
				static bool shouldSkipDbgValueFor(MachineInstr &MI) {
				switch (MI.getOpcode()) {
				paquetteUnsubmitted Not Done Reply Inline Actions might as well pull the `return false;` into the default case? paquette: might as well pull the `return false;` into the default case?
				arsenmUnsubmitted Not Done Reply Inline Actions I think the only combination that stops warnings on all compilers is return false in the default case with an llvm_unreachable after the switch arsenm: I think the only combination that stops warnings on all compilers is return false in the…
				case TargetOpcode::G_UNMERGE_VALUES:
				case TargetOpcode::G_MERGE_VALUES:
				case TargetOpcode::G_CONCAT_VECTORS:
				case TargetOpcode::G_BUILD_VECTOR:
				case TargetOpcode::G_EXTRACT:
				case TargetOpcode::G_INSERT:
				return true;
				default:
				return false;
				}
				}

	void llvm::saveUsesAndErase(MachineInstr &MI, MachineRegisterInfo &MRI,			void llvm::saveUsesAndErase(MachineInstr &MI, MachineRegisterInfo &MRI,
	LostDebugLocObserver *LocObserver,			LostDebugLocObserver *LocObserver,
	SmallInstListTy &DeadInstChain) {			SmallInstListTy &DeadInstChain) {
	for (MachineOperand &Op : MI.uses()) {			for (MachineOperand &Op : MI.uses()) {
	if (Op.isReg() && Op.getReg().isVirtual())			if (Op.isReg() && Op.getReg().isVirtual())
	DeadInstChain.insert(MRI.getVRegDef(Op.getReg()));			DeadInstChain.insert(MRI.getVRegDef(Op.getReg()));
	}			}
	LLVM_DEBUG(dbgs() << MI << "Is dead; erasing.\n");			LLVM_DEBUG(dbgs() << MI << "Is dead; erasing.\n");
	DeadInstChain.remove(&MI);			DeadInstChain.remove(&MI);
				if (shouldSkipDbgValueFor(MI))
				MI.eraseFromParent();
				else
	MI.eraseFromParentAndMarkDBGValuesForRemoval();			MI.eraseFromParentAndMarkDBGValuesForRemoval();
	if (LocObserver)			if (LocObserver)
	LocObserver->checkpoint(false);			LocObserver->checkpoint(false);
	}			}

	void llvm::eraseInstrs(ArrayRef<MachineInstr *> DeadInstrs,			void llvm::eraseInstrs(ArrayRef<MachineInstr *> DeadInstrs,
	MachineRegisterInfo &MRI,			MachineRegisterInfo &MRI,
	LostDebugLocObserver *LocObserver) {			LostDebugLocObserver *LocObserver) {
	SmallInstListTy DeadInstChain;			SmallInstListTy DeadInstChain;
	Show All 15 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/bug-legalization-artifact-combiner-dead-def.mir

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py		# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx1010 -O0 -run-pass=legalizer %s -o - \| FileCheck %s --check-prefix=GFX10		# RUN: llc -mtriple=amdgcn-mesa-mesa3d -mcpu=gfx1010 -O0 -run-pass=legalizer %s -o - \| FileCheck %s --check-prefix=GFX10

--- \|		--- \|

define void @value_finder_bug() { ret void }		define void @value_finder_bug() { ret void }
define void @value_finder_bug_before_artifact_combine() { ret void }		define void @value_finder_bug_before_artifact_combine() { ret void }
define void @value_finder_bug_before_artifact_combine_dbg_use() { ret void }

!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "llvm", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)		!0 = distinct !DICompileUnit(language: DW_LANG_C99, file: !1, producer: "llvm", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, enums: !2)
!1 = !DIFile(filename: "bug-legalization-artifact-combiner-dead-def", directory: "/tmp")		!1 = !DIFile(filename: "bug-legalization-artifact-combiner-dead-def", directory: "/tmp")
!2 = !{}		!2 = !{}
!3 = !{i32 2, !"Dwarf Version", i32 4}		!3 = !{i32 2, !"Dwarf Version", i32 4}
!4 = !{i32 2, !"Debug Info Version", i32 3}		!4 = !{i32 2, !"Debug Info Version", i32 3}
!5 = distinct !DISubprogram(name: "value_finder_bug_before_artifact_combine_dbg_use", scope: !1, file: !1, line: 1, type: !6, isLocal: false, isDefinition: true, scopeLine: 1, flags: DIFlagPrototyped, isOptimized: false, unit: !0, retainedNodes: !2)		!5 = distinct !DISubprogram(name: "value_finder_bug_before_artifact_combine_dbg_use", scope: !1, file: !1, line: 1, type: !6, isLocal: false, isDefinition: true, scopeLine: 1, flags: DIFlagPrototyped, isOptimized: false, unit: !0, retainedNodes: !2)
!6 = !DISubroutineType(types: !2)		!6 = !DISubroutineType(types: !2)
▲ Show 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	bb.0:
%8:_(<2 x s32>) = G_BUILD_VECTOR %6(s32), %7(s32)		%8:_(<2 x s32>) = G_BUILD_VECTOR %6(s32), %7(s32)
%9:_(<2 x s32>) = G_INSERT %8, %5(s32), 32		%9:_(<2 x s32>) = G_INSERT %8, %5(s32), 32
%deaf_def:_(s32), %11:_(s32) = G_UNMERGE_VALUES %9(<2 x s32>)		%deaf_def:_(s32), %11:_(s32) = G_UNMERGE_VALUES %9(<2 x s32>)
G_STORE %6(s32), %0(p5) :: (store (s32), align 8, addrspace 5)		G_STORE %6(s32), %0(p5) :: (store (s32), align 8, addrspace 5)
%12:_(s32) = G_CONSTANT i32 4		%12:_(s32) = G_CONSTANT i32 4
%13:_(p5) = G_PTR_ADD %0, %12(s32)		%13:_(p5) = G_PTR_ADD %0, %12(s32)
G_STORE %11(s32), %13(p5) :: (store (s32) into unknown-address + 4, addrspace 5)		G_STORE %11(s32), %13(p5) :: (store (s32) into unknown-address + 4, addrspace 5)

...		...
		Petar.AvramovicUnsubmitted Not Done Reply Inline Actions This is now broken, dbg_use has no def. Maybe delete this test since it was artificial to begin with. Relevant tests are above. Petar.Avramovic: This is now broken, dbg_use has no def. Maybe delete this test since it was artificial to begin…

---
name: value_finder_bug_before_artifact_combine_dbg_use
body: \|
bb.0:
liveins: $vgpr0, $vgpr1, $vgpr2

; GFX10-LABEL: name: value_finder_bug_before_artifact_combine_dbg_use
; GFX10: [[COPY:%[0-9]+]]:_(p5) = COPY $vgpr0
; GFX10: [[COPY1:%[0-9]+]]:_(s32) = COPY $vgpr1
; GFX10: [[COPY2:%[0-9]+]]:_(s32) = COPY $vgpr2
; GFX10: [[MV:%[0-9]+]]:_(p4) = G_MERGE_VALUES [[COPY1]](s32), [[COPY2]](s32)
; GFX10: [[LOAD:%[0-9]+]]:_(<4 x s32>) = G_LOAD [[MV]](p4) :: (load (<4 x s32>), align 4, addrspace 4)
; GFX10: [[EXTRACT:%[0-9]+]]:_(s32) = G_EXTRACT [[LOAD]](<4 x s32>), 96
; GFX10: [[EXTRACT1:%[0-9]+]]:_(s32) = G_EXTRACT [[LOAD]](<4 x s32>), 64
; GFX10: DBG_VALUE $noreg, $noreg, {{.*}}, !DIExpression(), debug-location !DILocation(line: 1, column: 1
; GFX10: G_STORE [[EXTRACT1]](s32), [[COPY]](p5) :: (store (s32), align 8, addrspace 5)
; GFX10: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; GFX10: [[PTR_ADD:%[0-9]+]]:_(p5) = G_PTR_ADD [[COPY]], [[C]](s32)
; GFX10: G_STORE [[EXTRACT]](s32), [[PTR_ADD]](p5) :: (store (s32) into unknown-address + 4, addrspace 5)
%0:_(p5) = COPY $vgpr0
%1:_(s32) = COPY $vgpr1
%2:_(s32) = COPY $vgpr2
%3:_(p4) = G_MERGE_VALUES %1(s32), %2(s32)
%4:_(<4 x s32>) = G_LOAD %3(p4) :: (load (<4 x s32>), align 4, addrspace 4)
%5:_(s32) = G_EXTRACT %4(<4 x s32>), 96
%6:_(s32) = G_EXTRACT %4(<4 x s32>), 64
%7:_(s32) = G_IMPLICIT_DEF
%8:_(<2 x s32>) = G_BUILD_VECTOR %6(s32), %7(s32)
%9:_(<2 x s32>) = G_INSERT %8, %5(s32), 32
%dbg_use:_(s32), %11:_(s32) = G_UNMERGE_VALUES %9(<2 x s32>)
DBG_VALUE %dbg_use(s32), $noreg, !7, !DIExpression(), debug-location !9
G_STORE %6(s32), %0(p5) :: (store (s32), align 8, addrspace 5)
%12:_(s32) = G_CONSTANT i32 4
%13:_(p5) = G_PTR_ADD %0, %12(s32)
G_STORE %11(s32), %13(p5) :: (store (s32) into unknown-address + 4, addrspace 5)
...

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval() for some artifacts.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 373787

llvm/lib/CodeGen/GlobalISel/Utils.cpp

llvm/test/CodeGen/AMDGPU/GlobalISel/bug-legalization-artifact-combiner-dead-def.mir

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval() for some artifacts.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 373787

llvm/lib/CodeGen/GlobalISel/Utils.cpp

llvm/test/CodeGen/AMDGPU/GlobalISel/bug-legalization-artifact-combiner-dead-def.mir

[GlobalISel][Legalizer] Don't use eraseFromParentAndMarkDBGValuesForRemoval() for some artifacts.
ClosedPublic