Download Raw Diff

Details

Reviewers

aemerson
arsenm
aprantl
paquette
probinson
jmorse
dblaikie
qcolombet

Commits

rGf32cafedf053: [GlobalISel][DebugInfo] Propagate debug location for localized constants

Summary

After IRTranslator pass, constants are deduplicated and translated into instructions at entry block, having debug locations lost.
Localization of constants may cause emission of extra zero lines in debug_line section, like here https://godbolt.org/z/ecvsxxfKn. In this example, constant gets placed as a first instruction in entry block, and despite it has no debug location, AsmPrinter emits zero line for it (line 45)

If a localized constant has the only user, we can assume that it has the same debug location as its user, since they are placed consequently.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dzhidzhoev created this revision.Jun 20 2022, 5:55 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2022, 5:55 AM

Herald added subscribers: hiraditya, rovka. · View Herald Transcript

dzhidzhoev requested review of this revision.Jun 20 2022, 5:55 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 20 2022, 5:55 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

dzhidzhoev edited the summary of this revision. (Show Details)Jun 20 2022, 5:56 AM

Harbormaster completed remote builds in B170838: Diff 438360.Jun 20 2022, 5:56 AM

dzhidzhoev edited the summary of this revision. (Show Details)Jun 20 2022, 5:56 AM

dzhidzhoev added a project: debug-info.Jun 20 2022, 6:11 AM

Ping

arsenm added inline comments.Jul 5 2022, 8:42 AM

llvm/lib/CodeGen/GlobalISel/Localizer.cpp
182	Isn't NewMI the same as MI?
191	demorgan this?
llvm/test/CodeGen/AArch64/GlobalISel/localizer-propagate-debug-loc.mir
14–37	Don't really need the function body if you remove the block names
70	Don't need registers section

dzhidzhoev mentioned this in D127488: [GlobalISel][DebugInfo] Remove debug info with zero line from constants inserted at entry block.Jul 8 2022, 7:41 AM

dzhidzhoev added inline comments.Jul 8 2022, 8:30 AM

llvm/lib/CodeGen/GlobalISel/Localizer.cpp
182	It seems that MI is copied on insert call.

arsenm added inline comments.Jul 8 2022, 9:02 AM

llvm/lib/CodeGen/GlobalISel/Localizer.cpp
182	Insert should just insert, the instruction pointer should still be the same

Fix

Harbormaster completed remote builds in B174412: Diff 443283.Jul 8 2022, 11:09 AM

Get rid of redundant variable

dzhidzhoev marked an inline comment as done.Jul 8 2022, 3:05 PM

dzhidzhoev added inline comments.

llvm/test/CodeGen/AArch64/GlobalISel/localizer-propagate-debug-loc.mir
14–37	volatile seems to be ignored without it

dzhidzhoev marked an inline comment as not done.Jul 8 2022, 3:06 PM

Harbormaster completed remote builds in B174462: Diff 443365.Jul 8 2022, 3:48 PM

Ping

@aprantl @probinson I'm really not sure, but should we consider backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block more generally than this patch is proposing?

In D128192#3658500, @dblaikie wrote:

@aprantl @probinson I'm really not sure, but should we consider backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block more generally than this patch is proposing?

BTW discussion about backpropagation for zero-location-instructions at the start of a basic block took in place here https://lists.llvm.org/pipermail/lldb-dev/2018-October/014263.html . There was a point against backpropagation of location for arbitrary instructions.

dzhidzhoev added reviewers: probinson, jmorse.Jul 20 2022, 5:59 AM

Ping

In D128192#3659407, @dzhidzhoev wrote:

In D128192#3658500, @dblaikie wrote:

@aprantl @probinson I'm really not sure, but should we consider backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block more generally than this patch is proposing?

BTW discussion about backpropagation for zero-location-instructions at the start of a basic block took in place here https://lists.llvm.org/pipermail/lldb-dev/2018-October/014263.html . There was a point against backpropagation of location for arbitrary instructions.

The objection was that it is incorrect to backpropagate debug locations of the nearest instruction below on arbitrary instructions, since there is no knowledge about their semantics on AsmPrinter stage. In contrast to that, in this commit we know that instructions being marked are generated from constants and are used by the nearest following instruction.

In D128192#3677707, @dzhidzhoev wrote:

In D128192#3659407, @dzhidzhoev wrote:

In D128192#3658500, @dblaikie wrote:

@aprantl @probinson I'm really not sure, but should we consider backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block more generally than this patch is proposing?

BTW discussion about backpropagation for zero-location-instructions at the start of a basic block took in place here https://lists.llvm.org/pipermail/lldb-dev/2018-October/014263.html . There was a point against backpropagation of location for arbitrary instructions.

The objection was that it is incorrect to backpropagate debug locations of the nearest instruction below on arbitrary instructions, since there is no knowledge about their semantics on AsmPrinter stage. In contrast to that, in this commit we know that instructions being marked are generated from constants and are used by the nearest following instruction.

We forward propagate such locations, though, right? So I'm not sure back propagating is especially worse/more problematic.

In D128192#3677771, @dblaikie wrote:

In D128192#3677707, @dzhidzhoev wrote:

In D128192#3659407, @dzhidzhoev wrote:

In D128192#3658500, @dblaikie wrote:

@aprantl @probinson I'm really not sure, but should we consider backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block more generally than this patch is proposing?

BTW discussion about backpropagation for zero-location-instructions at the start of a basic block took in place here https://lists.llvm.org/pipermail/lldb-dev/2018-October/014263.html . There was a point against backpropagation of location for arbitrary instructions.

The objection was that it is incorrect to backpropagate debug locations of the nearest instruction below on arbitrary instructions, since there is no knowledge about their semantics on AsmPrinter stage. In contrast to that, in this commit we know that instructions being marked are generated from constants and are used by the nearest following instruction.

We forward propagate such locations, though, right? So I'm not sure back propagating is especially worse/more problematic.

I'm not sure I have enough experience to answer that question.

But "backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block" is not generalization of this commit, since here debug locations are propagated not only for instructions at the block beginning. During localization, instructions may be put not only at the block beginning.

dzhidzhoev added reviewers: dblaikie, qcolombet.Jul 26 2022, 9:34 AM

In D128192#3678150, @dzhidzhoev wrote:

In D128192#3677771, @dblaikie wrote:

In D128192#3677707, @dzhidzhoev wrote:

In D128192#3659407, @dzhidzhoev wrote:

In D128192#3658500, @dblaikie wrote:

@aprantl @probinson I'm really not sure, but should we consider backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block more generally than this patch is proposing?

BTW discussion about backpropagation for zero-location-instructions at the start of a basic block took in place here https://lists.llvm.org/pipermail/lldb-dev/2018-October/014263.html . There was a point against backpropagation of location for arbitrary instructions.

The objection was that it is incorrect to backpropagate debug locations of the nearest instruction below on arbitrary instructions, since there is no knowledge about their semantics on AsmPrinter stage. In contrast to that, in this commit we know that instructions being marked are generated from constants and are used by the nearest following instruction.

We forward propagate such locations, though, right? So I'm not sure back propagating is especially worse/more problematic.

I'm not sure I have enough experience to answer that question.

But "backpropagating the first non-zero location to cover zero-location instructions at the start of a basic block" is not generalization of this commit, since here debug locations are propagated not only for instructions at the block beginning. During localization, instructions may be put not only at the block beginning.

We could potentially do it at other places too. It wouldn't hurt profile accuracy if we're propagating a location in the same basic block. It /might/ confuse interactive/human users if we ever did this over a call or possibly over a store instruction... eh, mixed feelings. I think if the instruction has no location (as opposed to zero location) and so we're willing/already letting previous locations cover the non-location instruction, then we should be willing to do the same thing backwards too.

In this case we're talking about dropping the location entirely (that's the current behavior, yes?) - so the only time an instruction like that causes more zero locations, is when it appears at the start of a block - so I think backpropagating at the start of a block would address the issue being discussed in this patch. Other zeros would not appear if the constant is put not at the beginning of a block - those would already be getting flow-on locations from the previous locations in a block.

It's not clear that singular constants getting the location of their original constant will address the location issues described here - they might still move up over some other location and cause just the same line table size problems, but without zero locations. So that's partly why I'm not sure about this path as a way to address the problem description.

In D128192#3680110, @dblaikie wrote:

In this case we're talking about dropping the location entirely (that's the current behavior, yes?) - so the only time an instruction like that causes more zero locations, is when it appears at the start of a block

I'm not sure, but it seems to be true.

In D128192#3680110, @dblaikie wrote:

so I think backpropagating at the start of a block would address the issue being discussed in this patch.

I can implement that, collect some metrics, but I can't make a decision on this assumption yet. On which stage/pass should such backpropagation happen?

In D128192#3680110, @dblaikie wrote:

Other zeros would not appear if the constant is put not at the beginning of a block - those would already be getting flow-on locations from the previous locations in a block.

We can consider this patch as changing a propagation direction for constants. From forward propagation, the default debugger behavior, to backward propagation. I think it affects not only instructions at the beginning of the basic block, since changing the propagation direction may make boundaries between source code lines more precise, and thus, improve stepping behavior (https://github.com/llvm/llvm-project/issues/56370 addressing such cases too).

It would be great to find out others opinions.

In D128192#3680218, @dzhidzhoev wrote:

On which stage/pass should such backpropagation happen?

The reason I have chosen Localizer pass is that we know the context. What instructions are getting propagated locations.

In D128192#3680110, @dblaikie wrote:

so I think backpropagating at the start of a block would address the issue being discussed in this patch.

I can implement that, collect some metrics, but I can't make a decision on this assumption yet. On which stage/pass should such backpropagation happen?

I'm not sure - wherever @probinson implemented the "use zero rather than unspecified location at the start of basic blocks" is probably the place to revisit that and consider backpropagating from the first specified location and see what that looks like.

In D128192#3680110, @dblaikie wrote:

Other zeros would not appear if the constant is put not at the beginning of a block - those would already be getting flow-on locations from the previous locations in a block.

We can consider this patch as changing a propagation direction for constants. From forward propagation, the default debugger behavior, to backward propagation.

Slight pedantry: This isn't a debugger behavior thing, it's a DWARF format thing (so a bit more robust/explicitly guaranteed/required than something some debuggers choose to do) - a source line in the line table is valid from the address it's specified at until the next point in the line table that changes the line/address.

I think it affects not only instructions at the beginning of the basic block, since changing the propagation direction may make boundaries between source code lines more precise, and thus, improve stepping behavior (https://github.com/llvm/llvm-project/issues/56370 addressing such cases too).

Yeah, I'm not entirely understanding that issue and/or the connection with this one - some complications/lots of nuanced details in this area for sure.

The case I was having a flashback to was FastISel, which _used to_ put all the constants at the top of the current block, with no debug info. This was reworked multiple times over the years, and at this point it no longer behaves that way--constants are materialized per IR instruction.

This patch (now that I look at it in detail) is doing something else entirely. The pass was already moving constant-materializing instructions to be adjacent to the user; the change here is, having done that motion, if there's only one use, set the debug info of the moved instruction. That seems entirely reasonable.

It just happens that, as a special case, sometimes the moved instruction would end up at the top of a block, and IIRC AsmPrinter will deliberately emit line 0 for a first-in-block instruction that has no source location. (This protects the first-in-block instructions from incorrectly inheriting the source location of the physically preceding block.) With this patch, that instruction will have the same source location as its user, which seems like a clear improvement.

I apologize for being triggered on all this stuff, and not recognizing what was going on sooner. FWIW, LGTM.

Ping

aemerson accepted this revision.Sep 6 2022, 4:18 AM

This revision is now accepted and ready to land.Sep 6 2022, 4:18 AM

Rebased.

This revision was landed with ongoing or failed builds.Dec 5 2022, 5:39 AM

Closed by commit rGf32cafedf053: [GlobalISel][DebugInfo] Propagate debug location for localized constants (authored by dzhidzhoev). · Explain Why

This revision was automatically updated to reflect the committed changes.

dzhidzhoev added a commit: rGf32cafedf053: [GlobalISel][DebugInfo] Propagate debug location for localized constants.

Harbormaster completed remote builds in B201091: Diff 480074.Dec 5 2022, 8:14 AM

Diff 480077

llvm/lib/CodeGen/GlobalISel/Localizer.cpp

Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines	for (MachineInstr *MI : LocalizedInstrs) {
while (II != MBB.end() && !Users.count(&*II))		while (II != MBB.end() && !Users.count(&*II))
++II;		++II;

assert(II != MBB.end() && "Didn't find the user in the MBB");		assert(II != MBB.end() && "Didn't find the user in the MBB");
LLVM_DEBUG(dbgs() << "Intra-block: moving " << MI << " before " << II		LLVM_DEBUG(dbgs() << "Intra-block: moving " << MI << " before " << II
<< '\n');		<< '\n');

MI->removeFromParent();		MI->removeFromParent();
MBB.insert(II, MI);		MBB.insert(II, MI);
		arsenmUnsubmitted Not Done Reply Inline Actions Isn't NewMI the same as MI? arsenm: Isn't NewMI the same as MI?
		dzhidzhoevAuthorUnsubmitted Done Reply Inline Actions It seems that MI is copied on insert call. dzhidzhoev: It seems that MI is copied on insert call.
		arsenmUnsubmitted Done Reply Inline Actions Insert should just insert, the instruction pointer should still be the same arsenm: Insert should just insert, the instruction pointer should still be the same
Changed = true;		Changed = true;

		// If the instruction (constant) being localized has single user, we can
		// propagate debug location from user.
		if (Users.size() == 1) {
		const auto &DefDL = MI->getDebugLoc();
		const auto &UserDL = (*Users.begin())->getDebugLoc();

		if ((!DefDL \|\| DefDL.getLine() == 0) && UserDL && UserDL.getLine() != 0) {
		arsenmUnsubmitted Done Reply Inline Actions demorgan this? arsenm: demorgan this?
		MI->setDebugLoc(UserDL);
		}
		}
}		}
return Changed;		return Changed;
}		}

bool Localizer::runOnMachineFunction(MachineFunction &MF) {		bool Localizer::runOnMachineFunction(MachineFunction &MF) {
// If the ISel pipeline failed, do not bother running that pass.		// If the ISel pipeline failed, do not bother running that pass.
if (MF.getProperties().hasProperty(		if (MF.getProperties().hasProperty(
MachineFunctionProperties::Property::FailedISel))		MachineFunctionProperties::Property::FailedISel))
Show All 18 Lines

llvm/test/CodeGen/AArch64/GlobalISel/localizer-propagate-debug-loc.mir

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
				# RUN: llc -O0 %s -global-isel -start-before localizer \
				# RUN: -stop-after localizer -o - \| FileCheck --check-prefix=CHECK %s
				--- \|
				target datalayout = "e-m:o-i64:64-i128:128-n32:64-S128"
				target triple = "arm64-apple-macosx12.0.0"

				@A = global i32 1234, align 4
				@B = global i32 5678, align 4
				@C = global i32 9012, align 4

				define noundef i32 @foo() !dbg !5 {
				%1 = alloca i32, align 4
				br i1 false, label %2, label %4

				2: ; preds = %0
				%3 = load i32, ptr @A, align 4, !dbg !10
				store volatile i32 %3, ptr %1, align 4
				br label %9

				4: ; preds = %0
				br i1 false, label %5, label %8

				5: ; preds = %4
				%6 = load i32, ptr @B, align 4, !dbg !13
				store volatile i32 %6, ptr %1, align 4
				%7 = load i32, ptr @B, align 4, !dbg !16
				store volatile i32 %7, ptr %1, align 4
				br label %9

				8: ; preds = %4
				store i32 3, ptr @C, align 4, !dbg !17
				br label %9

				9: ; preds = %8, %5, %2
				ret i32 0
				}
				arsenmUnsubmitted Not Done Reply Inline Actions Don't really need the function body if you remove the block names arsenm: Don't really need the function body if you remove the block names
				dzhidzhoevAuthorUnsubmitted Not Done Reply Inline Actions volatile seems to be ignored without it dzhidzhoev: volatile seems to be ignored without it

				!llvm.dbg.cu = !{!0}
				!llvm.module.flags = !{!2, !3, !4}

				!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
				!1 = !DIFile(filename: "tmp.ll", directory: "/")
				!2 = !{i32 7, !"Dwarf Version", i32 4}
				!3 = !{i32 2, !"Debug Info Version", i32 3}
				!4 = !{i32 1, !"wchar_size", i32 4}
				!5 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 5, type: !6, scopeLine: 5, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !9)
				!6 = !DISubroutineType(types: !7)
				!7 = !{!8}
				!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
				!9 = !{}
				!10 = !DILocation(line: 9, column: 9, scope: !11)
				!11 = distinct !DILexicalBlock(scope: !12, file: !1, line: 8, column: 15)
				!12 = distinct !DILexicalBlock(scope: !5, file: !1, line: 8, column: 7)
				!13 = !DILocation(line: 11, column: 9, scope: !14)
				!14 = distinct !DILexicalBlock(scope: !15, file: !1, line: 10, column: 22)
				!15 = distinct !DILexicalBlock(scope: !12, file: !1, line: 10, column: 14)
				!16 = !DILocation(line: 12, column: 13, scope: !14)
				!17 = !DILocation(line: 14, column: 7, scope: !18)
				!18 = distinct !DILexicalBlock(scope: !15, file: !1, line: 13, column: 10)

				...
				---
				name: foo
				alignment: 4
				legalized: true
				regBankSelected: true
				tracksRegLiveness: true
				stack:
				- { id: 0, name: '', type: default, offset: 0, size: 4, alignment: 4,
				arsenmUnsubmitted Done Reply Inline Actions Don't need registers section arsenm: Don't need registers section
				stack-id: default, callee-saved-register: '', callee-saved-restored: true,
				debug-info-variable: '', debug-info-expression: '', debug-info-location: '' }
				body: \|
				; CHECK: [[ADRP3:%[0-9]+]]:gpr64(p0) = ADRP target-flags(aarch64-page) @A, debug-location !10
				; CHECK-NEXT: [[ADD_LOW3:%[0-9]+]]:gpr(p0) = G_ADD_LOW [[ADRP3]](p0), target-flags(aarch64-pageoff, aarch64-nc) @A, debug-location !10
				; CHECK-NEXT: [[LOAD:%[0-9]+]]:gpr(s32) = G_LOAD [[ADD_LOW3]](p0), debug-location !10 :: (dereferenceable load (s32))

				; CHECK: [[ADRP4:%[0-9]+]]:gpr64(p0) = ADRP target-flags(aarch64-page) @B, debug-location !DILocation(line: 0, scope: !14)
				; CHECK-NEXT: [[ADD_LOW4:%[0-9]+]]:gpr(p0) = G_ADD_LOW [[ADRP4]](p0), target-flags(aarch64-pageoff, aarch64-nc) @B, debug-location !DILocation(line: 0, scope: !14)
				; CHECK-NEXT: [[LOAD1:%[0-9]+]]:gpr(s32) = G_LOAD [[ADD_LOW4]](p0), debug-location !13 :: (dereferenceable load (s32))

				; CHECK: [[ADRP5:%[0-9]+]]:gpr64(p0) = ADRP target-flags(aarch64-page) @C, debug-location !17
				; CHECK-NEXT: [[ADD_LOW5:%[0-9]+]]:gpr(p0) = G_ADD_LOW [[ADRP5]](p0), target-flags(aarch64-pageoff, aarch64-nc) @C, debug-location !17
				; CHECK-NEXT: [[C5:%[0-9]+]]:gpr(s32) = G_CONSTANT i32 3, debug-location !17
				; CHECK-NEXT: G_STORE [[C5]](s32), [[ADD_LOW5]](p0), debug-location !17 :: (store (s32) into @C)
				bb.1:
				successors: %bb.2(0x40000000), %bb.3(0x40000000)

				%2:gpr(s32) = G_CONSTANT i32 3
				%24:gpr64(p0) = ADRP target-flags(aarch64-page) @C, debug-location !DILocation(line: 0, scope: !18)
				%3:gpr(p0) = G_ADD_LOW %24(p0), target-flags(aarch64-pageoff, aarch64-nc) @C, debug-location !DILocation(line: 0, scope: !18)
				%23:gpr64(p0) = ADRP target-flags(aarch64-page) @B, debug-location !DILocation(line: 0, scope: !14)
				%5:gpr(p0) = G_ADD_LOW %23(p0), target-flags(aarch64-pageoff, aarch64-nc) @B, debug-location !DILocation(line: 0, scope: !14)
				%22:gpr64(p0) = ADRP target-flags(aarch64-page) @A, debug-location !DILocation(line: 0, scope: !11)
				%8:gpr(p0) = G_ADD_LOW %22(p0), target-flags(aarch64-pageoff, aarch64-nc) @A, debug-location !DILocation(line: 0, scope: !11)
				%9:gpr(s32) = G_CONSTANT i32 0
				%0:gpr(p0) = G_FRAME_INDEX %stack.0
				%18:gpr(s32) = COPY %9(s32)
				%19:gpr(s32) = G_CONSTANT i32 1
				%20:gpr(s32) = G_XOR %18, %19
				%11:gpr(s1) = G_TRUNC %20(s32)
				G_BRCOND %11(s1), %bb.3
				G_BR %bb.2

				bb.2:
				successors: %bb.6(0x80000000)

				%7:gpr(s32) = G_LOAD %8(p0), debug-location !10 :: (dereferenceable load (s32))
				G_STORE %7(s32), %0(p0) :: (volatile store (s32) into %ir.1)
				G_BR %bb.6

				bb.3:
				successors: %bb.4(0x40000000), %bb.5(0x40000000)

				%14:gpr(s32) = G_CONSTANT i32 0
				%15:gpr(s32) = G_CONSTANT i32 1
				%16:gpr(s32) = G_XOR %14, %15
				%13:gpr(s1) = G_TRUNC %16(s32)
				G_BRCOND %13(s1), %bb.5
				G_BR %bb.4

				bb.4:
				successors: %bb.6(0x80000000)

				%4:gpr(s32) = G_LOAD %5(p0), debug-location !13 :: (dereferenceable load (s32))
				G_STORE %4(s32), %0(p0) :: (volatile store (s32) into %ir.1)
				%6:gpr(s32) = G_LOAD %5(p0), debug-location !16 :: (dereferenceable load (s32))
				G_STORE %6(s32), %0(p0) :: (volatile store (s32) into %ir.1)
				G_BR %bb.6

				bb.5:
				successors: %bb.6(0x80000000)

				G_STORE %2(s32), %3(p0), debug-location !17 :: (store (s32) into @C)
				G_BR %bb.6

				bb.6:
				$w0 = COPY %9(s32)
				RET_ReallyLR implicit $w0

				...

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][DebugInfo] Propagate debug location for localized constants
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 480077

llvm/lib/CodeGen/GlobalISel/Localizer.cpp

llvm/test/CodeGen/AArch64/GlobalISel/localizer-propagate-debug-loc.mir

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][DebugInfo] Propagate debug location for localized constantsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 480077

llvm/lib/CodeGen/GlobalISel/Localizer.cpp

llvm/test/CodeGen/AArch64/GlobalISel/localizer-propagate-debug-loc.mir

[GlobalISel][DebugInfo] Propagate debug location for localized constants
ClosedPublic