This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/CodeGen/
-
CodeGen/
-
LiveDebugValues.cpp
-
test/DebugInfo/MIR/X86/
-
DebugInfo/
-
MIR/
-
X86/
-
live-debug-values-bad-transfer.mir

Differential D67393

[DebugInfo] LiveDebugValues: Defer all DBG_VALUE creation during analysis
ClosedPublic

Authored by jmorse on Sep 10 2019, 4:08 AM.

Download Raw Diff

Details

Reviewers

aprantl
vsk
wolfgangp

Commits

rG0ca48de26c46: [DebugInfo] LiveDebugValues: defer DBG_VALUE creation during analysis
rL373720: [DebugInfo] LiveDebugValues: defer DBG_VALUE creation during analysis

Summary

As stated in the docs here [0], the meaning of a DBG_VALUE instruction changes after LiveDebugValues runs, from corresponding to a source-level assignment, to being a per-block variable location record [1]. Unfortunately, LiveDebugValues seems to mix these up and there's a feedback effect. The problem is this:

Currently all register/stack transfers have per-block DBG_VALUE instructions created and inserted during the dataflow analysis [2],
Sometimes variable locations are invalidated during the dataflow analysis [3],
But the register/stack transfer DBG_VALUEs have already been created by then, and are interpreted as source-level assignments.

This means that a variable location that's propagated once (such as the first iteration where backedges are ignored) but invalidated on later iterations, can "latch" if it experiences a transfer in that time. Observe the test case in this patch, where in the loop block a value is shifted through four different registers (esi -> edi -> ecx -> eax). Currently the incoming location in ecx is transferred to ebx in the loop block, then invalidated on the second LiveDebugValues iteration through the loop because ecx is clobbered. However a DBG_VALUE for ebx has already been created by that point, and gets propagated into the exit block.

I've mentally divided this into two issues: ensuring that block-only DBG_VALUEs aren't interpreted as source-assignments, and deleting transfers of locations that are later invalidated. The former is easy to fix by moving the code in [2] out of the location propagation loop, which is what this patch does. Dealing with deleting transfers is in some more patches that are coming.

This patch also removes a condition in transferTerminator that doesn't update OutLocs if there are no locations at the end of processing a block. I think this made sense when we were only taking the union of OutLocs after propagation, but now we're deleting locations too, an empty set of OutLocs is something that needs to be handled. This is covered by the test added too.

[0] https://llvm.org/docs/SourceLevelDebugging.html#livedebugvalues-expansion-of-variable-locations
[1] I'm pretty confident of this, but as I wrote that paragraph of the docs this might be a circular argument.
[2] https://github.com/llvm/llvm-project/blob/5d9cd3b4ca457e55c3c21094cfea5e49dddef36c/llvm/lib/CodeGen/LiveDebugValues.cpp#L1365
[3] https://reviews.llvm.org/D66599

Diff Detail

Repository: rL LLVM

Event Timeline

jmorse created this revision.Sep 10 2019, 4:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 10 2019, 4:08 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

(Ninja-edit: this moves transfer-insertion to before flushPendingLocs. flushPendingLocs may manipulate VarLocs that refer to not-yet-inserted DBG_VALUEs as their sources).

jmorse mentioned this in D67398: [DebugInfo] LiveDebugValues: Move DBG_VALUE creation into VarLoc class.Sep 10 2019, 7:17 AM

jmorse added a child revision: D67398: [DebugInfo] LiveDebugValues: Move DBG_VALUE creation into VarLoc class.

As stated in the docs here [0], the meaning of a DBG_VALUE instruction changes after LiveDebugValues runs

Do you think we should avoid this confusion by calling them DBG_LOC or something else after LiveDebugValues to embrace the semantic difference?

In D67393#1665380, @aprantl wrote:

Do you think we should avoid this confusion by calling them DBG_LOC or something else after LiveDebugValues to embrace the semantic difference?

That'd probably be best -- although something without "location" in its name, it's already a highly overloaded term :(. Something like DBG_SPAN, DBG_BLOCKVAL, or some other term that indicates the limited range of the instructions effect?

Okay, we should make sure that we do get the design right. I don't quite get the argument made in SourceLevelDebugInfo.rst for why the semantics are different:

After this pass the DBG_VALUE instruction changes meaning: rather than corresponding to a source-level assignment where the variable may change value, it asserts the location of a variable in a block, and loses effect outside the block.

First, DBG_VALUEs aren't necessarily source-level assignments *before* LiveDebugValues either, they update the SSA value that a (fragment) of a source-level variable can be found in, but that SSA value could have been created by the compiler and has not necessarily any relation to a source-level assignment (think about salvageDebugInfo, for example).
The fact that the DBG_VALUE has no effect outside of the current basic block just falls out of DbgEntityHistoryCalculator not doing a LiveDebugVariable-style data flow analysis, but IMO that isn't a change in semantics, it would be *legal* for it to perform one, it just would be pointless after LiveDebugValues has propagated DBG_VALUEs across basic blocks and reached a fixed point.

So I'm suspecting that I'm missing something and it isn't mentioned in the text. Can you perhaps fill me in?

In D67393#1666272, @aprantl wrote:

First, DBG_VALUEs aren't necessarily source-level assignments *before* LiveDebugValues either, they update the SSA value that a (fragment) of a source-level variable can be found in, but that SSA value could have been created by the compiler and has not necessarily any relation to a source-level assignment (think about salvageDebugInfo, for example).

(Unfortunately I'm not known for operating the English language effectively). My meaning in the wording was about the placement of a dbg.value/DBG_VALUE within a block, ignoring its operand. AFAIUI, for any LLVM-IR instruction i in a function, and a (fragment of) variable, to determine the variables location at i one has to do an entire dominance-frontier analysis to work out which dbg.value/DBG_VALUE dominates i (potentially none of them). This method of recording the _position_ where variable locations _change_, closely matches the original source program: if you had five assignments to a variable in a program, you'd have five dbg.values, regardless of their operands.

The fact that the DBG_VALUE has no effect outside of the current basic block just falls out of DbgEntityHistoryCalculator not doing a LiveDebugVariable-style data flow analysis, but IMO that isn't a change in semantics, it would be *legal* for it to perform one, it just would be pointless after LiveDebugValues has propagated DBG_VALUEs across basic blocks and reached a fixed point.

All true; this is where a question of "what is the design" de-jure and de-facto comes in. Because DbgEntityHistoryCalculator currently relies on LiveDebugValues having run its analysis, does that not _make_ it a semantic change? At the very least, because that's what I saw when trying to document these things, that's what I wrote.

Alternately, we could document the change in interpretation as being an optimisation that DbgEntityHistoryCalculator performs/relies on, rather than being a change in semantics. (I think these are both sides of the same coin when it comes to explaining internal state).

jmorse mentioned this in D67500: [DebugInfo] LiveDebugValues: don't create transfer records for potentially invalid locations.Sep 12 2019, 7:43 AM

In D67393#1666414, @jmorse wrote:

In D67393#1666272, @aprantl wrote:

First, DBG_VALUEs aren't necessarily source-level assignments *before* LiveDebugValues either, they update the SSA value that a (fragment) of a source-level variable can be found in, but that SSA value could have been created by the compiler and has not necessarily any relation to a source-level assignment (think about salvageDebugInfo, for example).

(Unfortunately I'm not known for operating the English language effectively). My meaning in the wording was about the placement of a dbg.value/DBG_VALUE within a block, ignoring its operand. AFAIUI, for any LLVM-IR instruction i in a function, and a (fragment of) variable, to determine the variables location at i one has to do an entire dominance-frontier analysis to work out which dbg.value/DBG_VALUE dominates i (potentially none of them). This method of recording the _position_ where variable locations _change_, closely matches the original source program: if you had five assignments to a variable in a program, you'd have five dbg.values, regardless of their operands.

The fact that the DBG_VALUE has no effect outside of the current basic block just falls out of DbgEntityHistoryCalculator not doing a LiveDebugVariable-style data flow analysis, but IMO that isn't a change in semantics, it would be *legal* for it to perform one, it just would be pointless after LiveDebugValues has propagated DBG_VALUEs across basic blocks and reached a fixed point.

All true; this is where a question of "what is the design" de-jure and de-facto comes in. Because DbgEntityHistoryCalculator currently relies on LiveDebugValues having run its analysis, does that not _make_ it a semantic change? At the very least, because that's what I saw when trying to document these things, that's what I wrote.

Alternately, we could document the change in interpretation as being an optimisation that DbgEntityHistoryCalculator performs/relies on, rather than being a change in semantics. (I think these are both sides of the same coin when it comes to explaining internal state).

Does the location calculator rely on LiveDebugValues for correctness? IIUC it should be possible to run the location calculator without doing LiveDebugValues, and that should produce correct (if incomplete) information.

If that's right, then we can edit the docs to 1) clarify that, 2) describe LiveDebugValues as a pass that lets the location calculator "get away with" taking a narrow, one-block-at-a-time view of the program, and 3) state that the semantics of DBG_VALUE never change.

Now, for this patch, I think the change looks good, and the test case is really neat :). Please let @aprantl chime in as well, though.

This revision is now accepted and ready to land.Sep 20 2019, 1:05 PM

Does the location calculator rely on LiveDebugValues for correctness? IIUC it should be possible to run the location calculator without doing LiveDebugValues, and that should produce correct (if incomplete) information.

Hmmm, not for correctness, no -- you're right, the location calculator wouldn't produce any incorrect locations if LiveDebugValues didn't run.

If that's right, then we can edit the docs to 1) clarify that, 2) describe LiveDebugValues as a pass that lets the location calculator "get away with" taking a narrow, one-block-at-a-time view of the program, and 3) state that the semantics of DBG_VALUE never change.

Sure; I'm going to let this soak into my mind a little so that I'm confident it's not really a change in semantics.

Now, for this patch, I think the change looks good, and the test case is really neat :). Please let @aprantl chime in as well, though.

jmorse mentioned this in D68209: [LiveDebugValues] Introduce entry values of unmodified params.Oct 3 2019, 3:07 AM

I wrote:

Sure; I'm going to let this soak into my mind a little so that I'm confident it's not really a change in semantics.

I think I've got it now: the location calculators purpose isn't to "produce perfect and complete variable locations", it's to calculate the locations that the debuginfo in the instruction stream describes. If the information described by the DBG_VALUEs is incomplete, that's not an error; and LiveDebugValues just propagates locations to create more information rather than fixing a correctness problem. I'll cough up a documentation fix probably tomorrow; assuming @aprantl is happy with all of this, I'll commit this patch tomorrow-ish.

aprantl accepted this revision.Oct 3 2019, 12:51 PM

Closed by commit rL373720: [DebugInfo] LiveDebugValues: defer DBG_VALUE creation during analysis (authored by jmorse). · Explain WhyOct 4 2019, 2:37 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

LiveDebugValues.cpp

15 lines

test/

DebugInfo/

MIR/

X86/

live-debug-values-bad-transfer.mir

103 lines

Diff 223177

llvm/trunk/lib/CodeGen/LiveDebugValues.cpp

Show First 20 Lines • Show All 980 Lines • ▼ Show 20 Lines

/// Terminate all open ranges at the end of the current basic block.		/// Terminate all open ranges at the end of the current basic block.
bool LiveDebugValues::transferTerminator(MachineBasicBlock *CurMBB,		bool LiveDebugValues::transferTerminator(MachineBasicBlock *CurMBB,
OpenRangesSet &OpenRanges,		OpenRangesSet &OpenRanges,
VarLocInMBB &OutLocs,		VarLocInMBB &OutLocs,
const VarLocMap &VarLocIDs) {		const VarLocMap &VarLocIDs) {
bool Changed = false;		bool Changed = false;

if (OpenRanges.empty())
return false;

LLVM_DEBUG(for (unsigned ID		LLVM_DEBUG(for (unsigned ID
: OpenRanges.getVarLocs()) {		: OpenRanges.getVarLocs()) {
// Copy OpenRanges to OutLocs, if not already present.		// Copy OpenRanges to OutLocs, if not already present.
dbgs() << "Add to OutLocs in MBB #" << CurMBB->getNumber() << ": ";		dbgs() << "Add to OutLocs in MBB #" << CurMBB->getNumber() << ": ";
VarLocIDs[ID].dump();		VarLocIDs[ID].dump();
});		});
VarLocSet &VLS = OutLocs[CurMBB];		VarLocSet &VLS = OutLocs[CurMBB];
Changed = VLS != OpenRanges.getVarLocs();		Changed = VLS != OpenRanges.getVarLocs();
▲ Show 20 Lines • Show All 357 Lines • ▼ Show 20 Lines	while (!Worklist.empty()) {
// correspond to user variables.		// correspond to user variables.
// First load any pending inlocs.		// First load any pending inlocs.
OpenRanges.insertFromLocSet(PendingInLocs[MBB], VarLocIDs);		OpenRanges.insertFromLocSet(PendingInLocs[MBB], VarLocIDs);
for (auto &MI : *MBB)		for (auto &MI : *MBB)
process(MI, OpenRanges, OutLocs, VarLocIDs, Transfers,		process(MI, OpenRanges, OutLocs, VarLocIDs, Transfers,
DebugEntryVals, OverlapFragments, SeenFragments);		DebugEntryVals, OverlapFragments, SeenFragments);
OLChanged \|= transferTerminator(MBB, OpenRanges, OutLocs, VarLocIDs);		OLChanged \|= transferTerminator(MBB, OpenRanges, OutLocs, VarLocIDs);

// Add any DBG_VALUE instructions necessitated by spills.
for (auto &TR : Transfers)
MBB->insertAfterBundle(TR.TransferInst->getIterator(), TR.DebugInst);
Transfers.clear();

LLVM_DEBUG(printVarLocInMBB(MF, OutLocs, VarLocIDs,		LLVM_DEBUG(printVarLocInMBB(MF, OutLocs, VarLocIDs,
"OutLocs after propagating", dbgs()));		"OutLocs after propagating", dbgs()));
LLVM_DEBUG(printVarLocInMBB(MF, InLocs, VarLocIDs,		LLVM_DEBUG(printVarLocInMBB(MF, InLocs, VarLocIDs,
"InLocs after propagating", dbgs()));		"InLocs after propagating", dbgs()));

if (OLChanged) {		if (OLChanged) {
OLChanged = false;		OLChanged = false;
for (auto s : MBB->successors())		for (auto s : MBB->successors())
if (OnPending.insert(s).second) {		if (OnPending.insert(s).second) {
Pending.push(BBToOrder[s]);		Pending.push(BBToOrder[s]);
}		}
}		}
}		}
}		}
Worklist.swap(Pending);		Worklist.swap(Pending);
// At this point, pending must be empty, since it was just the empty		// At this point, pending must be empty, since it was just the empty
// worklist		// worklist
assert(Pending.empty() && "Pending should be empty");		assert(Pending.empty() && "Pending should be empty");
}		}

		// Add any DBG_VALUE instructions created by location transfers.
		for (auto &TR : Transfers) {
		auto *MBB = TR.TransferInst->getParent();
		MBB->insertAfterBundle(TR.TransferInst->getIterator(), TR.DebugInst);
		}
		Transfers.clear();

// Deferred inlocs will not have had any DBG_VALUE insts created; do		// Deferred inlocs will not have had any DBG_VALUE insts created; do
// that now.		// that now.
flushPendingLocs(PendingInLocs, VarLocIDs);		flushPendingLocs(PendingInLocs, VarLocIDs);

LLVM_DEBUG(printVarLocInMBB(MF, OutLocs, VarLocIDs, "Final OutLocs", dbgs()));		LLVM_DEBUG(printVarLocInMBB(MF, OutLocs, VarLocIDs, "Final OutLocs", dbgs()));
LLVM_DEBUG(printVarLocInMBB(MF, InLocs, VarLocIDs, "Final InLocs", dbgs()));		LLVM_DEBUG(printVarLocInMBB(MF, InLocs, VarLocIDs, "Final InLocs", dbgs()));
return Changed;		return Changed;
}		}
Show All 21 Lines

llvm/trunk/test/DebugInfo/MIR/X86/live-debug-values-bad-transfer.mir

				# RUN: llc %s -mtriple=x86_64-unknown-unknown -o - -run-pass=livedebugvalues \| FileCheck %s --implicit-check-not=DBG_VALUE
				#
				# Test that the DBG_VALUE of ecx below does not get propagated. It is considered
				# live-in on LiveDebugValues' first pass through the loop, but on the second it
				# should be removed from the InLocs set because it gets clobbered inside the
				# loop. There should be no transfer from ecx to ebx -- this is ensured by the
				# FileCheck implicit-check-not option.
				#
				# FIXME: we successfully prevent the false location (ebx) from being
				# propagated into block 2, but the original transfer isn't yet eliminated.
				# Thus we get no DBG_VALUe in block 2, but an invalid one in block 1.
				#
				# CHECK-LABEL: name: foo
				# CHECK-LABEL: bb.0.entry:
				# CHECK: $ecx = MOV32ri 0
				# CHECK-NEXT: DBG_VALUE
				# CHECK-LABEL: bb.1.loop:
				# CHECK: $ebx = COPY killed $ecx
				# CHECK-NEXT: DBG_VALUE

				--- \|
				source_filename = "live-debug-values-remove-range.ll"
				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				declare void @llvm.dbg.value(metadata, metadata, metadata)

				define i32 @foo(i32 %bar) !dbg !4 {
				entry:
				br label %loop
				loop:
				br label %loop
				exit:
				ret i32 %bar
				}

				!llvm.module.flags = !{!0, !1}
				!llvm.dbg.cu = !{!2}

				!0 = !{i32 2, !"Debug Info Version", i32 3}
				!1 = !{i32 2, !"Dwarf Version", i32 4}
				!2 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, file: !3, producer: "beards", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug)
				!3 = !DIFile(filename: "bees.cpp", directory: ".")
				!4 = distinct !DISubprogram(name: "nope", scope: !3, file: !3, line: 1, type: !5, spFlags: DISPFlagDefinition, unit: !2, retainedNodes: !8)
				!5 = !DISubroutineType(types: !6)
				!6 = !{!7}
				!7 = !DIBasicType(name: "int", size: 32, align: 32, encoding: DW_ATE_signed)
				!8 = !{!9}
				!9 = !DILocalVariable(name: "thin", scope: !4, file: !3, line: 1, type: !7)
				!10 = !DILocation(line: 1, scope: !4)

				...
				---
				name: foo
				alignment: 4
				tracksRegLiveness: true
				liveins:
				- { reg: '$edi' }
				frameInfo:
				stackSize: 8
				offsetAdjustment: -8
				maxAlignment: 1
				adjustsStack: true
				hasCalls: true
				maxCallFrameSize: 0
				cvBytesOfCalleeSavedRegisters: 8
				fixedStack:
				- { id: 0, type: spill-slot, offset: -16, size: 8, alignment: 16, callee-saved-register: '$rbx' }
				machineFunctionInfo: {}
				body: \|
				bb.0.entry:
				liveins: $edi, $rbx

				frame-setup PUSH64r killed $rbx, implicit-def $rsp, implicit $rsp
				CFI_INSTRUCTION def_cfa_offset 16
				CFI_INSTRUCTION offset $rbx, -16
				$ebx = MOV32rr $edi
				$eax = MOV32ri 0
				$ecx = MOV32ri 0
				DBG_VALUE $ecx, $noreg, !9, !DIExpression(), debug-location !10
				$edi = MOV32ri 0
				$esi = MOV32ri 0

				bb.1.loop:
				successors: %bb.1, %bb.2
				liveins: $ebx, $eax, $ecx, $edi, $esi

				$eax = COPY $ecx
				$ebx = COPY killed $ecx
				$ecx = COPY killed $edi
				$edi = COPY killed $esi
				$esi = MOV32ri 1
				TEST8ri killed renamable $al, 1, implicit-def $eflags
				JCC_1 %bb.1, 5, implicit killed $eflags

				bb.2.exit:
				liveins: $ebx

				$eax = MOV32rr killed $ebx
				$rbx = frame-destroy POP64r implicit-def $rsp, implicit $rsp
				CFI_INSTRUCTION def_cfa_offset 8
				RETQ $eax

				...