This is an archive of the discontinued LLVM Phabricator instance.

[RFC][DebugInfo] Do not use the DBG_VALUE to calculate debug info of spill location
Needs ReviewPublic

Authored by dongAxis1944 on Mar 21 2021, 8:27 PM.

Download Raw Diff

Details

Reviewers

mkuper
StephenTozer
jmorse

Summary

Considering the following c++ code:

enum class TestEnum ...
#define CHECK_GDB_TEST(s) ...

__attribute__((noinline)) uint64_t
TestBasicTypeNormal(uint8_t a, int8_t b, uint16_t c, int16_t d, uint32_t e,
                    int32_t f, uint64_t g, int64_t h, char i, const char *j,
                    const char k[], float l, double m, TestEnum n) {
  CHECK_GDB_TEST("coro_variable_test");
  uint64_t sum = a + b + c + d + e + f + g + h;
  double sum_f = l + m;
  CHECK_GDB_TEST("coro_variable_test");
  printf("%lu %lf %f %lf %c %s %s %s\n", sum, sum_f, l, m, i, j, k, n == TestEnum::TYPE_A ? "TYPE_A" : "TYPE_B");
  return sum;
}

When we use gdb or lldb to debug this elf, we get the following ouput:

Breakpoint 1, TestBasicTypeNormal (a=<optimized out>, a@entry=1 '\001', b=b@entry=10 '\n',
    c=c@entry=100, d=d@entry=1000, e=e@entry=10000, f=-854629579, f@entry=100000, g=100000000000,
    h=1000000000000, i=99 'c', j=0x4009b2 "hello", k=0x4009b8 "world", l=<optimized out>,
    m=0.87654321000000002, n=TestEnum::TYPE_B)
22        printf("%lu %lf %f %lf %c %s %s %s\n", sum, sum_f, l, m, i, j, k, n == TestEnum::TYPE_A ? "TYPE_A" : "TYPE_B");
(gdb)

We can find the variable l become -854629579. It is wried, because we did not change the value of l.
And after downloading the dwarf info:

0x00001e3a:     DW_TAG_formal_parameter
                  DW_AT_location        (0x0000015e:
                     [0x0000000000400740, 0x000000000040077c): DW_OP_reg9 R9
                     [0x000000000040077c, 0x000000000040082e): DW_OP_breg7 RSP+8)
                  DW_AT_name    ("f")
                  DW_AT_decl_line       (16)
                  DW_AT_type    (0x00001254 "int32_t")

It shows f should be in [rsp + 8] between 0x000000000040077c and 0x000000000040082e.
But it is not right after checking the assembly code:

40075a:       44 89 4c 24 08          mov    %r9d,0x8(%rsp)    ----> r9d is l, and it save to the rsp+8
....
4007bc:       f2 0f 11 44 24 08       movsd  %xmm0,0x8(%rsp)   ---->  save xmm0 to the rsp+8 without notify the dwarf

So the problem is clear, llvm failed to calculate dwarf interval for spill location.
I use this patch to fix the problems, but I do not know whether it is right.

Diff Detail

Unit TestsFailed

	Time	Test
	120 ms	x64 windows > LLVM.CodeGen/AArch64::spillfill-sve.mir
	160 ms	x64 windows > LLVM.CodeGen/AArch64::wineh-try-catch-vla.ll
	140 ms	x64 windows > LLVM.CodeGen/AMDGPU::greedy-broken-ssa-verifier-error.mir
	140 ms	x64 windows > LLVM.CodeGen/AMDGPU::sgpr-spill-wrong-stack-id.mir
	170 ms	x64 windows > LLVM.CodeGen/AMDGPU::spill-before-exec.mir
		View Full Test Results (47 Failed)

Event Timeline

dongAxis1944 created this revision.Mar 21 2021, 8:27 PM

Herald added subscribers: hiraditya, qcolombet. · View Herald TranscriptMar 21 2021, 8:27 PM

dongAxis1944 requested review of this revision.Mar 21 2021, 8:27 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2021, 8:27 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

dongAxis1944 added reviewers: mkuper, StephenTozer, jmorse.Mar 21 2021, 8:31 PM

ChuanqiXu added a subscriber: ChuanqiXu.Mar 21 2021, 8:43 PM

Harbormaster completed remote builds in B94928: Diff 332191.Mar 22 2021, 2:31 AM

Since this is RFC, so I do not fix ut of LLVM.

Hmmmm, do you have a reduced reproducer in llvm-ir that could go in a bug report? There are a number of things that could be going on here, and we can't be sure which without an example.

Given that the assembly you're using features a stack spill slot being shared by two values, I'd bet on stack slot colouring merging two slots and not modifying debug-info. Alternately, there are certain DBG_VALUEs that LiveDebugVariables produces which are hard for LiveDebugValues to interpret.

For the actual modification in this patch:

I'm not sure what to make of the inliner codegen changes,
The VarLocBasedImpl LiveDebugValues change will drop a lot of other variable locations, which is undesirable.

In D99048#2647988, @jmorse wrote:

Hmmmm, do you have a reduced reproducer in llvm-ir that could go in a bug report? There are a number of things that could be going on here, and we can't be sure which without an example.

Given that the assembly you're using features a stack spill slot being shared by two values, I'd bet on stack slot colouring merging two slots and not modifying debug-info. Alternately, there are certain DBG_VALUEs that LiveDebugVariables produces which are hard for LiveDebugValues to interpret.

For the actual modification in this patch:

I'm not sure what to make of the inliner codegen changes,

The VarLocBasedImpl LiveDebugValues change will drop a lot of other variable locations, which is undesirable.

Thanks for reviewing. I will upload the IR later.

dongAxis1944 added inline comments.Mar 24 2021, 6:36 PM

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313	@jmorse If the MI is indirect DBG_VALUE, does it mean the positions of variable is in the stack?

dongAxis1944 added inline comments.Mar 24 2021, 6:40 PM

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313	I just want to skip the DBG_VALUE related to the spill location. Because I find the function "VarLocBasedLDV::transferSpillOrRestoreInst" can handle spill location well

I just upload the c++ file for testing.

test.cpp1 KBDownload

Thanks for the reproducer; I think I see the same as you, when LiveDebugValues runs:

MOV32mr $rsp, 1, $noreg, 8, $noreg, $r9d :: (store 4 into %stack.3)
[...]
DBG_VALUE $rsp, 0, !"f", !DIExpression(DW_OP_plus_uconst, 8), debug-location !983; test1.cpp:0 line no:16 indirect

Followed by, later on:

MOVSDmr $rsp, 1, $noreg, 8, $noreg, killed renamable $xmm0 :: (store 8 into %stack.3)

Where the MOVSDmr writes to the stack slot the DBG_VALUE refers to. Unfortunately, there currently isn't a way for LiveDebugValues to connect the DBG_VALUE to the write into %stack.3 at this time. Doing so would involve parsing quite complicated DIExpressions, mapping back to a stack offset, finding the corresponding frame index and then scanning the function for stores to that stack slot. Doing so would be fragile, and we're trying to get away from heavily interpreting DIExpressions. Ultimately, this is because a lot of information is lost when regalloc / LiveDebugVariables runs.

This isn't fixed in the other LiveDebugValues implementation due to the complexity; instead it's fixed in a series of patches that haven't landed yet, by trying to avoid dropping information during regalloc. Those patches *might* be ready for the next release of LLVM, definitely behind an opt-in flag though.

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313	(The "isIndirect" flag is quite a pain, and hopefully it'll be eliminated when everything becomes DBG_VALUE_LIST instructions in the coming few months,) Right now isIndirect does indeed mean the variable is on the stack -- as opposed to the variable being the stack _pointer_. Alas, we can't just ignore these DBG_VALUEs as we'll drop numerous variable locations. Pre-regalloc DBG_VALUEs of a vreg translate into indirect DBG_VALUEs when that vreg is placed on the stack at the position of the DBG_VALUE. This is different to following a variable that's in a register onto and off-of the stack via transferSpillOrRestoreInst; it's a variable that is assigned a value that is on the stack at that time.

dongAxis1944 added inline comments.Mar 30 2021, 1:51 AM

llvm/lib/CodeGen/InlineSpiller.cpp
979	@jmorse In addition, when regalloc tries to spill vreg to stack. llvm forget to mark kill flags for copy instruction. I think it might be useful for dwarf construction. What's your opinions?
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313	thank you very much

jmorse added inline comments.Mar 31 2021, 7:40 AM

llvm/lib/CodeGen/InlineSpiller.cpp
979	Interesting -- I'm not very familiar with the kill flags, I've always relied on analysis passes to determine liveness information. If it's legitimate to place a a kill flag here, and it improves variable location tracking like you suggested in D41226, then it's probably a worthwhile improvement. However, you should first check whether this causes LLVM to ever generate different code. I think most of the community doesn't want to change the code generated simply to improve debug-info (unless it's a trivial alteration).

probinson added a subscriber: probinson.Apr 1 2021, 5:45 PM

probinson added inline comments.

llvm/lib/CodeGen/InlineSpiller.cpp
979	What we don't like is when adding -g changes codegen at all. Making slightly different choices (at -O0) that gives better debug experience without particularly affecting performance is okay, as long as that choice doesn't depend on the presence of debug info.

dongAxis1944 added inline comments.Apr 1 2021, 6:53 PM

llvm/lib/CodeGen/InlineSpiller.cpp
979	@probinson thanks, I will check whether the change will affect the performance.
979	@jmorse Thanks, let me check the generated code when applying this patch first.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

InlineSpiller.cpp

33 lines

LiveDebugValues/

VarLocBasedImpl.cpp

28 lines

Diff 332191

llvm/lib/CodeGen/InlineSpiller.cpp

Show First 20 Lines • Show All 821 Lines • ▼ Show 20 Lines	foldMemoryOperand(ArrayRef<std::pair<MachineInstr *, unsigned>> Ops,
MachineInstr *LoadMI) {		MachineInstr *LoadMI) {
if (Ops.empty())		if (Ops.empty())
return false;		return false;
// Don't attempt folding in bundles.		// Don't attempt folding in bundles.
MachineInstr *MI = Ops.front().first;		MachineInstr *MI = Ops.front().first;
if (Ops.back().first != MI \|\| MI->isBundled())		if (Ops.back().first != MI \|\| MI->isBundled())
return false;		return false;

bool WasCopy = MI->isCopy();
Register ImpReg;		Register ImpReg;
		bool WasCopy = MI->isCopy();
		Register CopyReg = MCRegister::NoRegister;
		if (WasCopy && MI->getNumOperands() == 2) {
		assert(MI->getOperand(0).isDef() && "operator 0 should be a def");
		assert(MI->getOperand(1).isUse() && "operator 1 should be a use");

		MachineOperand &MO = MI->getOperand(1);
		if (MO.isReg() && !MO.isImplicit())
		CopyReg = MO.getReg();
		}

// TII::foldMemoryOperand will do what we need here for statepoint		// TII::foldMemoryOperand will do what we need here for statepoint
// (fold load into use and remove corresponding def). We will replace		// (fold load into use and remove corresponding def). We will replace
// uses of removed def with loads (spillAroundUses).		// uses of removed def with loads (spillAroundUses).
// For that to work we need to untie def and use to pass it through		// For that to work we need to untie def and use to pass it through
// foldMemoryOperand and signal foldPatchpoint that it is allowed to		// foldMemoryOperand and signal foldPatchpoint that it is allowed to
// fold them.		// fold them.
bool UntieRegs = MI->getOpcode() == TargetOpcode::STATEPOINT;		bool UntieRegs = MI->getOpcode() == TargetOpcode::STATEPOINT;
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	if (ImpReg)
for (unsigned i = FoldMI->getNumOperands(); i; --i) {		for (unsigned i = FoldMI->getNumOperands(); i; --i) {
MachineOperand &MO = FoldMI->getOperand(i - 1);		MachineOperand &MO = FoldMI->getOperand(i - 1);
if (!MO.isReg() \|\| !MO.isImplicit())		if (!MO.isReg() \|\| !MO.isImplicit())
break;		break;
if (MO.getReg() == ImpReg)		if (MO.getReg() == ImpReg)
FoldMI->RemoveOperand(i - 1);		FoldMI->RemoveOperand(i - 1);
}		}

LLVM_DEBUG(dumpMachineInstrRangeWithSlotIndex(MIS.begin(), MIS.end(), LIS,		auto TryKillReg = [&]() {
"folded"));		assert(WasCopy && "old machine instruction must be a copy instruction");
		SmallVector<std::pair<MachineInstr*, unsigned>, 8> SpillMIOps;
		VirtRegInfo RI = AnalyzeVirtRegInBundle(*FoldMI, CopyReg, &SpillMIOps);
		for (const auto &OpPair : SpillMIOps) {
		MachineOperand &MO = OpPair.first->getOperand(OpPair.second);
		if (MO.isUse())
		if (!OpPair.first->isRegTiedToDefOperand(OpPair.second))
		MO.setIsKill();
		}
		};
if (!WasCopy)		if (!WasCopy)
++NumFolded;		++NumFolded;
else if (Ops.front().second == 0) {		else if (Ops.front().second == 0) {
++NumSpills;		++NumSpills;
// If there is only 1 store instruction is required for spill, add it		// If there is only 1 store instruction is required for spill, add it
// to mergeable list. In X86 AMX, 2 intructions are required to store.		// to mergeable list. In X86 AMX, 2 intructions are required to store.
// We disable the merge for this case.		// We disable the merge for this case.
if (std::distance(MIS.begin(), MIS.end()) <= 1)		if (std::distance(MIS.begin(), MIS.end()) <= 1) {
		if (CopyReg.isValid())
		TryKillReg();
		dongAxis1944AuthorUnsubmitted Done Reply Inline Actions @jmorse In addition, when regalloc tries to spill vreg to stack. llvm forget to mark kill flags for copy instruction. I think it might be useful for dwarf construction. What's your opinions? dongAxis1944: @jmorse In addition, when regalloc tries to spill vreg to stack. llvm forget to mark kill flags…
		jmorseUnsubmitted Not Done Reply Inline Actions Interesting -- I'm not very familiar with the kill flags, I've always relied on analysis passes to determine liveness information. If it's legitimate to place a a kill flag here, and it improves variable location tracking like you suggested in D41226, then it's probably a worthwhile improvement. However, you should first check whether this causes LLVM to ever generate different code. I think most of the community doesn't want to change the code generated simply to improve debug-info (unless it's a trivial alteration). jmorse: Interesting -- I'm not very familiar with the kill flags, I've always relied on analysis passes…
		probinsonUnsubmitted Not Done Reply Inline Actions What we don't like is when adding -g changes codegen at all. Making slightly different choices (at -O0) that gives better debug experience without particularly affecting performance is okay, as long as that choice doesn't depend on the presence of debug info. probinson: What we don't like is when adding -g changes codegen at all. Making slightly different choices…
		dongAxis1944AuthorUnsubmitted Done Reply Inline Actions @probinson thanks, I will check whether the change will affect the performance. dongAxis1944: @probinson thanks, I will check whether the change will affect the performance.
		dongAxis1944AuthorUnsubmitted Done Reply Inline Actions @jmorse Thanks, let me check the generated code when applying this patch first. dongAxis1944: @jmorse Thanks, let me check the generated code when applying this patch first.
HSpiller.addToMergeableSpills(*FoldMI, StackSlot, Original);		HSpiller.addToMergeableSpills(*FoldMI, StackSlot, Original);
		}
} else		} else
++NumReloads;		++NumReloads;

		LLVM_DEBUG(dumpMachineInstrRangeWithSlotIndex(MIS.begin(), MIS.end(), LIS,
		"folded"));
return true;		return true;
}		}

void InlineSpiller::insertReload(Register NewVReg,		void InlineSpiller::insertReload(Register NewVReg,
SlotIndex Idx,		SlotIndex Idx,
MachineBasicBlock::iterator MI) {		MachineBasicBlock::iterator MI) {
MachineBasicBlock &MBB = *MI->getParent();		MachineBasicBlock &MBB = *MI->getParent();

▲ Show 20 Lines • Show All 641 Lines • Show Last 20 Lines

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp

Show First 20 Lines • Show All 1,304 Lines • ▼ Show 20 Lines	if (removeEntryValue(MI, OpenRanges, VarLocIDs, EntryVL)) {
OpenRanges.erase(EntryVL);		OpenRanges.erase(EntryVL);
}		}
}		}

if (all_of(MI.debug_operands(), [](const MachineOperand &MO) {		if (all_of(MI.debug_operands(), [](const MachineOperand &MO) {
return (MO.isReg() && MO.getReg()) \|\| MO.isImm() \|\| MO.isFPImm() \|\|		return (MO.isReg() && MO.getReg()) \|\| MO.isImm() \|\| MO.isFPImm() \|\|
MO.isCImm();		MO.isCImm();
})) {		})) {
		if (!MI.isIndirectDebugValue()) {
		dongAxis1944AuthorUnsubmitted Done Reply Inline Actions @jmorse If the MI is indirect DBG_VALUE, does it mean the positions of variable is in the stack? dongAxis1944: @jmorse If the MI is indirect DBG_VALUE, does it mean the positions of variable is in the stack?
		dongAxis1944AuthorUnsubmitted Done Reply Inline Actions I just want to skip the DBG_VALUE related to the spill location. Because I find the function "VarLocBasedLDV::transferSpillOrRestoreInst" can handle spill location well dongAxis1944: I just want to skip the DBG_VALUE related to the spill location. Because I find the function…
		jmorseUnsubmitted Not Done Reply Inline Actions (The "isIndirect" flag is quite a pain, and hopefully it'll be eliminated when everything becomes DBG_VALUE_LIST instructions in the coming few months,) Right now isIndirect does indeed mean the variable is on the stack -- as opposed to the variable being the stack _pointer_. Alas, we can't just ignore these DBG_VALUEs as we'll drop numerous variable locations. Pre-regalloc DBG_VALUEs of a vreg translate into indirect DBG_VALUEs when that vreg is placed on the stack at the position of the DBG_VALUE. This is different to following a variable that's in a register onto and off-of the stack via transferSpillOrRestoreInst; it's a variable that is assigned a value that is on the stack at that time. jmorse: (The "isIndirect" flag is quite a pain, and hopefully it'll be eliminated when everything…
		dongAxis1944AuthorUnsubmitted Done Reply Inline Actions thank you very much dongAxis1944: thank you very much
// Use normal VarLoc constructor for registers and immediates.		// Use normal VarLoc constructor for registers and immediates.
VarLoc VL(MI, LS);		VarLoc VL(MI, LS);
// End all previous ranges of VL.Var.		// End all previous ranges of VL.Var.
OpenRanges.erase(VL);		OpenRanges.erase(VL);

LocIndices IDs = VarLocIDs.insert(VL);		LocIndices IDs = VarLocIDs.insert(VL);
// Add the VarLoc to OpenRanges from this DBG_VALUE.		// Add the VarLoc to OpenRanges from this DBG_VALUE.
OpenRanges.insert(IDs, VL);		OpenRanges.insert(IDs, VL);
		}
} else if (MI.memoperands().size() > 0) {		} else if (MI.memoperands().size() > 0) {
llvm_unreachable("DBG_VALUE with mem operand encountered after regalloc?");		llvm_unreachable("DBG_VALUE with mem operand encountered after regalloc?");
} else {		} else {
// This must be an undefined location. If it has an open range, erase it.		// This must be an undefined location. If it has an open range, erase it.
assert(MI.isUndefDebugValue() &&		assert(MI.isUndefDebugValue() &&
"Unexpected non-undef DBG_VALUE encountered");		"Unexpected non-undef DBG_VALUE encountered");
VarLoc VL(MI, LS);		VarLoc VL(MI, LS);
OpenRanges.erase(VL);		OpenRanges.erase(VL);
▲ Show 20 Lines • Show All 841 Lines • ▼ Show 20 Lines	while (!Worklist.empty()) {
if (MBBJoined) {		if (MBBJoined) {
MBBJoined = false;		MBBJoined = false;
Changed = true;		Changed = true;
// Now that we have started to extend ranges across BBs we need to		// Now that we have started to extend ranges across BBs we need to
// examine spill, copy and restore instructions to see whether they		// examine spill, copy and restore instructions to see whether they
// operate with registers that correspond to user variables.		// operate with registers that correspond to user variables.
// First load any pending inlocs.		// First load any pending inlocs.
OpenRanges.insertFromLocSet(getVarLocsInMBB(MBB, InLocs), VarLocIDs);		OpenRanges.insertFromLocSet(getVarLocsInMBB(MBB, InLocs), VarLocIDs);
for (auto &MI : *MBB)		for (auto &MI : *MBB) {
		#if !defined(NDEBUG)
		MI.dump();
		for (uint64_t ID : OpenRanges.getSpillVarLocs()) {
		LocIndex Idx = LocIndex::fromRawInteger(ID);
		const VarLoc &VL = VarLocIDs[Idx];
		assert(VL.containsSpillLocs() && "Broken VarLocSet?");
		VL.dump(TRI);
		}
		#endif
process(MI, OpenRanges, VarLocIDs, Transfers);		process(MI, OpenRanges, VarLocIDs, Transfers);
		}
OLChanged \|= transferTerminator(MBB, OpenRanges, OutLocs, VarLocIDs);		OLChanged \|= transferTerminator(MBB, OpenRanges, OutLocs, VarLocIDs);

LLVM_DEBUG(printVarLocInMBB(MF, OutLocs, VarLocIDs,		LLVM_DEBUG(printVarLocInMBB(MF, OutLocs, VarLocIDs,
"OutLocs after propagating", dbgs()));		"OutLocs after propagating", dbgs()));
LLVM_DEBUG(printVarLocInMBB(MF, InLocs, VarLocIDs,		LLVM_DEBUG(printVarLocInMBB(MF, InLocs, VarLocIDs,
"InLocs after propagating", dbgs()));		"InLocs after propagating", dbgs()));

if (OLChanged) {		if (OLChanged) {
Show All 39 Lines