Page MenuHomePhabricator

[RFC][DebugInfo] Do not use the DBG_VALUE to calculate debug info of spill location
Needs ReviewPublic

Authored by dongAxis1944 on Mar 21 2021, 8:27 PM.

Details

Summary

Considering the following c++ code:

enum class TestEnum ...
#define CHECK_GDB_TEST(s) ...

__attribute__((noinline)) uint64_t
TestBasicTypeNormal(uint8_t a, int8_t b, uint16_t c, int16_t d, uint32_t e,
                    int32_t f, uint64_t g, int64_t h, char i, const char *j,
                    const char k[], float l, double m, TestEnum n) {
  CHECK_GDB_TEST("coro_variable_test");
  uint64_t sum = a + b + c + d + e + f + g + h;
  double sum_f = l + m;
  CHECK_GDB_TEST("coro_variable_test");
  printf("%lu %lf %f %lf %c %s %s %s\n", sum, sum_f, l, m, i, j, k, n == TestEnum::TYPE_A ? "TYPE_A" : "TYPE_B");
  return sum;
}

When we use gdb or lldb to debug this elf, we get the following ouput:

Breakpoint 1, TestBasicTypeNormal (a=<optimized out>, a@entry=1 '\001', b=b@entry=10 '\n',
    c=c@entry=100, d=d@entry=1000, e=e@entry=10000, f=-854629579, f@entry=100000, g=100000000000,
    h=1000000000000, i=99 'c', j=0x4009b2 "hello", k=0x4009b8 "world", l=<optimized out>,
    m=0.87654321000000002, n=TestEnum::TYPE_B)
22        printf("%lu %lf %f %lf %c %s %s %s\n", sum, sum_f, l, m, i, j, k, n == TestEnum::TYPE_A ? "TYPE_A" : "TYPE_B");
(gdb)

We can find the variable l become -854629579. It is wried, because we did not change the value of l.
And after downloading the dwarf info:

0x00001e3a:     DW_TAG_formal_parameter
                  DW_AT_location        (0x0000015e:
                     [0x0000000000400740, 0x000000000040077c): DW_OP_reg9 R9
                     [0x000000000040077c, 0x000000000040082e): DW_OP_breg7 RSP+8)
                  DW_AT_name    ("f")
                  DW_AT_decl_line       (16)
                  DW_AT_type    (0x00001254 "int32_t")

It shows f should be in [rsp + 8] between 0x000000000040077c and 0x000000000040082e.
But it is not right after checking the assembly code:

40075a:       44 89 4c 24 08          mov    %r9d,0x8(%rsp)    ----> r9d is l, and it save to the rsp+8
....
4007bc:       f2 0f 11 44 24 08       movsd  %xmm0,0x8(%rsp)   ---->  save xmm0 to the rsp+8 without notify the dwarf

So the problem is clear, llvm failed to calculate dwarf interval for spill location.
I use this patch to fix the problems, but I do not know whether it is right.

Diff Detail

Unit TestsFailed

TimeTest
120 msx64 windows > LLVM.CodeGen/AArch64::spillfill-sve.mir
Script: -- : 'RUN: at line 1'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llc.exe -mtriple=aarch64-linux-gnu -run-pass=greedy C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\spillfill-sve.mir -o - | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\spillfill-sve.mir
160 msx64 windows > LLVM.CodeGen/AArch64::wineh-try-catch-vla.ll
Script: -- : 'RUN: at line 1'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llc.exe -o - C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\wineh-try-catch-vla.ll -mtriple=aarch64-windows -verify-machineinstrs | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AArch64\wineh-try-catch-vla.ll
140 msx64 windows > LLVM.CodeGen/AMDGPU::greedy-broken-ssa-verifier-error.mir
Script: -- : 'RUN: at line 1'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llc.exe -march=amdgcn -mcpu=gfx1030 -verify-machineinstrs -run-pass=greedy -stress-regalloc=2 C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\greedy-broken-ssa-verifier-error.mir -o - | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe -check-prefix=GCN C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\greedy-broken-ssa-verifier-error.mir
140 msx64 windows > LLVM.CodeGen/AMDGPU::sgpr-spill-wrong-stack-id.mir
Script: -- : 'RUN: at line 1'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llc.exe -mtriple=amdgcn-amd-amdhsa -verify-machineinstrs -stress-regalloc=3 -start-before=greedy -stop-after=stack-slot-coloring -o - C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\sgpr-spill-wrong-stack-id.mir | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe -check-prefixes=SHARE,GCN C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\sgpr-spill-wrong-stack-id.mir
170 msx64 windows > LLVM.CodeGen/AMDGPU::spill-before-exec.mir
Script: -- : 'RUN: at line 2'; c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\llc.exe -mtriple=amdgcn-- -verify-machineinstrs -debug-only=regalloc -run-pass=greedy -o /dev/null C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\spill-before-exec.mir 2>&1 | c:\ws\w16c2-1\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w16c2-1\llvm-project\premerge-checks\llvm\test\CodeGen\AMDGPU\spill-before-exec.mir
View Full Test Results (47 Failed)

Event Timeline

dongAxis1944 created this revision.Mar 21 2021, 8:27 PM
dongAxis1944 requested review of this revision.Mar 21 2021, 8:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 21 2021, 8:27 PM

Since this is RFC, so I do not fix ut of LLVM.

Hmmmm, do you have a reduced reproducer in llvm-ir that could go in a bug report? There are a number of things that could be going on here, and we can't be sure which without an example.

Given that the assembly you're using features a stack spill slot being shared by two values, I'd bet on stack slot colouring merging two slots and not modifying debug-info. Alternately, there are certain DBG_VALUEs that LiveDebugVariables produces which are hard for LiveDebugValues to interpret.

For the actual modification in this patch:

  • I'm not sure what to make of the inliner codegen changes,
  • The VarLocBasedImpl LiveDebugValues change will drop a lot of other variable locations, which is undesirable.

Hmmmm, do you have a reduced reproducer in llvm-ir that could go in a bug report? There are a number of things that could be going on here, and we can't be sure which without an example.

Given that the assembly you're using features a stack spill slot being shared by two values, I'd bet on stack slot colouring merging two slots and not modifying debug-info. Alternately, there are certain DBG_VALUEs that LiveDebugVariables produces which are hard for LiveDebugValues to interpret.

For the actual modification in this patch:

  • I'm not sure what to make of the inliner codegen changes,
  • The VarLocBasedImpl LiveDebugValues change will drop a lot of other variable locations, which is undesirable.

Thanks for reviewing. I will upload the IR later.

dongAxis1944 added inline comments.Mar 24 2021, 6:36 PM
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313

@jmorse If the MI is indirect DBG_VALUE, does it mean the positions of variable is in the stack?

dongAxis1944 added inline comments.Mar 24 2021, 6:40 PM
llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313

I just want to skip the DBG_VALUE related to the spill location. Because I find the function "VarLocBasedLDV::transferSpillOrRestoreInst" can handle spill location well

dongAxis1944 added a comment.EditedMar 24 2021, 7:50 PM

I just upload the c++ file for testing.

Thanks for the reproducer; I think I see the same as you, when LiveDebugValues runs:

MOV32mr $rsp, 1, $noreg, 8, $noreg, $r9d :: (store 4 into %stack.3)
[...]
DBG_VALUE $rsp, 0, !"f", !DIExpression(DW_OP_plus_uconst, 8), debug-location !983; test1.cpp:0 line no:16 indirect

Followed by, later on:

MOVSDmr $rsp, 1, $noreg, 8, $noreg, killed renamable $xmm0 :: (store 8 into %stack.3)

Where the MOVSDmr writes to the stack slot the DBG_VALUE refers to. Unfortunately, there currently isn't a way for LiveDebugValues to connect the DBG_VALUE to the write into %stack.3 at this time. Doing so would involve parsing quite complicated DIExpressions, mapping back to a stack offset, finding the corresponding frame index and then scanning the function for stores to that stack slot. Doing so would be fragile, and we're trying to get away from heavily interpreting DIExpressions. Ultimately, this is because a lot of information is lost when regalloc / LiveDebugVariables runs.

This isn't fixed in the other LiveDebugValues implementation due to the complexity; instead it's fixed in a series of patches that haven't landed yet, by trying to avoid dropping information during regalloc. Those patches *might* be ready for the next release of LLVM, definitely behind an opt-in flag though.

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313

(The "isIndirect" flag is quite a pain, and hopefully it'll be eliminated when everything becomes DBG_VALUE_LIST instructions in the coming few months,)

Right now isIndirect does indeed mean the variable is on the stack -- as opposed to the variable being the stack _pointer_. Alas, we can't just ignore these DBG_VALUEs as we'll drop numerous variable locations. Pre-regalloc DBG_VALUEs of a vreg translate into indirect DBG_VALUEs when that vreg is placed on the stack at the position of the DBG_VALUE.

This is different to following a variable that's in a register onto and off-of the stack via transferSpillOrRestoreInst; it's a variable that is assigned a value that is on the stack at that time.

dongAxis1944 added inline comments.Mar 30 2021, 1:51 AM
llvm/lib/CodeGen/InlineSpiller.cpp
979

@jmorse In addition, when regalloc tries to spill vreg to stack. llvm forget to mark kill flags for copy instruction. I think it might be useful for dwarf construction. What's your opinions?

llvm/lib/CodeGen/LiveDebugValues/VarLocBasedImpl.cpp
1313

thank you very much

jmorse added inline comments.Mar 31 2021, 7:40 AM
llvm/lib/CodeGen/InlineSpiller.cpp
979

Interesting -- I'm not very familiar with the kill flags, I've always relied on analysis passes to determine liveness information. If it's legitimate to place a a kill flag here, and it improves variable location tracking like you suggested in D41226, then it's probably a worthwhile improvement.

However, you should first check whether this causes LLVM to ever generate different code. I think most of the community doesn't want to change the code generated simply to improve debug-info (unless it's a trivial alteration).

probinson added inline comments.
llvm/lib/CodeGen/InlineSpiller.cpp
979

What we don't like is when adding -g changes codegen at all. Making slightly different choices (at -O0) that gives better debug experience without particularly affecting performance is okay, as long as that choice doesn't depend on the presence of debug info.

dongAxis1944 added inline comments.Apr 1 2021, 6:53 PM
llvm/lib/CodeGen/InlineSpiller.cpp
979

@probinson thanks, I will check whether the change will affect the performance.

979

@jmorse Thanks, let me check the generated code when applying this patch first.