This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Stop changing labels for register-described parameter DBG_VALUEs
ClosedPublic

Authored by dstenb on Jan 31 2019, 6:52 AM.

Details

Summary

This is a follow-up to D57510. This patch removes the code from
DebugHandlerBase::beginFunction() which changes the starting label for
the first non-overlapping register-described DBG_VALUEs of parameters
to the beginning of the function. That code does not consider what defines the
registers, which can result in the ranges for the debug values starting before
their defining instructions. The code also violates the DbgValueHistoryMap's
ranges, which I think can make it quite confusing when troubleshooting.
We currently do not emit debug values for constant values directly at the
start of the function, so this code is still useful for such values, but my
intention is to remove the code from beginFunction() completely when we
get there.

In D57510, PrologEpilogInserter was amended so that parameter
DBG_VALUEs are kept at the start of the entry block, even after emission of
prologue code. That was done to reduce the degradation of debug
completeness from this patch. PR40638 is another example, where the
lexical-scope trimming that LDV does results in instructions after the prologue
being left without locations. There might be other cases where the
DBG_VALUEs are pushed further down, for which the DebugHandlerBase code
may be helpful, but as it now quite often result in incorrect locations,
even after the prologue, it seems better to remove that code, and try to
work our way up with accurate locations.

In the long run we should maybe not aim to provide accurate locations
inside the prologue. Some single location descriptions, at least those
referring to stack values, generate inaccurate values inside the
epilogue, so we maybe should not aim to achieve accuracy for location
lists. However, it seems that we now emit line number programs that can
result in GDB and LLDB stopping inside the prologue when doing line
number stepping into functions. See PR40188 for more information.

A summary of some of the changed test cases is available in PR40188#c2.

Diff Detail

Repository
rL LLVM

Event Timeline

dstenb created this revision.Jan 31 2019, 6:52 AM

LGTM, although I'll wait for someone else to approve.

Does this have a noticeable effect on the debug info size?
Do we know about any remaining bugs where the first DBG_VALUE comes after the prologue_end and we would have a valid location at prologue_end? With constants perhaps? If yes, this would be a noticeable usability regression. Otherwise this just improves accuracy, which is always great.

Herald added a project: Restricted Project. · View Herald TranscriptJan 31 2019, 4:12 PM
dstenb added a comment.Feb 1 2019, 7:34 AM

Does this have a noticeable effect on the debug info size?

I will check this.

Do we know about any remaining bugs where the first DBG_VALUE comes after the prologue_end and we would have a valid location at prologue_end? With constants perhaps? If yes, this would be a noticeable usability regression. Otherwise this just improves accuracy, which is always great.

I reverted this patch (but kept D57510), and added the following patch (https://reviews.llvm.org/D57587) which prints whether or not the debug values that beginFunction() move are inside the prologue (using the same check as DwarfDebug's findPrologueEndLoc()).

I built a RelWithDebInfo-profiled Clang trunk binary with X86 as target using the above.

This resulted in the following numbers:

$ grep prologue a.log | sort | uniq -c
   4301   $noreg outside prologue
 220636   already inside prologue
     37   other type of value outside prologue
    813   reg value outside prologue

As seen, we have some $noreg debug values for parameters that are placed outside the prologue. Not moving those to the top of the function shouldn't affect the value in the debugger (it should be "unavailable" either way), but I guess that it may result in more location lists (I haven't verified this yet). For other values ("other type of value" in the table above), e.g. constants, this situation seems rare, only happening 37 times in a Clang build.

I'm slightly worried that if we merge this patch too early we will get lots of complaints that variables are no longer visible in the debugger (because as an end-user you don't know whether the variable would become visible after the first single-step). The statistics will help us make that call, thanks!

dstenb added a comment.Feb 4 2019, 6:56 AM

I have divided this into three measurements:

  • "before": Without this patch
  • "regs": Without this patch, but stop changing the label for values that are described by registers.
  • "after": With this patch

As before, I built a clang binary from trunk using -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_TARGETS_TO_BUILD=X86.

Here are the diffs from llvm-dwarfdump -statistics when stepping between the variants:

$ diff -u before.txt regs.txt
--- before.txt  2019-02-04 15:17:06.985994932 +0100
+++ regs.txt    2019-02-04 15:14:27.799763830 +0100
@@ -1,7 +1,7 @@
 {
 "call site entries": 0,
-"scope bytes covered": 365024733,
-"scope bytes total": 703739122,
+"scope bytes covered": 364668389,
+"scope bytes total": 703739115,
 "format": "ELF64-x86-64",
 "source functions": 330578,
 "total function size": 43819231,
@@ -9,7 +9,7 @@
 "source variables": 7617628,
 "variables with location": 5251165,
 "version": 1,
-"file": "clang-before",
+"file": "clang-regs",
 "inlined functions": 3846999,
 "unique source variables": 1041245
 }
$ diff -u regs.txt after.txt
--- regs.txt    2019-02-04 15:14:27.799763830 +0100
+++ after.txt   2019-02-04 15:19:18.815674181 +0100
@@ -1,7 +1,7 @@
 {
 "call site entries": 0,
-"scope bytes covered": 364668389,
-"scope bytes total": 703739115,
+"scope bytes covered": 364655966,
+"scope bytes total": 703739101,
 "format": "ELF64-x86-64",
 "source functions": 330578,
 "total function size": 43819231,
@@ -9,7 +9,7 @@
 "source variables": 7617628,
 "variables with location": 5251165,
 "version": 1,
-"file": "clang-regs",
+"file": "clang-after",
 "inlined functions": 3846999,
 "unique source variables": 1041245
 }
$ ls clang-* | xargs -n1 stat --format='%s %n' | sort -h
1313541136 clang-before
1313541344 clang-regs
1313541592 clang-after

I'm not very familiar with interpreting llvm-dwarfdump's statistics dump, but I'd thought I would at least post it while I try to understand it.

Are there any more measurements I should do?

I guess that a first step could be to only drop beginFunction's behavior for register-described debug values, as it is for those it misbehaves, and we actively try to place such debug values at the start of the function e.g. in SelectionDAG.

aprantl accepted this revision.Feb 4 2019, 9:49 AM

Thanks. As expected, it drives the number of variables that are not available at function entry down a bit. But my understanding is that we are pretty confident that most of them were incorrect before and I suppose we'll be more incentivized to get correct locations at the function entry with this patch applied than without it; even if it involves more disappointed end-users. It's a tough call, but I highly value correctness and reliability in the debugger, so let's do it! It will be the right thing in the long run.

test/DebugInfo/COFF/fp-stack.ll
13 ↗(On Diff #184495)

this definitely looks more correct.

This revision is now accepted and ready to land.Feb 4 2019, 9:49 AM
dstenb updated this revision to Diff 185780.Feb 7 2019, 8:48 AM
dstenb retitled this revision from [DebugInfo] Stop changing labels for parameter DBG_VALUEs to [DebugInfo] Stop changing labels for register-described parameter DBG_VALUEs.
dstenb edited the summary of this revision. (Show Details)

I have changed the patch so that it still changes the labels for non-register debug values. This is done to reduce the regressions. I have not been able to find a C program where moving a constant-described value results in false positives. I will write a PR about getting SelectionDAG to emit debug values for constant arguments directly at the start of the function, which hopefully should make it possible to remove the rest of beginFunction() without as many regressions.

I have spent some time looking at regressions on 32- and 64-bits x86 for register-described values with this patch. The one type of regression I have found yet is due to https://bugs.llvm.org/show_bug.cgi?id=40638. The lexical-scope trimming that LiveDebugVariables does move debug values down, and we can then land in situations where we schedule up an instruction with a debug location above the moved debug value, effectively ending the prologue before the parameter is described. I started looking into stopping (or at least limiting) the trimming for non-inlined parameters, so that we keep the debug values directly at the start of the function, but I don't have a patch for that yet. However, in a majority of the cases where we were left with uncovered instructions after the prologue, it would have been incorrect to move the label, so it seems to me that this patch is worth getting in, even without a patch for PR40638.

Most of the cases I looked at where we moved the debug value above the defining instructions seems to be for constant parameters that we currently can't emit as constant debug values, e.g. GlobalValues, and various constant expressions, which we instead rely on lowered values somewhere down in the function.

aprantl added inline comments.Feb 7 2019, 1:16 PM
lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
240 ↗(On Diff #185780)

Does this become more readable by changing it to

 if (!std::all_of(...))
  break;
if (!IsDescribedByRef(...))

?

dstenb updated this revision to Diff 185956.Feb 8 2019, 5:51 AM

Update after comment.

dstenb marked an inline comment as done.Feb 8 2019, 5:52 AM
dstenb added inline comments.
lib/CodeGen/AsmPrinter/DebugHandlerBase.cpp
240 ↗(On Diff #185780)

Yes; thanks! I inverted it to an any_of() also.

nhaehnle removed a subscriber: nhaehnle.Feb 12 2019, 1:19 AM

Thanks for the reviews! Unless there are any more comments, I'm planning on merging this tomorrow or Thursday.

This revision was automatically updated to reflect the committed changes.