Page MenuHomePhabricator

[DWARF] Inlined variables with no location should not have a DW_TAG_variable
ClosedPublic

Authored by jmorse on Jan 28 2021, 7:17 AM.

Details

Summary

Discussed in this thread:

https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html

DwarfDebug::collectEntityInfo accidentally distinguishes between variable locations that never have a location specified, and variable locations that have an empty location specified. The latter leads to the creation of an empty variable referring to the abstract origin:

DW_TAG_formal_parameter
    DW_AT_abstract_origin       (0x0000005a "bar")

Which is needless and potentially misleading. This patch makes collectEntityInfo behave the same way for both ways an empty variable can be represented, by only emitting a concrete variable if it actually has a location. The added test exercises this, and checks variables are created for "normal" functions too.

The NVPTX tests fail with this; this is because they were compiled to LLVM-IR before retainedNodes was correctly tracked, apparently. I've made the following changes to reflect changes in output:

  • debug-addr-class.ll: Added the list of retained nodes to the input. This causes two types to shift position in the output for reasons I don't understand.
  • debug-info.ll: The size of the unit changes because we cease emitting two parameters with abstract-origins as a result of this patch
  • debug-loc-offset.ll: adding retainedNodes to this makes the abbrev numbering change, and one new variable to be emitted.

Diff Detail

Event Timeline

jmorse created this revision.Jan 28 2021, 7:17 AM
jmorse requested review of this revision.Jan 28 2021, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 28 2021, 7:17 AM
djtodoro added inline comments.Feb 1 2021, 2:46 AM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1789–1800

Can we refactor this into a predicate function (according to the CodingStandards)?

1803

Are we covering inlined variables only with this?

jmorse updated this revision to Diff 320797.Feb 2 2021, 8:17 AM

Shift "does this list have a location?" predicate into a function

jmorse marked an inline comment as done.Feb 2 2021, 8:20 AM
jmorse added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1803

This filter covers everything -- otherwise we risk having some other distinction between inlined and non-inlined variables with no location IMO.

There's a loop at the bottom of this function to create concrete entities for non-inlined variables that weren't given a location, and back in endFunctionImpl there's a loop over all abstract scopes to ensure they get abstract entities created.

djtodoro accepted this revision.Feb 3 2021, 1:05 AM

lgtm, thanks!

NOTE: I will try to address this in llvm-dwarfdump --stats/locstats as well (asap).
This revision is now accepted and ready to land.Feb 3 2021, 1:05 AM
jmorse updated this revision to Diff 321379.Feb 4 2021, 4:34 AM

I reverted this as a builtbot objected:

http://lab.llvm.org:8011/#builders/105/builds/5486

The failing test, missing-abstract-variable.ll, is a known bad behaver on certain platforms. In the past it's been XFailed on those platforms because certain inlined parameters get optimised out. That xfail was deleted after these two commits:

https://reviews.llvm.org/rGff2073a51fe283bde05f48c300e3a9ee807b1b6d
https://reviews.llvm.org/rG93faeecd8fa1fc148f2ee0d0cb64f11c3d41ce49

I believe the latter was allowing more DBG_VALUEs to pass through isel, which are then optimised out in this test. The behaviour that lets this test pass on (mips, powerpc64, s390x, sparc) is thus exactly the behaviour this patch is trying to disable, and it's fine to re-apply the XFail.

I'll leave this as re-opened but accepted for a ~day, I'm confident about the fix but wanted somewhere to write the explanation, plus leave a little time in case there's a wider view.

jmorse reopened this revision.Feb 4 2021, 4:34 AM
This revision is now accepted and ready to land.Feb 4 2021, 4:34 AM

I reverted this as a builtbot objected:

http://lab.llvm.org:8011/#builders/105/builds/5486

The failing test, missing-abstract-variable.ll, is a known bad behaver on certain platforms. In the past it's been XFailed on those platforms because certain inlined parameters get optimised out. That xfail was deleted after these two commits:

https://reviews.llvm.org/rGff2073a51fe283bde05f48c300e3a9ee807b1b6d
https://reviews.llvm.org/rG93faeecd8fa1fc148f2ee0d0cb64f11c3d41ce49

I believe the latter was allowing more DBG_VALUEs to pass through isel, which are then optimised out in this test. The behaviour that lets this test pass on (mips, powerpc64, s390x, sparc) is thus exactly the behaviour this patch is trying to disable, and it's fine to re-apply the XFail.

I'll leave this as re-opened but accepted for a ~day, I'm confident about the fix but wanted somewhere to write the explanation, plus leave a little time in case there's a wider view.

In a separate/preparatory commit, perhaps you could add checks that these inlined formal parameters have DW_AT_location. (& adding the xfail for the architectures that don't have locations on these variables/parameters)

& then this change won't have to touch that test, I think.

MaskRay added a subscriber: MaskRay.EditedFeb 6 2021, 2:06 PM

Reproduce: https://gist.github.com/MaskRay/4c67b29bb038d5a016260437c9e25a7a

# git revert -n 56fa34ae3570a34fd0f4c2cf1bfaf095da01a959 # revert the workaround
% cat run.sh 
#!/bin/sh -e
llc -split-dwarf-file x.dwo -filetype=obj $1 -o x.o
if llvm-dwarfdump -debug-gnu-pubnames x.o 2> log; then
  exit 1
fi
grep 'name lookup table at' log
% ./run.sh a.ll
error: name lookup table at offset 0x12cc has a terminator at offset 0x12da before the expected end at 0x12fd
jmorse added a comment.Feb 8 2021, 2:23 AM

In a separate/preparatory commit, perhaps you could add checks that these inlined formal parameters have DW_AT_location. (& adding the xfail for the architectures that don't have locations on these variables/parameters)

& then this change won't have to touch that test, I think.

Added in e05c10380ce7, I'll leave it to soak for a bit, then put this patch back in.

[...]

error: name lookup table at offset 0x12cc has a terminator at offset 0x12da before the expected end at 0x12fd

I think this was destined for D94976 (the _other_ dwarf patch of mine that's going in and out), I'll copy it there and start investigating, thanks for the reproducer.

Added in e05c10380ce7, I'll leave it to soak for a bit, then put this patch back in.

That change makes the test fail on arm (or at least on mac/arm): http://45.33.8.238/macm1/3175/step_10.txt Please take a look.

jmorse added a comment.Feb 9 2021, 2:10 AM

Hi Nico,

That change makes the test fail on arm (or at least on mac/arm): http://45.33.8.238/macm1/3175/step_10.txt Please take a look.

I've rolled back the change; this is curious though, because it was passing on a 32-bit arm bot [0], and I added aarch64 to the list of XFails so would imagine that arm-macs would be caught by that. What triple does that bot target?

[0] http://lab.llvm.org:8011/#/builders/111/builds/1545 <--- note this has unrelated test failures

Hi Nico,

That change makes the test fail on arm (or at least on mac/arm): http://45.33.8.238/macm1/3175/step_10.txt Please take a look.

I've rolled back the change; this is curious though, because it was passing on a 32-bit arm bot [0], and I added aarch64 to the list of XFails so would imagine that arm-macs would be caught by that. What triple does that bot target?

[0] http://lab.llvm.org:8011/#/builders/111/builds/1545 <--- note this has unrelated test failures

I won't be near my arm mac for a few more hours, but de1bca4b36deb2a405bed85f7a7c71c09ab3c586 suggests that the triple is arm64-apple.

I won't be near my arm mac for a few more hours, but de1bca4b36deb2a405bed85f7a7c71c09ab3c586 suggests that the triple is arm64-apple.

Here's what cmake prints when running it on an m1 mbp:

-- LLVM host triple: arm64-apple-darwin20.2.0
-- LLVM default target triple: arm64-apple-darwin20.2.0
jmorse added a comment.Feb 9 2021, 1:23 PM
  • LLVM host triple: arm64-apple-darwin20.2.0
  • LLVM default target triple: arm64-apple-darwin20.2.0

Hmmmmm. I'll stick "arm64" into the XFail list when re-applying that patch, there's very little chance of the term "arm64" ever referring to another architecture. To state the obvious, it's not ideal that there are two names for the same architecture, but I'll move that discussion to the mailing list rather than code review.

Oh, interesting note I just realized in some other code I was investigating (for https://bugs.llvm.org/show_bug.cgi?id=49769 ) - this sort of change should be made for non-inline functions too. WEll, specifically non-inline functions that have an abstract origin.

eg:

void do_stuff();
__attribute__((always_inline)) void f1() {
  int i;
  do_stuff();
}
void f2() { f1(); }
$ clang++-tot -O1 test.cpp -g -c && llvm-dwarfdump-tot test.o
...
0x0000002a:   DW_TAG_subprogram
                DW_AT_low_pc    (0x0000000000000000)
                DW_AT_high_pc   (0x0000000000000008)
                DW_AT_frame_base        (DW_OP_reg7 RSP)
                DW_AT_GNU_all_call_sites        (true)
                DW_AT_abstract_origin   (0x0000005b "_Z2f1v")

0x0000003d:     DW_TAG_variable
                  DW_AT_abstract_origin (0x00000067 "i")

0x00000042:     DW_TAG_GNU_call_site
                  DW_AT_abstract_origin (0x00000050 "_Z8do_stuffv")
                  DW_AT_low_pc  (0x0000000000000006)

0x0000004f:     NULL