This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Support more than 2 operands in DWARF operations
ClosedPublic

Authored by scott.linder on Mar 30 2023, 1:52 PM.

Details

Summary

Update DWARFExpression::Operation and LVOperation to support more than
2 operands.

Take the opportunity to use a SmallVector, which will handle at least 2
operands without allocation anyway, and removes the static limit
completely.

As there is no longer the concept of an "unused operand", remove
Operation::Encoding::SizeNA. Any use of it is now replaced with explicit
checks for how many operands an operation has.

There are still places where the limit remains 2, namely in the
DWARFLinker and in DIExpressions, but these can be updated in later
patches as-needed.

There are no explicit tests as this is nearly NFC: no new operation is
added which makes use of the additional operand capacity yet. A future
patch adding a new DWARF extension point will include operations which
require the support.

Diff Detail

Event Timeline

scott.linder created this revision.Mar 30 2023, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2023, 1:52 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
scott.linder requested review of this revision.Mar 30 2023, 1:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2023, 1:52 PM
Orlando accepted this revision.Mar 31 2023, 8:05 AM
Orlando added a subscriber: Orlando.

LGTM

llvm/include/llvm/DebugInfo/DWARF/DWARFExpression.h
73

Possibly worth adding a short comment along the lines of "Fill Op with Ops then Is.size() number of SizeNA entries". YMMV, maybe it's obvious, but it took me a minute to work out what was going on here.

This revision is now accepted and ready to land.Mar 31 2023, 8:05 AM

Got any measurements of memory usage? (seems like this'd increase memory usage? Maybe try a self-host release build (optimizations causing more interesting locations and so more location operations) & see how peak memory usage in various compilation changes?)

Got any measurements of memory usage? (seems like this'd increase memory usage? Maybe try a self-host release build (optimizations causing more interesting locations and so more location operations) & see how peak memory usage in various compilation changes?)

I will run the tests you suggest and report back; I didn't initially because there are already 3-operand operations, and hence for completeness we will have to change at some point. The only remaining question in my mind is the approach to support it: either statically sized and compile-time evaluated (i.e. DWARFExpression::Operation) or dynamically sized (i.e. LVOperation). I will try to measure which approach better balances LLVM compile time, artifact size, compile time, etc.

scott.linder edited the summary of this revision. (Show Details)

Now that we are no longer contrained by constexpr, just use
SmallVector for everything.

Running 3 trials comparing release/llvm-dwarfdump rel-with-deb-info/llc
between this patch and trunk:

<stat>: <control> <patched> (<patched>/<control>)
instructions:u: 2053361427267 2053074326681 (0.999860)
Maximum resident set size: 5385686 5384477 (0.999776)
branches: 207788977127 208106839029 (1.001530)
branch-misses: 2480119693 2785843401 (1.123270)
User time: 254.363333 265.583333 (1.044110)

I haven't benchmarked LogicalView, but I suspect it should be similar.

CarlosAlbertoEnciso added a comment.EditedApr 28 2023, 3:28 AM

@scott.linder: Sorry for my delay. As pointed out by @dblaikie, it would be interesting to see the memory usage, specially for the logical view. Thanks.

@scott.linder: Sorry for my delay. As pointed out by @dblaikie, it would be interesting to see the memory usage, specially for the logical view. Thanks.

Thank you for taking a look!

Is there a good target you have in mind for the benchmark? I seem to hit an assertion when dogfooding RelWithDebInfo binaries:

$ git log --oneline | head -1
0dec49cb3466 [mlir][vector] Remove unused variable `srcShape`.

$ build/bin/llvm-debuginfo-analyzer --print=symbols build-relwithdebinfo/bin/llvm-tblgen
llvm-debuginfo-analyzer: /home/slinder1/llvm-project/main/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSymbol.h:102: virtual v
oid llvm::logicalview::LVSymbol::setReference(llvm::logicalview::LVElement *): Assertion `(!Element || isa<LVSymbol>(Element)) &
& "Invalid element"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: build/bin/llvm-debuginfo-analyzer --print=symbols build-relwithdebinfo/bin/llvm-tblgen
 #0 0x0000000001ab6017 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (build/bin/llvm-debuginfo-analyzer+0x1ab6017)
 #1 0x0000000001ab3dee llvm::sys::RunSignalHandlers() (build/bin/llvm-debuginfo-analyzer+0x1ab3dee)
 #2 0x0000000001ab668f SignalHandler(int) Signals.cpp:0:0
 #3 0x00007f9a91add420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #4 0x00007f9a9136e00b raise /build/glibc-SzIz7B/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #5 0x00007f9a9134d859 abort /build/glibc-SzIz7B/glibc-2.31/stdlib/abort.c:81:7
 #6 0x00007f9a9134d729 get_sysdep_segment_value /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:509:8
 #7 0x00007f9a9134d729 _nl_load_domain /build/glibc-SzIz7B/glibc-2.31/intl/loadmsgcat.c:970:34
 #8 0x00007f9a9135efd6 (/lib/x86_64-linux-gnu/libc.so.6+0x33fd6)
 #9 0x000000000189be81 llvm::logicalview::LVSymbol::setReference(llvm::logicalview::LVElement*) LVSymbol.cpp:0:0
#10 0x000000000191c9ee llvm::logicalview::LVELFReader::processOneDie(llvm::DWARFDie const&, llvm::logicalview::LVScope*, llvm::D
WARFDie&) (build/bin/llvm-debuginfo-analyzer+0x191c9ee)
#11 0x000000000191d6c9 llvm::logicalview::LVELFReader::traverseDieAndChildren(llvm::DWARFDie&, llvm::logicalview::LVScope*, llvm
::DWARFDie&) (build/bin/llvm-debuginfo-analyzer+0x191d6c9)
#12 0x000000000191d741 llvm::logicalview::LVELFReader::traverseDieAndChildren(llvm::DWARFDie&, llvm::logicalview::LVScope*, llvm
::DWARFDie&) (build/bin/llvm-debuginfo-analyzer+0x191d741)
#13 0x000000000191ef75 llvm::logicalview::LVELFReader::createScopes() (build/bin/llvm-debuginfo-analyzer+0x191ef75)
#14 0x000000000186507d llvm::logicalview::LVReader::doLoad() (build/bin/llvm-debuginfo-analyzer+0x186507d)
#15 0x00000000018a2768 llvm::logicalview::LVReaderHandler::createReader(llvm::StringRef, std::vector<std::unique_ptr<llvm::logic
alview::LVReader, std::default_delete<llvm::logicalview::LVReader>>, std::allocator<std::unique_ptr<llvm::logicalview::LVReader,
 std::default_delete<llvm::logicalview::LVReader>>>>&, llvm::PointerUnion<llvm::object::ObjectFile*, llvm::pdb::PDBFile*>&, llvm
::StringRef, llvm::StringRef) (build/bin/llvm-debuginfo-analyzer+0x18a2768)
#16 0x00000000018a505d llvm::logicalview::LVReaderHandler::handleObject(std::vector<std::unique_ptr<llvm::logicalview::LVReader,
 std::default_delete<llvm::logicalview::LVReader>>, std::allocator<std::unique_ptr<llvm::logicalview::LVReader, std::default_del
ete<llvm::logicalview::LVReader>>>>&, llvm::StringRef, llvm::object::Binary&) (build/bin/llvm-debuginfo-analyzer+0x18a505d)
#17 0x00000000018a4046 llvm::logicalview::LVReaderHandler::handleBuffer(std::vector<std::unique_ptr<llvm::logicalview::LVReader,
 std::default_delete<llvm::logicalview::LVReader>>, std::allocator<std::unique_ptr<llvm::logicalview::LVReader, std::default_del
ete<llvm::logicalview::LVReader>>>>&, llvm::StringRef, llvm::MemoryBufferRef, llvm::StringRef) (build/bin/llvm-debuginfo-analyze
r+0x18a4046)
#18 0x00000000018a4de5 llvm::logicalview::LVReaderHandler::handleFile(std::vector<std::unique_ptr<llvm::logicalview::LVReader, s
td::default_delete<llvm::logicalview::LVReader>>, std::allocator<std::unique_ptr<llvm::logicalview::LVReader, std::default_delet
e<llvm::logicalview::LVReader>>>>&, llvm::StringRef, llvm::StringRef) (build/bin/llvm-debuginfo-analyzer+0x18a4de5)
#19 0x00000000018a1fa7 llvm::logicalview::LVReaderHandler::createReaders() (build/bin/llvm-debuginfo-analyzer+0x18a1fa7)
#20 0x00000000018a1e2f llvm::logicalview::LVReaderHandler::process() (build/bin/llvm-debuginfo-analyzer+0x18a1e2f)
#21 0x0000000001441730 main (build/bin/llvm-debuginfo-analyzer+0x1441730)
#22 0x00007f9a9134f083 __libc_start_main /build/glibc-SzIz7B/glibc-2.31/csu/../csu/libc-start.c:342:3
#23 0x0000000001440bce _start (build/bin/llvm-debuginfo-analyzer+0x1440bce)
Aborted (core dumped)

I've also tried llc, clang, and llvm-debuginfo-analyzer itself, all hit the same assert. Is there something I might be doing wrong on my end?

Is there a good target you have in mind for the benchmark? I seem to hit an assertion when dogfooding RelWithDebInfo binaries:

$ build/bin/llvm-debuginfo-analyzer --print=symbols build-relwithdebinfo/bin/llvm-tblgen
llvm-debuginfo-analyzer: /home/slinder1/llvm-project/main/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSymbol.h:102: virtual v
oid llvm::logicalview::LVSymbol::setReference(llvm::logicalview::LVElement *): Assertion `(!Element || isa<LVSymbol>(Element)) &
& "Invalid element"' failed.

I've also tried `llc`, `clang`, and `llvm-debuginfo-analyzer` itself, all hit the same assert. Is there something I might be doing wrong on my end?

Sorry for my delay in answering but I was at the LLVM Euro 2023. Your command line is correct.

At Sony, we have managed to reproduce the same issue with a debug/checking build on a private project.

scott.linder added a comment.EditedMay 16 2023, 8:43 AM

Is there a good target you have in mind for the benchmark? I seem to hit an assertion when dogfooding RelWithDebInfo binaries:

$ build/bin/llvm-debuginfo-analyzer --print=symbols build-relwithdebinfo/bin/llvm-tblgen
llvm-debuginfo-analyzer: /home/slinder1/llvm-project/main/llvm/include/llvm/DebugInfo/LogicalView/Core/LVSymbol.h:102: virtual v
oid llvm::logicalview::LVSymbol::setReference(llvm::logicalview::LVElement *): Assertion `(!Element || isa<LVSymbol>(Element)) &
& "Invalid element"' failed.

I've also tried `llc`, `clang`, and `llvm-debuginfo-analyzer` itself, all hit the same assert. Is there something I might be doing wrong on my end?

Sorry for my delay in answering but I was at the LLVM Euro 2023. Your command line is correct.

At Sony, we have managed to reproduce the same issue with a debug/checking build on a private project.

Thank you for confirming!

I did a little digging and it seems like there is a faulty assumption in LVELFReader:

// We are assuming that DW_AT_specification, DW_AT_abstract_origin,
// DW_AT_type and DW_AT_extension do not appear at the same time
// in the same DIE.

This seems true for GCC, but not for Clang, at least for the simplest reproducer I could create:

$ cat a.cpp
struct S {
    static const int Arr[];
};
const int S::Arr[] = {
    0, 1, 2
};
$ gcc -g -c a.cpp -o a.o
$ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
0x00000053:   DW_TAG_variable
                DW_AT_specification     (0x00000028 "Arr")
                DW_AT_decl_line (4)
                DW_AT_decl_column       (0x0b)
$ clang++ -g -c a.cpp -o a.o
$ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
0x0000001e:   DW_TAG_variable
                DW_AT_specification     (0x0000003e "Arr")
                DW_AT_type      (0x00000068 "const int[3]")
                DW_AT_location  (DW_OP_addr 0x0)

It seems like the code could be adapted to track these independently, rather than assume only one is present? I can prepare a patch if that sounds reasonable to you!

Edit: This may also be a Clang bug. I haven't read the relevant DWARF sections with enough detail to say whether the DWARF itself is valid or not, but either way it seems nice to support it in llvm-debuginfo-analyzer, especially considering how much flexibility the tool already affords in terms of input

I took a stab at a patch to address the assertion at D150713

Likely there is room to generalize it even further, but I can at least run the tool over e.g. llc in RelWithDebInfo mode, so I can do the same benchmark as for llvm-dwarfdump:

<stat>: <control> <patched> (<patched>/<control>)
instructions:u: 1896657656701 1896827456117 (1.000090)
Maximum resident set size: 51082468 51082618 (1.000003)
branches: 205322657743 205394850877 (1.000352)
branch-misses: 3028996914 2954952928 (0.975555)
User time: 451.593333 453.516667 (1.004259)

It seems the two implementations are indistinguishable in this case (llvm-debuginfo-analyzer --print=symbols build-relwithdebinfo/bin/llc over three trials)

Let me know what you think!

I did a little digging and it seems like there is a faulty assumption in LVELFReader:

// We are assuming that DW_AT_specification, DW_AT_abstract_origin,
// DW_AT_type and DW_AT_extension do not appear at the same time
// in the same DIE.

Thanks very much for your analysis and for creating a small reproducible test.

This seems true for GCC, but not for Clang, at least for the simplest reproducer I could create:

$ cat a.cpp
struct S {
    static const int Arr[];
};
const int S::Arr[] = {
    0, 1, 2
};
$ gcc -g -c a.cpp -o a.o
$ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
0x00000053:   DW_TAG_variable
                DW_AT_specification     (0x00000028 "Arr")
                DW_AT_decl_line (4)
                DW_AT_decl_column       (0x0b)
$ clang++ -g -c a.cpp -o a.o
$ build/bin/llvm-dwarfdump --debug-info a.o | grep -B1 -A2 DW_AT_specification
0x0000001e:   DW_TAG_variable
                DW_AT_specification     (0x0000003e "Arr")
                DW_AT_type      (0x00000068 "const int[3]")
                DW_AT_location  (DW_OP_addr 0x0)

It seems like the code could be adapted to track these independently, rather than assume only one is present? I can prepare a patch if that sounds reasonable to you!

I am happy with your proposal and a patch.

Edit: This may also be a Clang bug. I haven't read the relevant DWARF sections with enough detail to say whether the DWARF itself is valid or not, but either way it seems nice to support it in llvm-debuginfo-analyzer, especially considering how much flexibility the tool already affords in terms of input

I compiled your test case with the latest Clang:

0x0000001e:   DW_TAG_variable
                     DW_AT_specification	(0x00000031 "Arr")
                     DW_AT_type	(0x00000052 "const int[3]")
                     DW_AT_location	(DW_OP_addrx 0x0)
                     DW_AT_linkage_name	("_ZN1S3ArrE")

0x00000031:     DW_TAG_member
                       DW_AT_name	("Arr")
                       DW_AT_type	(0x0000003a "const int[]")
                       DW_AT_decl_file	("/data/projects/sandbox/pr-62716.cpp")
                       DW_AT_decl_line	(2)
                       DW_AT_external	(true)
                       DW_AT_declaration	(true)

From the DWARF section for DW_AT_specification:

A debugging information entry with a DW_AT_specification attribute does not need to duplicate information provided by the debugging information entry referenced by that specification attribute.

Clang:

  • Added an extra entry for DW_AT_type which seems redundant in DIE (0x1e) as by following the DW_AT_specification we can get the type
  • The referenced types are pointing to different DIEs (0x52 and 0x3a).

I took a stab at a patch to address the assertion at D150713

Many thanks.

Likely there is room to generalize it even further, but I can at least run the tool over e.g. llc in RelWithDebInfo mode, so I can do the same benchmark as for llvm-dwarfdump:

<stat>: <control> <patched> (<patched>/<control>)
instructions:u: 1896657656701 1896827456117 (1.000090)
Maximum resident set size: 51082468 51082618 (1.000003)
branches: 205322657743 205394850877 (1.000352)
branch-misses: 3028996914 2954952928 (0.975555)
User time: 451.593333 453.516667 (1.004259)

It seems the two implementations are indistinguishable in this case (llvm-debuginfo-analyzer --print=symbols build-relwithdebinfo/bin/llc over three trials)

Let me know what you think!

LGTM

This revision was landed with ongoing or failed builds.Jun 19 2023, 12:39 PM
This revision was automatically updated to reflect the committed changes.