This is an archive of the discontinued LLVM Phabricator instance.

[llvm-symbolizer] Support debug file lookup using build ID
ClosedPublic

Authored by phosek on Nov 26 2019, 7:11 PM.

Details

Summary

Build ID is a protocol for looking up debug files that's already
supported by various tools including debuggers. For example, when
locating debug files, gdb would check the following directories:

  • /usr/lib/debug/.build-id/ab/cdef1234.debug
  • /usr/bin/ls.debug
  • /usr/bin/.debug/ls.debug
  • /usr/lib/debug/usr/bin/ls.debug

llvm-symbolizer currently consults all of these except for build ID
based one. This patch implements support for build ID lookup. The
set of debug directories to search is specified by the new option:
--debug-file-directory, whose name matches the debug-file-directory
variable used by gdb for the same purpose.

Diff Detail

Event Timeline

phosek created this revision.Nov 26 2019, 7:11 PM

@rupprecht added the --fallback-debug-path option in rL353730 which is AFAIK used for similar purpose as --debug-file-directory, except that it supports only a single directory, whereas in our use case we'd like to support searching through multiple directories.

Test coverage? (since the directory can be specified, I imagine a small checked in binary example (as with other llvm-symbolizer tests) would be feasible?)

llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
308–315
phosek updated this revision to Diff 232031.Dec 3 2019, 8:41 PM
phosek marked an inline comment as done.

Test added.

MaskRay added a subscriber: MaskRay.Dec 3 2019, 9:06 PM
MaskRay added inline comments.
llvm/test/DebugInfo/symbolize-build-id.test
7

Can the binary artifact Inputs/dwarfdump-test.elf-x86-64.build-id be replaced with a YAML file that specifies .note.gnu.build-id?

phosek updated this revision to Diff 232178.Dec 4 2019, 10:30 AM
dblaikie accepted this revision.Dec 4 2019, 11:21 AM
dblaikie added inline comments.
llvm/test/DebugInfo/symbolize-build-id.test
4

Perhaps this should have a --debug-file-directory that's specifically non-existent rather than (if I'm reading the code correctly) falling back to the system directories and potentially interacting with other paths on the system that are outside the test's control?

This revision is now accepted and ready to land.Dec 4 2019, 11:21 AM
MaskRay added inline comments.Dec 4 2019, 11:25 AM
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
328

Super nit: /*LowerCase=*/ may be more common. At least clang-format recognizes this form and deletes the space before true.

llvm/test/DebugInfo/symbolize-build-id.test
2

In binary utilities tests, we use -o to avoid shell redirection.

16

Machine: is the longest key. The indentation of values can be reduced to the minimum.

Machine: EM_X86_64

24

ditto. Content: 040000000800000003000000474e5500abb50d82b6bdc861

MaskRay accepted this revision.Dec 4 2019, 11:25 AM
phosek updated this revision to Diff 232213.Dec 4 2019, 2:24 PM
phosek marked 5 inline comments as done.
This revision was automatically updated to reflect the committed changes.

Hey @phosek, @delcypher and I noticed that a Swift+ASan test is failing due to llvm-symbolizer crashing like this:

(gdb) bt
#0  0x00007f6c3adbae97 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f6c3adbc801 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00000000004228d7 in llvm::Error::fatalUncheckedError() const ()
#3  0x000000000046d515 in llvm::symbolize::LLVMSymbolizer::lookUpBuildIDObject(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::object::ELFObjectFileBase const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#4  0x000000000046dffc in llvm::symbolize::LLVMSymbolizer::getOrCreateObjectPair(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#5  0x000000000046a473 in llvm::symbolize::LLVMSymbolizer::getOrCreateModuleInfo(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) ()
#6  0x000000000046a9c7 in llvm::symbolize::LLVMSymbolizer::symbolizeInlinedCode(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, llvm::object::SectionedAddress) ()
#7  0x000000000040dcd6 in symbolizeInput(llvm::StringRef, llvm::symbolize::LLVMSymbolizer&, llvm::symbolize::DIPrinter&) ()
#8  0x000000000040cff2 in main ()

Based on the history I thought I'd ping this patch (although I'm not sure I spot an unchecked error). Any help appreciated.

In D70759#1839133, @vsk wrote:

Hey @phosek, @delcypher and I noticed that a Swift+ASan test is failing due to llvm-symbolizer crashing like this:

@phosek @vsk

To add more context without gdb attached the error message is Program aborted due to an unhandled Error: ELF note overflows container

vsk added inline comments.Jan 24 2020, 9:46 AM
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
303

Does Err need to be checked here, after the loop?

From notes_begin:

/// \param Err [out] an error to support fallible iteration, which should
///  be checked after iteration ends.
dblaikie added inline comments.
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
303

Looks like it to me.

A test case would be good/necessary, though - @lhames - you wouldn't happen to know a good way to produce an object file with this problem, by chance?

dblaikie added inline comments.Jan 24 2020, 11:33 PM
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
303

Also is there anything to be done about this idiom to make it more robust like other uses of llvm::Error? So that this usage would fail even in Success-only executions.

lhames added inline comments.Jan 25 2020, 4:24 PM
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
303

I think it should already fail if you don't check Err, even on success. The caveat is that you would have to exit the bottom of the loop: If you always leave via that return statement (which you might have if that's what you were testing) then you aren't forced to check the error, because returns from inside the loop body are guaranteed to be safe. Making safe returns not require checking was actually one of the main reasons we developed the fallible iterator wrapper in the first place, if memory serves. :)

dblaikie added inline comments.Jan 25 2020, 7:59 PM
llvm/lib/DebugInfo/Symbolize/Symbolize.cpp
303

That does sound vaguely familiar - though whichever case it was (that we designed it this way intentionally or accidentally) - reckon it might be worth revisiting given how easy it is to write code like this that sort of defeats the "must always check" kind of design of Error?

Hmm - I was going to say that I think I'd be OK saying you can't return early from a fallible iteration - but given range-based-for loops make it hard to add new conditions into your loop to exit early when you've found the thing you want, maybe that's not ideal either. Then I end up back at the functor based iteration schemes, rather than exposing a full iterator/range?

BUt yeah, I guess for this code, a test case that doesn't have a BUILD_ID should suffice... - any ideas if yaml2obj is the right tool for that? Or other ways to construct such objects for testing this sort of thing?

Sorry about the breakage, D73438 should address this issue and also includes a test.