Page MenuHomePhabricator

[RFC] Use llvm-symbolizer to symbolize LLVM/Clang crash dumps
ClosedPublic

Authored by samsonov on Oct 3 2014, 2:37 PM.

Details

Summary

This change allows to use llvm-symbolizer binary in
signal handler to provide a better stacktrace. Instead of printing
raw addresses and optional symbol names, stack trace will now contain
file/line information (if binary has debug info). See example outputs below.

This is an RFC patch. IMO it can already be applied now with minor changes
(add a check that dl_iterate_phdr and <link.h> are available on given platform).
However, I see the following areas for improvement:

  1. Thanks to powerful LLVMSupport, printSymbolizedStackTrace() is short and

mostly portable code (the only Linux-specific part is dl_iterate_phdr). We can
make this function generally available by putting it to Support headers. By doing
this we'll also be able to unit-test both this function and llvm-symbolizer tool
(to some extent).

  1. Currently llvm-symbolizer input/output format is hardcoded in the tool

implementation *and* in printSymbolizedStackTrace(). We can instead move the
description of this format to Support headers/library: move DILineInfo (format-neutral
container for representing file/line information) and DIInliningInfo to Support
and provide methods to serialize/deserialize it. These methods can then be used
in both tools/llvm-symbolizer and printSymbolizedStackTraces() and unit-tested
appropriately.

Let me know what you think about this patch (in case I'm doing something stupid),
and what pieces of 1)-2) I should work on before landing this (or I can procced
with this patch and work on that parts later).

  • before:

0 clang 0x00000000021f260f llvm::sys::PrintStackTrace(_IO_FILE*) + 38
1 clang 0x00000000021f288c
2 clang 0x00000000021f14ea
3 libpthread.so.0 0x00002b753e753cb0
4 libc.so.6 0x00002b753f3ca0d5 gsignal + 53
5 libc.so.6 0x00002b753f3cd83b abort + 379
6 libc.so.6 0x00002b753f3c2d9e
7 libc.so.6 0x00002b753f3c2e42
8 clang 0x0000000002891db7 clang::CodeGen::CodeGenFunction::EmitFunctionEpilog(clang::CodeGen::CGFunctionInfo const&, bool, clang::SourceLocation) + 2163
9 clang 0x00000000027844de clang::CodeGen::CodeGenFunction::FinishFunction(clang::SourceLocation) + 518
10 clang 0x0000000002787a72 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) + 2304
11 clang 0x000000000279b63a clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) + 1134
12 clang 0x00000000027989c9 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) + 487
13 clang 0x0000000002798299 clang::CodeGen::CodeGenModule::EmitGlobal(clang::GlobalDecl) + 801
14 clang 0x000000000279eace clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) + 292
15 clang 0x000000000271acf5
16 clang 0x0000000002704c8e
17 clang 0x00000000030dd545 clang::ParseAST(clang::Sema&, bool, bool) + 550
18 clang 0x00000000023da536 clang::ASTFrontendAction::ExecuteAction() + 322
19 clang 0x000000000270749c clang::CodeGenAction::ExecuteAction() + 1306
20 clang 0x00000000023d9fed clang::FrontendAction::Execute() + 139
21 clang 0x00000000023a3c1c clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) + 770
22 clang 0x00000000024e8117 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) + 1102
23 clang 0x000000000113f53f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) + 689
24 clang 0x000000000113840d
25 clang 0x0000000001138994 main + 979
26 libc.so.6 0x00002b753f3b576d __libc_start_main + 237
27 clang 0x0000000001135609
<...>

  • after:

#0 0x21f260f llvm::sys::PrintStackTrace(_IO_FILE*) llvm/lib/Support/Unix/Signals.inc:399:0
#1 0x21f28aa PrintStackTraceSignalHandler(void*) llvm/lib/Support/Unix/Signals.inc:457:0
#2 0x21f14ea SignalHandler(int) llvm/lib/Support/Unix/Signals.inc:196:0
#3 0x2b3c83ca6cb0 restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0xfcb0)
#4 0x2b3c8491d0d5 gsignal /build/buildd/eglibc-2.15/signal/../nptl/sysdeps/unix/sysv/linux/raise.c:64:0
#5 0x2b3c8492083b abort /build/buildd/eglibc-2.15/stdlib/abort.c:93:0
#6 0x2b3c84915d9e
assert_fail_base /build/buildd/eglibc-2.15/assert/assert.c:55:0
#7 0x2b3c84915e42 (/lib/x86_64-linux-gnu/libc.so.6+0x2ee42)
#8 0x2891dd7 clang::CodeGen::CodeGenFunction::EmitFunctionEpilog(clang::CodeGen::CGFunctionInfo const&, bool, clang::SourceLocation) llvm/tools/clang/lib/CodeGen/CGCall.cpp:2219:0
#9 0x27844fe clang::CodeGen::CodeGenFunction::FinishFunction(clang::SourceLocation) llvm/tools/clang/lib/CodeGen/CodeGenFunction.cpp:261:0
#10 0x2787a92 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) llvm/tools/clang/lib/CodeGen/CodeGenFunction.cpp:910:0
#11 0x279b65a clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp:2292:0
#12 0x27989e9 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp:1433:0
#13 0x27982b9 clang::CodeGen::CodeGenModule::EmitGlobal(clang::GlobalDecl) llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp:1293:0
#14 0x279eaee clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) llvm/tools/clang/lib/CodeGen/CodeGenModule.cpp:3034:0
#15 0x271ad15 (anonymous namespace)::CodeGeneratorImpl::HandleTopLevelDecl(clang::DeclGroupRef) llvm/tools/clang/lib/CodeGen/ModuleBuilder.cpp:113:0
#16 0x2704cae clang::BackendConsumer::HandleTopLevelDecl(clang::DeclGroupRef) llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:106:0
#17 0x30dd565 clang::ParseAST(clang::Sema&, bool, bool) llvm/tools/clang/lib/Parse/ParseAST.cpp:143:0
#18 0x23da556 clang::ASTFrontendAction::ExecuteAction() llvm/tools/clang/lib/Frontend/FrontendAction.cpp:523:0
#19 0x27074bc clang::CodeGenAction::ExecuteAction() llvm/tools/clang/lib/CodeGen/CodeGenAction.cpp:718:0
#20 0x23da00d clang::FrontendAction::Execute() llvm/tools/clang/lib/Frontend/FrontendAction.cpp:427:0
#21 0x23a3c3c clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) llvm/tools/clang/lib/Frontend/CompilerInstance.cpp:812:0
#22 0x24e8137 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) llvm/tools/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:222:0
#23 0x113f53f cc1_main(llvm::ArrayRef<char const*>, char const*, void*) llvm/tools/clang/tools/driver/cc1_main.cpp:110:0
#24 0x113840d ExecuteCC1Tool(llvm::ArrayRef<char const*>, llvm::StringRef) llvm/tools/clang/tools/driver/driver.cpp:368:0
#25 0x1138994 main llvm/tools/clang/tools/driver/driver.cpp:411:0
#26 0x2b3c8490876d __libc_start_main /build/buildd/eglibc-2.15/csu/libc-start.c:258:0
#27 0x1135609 _start (llvm_cmake_debug/bin/clang-3.6+0x1135609)
<...>

Diff Detail

Event Timeline

samsonov updated this revision to Diff 14408.Oct 3 2014, 2:37 PM
samsonov retitled this revision from to [RFC] Use llvm-symbolizer to symbolize LLVM/Clang crash dumps.
samsonov updated this object.
samsonov edited the test plan for this revision. (Show Details)
samsonov added reviewers: dblaikie, chandlerc.
samsonov added a subscriber: Unknown Object (MLST).
samsonov updated this revision to Diff 14409.Oct 3 2014, 2:42 PM

Cleanup some unnecessary code/comments.

Haven't reviewed the code yet, but thumbs up to the idea. :)

dblaikie edited edge metadata.Oct 3 2014, 3:37 PM

Some random thoughts. Nothing terribly important.

lib/Support/Unix/Signals.inc
286

parens around data->first seem unnecessary

289

Well that's some syntax I haven't seen before... I assume it's valid/compiles?

I would've expected to see that written as:

const EelfW (*phdr)(Phdr) = ...

but if what you've got there works... great... just surprising.

290

could be an early continue instead

296

could be an early continue, but is of questionable benefit (same number of lines of code, only two lines to unindent)

303

The return value seems to be always zero and unused. Remove it

328

hmm - should we not be just reading the output stream straight from the process via a pipe, rather than writing it to disk and reading it back from there?

Maybe we don't have portable primitives for that. (maybe there's no such common/portable primitive)

333

Could these just be straight StringRefs, rather than unique_ptr<StringRef>?

350

just an aside: but why is symbolication out-of-process in this way? (I mean I can think of the obvious "Because the process is already in a bad state, we don't want to do lots of work in that process for fear the bad state would somehow corrupt all that work" - but I'm not sure if that's the case & how much it applies when we're still dynamically allocating several structures here so we can call llvm-symbolizer and print the result, etc)

366

So this loop is where inlining is handled? (multiple results from llvm-symbolizer for a single query, terminated by a blank line?)

371

usually skip the braces on single line blocks

377

More braces to skip (if and else)

383

This could be an early continue and reduce indentation

samsonov updated this revision to Diff 14416.Oct 3 2014, 4:10 PM
samsonov edited edge metadata.

Address comments.

lib/Support/Unix/Signals.inc
286

Done

289

ElfW() is a macro, which here expands ElfW(Phdr) to Elf32_Phdr, or Elf64_Phdr.

290

Done

296

Left as is (IMO negative condition for "lies in the interval" is less clear).

303

This function is passed as a callback into dl_iterate_phdr, so we must have "int" as a return value here.

328

Yeah, unfortunately sys::ExecuteAndWait and friends only accepts files, not file descriptors for redirects. I can work on adding an overload (though, using files proved to be convenient while debugging this code).

333

sys::ExecuteAndWait accepts (const StringRef ** as its arguments). That's sort of... ugly. It also distinguishes between nullptr (= redirect is not set up) and empty StringRef (= redirect to /dev/null).

350

If we do an in-process symbolization, we'd need to link in both LLVMDebugInfo and LLVMObject into every LLVM tool/program that uses this signal handler (ew). I believe that Clang, for instance, doesn't (and shouldn't) depend on DWARF parser at the moment. This was the main reason.

I agree that using vectors/strings and a bunch of heavy methods from LLVMSupport in signal handler can potentially cause some problems, but, oh, not doing so would make the code so much worse. Still, I think that the amout of work we're doing here is significantly smaller than mapping large binaries and maintaining lots of internal structures for representing DWARF that's done in DebugInfo.

366

Yes.

371

Done

377

Done

383

Done

dblaikie added inline comments.Oct 3 2014, 4:45 PM
lib/Support/Unix/Signals.inc
333

Not sure I follow... What I mean is to change this code to:

std::vector<const StringRef *> Redirects(3, nullptr);
Redirects[0] = &InputFile;
Redirects[1] = &OutputFile;
StringRef StderrFile;
Redirects[2] = &StderrFile;

Would that be wrong somehow?

samsonov updated this revision to Diff 14417.Oct 3 2014, 5:09 PM

Get rid of unnecessary std::unique_ptr.

lib/Support/Unix/Signals.inc
333

Ah, sure. I'd still need local variables for StringRef's, as InputFile/OutputFile are SmallStrings, but unique_ptr are not needed here.

dblaikie accepted this revision.Oct 3 2014, 5:14 PM
dblaikie edited edge metadata.

Looks good to me. I don't know the particulars of how best to work with llvm-symbolizer (if doing this out-of-process is the right thing, perhaps a convenient wrapper library for doing the inter-process stuff might be nice), so I don't have particular opinions on the broad strokes, but they look reasonable.

This revision is now accepted and ready to land.Oct 3 2014, 5:14 PM

FWIW, it's possible to use LLVMSymbolizer in process - you can just instantiate an LLVMSymbolizer class (defined in tools/llvm-symbolizer/LLVMSymbolize.h), and issue a necessary queries. It just hasn't been moved to DebugInfo library (which is probably worth doing). My main reason for using out-of-process communication here was dependency on Object/DebugInfo libraries, which we don't want to enforce.

Thanks for the review!
I will submit this patch early next week and proceed with refactorings outlined in 1-2.

samsonov updated this revision to Diff 14607.Oct 8 2014, 4:02 PM
samsonov edited edge metadata.

Check for <link.h> header presence.

echristo accepted this revision.Oct 8 2014, 4:13 PM
echristo added a reviewer: echristo.

Seems reasonable.

emaste requested changes to this revision.Oct 8 2014, 4:20 PM
emaste added a reviewer: emaste.
emaste added a subscriber: emaste.

It appears this breaks the build on FreeBSD

In file included from ../lib/Support/Signals.cpp:30:
../lib/Support/Unix/Signals.inc:290:11: error: unknown type name 'ElfW'
    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
          ^
../lib/Support/Unix/Signals.inc:290:21: error: expected ';' at end of declaration
    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
                    ^
                    ;
../lib/Support/Unix/Signals.inc:291:9: error: use of undeclared identifier 'phdr'; did you mean 'Phdr'?
    if (phdr->p_type != PT_LOAD)
        ^~~~
        Phdr
../lib/Support/Unix/Signals.inc:290:16: note: 'Phdr' declared here
    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
               ^
../lib/Support/Unix/Signals.inc:293:38: error: use of undeclared identifier 'phdr'; did you mean 'Phdr'?
    intptr_t beg = info->dlpi_addr + phdr->p_vaddr;
                                     ^~~~
                                     Phdr
../lib/Support/Unix/Signals.inc:290:16: note: 'Phdr' declared here
    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
               ^
../lib/Support/Unix/Signals.inc:294:26: error: use of undeclared identifier 'phdr'; did you mean 'Phdr'?
    intptr_t end = beg + phdr->p_memsz;
                         ^~~~
                         Phdr
../lib/Support/Unix/Signals.inc:290:16: note: 'Phdr' declared here
    const ElfW(Phdr) *phdr = &info->dlpi_phdr[i];
               ^
../lib/Support/Unix/Signals.inc:372:23: warning: unused variable 'Err' [-Wunused-variable]
  if (std::error_code Err = OutputBuf.getError())
                      ^
1 warning and 5 errors generated.
This revision now requires changes to proceed.Oct 8 2014, 4:20 PM
emaste added a comment.Oct 8 2014, 4:26 PM

the only Linux-specific part is dl_iterate_phdr

FYI, FreeBSD also has dl_iterate_phdr

Yes, I've already reverted this in r219355.

Ed, do I understand correctly that LLVM is supported for 32-bit FreeBSD 9.2? If that's the case, I'll have to introduce old-FreeBSD-specific defines for dl_iterate_phdr here, or disable dl_iterate_phdr for it entirely.

samsonov updated this revision to Diff 14617.Oct 8 2014, 5:23 PM
samsonov edited edge metadata.

Fix unused-variable warning. Use dl_iterate_phdr only on Linux: this
is fine for now, as Linux is the only platform which has dl_iterate_phdr
and can fetch the executable name w/o argc/argv at the same time.

In D5610#30, @samsonov wrote:

Ed, do I understand correctly that LLVM is supported for 32-bit FreeBSD 9.2? If that's the case, I'll have to introduce old-FreeBSD-specific defines for dl_iterate_phdr here, or disable dl_iterate_phdr for it entirely.

Please disregard this. This code won't be run on FreeBSD anyway, because sys::fs::getMainExecutable(nullptr, nullptr) won't work there.

emaste added a comment.Oct 8 2014, 5:56 PM

Linux is the only platform which has dl_iterate_phdr
and can fetch the executable name w/o argc/argv at the same time.

Is this the only missing functionality required by the symbolizer change? It's possible to implement for FreeBSD.

emaste added a comment.Oct 8 2014, 6:11 PM

/proc/self/exe equivalent for FreeBSD:

#include <sys/param.h>
#include <sys/sysctl.h>
#include <err.h>
#include <errno.h>
#include <stdio.h>
#include <unistd.h>

int
main(int argc, char *argv[])
{
        char buf[MAXPATHLEN];
        int error, name[4] = { CTL_KERN, KERN_PROC, KERN_PROC_PATHNAME };
        size_t len = sizeof(buf);

        name[3] = getpid();
        error = sysctl(name, 4, buf, &len, NULL, 0);
        if (error != 0 && errno != ESRCH)
                warn("sysctl: kern.proc.pathname: %d", getpid());
        if (len == 0)
                buf[0] = '\0';

        printf("kern.proc.pathname = %s\n", buf);
}

I'll clean this up and implement it for lib/Support/Unix/Path.inc

emaste added a comment.Oct 8 2014, 8:02 PM

getMainExecutable update for FreeBSD in review D5693

joerg added a subscriber: joerg.Oct 9 2014, 2:24 AM

NetBSD certainly has dl_iterate_phdr and provides the name of the main binary...

emaste added inline comments.Oct 9 2014, 12:47 PM
lib/Support/Unix/Signals.inc
275

This builds on FreeBSD (with the linux removed again) after your ElfW(Phdr)->auto *phdr change

287

This conditional is not be needed on FreeBSD, since on FreeBSD dl_iterate_phdr returns the executable with full path name in dlpi_name.

emaste added a comment.Oct 9 2014, 1:08 PM

@joerg, it looks like when we implemented dl_iterate_phdr in FreeBSD we returned the l_name if available rather than the path - i.e., libc.so.7 instead of /usr/lib/libc.so.7, and NetBSD copied that implementation.

However, the Linux manpage claims:

The dlpi_name field is a null-terminated string giving the pathname from which the shared object was loaded.

I'm looking at trying to make the following change or something similar to our rtld:

-       phdr_info->dlpi_name = STAILQ_FIRST(&obj->names) ?
-           STAILQ_FIRST(&obj->names)->name : obj->path;
+       phdr_info->dlpi_name = obj->path;
emaste resigned from this revision.Oct 9 2014, 1:17 PM
emaste removed a reviewer: emaste.
This revision is now accepted and ready to land.Oct 9 2014, 1:17 PM
samsonov added inline comments.Oct 10 2014, 3:00 PM
lib/Support/Unix/Signals.inc
275

I've included FreeBSD and NetBSD here.

287

I'd prefer to leave this condition as is, and don't special-case on it. We already have main executable at this point, why not use it?

samsonov updated this revision to Diff 14751.Oct 10 2014, 3:00 PM

Enable symbolization on FreeBSD and NetBSD.

emaste added inline comments.Oct 10 2014, 3:22 PM
lib/Support/Unix/Signals.inc
287

On FreeBSD the main executable path is returned by the first callback from dl_iterate_phdr, rather than a blank string as on Linux. This is true in practice at least, that it's first is not explicitly guaranteed.

On FreeBSD the path passed to the callback may be more reliable, as well; the path returned by my change (what we'll pass in data->main_exec_name) on FreeBSD comes through vn_fullpath in the kernel, which has the following note in the man page:

The vn_fullpath() function makes a ``best effort'' attempt to generate a
string pathname for the passed vnode; the resulting path, if any, will be
relative to the root directory of the process associated with the passed
thread pointer.  The vn_fullpath() function is implemented by inspecting
the VFS name cache, and attempting to reconstruct a path from the process
root to the object.

whereas the executable's dlpi_name is stored by rtld at the time the process starts.

samsonov closed this revision.Oct 27 2014, 3:32 PM