This is an archive of the discontinued LLVM Phabricator instance.

asan: handle SIGABRT
Needs ReviewPublic

Authored by tlipcon on Jul 10 2013, 3:46 PM.

Details

Reviewers
samsonov
Summary

commit 0eb8067bfe9ee4848df03469ccd1a8f9364df4a5
Author: Todd Lipcon <todd@cloudera.com>
Date: Wed Jul 10 15:34:15 2013 -0700

asan: handle SIGABRT

On some kernel versions, if /proc/sys/vm/core_pattern is set to
a pipe, then the setrlimit(RLIMIT_CORE) call to disable core
dumps has no effect. In that case, if the user application calls
abort(), then the uncaught SIGABRT triggers a core dump, resulting
in a multi-TB pipe into the core dump handler (blocking the process
for many hours).

This patch adds a handler for SIGABRT which logs the error, then
unmaps the shadow memory before propagating the signal. This allows
a core dump to be processed reasonably even on 64-bit systems.

Manually tested with an application which calls "abort()". On systems
set up as described above, this used to hang for hours as it tried
to send the large core dump through the pipe. With the patch, it
outputs:

$ ./coredump
ASAN:SIGABRT
==28640==ERROR: AddressSanitizer: ABRT (pc 0x7f8632666425 sp 0x7fff0cae8f08 bp 0x7fff0cae91d0 T0)
    #0 0x7f8632666424 (/lib/x86_64-linux-gnu/libc-2.15.so+0x36424)
    #1 0x7f8632669b8a (/lib/x86_64-linux-gnu/libc-2.15.so+0x39b8a)
    #2 0x42b11e (/tmp/coredump+0x42b11e)
    #3 0x7f863265176c (/lib/x86_64-linux-gnu/libc-2.15.so+0x2176c)
    #4 0x42ae7c (/tmp/coredump+0x42ae7c)
Aborted (core dumped)

Diff Detail

Event Timeline

glider added inline comments.Jul 11 2013, 3:30 AM
lib/asan/asan_posix.cc
90

Once you unmap the shadow mappings, it's unsafe to invoke the default signal handler, because the instrumented code will try to access the shadow memory.

I'd rather avoid dealing with SIGABRT in ASan. Is there really no way to limit the size of the core dump on the systems you mention?

lib/asan/asan_report.cc
526

No, generally ASan should fail the program immediately after it prints the error report - that's why all "ReportFoo" functions are noreturn and we have ScopedInErrorReport object. Among the other things, it makes sure that no two error reports are printed simultaneously, otherwise bad things may happen - the code in "ReportFoo" functions is thread-unsafe.

lib/asan/asan_rtl.cc
151

Why? I don't think SIGABRT handler should be on by default.

tlipcon added inline comments.Jul 12 2013, 11:55 AM
lib/asan/asan_posix.cc
90

The default signal handler, though, is in the kernel -- not user code. So, it won't run instrumented code, it'll just generate a core dump and exit with the right return code. Right?

I'm explicitly setting to SIG_DFL, not restoring whatever signal handler the user might have installed.

lib/asan/asan_report.cc
526

The issue is that if you die with "Die()" then you don't get the core dump at all. I'd like to actually get core dumps on SEGV/ABRT so I can debug things, but I want the core dumps to be _without_ the shadow mapping.

With this patch, you can get this behavior by turning on abort_on_error -- the flow is:

  • user code accesses bad memory
    • ReportSEGV generates the report
    • Die() calls abort(), generating a SIGABRT
      • HandleSIGABRT gets called, reports SIGABRT, unmaps memory
      • HandleSIGABRT re-kills self with SIGABRT after setting to SIG_DFL
        • kernel generates core dump (without shadow mappings)
  • user happily debugs their application from the corefile

As for ensuring thread-safety, I put that check in the signal handler -- it uses an atomic swap to make sure that only one thread will proceed into this code

lib/asan/asan_rtl.cc
151

Why not? Unmapping the shadow memory on abort makes corefiles possible on 64-bit (and could enable us to then get rid of the other code which sets RLIMIT_CORE->0 on those systems)

My general use case is that I have an integration test build which runs all of my tests with ASAN enabled. Sometimes my tests fail due to a race which is difficult to reproduce, but a corefile would allow us to debug it the next morning. Currently it's impossible to get corefiles out of this build, but with this patch we're able to.

Sorry for the delayed response.

lib/asan/asan_posix.cc
62

Looks like you don't "invoke" default handler, you just install it here.

90

Ok... So you mean that after ASAN_OnSIGABRT exits, default signal handler will be called, and no more user code will be executed?

lib/asan/asan_report.cc
526

Hm... but Die() already unmaps shadow at exit, doesn't it? Why do you need to do this in SIGABRT handler? Will the following strategy work?

  • make SIGABRT and SIGSEGV handlers identical: print an error report, optionally set the handler to SIG_DFL, and call Die()
  • Die() optionally unmaps the shadow memory
  • Die() optionally calls abort(), which goes to kernel and generates a core dump.

Your atomic swap doesn't prevent two threads from entering ReportSIGABRT and ReportFreeNotMalloced simultaneously. The code in these functions is not thread-safe, for example they print symbolized stack traces, and symbolization is single-threaded.

lib/asan/asan_rtl.cc
151

Your use case is legitimate, but we generally introduce new options under a flag. We need to make sure it won't break the existing users, especially if they were able to live w/o this capability.

See my comment above about generating core dumps.