This is an archive of the discontinued LLVM Phabricator instance.

[asan] Fix the deadlocks introduced by "On OS X, log reports to syslog and os_trace" commit
ClosedPublic

Authored by zaks.anna on Nov 6 2015, 5:02 PM.

Details

Summary

[asan] On OS X, log reports to syslog and os_trace, has been reverted in r252076 due to deadlocks on earlier versions of OS X. Alexey has also noticed deadlocks in some corner cases on Linux. This patch, if applied on top of the logging patch (http://reviews.llvm.org/D13452), addresses the known deadlock issues.

(This also proactively removes the color escape sequences from the error report buffer since we have to copy the buffer anyway.)

More background:

Alexey Samsonov <vonosmas@gmail.com>:
While I'm not exactly sure what caused these issues on Mac buildbot, I've also observed several self-deadlocks caused by this change.
Essentially, they are caused by AppendToErrorMessageBuffer function, which now takes a lock, but can be re-entrant:

  • AppendToErrorMessageBuffer is called and takes a lock
  • AppendToErrorMessageBuffer tries to allocate error message buffer and calls MmapOrDie()
  • MmapOrDie() fails, because memory limit is exceeded, and calls Report() to report that
  • Report calls AppendToErrorMessageBuffer, which attempts to take a lock, and hangs.

The same will happen if CHECK() call in AppendToErrorMessageBuffer fails. Sadly, I can't suggest a good solution for that: probably
you just need to try harder to make sure the function will never be called recursively: use RAW_CHECK instead of CHECK and
call (some variation of) MmapOrDieQuietly(), which will just call RawWrite()+Die() on failure.

Anna:
I suspect the issue with the earlier versions of OS X is that logging the error report results in ASan calling Printf under high verbosity level (and Printf calls AppendToErrorMessageBuffer). We hold the lock that is guarding error_message_buf when reporting inside ScopedInErrorReport. We can work around that issue by copying the buffer under the lock but releasing it before calling the logging routines.

Diff Detail

Repository
rL LLVM

Event Timeline

zaks.anna updated this revision to Diff 39611.Nov 6 2015, 5:02 PM
zaks.anna retitled this revision from to [asan] Fix the deadlocks introduced by "On OS X, log reports to syslog and os_trace" commit.
zaks.anna updated this object.
zaks.anna added reviewers: samsonov, kubamracek.
zaks.anna added a subscriber: llvm-commits.
samsonov added inline comments.Nov 6 2015, 5:24 PM
lib/asan/asan_report.cc
679 ↗(On Diff #39611)

Hm, I would actually prefer InternalScopedString here to avoid using internal allocator that late (and use the same mmap we use for error_message_buffer).

lib/sanitizer_common/sanitizer_common.h
80 ↗(On Diff #39611)

Looks like it's not used.

lib/sanitizer_common/sanitizer_mac.cc
384 ↗(On Diff #39611)

Wouldn't we die earlier if it's null? If it can be null, we probably don't need to invoke error_report_callback either.

zaks.anna updated this revision to Diff 39856.Nov 10 2015, 2:56 PM

Addressing Alexey's review. Switching to use InternalScopedBuffer lead to a bit more refactoring. For example, I've also switched to InternalScopedBuffer in WriteToSyslog since that copy happens late as well.

Alexey,

Could you look at this?

Thanks!
Anna.

zaks.anna updated this revision to Diff 40563.Nov 18 2015, 2:23 PM

Change WriteToSyslog to make it's own copy of the message since it destroys the buffer.

kubamracek edited edge metadata.Nov 19 2015, 7:53 AM

LGTM with a few nits.

lib/asan/asan_report.cc
692 ↗(On Diff #40563)

Use InternalScopedString instead of InternalScopedBuffer<char>.

lib/sanitizer_common/sanitizer_common.h
52 ↗(On Diff #40563)

Rename to "kErrorMessageBufferSize".

661 ↗(On Diff #40563)

Do we need to pass InternalScopedBuffer/InternalScopedString? Can we keep it as const char *?

samsonov edited edge metadata.Nov 19 2015, 9:19 AM

Sorry for the delay in review! =/

lib/asan/asan_report.cc
52 ↗(On Diff #40563)

Do you actually need this variable at all, now that you have a constant?

lib/sanitizer_common/sanitizer_posix.cc
125 ↗(On Diff #40563)

This function is now called in several more places. You might want to just add extra argument (raw_report) to ReportMmapFailureAndDie to reduce code duplication.

zaks.anna added inline comments.Nov 19 2015, 4:04 PM
lib/asan/asan_report.cc
692 ↗(On Diff #40563)

It's a buffer because we treat it as a buffer later on. For example, RemoveANSIEscapeSequencesFromString modifies it in place and modifies the raw data. (If it were a string, its length would get invalidated.)

zaks.anna updated this revision to Diff 40723.Nov 19 2015, 5:10 PM
zaks.anna edited edge metadata.

Addressed all outstanding suggestions.

samsonov accepted this revision.Nov 20 2015, 9:51 AM
samsonov edited edge metadata.

LGTM with a nit. Thanks!

lib/sanitizer_common/sanitizer_common.h
320 ↗(On Diff #40723)

You can remove this function from the header, and make it static in .cc file (if not inline it).

This revision is now accepted and ready to land.Nov 20 2015, 9:51 AM
This revision was automatically updated to reflect the committed changes.