This is an archive of the discontinued LLVM Phabricator instance.

[ASan] Stop blocking child thread progress from parent thread in `pthread_create` interceptor.
ClosedPublic

Authored by delcypher on Jan 21 2021, 4:19 PM.

Details

Summary

Previously in ASan's pthread_create interceptor we would block in the
pthread_create interceptor waiting for the child thread to start.

Unfortunately this has bad performance characteristics because the OS
scheduler doesn't know the relationship between the parent and child
thread (i.e. the parent thread cannot make progress until the child
thread makes progress) and may make the wrong scheduling decision which
stalls progress.

It turns out that ASan didn't use to block in this interceptor but was
changed to do so to try to address
http://llvm.org/bugs/show_bug.cgi?id=21621/.

In that bug the problem being addressed was a LeakSanitizer false
positive. That bug concerns a heap object being passed
as arg to pthread_create. If:

  • The calling thread loses a live reference to the object (e.g. pthread_create finishes and the thread no longer has a live reference to the object).
  • Leak checking is triggered.
  • The child thread has not yet started (once it starts it will have a live reference).

then the heap object will incorrectly appear to be leaked.

This bug is covered by the lsan/TestCases/leak_check_before_thread_started.cpp test case.

In b029c5101fb49b3577a1c322f42ef9fc616f25bf ASan was changed to block
in pthread_create() until the child thread starts so that arg is
kept alive for the purposes of leaking check.

While this change "works" its problematic due to the performance
problems it causes. The change is also completely unnecessary if leak
checking is disabled (via detect_leaks runtime option or
CAN_SANITIZE_LEAKS compile time config).

This patch does two things:

  1. Takes a different approach to solving the leak false positive by making LSan's leak checking mechanism treat the arg pointer of created but not started threads as reachable. This is done by implementing the ForEachRegisteredThreadContextCb callback for ASan.
  1. Removes the blocking behaviour in the ASan pthread_create interceptor.

rdar://problem/63537240

Diff Detail

Event Timeline

delcypher created this revision.Jan 21 2021, 4:19 PM
delcypher requested review of this revision.Jan 21 2021, 4:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2021, 4:19 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript

@vitalybuka This patch implements the "update the frontier" approach you outlined in the https://reviews.llvm.org/D94210 . The patch depends on https://reviews.llvm.org/D95183 .

delcypher updated this revision to Diff 318353.Jan 21 2021, 4:36 PM

Tweak condition to make it easier to read.

vitalybuka added inline comments.Jan 21 2021, 8:55 PM
compiler-rt/lib/asan/asan_allocator.cpp
1139

if we at asan_thread.cpp:260 arg is still invisible to lsan, and we can report false leak. So ThreadStatusRunning should be covered as well.

1150–1165

I would prefer we keep this logic in lsan

compiler-rt/lib/asan/asan_interceptors.cpp
230

if (result != 0) {

// destroy t;

}

delcypher added inline comments.Jan 22 2021, 12:57 PM
compiler-rt/lib/asan/asan_allocator.cpp
1139

Good catch.

1150–1165

Yep. Your suggestion in https://reviews.llvm.org/D95183 should resolve this. One downside though is when producing log output is that we won't be able to print the thread ID. That's a pretty small downside so I'm okay with that.

1153

Hmm. It just occurred to me we aren't checking m.allocated() is true before changing the tag and adding it to the frontier. This could lead to bad behaviour if the chunk gets freed at some point (e.g. the created thread frees it). Adding a freed chunk to the frontier could be especially bad because later on when doing the flood fill in LSan we'll scan the freed memory for pointers.

I'll add a check here to prevent this.

compiler-rt/lib/asan/asan_interceptors.cpp
230

Good catch. Isn't this a separate bug though? It looks like the old code didn't handle this either. I'm happy to fix it but I think it should be in a separate patch.

delcypher added inline comments.Jan 22 2021, 5:35 PM
compiler-rt/lib/asan/asan_interceptors.cpp
230

Wait.. no the bug isn't the old code because I've reserved the creation order so I've introduced a new bug. I'll fix this in this patch.

delcypher updated this revision to Diff 318713.Jan 22 2021, 6:12 PM

Address first round of feedback.

delcypher marked 3 inline comments as done.Jan 22 2021, 6:13 PM
delcypher added inline comments.Jan 22 2021, 6:29 PM
compiler-rt/lib/asan/asan_interceptors.cpp
228

@vitalybuka Is AsanThread::Destroy() safe to call from this thread and in this context where the corresponding thread was never started.? I wasn't really sure.

I see that calling this will call DTLS_Destroy() and I'm not sure if that needs to be called from a different thread (i.e. called from the thread before its destroyed). It looks like this function is a no-op on macOS which is where I'm testing.

vitalybuka accepted this revision.Jan 22 2021, 7:31 PM
vitalybuka added inline comments.
compiler-rt/lib/asan/asan_allocator.cpp
1197

historically sanitizers use snake case for local variable, from Google coding style

1202–1204
if (atctx->status != ThreadStatusCreated && atctx->status != ThreadStatusRunning)
  return;
compiler-rt/lib/asan/asan_interceptors.cpp
228

I think DTLS_Destroy should be fine here. I assume this thread will never showup in suspended_threads so lsan will not try to scan it.

As I understand Ctx is not leaked but pushed into quarantine by ThreadRegistry::FinishThread

This revision is now accepted and ready to land.Jan 22 2021, 7:31 PM
delcypher updated this revision to Diff 318721.Jan 22 2021, 7:43 PM

Address second round of feedback.

delcypher marked an inline comment as done.Jan 22 2021, 7:43 PM
delcypher added inline comments.
compiler-rt/lib/asan/asan_allocator.cpp
1197

Okay. I'll change the style.

1202–1204

Sure. I'll rewrite it that way.

compiler-rt/lib/asan/asan_interceptors.cpp
228

Sorry "leaks" is a poor choice of words. What I meant is we've wasted an AsanThreadContext because created it but then never used it. Because in ASan we never re-use AsanThreadContexts its wasted.

I'll update the comment.

delcypher marked 3 inline comments as done.Jan 22 2021, 7:44 PM
This revision was landed with ongoing or failed builds.Jan 22 2021, 11:35 PM
This revision was automatically updated to reflect the committed changes.

You should also change compiler-rt/lib/asan/asan_win.cpp like this:

diff --git a/compiler-rt/lib/asan/asan_win.cpp b/compiler-rt/lib/asan/asan_win.cpp
index 8044ae16ff9b..1577c83cf994 100644
--- a/compiler-rt/lib/asan/asan_win.cpp
+++ b/compiler-rt/lib/asan/asan_win.cpp
@@ -134,7 +134,7 @@ INTERCEPTOR(int, _except_handler4, void *a, void *b, void *c, void *d) {
 static thread_return_t THREAD_CALLING_CONV asan_thread_start(void *arg) {
   AsanThread *t = (AsanThread *)arg;
   SetCurrentThread(t);
-  return t->ThreadStart(GetTid(), /* signal_thread_is_registered */ nullptr);
+  return t->ThreadStart(GetTid());
 }

 INTERCEPTOR_WINAPI(HANDLE, CreateThread, LPSECURITY_ATTRIBUTES security,

You should also change compiler-rt/lib/asan/asan_win.cpp like this:

diff --git a/compiler-rt/lib/asan/asan_win.cpp b/compiler-rt/lib/asan/asan_win.cpp
index 8044ae16ff9b..1577c83cf994 100644
--- a/compiler-rt/lib/asan/asan_win.cpp
+++ b/compiler-rt/lib/asan/asan_win.cpp
@@ -134,7 +134,7 @@ INTERCEPTOR(int, _except_handler4, void *a, void *b, void *c, void *d) {
 static thread_return_t THREAD_CALLING_CONV asan_thread_start(void *arg) {
   AsanThread *t = (AsanThread *)arg;
   SetCurrentThread(t);
-  return t->ThreadStart(GetTid(), /* signal_thread_is_registered */ nullptr);
+  return t->ThreadStart(GetTid());
 }

 INTERCEPTOR_WINAPI(HANDLE, CreateThread, LPSECURITY_ATTRIBUTES security,

Thanks. Sorry for breaking the Windows build. This should be resolved by 757b93bb7b384038a8dec35433f78f5c7c2ef8b0