This is an archive of the discontinued LLVM Phabricator instance.

[ASan] Stop blocking child thread progress from parent thread in `pthread_create` interceptor.
AbandonedPublic

Authored by delcypher on Jan 6 2021, 5:44 PM.

Details

Summary

Previously in ASan's pthread_create interceptor we would block in the
pthread_create interceptor waiting for the child thread to start.

Unfortunately this has bad performance characteristics because the OS
scheduler doesn't know the relationship between the parent and child
thread (i.e. the parent thread cannot make progress until the child
thread makes progress) and may make the wrong scheduling decision which
stalls progress.

It turns out that ASan didn't use to block in this interceptor but was
changed to do so to try to address
http://llvm.org/bugs/show_bug.cgi?id=21621/.

In that bug the problem being addressed was a LeakSanitizer false
positive. That bug concerns a heap object being passed
as arg to pthread_create. If:

  • The calling thread loses a live reference to the object (e.g. pthread_create finishes and the thread no longer has a live reference to the object).
  • Leak checking is triggered.
  • The child thread has not yet started (once it starts it will have a live reference).

then the heap object will incorrectly appear to be leaked.

This bug is covered by the lsan/TestCases/leak_check_before_thread_started.cpp test case.

In b029c5101fb49b3577a1c322f42ef9fc616f25bf ASan was changed to block
in pthread_create() until the child thread starts so that arg is
kept alive for the purposes of leaking check.

While this change "works" its problematic due to the performance
problems it causes. The change is also completely unnecessary if leak
checking is disabled (via detect_leaks runtime option or
CAN_SANITIZE_LEAKS compile time config).

This patch takes a different approach to solving the leak false positive
telling LSan to ignore the arg object in between the time the child
thread is created and started. Because the object is being ignored its
no longer necessary to wait for the child thread to start and so the
blocking behaviour can be removed.

In more detail:

  1. Change pthread_create interceptor and asan_thread_start to no longer have a blocking relationship. This is similar to reverting b029c5101fb49b3577a1c322f42ef9fc616f25bf except we
    • Keep the test case added.
    • Leave the __lsan_thread_start_func changes in place because we haven't changed the blocking behaviour in standalone LSan. We could but I think its best we start with making this change to ASan first.
    • Adapt to source code changes made since the patch was landed.
  1. In the pthread_create interceptor call __lsan::IgnoreObjectLocked(arg) to cause the object to be ignored for the purposes of leak detection. This is only done if leak checking is enabled. The interceptor then calls a new method (set_unignore_arg_on_start) to tell the AsanThread that it should unignore the object when it starts.
  1. In AsanThread::ThreadStart call __lsan::UnIgnoredObjectLocked(arg) if unignore_arg_on_start_ is true.

rdar://problem/63537240

Diff Detail

Event Timeline

delcypher created this revision.Jan 6 2021, 5:44 PM
delcypher requested review of this revision.Jan 6 2021, 5:44 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 6 2021, 5:44 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
delcypher added inline comments.Jan 6 2021, 5:49 PM
compiler-rt/lib/asan/asan_thread.cpp
264

The lint warning here can be ignored. The function is introduced by https://reviews.llvm.org/D94209

vitalybuka added inline comments.Jan 6 2021, 6:12 PM
compiler-rt/lib/asan/asan_interceptors.cpp
220

probably too artificial but looks like will be broken

p = new char;
pthread_create(... p)
delete p;  // we know that thread is not going to use it
... 

p = new char; // unrelated allocation
Ingore(p)

could be simple and more reliable to change leak sanitizer to walk through all alive threads and include thread->arg_ into frontier?

vitalybuka added inline comments.Jan 6 2021, 6:29 PM
compiler-rt/lib/asan/asan_interceptors.cpp
220

more practical

shared = new char;
pthread_create(... shared)
pthread_create(... shared)

if the first thread unignores and exists and the second thread are not yet started we may see false leak.
should be fine with proposed lsan change.

delcypher added inline comments.Jan 7 2021, 12:00 PM
compiler-rt/lib/asan/asan_interceptors.cpp
220

For the benefit of readers, here's what can go wrong in the "For the "we know that thread is not going to use it" example.

Note we are assuming that the second new char gets the same address as the first new char. This wouldn't happen with the quarantine in place which is not the normal runtime configuration so this example is a bit artificial but it is still worth considering.

If the call to Ingore(p) (i.e. __lsan_ignore_object()) happens in between the call to __lsan::IgnoreObjectLocked() in the pthread_create interceptor and the call to __lsan::UnIgnoreObjectLocked() in the child thread then it will be as if that call never happened because the child thread will unignore the object regardless of the Ingore(p) call.

220

Thanks for the examples. They illustrate a significant problem with my approach that I hadn't considered.

Your proposal to have "leak sanitizer to walk through all alive threads and include thread->arg_ into frontier" would work if we could "walk through all alive threads". Unfortunately we can't do that with the removal of the blocking behaviour that this patch tries to achieve.

This is because although we create the AsanThread object in the parent thread (in the pthread_create interceptor) we don't actually register this thread with the thread registry until the child thread starts (registration happens in the call to t->ThreadStart(GetTid()); in the child thread). Because the thread isn't registered with the thread registry in between the exiting of the pthread_create() interceptor and the start of the child thread LSan won't be able to find this unstarted child thread in the thread registry and thus can't ignore its arg.

However, I have two alternative ideas that might work...

Solution 1. Introduce internal APIs that can be used to add/remove pointers from the frontier. Then this patch would call __lsan::AddPtrToFrontier() and __lsan::RemoveAddedPtrFromFrontier() in the pthread_create interceptor (parent thread) and in asan_thread_start() (child thread). The idea here being if leak detection is triggered in between __lsan::AddPtrToFrontier() and __lsan::RemoveAddedPtrFromFrontier() the arg of the not started child thread will be added to the frontier when the frontier is created and thus reporting the arg leak will be suppressed.

We need to do __lsan::RemoveAddedPtrFromFrontier() because that pointer might not be valid forever and the address could get reused for something which would mess up reporting leaks.
It looks like the Frontier object stores LsanMetadata chunks so these new APIs would need to ignore pointers that are not allocated chunks.

Solution 2.

Change the code to basically not do the blocking behaviour when leak checking is off. This technically gets me to my end goal because the situation I care about is where leak checking is off. I don't really like this solution though because it

  • Make the code complicated.
  • Creates a divergence in behaviour between leaks on vs leaks off which we probably don't have good test coverage for today (most tests probably run with leak checking on).
  • Doesn't fix the performance problem when leak checking is on.

What do you think? Do you like either of these solutions or do you have another you'd like to propose?

vitalybuka added inline comments.Jan 7 2021, 4:46 PM
compiler-rt/lib/asan/asan_interceptors.cpp
211

previously we kept AsanThread only if pthread_create succeed.
now if pthread_create fails we still keep it. We probably should destroy it.

220

Your proposal to have "leak sanitizer to walk through all alive threads and include thread->arg_ into frontier" would work if we could "walk through all alive threads". Unfortunately we can't do that with the removal of the blocking behaviour that this patch tries to achieve.

Your patch still calls AsanThread::Create -> asanThreadRegistry().CreateThread -> adds args into threads_
We don't rely care if it's alive or not. Leak sanitizer can add all args from there into frontier. When thread exists we can set args to null, just in case.

Solution 1.

LGTM, if the one above is not good enough.
However lets call it something like lsan::IgnorePermanently or AddToIgnoreList avoid exposing "Frontier" which is internal lsan term and to show somehow that it's different from IgnoreObjectLocked that ignore does not goes away when object is deallocated.

delcypher added inline comments.Jan 7 2021, 5:03 PM
compiler-rt/lib/asan/asan_interceptors.cpp
220

Your patch still calls AsanThread::Create -> asanThreadRegistry().CreateThread -> adds args into threads_
We don't rely care if it's alive or not. Leak sanitizer can add all args from there into frontier. When thread exists we can set args to null, just in case.

Oops. You're right, I misread the code. It might be possible to do your original suggestion. I'll give it a try and report back.