This is an archive of the discontinued LLVM Phabricator instance.

[asan] Fix stack-use-after-free checks on non-main thread on Fuchsia
ClosedPublic

Authored by zarvox on Oct 16 2020, 5:15 PM.

Details

Summary

While some platforms call AsanThread::Init() from the context of the
thread being started, others (like Fuchsia) call AsanThread::Init()
from the context of the thread spawning a child. Since
AsyncSignalSafeLazyInitFakeStack writes to a thread-local, we need to
avoid calling it from the spawning thread on Fuchsia. Skipping the call
here on Fuchsia is fine; it'll get called from the new thread lazily on first
attempted access.

Diff Detail

Event Timeline

zarvox created this revision.Oct 16 2020, 5:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 16 2020, 5:15 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
zarvox requested review of this revision.Oct 16 2020, 5:15 PM
charco added a subscriber: charco.Oct 16 2020, 5:27 PM

is possible to test this case, even if it's only Fuchsia?

I don't really understand why the whole "lazy" logic exists. I can't tell what case it would ever be used in as the code stands. Every thread will pass through ASanThread::Init before it would ever just call fake_stack() and hit the "lazy" case. Is it intended that one could set __asan_option_detect_stack_use_after_return to false (either from startup or explicitly clear it), then create a thread, then set the flag explicitly on later?

What actually seems preferable here is that we always do the eager allocation, which can happen from any thread, but defer the setting of the TLS cache, which is the only thing that has to happen in the target thread.
So how about if AsyncSignalSafeLazyInitFakeStack calls SetTLSFakeStack either never or only if (tid() == CurrentTid), and then the GetFakeStack() slow path can call SetTLSFakeStack itself?

As to testing, note that we don't have a mechanism to run sanitizer runtime tests on Fuchsia at all so far. The best I can really think of is that just adding a DCHECK_EQ(GetCurrentThread(), this); before calling SetTLSFakeStack would trigger the assertion failure for a spawned thread on Fuchsia before this fix.

I don't really understand why the whole "lazy" logic exists. I can't tell what case it would ever be used in as the code stands. Every thread will pass through ASanThread::Init before it would ever just call fake_stack() and hit the "lazy" case. Is it intended that one could set __asan_option_detect_stack_use_after_return to false (either from startup or explicitly clear it), then create a thread, then set the flag explicitly on later?

What actually seems preferable here is that we always do the eager allocation, which can happen from any thread, but defer the setting of the TLS cache, which is the only thing that has to happen in the target thread.
So how about if AsyncSignalSafeLazyInitFakeStack calls SetTLSFakeStack either never or only if (tid() == CurrentTid), and then the GetFakeStack() slow path can call SetTLSFakeStack itself?

As to testing, note that we don't have a mechanism to run sanitizer runtime tests on Fuchsia at all so far. The best I can really think of is that just adding a DCHECK_EQ(GetCurrentThread(), this); before calling SetTLSFakeStack would trigger the assertion failure for a spawned thread on Fuchsia before this fix.

Maybe https://reviews.llvm.org/D2553

compiler-rt/lib/asan/asan_thread.cpp
237

This looks like very inconvenient difference, it can be easily missed by some making changes for other platforms.
How hard to make fuchsia to run AsanThread::Init on new thread?

As to testing, note that we don't have a mechanism to run sanitizer runtime tests on Fuchsia at all so far. The best I can really think of is that just adding a DCHECK_EQ(GetCurrentThread(), this); before calling SetTLSFakeStack would trigger the assertion failure for a spawned thread on Fuchsia before this fix.

Why is that? Can you run lit on linux and push tests binaries like we do for android?

vitalybuka added inline comments.Oct 23 2020, 12:39 PM
compiler-rt/lib/asan/asan_thread.cpp
234

can you add comment that the Fuchsia is probably the only known example of platform which needs that?

vitalybuka accepted this revision.Oct 23 2020, 12:39 PM
This revision is now accepted and ready to land.Oct 23 2020, 12:39 PM
zarvox updated this revision to Diff 300419.Oct 23 2020, 2:57 PM
zarvox marked an inline comment as done.

Updated comment and added DCHECK

compiler-rt/lib/asan/asan_thread.cpp
234

I suspect RTEMS might too, based on how the #if just below notes that it also doesn't use ThreadStart, but I'm not certain.

I updated the comment to make it clear that Fuchsia is one of the users of this approach.

237

My understanding is that that would be complicated for reasons that Roland can probably speak to better than I can.