This is an archive of the discontinued LLVM Phabricator instance.

[tsan] Support thread sanitizer on Android.
AbandonedPublic

Authored by yabinc on Nov 19 2015, 7:13 PM.

Details

Summary

This patch contains three parts of work:

  1. As Android doesn't support __thread, use a tls slot to store the

pointer to ThreadState for current thread. Add check for thr->is_inited
as bionic libc can call intercepted functions before ThreadStart().
Add dead_thread_state as bionic libc can call intercepted functions
after DestroyThreadState().

  1. Add special considerations to intercept bionic libc functions, like

removing unsupported functions.

  1. Add address space support for android aarch64 devices. I have tested

on N9 and N5X, but I can't guarantee it is suitable for all android
aarch64 devices.

Diff Detail

Event Timeline

yabinc updated this revision to Diff 40734.Nov 19 2015, 7:13 PM
yabinc retitled this revision from to [tsan] Support thread sanitizer on Android..
yabinc updated this object.
danalbert added a subscriber: llvm-commits.
danalbert added inline comments.Nov 19 2015, 9:43 PM
lib/tsan/rtl/tsan_interceptors.h
35

I can't say much for the rest of the change, but the is_inited checks can easily be a separate, smaller patch, and I can pretty much guarantee that someone else would say so if I didn't :)

dvyukov edited edge metadata.Nov 20 2015, 1:32 PM

This needs to be split into smaller changes. But otherwise I think we can get something like this in.

lib/tsan/rtl/tsan_interceptors.cc
253

fix internal_sigfillset to work on android instead
the existing layer of indirection is exactly to hide such differences

269–273

ditto

271

drop {} for if with single-statement/single-line body
whole tsan codebase does it that way

284

ditto

lib/tsan/rtl/tsan_platform.h
181

drop {}

181

drop {}

lib/tsan/rtl/tsan_platform_linux.cc
371

please use a const here, it does not need to be a macro

const int kTsanTlsSlot = 8;  // allocated by bionic for tsan
372

add a comment explaining what it is for

378

this is still not completely async-signal-safe
consider that a thread is preempted by a signal handler on this line: now signal handler will allocate thread context and work for some time with it, then thread will allocate another context and lead the first one
see how mac does this, a similar change for mac recently went it

383

use MmapOrDie, it does not have zillion of uninteresting arguments and does not return nullptr

392

I wonder if there will be some races on dead_thread_state since it is shared between threads. I think it is a good idea to mprotect(PROT_READ) it after initialization. It can save us lots of debugging in future.

lib/tsan/rtl/tsan_platform_posix.cc
125–130

can't we keep these ProtectRange's on android?

applications do weird things like mmaps on strange fixed addresses, or mmaps at completely random addresses (a-la home-grown ASLR), such things cause unpleasant debugging

lib/tsan/rtl/tsan_rtl.cc
292 ↗(On Diff #40734)

why is this change made?

yabinc updated this revision to Diff 41003.Nov 23 2015, 5:40 PM
yabinc marked 7 inline comments as done.
yabinc edited edge metadata.

Update code based on review, will split the patch after finishing all comments.

lib/tsan/rtl/tsan_interceptors.cc
253

I use internal_sigfillset() for android because sigfillset() is not intercepted on android (maybe because of asan). And I think we can't use internal_sigfillset() for mac or windows.

lib/tsan/rtl/tsan_interceptors.h
35

I will split the patch into 3 parts as in the summary, if no one complains that it should be smaller.

lib/tsan/rtl/tsan_platform_linux.cc
378

why it is not signal-safe? There is another check of thr in L365. And I think L364-L365 will not be moved above L363 because L364 is inlined assemble language.

392

yes, it's a good idea.

lib/tsan/rtl/tsan_platform_posix.cc
125–130

surely we can. I commented this out just because the address map for android aarch64 doesn't conform the assumption here. I found that by commenting these out, the core dump becomes reasonable small, which helps debugging.

lib/tsan/rtl/tsan_rtl.cc
292 ↗(On Diff #40734)

Because previously I made kHeapMemBeg == kHeapMemEnd on android/aarch64. I am not sure how I should set kHeapMemBeg and kHeapMemEnd. The brk() space goes after main binary. But android uses jemalloc which uses common mmap() to allocate heap space. This makes real heap space mixed with kHiAppMem. As the correctness of kHeapMem is not very important, I changed it to follow linux/aarch64.
By the way, I am a little curious whether kxxxEnd should be accessible. From tsan_platform.h, it gives me a sense that kxxEnd should not be accessible, but kHiAppMemEnd should be accessible when it is 0x7fffffffffull.

+Kuba for sigfillset Mac question.

lib/tsan/rtl/tsan_interceptors.cc
253

This file calls REAL(sigfillset) just because it was written before internal_sigfillset appeared. The right thing to do is to replace all calls of REAL(sigfillset) to internal_sigfillset in this file, and then make internal_sigfillset work on all platforms. Windows is not an issue, because tsan does not work on windows yet. Mac calls just sigfillset inside of internal_sigfillset, not sure whether it can lead to infinite recursion or not... +Kuba for this. Anyway we need to fix internal_sigfillset.

lib/tsan/rtl/tsan_platform_linux.cc
378

I missed the second check. Yes, this looks signal-safe. Sorry.

lib/tsan/rtl/tsan_platform_posix.cc
125–130

I think we need let each platform define what regions need to be protected (see GetUserRegion function for similar functionality). Then you can support Android mapping.

I don't understand how it helps with core dumps. These regions should not be in core dump as they are all zeros. And anyway you still protect huge regions of memory. And we have huge shadow that we need to map anyway.

lib/tsan/rtl/tsan_rtl.cc
292 ↗(On Diff #40734)

There is a check below:

if (p < beg || p >= end)

which ensures that we don't actually end address. So correct constant for kHiAppMemEnd is 0x8000000000 rather than 0x7fffffffff.

All sanitizers replace default malloc, so jemalloc behavior is irrelevant here.
Tsan allocator is defined in tsan_rtl.h, search for "SizeClassAllocator". For mips64 and arm64 it actually does not use kHeapMemBeg/kHeapMemEnd, it just uses whatever mmap returns (this again makes it important to mprotect all unused holes in address space so that mmap does not return that addresses). So it should be irrelevant how you define it for arm64 (not sure why it defined these constants at all).

kubamracek added inline comments.Dec 3 2015, 5:50 AM
lib/tsan/rtl/tsan_interceptors.cc
253

Calling just sigfillset on OS X will call REAL(sigfillset), so it won't infinitely recurse. (Actually, sigfillset is a macro that sets the value to ~0, so nothing is called at all.)

dvyukov added inline comments.Dec 3 2015, 6:17 AM
lib/tsan/rtl/tsan_interceptors.cc
253

Great, thanks.
Then we just need to call internal_sigfillset in this file.

yabinc updated this revision to Diff 41927.Dec 4 2015, 12:33 PM
yabinc marked an inline comment as done.

update based on review comments and rebase code.

yabinc added inline comments.Dec 4 2015, 12:57 PM
lib/tsan/rtl/tsan_interceptors.cc
253

Inorder to use internal_sigfillset() in all cases, I have to move internal_sigfillset() from sanitizer_linux.cc to sanitizer_posix.cc. I included "sanitizer_platform_limits_posix.h" in sanitizer_posix.h, I am not very sure that
it is fine.

lib/tsan/rtl/tsan_platform_posix.cc
125–130

If we provide a GetProtectRegion() function, it is much longer (about 50~60 lines) than code here. And personally I think that is less clear. Sadly I can't use address range array. So I improved the code here
a little by using #if, which may make it clearer. But if you really think we should use GetProtectRegion(),
feel free to let me know.
I don't know the detail of core dump, maybe dive into this question later.

It generally looks good to me, please start splitting this into smaller patches. For example, disable of interceptors is a separate patch; sigfillset is a separate patch; android cur_thread() is a separate patch, etc.

yabinc abandoned this revision.Dec 9 2015, 4:26 PM

Abandon this, as it has been split.