This patch allows us to move away from using __thread on darwin,
which is requiring for building lsan for darwin on ios version 7
and on iossim i386.
Details
- Reviewers
- kcc - kubamracek 
- Commits
- rGdc13921fbc19: Factor lsan allocator cache accesses into a function
 rGb91a5eabb3cc: Use pthreads for thread-local lsan allocator cache on darwin
 rG9971b76d20df: Use pthreads for thread-local lsan allocator cache on darwin
 rG923a32044199: Use pthreads for thread-local lsan allocator cache on darwin
 rG01159f5c5868: Use pthreads for thread-local lsan allocator cache on darwin
 rCRT298537: Factor lsan allocator cache accesses into a function
 rCRT298274: Use pthreads for thread-local lsan allocator cache on darwin
 rCRT298214: Use pthreads for thread-local lsan allocator cache on darwin
 rCRT296707: Use pthreads for thread-local lsan allocator cache on darwin
 rCRT295413: Use pthreads for thread-local lsan allocator cache on darwin
 rL298537: Factor lsan allocator cache accesses into a function
 rL298274: Use pthreads for thread-local lsan allocator cache on darwin
 rL298214: Use pthreads for thread-local lsan allocator cache on darwin
 rL296707: Use pthreads for thread-local lsan allocator cache on darwin
 rL295413: Use pthreads for thread-local lsan allocator cache on darwin
Diff Detail
- Build Status
- Buildable 4974 - Build 4974: arc lint + arc unit 
Event Timeline
| lib/lsan/lsan_allocator.cc | ||
|---|---|---|
| 43 | Again, why do we need this change? We need to make sure not to break these platforms (mips, aarch64). | |
| 147 | Looks too weird to me. sizeof(AllocatorCache) should do. | |
| lib/lsan/lsan_allocator.h | ||
| 58 ↗ | (On Diff #88566) | Do we need to change this code here? It's not obvious to me that this doesn't change the parameters for some platforms. | 
| lib/sanitizer_common/sanitizer_allocator_primary64.h | ||
| 49–50 ↗ | (On Diff #88566) | Why is this necessary? I mean, the change looks reasonable, but is it related to this change? If not, please submit it as a separate patch. | 
| lib/lsan/lsan_allocator.cc | ||
|---|---|---|
| 43 | This is exactly equivalent to the existing #if condition (#if (mips || aarch || i386) && !i386), but simplified because the data has been split out. | |
| 147 | sizeof(TYPENAME) fails check-sanitizer style checks. | |
| lib/lsan/lsan_allocator.h | ||
| 58 ↗ | (On Diff #88566) | It seemed cleaner to do this way, now that kMaxAllowedMallocSize has its own if block. However, I think you're right, this could potentially miss some edge cases for architecture. | 
| lib/sanitizer_common/sanitizer_allocator_primary64.h | ||
| 49–50 ↗ | (On Diff #88566) | It's necessary, at least on Darwin, because you'll overflow the type if you use a uptr to store a ULL. I'll move it to a separate patch. | 
| lib/lsan/lsan_allocator.cc | ||
|---|---|---|
| 43 | I'd still prefer not to make refactoring together with functional changes. (This particular change is okay, but it's hard to immediately see that the change is NFC.) | |
| 147 | Does it really? Is the linter smart enough to figure out that AllocatorCache is a type and not a variable? Any other suggestions? I really don't like to have a function call inside a sizeof() (I know it's not really calling the function, but still.) | |
| lib/sanitizer_common/sanitizer_allocator_primary64.h | ||
| 49–50 ↗ | (On Diff #88566) | This whole class should be ifdef'd to only compile on 64-bit platforms. Isn't that so? | 
| lib/sanitizer_common/sanitizer_allocator_primary64.h | ||
|---|---|---|
| 49–50 ↗ | (On Diff #88566) | Right, and unsigned long is 64-bit on (supported) 64-bit platforms, isn't it? | 
| lib/sanitizer_common/sanitizer_allocator_primary64.h | ||
|---|---|---|
| 49–50 ↗ | (On Diff #88566) | Turns out that __arm__ was creeping in due to a gap in the if condition. I'll fix that. | 
| lib/lsan/lsan_common.h | ||
|---|---|---|
| 18 ↗ | (On Diff #88841) | Hm... Could we have it the other way around? lsan_allocator.h would include lsan_common.h and GetAllocatorCache would be declared in lsan_allocator.h only. lsan_allocator.cc would include lsan_allocator.h. The _linux.cc and _mac.cc files would also include lsan_allocator.h. Is there a dependency problem with that? | 
| lib/lsan/lsan_common.h | ||
|---|---|---|
| 18 ↗ | (On Diff #88841) | Ahh, that's a very good idea, didn't think to move just the declaration. | 
| lib/lsan/lsan_allocator.h | ||
|---|---|---|
| 55 ↗ | (On Diff #88846) | Does the issue with __arm__ still exist now? | 
Thanks for taking the time to review! This should fix the last of the build failures, currently working on a patch to enable builds by default but disable actually running it.
Okay. It should be enough to exclude "check-lsan" from "check-all", but still keep "check-lsan" so that we can run the tests manually. Look for add_lit_testsuite and EXCLUDE_FROM_ALL.
We'll also need to change the default in asan, since asan enables lsan by default currently.
This patch causes a build failure on one test on linux, and I'm having trouble figuring out why. It appears that the test error is fixed by simply moving the definition of GetAllocatorCache() from lsan_common_linux.cc to lsan_allocator.cc. Not sure why this would change the behavior of the test, but investigating now. In the meantime, I reverted the patch.
The failing test is Linux-x86_64 stack-use-after-return.cc, failing with: AddressSanitizer: nested bug in the same thread, aborting.
FAIL: AddressSanitizer-x86_64-linux :: TestCases/Posix/stack-use-after-return.cc (1102 of 1258)
- TEST 'AddressSanitizer-x86_64-linux :: TestCases/Posix/stack-use-after-return.cc' FAILED ****
Script:
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64  -O0 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc -pthread -o /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp && env ASAN_OPTIONS=detect_stack_use_after_return=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64  -O1 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc -pthread -o /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp && env ASAN_OPTIONS=detect_stack_use_after_return=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64  -O2 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc -pthread -o /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp && env ASAN_OPTIONS=detect_stack_use_after_return=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64  -O3 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc -pthread -o /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp && env ASAN_OPTIONS=detect_stack_use_after_return=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
env ASAN_OPTIONS=detect_stack_use_after_return=0  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64  -O3 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc -pthread -o /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp -DkSize=10000 -DUseThread -DkStackSize=65536 && env ASAN_OPTIONS=detect_stack_use_after_return=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck --check-prefix=THREAD /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64  -DUseThread -O2 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc -pthread -o /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp && env ASAN_OPTIONS=detect_stack_use_after_return=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck --check-prefix=THREAD /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
env ASAN_OPTIONS=detect_stack_use_after_return=1:max_uar_stack_size_log=20:verbosity=1 not  /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck --check-prefix=CHECK-20 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
env ASAN_OPTIONS=detect_stack_use_after_return=1:min_uar_stack_size_log=24:max_uar_stack_size_log=24:verbosity=1 not /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm_build_ninja/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Posix/Output/stack-use-after-return.cc.tmp 2>&1 | FileCheck --check-prefix=CHECK-24 /mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc
Exit Code: 1
Command Output (stderr):
1: 0x7fff258045e0
2: 0x7fff258045e0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/projects/compiler-rt/test/asan/TestCases/Posix/stack-use-after-return.cc:52:13: error: expected string not found in input
 // THREAD: WRITE of size 1 {{.*}} thread T{{[1-9]}}
^
<stdin>:1:1: note: scanning from here
1: 0x7f922b0cf820
^
<stdin>:5:24: note: possible intended match here
AddressSanitizer: nested bug in the same thread, aborting.
^
--
I believe the issue is that we're using too small of a stack in that test, and perhaps we're just pushed over the limit by this new AllocatorCache. If I change -DkStackSize=65536 to be a number >=75392, the test passes. Is this an acceptable solution? If so, what would be a good number to use?
Ok, that's even weirder. Why should this change the local stack overhead? Do we overflow the stack? Can you actually post the backtrace from the debugger? Why is there a nested ASan failure?
The stack trace from the debugger isn't particularly helpful:
 thread #2, name = 'stack-use-after', stop reason = signal SIGSEGV: address access protected (fault address: 0x7ffff7e72020)
  frame #0: stack-use-after-return.cc.tmp`Func1() at stack-use-after-return.cc:39
   36  	}
   37  	
   38  	__attribute__((noinline))
-> 39  	char *Func1() {
   40  	  char local[kSize];
   41  	  return Ident(local);
   42  	}
(lldb) bt all
  thread #1, name = 'stack-use-after'
    frame #0: libpthread.so.0`pthread_join(threadid=140737352574720, thread_return=0x0000000000000000) at pthread_join.c:92
    frame #1: stack-use-after-return.cc.tmp`main at stack-use-after-return.cc:74
    frame #2: libc.so.6`__libc_start_main(main=(stack-use-after-return.cc.tmp`main), argc=1, argv=0x00007fffffffdb78, init=<unavailable>, fini=<unavailable>, rtld_fini=<unavailable>, stack_end=0x00007fffffffdb68) at libc-start.c:287
    frame #3: 0x000000000041ba7c stack-use-after-return.cc.tmp`_start + 41
* thread #2, name = 'stack-use-after', stop reason = signal SIGSEGV: address access protected (fault address: 0x7ffff7e72020)
  * frame #0: stack-use-after-return.cc.tmp`Func1() at stack-use-after-return.cc:39
    frame #1: stack-use-after-return.cc.tmp`Thread(void*) at stack-use-after-return.cc:61
    frame #2: libpthread.so.0`start_thread(arg=0x00007ffff7e82700) at pthread_create.c:312
    frame #3: libc.so.6`__clone at clone.S:111The crash error also goes away if I add a CHECK(0); to GetAllocatorCache(), which is similarly strange.
I found something very odd and very relevant. I added a line in test-use-after-return.cc to call pthread_attr_getstacksize immediately after pthread_attr_setstacksize. On master, the returned stack size is 8388608. With my change, it's 65536. I assume this means that we're intercepting something inside call to pthread_attr_setstacksize. But regardless, this means that 65536 was not large enough to begin with, unless we expect the sanitizers to intercept that call and increase the stack size. I assume we don't, since there's no direct interceptor for pthread_attr_setstacksize.
My diff is:
int main(int argc, char **argv) {
   pthread_attr_init(&attr);
   if (kStackSize > 0)
     pthread_attr_setstacksize(&attr, kStackSize);
+  size_t stacksize = 0;
+  pthread_attr_getstacksize(&attr, &stacksize);
+  printf("stacksize: %lu\n", stacksize);
   pthread_t t;
   pthread_create(&t, &attr, Thread, 0);
   pthread_attr_destroy(&attr);Ok, then the test is just flaky. Let's simply double the stack size value in the test (as a separate commit).
Okay, this is getting ridiculous. I've disabled the test on AArch64 in r298267. The test was not testing the right thing on AArch64 anyway (it wasn't setting the stack size).
This patch is having a real rough time. @vitalybuka thinks it probably caused this failure, and I wouldn't be too surprised: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/1117/steps/test%20standalone%20compiler-rt/logs/stdio
It doesn't repro on my CentOS or my Ubuntu machine though...
Actually, looks like the buildbot isn't failing currently. I won't revert the patch for now.
I've tried everything I can think of in terms of a repro here - different gcc/clang versions, etc. I also looked up the zorg config and ran exactly the way it was run on the buildbot (standalone build + i386 test), and still no failure. Perhaps I can split this into two changes, to try to see where the failure is springing up.
My guess is that this is still caused simply by moving the GetAllocatorCache() to a different file, so I can try to commit the whole change minus that bit first, and make sure tests pass. If just moving the function definition to a different file causes a failure, it's very unlikely to me that the failure is legitimate and not a flake.
Given that the code review on this is fine, I'll go ahead with merging the partial commit (the bit that moves the definitions to the header and adds a function for GetAllocatorCache() in the same file), provided I don't hear any objection. I'll then upload another review for the second half.
Again, why do we need this change? We need to make sure not to break these platforms (mips, aarch64).