This patch adds functions for managing fibers:
- __tsan_get_current_fiber()
- __tsan_create_fiber()
- __tsan_destroy_fiber()
- __tsan_switch_to_fiber()
- __tsan_set_fiber_name()
Differential D54889
Fiber support for thread sanitizer yuri on Nov 26 2018, 2:07 AM. Authored by
Details
Diff Detail
Event TimelineThere are a very large number of changes, so older changes are hidden. Show Older Changes Comment Actions Makes sense. If somebody will want to do longjmp to switch fibers across threads, we will need some additional support in tsan for that. Let's stick with Yuri and your use cases for now. Comment Actions
Comment Actions I added default synchronization and flag to opt-out. It would be great if someone can check this version with QEMU. Comment Actions New version should be as fast as original code.
IMHO it is hardly possible function entry/exit use shadow stack, memory access functions use clock. Both structure are too big to be copied on each fiber switch.
Can't agree with this. Only fiber is visible to user. Of cause, it is the same as thread until user switches it. Comment Actions In standalone thread we create temporary context and switch into into it. Then original thread context is posted into event loop thread, where it runs for some time together with other fibers. At some point code decides that it should migrate to original thread. Context is removed from list of fibers and original thread (running temporary context) is signalled. Thread switches into its own context and destroys temporary context. This trick significantly simplifies implementation of bindings for library that uses fibers internally and improves debugability because stack traces are preserved after switch.
Currently there is no interceptor for pthread_exit(). Do you suggest to add it?
Isn't it enough to place _tsan_ignore_thread_begin()/_tsan_ignore_thread_end() around setjmp/longjmp calls?
Comment Actions
Whatever we call them, Processor is for different things. It must not hold anything related to user state.
Oh, I see. I guess it can have some rough edges if used incorrectly, but also probably can work if used carefully.
Yes.
These only ignore memory accesses, they don't affect interceptor operation as far as I see. Comment Actions I don't completely follow logic behind cur_thread/cur_thread_fast/cur_thread1 and how it does not introduce slowdown. Comment Actions
Can you swicth to it in your codebase? I would expect that you need all (or almost all) of this synchronization anyway?
Please add such test (or we will break it in future).
This is a good question. Comment Actions FTR, here is current code: 00000000004b2c50 <__tsan_read2>: 4b2c50: 48 b8 f8 ff ff ff ff movabs $0xffff87fffffffff8,%rax 4b2c57: 87 ff ff 4b2c5a: 48 ba 00 00 00 00 00 movabs $0x40000000000,%rdx 4b2c61: 04 00 00 4b2c64: 53 push %rbx 4b2c65: 48 21 f8 and %rdi,%rax 4b2c68: 48 8b 74 24 08 mov 0x8(%rsp),%rsi 4b2c6d: 48 31 d0 xor %rdx,%rax 4b2c70: 48 83 3c 85 00 00 00 cmpq $0xffffffffffffffff,0x0(,%rax,4) 4b2c77: 00 ff 4b2c79: 0f 84 9d 00 00 00 je 4b2d1c <__tsan_read2+0xcc> 4b2c7f: 49 c7 c0 c0 04 fc ff mov $0xfffffffffffc04c0,%r8 4b2c86: 64 49 8b 08 mov %fs:(%r8),%rcx 4b2c8a: 48 85 c9 test %rcx,%rcx 4b2c8d: 0f 88 89 00 00 00 js 4b2d1c <__tsan_read2+0xcc> ... 4b2cf6: 64 f3 41 0f 7e 50 08 movq %fs:0x8(%r8),%xmm2 ... 4b2d36: 64 49 89 10 mov %rdx,%fs:(%r8) Here is new code: 00000000004b8460 <__tsan_read2>: 4b8460: 48 b8 f8 ff ff ff ff movabs $0xffff87fffffffff8,%rax 4b8467: 87 ff ff 4b846a: 48 ba 00 00 00 00 00 movabs $0x40000000000,%rdx 4b8471: 04 00 00 4b8474: 53 push %rbx 4b8475: 48 21 f8 and %rdi,%rax 4b8478: 48 8b 74 24 08 mov 0x8(%rsp),%rsi 4b847d: 48 31 d0 xor %rdx,%rax 4b8480: 48 83 3c 85 00 00 00 cmpq $0xffffffffffffffff,0x0(,%rax,4) 4b8487: 00 ff 4b8489: 0f 84 9f 00 00 00 je 4b852e <__tsan_read2+0xce> 4b848f: 48 c7 c2 f8 ff ff ff mov $0xfffffffffffffff8,%rdx 4b8496: 64 4c 8b 02 mov %fs:(%rdx),%r8 4b849a: 49 8b 08 mov (%r8),%rcx 4b849d: 48 85 c9 test %rcx,%rcx 4b84a0: 0f 88 88 00 00 00 js 4b852e <__tsan_read2+0xce> ... 4b8509: f3 41 0f 7e 50 08 movq 0x8(%r8),%xmm2 ... 4b8546: 49 89 10 mov %rdx,(%r8) The additional indirection is "mov (%r8),%rcx". Comment Actions https://android-review.googlesource.com/c/platform/external/qemu/+/844675 Latest version of patch doesn't work with QEMU anymore, at least with those annotations. Error log: qemu_coroutine_new:181:0x7f02938cf700:0x7f029388f7c0 Start new coroutine #0 __tsan::TsanCheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_rtl_report.cc:48:25 (qemu-system-x86_64+0x536168) #1 __sanitizer::CheckFailed(char const*, int, char const*, unsigned long long, unsigned long long) /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/sanitizer_common/sanitizer_termination.cc:79:5 (qemu-system-x86_64+0x4bcf4f) #2 LongJmp(__tsan::ThreadState*, unsigned long*) /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:531:7 (qemu-system-x86_64+0x4d2311) #3 siglongjmp /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:647:3 (qemu-system-x86_64+0x4d23ce) #4 coroutine_trampoline /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/coroutine-ucontext.c:165:9 (qemu-system-x86_64+0xb72000) #5 <null> <null> (libc.so.6+0x43fcf) Do you know what I'm doing wrong here? Comment Actions
Comment Actions That worked, thanks! Sorry, I had updated the patch with a refactoring the meantime that probably actually caused the break. Comment Actions QEMU status: Fewer false positives now; there are many warnings that seem more real now, mostly about not atomically reading variables that got atomically updated. There might be other issues as well. Comment Actions WARNING: ThreadSanitizer: data race (pid=3742)
Atomic write of size 1 at 0x7b0c00051620 by thread T14 (mutexes: write M1097):
#0 __tsan_atomic8_exchange /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cc:579:3 (qemu-system-x86_64+0x51d0a8)
#1 qemu_bh_schedule /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:167:9 (qemu-system-x86_64+0xb4675e)
#2 worker_thread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:114:9 (qemu-system-x86_64+0xb4838c)
#3 qemu_thread_trampoline /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:551:17 (qemu-system-x86_64+0xb4fe36)
Previous read of size 1 at 0x7b0c00051620 by thread T10 (mutexes: write M985): #0 aio_compute_timeout /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:198:17 (qemu-system-x86_64+0xb46868) #1 aio_poll /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/aio-posix.c:617:26 (qemu-system-x86_64+0xb4c38c) #2 blk_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1247:9 (qemu-system-x86_64+0x98bbbd) #3 blk_pread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1409:15 (qemu-system-x86_64+0x98b91a) #4 find_image_format /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:701:11 (qemu-system-x86_64+0x958068) #5 bdrv_open_inherit /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2690 (qemu-system-x86_64+0x958068) #6 bdrv_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2802:12 (qemu-system-x86_64+0x958d1f) #7 blk_new_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:375:10 (qemu-system-x86_64+0x98927a) #8 blockdev_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:598:15 (qemu-system-x86_64+0x9c5700) #9 drive_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:1092 (qemu-system-x86_64+0x9c5700) #10 drive_init(void*, QemuOpts*, Error**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:366:28 (qemu-system-x86_64+0xb1e9a2) #11 qemu_opts_foreach /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-option.c:1106:14 (qemu-system-x86_64+0xb696e5) #12 android_drive_share_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:530:9 (qemu-system-x86_64+0xb1e1e0) #13 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:5244:13 (qemu-system-x86_64+0x54f811) #14 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #15 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #16 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #17 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Location is heap block of size 40 at 0x7b0c00051600 allocated by thread T13: #0 malloc /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:667:5 (qemu-system-x86_64+0x4d246c) #1 g_malloc /tmp/jansene-build-temp-193494/src/glib-2.38.2/glib/gmem.c:104 (qemu-system-x86_64+0xecfdc0) #2 thread_pool_init_one /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:307:27 (qemu-system-x86_64+0xb47a63) #3 thread_pool_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:327 (qemu-system-x86_64+0xb47a63) #4 aio_get_thread_pool /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:320:28 (qemu-system-x86_64+0xb46934) #5 paio_submit_co /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1565:12 (qemu-system-x86_64+0xa15d23) #6 raw_co_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1620 (qemu-system-x86_64+0xa15d23) #7 raw_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1627 (qemu-system-x86_64+0xa15d23) #8 bdrv_driver_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:924:16 (qemu-system-x86_64+0x8d91f7) #9 bdrv_aligned_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:1228 (qemu-system-x86_64+0x8d91f7) #10 bdrv_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:1324:11 (qemu-system-x86_64+0x8d8dd7) #11 blk_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1158:11 (qemu-system-x86_64+0x98b0da) #12 blk_read_entry /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1206:17 (qemu-system-x86_64+0x98c1e3) #13 coroutine_trampoline /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/coroutine-ucontext.c:173:9 (qemu-system-x86_64+0xb71c8a) #14 <null> <null> (libc.so.6+0x43fcf) Mutex M1097 (0x7b3800009bd0) created at: #0 pthread_mutex_init /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1187:3 (qemu-system-x86_64+0x4d4ebc) #1 qemu_mutex_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:61:11 (qemu-system-x86_64+0xb4f2a7) #2 thread_pool_init_one /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:308:5 (qemu-system-x86_64+0xb47a7b) #3 thread_pool_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:327 (qemu-system-x86_64+0xb47a7b) #4 aio_get_thread_pool /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:320:28 (qemu-system-x86_64+0xb46934) #5 paio_submit_co /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1565:12 (qemu-system-x86_64+0xa15d23) #6 raw_co_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1620 (qemu-system-x86_64+0xa15d23) #7 raw_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1627 (qemu-system-x86_64+0xa15d23) #8 bdrv_driver_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:924:16 (qemu-system-x86_64+0x8d91f7) #9 bdrv_aligned_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:1228 (qemu-system-x86_64+0x8d91f7) #10 bdrv_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:1324:11 (qemu-system-x86_64+0x8d8dd7) #11 blk_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1158:11 (qemu-system-x86_64+0x98b0da) #12 blk_read_entry /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1206:17 (qemu-system-x86_64+0x98c1e3) #13 coroutine_trampoline /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/coroutine-ucontext.c:173:9 (qemu-system-x86_64+0xb71c8a) #14 <null> <null> (libc.so.6+0x43fcf) Mutex M985 (0x000003bb0fb8) created at: #0 pthread_mutex_init /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1187:3 (qemu-system-x86_64+0x4d4ebc) #1 qemu_mutex_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:61:11 (qemu-system-x86_64+0xb4f2a7) #2 qemu_init_cpu_loop /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../cpus.c:1123:5 (qemu-system-x86_64+0x5ddb0c) #3 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3437:5 (qemu-system-x86_64+0x547ae0) #4 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #5 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #6 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #7 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Thread T14 (tid=3886, running) created by thread T10 at: #0 pthread_create /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:968:3 (qemu-system-x86_64+0x4d3cda) #1 qemu_thread_create /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:591:11 (qemu-system-x86_64+0xb4fcef) #2 do_spawn_thread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:135:5 (qemu-system-x86_64+0xb48072) #3 spawn_thread_bh_fn /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:143 (qemu-system-x86_64+0xb48072) #4 aio_bh_call /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:90:5 (qemu-system-x86_64+0xb46616) #5 aio_bh_poll /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:118 (qemu-system-x86_64+0xb46616) #6 aio_poll /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/aio-posix.c:706:17 (qemu-system-x86_64+0xb4cf45) #7 blk_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1247:9 (qemu-system-x86_64+0x98bbbd) #8 blk_pread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1409:15 (qemu-system-x86_64+0x98b91a) #9 find_image_format /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:701:11 (qemu-system-x86_64+0x958068) #10 bdrv_open_inherit /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2690 (qemu-system-x86_64+0x958068) #11 bdrv_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2802:12 (qemu-system-x86_64+0x958d1f) #12 blk_new_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:375:10 (qemu-system-x86_64+0x98927a) #13 blockdev_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:598:15 (qemu-system-x86_64+0x9c5700) #14 drive_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:1092 (qemu-system-x86_64+0x9c5700) #15 drive_init(void*, QemuOpts*, Error**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:366:28 (qemu-system-x86_64+0xb1e9a2) #16 qemu_opts_foreach /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-option.c:1106:14 (qemu-system-x86_64+0xb696e5) #17 android_drive_share_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:530:9 (qemu-system-x86_64+0xb1e1e0) #18 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:5244:13 (qemu-system-x86_64+0x54f811) #19 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #20 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #21 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #22 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Thread T10 'MainLoopThread' (tid=3881, running) created by main thread at: #0 pthread_create /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:968:3 (qemu-system-x86_64+0x4d3cda) #1 QThread::start(QThread::Priority) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:726:16 (libQt5Core.so.5+0xa84fb) #2 skin_winsys_spawn_thread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/winsys-qt.cpp:519:17 (qemu-system-x86_64+0xbcf68b) #3 main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:1624:5 (qemu-system-x86_64+0x543f9c) Thread T13 (tid=0, running) created by thread T10 at: #0 on_new_fiber /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/coroutine-ucontext.c:90:25 (qemu-system-x86_64+0xb71b48) #1 qemu_coroutine_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/coroutine-ucontext.c:217 (qemu-system-x86_64+0xb71b48) #2 qemu_coroutine_create /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-coroutine.c:88:14 (qemu-system-x86_64+0xb70349) #3 blk_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1245:25 (qemu-system-x86_64+0x98baca) #4 blk_pread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1409:15 (qemu-system-x86_64+0x98b91a) #5 find_image_format /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:701:11 (qemu-system-x86_64+0x958068) #6 bdrv_open_inherit /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2690 (qemu-system-x86_64+0x958068) #7 bdrv_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2802:12 (qemu-system-x86_64+0x958d1f) #8 blk_new_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:375:10 (qemu-system-x86_64+0x98927a) #9 blockdev_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:598:15 (qemu-system-x86_64+0x9c5700) #10 drive_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:1092 (qemu-system-x86_64+0x9c5700) #11 drive_init(void*, QemuOpts*, Error**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:366:28 (qemu-system-x86_64+0xb1e9a2) #12 qemu_opts_foreach /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-option.c:1106:14 (qemu-system-x86_64+0xb696e5) #13 android_drive_share_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:530:9 (qemu-system-x86_64+0xb1e1e0) #14 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:5244:13 (qemu-system-x86_64+0x54f811) #15 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #16 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #17 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #18 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) SUMMARY: ThreadSanitizer: data race /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:167:9 in qemu_bh_scheduleWARNING: ThreadSanitizer: data race (pid=3742)
Read of size 4 at 0x7b4400025758 by thread T14 (mutexes: write M1097):
#0 aio_notify /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:342:14 (qemu-system-x86_64+0xb46778)
#1 qemu_bh_schedule /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:168 (qemu-system-x86_64+0xb46778)
#2 worker_thread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:114:9 (qemu-system-x86_64+0xb4838c)
#3 qemu_thread_trampoline /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:551:17 (qemu-system-x86_64+0xb4fe36)
Previous atomic write of size 4 at 0x7b4400025758 by thread T10 (mutexes: write M985): #0 __tsan_atomic32_fetch_add /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cc:616:3 (qemu-system-x86_64+0x51df18) #1 aio_poll /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/aio-posix.c:608:9 (qemu-system-x86_64+0xb4c342) #2 blk_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1247:9 (qemu-system-x86_64+0x98bbbd) #3 blk_pread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1409:15 (qemu-system-x86_64+0x98b91a) #4 find_image_format /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:701:11 (qemu-system-x86_64+0x958068) #5 bdrv_open_inherit /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2690 (qemu-system-x86_64+0x958068) #6 bdrv_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2802:12 (qemu-system-x86_64+0x958d1f) #7 blk_new_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:375:10 (qemu-system-x86_64+0x98927a) #8 blockdev_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:598:15 (qemu-system-x86_64+0x9c5700) #9 drive_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:1092 (qemu-system-x86_64+0x9c5700) #10 drive_init(void*, QemuOpts*, Error**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:366:28 (qemu-system-x86_64+0xb1e9a2) #11 qemu_opts_foreach /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-option.c:1106:14 (qemu-system-x86_64+0xb696e5) #12 android_drive_share_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:530:9 (qemu-system-x86_64+0xb1e1e0) #13 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:5244:13 (qemu-system-x86_64+0x54f811) #14 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #15 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #16 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #17 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Location is heap block of size 296 at 0x7b44000256c0 allocated by thread T10: #0 calloc /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:684:5 (qemu-system-x86_64+0x4d26ef) #1 g_malloc0 /tmp/jansene-build-temp-193494/src/glib-2.38.2/glib/gmem.c:134 (qemu-system-x86_64+0xecfe18) #2 qemu_init_main_loop /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/main-loop.c:165:24 (qemu-system-x86_64+0xb4abac) #3 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:4605:9 (qemu-system-x86_64+0x54b937) #4 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #5 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #6 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #7 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Mutex M1097 (0x7b3800009bd0) created at: #0 pthread_mutex_init /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1187:3 (qemu-system-x86_64+0x4d4ebc) #1 qemu_mutex_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:61:11 (qemu-system-x86_64+0xb4f2a7) #2 thread_pool_init_one /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:308:5 (qemu-system-x86_64+0xb47a7b) #3 thread_pool_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:327 (qemu-system-x86_64+0xb47a7b) #4 aio_get_thread_pool /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:320:28 (qemu-system-x86_64+0xb46934) #5 paio_submit_co /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1565:12 (qemu-system-x86_64+0xa15d23) #6 raw_co_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1620 (qemu-system-x86_64+0xa15d23) #7 raw_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/file-posix.c:1627 (qemu-system-x86_64+0xa15d23) #8 bdrv_driver_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:924:16 (qemu-system-x86_64+0x8d91f7) #9 bdrv_aligned_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:1228 (qemu-system-x86_64+0x8d91f7) #10 bdrv_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/io.c:1324:11 (qemu-system-x86_64+0x8d8dd7) #11 blk_co_preadv /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1158:11 (qemu-system-x86_64+0x98b0da) #12 blk_read_entry /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1206:17 (qemu-system-x86_64+0x98c1e3) #13 coroutine_trampoline /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/coroutine-ucontext.c:173:9 (qemu-system-x86_64+0xb71c8a) #14 <null> <null> (libc.so.6+0x43fcf) Mutex M985 (0x000003bb0fb8) created at: #0 pthread_mutex_init /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:1187:3 (qemu-system-x86_64+0x4d4ebc) #1 qemu_mutex_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:61:11 (qemu-system-x86_64+0xb4f2a7) #2 qemu_init_cpu_loop /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../cpus.c:1123:5 (qemu-system-x86_64+0x5ddb0c) #3 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3437:5 (qemu-system-x86_64+0x547ae0) #4 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #5 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #6 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #7 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Thread T14 (tid=3886, running) created by thread T10 at: #0 pthread_create /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:968:3 (qemu-system-x86_64+0x4d3cda) #1 qemu_thread_create /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-thread-posix.c:591:11 (qemu-system-x86_64+0xb4fcef) #2 do_spawn_thread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:135:5 (qemu-system-x86_64+0xb48072) #3 spawn_thread_bh_fn /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/thread-pool.c:143 (qemu-system-x86_64+0xb48072) #4 aio_bh_call /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:90:5 (qemu-system-x86_64+0xb46616) #5 aio_bh_poll /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:118 (qemu-system-x86_64+0xb46616) #6 aio_poll /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/aio-posix.c:706:17 (qemu-system-x86_64+0xb4cf45) #7 blk_prw /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1247:9 (qemu-system-x86_64+0x98bbbd) #8 blk_pread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:1409:15 (qemu-system-x86_64+0x98b91a) #9 find_image_format /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:701:11 (qemu-system-x86_64+0x958068) #10 bdrv_open_inherit /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2690 (qemu-system-x86_64+0x958068) #11 bdrv_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block.c:2802:12 (qemu-system-x86_64+0x958d1f) #12 blk_new_open /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../block/block-backend.c:375:10 (qemu-system-x86_64+0x98927a) #13 blockdev_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:598:15 (qemu-system-x86_64+0x9c5700) #14 drive_new /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../blockdev.c:1092 (qemu-system-x86_64+0x9c5700) #15 drive_init(void*, QemuOpts*, Error**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:366:28 (qemu-system-x86_64+0xb1e9a2) #16 qemu_opts_foreach /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/qemu-option.c:1106:14 (qemu-system-x86_64+0xb696e5) #17 android_drive_share_init /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/drive-share.cpp:530:9 (qemu-system-x86_64+0xb1e1e0) #18 main_impl /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:5244:13 (qemu-system-x86_64+0x54f811) #19 run_qemu_main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../vl.c:3349:21 (qemu-system-x86_64+0x547a02) #20 enter_qemu_main_loop(int, char**) /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:606:5 (qemu-system-x86_64+0x54427c) #21 MainLoopThread::run() /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/emulator-qt-window.h:73:13 (qemu-system-x86_64+0xc40b41) #22 QThreadPrivate::start(void*) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:367:14 (libQt5Core.so.5+0xa7d35) Thread T10 'MainLoopThread' (tid=3881, running) created by main thread at: #0 pthread_create /usr/local/google/home/lfy/aosp-llvm-toolchain/toolchain/llvm/projects/compiler-rt/lib/tsan/rtl/tsan_interceptors.cc:968:3 (qemu-system-x86_64+0x4d3cda) #1 QThread::start(QThread::Priority) /usr/local/google/home/joshuaduong/qt-build/src/qt-everywhere-src-5.11.1/qtbase/src/corelib/thread/qthread_unix.cpp:726:16 (libQt5Core.so.5+0xa84fb) #2 skin_winsys_spawn_thread /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android/android-emu/android/skin/qt/winsys-qt.cpp:519:17 (qemu-system-x86_64+0xbcf68b) #3 main /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../android-qemu2-glue/main.cpp:1624:5 (qemu-system-x86_64+0x543f9c) SUMMARY: ThreadSanitizer: data race /usr/local/google/home/lfy/emu2/master/external/qemu/objs/../util/async.c:342:14 in aio_notifyComment Actions
Comment Actions I assume that after program or new thread starts sanitizer will first see few (or maybe zero) interceptors, then _func_entry() and then everything else. Following this logic, I use cur_thread_fast() in all performance-critical entry points excluding _func_entry(). I ran tsan benchmarks and it looks like memory indirection by itself does not affect performance because variable is in CPU cache in most cases. At the same time, conditional check has visible effect on performance.
Comment Actions Unfortunately, I can't check it. Current implementation of our codebase work fine with synchronization on fiber switch. I plan to implement another mode when fibers are not synchronized by default, but only when they call special synchronization APIs (events, mutexes etc.). Such way I can catch more errors in code when fibers are running in the same thread only by chance. In order to implement it, I need some way to opt-out of synchronization in tsan. Comment Actions Sorry, I meant to switch to the built-in synchronization to validate that it works in real projects and either removes the need in manual synchronization annotations or at least significantly reduces the need in manual annotations.
Comment Actions Re performance, we have this open performance regression in clang codegen which makes it harder to do analysis: Comment Actions I've benchmarked on 350140 with host gcc version 7.3.0 (Debian 7.3.0-5), running old/new binary alternated: int main() { const int kSize = 2<<10; const int kRepeat = 1<<19; volatile long data[kSize]; for (int i = 0; i < kRepeat; i++) { for (int j = 0; j < kSize; j++) data[j] = 1; __atomic_load_n(&data[0], __ATOMIC_ACQUIRE); __atomic_store_n(&data[0], 1, __ATOMIC_RELEASE); } } compiler-rt$ TIME="%e" nice -20 time taskset -c 0 ./current.test compiler-rt$ TIME="%e" nice -20 time taskset -c 0 ./fiber.test This looks like 14% degradation. Comment Actions I improved test execution time. On my system I got following execution times (compared to original version of code):
Comment Actions "current" is compiler-rt HEAD with no local changes, or with the previous version of this change? This makes clang-compiled runtime 12% faster? Comment Actions What exactly has changed wrt performance? I see some small tweaks, but I am not sure if they are the main reason behind the speedup or I am missing something important. Comment Actions "current" is compiler-rt HEAD without any changes. In case of clang-compiled library, even previous version of patch was faster then HEAD. Looks like additional indirection of ThreadState by itself introduce minimal overhead, but affect code generation in unpredictable way, especially for clang.
Yes, I can see slowdown with previous version of patch and gcc-compiled library around 3.5%.
You may consider this as "cheating" because changes are general and not related to fibers. Most speedup is because of change in tsan_rtl.cc: - if (!SANITIZER_GO && *shadow_mem == kShadowRodata) { + if (!SANITIZER_GO && !kAccessIsWrite && *shadow_mem == kShadowRodata) { Placing LIKELY/UNLIKELY in code gives additional 1-2%. Comment Actions Benchmark results with clang: write8: read8: read4: fibers(new) is the current version of the change. fibers(old) is the one I used with gcc (no spot optimizations). So this change indeed makes it faster for clang, but the fastest clang version is still slower then gcc, so this suggests that there is something to improve in clang codegen. If we improve clang codegen, will this change again lead to slower code or not? It's bad that we have that open clang regression... Comment Actions I have found out which optimization is applied by gcc but not by clang and did it manually. Executions times for new version:
Comment Actions Hi Dmitry, Comment Actions
Just me finding time to review it. Comment Actions Spent a while benchmarking this. I used 4 variations of a test program: I used 3 variations of the runtime: current tsan runtime without modifications, this change, this change with cur_thread/cur_thread_fast returning reinterpret_cast<ThreadState *>(cur_thread_placeholder). In all cases I used self-hosted clang build on revision 353000. Fibers seem to incur ~4% slowdown on average. Comment Actions Since fiber support incurs slowdown for all current tsan users who don't use fibers, this is a hard decision. I've prototyped a change which leaves fast_state and fast_synch_epoch in TLS (the only state accessed on fast path besides the clock): struct FastThreadState { FastState fast_state; u64 fast_synch_epoch; }; __attribute__((tls_model("initial-exec"))) extern THREADLOCAL char cur_thread_faststate1[]; INLINE FastThreadState& cur_thread_faststate() { return *reinterpret_cast<FastThreadState *>(cur_thread_faststate1); } But this seems to be even slower than just using the single pointer indirection. Comment Actions Also benchmarked function entry/exit using the following benchmark: // foo1.c volatile int kRepeat = 1 << 30; const int repeat = kRepeat; for (int i = 0; i < repeat; i++) foo(false); } // foo2.c if (x) bar(); } The program spends ~75% of time in __tsan_func_entry/exit. Rest of the conditions are the same as in the previous benchmark. Current code runs in 7.16s Using the pointer indirection seems to positively affect func entry/exit codegen. Comment Actions But if I do: INLINE ThreadState *cur_thread_fast() { ThreadState* thr; __asm__("": "=a"(thr): "a"(&cur_thread_placeholder[0])); return thr; } (which is a dirty trick to force compiler to cache address of the tls object in a register) then the program runs 5.94s -- faster than any other options as it takes advantage of both no indirection and faster instructions. But this is not beneficial for __tsan_read/write functions because caching the address takes a register and these functions are already severely short on registers. Comment Actions Going forward I think we should get in all unrelated/preparatory changed first: thread type (creates lots of diffs), pthread_exit interceptor/test and spot optimizations to memory access functions. Comment Actions This is looked like an interesting optimization, but turns out to be too sensitive to unrelated code changes. Comment Actions Did another round of benchmarking of this change on the current HEAD using these 2 benchmarks: Here fibers is this change, and fibers* is this change with 2 additional changes:
Now ~2% slowdown on highly synthetic benchmarks looks like something we can tolerate (2 cases are actually faster). The cur_thread/cur_thread_fast separation still looks confusing to me. It's a convoluted way to do lazy initialization. If one adds any new calls to these functions, which one to choose is non-obvious. I think we should do lazy initialization explicitly. Namely, leave cur_thread alone, don't introduce cur_thread_fast, don't change any call sites. Instead, add init_cur_thread call that does lazy initialization to interceptor entry point and any other points that we expect can be the first call into tsan runtime overall or within a new thread. I think interceptors and tsan_init should be enough (no tsan_func_entry). We call __tsan_init from .preinit_array, instrumented code can't be executed before .preinit_array, only interceptors from dynamic loader can precede .preinit_array callbacks. With these 3 changes, it looks good to me and I am ready to merge it. Comment Actions To clarify the graph: it's difference in execution time in percents as compared to the current HEAD. I.e. -4 means that fibers are 4% slower than the current HEAD. Comment Actions
Comment Actions For now I added calls to cur_thread_init() into 3 places. It was enough to pass all tests on my system. I am not sure if it will work with different versions of glibc. What do you think about it? Comment Actions The change is now a very good shape.
I've tested on 2 more distributions and it worked for me.
Comment Actions There is a lot if interceptors that do if (cur_thread()->in_symbolizer) before SCOPED_INTERCEPTOR_RAW. What to do with them? Comment Actions Yikes! Good question! If we are in the symbolize we've already initialized cur_thread, since we are coming recursively from runtime. But this does not help because if we are not in symbolizer, we can have cur_thread not initialized... We have it in malloc, atexit and similar fundamental functions that can well be a function called during process or thread start. All of in_symbolizer checks call cur_thread in the same expression rather than use some local variable, i.e. they are of the form: if (cur_thread()->in_symbolizer) which suggests that we should introduce a helper in_symbolizer(void) function that will incapsulate cur_thread_init and the check (probably should go into tsan_interceptors.h). Comment Actions Committed in: Thanks for bearing with me, this touched very sensitive parts of runtime so I did not want to crush. This adds a useful capability to ThreadSanitizer. The performance improvements resulted from this work are much appreciated too. Comment Actions FTR updated check_analyze.sh in http://llvm.org/viewvc/llvm-project?view=revision&revision=353820 Comment Actions @lei, please help to investigate what happened in http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/11413 Is it possible get stack trace of crash? Comment Actions Hi Yuri, I think this might be breaking our aarch64-full buildbot [1]. I ran compiler-rt/test/sanitizer_common/tsan-aarch64-Linux/Linux/Output/clock_gettime.c.tmp and get this stack trace [2]: Any ideas? The bot has been red for quite a while now, so I think I will have to revert this while we investigate. [1] http://lab.llvm.org:8011/builders/clang-cmake-aarch64-full/builds/6556 Comment Actions
From: Yi-Hong Lyu via Phabricator <reviews@reviews.llvm.org> Yi-Hong.Lyu added a comment. Comment Actions I see a similar failure was already reported: Comment Actions Looks like it is already fixed. See http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/9157 Comment Actions I work on integrating this for OpenMP task into openmp/tools/archer/ompt-tsan.cpp Initial experiments of the integration into ompt-tsan show two issues:
==274531==FATAL: ThreadSanitizer: internal allocator is out of memory trying to allocate 0x3fb58 bytes or ==274830==ERROR: ThreadSanitizer failed to deallocate 0x41000 (266240) bytes at address 0x7f8813fd2000 ==274830==ERROR: ThreadSanitizer failed to deallocate 0x43000 (274432) bytes at address 0x0e2050cf1000 FATAL: ThreadSanitizer CHECK failed: /home/pj416018/TSAN/llvm-project/compiler-rt/lib/sanitizer_common/sanitizer_posix.cc:61 "(("unable to unmap" && 0)) != (0)" (0x0, 0x0) |