See the added test and https://github.com/google/sanitizers/issues/1520
for the description of the problem.
The standard says that failing CAS is a memory load only,
model it as such to avoid false positives.
Details
- Reviewers
melver vitalybuka - Commits
- rG2fec52a40261: tsan: model atomic read for failing CAS
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This is tricky, I am not 100% sure (1) it's the right thing to do, (2) that destruction safety problem (see the comment) will hit us or not (maybe there are some other cases I don't see).
also (3) do I do correct reading of the standard?
(4) is it correct to extend this guarantee to the __atomic builtins?
The question I wonder is: do we need 100% correctness at cost of complexity and/or other issues?
Several cons:
- The fact we haven't encountered this on any realistic large codebase yet tells me that the example in the Github issue is a contrived example. Are there any real codebases where this false positive is a problem?
- Mixing atomic and non-atomic ops should be discouraged.
- If someone deliberately does mix plain and atomic ops and runs into this case, I'm assuming it's some one-off complex algorithm and the author could also add no_sanitize("thread").
- In most algorithms, the CAS will eventually also do a write, so the cases where it will always be a false positive is extremely small.
- (2) might be a more real problem.
One pro:
- More correct interpretation of the standard.
Assuming the implementation can't be simpler and it causes other issues, pragmatically I would therefore avoid supporting this.
compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp | ||
---|---|---|
436–441 | This leads me to think this is a real problem, but because of the memory ordering requirements it's not (AFAIK, and as just discussed in chat). |
I've merged this. In few weeks it will pass more extensive testing, if there will be unforeseen issues, I will revert this. But at least we will know on what code it fails and add a test.
sanitizer-x86_64-linux-autoconf failed:
https://lab.llvm.org/buildbot/#/builders/70/builds/21206
[30/31] Running ThreadSanitizer tests
- Testing: 442 tests, 80 workers --
command timed out: 1200 seconds without output running [b'python', b'../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py'], attempting to kill
process killed by signal 9
Not sure how this new test could have caused the timeout...
Yes, debug versions of compare_exchange.cpp and atomic_norace2.cpp hang with:
ThreadSanitizer: internal deadlock: can't lock Slots under SyncVar mutex
at:
(gdb) bt
#0 sanitizer::FutexWait (p=p@entry=0x7ffff7fa80a8, cmp=cmp@entry=0) at sanitizer_common/sanitizer_linux.cpp:678
#1 0x000000000042d9ea in sanitizer::Semaphore::Wait (this=0x7ffff7fa80a8) at sanitizer_common/sanitizer_mutex.cpp:35
#2 0x000000000042be7c in sanitizer::Mutex::Lock (this=this@entry=0x7ffff7fa8098) at sanitizer_common/sanitizer_mutex.h:196
#3 0x000000000043d6e0 in sanitizer::GenericScopedLock<sanitizer::Mutex>::GenericScopedLock (mu=0x7ffff7fa8098, this=<optimized out>) at sanitizer_common/sanitizer_mutex.h:383
#4 sanitizer::Symbolizer::SymbolizePC (this=0x7ffff7fa8028, addr=4815253) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:86
#5 0x000000000043be57 in sanitizer::(anonymous namespace)::StackTraceTextPrinter::ProcessAddressFrames (this=this@entry=0x7ffff3efbe88, pc=4815253) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:36
#6 0x000000000043bd79 in sanitizer::StackTrace::PrintTo (this=this@entry=0x7ffff3efbf18, output=output@entry=0x7ffff3efbee8) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:109
#7 0x000000000043c101 in sanitizer::StackTrace::Print (this=0x7ffff3efbf18) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:132
#8 0x00000000004a28f3 in sanitizer::PrintMutexPC (pc=4815254) at tsan/rtl/tsan_rtl.cpp:1064
#9 0x000000000042df9b in sanitizer::InternalDeadlockDetector::Lock (this=0x7ffff3efccc0, type=type@entry=12, pc=pc@entry=4498873) at sanitizer_common/sanitizer_mutex.cpp:177
#10 0x000000000042de05 in sanitizer::CheckedMutex::LockImpl (this=<optimized out>, pc=0) at sanitizer_common/sanitizer_mutex.cpp:218
#11 0x000000000042bdb2 in sanitizer::CheckedMutex::Lock (this=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:129
#12 sanitizer::Mutex::Lock (this=this@entry=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:167
#13 0x000000000044a5b9 in sanitizer::GenericScopedLock<sanitizer::Mutex>::GenericScopedLock (mu=0x1868b80 <tsan::ctx_placeholder+10518912>, this=<optimized out>) at tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
#14 tsan::TraceAcquire<tsan::EventFunc> (thr=0x7ffff3efcf00, ev=<optimized out>) at tsan/rtl/tsan_rtl.h:659
#15 tsan::TryTraceFunc (thr=0x7ffff3efcf00, pc=4411429) at tsan/rtl/tsan_rtl.h:702
#16 tsan::FuncEntry (thr=0x7ffff3efcf00, pc=4411429) at tsan/rtl/tsan_rtl.h:733
#17 tsan::ScopedInterceptor::ScopedInterceptor (this=this@entry=0x7ffff3efc010, thr=thr@entry=0x7ffff3efcf00, fname=<optimized out>, pc=4411429) at tsan/rtl/tsan_interceptors_posix.cpp:251
#18 0x0000000000452cf9 in interceptor_dl_iterate_phdr (cb=0x435030 <sanitizer::dl_iterate_phdr_cb(dl_phdr_info*, unsigned long, void*)>, data=0x7ffff3efc040) at tsan/rtl/tsan_interceptors_posix.cpp:2300
#19 0x0000000000435025 in sanitizer::ListOfModules::init (this=0x7ffff7fa8050) at sanitizer_common/sanitizer_linux_libcdep.cpp:691
#20 0x000000000043d7b3 in sanitizer::Symbolizer::RefreshModules (this=0x7ffff7fa8028) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:185
#21 sanitizer::Symbolizer::FindModuleForAddress (this=this@entry=0x7ffff7fa8028, address=address@entry=4815253) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:204
#22 0x000000000043d6fb in sanitizer::Symbolizer::SymbolizePC (this=0x7ffff7fa8028, addr=4815253) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:88
#23 0x000000000043be57 in sanitizer::(anonymous namespace)::StackTraceTextPrinter::ProcessAddressFrames (this=this@entry=0x7ffff3efc158, pc=4815253) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:36
#24 0x000000000043bd79 in sanitizer::StackTrace::PrintTo (this=this@entry=0x7ffff3efc1e8, output=output@entry=0x7ffff3efc1b8) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:109
#25 0x000000000043c101 in sanitizer::StackTrace::Print (this=0x7ffff3efc1e8) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:132
#26 0x00000000004a28f3 in sanitizer::PrintMutexPC (pc=4815254) at tsan/rtl/tsan_rtl.cpp:1064
#27 0x000000000042df9b in sanitizer::InternalDeadlockDetector::Lock (this=0x7ffff3efccc0, type=type@entry=12, pc=pc@entry=4876568) at sanitizer_common/sanitizer_mutex.cpp:177
#28 0x000000000042de05 in sanitizer::CheckedMutex::LockImpl (this=<optimized out>, pc=0) at sanitizer_common/sanitizer_mutex.cpp:218
#29 0x000000000042bdb2 in sanitizer::CheckedMutex::Lock (this=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:129
#30 sanitizer::Mutex::Lock (this=this@entry=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:167
#31 0x00000000004a6918 in sanitizer::GenericScopedLock<sanitizer::Mutex>::GenericScopedLock (mu=0x1868b80 <tsan::ctx_placeholder+10518912>, this=<optimized out>) at tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
#32 tsan::TraceAcquire<tsan::EventAccess> (thr=0x7ffff3efcf00, ev=<optimized out>) at tsan/rtl/tsan_rtl.h:659
#33 tsan::TryTraceMemoryAccess (thr=0x7ffff3efcf00, pc=5011256, addr=26177776, size=8, typ=3) at tsan/rtl/tsan_rtl_access.cpp:25
#34 tsan::MemoryAccess (thr=thr@entry=0x7ffff3efcf00, pc=pc@entry=5011256, addr=addr@entry=26177776, size=size@entry=8, typ=typ@entry=3) at tsan/rtl/tsan_rtl_access.cpp:449
#35 0x00000000004979ce in AtomicCAS<unsigned long long> (thr=thr@entry=0x7ffff3efcf00, pc=5011256, a=a@entry=0x18f70f0 <_nodes>, c=c@entry=0x7ffff3efc390, v=v@entry=0, mo=tsan::mo_relaxed, fmo=tsan::mo_relaxed) at tsan/rtl/tsan_interface_atomic.cpp:462
#36 0x0000000000498913 in AtomicCAS<unsigned long long> (thr=0x7ffff3efcf00, pc=128, a=0x18f70f0 <_nodes>, c=135257110085632, v=0, mo=tsan::mo_relaxed, fmo=tsan::mo_relaxed) at tsan/rtl/tsan_interface_atomic.cpp:479
#37 tsan_atomic64_compare_exchange_val (a=0x18f70f0 <_nodes>, c=0, v=0, mo=tsan::mo_relaxed, fmo=tsan::mo_relaxed) at tsan/rtl/tsan_interface_atomic.cpp:853
#38 0x00000000004c7738 in std::atomic_base<node*>::compare_exchange_strong (m1=<optimized out>, this=<optimized out>, p1=<optimized out>, p2=<optimized out>, m2=<optimized out>) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:880
#39 std::atomic<node*>::compare_exchange_weak (this=this@entry=0x18f70f0 <_nodes>, p1=@0x7ffff3efc408: 0x0, p2=p2@entry=0x0, m1=m1@entry=std::memory_order_release, m2=m2@entry=std::memory_order_relaxed) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:589
#40 0x00000000004c7263 in f2<0> (i=i@entry=0, mo=mo@entry=std::memory_order_release, fmo=std::memory_order_relaxed) at ../../../test/tsan/compare_exchange.cpp:25
#41 0x00000000004c785b in std::invoke_impl<void, void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> (f=<optimized out>, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
#42 std::invoke<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> (fn=<optimized out>, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96
#43 std::thread::_Invoker<std::tuple<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> >::_M_invoke<0ul, 1ul, 2ul, 3ul> (this=0x7b0800000048) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253
#44 std::thread::_Invoker<std::tuple<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> >::operator() (this=0x7b0800000048) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260
#45 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> > >::_M_run (this=0x7b0800000040) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211
#46 0x00007ffff7e6c954 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#47 0x000000000044d29c in __tsan_thread_start_func (arg=<optimized out>) at tsan/rtl/tsan_interceptors_posix.cpp:1012
#48 0x00007ffff7c3dd80 in start_thread (arg=0x7ffff3efd640) at pthread_create.c:481
#49 0x00007ffff7b2b76f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
And modelling the memory access after the CAS and after releasing the mutex it subject to use-after-frees even without standalone fences.
So so far this looks infeasible.
This leads me to think this is a real problem, but because of the memory ordering requirements it's not (AFAIK, and as just discussed in chat).