This is an archive of the discontinued LLVM Phabricator instance.

tsan: model atomic read for failing CAS
AcceptedPublic

Authored by dvyukov on Apr 27 2022, 1:34 AM.

Details

Summary

See the added test and https://github.com/google/sanitizers/issues/1520
for the description of the problem.
The standard says that failing CAS is a memory load only,
model it as such to avoid false positives.

Diff Detail

Event Timeline

dvyukov created this revision.Apr 27 2022, 1:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 1:34 AM
dvyukov requested review of this revision.Apr 27 2022, 1:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2022, 1:34 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript

This is tricky, I am not 100% sure (1) it's the right thing to do, (2) that destruction safety problem (see the comment) will hit us or not (maybe there are some other cases I don't see).

This is tricky, I am not 100% sure (1) it's the right thing to do, (2) that destruction safety problem (see the comment) will hit us or not (maybe there are some other cases I don't see).

also (3) do I do correct reading of the standard?
(4) is it correct to extend this guarantee to the __atomic builtins?

This is tricky, I am not 100% sure (1) it's the right thing to do, (2) that destruction safety problem (see the comment) will hit us or not (maybe there are some other cases I don't see).

also (3) do I do correct reading of the standard?
(4) is it correct to extend this guarantee to the __atomic builtins?

The question I wonder is: do we need 100% correctness at cost of complexity and/or other issues?

Several cons:

  • The fact we haven't encountered this on any realistic large codebase yet tells me that the example in the Github issue is a contrived example. Are there any real codebases where this false positive is a problem?
  • Mixing atomic and non-atomic ops should be discouraged.
  • If someone deliberately does mix plain and atomic ops and runs into this case, I'm assuming it's some one-off complex algorithm and the author could also add no_sanitize("thread").
  • In most algorithms, the CAS will eventually also do a write, so the cases where it will always be a false positive is extremely small.
  • (2) might be a more real problem.

One pro:

  • More correct interpretation of the standard.

Assuming the implementation can't be simpler and it causes other issues, pragmatically I would therefore avoid supporting this.

melver accepted this revision.Apr 28 2022, 3:58 AM
melver added inline comments.
compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp
436–441

This leads me to think this is a real problem, but because of the memory ordering requirements it's not (AFAIK, and as just discussed in chat).

This revision is now accepted and ready to land.Apr 28 2022, 3:58 AM
This revision was automatically updated to reflect the committed changes.

I've merged this. In few weeks it will pass more extensive testing, if there will be unforeseen issues, I will revert this. But at least we will know on what code it fails and add a test.

sanitizer-x86_64-linux-autoconf failed:

https://lab.llvm.org/buildbot/#/builders/70/builds/21206

[30/31] Running ThreadSanitizer tests

  • Testing: 442 tests, 80 workers --

command timed out: 1200 seconds without output running [b'python', b'../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py'], attempting to kill
process killed by signal 9

Not sure how this new test could have caused the timeout...

vitalybuka reopened this revision.May 2 2022, 10:29 PM
This revision is now accepted and ready to land.May 2 2022, 10:29 PM

sanitizer-x86_64-linux-autoconf failed:

https://lab.llvm.org/buildbot/#/builders/70/builds/21206

[30/31] Running ThreadSanitizer tests

  • Testing: 442 tests, 80 workers --

command timed out: 1200 seconds without output running [b'python', b'../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py'], attempting to kill
process killed by signal 9

Not sure how this new test could have caused the timeout...

It's likely deadlock or other hang.

Yes, debug versions of compare_exchange.cpp and atomic_norace2.cpp hang with:

ThreadSanitizer: internal deadlock: can't lock Slots under SyncVar mutex

at:

(gdb) bt
#0 sanitizer::FutexWait (p=p@entry=0x7ffff7fa80a8, cmp=cmp@entry=0) at sanitizer_common/sanitizer_linux.cpp:678
#1 0x000000000042d9ea in
sanitizer::Semaphore::Wait (this=0x7ffff7fa80a8) at sanitizer_common/sanitizer_mutex.cpp:35
#2 0x000000000042be7c in sanitizer::Mutex::Lock (this=this@entry=0x7ffff7fa8098) at sanitizer_common/sanitizer_mutex.h:196
#3 0x000000000043d6e0 in
sanitizer::GenericScopedLock<sanitizer::Mutex>::GenericScopedLock (mu=0x7ffff7fa8098, this=<optimized out>) at sanitizer_common/sanitizer_mutex.h:383
#4
sanitizer::Symbolizer::SymbolizePC (this=0x7ffff7fa8028, addr=4815253) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:86
#5 0x000000000043be57 in sanitizer::(anonymous namespace)::StackTraceTextPrinter::ProcessAddressFrames (this=this@entry=0x7ffff3efbe88, pc=4815253) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:36
#6 0x000000000043bd79 in
sanitizer::StackTrace::PrintTo (this=this@entry=0x7ffff3efbf18, output=output@entry=0x7ffff3efbee8) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:109
#7 0x000000000043c101 in sanitizer::StackTrace::Print (this=0x7ffff3efbf18) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:132
#8 0x00000000004a28f3 in
sanitizer::PrintMutexPC (pc=4815254) at tsan/rtl/tsan_rtl.cpp:1064
#9 0x000000000042df9b in sanitizer::InternalDeadlockDetector::Lock (this=0x7ffff3efccc0, type=type@entry=12, pc=pc@entry=4498873) at sanitizer_common/sanitizer_mutex.cpp:177
#10 0x000000000042de05 in
sanitizer::CheckedMutex::LockImpl (this=<optimized out>, pc=0) at sanitizer_common/sanitizer_mutex.cpp:218
#11 0x000000000042bdb2 in sanitizer::CheckedMutex::Lock (this=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:129
#12 sanitizer::Mutex::Lock (this=this@entry=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:167
#13 0x000000000044a5b9 in sanitizer::GenericScopedLock<sanitizer::Mutex>::GenericScopedLock (mu=0x1868b80 <tsan::ctx_placeholder+10518912>, this=<optimized out>) at tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
#14
tsan::TraceAcquire<tsan::EventFunc> (thr=0x7ffff3efcf00, ev=<optimized out>) at tsan/rtl/tsan_rtl.h:659
#15
tsan::TryTraceFunc (thr=0x7ffff3efcf00, pc=4411429) at tsan/rtl/tsan_rtl.h:702
#16 tsan::FuncEntry (thr=0x7ffff3efcf00, pc=4411429) at tsan/rtl/tsan_rtl.h:733
#17
tsan::ScopedInterceptor::ScopedInterceptor (this=this@entry=0x7ffff3efc010, thr=thr@entry=0x7ffff3efcf00, fname=<optimized out>, pc=4411429) at tsan/rtl/tsan_interceptors_posix.cpp:251
#18 0x0000000000452cf9 in interceptor_dl_iterate_phdr (cb=0x435030 <sanitizer::dl_iterate_phdr_cb(dl_phdr_info*, unsigned long, void*)>, data=0x7ffff3efc040) at tsan/rtl/tsan_interceptors_posix.cpp:2300
#19 0x0000000000435025 in sanitizer::ListOfModules::init (this=0x7ffff7fa8050) at sanitizer_common/sanitizer_linux_libcdep.cpp:691
#20 0x000000000043d7b3 in
sanitizer::Symbolizer::RefreshModules (this=0x7ffff7fa8028) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:185
#21 sanitizer::Symbolizer::FindModuleForAddress (this=this@entry=0x7ffff7fa8028, address=address@entry=4815253) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:204
#22 0x000000000043d6fb in
sanitizer::Symbolizer::SymbolizePC (this=0x7ffff7fa8028, addr=4815253) at sanitizer_common/sanitizer_symbolizer_libcdep.cpp:88
#23 0x000000000043be57 in sanitizer::(anonymous namespace)::StackTraceTextPrinter::ProcessAddressFrames (this=this@entry=0x7ffff3efc158, pc=4815253) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:36
#24 0x000000000043bd79 in
sanitizer::StackTrace::PrintTo (this=this@entry=0x7ffff3efc1e8, output=output@entry=0x7ffff3efc1b8) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:109
#25 0x000000000043c101 in sanitizer::StackTrace::Print (this=0x7ffff3efc1e8) at sanitizer_common/sanitizer_stacktrace_libcdep.cpp:132
#26 0x00000000004a28f3 in
sanitizer::PrintMutexPC (pc=4815254) at tsan/rtl/tsan_rtl.cpp:1064
#27 0x000000000042df9b in sanitizer::InternalDeadlockDetector::Lock (this=0x7ffff3efccc0, type=type@entry=12, pc=pc@entry=4876568) at sanitizer_common/sanitizer_mutex.cpp:177
#28 0x000000000042de05 in
sanitizer::CheckedMutex::LockImpl (this=<optimized out>, pc=0) at sanitizer_common/sanitizer_mutex.cpp:218
#29 0x000000000042bdb2 in sanitizer::CheckedMutex::Lock (this=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:129
#30 sanitizer::Mutex::Lock (this=this@entry=0x1868b80 <tsan::ctx_placeholder+10518912>) at sanitizer_common/sanitizer_mutex.h:167
#31 0x00000000004a6918 in sanitizer::GenericScopedLock<sanitizer::Mutex>::GenericScopedLock (mu=0x1868b80 <tsan::ctx_placeholder+10518912>, this=<optimized out>) at tsan/rtl/../../sanitizer_common/sanitizer_mutex.h:383
#32
tsan::TraceAcquire<tsan::EventAccess> (thr=0x7ffff3efcf00, ev=<optimized out>) at tsan/rtl/tsan_rtl.h:659
#33
tsan::TryTraceMemoryAccess (thr=0x7ffff3efcf00, pc=5011256, addr=26177776, size=8, typ=3) at tsan/rtl/tsan_rtl_access.cpp:25
#34 tsan::MemoryAccess (thr=thr@entry=0x7ffff3efcf00, pc=pc@entry=5011256, addr=addr@entry=26177776, size=size@entry=8, typ=typ@entry=3) at tsan/rtl/tsan_rtl_access.cpp:449
#35 0x00000000004979ce in AtomicCAS<unsigned long long> (thr=thr@entry=0x7ffff3efcf00, pc=5011256, a=a@entry=0x18f70f0 <_nodes>, c=c@entry=0x7ffff3efc390, v=v@entry=0, mo=
tsan::mo_relaxed, fmo=tsan::mo_relaxed) at tsan/rtl/tsan_interface_atomic.cpp:462
#36 0x0000000000498913 in AtomicCAS<unsigned long long> (thr=0x7ffff3efcf00, pc=128, a=0x18f70f0 <_nodes>, c=135257110085632, v=0, mo=
tsan::mo_relaxed, fmo=tsan::mo_relaxed) at tsan/rtl/tsan_interface_atomic.cpp:479
#37
tsan_atomic64_compare_exchange_val (a=0x18f70f0 <_nodes>, c=0, v=0, mo=tsan::mo_relaxed, fmo=tsan::mo_relaxed) at tsan/rtl/tsan_interface_atomic.cpp:853
#38 0x00000000004c7738 in std::atomic_base<node*>::compare_exchange_strong (m1=<optimized out>, this=<optimized out>, p1=<optimized out>, p2=<optimized out>, m2=<optimized out>) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/atomic_base.h:880
#39 std::atomic<node*>::compare_exchange_weak (this=this@entry=0x18f70f0 <_nodes>,
p1=@0x7ffff3efc408: 0x0, p2=p2@entry=0x0, m1=m1@entry=std::memory_order_release, m2=m2@entry=std::memory_order_relaxed) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/atomic:589
#40 0x00000000004c7263 in f2<0> (i=i@entry=0, mo=mo@entry=std::memory_order_release, fmo=std::memory_order_relaxed) at ../../../test/tsan/compare_exchange.cpp:25
#41 0x00000000004c785b in std::invoke_impl<void, void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> (f=<optimized out>, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:61
#42 std::
invoke<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> (fn=<optimized out>, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed, args=@0x7b0800000048: std::memory_order_relaxed) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/invoke.h:96
#43 std::thread::_Invoker<std::tuple<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> >::_M_invoke<0ul, 1ul, 2ul, 3ul> (this=0x7b0800000048) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:253
#44 std::thread::_Invoker<std::tuple<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> >::operator() (this=0x7b0800000048) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:260
#45 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)(int, std::memory_order, std::memory_order), unsigned int, std::memory_order, std::memory_order> > >::_M_run (this=0x7b0800000040) at /usr/lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_thread.h:211
#46 0x00007ffff7e6c954 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#47 0x000000000044d29c in __tsan_thread_start_func (arg=<optimized out>) at tsan/rtl/tsan_interceptors_posix.cpp:1012
#48 0x00007ffff7c3dd80 in start_thread (arg=0x7ffff3efd640) at pthread_create.c:481
#49 0x00007ffff7b2b76f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

And modelling the memory access after the CAS and after releasing the mutex it subject to use-after-frees even without standalone fences.
So so far this looks infeasible.