After switching tsan from the old mutex to the new sanitizer_common mutex,
we've observed a significant degradation of performance on a test.
The test effectively stresses a lock-free stack with 4 threads
with a mix of atomic_compare_exchange and atomic_load operations.
The former takes write lock, while the latter takes read lock.
It turned out the new mutex performs worse because readers don't
use active spinning, which results in significant amount of thread
blocking/unblocking. The old tsan mutex used active spinning
for both writers and readers.
Add active spinning for readers.
Don't hand off the mutex to readers, and instread make them
compete for the mutex after wake up again.
This makes readers and writers almost symmetric.
I think we need a 'pause' hint in active spinning. Do we have arch-independent pause-hint wrapper already?