Another attempt at fixing tsan_invisible_barrier.
Current implementation causes:
https://llvm.org/bugs/show_bug.cgi?id=25643
There were several unsuccessful iterations for this functionality:
- Initially it was implemented in user code using REAL(pthread_barrier_wait). But pthread_barrier_wait is not supported on MacOS. Futexes are linux-specific for this matter.
- Then we switched to atomics+usleep(10). But usleep produced parasitic "as-if synchronized via sleep" messages in reports which failed some output tests.
- Then we switched to atomics+sched_yield. But this produced tons of tsan- visible events, which lead to "failed to restore stack trace" failures.
Move implementation into runtime and use internal_sched_yield in the wait loop.
This way tsan should see no events from the barrier, so not trace overflows and
no "as-if synchronized via sleep" messages.
If these are only used in tests, can we name this __tsan_testonly_invisible_barrier_init?