This is an archive of the discontinued LLVM Phabricator instance.

[Support] Use atomics in DebugCounter to silence TSAN
AbandonedPublic

Authored by ilya-biryukov on Jul 31 2018, 9:57 AM.

Details

Summary

Note that the change does not aim to define semantics of DebugCounter
for multi-threaded code.

Diff Detail

Event Timeline

ilya-biryukov created this revision.Jul 31 2018, 9:57 AM

This change is a bit ugly since it has to account for the value of LLVM_ENABLE_THREADS...
Happy to accept suggestions/alternative ways to fix this :-)

Thanks for this!

I totally agree that DebugCounters, in their current state, make no sense to really use with threads. So, I'm not the biggest fan of silencing the tools that're correctly telling us "hey, you're holding these DebugCounters wrong."

Might it be reasonable to just have an enabled bit inside of the DebugCounters instance()? If true, we'll do all of this counting. If false, we'll just skip it all? (Logically, it would be set to "were we passed any -debug-counter args, or do we plan to run the pass from D50031?")

Also, if this is still breaking you (or anyone else), please let me know. I'm happy to revert the problematic patch while we work this out if it makes anyone's life easier. :)

ilya-biryukov abandoned this revision.Aug 1 2018, 2:28 AM

Abandoning in favor of George's fix.

Might it be reasonable to just have an enabled bit inside of the DebugCounters instance()? If true, we'll do all of this counting. If false, we'll just skip it all? (Logically, it would be set to "were we passed any -debug-counter args, or do we plan to run the pass from D50031?")

LG. Anything that keeps TSAN happy is good with our use-case, there's does not seem to be a way to make debug counters useful in the multi-threaded environment anyway.

Also, if this is still breaking you (or anyone else), please let me know. I'm happy to revert the problematic patch while we work this out if it makes anyone's life easier. :)

It's tensorflow under TSAN that's broken, they do run LLVM code that uses debug counters in parallel and get those errors from TSAN.