This is an archive of the discontinued LLVM Phabricator instance.

Vector clock algorithm improvement in case of blocking release sequence
Needs ReviewPublic

Authored by YorovSobir on May 19 2018, 10:59 AM.

Details

Reviewers
dvyukov
kcc
Summary

In tsan, release sequences are never blocked, and all of them will continue indefinitely. It leads several data races to be missed. On the other hand, tsan does not recognise fence semantics and their role in synchronization, causing tsan to produce false positives.
See more in https://www.doc.ic.ac.uk/~afd/homepages/papers/pdfs/2017/POPL.pdf
Now vector clock algorithm is modified according to the article. Correctness is proved by several unit tests.

Event Timeline

YorovSobir created this revision.May 19 2018, 10:59 AM
Herald added subscribers: Restricted Project, llvm-commits, delcypher and 2 others. · View Herald TranscriptMay 19 2018, 10:59 AM
YorovSobir retitled this revision from release sequence now can be blocked to Vector clock algorithm improvement in case of blocking release sequence.May 19 2018, 1:20 PM
YorovSobir edited the summary of this revision. (Show Details)
YorovSobir added reviewers: dvyukov, kcc.
jfb added a subscriber: jfb.May 23 2018, 8:38 AM

Hi Yorov,

Sorry for delays, I was on a vacation and now have a backlog of work.

Thanks for working on this.

This needs some additional work before we can merge this. High-level notes:

  1. Fences and release sequences need to be separated into individual patches. I would suggest to start with fences as it looks simpler.
  1. Tsan is used on a very large programs with thousands of threads and tens of millions of sync objects. Performance and especially memory consumption of synchronization handling is crucial. Since tsan is already widely deployed any non-constant overheads need to be controlled by flags. For example, any fence handling need to be completely skipped if the flag is not set. Default value for these flags it's a hard question, but we can decide later.
  1. Vector's should not be used inside of sync objects. There are 2 problems with Vector's:
    • they can use memory suboptimally, and this is unacceptable for sync objects since they can consume tens of gigs
    • our memory allocator does reuse memory between different size classes, this leads to O(N^2) memory consumption as these Vector's slowly grow during program startup and eat memory in all consecutive size classes; that's the reason why current clocks are built as chunked data structures with constant-size blocks
  1. For fences check out what we did in KTSAN to reduce their overheads (acquire_active and release_active):

https://github.com/google/ktsan/blob/tsan/mm/ktsan/sync_atomic.c#L6
I think we need something similar here. The actual threshold for discarding fence effects should be controllable with a flag.

  1. For release fences it's possible to get their overhead to almost zero in common case. The idea is as follows. In common case there are no acquire operations in between release fence and the corresponding relaxed store, in such case we don't need to memorize the whole thread vector clock. Instead we can memorize only the current thread time. Then when we need to turn a relaxed store into release operation, we use the current thread vector clock (it did not change because we did not do any acquire operations) except for the current thread time, for the current thread time we use the previously saved value. This makes release fence cost O(1) instead of O(N).

But if there is an acquire operation in between, then we need to save the whole vector clock at that point. However, if we also do (4), then usually there won't be an acquire fence while a release fence is active.

lib/tsan/rtl/tsan_clock.cc
185

This comment and "Check if we need to resize dst" does not seem to add much value. I would rather see more extensive comments for more complex parts of the change.

186

This code base does not use {} around if/for blocks if they contain a single statement. Please fix here and below.

274

The TODOs need to be resolved before commit. Either by fixing code, or by removing TODO if we are not going to fix code.

349

revert

383

Same here: either resolve it, or remove.

396

We use 2 styles for comments:

  1. comment that take the full line are full English sentences: start with a capital letter and end with dot
  2. short comments on the same line with code can be either full English sentences or start with lower letter and end with no dot.

So if you do this as a full line this needs to be:

// Unshare before changing dst.
443

Empty line between functions please.

lib/tsan/rtl/tsan_clock.h
180

I think we can use kMaxTid here.
kMaxTidInClock is kMaxTid*2, that's required to detect accesses to freed memory at low cost. But we should not access the second half of the clock here.

lib/tsan/rtl/tsan_interface_atomic.cc
227

No dead code please.

test/CMakeLists.txt
18

I don't think this needs to be here as no other tests are listed here.
If you want to build a single test, just invoke the compiler manually.