This is an archive of the discontinued LLVM Phabricator instance.

[TSan] Add option for emitting compound read-write instrumentation
ClosedPublic

Authored by melver on Jul 15 2020, 5:59 AM.

Details

Summary

This adds option -tsan-compound-read-before-write to emit different
instrumentation for the write if the read before that write is omitted
from instrumentation. The default TSan runtime currently does not
support the different instrumentation, and the option is disable by
default.

Alternative runtimes, such as the Kernel Concurrency Sanitizer (KCSAN)
can make use of the feature. Indeed, the initial motivation is for use
in KCSAN as it was determined that due to the Linux kernel having a
large number of unaddressed data races, it makes sense to improve
performance and reporting by distinguishing compounded operations. E.g.
the compounded instrumentation is typically emitted for compound
operations such as ++, +=, |=, etc. By emitting different reports, such
data races can automatically be bucketed differently (currently the plan
is to prioritize them).

Diff Detail

Event Timeline

melver created this revision.Jul 15 2020, 5:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 15 2020, 5:59 AM

It seems the linter/clang-format did a number of non-functional changes here. I'm inclined to keep them, because they more closely adhere to LLVM's style guide.

KCSAN change (draft) is here: https://github.com/melver/linux/commit/a04e683f457ea94b4bbba80cf05b5bcb50857fa7 -- tested and working as intended.

dvyukov added inline comments.Jul 16 2020, 2:41 AM
llvm/test/Instrumentation/ThreadSanitizer/read_before_write.ll
21

I think tsan_read expectation will be matched against tsan_read_write call. Is there a single way to refine it? Add an opening bracket or something?

What happens if the load and the store are separated by a barrier atomic load or store? Will they also be combined into a single operation?

llvm/test/Instrumentation/ThreadSanitizer/read_before_write.ll
78

Don't we want to treat pairs of volatile loads and stores as separate accesses? E.g. a volatile load may be racing with a completely different store somewhere else.

melver updated this revision to Diff 278455.Jul 16 2020, 6:47 AM
melver marked 4 inline comments as done.

Address comments.

What happens if the load and the store are separated by a barrier atomic load or store? Will they also be combined into a single operation?

Hmm, good question. My intent wasn't to change the existing behaviour of how it compounds reads and writes, so I think it shouldn't be in this change.

I added a test and it does compound them. This also affects normal TSAN actually, which by default compounds reads and writes to same addr to a single __tsan_write. Let me do this as a separate change after this: https://reviews.llvm.org/D83949

llvm/test/Instrumentation/ThreadSanitizer/read_before_write.ll
21

I can just add the size.

78

Yes. Note the difference between CHECK-COMPOUND and CHECK-COMPOUND-VOLATILE.

For the kernel we'll have the CHECK-COMPOUND-VOLATILE behaviour, i.e. -tsan-distinguish-volatile -tsan-compound-read-before-write is set. And it will treat them separately and not compound them.

If volatiles are not to be distinguished, it'll just compound them, i.e. revert to the standard behaviour (arguably also perfectly in line with the non-kernel standard, because racing volatiles are still a data race).

dvyukov accepted this revision.Jul 16 2020, 7:21 AM

LGTM on my side

This revision is now accepted and ready to land.Jul 16 2020, 7:21 AM
glider accepted this revision.Jul 16 2020, 10:11 AM
glider added inline comments.
llvm/test/Instrumentation/ThreadSanitizer/read_before_write.ll
78

This is probably unrelated to this patch, but I suppose that even if -tsan-distinguish-volatile is disabled (i.e. we don't use special functions to instrument volatile accesses), we still must not combine them.

This revision was automatically updated to reflect the committed changes.