This is an archive of the discontinued LLVM Phabricator instance.

[TSan] Add read/write range interface functions with PC
ClosedPublic

Authored by protze.joachim on Aug 28 2019, 8:05 AM.

Details

Summary

For most memory access functions like __tsan_write8 there is a variant to provide a pc: __tsan_write8_pc.
This is not the case for __tsan_write_range, which allows to annotate memory access for a range of memory. The main advantage we see in the range version is, that it only ticks once and therefore saves entries in the history ring buffer. Therefore, we would prefer to call __tsan_write_range_pc over looping on __tsan_write8_pc.

This patch adds the two new interface functions __tsan_write_range_pc and __tsan_read_range_pc, which take the additional PC argument.

Diff Detail

Repository
rL LLVM

Event Timeline

protze.joachim created this revision.Aug 28 2019, 8:05 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 28 2019, 8:05 AM
Herald added subscribers: Restricted Project, llvm-commits. · View Herald Transcript

Who are "we"? And you would prefer to call them where? ;)
Also, tests.

protze.joachim edited the summary of this revision. (Show Details)Aug 28 2019, 9:35 AM

Who are "we"? And you would prefer to call them where? ;)

I'm working with some students in my group on the MPI and OpenMP runtime correctness checking tools called MUST and Archer.
For MPI communication calls we are working on annotation for the memory access semantics of the buffer accesses. Since these memory accesses are typically done by the NIC, TSan could never observe such accesses directly. In multi-threaded MPI applications this annotation allows us to detect data race between communication and other threads accessing the buffer.

A simplified example of such annotation would look like:

int MPI_Send(const void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm){
  int size;
  MPI_Type_size(datatype, &size);
  __tsan_read_range_pc(buf, (uptr)size*count, __builtin_return_address(0));
  return PMPI_Send(buf, count, datatype, dest, tag, comm);
}

Is there a better interface for doing such annotations?
In our specific use case, also asan and msan might be interesting. I didn't look into interfaces of those, but a generic interface for all those sanitizers might be interesting.

Also, tests.

Ok, I'll look into tests for this. But will take some time.

Thanks for the background. Good to know.

I think we could actually do common cross-sanitizer annotation for this in common_interface_defs.h. We already have e.g. sanitizer_unaligned_load/storeN there that are implemented by all sanitizers.
Access annotations are historically separate, but it makes sense to have a common one. Asan also has
asan_address_is_poisoned/__asan_region_is_poisoned.
Kostya, Eugenii, Vitaly, any objections to adding something like __sanitizer_read/write_range_pc?
There is a question of what to do in msan versions. Should read to the checking? Should write unpoison? But we could choose what we expect to be more common case and add other special versions for msan if necessary later.

Re tests. Check test/tsan/ dir, e.g. simple_race.c, stack_race.cc, test/tsan/unaligned_norace.cc. There are not too hard to write, we just need to check it builds, detects a race on the annotation and the PC is used in the stack.

I derived a test from java_race_pc.cc.

In this new test, I see the same issue related to race on stack variables as reported here:
https://github.com/google/sanitizers/issues/1134

In this new test, I see the same issue related to race on stack variables as reported here:
https://github.com/google/sanitizers/issues/1134

Does it make the test fail? Or just produces a warning? We could switch to a heap-allocated block or a global.

In this new test, I see the same issue related to race on stack variables as reported here:
https://github.com/google/sanitizers/issues/1134

Does it make the test fail? Or just produces a warning? We could switch to a heap-allocated block or a global.

The test succeeds, it just produces the additional messages from llvm-symbolizer.

vitalybuka resigned from this revision.Sep 12 2019, 11:01 AM
dvyukov accepted this revision.Sep 16 2019, 6:26 AM

Generally looks good to me.

Joachim, do you have commit access? Or you want us to land it?

Vitaly, please take a look too and land if necessary.

This revision is now accepted and ready to land.Sep 16 2019, 6:26 AM
This revision was automatically updated to reflect the committed changes.