This is an archive of the discontinued LLVM Phabricator instance.

[tsan] Detect races on modifying accesses in Swift code
ClosedPublic

Authored by kubamracek on Apr 3 2017, 4:22 PM.

Details

Summary

This patch allows the Swift compiler to emit calls to __tsan_external_write before starting any modifying access, which will cause TSan to detect races on arrays, dictionaries and other classes defined in non-instrumented modules. Races on collections from the Swift standard library and user-defined structs and a frequent cause of subtle bugs and it's important that TSan detects those on top of existing LLVM IR instrumentation, which already detects races in direct memory accesses.

Diff Detail

Repository
rL LLVM

Event Timeline

kubamracek created this revision.Apr 3 2017, 4:22 PM
kubamracek updated this revision to Diff 94144.Apr 4 2017, 5:28 PM
kubamracek retitled this revision from [tsan] WIP: Detect races on modifying accesses in Swift code to [tsan] Detect races on modifying accesses in Swift code.
kubamracek edited the summary of this revision. (Show Details)

Updating the patch with a solution to address the problem described at https://groups.google.com/forum/#!topic/thread-sanitizer/r7HmBkKQiIw:

There is a problem is that we often detect the race on the "regular" access (called from LLVM IR instrumentation), and we can't tell that this is a Swift-type race, because we don't store the information that the previous access way a sizeless, external memory access. Currently, the __tsan_external_write API simply does MemoryWrite(addr, kSizeLog8), so we track those accesses as 8-byte long, which is not really true. We'd like to instead treat them as "sizeless".

This patch stores the external tag information in a fake stack frame in thr->shadow_stack, which is later retrieved during issue reporting. I'm open to better suggestions.

dvyukov edited edge metadata.Apr 7 2017, 1:54 PM

Hi Kuba,

I've seen a bunch of changes from you and they all are marked in my inbox. The past 1.5 weeks I was travelling and next week I am OOO so won't be able to review them. Just wanted to let you know.

Hi Kuba,

I've seen a bunch of changes from you and they all are marked in my inbox. The past 1.5 weeks I was travelling and next week I am OOO so won't be able to review them. Just wanted to let you know.

I am back. Slowly working on backlog.

I don't like that we sprinkle swift-specific bits throughout the code when the external API was meant to be general enough to cover all such cases.
We have the special tag for swift, so it really seems to me that it should be the only swift-specific bit here and the rest must be handled by the external API on general basis. We have 3 potential users for the external API (Go, Java, races on fd, potentially more in future). So I would be really interested in making the external API general and flexible enough to support all of them, rather than later sprinkle Go/Java/fd/etc-specific bits throughout the code again later.

Why do we say "Mutating" for external and "Modifying" for swift? Please settle on 1 form.
Why is the tag for switch includes "modifying"? Modifying/non-modifying is orthogonal to the object type and is controlled by __tsan_external_read/write. It really should be kExternalTagSwiftAccess.
If we want to say "swift" in header line, why don't we want the same level of support for other external objects?

I propose we say for usual data races:

ThreadSanitizer: data race

Write of size X at P by S
Previous write of size X at P by S

and for external accesses:

ThreadSanitizer: race on TAG_NAME

Mutating access at P by S
Previous mutating access at P by S

(or Modifying, I don't care too much)

This will allow us to have:

ThreadSanitizer: race on file descriptor
ThreadSanitizer: race on curl handle
ThreadSanitizer: race on Swift array
ThreadSanitizer: race on Swift map
ThreadSanitizer: race on Go map

Also, could swift runtime register a tag at startup? Or performance overhead is a concern?

test/tsan/Darwin/external-swift.cc
10 ↗(On Diff #94144)

Declare them in test.h. They will be used by other tests in future.

We have 3 potential users for the external API (Go, Java, races on fd, potentially more in future).

Also add races on vptr to the list.

lib/tsan/rtl/tsan_rtl.h
651 ↗(On Diff #94144)

Why is it 128? It seems it should be 2.

Hi Kuba,

I've seen a bunch of changes from you and they all are marked in my inbox. The past 1.5 weeks I was travelling and next week I am OOO so won't be able to review them. Just wanted to let you know.

I am back. Slowly working on backlog.

I now think that I replied to all pending reviews. If there is anything left, please ping them.

lib/tsan/rtl/tsan_report.cc
215 ↗(On Diff #94144)

It's better to drop "object" here to make it more general. A user can always add "object" to tag name if necessary.

lib/tsan/rtl/tsan_suppressions.cc
79 ↗(On Diff #94144)

It seems to me that we now won't see ReportTypeExternalRace/ReportTypeSwiftAccessRace here.

First, let me split out the conversion of tag tracking in ThreadState into tracking in thread event logs: https://reviews.llvm.org/D32382.

kubamracek added inline comments.Apr 21 2017, 3:48 PM
lib/tsan/rtl/tsan_report.cc
215 ↗(On Diff #94144)
lib/tsan/rtl/tsan_rtl.h
651 ↗(On Diff #94144)
test/tsan/Darwin/external-swift.cc
10 ↗(On Diff #94144)
kubamracek updated this revision to Diff 96261.Apr 21 2017, 3:56 PM

Uploading with without the extracted standalone parts.

kubamracek updated this revision to Diff 96268.Apr 21 2017, 5:44 PM

Addressing review comments.

ThreadSanitizer: race on Swift array
ThreadSanitizer: race on Swift map

Unfortunately, we won't be able to tell the type of the variable (at least not now). I've updated the patch to say "race on Swift variable", but that's not really exact and it's not describing the problem well. Swift is getting exact threading rules that are very different from C/C++ and I'm trying to focus on making the reports readable and understandable by Swift users.

Is there a way we could still special-case the Swift reports? I understand the desire to keep the API generic, but I think we need to make sure the users understand the report. E.g. "Swift access race" much better describes what the problem is, and users words that define the Swift threading model.

Also, could swift runtime register a tag at startup? Or performance overhead is a concern?

It's quite hard to make the runtime call the registration at process startup, because it would need to dynamically figure out whether we're running with TSan or not, plus we'd need to add an indirection to each access (to read the value of the tag). I'd really prefer to keep the value as a well-defined constant in TSan.

Also, could swift runtime register a tag at startup? Or performance overhead is a concern?

It's quite hard to make the runtime call the registration at process startup, because it would need to dynamically figure out whether we're running with TSan or not, plus we'd need to add an indirection to each access (to read the value of the tag). I'd really prefer to keep the value as a well-defined constant in TSan.

Agree, I think, a constant is the right way to do it.
Any other uses that I think of will also be much simpler/faster with a constant.

ThreadSanitizer: race on Swift array
ThreadSanitizer: race on Swift map

Unfortunately, we won't be able to tell the type of the variable (at least not now). I've updated the patch to say "race on Swift variable", but that's not really exact and it's not describing the problem well. Swift is getting exact threading rules that are very different from C/C++ and I'm trying to focus on making the reports readable and understandable by Swift users.

Is there a way we could still special-case the Swift reports? I understand the desire to keep the API generic, but I think we need to make sure the users understand the report. E.g. "Swift access race" much better describes what the problem is, and users words that define the Swift threading model.

Humm... what is so special about Swift? Isn't it that concurrent accesses to some objects are a bug? I would expect that it is, and then it's no different from C/C++ variables, C++ containers, Go variables and containers, Java containers, file descriptors and everything else.

I don't understand why saying race or data race on something is confusing or not understand-able. What's ambiguous/obscure/wrong about it?
On the other hand I have problems understanding meaning of "Swift access race". Does it mean "race during access to a Swift object"? If so it's semantically equivalent to "race on Swift object".

ThreadSanitizer: race on Swift array
ThreadSanitizer: race on Swift map

Unfortunately, we won't be able to tell the type of the variable (at least not now). I've updated the patch to say "race on Swift variable", but that's not really exact and it's not describing the problem well. Swift is getting exact threading rules that are very different from C/C++ and I'm trying to focus on making the reports readable and understandable by Swift users.

Is there a way we could still special-case the Swift reports? I understand the desire to keep the API generic, but I think we need to make sure the users understand the report. E.g. "Swift access race" much better describes what the problem is, and users words that define the Swift threading model.

Humm... what is so special about Swift? Isn't it that concurrent accesses to some objects are a bug? I would expect that it is, and then it's no different from C/C++ variables, C++ containers, Go variables and containers, Java containers, file descriptors and everything else.

I don't understand why saying race or data race on something is confusing or not understand-able. What's ambiguous/obscure/wrong about it?
On the other hand I have problems understanding meaning of "Swift access race". Does it mean "race during access to a Swift object"? If so it's semantically equivalent to "race on Swift object".

The memory/threading model of Swift defines races in terms of accesses: Starting a new "modifying" access requires that there are no outstanding accesses. And "access" here means the formal duration over which the variable is accessed, e.g. the whole function call for inout arguments. This is all pretty new and still being worked on: https://github.com/apple/swift/blob/master/docs/OwnershipManifesto.md.

ThreadSanitizer: race on Swift array
ThreadSanitizer: race on Swift map

Unfortunately, we won't be able to tell the type of the variable (at least not now). I've updated the patch to say "race on Swift variable", but that's not really exact and it's not describing the problem well. Swift is getting exact threading rules that are very different from C/C++ and I'm trying to focus on making the reports readable and understandable by Swift users.

Is there a way we could still special-case the Swift reports? I understand the desire to keep the API generic, but I think we need to make sure the users understand the report. E.g. "Swift access race" much better describes what the problem is, and users words that define the Swift threading model.

Humm... what is so special about Swift? Isn't it that concurrent accesses to some objects are a bug? I would expect that it is, and then it's no different from C/C++ variables, C++ containers, Go variables and containers, Java containers, file descriptors and everything else.

I don't understand why saying race or data race on something is confusing or not understand-able. What's ambiguous/obscure/wrong about it?
On the other hand I have problems understanding meaning of "Swift access race". Does it mean "race during access to a Swift object"? If so it's semantically equivalent to "race on Swift object".

The memory/threading model of Swift defines races in terms of accesses: Starting a new "modifying" access requires that there are no outstanding accesses. And "access" here means the formal duration over which the variable is accessed, e.g. the whole function call for inout arguments. This is all pretty new and still being worked on: https://github.com/apple/swift/blob/master/docs/OwnershipManifesto.md.

Then let's register report header for external tags.

The memory/threading model of Swift defines races in terms of accesses: Starting a new "modifying" access requires that there are no outstanding accesses. And "access" here means the formal duration over which the variable is accessed, e.g. the whole function call for inout arguments. This is all pretty new and still being worked on: https://github.com/apple/swift/blob/master/docs/OwnershipManifesto.md.

Then let's register report header for external tags.

What do you mean by that? Can you explain?

The memory/threading model of Swift defines races in terms of accesses: Starting a new "modifying" access requires that there are no outstanding accesses. And "access" here means the formal duration over which the variable is accessed, e.g. the whole function call for inout arguments. This is all pretty new and still being worked on: https://github.com/apple/swift/blob/master/docs/OwnershipManifesto.md.

Then let's register report header for external tags.

What do you mean by that? Can you explain?

Currently we have:

void *__tsan_external_register_tag(const char *object_type);

I am proposing:

void *__tsan_external_register_tag(const char *report_header, const char *object_type);

Then we can have customized report header for external tags.
For the dedicated swift tag, you will need to pre-populate the global tag descriptor with necessary report header and object name. Then we don't need any swift-specific changes in ReportRace/PrintReport.
To make this work we will also need to associate tag with the whole report (rather than with memory access descriptors), but it looks like the right thing to do regardless.
I am not sure if we will still need object_type then.

kubamracek updated this revision to Diff 97485.May 2 2017, 12:46 PM

Updating the patch, addressing review comments. I added a separate API to register report header strings to avoid breaking existing users of the API.

dvyukov accepted this revision.May 3 2017, 4:17 AM

Thanks!
I think this now provides a solid foundation for future extensions.

This revision is now accepted and ready to land.May 3 2017, 4:17 AM
This revision was automatically updated to reflect the committed changes.