This is an archive of the discontinued LLVM Phabricator instance.

[tsan] Don't abort when a deadlock detector finds a mutex cycle longer than 10
ClosedPublic

Authored by kubamracek on May 19 2016, 5:58 AM.

Details

Summary

In one of the already existing apps that I'm testing TSan on, I really see a mutex path that is longer than 10 (but not by much, something like 11-13 actually). Let's weaken the assertion and simply not report this deadlock.

Diff Detail

Repository
rL LLVM

Event Timeline

kubamracek updated this revision to Diff 57773.May 19 2016, 5:58 AM
kubamracek retitled this revision from to [tsan] Don't abort when a deadlock detector finds a mutex cycle longer than 10.
kubamracek updated this object.
kubamracek added reviewers: dvyukov, kcc, glider, samsonov.
kubamracek added a project: Restricted Project.
dvyukov added inline comments.May 19 2016, 7:18 AM
lib/sanitizer_common/sanitizer_deadlock_detector1.cc
125 ↗(On Diff #57773)

So, can we see it? :)
Maybe there is just some bug in logic, or a false deadlock report? Is it actually a legit report?
If it is a true report, then we should consider increasing the limit (and then we don't need to remove the CHECK). If it is a false report, then we need to fix something (and again no need to remove the CHECK, let it catch more bugs).

kubamracek added inline comments.May 19 2016, 7:37 AM
lib/sanitizer_common/sanitizer_deadlock_detector1.cc
125 ↗(On Diff #57773)

It looks to me the report is correct (and it's really a long cycle). In this case it's a collection of Obj-C objects using @synchronized(self) while also referring to other objects. The same issue gets reported multiple times with different cycle lengths, so the report with extremely long cycle isn't really useful (it's already reported with simpler cases). Anyway, increasing the limit works for me as well.

dvyukov added inline comments.May 19 2016, 8:25 AM
lib/sanitizer_common/sanitizer_deadlock_detector1.cc
125 ↗(On Diff #57773)

Let's increase the number to, say, 20 then.
As for the CHECK, we can leave it as is, or maybe replace with just Printf. If you get 13, I guess somebody can get 21. But I think it's better to have just a Printf, so that users see at least something.

kubamracek updated this revision to Diff 57800.May 19 2016, 8:29 AM

Updating patch.

dvyukov edited edge metadata.May 19 2016, 8:46 AM

LGTM, but wait for Kostya to sign off.

kcc edited edge metadata.May 19 2016, 4:18 PM

A test maybe?

kubamracek updated this revision to Diff 57916.May 20 2016, 2:34 AM
kubamracek edited edge metadata.

Adding a test case. Fixing another place where a hard limit for mutex cycle was.

kcc accepted this revision.May 20 2016, 4:54 PM
kcc edited edge metadata.

LGTM with a nit

lib/sanitizer_common/sanitizer_deadlock_detector1.cc
122 ↗(On Diff #57916)

can you use kMaxLoopSize here?

This revision is now accepted and ready to land.May 20 2016, 4:54 PM
This revision was automatically updated to reflect the committed changes.