This is an archive of the discontinued LLVM Phabricator instance.

[tsan] Handle pthread_mutex_timedlock() with PTHREAD_MUTEX_ROBUST
Needs ReviewPublic

Authored by jkt on Sep 3 2021, 3:25 AM.

Details

Reviewers
dvyukov
Summary

The interceptor for pthread_mutex_timedlock() was treating the EOWNERDEAD return value properly, and it instead assumed that that value is an actual error. TSAN therefore thought that the mutex was "not taken", and in the trace reports this appeared as if that robust mutex was not held at all for the recovery-after-a-death branch.

The test case is adopted from mutex_robust.cpp. In the thr() function, though, it is necessary to lock, write, unlock, and then lock again. If there was no unlock, then TSAN would report a race (which is was mutex_robust2.cpp tests for, so that's no change).

Discovered when trying to debug an issue in sysrepo; thank for Michal Vaško for a walkthrough through the original user code.

Bug: https://github.com/sysrepo/sysrepo/issues/2560

A git commit is available at https://github.com/jktjkt/llvm-project/commit/70e198ac563a2dfc20dabf082bcfcc28f4c5a52e .

Diff Detail

Event Timeline

jkt requested review of this revision.Sep 3 2021, 3:25 AM
jkt created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptSep 3 2021, 3:26 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
jkt added a comment.Sep 15 2021, 7:03 AM

Hi @dvyukov , just a gentle reminder. Can you please take a look at this patch?

Just back from vacation and getting through backlock.
The fix itself looks good to me, but there are some things in the test to improve.
Thanks

compiler-rt/test/tsan/Linux/mutex_robust3.cpp
9

Please make it long. Unit tests may have issues with variables less than 8 bytes due to tsan internal details.

26

Sleep-based coordination proved to be flaky on CI bots. 1 second was frequently not enough, and even 3 sec delays fired sometimes.
Please use a barrier instead (see other tests). I think it will be enough to ensure that thr has acquire m second time and then we can proceed with pthread_mutex_timedlock.

27

This looks like an uninit variable used in pthread_mutex_timedlock.

31

I think it may be useful to access i here to ensure that EOWNERDEAD was treated as successful acquisition by tsan and that it understands synchronization.