This is an archive of the discontinued LLVM Phabricator instance.

tsan: Add pthread_tryjoin_np and pthread_timedjoin_np interceptors
ClosedPublic

Authored by yuri on Nov 14 2018, 3:55 AM.

Details

Reviewers
dvyukov
kcc
Summary

Add pthread_tryjoin_np() and pthread_timedjoin_np() interceptors on Linux, so that Thread sanitizer can handle programs using these functions.

Diff Detail

Event Timeline

yuri created this revision.Nov 14 2018, 3:55 AM
Herald added subscribers: Restricted Project, llvm-commits, jfb, kubamracek. · View Herald TranscriptNov 14 2018, 3:55 AM

We need tests for these functions in compiler-rt/test/tsan/

lib/tsan/rtl/tsan_interceptors.cc
1054

Please remove {}
tsan codebase does not use {} around single-statement if/for.

1066

Please remove {}
tsan codebase does not use {} around single-statement if/for.

dvyukov added inline comments.Nov 15 2018, 9:44 AM
lib/tsan/rtl/tsan_interceptors.cc
1054

OK, I see somebody did this for pthread_join too. Then you can leave this is as for consistency.

yuri updated this revision to Diff 174765.Nov 20 2018, 6:41 AM

Added tests and fixed code, now it really works

dvyukov added inline comments.Nov 20 2018, 8:39 PM
lib/tsan/rtl/tsan_interceptors.cc
1057

Why is this necessary?
We've already set it in CreateThread.
Also this looks racy and can CHECK-fail. Consider that this pthread_tryjoin fails but another thread concurrently also executes pthread_tryjoin and it succeeds and thread is destroyed.

test/tsan/Linux/thread_timedjoin.c
16

Why do we need a barrier in this test? It looks like it's deterministic without the barrier.

23

Let's also check that pthread_timedjoin_np provides necessary synchronization. It's easy to add to this test and that's an important guarantee of joining a thread.
I.e. create a global var, write to it in the thread function and write in main after join. It must not race.
That will also check that pthread_timedjoin_np did not return some other error.

test/tsan/Linux/thread_tryjoin.c
16

The same applies here.

yuri added inline comments.Nov 21 2018, 12:01 AM
lib/tsan/rtl/tsan_interceptors.cc
1057

This is needed because ThreadTid()/FindThreadByUid() clears user_id. This behaviour can be traced back to commit https://github.com/llvm-mirror/compiler-rt/commit/411b2c9d6787b64939fc15fdeec65e9d65ba1a51.

Concurrent call of pthread_tryjoin() or any other join functions is racy because you can end up with already joined thread and access freed memory. Problem of current thread sanitizer code that in case of such racy application it fails CHECK instead of reporting clear error.

I personally do no like such implementation. Instead of clearing user_id, it would be better to set flag "thread being joined" and store stack context of calling thread. If later someone attempts to join the same thread – report an error. In such case ThreadNotJoined() would clear this flag.

test/tsan/Linux/thread_timedjoin.c
16

I just copied it from pthread_deatch test. Actually brarier can be useful in this test if it is moved farther . I will update tests.

23

OK

yuri updated this revision to Diff 174882.Nov 21 2018, 1:12 AM
yuri set the repository for this revision to rCRT Compiler Runtime.

Improved tests

dvyukov added inline comments.Nov 21 2018, 1:17 AM
lib/tsan/rtl/tsan_interceptors.cc
1057

I see. Yes, it's a bit messy because when pthread_join returns the pthread_t can be already reused for another thread.

You are right that concurrent pthread_tryjoin's are racy and bad (unless program know that they both won't succeed, but that would be pretty strange code).

Instead of clearing user_id, it would be better to set flag...

Isn't this also racy if pthread_tryjoin succeeds? When pthread_tryjoin succeeds user_id (pthread_t) becomes stale and can be reused. So another thread can already be joining a different thread, but it finds the same user_id and the "thread being joined" flag set and decides that it does a racy pthread_join and reports a bug, but it reality it is joining a completely unrelated thread that happened to have the same user_id.

test/tsan/Linux/thread_timedjoin.c
16

With a barrier one could test that we join the thread on at least second iteration (coverage for ThreadNotJoined), or that pthread_timedjoin_np fails and then a race on shared var with the thread is detected (failed pthread_timedjoin_np should not synchronize threads).

dvyukov accepted this revision.Nov 21 2018, 1:21 AM

Do you want me to merge this or you have commit access?

This revision is now accepted and ready to land.Nov 21 2018, 1:21 AM
yuri added a comment.Nov 21 2018, 1:22 AM

Do you want me to merge this or you have commit access?

Merge it, please