This is an archive of the discontinued LLVM Phabricator instance.

tsan: Add pthread_tryjoin_np and pthread_timedjoin_np interceptors
ClosedPublic

Authored by yuri on Nov 14 2018, 3:55 AM.

Download Raw Diff

Details

Reviewers

dvyukov
kcc

Summary

Add pthread_tryjoin_np() and pthread_timedjoin_np() interceptors on Linux, so that Thread sanitizer can handle programs using these functions.

Diff Detail

Repository: rCRT Compiler Runtime

Event Timeline

yuri created this revision.Nov 14 2018, 3:55 AM

Herald added subscribers: Restricted Project, llvm-commits, jfb, kubamracek. · View Herald TranscriptNov 14 2018, 3:55 AM

We need tests for these functions in compiler-rt/test/tsan/

lib/tsan/rtl/tsan_interceptors.cc
1054	Please remove {} tsan codebase does not use {} around single-statement if/for.
1066	Please remove {} tsan codebase does not use {} around single-statement if/for.

dvyukov added inline comments.Nov 15 2018, 9:44 AM

lib/tsan/rtl/tsan_interceptors.cc
1054	OK, I see somebody did this for pthread_join too. Then you can leave this is as for consistency.

Added tests and fixed code, now it really works

dvyukov added inline comments.Nov 20 2018, 8:39 PM

lib/tsan/rtl/tsan_interceptors.cc
1057	Why is this necessary? We've already set it in CreateThread. Also this looks racy and can CHECK-fail. Consider that this pthread_tryjoin fails but another thread concurrently also executes pthread_tryjoin and it succeeds and thread is destroyed.
test/tsan/Linux/thread_timedjoin.c
16	Why do we need a barrier in this test? It looks like it's deterministic without the barrier.
23	Let's also check that pthread_timedjoin_np provides necessary synchronization. It's easy to add to this test and that's an important guarantee of joining a thread. I.e. create a global var, write to it in the thread function and write in main after join. It must not race. That will also check that pthread_timedjoin_np did not return some other error.
test/tsan/Linux/thread_tryjoin.c
16	The same applies here.

yuri added inline comments.Nov 21 2018, 12:01 AM

lib/tsan/rtl/tsan_interceptors.cc
1057	This is needed because ThreadTid()/FindThreadByUid() clears user_id. This behaviour can be traced back to commit https://github.com/llvm-mirror/compiler-rt/commit/411b2c9d6787b64939fc15fdeec65e9d65ba1a51. Concurrent call of pthread_tryjoin() or any other join functions is racy because you can end up with already joined thread and access freed memory. Problem of current thread sanitizer code that in case of such racy application it fails CHECK instead of reporting clear error. I personally do no like such implementation. Instead of clearing user_id, it would be better to set flag "thread being joined" and store stack context of calling thread. If later someone attempts to join the same thread – report an error. In such case ThreadNotJoined() would clear this flag.
test/tsan/Linux/thread_timedjoin.c
16	I just copied it from pthread_deatch test. Actually brarier can be useful in this test if it is moved farther . I will update tests.
23	OK

Improved tests

dvyukov added inline comments.Nov 21 2018, 1:17 AM

lib/tsan/rtl/tsan_interceptors.cc
1057	I see. Yes, it's a bit messy because when pthread_join returns the pthread_t can be already reused for another thread. You are right that concurrent pthread_tryjoin's are racy and bad (unless program know that they both won't succeed, but that would be pretty strange code). Instead of clearing user_id, it would be better to set flag... Isn't this also racy if pthread_tryjoin succeeds? When pthread_tryjoin succeeds user_id (pthread_t) becomes stale and can be reused. So another thread can already be joining a different thread, but it finds the same user_id and the "thread being joined" flag set and decides that it does a racy pthread_join and reports a bug, but it reality it is joining a completely unrelated thread that happened to have the same user_id.
test/tsan/Linux/thread_timedjoin.c
16	With a barrier one could test that we join the thread on at least second iteration (coverage for ThreadNotJoined), or that pthread_timedjoin_np fails and then a race on shared var with the thread is detected (failed pthread_timedjoin_np should not synchronize threads).

Do you want me to merge this or you have commit access?

This revision is now accepted and ready to land.Nov 21 2018, 1:21 AM

In D54521#1305057, @dvyukov wrote:

Do you want me to merge this or you have commit access?

Merge it, please

Committed in r347383
http://llvm.org/viewvc/llvm-project?view=revision&revision=347383

Thanks

Revision Contents

Path

Size

lib/

sanitizer_common/

sanitizer_thread_registry.h

1 line

sanitizer_thread_registry.cc

11 lines

tsan/

rtl/

tsan_interceptors.cc

33 lines

tsan_rtl.h

1 line

tsan_rtl_thread.cc

6 lines

test/

tsan/

Linux/

thread_timedjoin.c

39 lines

thread_tryjoin.c

41 lines

Diff 174882

lib/sanitizer_common/sanitizer_thread_registry.h

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	public:
ThreadContextBase *FindThreadContextByOsIDLocked(tid_t os_id);		ThreadContextBase *FindThreadContextByOsIDLocked(tid_t os_id);

void SetThreadName(u32 tid, const char *name);		void SetThreadName(u32 tid, const char *name);
void SetThreadNameByUserId(uptr user_id, const char *name);		void SetThreadNameByUserId(uptr user_id, const char *name);
void DetachThread(u32 tid, void *arg);		void DetachThread(u32 tid, void *arg);
void JoinThread(u32 tid, void *arg);		void JoinThread(u32 tid, void *arg);
void FinishThread(u32 tid);		void FinishThread(u32 tid);
void StartThread(u32 tid, tid_t os_id, bool workerthread, void *arg);		void StartThread(u32 tid, tid_t os_id, bool workerthread, void *arg);
		void SetThreadUserId(u32 tid, uptr user_id);

private:		private:
const ThreadContextFactory context_factory_;		const ThreadContextFactory context_factory_;
const u32 max_threads_;		const u32 max_threads_;
const u32 thread_quarantine_size_;		const u32 thread_quarantine_size_;
const u32 max_reuse_;		const u32 max_reuse_;

BlockingMutex mtx_;		BlockingMutex mtx_;
Show All 22 Lines

lib/sanitizer_common/sanitizer_thread_registry.cc

	Show First 20 Lines • Show All 332 Lines • ▼ Show 20 Lines
	ThreadContextBase *ThreadRegistry::QuarantinePop() {			ThreadContextBase *ThreadRegistry::QuarantinePop() {
	if (invalid_threads_.size() == 0)			if (invalid_threads_.size() == 0)
	return 0;			return 0;
	ThreadContextBase *tctx = invalid_threads_.front();			ThreadContextBase *tctx = invalid_threads_.front();
	invalid_threads_.pop_front();			invalid_threads_.pop_front();
	return tctx;			return tctx;
	}			}

				void ThreadRegistry::SetThreadUserId(u32 tid, uptr user_id) {
				BlockingMutexLock l(&mtx_);
				CHECK_LT(tid, n_contexts_);
				ThreadContextBase *tctx = threads_[tid];
				CHECK_NE(tctx, 0);
				CHECK_NE(tctx->status, ThreadStatusInvalid);
				CHECK_NE(tctx->status, ThreadStatusDead);
				CHECK_EQ(tctx->user_id, 0);
				tctx->user_id = user_id;
				}

	} // namespace __sanitizer			} // namespace __sanitizer

lib/tsan/rtl/tsan_interceptors.cc

Show First 20 Lines • Show All 1,038 Lines • ▼ Show 20 Lines	TSAN_INTERCEPTOR(int, pthread_detach, void *th) {
int tid = ThreadTid(thr, pc, (uptr)th);		int tid = ThreadTid(thr, pc, (uptr)th);
int res = REAL(pthread_detach)(th);		int res = REAL(pthread_detach)(th);
if (res == 0) {		if (res == 0) {
ThreadDetach(thr, pc, tid);		ThreadDetach(thr, pc, tid);
}		}
return res;		return res;
}		}

		#if SANITIZER_LINUX
		TSAN_INTERCEPTOR(int, pthread_tryjoin_np, void th, void *ret) {
		SCOPED_TSAN_INTERCEPTOR(pthread_tryjoin_np, th, ret);
		int tid = ThreadTid(thr, pc, (uptr)th);
		ThreadIgnoreBegin(thr, pc);
		int res = REAL(pthread_tryjoin_np)(th, ret);
		ThreadIgnoreEnd(thr, pc);
		if (res == 0)
		dvyukovUnsubmitted Not Done Reply Inline Actions Please remove {} tsan codebase does not use {} around single-statement if/for. dvyukov: Please remove {} tsan codebase does not use {} around single-statement if/for.
		dvyukovUnsubmitted Not Done Reply Inline Actions OK, I see somebody did this for pthread_join too. Then you can leave this is as for consistency. dvyukov: OK, I see somebody did this for pthread_join too. Then you can leave this is as for consistency.
		ThreadJoin(thr, pc, tid);
		else
		ThreadNotJoined(thr, pc, tid, (uptr)th);
		dvyukovUnsubmitted Not Done Reply Inline Actions Why is this necessary? We've already set it in CreateThread. Also this looks racy and can CHECK-fail. Consider that this pthread_tryjoin fails but another thread concurrently also executes pthread_tryjoin and it succeeds and thread is destroyed. dvyukov: Why is this necessary? We've already set it in CreateThread. Also this looks racy and can CHECK…
		yuriAuthorUnsubmitted Not Done Reply Inline Actions This is needed because ThreadTid()/FindThreadByUid() clears user_id. This behaviour can be traced back to commit https://github.com/llvm-mirror/compiler-rt/commit/411b2c9d6787b64939fc15fdeec65e9d65ba1a51. Concurrent call of pthread_tryjoin() or any other join functions is racy because you can end up with already joined thread and access freed memory. Problem of current thread sanitizer code that in case of such racy application it fails CHECK instead of reporting clear error. I personally do no like such implementation. Instead of clearing user_id, it would be better to set flag "thread being joined" and store stack context of calling thread. If later someone attempts to join the same thread – report an error. In such case ThreadNotJoined() would clear this flag. yuri: This is needed because ThreadTid()/FindThreadByUid() clears user_id. This behaviour can be…
		dvyukovUnsubmitted Not Done Reply Inline Actions I see. Yes, it's a bit messy because when pthread_join returns the pthread_t can be already reused for another thread. You are right that concurrent pthread_tryjoin's are racy and bad (unless program know that they both won't succeed, but that would be pretty strange code). Instead of clearing user_id, it would be better to set flag... Isn't this also racy if pthread_tryjoin succeeds? When pthread_tryjoin succeeds user_id (pthread_t) becomes stale and can be reused. So another thread can already be joining a different thread, but it finds the same user_id and the "thread being joined" flag set and decides that it does a racy pthread_join and reports a bug, but it reality it is joining a completely unrelated thread that happened to have the same user_id. dvyukov: I see. Yes, it's a bit messy because when pthread_join returns the pthread_t can be already…
		return res;
		}

		TSAN_INTERCEPTOR(int, pthread_timedjoin_np, void th, void *ret,
		const struct timespec *abstime) {
		SCOPED_TSAN_INTERCEPTOR(pthread_timedjoin_np, th, ret, abstime);
		int tid = ThreadTid(thr, pc, (uptr)th);
		ThreadIgnoreBegin(thr, pc);
		int res = BLOCK_REAL(pthread_timedjoin_np)(th, ret, abstime);
		dvyukovUnsubmitted Not Done Reply Inline Actions Please remove {} tsan codebase does not use {} around single-statement if/for. dvyukov: Please remove {} tsan codebase does not use {} around single-statement if/for.
		ThreadIgnoreEnd(thr, pc);
		if (res == 0)
		ThreadJoin(thr, pc, tid);
		else
		ThreadNotJoined(thr, pc, tid, (uptr)th);
		return res;
		}
		#endif

// Problem:		// Problem:
// NPTL implementation of pthread_cond has 2 versions (2.2.5 and 2.3.2).		// NPTL implementation of pthread_cond has 2 versions (2.2.5 and 2.3.2).
// pthread_cond_t has different size in the different versions.		// pthread_cond_t has different size in the different versions.
// If call new REAL functions for old pthread_cond_t, they will corrupt memory		// If call new REAL functions for old pthread_cond_t, they will corrupt memory
// after pthread_cond_t (old cond is smaller).		// after pthread_cond_t (old cond is smaller).
// If we call old REAL functions for new pthread_cond_t, we will lose some		// If we call old REAL functions for new pthread_cond_t, we will lose some
// functionality (e.g. old functions do not support waiting against		// functionality (e.g. old functions do not support waiting against
// CLOCK_REALTIME).		// CLOCK_REALTIME).
▲ Show 20 Lines • Show All 1,580 Lines • ▼ Show 20 Lines	#endif

TSAN_INTERCEPT(strcpy); // NOLINT		TSAN_INTERCEPT(strcpy); // NOLINT
TSAN_INTERCEPT(strncpy);		TSAN_INTERCEPT(strncpy);
TSAN_INTERCEPT(strdup);		TSAN_INTERCEPT(strdup);

TSAN_INTERCEPT(pthread_create);		TSAN_INTERCEPT(pthread_create);
TSAN_INTERCEPT(pthread_join);		TSAN_INTERCEPT(pthread_join);
TSAN_INTERCEPT(pthread_detach);		TSAN_INTERCEPT(pthread_detach);
		#if SANITIZER_LINUX
		TSAN_INTERCEPT(pthread_tryjoin_np);
		TSAN_INTERCEPT(pthread_timedjoin_np);
		#endif

TSAN_INTERCEPT_VER(pthread_cond_init, PTHREAD_ABI_BASE);		TSAN_INTERCEPT_VER(pthread_cond_init, PTHREAD_ABI_BASE);
TSAN_INTERCEPT_VER(pthread_cond_signal, PTHREAD_ABI_BASE);		TSAN_INTERCEPT_VER(pthread_cond_signal, PTHREAD_ABI_BASE);
TSAN_INTERCEPT_VER(pthread_cond_broadcast, PTHREAD_ABI_BASE);		TSAN_INTERCEPT_VER(pthread_cond_broadcast, PTHREAD_ABI_BASE);
TSAN_INTERCEPT_VER(pthread_cond_wait, PTHREAD_ABI_BASE);		TSAN_INTERCEPT_VER(pthread_cond_wait, PTHREAD_ABI_BASE);
TSAN_INTERCEPT_VER(pthread_cond_timedwait, PTHREAD_ABI_BASE);		TSAN_INTERCEPT_VER(pthread_cond_timedwait, PTHREAD_ABI_BASE);
TSAN_INTERCEPT_VER(pthread_cond_destroy, PTHREAD_ABI_BASE);		TSAN_INTERCEPT_VER(pthread_cond_destroy, PTHREAD_ABI_BASE);

▲ Show 20 Lines • Show All 167 Lines • Show Last 20 Lines

lib/tsan/rtl/tsan_rtl.h

	Show First 20 Lines • Show All 766 Lines • ▼ Show 20 Lines
	void ThreadFinish(ThreadState *thr);			void ThreadFinish(ThreadState *thr);
	int ThreadTid(ThreadState *thr, uptr pc, uptr uid);			int ThreadTid(ThreadState *thr, uptr pc, uptr uid);
	void ThreadJoin(ThreadState *thr, uptr pc, int tid);			void ThreadJoin(ThreadState *thr, uptr pc, int tid);
	void ThreadDetach(ThreadState *thr, uptr pc, int tid);			void ThreadDetach(ThreadState *thr, uptr pc, int tid);
	void ThreadFinalize(ThreadState *thr);			void ThreadFinalize(ThreadState *thr);
	void ThreadSetName(ThreadState thr, const char name);			void ThreadSetName(ThreadState thr, const char name);
	int ThreadCount(ThreadState *thr);			int ThreadCount(ThreadState *thr);
	void ProcessPendingSignals(ThreadState *thr);			void ProcessPendingSignals(ThreadState *thr);
				void ThreadNotJoined(ThreadState *thr, uptr pc, int tid, uptr uid);

	Processor *ProcCreate();			Processor *ProcCreate();
	void ProcDestroy(Processor *proc);			void ProcDestroy(Processor *proc);
	void ProcWire(Processor proc, ThreadState thr);			void ProcWire(Processor proc, ThreadState thr);
	void ProcUnwire(Processor proc, ThreadState thr);			void ProcUnwire(Processor proc, ThreadState thr);

	// Note: the parameter is called flagz, because flags is already taken			// Note: the parameter is called flagz, because flags is already taken
	// by the global function that returns flags.			// by the global function that returns flags.
	▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

lib/tsan/rtl/tsan_rtl_thread.cc

	Show First 20 Lines • Show All 306 Lines • ▼ Show 20 Lines
	}			}

	void ThreadDetach(ThreadState *thr, uptr pc, int tid) {			void ThreadDetach(ThreadState *thr, uptr pc, int tid) {
	CHECK_GT(tid, 0);			CHECK_GT(tid, 0);
	CHECK_LT(tid, kMaxTid);			CHECK_LT(tid, kMaxTid);
	ctx->thread_registry->DetachThread(tid, thr);			ctx->thread_registry->DetachThread(tid, thr);
	}			}

				void ThreadNotJoined(ThreadState *thr, uptr pc, int tid, uptr uid) {
				CHECK_GT(tid, 0);
				CHECK_LT(tid, kMaxTid);
				ctx->thread_registry->SetThreadUserId(tid, uid);
				}

	void ThreadSetName(ThreadState thr, const char name) {			void ThreadSetName(ThreadState thr, const char name) {
	ctx->thread_registry->SetThreadName(thr->tid, name);			ctx->thread_registry->SetThreadName(thr->tid, name);
	}			}

	void MemoryAccessRange(ThreadState *thr, uptr pc, uptr addr,			void MemoryAccessRange(ThreadState *thr, uptr pc, uptr addr,
	uptr size, bool is_write) {			uptr size, bool is_write) {
	if (size == 0)			if (size == 0)
	return;			return;
	▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

test/tsan/Linux/thread_timedjoin.c

This file was added.

				// RUN: %clang_tsan -O1 %s -o %t && %run %t 2>&1 \| FileCheck %s
				#define _GNU_SOURCE
				#include "../test.h"
				#include <errno.h>

				int var;

				void Thread(void x) {
				barrier_wait(&barrier);
				var = 1;
				return 0;
				}

				static void check(int res, int expect) {
				if (res != expect) {
				fprintf(stderr, "Unexpected result of pthread_timedjoin_np: %d\n", res);
				dvyukovUnsubmitted Not Done Reply Inline Actions Why do we need a barrier in this test? It looks like it's deterministic without the barrier. dvyukov: Why do we need a barrier in this test? It looks like it's deterministic without the barrier.
				yuriAuthorUnsubmitted Not Done Reply Inline Actions I just copied it from pthread_deatch test. Actually brarier can be useful in this test if it is moved farther . I will update tests. yuri: I just copied it from pthread_deatch test. Actually brarier can be useful in this test if it is…
				dvyukovUnsubmitted Not Done Reply Inline Actions With a barrier one could test that we join the thread on at least second iteration (coverage for ThreadNotJoined), or that pthread_timedjoin_np fails and then a race on shared var with the thread is detected (failed pthread_timedjoin_np should not synchronize threads). dvyukov: With a barrier one could test that we join the thread on at least second iteration (coverage…
				exit(1);
				}
				}

				int main() {
				barrier_init(&barrier, 2);
				pthread_t t;
				dvyukovUnsubmitted Not Done Reply Inline Actions Let's also check that pthread_timedjoin_np provides necessary synchronization. It's easy to add to this test and that's an important guarantee of joining a thread. I.e. create a global var, write to it in the thread function and write in main after join. It must not race. That will also check that pthread_timedjoin_np did not return some other error. dvyukov: Let's also check that pthread_timedjoin_np provides necessary synchronization. It's easy to add…
				yuriAuthorUnsubmitted Not Done Reply Inline Actions OK yuri: OK
				pthread_create(&t, 0, Thread, 0);
				struct timespec ts;
				clock_gettime(CLOCK_REALTIME, &ts);
				check(pthread_timedjoin_np(t, 0, &ts), ETIMEDOUT);
				barrier_wait(&barrier);
				clock_gettime(CLOCK_REALTIME, &ts);
				ts.tv_sec += 10000;
				check(pthread_timedjoin_np(t, 0, &ts), 0);
				var = 2;
				fprintf(stderr, "PASS\n");
				return 0;
				}

				// CHECK-NOT: WARNING: ThreadSanitizer: data race
				// CHECK-NOT: WARNING: ThreadSanitizer: thread leak
				// CHECK: PASS

test/tsan/Linux/thread_tryjoin.c

This file was added.

				// RUN: %clang_tsan -O1 %s -o %t && %run %t 2>&1 \| FileCheck %s
				#define _GNU_SOURCE
				#include "../test.h"
				#include <errno.h>

				int var;

				void Thread(void x) {
				barrier_wait(&barrier);
				var = 1;
				return 0;
				}

				static void check(int res) {
				if (res != EBUSY) {
				fprintf(stderr, "Unexpected result of pthread_tryjoin_np: %d\n", res);
				dvyukovUnsubmitted Not Done Reply Inline Actions The same applies here. dvyukov: The same applies here.
				exit(1);
				}
				}

				int main() {
				barrier_init(&barrier, 2);
				pthread_t t;
				pthread_create(&t, 0, Thread, 0);
				check(pthread_tryjoin_np(t, 0));
				barrier_wait(&barrier);
				for (;;) {
				int res = pthread_tryjoin_np(t, 0);
				if (!res)
				break;
				check(res);
				pthread_yield();
				}
				var = 2;
				fprintf(stderr, "PASS\n");
				return 0;
				}

				// CHECK-NOT: WARNING: ThreadSanitizer: data race
				// CHECK-NOT: WARNING: ThreadSanitizer: thread leak
				// CHECK: PASS