This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/trunk/
-
trunk/
-
lib/tsan/rtl/
-
tsan/
-
rtl/
-
tsan_interceptors_mac.cc
-
test/tsan/Darwin/
-
tsan/
-
Darwin/
-
objc-synchronize-cycle-tagged.mm

Differential D56238

[TSan] Support Objective-C @synchronized with tagged pointers
ClosedPublic

Authored by yln on Jan 2 2019, 5:17 PM.

Download Raw Diff

Details

Reviewers

dcoughlin
kubamracek
dvyukov
delcypher

Commits

rGa6d29024edb0: [TSan] Support Objective-C @synchronized with tagged pointers
rCRT350556: [TSan] Support Objective-C @synchronized with tagged pointers
rL350556: [TSan] Support Objective-C @synchronized with tagged pointers

Summary

Objective-C employs tagged pointers, that is, small objects/values may be encoded directly in the pointer bits. The resulting pointer is not backed by an allocation/does not point to a valid memory. TSan infrastructure requires a valid address for Acquire/Release and Mutex{Lock/Unlock}.
This patch establishes such a mapping via a "dummy allocation" for each encountered tagged pointer value.

Diff Detail

Repository: rL LLVM

Event Timeline

yln created this revision.Jan 2 2019, 5:17 PM

Herald added subscribers: Restricted Project, llvm-commits. · View Herald TranscriptJan 2 2019, 5:17 PM

What are tagged pointers? When they are used? What is the actual value? How does synchronized handle them? Is there a man page or something with more info? It should be included in the comment for future generations.

Improve/extend comments.

dvyukov added inline comments.Jan 4 2019, 1:24 AM

lib/tsan/rtl/tsan_interceptors_mac.cc
302 ↗	(On Diff #180112)	I still wonder what does synchronized itself do to lock them? Since the optimization is transparent, it suggest that these things still have reference identity rather than values identity. But I fail to see how we respect this reference identity. Consider, we have two different objects that are small and converted to a tagger pointer with the same value (say, integer 1). Now we will use the same address for these 2 objects because they have the same value, so we think they are the same. Since we merge them we can get false deadlock reports and all kinds of bad stuff. But then I am confused how synchronized distinguish them.
319 ↗	(On Diff #180112)	It's safer/better to use a user allocation. Internal allocations may not be covered by shadow in future (not sure it has not in all configurations). And mutexes have limited functionality in non-app mem.

yln marked 2 inline comments as done.Jan 4 2019, 11:44 AM

yln added inline comments.

lib/tsan/rtl/tsan_interceptors_mac.cc
302 ↗	(On Diff #180112)	Ah, I think I now understand your question. Sorry for not communicating more clearly. The Obj-C runtime maintains a mapping: <pointer value -> lock>. So two pointers with the same value (pointer bits) will map to the same lock, regardless whether or not they are backed by an allocation. So our implementation already replicates the behavior of the Obj-C runtime. We considered reporting a blanket warning any time @synchronized is used with a tagged pointer, since it might have surprising behavior from the programmer's point of view. However, Devin made a convincing point that tagged pointers are only one special case of a larger issue here. In general, it is bad practice to lock on objects that you don't own/allocate/control the lifetime of. Think of a cache for small numbers, or interned string literals. If you use @synchronized with, e.g., two numbers of the same value, with the expectation that they are different objects, then it doesn't matter if they are tagged pointers or two pointers pointing to the same object. In both cases your assumption is broken. So this would need a more general warning to make sure programmers explicitly manage locks. Summary: @synchronized locks on pointer values. We already replicate this behavior (I hope ;)
319 ↗	(On Diff #180112)	With `user_alloc` TSan reports a race in the `objc-synchronize-tagged.mm` test. I do not completely understand what the cause is: the address/allocation is supposed to be racy by nature since it represents a mutex, which itself is used to establish synchronization. Do you have any insights here? Plain old `malloc` seems to work.

dvyukov added inline comments.Jan 6 2019, 10:12 PM

lib/tsan/rtl/tsan_interceptors_mac.cc
302 ↗	(On Diff #180112)	I see. Thanks for explanation. Is it that a user can't create new tagged pointers? But regardless, if this is what synchronized do, then I guess we can do the same. In some sense deadlock reports caused by merging will be true.
319 ↗	(On Diff #180112)	malloc and user_alloc should generally be the same. But I wonder if malloc interceptor detects that it's called a non-intrumented library and enables ignores so that it does not model access to the range. This can be checked by running with TSAN_OPTIONS=ignore_noninstrumented_modules=0 Yes, I think a false report is possible if we use user memory without ignores. Consider a thread creates a mutex, then passes it to 2 threads and these 2 threads lock the mutex concurrently. In this case all proper synchronization is in place and the code is correct. However, what tsan sees is that the mutex is created by one of the 2 threads that calls lock first. So it looks like one thread creates a mutex and another thread immidiately tries to lock it without any synchronization. Let's try to surround user_alloc with ThreadIgnoreBegin/End. Does it help?

Use user_alloc surrounded by ThreadIgnoreBegin/End.

yln marked 6 inline comments as done.Jan 7 2019, 11:05 AM

yln added inline comments.

lib/tsan/rtl/tsan_interceptors_mac.cc
302 ↗	(On Diff #180112)	Is it that a user can't create new tagged pointers? Some APIs return tagged pointers and certain patterns create tagged pointers. So user code can create new tagged pointers, but should not depend on it (no guarantees) and ideally the programmer shouldn't even be aware of it (transparent optimization). We can say that @synchronized is a place where this optimization leaks its implementation details. But regardless, if this is what synchronized do, then I guess we can do the same. In some sense deadlock reports caused by merging will be true. Yes, that's our intention here.
319 ↗	(On Diff #180112)	Your analysis is correct. With `TSAN_OPTIONS=ignore_noninstrumented_modules=0` also `malloc` reports a warning and `ThreadIgnoreBegin/End` fixes the failing test. Good that we had that test. :) One last question: looking at the code I don't think so, but does the size of the dummy allocation matter, i.e., should it be something like `sizeof(pthread_mutex_t)`?

yln marked 2 inline comments as done.Jan 7 2019, 11:06 AM

dvyukov accepted this revision.Jan 7 2019, 11:10 AM

dvyukov added inline comments.

lib/tsan/rtl/tsan_interceptors_mac.cc
319 ↗	(On Diff #180112)	No, it doesn't have to be larger than 1 byte because a user mutex implementation can be just 1 byte (think of a simple spinlock). Tsan does not store any info inline (that would interfere with the mutex code itself), all info is stored on the side, so 1 byte enough.

This revision is now accepted and ready to land.Jan 7 2019, 11:10 AM

yln marked 2 inline comments as done.Jan 7 2019, 11:15 AM

yln added inline comments.

lib/tsan/rtl/tsan_interceptors_mac.cc
319 ↗	(On Diff #180112)	Okay, got it. Thanks for the quick answers and great feedback, Dmitry!

Closed by commit rL350556: [TSan] Support Objective-C @synchronized with tagged pointers (authored by yln). · Explain WhyJan 7 2019, 11:23 AM

This revision was automatically updated to reflect the committed changes.

yln marked an inline comment as done.

Revision Contents

Path

Size

compiler-rt/

trunk/

lib/

tsan/

rtl/

tsan_interceptors_mac.cc

52 lines

test/

tsan/

Darwin/

objc-synchronize-cycle-tagged.mm

3 lines

Diff 180531

compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_mac.cc

	Show All 13 Lines

	#include "sanitizer_common/sanitizer_platform.h"			#include "sanitizer_common/sanitizer_platform.h"
	#if SANITIZER_MAC			#if SANITIZER_MAC

	#include "interception/interception.h"			#include "interception/interception.h"
	#include "tsan_interceptors.h"			#include "tsan_interceptors.h"
	#include "tsan_interface.h"			#include "tsan_interface.h"
	#include "tsan_interface_ann.h"			#include "tsan_interface_ann.h"
				#include "sanitizer_common/sanitizer_addrhashmap.h"

	#include <libkern/OSAtomic.h>			#include <libkern/OSAtomic.h>
	#include <objc/objc-sync.h>			#include <objc/objc-sync.h>

	#if defined(__has_include) && __has_include(<xpc/xpc.h>)			#if defined(__has_include) && __has_include(<xpc/xpc.h>)
	#include <xpc/xpc.h>			#include <xpc/xpc.h>
	#endif // #if defined(__has_include) && __has_include(<xpc/xpc.h>)			#endif // #if defined(__has_include) && __has_include(<xpc/xpc.h>)

	▲ Show 20 Lines • Show All 260 Lines • ▼ Show 20 Lines
	TSAN_INTERCEPTOR(void, xpc_connection_cancel, xpc_connection_t connection) {			TSAN_INTERCEPTOR(void, xpc_connection_cancel, xpc_connection_t connection) {
	SCOPED_TSAN_INTERCEPTOR(xpc_connection_cancel, connection);			SCOPED_TSAN_INTERCEPTOR(xpc_connection_cancel, connection);
	Release(thr, pc, (uptr)connection);			Release(thr, pc, (uptr)connection);
	REAL(xpc_connection_cancel)(connection);			REAL(xpc_connection_cancel)(connection);
	}			}

	#endif // #if defined(__has_include) && __has_include(<xpc/xpc.h>)			#endif // #if defined(__has_include) && __has_include(<xpc/xpc.h>)

	// Is the Obj-C object a tagged pointer (i.e. isn't really a valid pointer and			// Determines whether the Obj-C object pointer is a tagged pointer. Tagged
	// contains data in the pointers bits instead)?			// pointers encode the object data directly in their pointer bits and do not
	static bool IsTaggedObjCPointer(void *obj) {			// have an associated memory allocation. The Obj-C runtime uses tagged pointers
				// to transparently optimize small objects.
				static bool IsTaggedObjCPointer(id obj) {
	const uptr kPossibleTaggedBits = 0x8000000000000001ull;			const uptr kPossibleTaggedBits = 0x8000000000000001ull;
	return ((uptr)obj & kPossibleTaggedBits) != 0;			return ((uptr)obj & kPossibleTaggedBits) != 0;
	}			}

	// Return an address on which we can synchronize (Acquire and Release) for a			// Returns an address which can be used to inform TSan about synchronization
	// Obj-C tagged pointer (which is not a valid pointer). Ideally should be a			// points (MutexLock/Unlock). The TSan infrastructure expects this to be a valid
	// derived address from 'obj', but for now just return the same global address.			// address in the process space. We do a small allocation here to obtain a
	// TODO(kubamracek): Return different address for different pointers.			// stable address (the array backing the hash map can change). The memory is
	static uptr SyncAddressForTaggedPointer(void *obj) {			// never free'd (leaked) and allocation and locking are slow, but this code only
	(void)obj;			// runs for @synchronized with tagged pointers, which is very rare.
	static u64 addr;			static uptr GetOrCreateSyncAddress(uptr addr, ThreadState *thr, uptr pc) {
	return (uptr)&addr;			typedef AddrHashMap<uptr, 5> Map;
	}			static Map Addresses;
				Map::Handle h(&Addresses, addr);
	// Address on which we can synchronize for an Objective-C object. Supports			if (h.created()) {
	// tagged pointers.			ThreadIgnoreBegin(thr, pc);
	static uptr SyncAddressForObjCObject(void *obj) {			h = (uptr) user_alloc(thr, pc, /size=*/1);
	if (IsTaggedObjCPointer(obj)) return SyncAddressForTaggedPointer(obj);			ThreadIgnoreEnd(thr, pc);
				}
				return *h;
				}

				// Returns an address on which we can synchronize given an Obj-C object pointer.
				// For normal object pointers, this is just the address of the object in memory.
				// Tagged pointers are not backed by an actual memory allocation, so we need to
				// synthesize a valid address.
				static uptr SyncAddressForObjCObject(id obj, ThreadState *thr, uptr pc) {
				if (IsTaggedObjCPointer(obj))
				return GetOrCreateSyncAddress((uptr)obj, thr, pc);
	return (uptr)obj;			return (uptr)obj;
	}			}

	TSAN_INTERCEPTOR(int, objc_sync_enter, id obj) {			TSAN_INTERCEPTOR(int, objc_sync_enter, id obj) {
	SCOPED_TSAN_INTERCEPTOR(objc_sync_enter, obj);			SCOPED_TSAN_INTERCEPTOR(objc_sync_enter, obj);
	if (!obj) return REAL(objc_sync_enter)(obj);			if (!obj) return REAL(objc_sync_enter)(obj);
	uptr addr = SyncAddressForObjCObject(obj);			uptr addr = SyncAddressForObjCObject(obj, thr, pc);
	MutexPreLock(thr, pc, addr, MutexFlagWriteReentrant);			MutexPreLock(thr, pc, addr, MutexFlagWriteReentrant);
	int result = REAL(objc_sync_enter)(obj);			int result = REAL(objc_sync_enter)(obj);
	CHECK_EQ(result, OBJC_SYNC_SUCCESS);			CHECK_EQ(result, OBJC_SYNC_SUCCESS);
	MutexPostLock(thr, pc, addr, MutexFlagWriteReentrant);			MutexPostLock(thr, pc, addr, MutexFlagWriteReentrant);
	return result;			return result;
	}			}

	TSAN_INTERCEPTOR(int, objc_sync_exit, id obj) {			TSAN_INTERCEPTOR(int, objc_sync_exit, id obj) {
	SCOPED_TSAN_INTERCEPTOR(objc_sync_exit, obj);			SCOPED_TSAN_INTERCEPTOR(objc_sync_exit, obj);
	if (!obj) return REAL(objc_sync_exit)(obj);			if (!obj) return REAL(objc_sync_exit)(obj);
	uptr addr = SyncAddressForObjCObject(obj);			uptr addr = SyncAddressForObjCObject(obj, thr, pc);
	MutexUnlock(thr, pc, addr);			MutexUnlock(thr, pc, addr);
	int result = REAL(objc_sync_exit)(obj);			int result = REAL(objc_sync_exit)(obj);
	if (result != OBJC_SYNC_SUCCESS) MutexInvalidAccess(thr, pc, addr);			if (result != OBJC_SYNC_SUCCESS) MutexInvalidAccess(thr, pc, addr);
	return result;			return result;
	}			}

	// On macOS, libc++ is always linked dynamically, so intercepting works the			// On macOS, libc++ is always linked dynamically, so intercepting works the
	// usual way.			// usual way.
	▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

compiler-rt/trunk/test/tsan/Darwin/objc-synchronize-cycle-tagged.mm

	// RUN: %clangxx_tsan %s -o %t -framework Foundation -fobjc-arc %darwin_min_target_with_full_runtime_arc_support			// RUN: %clangxx_tsan %s -o %t -framework Foundation -fobjc-arc %darwin_min_target_with_full_runtime_arc_support
	// RUN: %run %t 6 2>&1 \| FileCheck %s --check-prefix=SIX			// RUN: %run %t 6 2>&1 \| FileCheck %s --check-prefix=SIX
	// RUN: not %run %t 7 2>&1 \| FileCheck %s --check-prefix=SEVEN			// RUN: not %run %t 7 2>&1 \| FileCheck %s --check-prefix=SEVEN
	// XFAIL: *

	#import <Foundation/Foundation.h>			#import <Foundation/Foundation.h>

	static bool isTaggedPtr(id obj) {			static bool isTaggedPtr(id obj) {
	uintptr_t ptr = (uintptr_t) obj;			uintptr_t ptr = (uintptr_t) obj;
	return (ptr & 0x8000000000000001ull) != 0;			return (ptr & 0x8000000000000001ull) != 0;
	}			}

	int main(int argc, char* argv[]) {			int main(int argc, char* argv[]) {
	assert(argc == 2);			assert(argc == 2);
	int arg = atoi(argv[0]);			int arg = atoi(argv[1]);

	@autoreleasepool {			@autoreleasepool {
	NSObject* obj = [NSObject new];			NSObject* obj = [NSObject new];
	NSObject* num1 = @7;			NSObject* num1 = @7;
	NSObject* num2 = [NSNumber numberWithInt:arg];			NSObject* num2 = [NSNumber numberWithInt:arg];

	assert(!isTaggedPtr(obj));			assert(!isTaggedPtr(obj));
	assert(isTaggedPtr(num1) && isTaggedPtr(num2));			assert(isTaggedPtr(num1) && isTaggedPtr(num2));
	Show All 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[TSan] Support Objective-C @synchronized with tagged pointersClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 180531

compiler-rt/trunk/lib/tsan/rtl/tsan_interceptors_mac.cc

compiler-rt/trunk/test/tsan/Darwin/objc-synchronize-cycle-tagged.mm

[TSan] Support Objective-C @synchronized with tagged pointers
ClosedPublic