This is an archive of the discontinued LLVM Phabricator instance.

[PATCH] [compiler-rt] [sanitizers] Add VMA size check at runtime
ClosedPublic

Authored by zatrazz on Sep 10 2015, 8:03 AM.

Download Raw Diff

Details

Reviewers

kcc
rengolin
dvyukov
eugenis
pcc

Summary

This patch adds a runtime check for asan, dfsan, msan, and tsan for
architectures that support multiple VMA size (like aarch64). Currently
the check only prints a warning indicating which is the VMA built and
expected against the one detected at runtime.

Diff Detail

Event Timeline

zatrazz updated this revision to Diff 34447.Sep 10 2015, 8:03 AM

zatrazz retitled this revision from to [PATCH] [compiler-rt] [sanitizers] Add VMA size check at runtime.

zatrazz updated this object.

zatrazz added reviewers: kcc, dvyukov, eugenis, pcc, rengolin.

zatrazz added a subscriber: llvm-commits.

Herald added a subscriber: aemerson. · View Herald TranscriptSep 10 2015, 8:03 AM

rengolin added inline comments.Sep 10 2015, 8:57 AM

lib/sanitizer_common/sanitizer_posix.cc
336	I was expecting this to print things like 39/42, not the max size.

zatrazz added inline comments.Sep 10 2015, 9:57 AM

lib/sanitizer_common/sanitizer_posix.cc
336	I tried to be less intrusive as possible, since only GetMaxVirtualAddress is really defined in a multiplatform way. However if the idea is to show 39/42 I think we can create a symbol to return it as well.

rengolin added inline comments.Sep 10 2015, 10:02 AM

lib/sanitizer_common/sanitizer_posix.cc
336	I think the max size will be very uninformative. Maybe it would be good to have that CLZ function in the sanitizers after all.

Updated patch to prints bits instead of total VMA range.

I would have assumed it would be relatively straightforward to support multiple virtual-memory address space sizes in a single compiler-rt library, by adding name mangling to the public functions in the library that need to behave differently for different VMA sizes. E.g. couldn't a name mangling scheme adding "_vma39/_vma42/_vma48" be used to get all the functions depending on VMA size into a single compiler-rt library?
Building different compiler-rt libraries just for this difference seems inconvenient to users to me. Furthermore, if for some reason another difference requires slightly different implementations in the sanitizers or in other functionality in compiler-rt, are we going to build the cartesian cross-product of all variants?

I haven't looked into the details of how the sanitizers are implemented, but I'm assuming there's a very good reason why the shadow regions need to be placed in an area depending on the VMA size?

Otherwise LGTM

lib/sanitizer_common/sanitizer_posix.cc
335	Add tool name in the beginning. There is a function SanitizerToolName() or something like that. We generally prefix all output with tool name, because sometimes people run third_party programs or very large programs, and they may not know that this output comes from the tool.

In D12763#244128, @kristof.beyls wrote:

I would have assumed it would be relatively straightforward to support multiple virtual-memory address space sizes in a single compiler-rt library, by adding name mangling to the public functions in the library that need to behave differently for different VMA sizes. E.g. couldn't a name mangling scheme adding "_vma39/_vma42/_vma48" be used to get all the functions depending on VMA size into a single compiler-rt library?

Hi Kristof,

The main issue here is run time impact of the decisions, and how it led the development of the sanitizers, and that's a topic not suitable to a small patch like this. :)

It should be possible to come up with a clean design to make run time decisions cheaper and more elegant, and we're discussing this on IRC, mailing list and privately. There will be a session at Connect to discuss this with the ARM GCC team, and if you guys are interested, I'll send you the invite.

But even so, none of that will give us any final direction, it'll be just an informal talk. The real discussion will happen in the list, hopefully also in the US LLVM meeting, so we can come up with a final design that doesn't regress performance and gets us our run time VMA choice.

The sanitizers were developed with the mindset that there is only one memory map per architecture and that's hard-coded into the pre-processor macros. We'll need a bunch of small refactorings before we can start thinking about name mangling. For now, we need some mechanism that works.

cheers,
--renato

With Dmitry's comments, LGTM, too.

This revision is now accepted and ready to land.Sep 11 2015, 1:18 AM

In D12763#244134, @rengolin wrote:

In D12763#244128, @kristof.beyls wrote:

I would have assumed it would be relatively straightforward to support multiple virtual-memory address space sizes in a single compiler-rt library, by adding name mangling to the public functions in the library that need to behave differently for different VMA sizes. E.g. couldn't a name mangling scheme adding "_vma39/_vma42/_vma48" be used to get all the functions depending on VMA size into a single compiler-rt library?

Hi Kristof,

The main issue here is run time impact of the decisions, and how it led the development of the sanitizers, and that's a topic not suitable to a small patch like this. :)

It should be possible to come up with a clean design to make run time decisions cheaper and more elegant, and we're discussing this on IRC, mailing list and privately. There will be a session at Connect to discuss this with the ARM GCC team, and if you guys are interested, I'll send you the invite.

But even so, none of that will give us any final direction, it'll be just an informal talk. The real discussion will happen in the list, hopefully also in the US LLVM meeting, so we can come up with a final design that doesn't regress performance and gets us our run time VMA choice.

The sanitizers were developed with the mindset that there is only one memory map per architecture and that's hard-coded into the pre-processor macros. We'll need a bunch of small refactorings before we can start thinking about name mangling. For now, we need some mechanism that works.

cheers,
--renato

I don't see why name mangling would add run-time overhead - but it would avoid having to build different variants of compiler-rt for AArch64.
If name mangling does add run-time overhead, could you explain why?

In D12763#244146, @kristof.beyls wrote:

I don't see why name mangling would add run-time overhead - but it would avoid having to build different variants of compiler-rt for AArch64.
If name mangling does add run-time overhead, could you explain why?

I'd rather discuss this outside of this patch, as it is just a safety check, not a decision making change. Feel free to send an email to the list if you want to start the discussion now.

In D12763#244164, @rengolin wrote:

In D12763#244146, @kristof.beyls wrote:

I don't see why name mangling would add run-time overhead - but it would avoid having to build different variants of compiler-rt for AArch64.
If name mangling does add run-time overhead, could you explain why?

I'd rather discuss this outside of this patch, as it is just a safety check, not a decision making change. Feel free to send an email to the list if you want to start the discussion now.

Fair enough - I hadn't noticed we're already requiring different compiler-rt builds for different variants of AArch64-linux.

zatrazz closed this revision.Sep 11 2015, 7:02 AM

Revision Contents

Path

Size

lib/

asan/

asan_rtl.cc

1 line

dfsan/

dfsan.cc

2 lines

msan/

msan.cc

2 lines

sanitizer_common/

sanitizer_common.h

2 lines

sanitizer_posix.cc

15 lines

sanitizer_win.cc

4 lines

tsan/

rtl/

tsan_rtl.cc

3 lines

Diff 34484

lib/asan/asan_rtl.cc

	Show First 20 Lines • Show All 579 Lines • ▼ Show 20 Lines

	void NOINLINE __asan_set_death_callback(void (*callback)(void)) {			void NOINLINE __asan_set_death_callback(void (*callback)(void)) {
	SetUserDieCallback(callback);			SetUserDieCallback(callback);
	}			}

	// Initialize as requested from instrumented application code.			// Initialize as requested from instrumented application code.
	// We use this call as a trigger to wake up ASan from deactivated state.			// We use this call as a trigger to wake up ASan from deactivated state.
	void __asan_init() {			void __asan_init() {
				CheckVMASize();
	AsanActivate();			AsanActivate();
	AsanInitInternal();			AsanInitInternal();
	}			}

	void __asan_version_mismatch_check() {			void __asan_version_mismatch_check() {
	// Do nothing.			// Do nothing.
	}			}

lib/dfsan/dfsan.cc

Show First 20 Lines • Show All 393 Lines • ▼ Show 20 Lines	if (internal_strcmp(flags().dump_labels_at_exit, "") != 0) {
Report("INFO: DataFlowSanitizer: dumping labels to %s\n",		Report("INFO: DataFlowSanitizer: dumping labels to %s\n",
flags().dump_labels_at_exit);		flags().dump_labels_at_exit);
dfsan_dump_labels(fd);		dfsan_dump_labels(fd);
CloseFile(fd);		CloseFile(fd);
}		}
}		}

static void dfsan_init(int argc, char argv, char envp) {		static void dfsan_init(int argc, char argv, char envp) {
		CheckVMASize();

MmapFixedNoReserve(kShadowAddr, kUnusedAddr - kShadowAddr);		MmapFixedNoReserve(kShadowAddr, kUnusedAddr - kShadowAddr);

// Protect the region of memory we don't use, to preserve the one-to-one		// Protect the region of memory we don't use, to preserve the one-to-one
// mapping from application to shadow memory. But if ASLR is disabled, Linux		// mapping from application to shadow memory. But if ASLR is disabled, Linux
// will load our executable in the middle of our unused region. This mostly		// will load our executable in the middle of our unused region. This mostly
// works so long as the program doesn't use too much memory. We support this		// works so long as the program doesn't use too much memory. We support this
// case by disabling memory protection when ASLR is disabled.		// case by disabling memory protection when ASLR is disabled.
uptr init_addr = (uptr)&dfsan_init;		uptr init_addr = (uptr)&dfsan_init;
Show All 18 Lines

lib/msan/msan.cc

	Show First 20 Lines • Show All 369 Lines • ▼ Show 20 Lines
	}			}

	void __msan_init() {			void __msan_init() {
	CHECK(!msan_init_is_running);			CHECK(!msan_init_is_running);
	if (msan_inited) return;			if (msan_inited) return;
	msan_init_is_running = 1;			msan_init_is_running = 1;
	SanitizerToolName = "MemorySanitizer";			SanitizerToolName = "MemorySanitizer";

				CheckVMASize();

	InitTlsSize();			InitTlsSize();

	CacheBinaryName();			CacheBinaryName();
	InitializeFlags();			InitializeFlags();
	__sanitizer_set_report_path(common_flags()->log_path);			__sanitizer_set_report_path(common_flags()->log_path);

	InitializeInterceptors();			InitializeInterceptors();
	InstallAtExitHandler(); // Needs __cxa_atexit interceptor.			InstallAtExitHandler(); // Needs __cxa_atexit interceptor.
	▲ Show 20 Lines • Show All 254 Lines • Show Last 20 Lines

lib/sanitizer_common/sanitizer_common.h

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
	// Used to check if we can map shadow memory to a fixed location.			// Used to check if we can map shadow memory to a fixed location.
	bool MemoryRangeIsAvailable(uptr range_start, uptr range_end);			bool MemoryRangeIsAvailable(uptr range_start, uptr range_end);
	void FlushUnneededShadowMemory(uptr addr, uptr size);			void FlushUnneededShadowMemory(uptr addr, uptr size);
	void IncreaseTotalMmap(uptr size);			void IncreaseTotalMmap(uptr size);
	void DecreaseTotalMmap(uptr size);			void DecreaseTotalMmap(uptr size);
	uptr GetRSS();			uptr GetRSS();
	void NoHugePagesInRegion(uptr addr, uptr length);			void NoHugePagesInRegion(uptr addr, uptr length);
	void DontDumpShadowMemory(uptr addr, uptr length);			void DontDumpShadowMemory(uptr addr, uptr length);
				// Check if the built VMA size matches the runtime one.
				void CheckVMASize();

	// InternalScopedBuffer can be used instead of large stack arrays to			// InternalScopedBuffer can be used instead of large stack arrays to
	// keep frame size low.			// keep frame size low.
	// FIXME: use InternalAlloc instead of MmapOrDie once			// FIXME: use InternalAlloc instead of MmapOrDie once
	// InternalAlloc is made libc-free.			// InternalAlloc is made libc-free.
	template<typename T>			template<typename T>
	class InternalScopedBuffer {			class InternalScopedBuffer {
	public:			public:
	▲ Show 20 Lines • Show All 618 Lines • Show Last 20 Lines

lib/sanitizer_common/sanitizer_posix.cc

	Show First 20 Lines • Show All 318 Lines • ▼ Show 20 Lines

	SignalContext SignalContext::Create(void siginfo, void context) {			SignalContext SignalContext::Create(void siginfo, void context) {
	uptr addr = (uptr)((siginfo_t*)siginfo)->si_addr;			uptr addr = (uptr)((siginfo_t*)siginfo)->si_addr;
	uptr pc, sp, bp;			uptr pc, sp, bp;
	GetPcSpBp(context, &pc, &sp, &bp);			GetPcSpBp(context, &pc, &sp, &bp);
	return SignalContext(context, addr, pc, sp, bp);			return SignalContext(context, addr, pc, sp, bp);
	}			}

				// This function check is the built VMA matches the runtime one for
				// architectures with multiple VMA size.
				void CheckVMASize() {
				#ifdef __aarch64__
				static const unsigned kBuiltVMA = SANITIZER_AARCH64_VMA;
				unsigned maxRuntimeVMA =
				(MostSignificantSetBitIndex(GET_CURRENT_FRAME()) + 1);
				if (kBuiltVMA != maxRuntimeVMA) {
				Printf("WARNING: Runtime VMA is not the one built for.\n");
				dvyukovUnsubmitted Not Done Reply Inline Actions Add tool name in the beginning. There is a function SanitizerToolName() or something like that. We generally prefix all output with tool name, because sometimes people run third_party programs or very large programs, and they may not know that this output comes from the tool. dvyukov: Add tool name in the beginning. There is a function SanitizerToolName() or something like that.
				Printf("\tBuilt VMA: %u bits\n", kBuiltVMA);
				rengolinUnsubmitted Not Done Reply Inline Actions I was expecting this to print things like 39/42, not the max size. rengolin: I was expecting this to print things like 39/42, not the max size.
				zatrazzAuthorUnsubmitted Not Done Reply Inline Actions I tried to be less intrusive as possible, since only GetMaxVirtualAddress is really defined in a multiplatform way. However if the idea is to show 39/42 I think we can create a symbol to return it as well. zatrazz: I tried to be less intrusive as possible, since only GetMaxVirtualAddress is really defined in…
				rengolinUnsubmitted Not Done Reply Inline Actions I think the max size will be very uninformative. Maybe it would be good to have that CLZ function in the sanitizers after all. rengolin: I think the max size will be very uninformative. Maybe it would be good to have that CLZ…
				Printf("\tRuntime VMA: %u bits\n", maxRuntimeVMA);
				}
				#endif
				}

	} // namespace __sanitizer			} // namespace __sanitizer

	#endif // SANITIZER_POSIX			#endif // SANITIZER_POSIX

lib/sanitizer_common/sanitizer_win.cc

Show First 20 Lines • Show All 749 Lines • ▼ Show 20 Lines	uptr ReadBinaryName(/out/char *buf, uptr buf_len) {
buf[0] = 0;		buf[0] = 0;
return 0;		return 0;
}		}

uptr ReadLongProcessName(/out/char *buf, uptr buf_len) {		uptr ReadLongProcessName(/out/char *buf, uptr buf_len) {
return ReadBinaryName(buf, buf_len);		return ReadBinaryName(buf, buf_len);
}		}

		void CheckVMASize() {
		// Do nothing.
		}

} // namespace __sanitizer		} // namespace __sanitizer

#endif // _WIN32		#endif // _WIN32

lib/tsan/rtl/tsan_rtl.cc

	Show First 20 Lines • Show All 306 Lines • ▼ Show 20 Lines
	}			}

	void Initialize(ThreadState *thr) {			void Initialize(ThreadState *thr) {
	// Thread safe because done before all threads exist.			// Thread safe because done before all threads exist.
	static bool is_initialized = false;			static bool is_initialized = false;
	if (is_initialized)			if (is_initialized)
	return;			return;
	is_initialized = true;			is_initialized = true;

				CheckVMASize();

	// We are not ready to handle interceptors yet.			// We are not ready to handle interceptors yet.
	ScopedIgnoreInterceptors ignore;			ScopedIgnoreInterceptors ignore;
	SanitizerToolName = "ThreadSanitizer";			SanitizerToolName = "ThreadSanitizer";
	// Install tool-specific callbacks in sanitizer_common.			// Install tool-specific callbacks in sanitizer_common.
	SetCheckFailedCallback(TsanCheckFailed);			SetCheckFailedCallback(TsanCheckFailed);

	ctx = new(ctx_placeholder) Context;			ctx = new(ctx_placeholder) Context;
	const char *options = GetEnv(kTsanOptionsEnv);			const char *options = GetEnv(kTsanOptionsEnv);
	▲ Show 20 Lines • Show All 694 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PATCH] [compiler-rt] [sanitizers] Add VMA size check at runtimeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 34484

lib/asan/asan_rtl.cc

lib/dfsan/dfsan.cc

lib/msan/msan.cc

lib/sanitizer_common/sanitizer_common.h

lib/sanitizer_common/sanitizer_posix.cc

lib/sanitizer_common/sanitizer_win.cc

lib/tsan/rtl/tsan_rtl.cc

[PATCH] [compiler-rt] [sanitizers] Add VMA size check at runtime
ClosedPublic