This is an archive of the discontinued LLVM Phabricator instance.

hwasan: Move memory access checks into small outlined functions on aarch64.
ClosedPublic

Authored by pcc on Jan 18 2019, 7:01 PM.

Download Raw Diff

Details

Reviewers

Commits

rG73078ecd381b: hwasan: Move memory access checks into small outlined functions on aarch64.
rL351920: hwasan: Move memory access checks into small outlined functions on aarch64.
rCRT351920: hwasan: Move memory access checks into small outlined functions on aarch64.

Summary

Each hwasan check requires emitting a small piece of code like this:
https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#memory-accesses

The problem with this is that these code blocks typically bloat code
size significantly.

An obvious solution is to outline these blocks of code. In fact, this
has already been implemented under the -hwasan-instrument-with-calls
flag. However, as currently implemented this has a number of problems:

The functions use the same calling convention as regular C functions. This means that the backend must spill all temporary registers as required by the platform's C calling convention, even though the check only needs two registers on the hot path.
The functions take the address to be checked in a fixed register, which increases register pressure.

Both of these factors can diminish the code size effect and increase
the performance hit of -hwasan-instrument-with-calls.

The solution that this patch implements is to involve the aarch64
backend in outlining the checks. An intrinsic and pseudo-instruction
are created to represent a hwasan check. The pseudo-instruction
is register allocated like any other instruction, and we allow the
register allocator to select almost any register for the address to
check. A particular combination of (register selection, type of check)
triggers the creation in the backend of a function to handle the check
for specifically that pair. The resulting functions are deduplicated by
the linker. The pseudo-instruction (really the function) is specified
to preserve all registers except for the registers that the AAPCS
specifies may be clobbered by a call.

To measure the code size and performance effect of this change, I
took a number of measurements using Chromium for Android on aarch64,
comparing a browser with inlined checks (the baseline) against a
browser with outlined checks.

Code size: Size of .text decreases from 243897420 to 171619972 bytes,
or a 30% decrease.

Performance: Using Chromium's blink_perf.layout microbenchmarks I
measured a median performance regression of 6.24%.

The fact that a perf/size tradeoff is evident here suggests that
we might want to make the new behaviour conditional on -Os/-Oz.
But for now I've enabled it unconditionally, my reasoning being that
hwasan users typically expect a relatively large perf hit, and ~6%
isn't really adding much. We may want to revisit this decision in
the future, though.

I also tried experimenting with varying the number of registers
selectable by the hwasan check pseudo-instruction (which would result
in fewer variants being created), on the hypothesis that creating
fewer variants of the function would expose another perf/size tradeoff
by reducing icache pressure from the check functions at the cost of
register pressure. Although I did observe a code size increase with
fewer registers, I did not observe a strong correlation between the
number of registers and the performance of the resulting browser on the
microbenchmarks, so I conclude that we might as well use ~all registers
to get the maximum code size improvement. My results are below:

Regs | .text size | Perf hit
-----+------------+---------
~all | 171619972 | 6.24%

16 | 171765192  | 7.03%
 8 | 172917788  | 5.82%
 4 | 177054016  | 6.89%

Diff Detail

Repository: rCRT Compiler Runtime

Event Timeline

pcc created this revision.Jan 18 2019, 7:01 PM

Herald added subscribers: hiraditya, kristof.beyls, javed.absar and 2 others. · View Herald TranscriptJan 18 2019, 7:01 PM

Harbormaster completed remote builds in B27091: Diff 182661.Jan 18 2019, 7:02 PM

More than half of this changelist is switching shadow addressing to getelementptr. Could you do it in a separate change?

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
281 ↗	(On Diff #182661)	why weak?

In D56954#1367060, @eugenis wrote:

More than half of this changelist is switching shadow addressing to getelementptr. Could you do it in a separate change?

I think I could, but it would make the tests (more) painful to update. I guess I could drop the tests for abort in basic.ll in the first change and then bring them back in the second change, if that's ok?

llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
281 ↗	(On Diff #182661)	Because we're not guaranteed that the linker will be able to resolve comdats at this point. LLD for example will ignore comdats in post-LTO object files (see discussion in D56015). On the other hand, weak/strong resolution will work fine.

I think I could, but it would make the tests (more) painful to update. I guess I could drop the tests for abort in basic.ll in the first change and then bring them back in the second change, if that's ok?

Let's keep it as one change then.

llvm/lib/Target/AArch64/AArch64RegisterBankInfo.cpp
254 ↗	(On Diff #182661)	These are weird. At least one of them goes away if GPR64noip is derived from GPR64common instead of GPR64.
llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
534 ↗	(On Diff #182661)	Let's add an -mllvm flag to disable this feature, just in case.

This revision is now accepted and ready to land.Jan 22 2019, 5:10 PM

pcc marked an inline comment as done.Jan 22 2019, 5:16 PM

pcc added inline comments.

llvm/lib/Target/AArch64/AArch64RegisterBankInfo.cpp
254 ↗	(On Diff #182661)	Yeah. There's a todo at the top of this file to make it be generated by tblgen, which certainly makes sense at least for this function.

Closed by commit rCRT351920: hwasan: Move memory access checks into small outlined functions on aarch64. (authored by pcc). · Explain WhyJan 22 2019, 6:20 PM

This revision was automatically updated to reflect the committed changes.

pcc marked an inline comment as done.

Herald added a subscriber: Restricted Project. · View Herald TranscriptJan 22 2019, 6:20 PM

Revision Contents

Path

Size

lib/

hwasan/

hwasan_linux.cc

34 lines

Diff 183012

lib/hwasan/hwasan_linux.cc

	Show First 20 Lines • Show All 362 Lines • ▼ Show 20 Lines

	#else			#else
	# error Unsupported architecture			# error Unsupported architecture
	#endif			#endif

	return AccessInfo{addr, size, is_store, !is_store, recover};			return AccessInfo{addr, size, is_store, !is_store, recover};
	}			}

	static bool HwasanOnSIGTRAP(int signo, siginfo_t info, ucontext_t uc) {			static void HandleTagMismatch(AccessInfo ai, uptr pc, uptr frame,
	AccessInfo ai = GetAccessInfo(info, uc);			ucontext_t *uc) {
	if (!ai.is_store && !ai.is_load)
	return false;

	InternalMmapVector<BufferedStackTrace> stack_buffer(1);			InternalMmapVector<BufferedStackTrace> stack_buffer(1);
	BufferedStackTrace *stack = stack_buffer.data();			BufferedStackTrace *stack = stack_buffer.data();
	stack->Reset();			stack->Reset();
	SignalContext sig{info, uc};			GetStackTrace(stack, kStackTraceMax, pc, frame, uc,
	GetStackTrace(stack, kStackTraceMax, StackTrace::GetNextInstructionPc(sig.pc),			common_flags()->fast_unwind_on_fatal);
	sig.bp, uc, common_flags()->fast_unwind_on_fatal);

	++hwasan_report_count;			++hwasan_report_count;

	bool fatal = flags()->halt_on_error \|\| !ai.recover;			bool fatal = flags()->halt_on_error \|\| !ai.recover;
	ReportTagMismatch(stack, ai.addr, ai.size, ai.is_store, fatal);			ReportTagMismatch(stack, ai.addr, ai.size, ai.is_store, fatal);
				}

				static bool HwasanOnSIGTRAP(int signo, siginfo_t info, ucontext_t uc) {
				AccessInfo ai = GetAccessInfo(info, uc);
				if (!ai.is_store && !ai.is_load)
				return false;

				SignalContext sig{info, uc};
				HandleTagMismatch(ai, StackTrace::GetNextInstructionPc(sig.pc), sig.bp, uc);

	#if defined(__aarch64__)			#if defined(__aarch64__)
	uc->uc_mcontext.pc += 4;			uc->uc_mcontext.pc += 4;
	#elif defined(__x86_64__)			#elif defined(__x86_64__)
	#else			#else
	# error Unsupported architecture			# error Unsupported architecture
	#endif			#endif
	return true;			return true;
	}			}

				extern "C" SANITIZER_INTERFACE_ATTRIBUTE void __hwasan_tag_mismatch(
				uptr addr, uptr access_info) {
				AccessInfo ai;
				ai.is_store = access_info & 0x10;
				ai.recover = false;
				ai.addr = addr;
				ai.size = 1 << (access_info & 0xf);

				HandleTagMismatch(ai, (uptr)__builtin_return_address(0),
				(uptr)__builtin_frame_address(0), nullptr);
				__builtin_unreachable();
				}

	static void OnStackUnwind(const SignalContext &sig, const void *,			static void OnStackUnwind(const SignalContext &sig, const void *,
	BufferedStackTrace *stack) {			BufferedStackTrace *stack) {
	GetStackTrace(stack, kStackTraceMax, StackTrace::GetNextInstructionPc(sig.pc),			GetStackTrace(stack, kStackTraceMax, StackTrace::GetNextInstructionPc(sig.pc),
	sig.bp, sig.context, common_flags()->fast_unwind_on_fatal);			sig.bp, sig.context, common_flags()->fast_unwind_on_fatal);
	}			}

	void HwasanOnDeadlySignal(int signo, void info, void context) {			void HwasanOnDeadlySignal(int signo, void info, void context) {
	// Probably a tag mismatch.			// Probably a tag mismatch.
	Show All 11 Lines