This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/CodeGen/
-
CodeGen/
-
CodeGenFunction.cpp
-
CodeGenModule.cpp
-
test/CodeGen/
-
CodeGen/
-
kcfi.c
-
ubsan-function.cpp

Differential D154043

[CodeGen] -fsanitize={function,kcfi}: ensure align 4 if +strict-align
AbandonedPublic

Authored by MaskRay on Jun 28 2023, 11:18 PM.

Download Raw Diff

Details

Reviewers

efriedma
rjmccall
simon_tatham
samitolvanen

Summary

Fix https://github.com/llvm/llvm-project/issues/63579

% cat a.c
void foo() {}
% clang --target=arm-none-eabi -mthumb -mno-unaligned-access -fsanitize=kcfi a.c -S -o - | grep p2align
        .p2align        1
% clang --target=armv6m-none-eabi -fsanitize=function a.c -S -o - | grep p2align
        .p2align        1

With -mno-unaligned-access (possibly implicit), we should ensure that
-fsanitize={function,kcfi} instrumented functions are aligned by at least 4, so
that loading the type hash before the function label will not cause a misaligned
access, even if the backend doesn't set setMinFunctionAlignment to 4 or greater.

With this patch, the generated assembly for the examples above will contain .p2align 2.

If -falign-functions= is specified, take the maxiumum.

If __attribute__((aligned(2))) is specified, arbitrarily let the function
attribute win.

Since SanOpts is per-function, move the alignment setting code from
CodeGenModule::SetLLVMFunctionAttributesForDefinition to CodeGenFunction.
This move requires some attention.

Note: CodeGenModule::SetLLVMFunctionAttributesForDefinition is called by many
thunk codegen code with a dummy GlobalDecl/FunctionDecl.
However, in one call site, MicrosoftCXXABI::EmitVirtualMemPtrThunk has a
SetLLVMFunctionAttributesForDefinition use case that requires the
"Some C++ ABIs require 2-byte alignment for member functions" code. So
keep this part in CodeGenModule.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MaskRay created this revision.Jun 28 2023, 11:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 11:18 PM

Herald added a subscriber: kristof.beyls. · View Herald Transcript

MaskRay requested review of this revision.Jun 28 2023, 11:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 11:18 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B241974: Diff 535634.Jun 28 2023, 11:56 PM

The details of this approach look good to me, but is this the best place to solve it? Doing it in clang means that every language front end that wants to use either of these sanitizers is responsible for doing this same work: tagging every IR function with align 4 if it also has !kcfi_type or !func_sanitize, and perhaps also checking the target-features to decide whether to do that.

I'd imagined the problem being solved at a lower level, when converting the IR into actual function prologues, so that all front ends generating IR would benefit from the fix.

I also think it makes sense to fix the alignment when we lower the metadata, not in the frontend, unless I'm missing something.

It's not clear to me how "strict-align" is relevant; if sanitizer lowering is generating "align 4" loads, the relevant pointers need to be appropriately aligned regardless of the cost of unaligned loads. Misaligned loads are undefined behavior in LLVM IR on all targets. (32-bit ARM in particular has cases where 32-bit unaligned loads are supported, but certain load instruction variations enforce alignment.)

In D154043#4459446, @simon_tatham wrote:

The details of this approach look good to me, but is this the best place to solve it? Doing it in clang means that every language front end that wants to use either of these sanitizers is responsible for doing this same work: tagging every IR function with align 4 if it also has !kcfi_type or !func_sanitize, and perhaps also checking the target-features to decide whether to do that.

I'd imagined the problem being solved at a lower level, when converting the IR into actual function prologues, so that all front ends generating IR would benefit from the fix.

In D154043#4460665, @efriedma wrote:

I also think it makes sense to fix the alignment when we lower the metadata, not in the frontend, unless I'm missing something.

It's not clear to me how "strict-align" is relevant; if sanitizer lowering is generating "align 4" loads, the relevant pointers need to be appropriately aligned regardless of the cost of unaligned loads. Misaligned loads are undefined behavior in LLVM IR on all targets. (32-bit ARM in particular has cases where 32-bit unaligned loads are supported, but certain load instruction variations enforce alignment.)

OK. See D154125 for the MachineFunction.cpp approach. If we go that direction, I'll abandon this patch.

simon_tatham mentioned this in D154125: MachineFunction: -fsanitize={function,kcfi}: ensure 4-byte alignment.Jun 30 2023, 1:24 AM

Obsoleted by D154125

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CodeGenFunction.cpp

21 lines

CodeGenModule.cpp

8 lines

test/

CodeGen/

kcfi.c

11 lines

ubsan-function.cpp

16 lines

Diff 535634

clang/lib/CodeGen/CodeGenFunction.cpp

Show First 20 Lines • Show All 812 Lines • ▼ Show 20 Lines	#undef SANITIZER
// Ignore null checks in coroutine functions since the coroutines passes		// Ignore null checks in coroutine functions since the coroutines passes
// are not aware of how to move the extra UBSan instructions across the split		// are not aware of how to move the extra UBSan instructions across the split
// coroutine boundaries.		// coroutine boundaries.
if (D && SanOpts.has(SanitizerKind::Null))		if (D && SanOpts.has(SanitizerKind::Null))
if (FD && FD->getBody() &&		if (FD && FD->getBody() &&
FD->getBody()->getStmtClass() == Stmt::CoroutineBodyStmtClass)		FD->getBody()->getStmtClass() == Stmt::CoroutineBodyStmtClass)
SanOpts.Mask &= ~SanitizerKind::Null;		SanOpts.Mask &= ~SanitizerKind::Null;

		if (FD && FD->hasAttr<AlignedAttr>()) {
		if (unsigned alignment =
		FD->getMaxAlignment() / getContext().getCharWidth())
		Fn->setAlignment(llvm::Align(alignment));
		} else if (FD) {
		if (getLangOpts().FunctionAlignment)
		Fn->setAlignment(llvm::Align(1ull << getLangOpts().FunctionAlignment));
		// -fsanitize=function and -fsanitize=kcfi instrument indirect function
		// calls to load a type hash before the function label. Ensure the function
		// is aligned by a least 4 to avoid unaligned access for
		// -mno-unaligned-access, even if the backend does not increase the
		// alignment.
		if (Fn->getAlignment() < 4 && (SanOpts.has(SanitizerKind::Function) \|\|
		SanOpts.has(SanitizerKind::KCFI))) {
		llvm::StringMap<bool> FeatureMap;
		getContext().getFunctionFeatureMap(FeatureMap, GD);
		if (FeatureMap.lookup("strict-align"))
		Fn->setAlignment(llvm::Align(4));
		}
		}

// Apply xray attributes to the function (as a string, for now)		// Apply xray attributes to the function (as a string, for now)
bool AlwaysXRayAttr = false;		bool AlwaysXRayAttr = false;
if (const auto *XRayAttr = D ? D->getAttr<XRayInstrumentAttr>() : nullptr) {		if (const auto *XRayAttr = D ? D->getAttr<XRayInstrumentAttr>() : nullptr) {
if (CGM.getCodeGenOpts().XRayInstrumentationBundle.has(		if (CGM.getCodeGenOpts().XRayInstrumentationBundle.has(
XRayInstrKind::FunctionEntry) \|\|		XRayInstrKind::FunctionEntry) \|\|
CGM.getCodeGenOpts().XRayInstrumentationBundle.has(		CGM.getCodeGenOpts().XRayInstrumentationBundle.has(
XRayInstrKind::FunctionExit)) {		XRayInstrKind::FunctionExit)) {
if (XRayAttr->alwaysXRayInstrument() && ShouldXRayInstrumentFunction()) {		if (XRayAttr->alwaysXRayInstrument() && ShouldXRayInstrumentFunction()) {
▲ Show 20 Lines • Show All 2,072 Lines • Show Last 20 Lines

clang/lib/CodeGen/CodeGenModule.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,367 Lines • ▼ Show 20 Lines	if (!D->hasAttr<OptimizeNoneAttr>()) {
if (D->hasAttr<HotAttr>())		if (D->hasAttr<HotAttr>())
B.addAttribute(llvm::Attribute::Hot);		B.addAttribute(llvm::Attribute::Hot);
if (D->hasAttr<MinSizeAttr>())		if (D->hasAttr<MinSizeAttr>())
B.addAttribute(llvm::Attribute::MinSize);		B.addAttribute(llvm::Attribute::MinSize);
}		}

F->addFnAttrs(B);		F->addFnAttrs(B);

unsigned alignment = D->getMaxAlignment() / Context.getCharWidth();
if (alignment)
F->setAlignment(llvm::Align(alignment));

if (!D->hasAttr<AlignedAttr>())
if (LangOpts.FunctionAlignment)
F->setAlignment(llvm::Align(1ull << LangOpts.FunctionAlignment));

// Some C++ ABIs require 2-byte alignment for member functions, in order to		// Some C++ ABIs require 2-byte alignment for member functions, in order to
// reserve a bit for differentiating between virtual and non-virtual member		// reserve a bit for differentiating between virtual and non-virtual member
// functions. If the current target's C++ ABI requires this and this is a		// functions. If the current target's C++ ABI requires this and this is a
// member function, set its alignment accordingly.		// member function, set its alignment accordingly.
if (getTarget().getCXXABI().areMemberFunctionsAligned()) {		if (getTarget().getCXXABI().areMemberFunctionsAligned()) {
if (F->getAlignment() < 2 && isa<CXXMethodDecl>(D))		if (F->getAlignment() < 2 && isa<CXXMethodDecl>(D))
F->setAlignment(llvm::Align(2));		F->setAlignment(llvm::Align(2));
}		}
▲ Show 20 Lines • Show All 5,097 Lines • Show Last 20 Lines

clang/test/CodeGen/kcfi.c

	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -o - %s \| FileCheck %s --check-prefixes=CHECK,BUNDLE
	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -x c++ -o - %s \| FileCheck %s			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -x c++ -o - %s \| FileCheck %s --check-prefixes=CHECK,BUNDLE
	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -fpatchable-function-entry-offset=3 -o - %s \| FileCheck %s --check-prefixes=CHECK,OFFSET			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -fsanitize=kcfi -fpatchable-function-entry-offset=3 -o - %s \| FileCheck %s --check-prefixes=CHECK,BUNDLE,OFFSET
				// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -fsanitize=kcfi -target-feature -strict-align -o - %s \| FileCheck %s --check-prefix=CHECK
				// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -fsanitize=kcfi -target-feature +strict-align -o - %s \| FileCheck %s --check-prefix=STRICTALIGN
	#if !__has_feature(kcfi)			#if !__has_feature(kcfi)
	#error Missing kcfi?			#error Missing kcfi?
	#endif			#endif

	/// Must emit __kcfi_typeid symbols for address-taken function declarations			/// Must emit __kcfi_typeid symbols for address-taken function declarations
	// CHECK: module asm ".weak __kcfi_typeid_[[F4:[a-zA-Z0-9_]+]]"			// CHECK: module asm ".weak __kcfi_typeid_[[F4:[a-zA-Z0-9_]+]]"
	// CHECK: module asm ".set __kcfi_typeid_[[F4]], [[#%d,HASH:]]"			// CHECK: module asm ".set __kcfi_typeid_[[F4]], [[#%d,HASH:]]"
	/// Must not __kcfi_typeid symbols for non-address-taken declarations			/// Must not __kcfi_typeid symbols for non-address-taken declarations
	// CHECK-NOT: module asm ".weak __kcfi_typeid_{{f6\|_Z2f6v}}"			// CHECK-NOT: module asm ".weak __kcfi_typeid_{{f6\|_Z2f6v}}"
	typedef int (*fn_t)(void);			typedef int (*fn_t)(void);

	// CHECK: define dso_local{{.}} i32 @{{f1\|_Z2f1v}}(){{.}} !kcfi_type ![[#TYPE:]]			// CHECK: define dso_local{{.}} i32 @{{f1\|_Z2f1v}}(){{.}} !kcfi_type ![[#TYPE:]]
				// STRICTALIGN: define{{.*}} i32 @f1() #[[#]] align 4 !kcfi_type ![[#TYPE:]]
	int f1(void) { return 0; }			int f1(void) { return 0; }

	// CHECK: define dso_local{{.}} i32 @{{f2\|_Z2f2v}}(){{.}} !kcfi_type ![[#TYPE2:]]			// CHECK: define dso_local{{.}} i32 @{{f2\|_Z2f2v}}(){{.}} !kcfi_type ![[#TYPE2:]]
	unsigned int f2(void) { return 2; }			unsigned int f2(void) { return 2; }

	// CHECK-LABEL: define dso_local{{.}} i32 @{{__call\|_Z6__callPFivE}}(ptr{{.}} %f)			// CHECK-LABEL: define dso_local{{.}} i32 @{{__call\|_Z6__callPFivE}}(ptr{{.}} %f)
	int __call(fn_t f) __attribute__((__no_sanitize__("kcfi"))) {			int __call(fn_t f) __attribute__((__no_sanitize__("kcfi"))) {
	// CHECK-NOT: call{{.}} i32 %{{.}}(){{.}} [ "kcfi"			// CHECK-NOT: call{{.}} i32 %{{.}}(){{.}} [ "kcfi"
	return f();			return f();
	}			}

	// CHECK: define dso_local{{.}} i32 @{{call\|_Z4callPFivE}}(ptr{{.}} %f){{.*}}			// CHECK: define dso_local{{.}} i32 @{{call\|_Z4callPFivE}}(ptr{{.}} %f){{.*}}
	int call(fn_t f) {			int call(fn_t f) {
	// CHECK: call{{.}} i32 %{{.}}(){{.}} [ "kcfi"(i32 [[#HASH]]) ]			// BUNDLE: call{{.}} i32 %{{.}}(){{.}} [ "kcfi"(i32 [[#HASH]]) ]
	return f();			return f();
	}			}

	// CHECK-DAG: define internal{{.}} i32 @{{f3\|_ZL2f3v}}(){{.}} !kcfi_type ![[#TYPE]]			// CHECK-DAG: define internal{{.}} i32 @{{f3\|_ZL2f3v}}(){{.}} !kcfi_type ![[#TYPE]]
	static int f3(void) { return 1; }			static int f3(void) { return 1; }

	// CHECK-DAG: declare !kcfi_type ![[#TYPE]]{{.*}} i32 @[[F4]]()			// CHECK-DAG: declare !kcfi_type ![[#TYPE]]{{.*}} i32 @[[F4]]()
	extern int f4(void);			extern int f4(void);
	Show All 23 Lines

clang/test/CodeGen/ubsan-function.cpp

	// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,64			// RUN: %clang_cc1 -triple x86_64-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,64
	// RUN: %clang_cc1 -triple aarch64-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,64			// RUN: %clang_cc1 -triple aarch64-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,64
	// RUN: %clang_cc1 -triple aarch64_be-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,64			// RUN: %clang_cc1 -triple aarch64_be-linux-gnu -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,64
	// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,ARM,32			// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all \| FileCheck %s --check-prefixes=CHECK,ARM,32

				/// With -munaligned-access, ensure that the alignment is at least 4.
				// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all -target-feature +strict-align \| FileCheck %s --check-prefix=ALIGN4
				/// Smaller -faligned-function= is overridden while larger -faligned-function= wins.
				// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all -target-feature +strict-align -function-alignment 1 \| FileCheck %s --check-prefix=ALIGN4
				// RUN: %clang_cc1 -triple arm-none-eabi -emit-llvm -o - %s -fsanitize=function -fno-sanitize-recover=all -target-feature +strict-align -function-alignment 5 \| FileCheck %s --check-prefix=ALIGN32

	// CHECK: define{{.}} void @_Z3funv() #0 !func_sanitize ![[FUNCSAN:.]] {			// CHECK: define{{.}} void @_Z3funv() #0 !func_sanitize ![[FUNCSAN:.]] {
				// ALIGN4: define{{.}} void @_Z3funv() #0 align 4 !func_sanitize ![[FUNCSAN:.]] {
				// ALIGN32: define{{.}} void @_Z3funv() #0 align 32 !func_sanitize ![[FUNCSAN:.]] {
	void fun() {}			void fun() {}

				// CHECK: define{{.*}} void @_Z8aligned2v() #[[#]] align 2
				// ALIGN4: define{{.*}} void @_Z8aligned2v() #[[#]] align 2
				// ALIGN32: define{{.*}} void @_Z8aligned2v() #[[#]] align 2
				__attribute__((aligned(2)))
				void aligned2() {}

	// CHECK-LABEL: define{{.*}} void @_Z6callerPFvvE(ptr noundef %f)			// CHECK-LABEL: define{{.*}} void @_Z6callerPFvvE(ptr noundef %f)
	// ARM: ptrtoint ptr {{.*}} to i32, !nosanitize !5			// ARM: ptrtoint ptr {{.*}} to i32, !nosanitize !5
	// ARM: and i32 {{.*}}, -2, !nosanitize !5			// ARM: and i32 {{.*}}, -2, !nosanitize !5
	// ARM: inttoptr i32 {{.*}} to ptr, !nosanitize !5			// ARM: inttoptr i32 {{.*}} to ptr, !nosanitize !5
	// CHECK: getelementptr <{ i32, i32 }>, ptr {{.*}}, i32 -1, i32 0, !nosanitize			// CHECK: getelementptr <{ i32, i32 }>, ptr {{.*}}, i32 -1, i32 0, !nosanitize
	// CHECK: load i32, ptr {{.}}, align {{.}}, !nosanitize			// CHECK: load i32, ptr {{.}}, align {{.}}, !nosanitize
	// CHECK: icmp eq i32 {{.*}}, -1056584962, !nosanitize			// CHECK: icmp eq i32 {{.*}}, -1056584962, !nosanitize
	// CHECK: br i1 {{.}}, label %[[LABEL1:.]], label %[[LABEL4:.*]], !nosanitize			// CHECK: br i1 {{.}}, label %[[LABEL1:.]], label %[[LABEL4:.*]], !nosanitize
	Show All 19 Lines