This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
BuiltinsBPF.def
-
DiagnosticSemaKinds.td
-
lib/
-
CodeGen/
-
CGBuiltin.cpp
-
Sema/
-
SemaChecking.cpp
-
test/
-
CodeGen/
1/2
builtin-bpf-load-u32-to-ptr.c
-
Sema/
-
builtin-bpf-load-u32-to-ptr.c
-
llvm/include/llvm/IR/
-
include/
-
llvm/
-
IR/
-
IntrinsicsBPF.td

Differential D81479

[BPF] introduce __builtin_bpf_load_u32_to_ptr() intrinsic
AbandonedPublic

Authored by yonghong-song on Jun 9 2020, 9:46 AM.

Download Raw Diff

Details

Reviewers

ast

Summary

In current linux BPF uapi

https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h

struct __sk_buff/xdp_md have fields

__u32 data;
__u32 data_end;
__u32 data_meta;

which actually represent pointers. Typically in bpf program, users
write the following:

void *data = (void *)(long)__sk_buff->data;

The kernel verifier magically copied the address to the target
64bit register for __sk_buff->data and hope nothing is messed up
and it can survive to the variable "data".

In the past, we have seen a few issues with this. For example,
for the above C code, the IR looks like:

i32_v = load u32
i64_v = zext i32_v
...

The BPF backend has tried, through InstrInfo.td pattern matching or
optimization in MachineInstr SSA analysis and transformation,
to recognize the above pattern to remove zext so the really "addr"
is preserved. But this is still fragile and in the past, we have
to fix multiple bugs due to other changes in BPF backend. The
optimization may not cover all possible cases. Some users may even
use inline assembly to work around potentially missed compiler
zext elimination.

The patch introduced the following builtin function for bpf target:

void *ptr = __builtin_bpf_load_u32_to_ptr(void *base, int offset);

The builtin will perform a 32bit load with address "base + offset"
and the result, with zext, will be returned. This way, user is
guaranteed a correct address.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

yonghong-song created this revision.Jun 9 2020, 9:46 AM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 9 2020, 9:46 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

The corresponding llvm change: https://reviews.llvm.org/D81480

ast added inline comments.Jun 9 2020, 10:33 AM

clang/test/CodeGen/builtin-bpf-load-u32-to-ptr.c
6	can it be expressed as: __builtin_load_u32_to_ptr(&arg->b) ?

Harbormaster failed remote builds in B59651: Diff 269575!Jun 9 2020, 10:58 AM

yonghong-song marked an inline comment as done.Jun 9 2020, 11:06 AM

yonghong-song added inline comments.

clang/test/CodeGen/builtin-bpf-load-u32-to-ptr.c
6	Good question. If this, we will have code like tmp = ctx + 76 void ptr = (u32 *)(tmp + 0) IIRC, the verifier does not like this. The verifier prefers `ctx + offset`.

IntrinsicsBPF.td uses __builtin_bpf_load_u32_to_ptr, while everywhere it is just __builtin_load_u32_to_ptr. I think having bpf prefix there is good, given this is bpf-specific. But either way, probably should be consistent everywhere?

In D81479#2083510, @anakryiko wrote:

IntrinsicsBPF.td uses __builtin_bpf_load_u32_to_ptr, while everywhere it is just __builtin_load_u32_to_ptr. I think having bpf prefix there is good, given this is bpf-specific. But either way, probably should be consistent everywhere?

Sounds good. This is indeed specific to bpf, adding bpf prefix to the public builtin name sounds reasonable. Will make the change.

change builtin name from builtin_load_u32_to_ptr to builtin_bpf_load_u32_to_ptr to reflect it is bpf specific.

It feels that the same thing can be represented as inline asm.
What advantage builtin has?

In D81479#2083933, @ast wrote:

It feels that the same thing can be represented as inline asm.
What advantage builtin has?

Yes, this can be represented as an inline asm. I have a tendency not liking inline assembly codes in bpf programs.
But maybe for this one, we could hide inline asm in the header and provide the same signature like
__builtin_bpf_load_u32_to_ptr() to users.

Harbormaster failed remote builds in B59727: Diff 269721!Jun 9 2020, 8:25 PM

I guess I will go with inline asm in kernel for now as llvm seems already doing a pretty good job parsing/understanding inline asm to integrated into optimization passes. A few passes like SimplifyCFG, GVN, etc. may have some impact but probably does not really matter for the use case here.

inline asm can do the work, so abandon this patch.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

BuiltinsBPF.def

3 lines

DiagnosticSemaKinds.td

2 lines

lib/

CodeGen/

CGBuiltin.cpp

12 lines

Sema/

SemaChecking.cpp

21 lines

test/

CodeGen/

builtin-bpf-load-u32-to-ptr.c

8 lines

Sema/

builtin-bpf-load-u32-to-ptr.c

10 lines

llvm/

include/

llvm/

IR/

IntrinsicsBPF.td

2 lines

Diff 269721

clang/include/clang/Basic/BuiltinsBPF.def

	Show All 17 Lines
	#endif			#endif

	// Get record field information.			// Get record field information.
	TARGET_BUILTIN(__builtin_preserve_field_info, "Ui.", "t", "")			TARGET_BUILTIN(__builtin_preserve_field_info, "Ui.", "t", "")

	// Get BTF type id.			// Get BTF type id.
	TARGET_BUILTIN(__builtin_btf_type_id, "Ui.", "t", "")			TARGET_BUILTIN(__builtin_btf_type_id, "Ui.", "t", "")

				// Load a unsigned value and convert it to a pointer.
				TARGET_BUILTIN(__builtin_bpf_load_u32_to_ptr, "vvLi", "n", "")

	#undef BUILTIN			#undef BUILTIN
	#undef TARGET_BUILTIN			#undef TARGET_BUILTIN

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,782 Lines • ▼ Show 20 Lines	def err_builtin_matrix_arg: Error<
"%select{first\|second}0 argument must be a matrix">;		"%select{first\|second}0 argument must be a matrix">;

def err_preserve_field_info_not_field : Error<		def err_preserve_field_info_not_field : Error<
"__builtin_preserve_field_info argument %0 not a field access">;		"__builtin_preserve_field_info argument %0 not a field access">;
def err_preserve_field_info_not_const: Error<		def err_preserve_field_info_not_const: Error<
"__builtin_preserve_field_info argument %0 not a constant">;		"__builtin_preserve_field_info argument %0 not a constant">;
def err_btf_type_id_not_const: Error<		def err_btf_type_id_not_const: Error<
"__builtin_btf_type_id argument %0 not a constant">;		"__builtin_btf_type_id argument %0 not a constant">;
		def err_bpf_load_u32_to_ptr_not_const: Error<
		"__builtin_bpf_load_u32_to_ptr argument %0 not a constant">;

def err_bit_cast_non_trivially_copyable : Error<		def err_bit_cast_non_trivially_copyable : Error<
"__builtin_bit_cast %select{source\|destination}0 type must be trivially copyable">;		"__builtin_bit_cast %select{source\|destination}0 type must be trivially copyable">;
def err_bit_cast_type_size_mismatch : Error<		def err_bit_cast_type_size_mismatch : Error<
"__builtin_bit_cast source size does not equal destination size (%0 vs %1)">;		"__builtin_bit_cast source size does not equal destination size (%0 vs %1)">;

// SYCL-specific diagnostics		// SYCL-specific diagnostics
def warn_sycl_kernel_num_of_template_params : Warning<		def warn_sycl_kernel_num_of_template_params : Warning<
Show All 17 Lines

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,679 Lines • ▼ Show 20 Lines	case NEON::BI__builtin_neon_vuqaddq_v: {
return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd");		return EmitNeonCall(CGM.getIntrinsic(Int, Ty), Ops, "vuqadd");
}		}
}		}
}		}

Value *CodeGenFunction::EmitBPFBuiltinExpr(unsigned BuiltinID,		Value *CodeGenFunction::EmitBPFBuiltinExpr(unsigned BuiltinID,
const CallExpr *E) {		const CallExpr *E) {
assert((BuiltinID == BPF::BI__builtin_preserve_field_info \|\|		assert((BuiltinID == BPF::BI__builtin_preserve_field_info \|\|
BuiltinID == BPF::BI__builtin_btf_type_id) &&		BuiltinID == BPF::BI__builtin_btf_type_id \|\|
		BuiltinID == BPF::BI__builtin_bpf_load_u32_to_ptr) &&
"unexpected BPF builtin");		"unexpected BPF builtin");

switch (BuiltinID) {		switch (BuiltinID) {
default:		default:
llvm_unreachable("Unexpected BPF builtin");		llvm_unreachable("Unexpected BPF builtin");
case BPF::BI__builtin_preserve_field_info: {		case BPF::BI__builtin_preserve_field_info: {
const Expr *Arg = E->getArg(0);		const Expr *Arg = E->getArg(0);
bool IsBitField = Arg->IgnoreParens()->getObjectKind() == OK_BitField;		bool IsBitField = Arg->IgnoreParens()->getObjectKind() == OK_BitField;
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	case BPF::BI__builtin_btf_type_id: {
Constant *CV = ConstantInt::get(IntTy, IsLValue);		Constant *CV = ConstantInt::get(IntTy, IsLValue);
llvm::Function *FnBtfTypeId = llvm::Intrinsic::getDeclaration(		llvm::Function *FnBtfTypeId = llvm::Intrinsic::getDeclaration(
&CGM.getModule(), llvm::Intrinsic::bpf_btf_type_id,		&CGM.getModule(), llvm::Intrinsic::bpf_btf_type_id,
{FieldVal->getType(), CV->getType()});		{FieldVal->getType(), CV->getType()});
CallInst *Fn = Builder.CreateCall(FnBtfTypeId, {FieldVal, CV, FlagValue});		CallInst *Fn = Builder.CreateCall(FnBtfTypeId, {FieldVal, CV, FlagValue});
Fn->setMetadata(LLVMContext::MD_preserve_access_index, DbgInfo);		Fn->setMetadata(LLVMContext::MD_preserve_access_index, DbgInfo);
return Fn;		return Fn;
}		}
		case BPF::BI__builtin_bpf_load_u32_to_ptr: {
		Value *BaseV = EmitScalarExpr(E->getArg(0));
		Value *OffsetV = EmitScalarExpr(E->getArg(1));

		// Built the IR for the bpf_load_u32_to_ptr intrinsic.
		llvm::Function *FnLoadU32ToPtr = llvm::Intrinsic::getDeclaration(
		&CGM.getModule(), llvm::Intrinsic::bpf_load_u32_to_ptr, {});
		return Builder.CreateCall(FnLoadU32ToPtr, {BaseV, OffsetV});
		}
}		}
}		}

llvm::Value *CodeGenFunction::		llvm::Value *CodeGenFunction::
BuildVector(ArrayRef<llvm::Value*> Ops) {		BuildVector(ArrayRef<llvm::Value*> Ops) {
assert((Ops.size() & (Ops.size() - 1)) == 0 &&		assert((Ops.size() & (Ops.size() - 1)) == 0 &&
"Not a power-of-two sized vector!");		"Not a power-of-two sized vector!");
bool AllConstants = true;		bool AllConstants = true;
▲ Show 20 Lines • Show All 5,758 Lines • Show Last 20 Lines

clang/lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,495 Lines • ▼ Show 20 Lines	bool Sema::CheckAArch64BuiltinFunctionCall(const TargetInfo &TI,
}		}

return SemaBuiltinConstantArgRange(TheCall, i, l, u + l);		return SemaBuiltinConstantArgRange(TheCall, i, l, u + l);
}		}

bool Sema::CheckBPFBuiltinFunctionCall(unsigned BuiltinID,		bool Sema::CheckBPFBuiltinFunctionCall(unsigned BuiltinID,
CallExpr *TheCall) {		CallExpr *TheCall) {
assert((BuiltinID == BPF::BI__builtin_preserve_field_info \|\|		assert((BuiltinID == BPF::BI__builtin_preserve_field_info \|\|
BuiltinID == BPF::BI__builtin_btf_type_id) &&		BuiltinID == BPF::BI__builtin_btf_type_id \|\|
"unexpected ARM builtin");		BuiltinID == BPF::BI__builtin_bpf_load_u32_to_ptr) &&
		"unexpected BPF builtin");

		// Generic checking has done basic checking against the
		// signature, here only to ensure the second argument
		// be a constant.
		Expr *Arg;
		if (BuiltinID == BPF::BI__builtin_bpf_load_u32_to_ptr) {
		llvm::APSInt Value;
		Arg = TheCall->getArg(1);
		if (!Arg->isIntegerConstantExpr(Value, Context)) {
		Diag(Arg->getBeginLoc(), diag::err_bpf_load_u32_to_ptr_not_const)
		<< 2 << Arg->getSourceRange();
		return true;
		}
		return false;
		}

if (checkArgCount(*this, TheCall, 2))		if (checkArgCount(*this, TheCall, 2))
return true;		return true;

Expr *Arg;
if (BuiltinID == BPF::BI__builtin_btf_type_id) {		if (BuiltinID == BPF::BI__builtin_btf_type_id) {
// The second argument needs to be a constant int		// The second argument needs to be a constant int
llvm::APSInt Value;		llvm::APSInt Value;
Arg = TheCall->getArg(1);		Arg = TheCall->getArg(1);
if (!Arg->isIntegerConstantExpr(Value, Context)) {		if (!Arg->isIntegerConstantExpr(Value, Context)) {
Diag(Arg->getBeginLoc(), diag::err_btf_type_id_not_const)		Diag(Arg->getBeginLoc(), diag::err_btf_type_id_not_const)
<< 2 << Arg->getSourceRange();		<< 2 << Arg->getSourceRange();
return true;		return true;
▲ Show 20 Lines • Show All 12,565 Lines • Show Last 20 Lines

clang/test/CodeGen/builtin-bpf-load-u32-to-ptr.c

This file was added.

				// REQUIRES: bpf-registered-target
				// RUN: %clang -target bpf -emit-llvm -S %s -o - \| FileCheck %s

				struct t { int a; int b; };
				void test(struct t arg) { return __builtin_bpf_load_u32_to_ptr(arg, 4); }

				astUnsubmitted Not Done Reply Inline Actions can it be expressed as: __builtin_load_u32_to_ptr(&arg->b) ? ast: can it be expressed as: __builtin_load_u32_to_ptr(&arg->b) ?
				yonghong-songAuthorUnsubmitted Done Reply Inline Actions Good question. If this, we will have code like tmp = ctx + 76 void ptr = (u32 )(tmp + 0) IIRC, the verifier does not like this. The verifier prefers `ctx + offset`. yonghong-song:* Good question. If this, we will have code like tmp = ctx + 76 void ptr = (u32 *)(tmp +…
				// CHECK: define dso_local i8* @test
				// CHECK: call i8* @llvm.bpf.load.u32.to.ptr(i8* %{{[0-9a-z.]+}}, i64 4)

clang/test/Sema/builtin-bpf-load-u32-to-ptr.c

This file was added.

				// RUN: %clang_cc1 -x c -triple bpf-pc-linux-gnu -dwarf-version=4 -fsyntax-only -verify %s

				struct t { int a; int b; };

				void invalid1(struct t arg) { return __builtin_bpf_load_u32_to_ptr(arg, arg->a); } // expected-error {{__builtin_bpf_load_u32_to_ptr argument 2 not a constant}}
				void invalid2(struct t arg) { return __builtin_bpf_load_u32_to_ptr(arg + 4); } // expected-error {{too few arguments to function call, expected 2, have 1}}
				void invalid3(struct t arg) { return __builtin_bpf_load_u32_to_ptr(arg, 4, 0); } // expected-error {{too many arguments to function call, expected 2, have 3}}
				unsigned invalid4(struct t arg) { return __builtin_bpf_load_u32_to_ptr(arg, 4); } // expected-warning {{incompatible pointer to integer conversion returning 'void ' from a function with result type 'unsigned int'}}

				void valid1(struct t arg) { return __builtin_bpf_load_u32_to_ptr(arg, 4); }

llvm/include/llvm/IR/IntrinsicsBPF.td

Show All 20 Lines	let TargetPrefix = "bpf" in { // All intrinsics start with "llvm.bpf."
def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">,		def int_bpf_pseudo : GCCBuiltin<"__builtin_bpf_pseudo">,
Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>;		Intrinsic<[llvm_i64_ty], [llvm_i64_ty, llvm_i64_ty]>;
def int_bpf_preserve_field_info : GCCBuiltin<"__builtin_bpf_preserve_field_info">,		def int_bpf_preserve_field_info : GCCBuiltin<"__builtin_bpf_preserve_field_info">,
Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty, llvm_i64_ty],		Intrinsic<[llvm_i32_ty], [llvm_anyptr_ty, llvm_i64_ty],
[IntrNoMem, ImmArg<ArgIndex<1>>]>;		[IntrNoMem, ImmArg<ArgIndex<1>>]>;
def int_bpf_btf_type_id : GCCBuiltin<"__builtin_bpf_btf_type_id">,		def int_bpf_btf_type_id : GCCBuiltin<"__builtin_bpf_btf_type_id">,
Intrinsic<[llvm_i32_ty], [llvm_any_ty, llvm_any_ty, llvm_i64_ty],		Intrinsic<[llvm_i32_ty], [llvm_any_ty, llvm_any_ty, llvm_i64_ty],
[IntrNoMem]>;		[IntrNoMem]>;
		def int_bpf_load_u32_to_ptr : GCCBuiltin<"__builtin_bpf_load_u32_to_ptr">,
		Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty, llvm_i64_ty], [IntrReadMem]>;
}		}