This is an archive of the discontinued LLVM Phabricator instance.

[BPF] introduce __builtin_bpf_load_u32_to_ptr() intrinsic
AbandonedPublic

Authored by yonghong-song on Jun 9 2020, 9:46 AM.

Details

Reviewers
ast
Summary

In current linux BPF uapi

https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h

struct __sk_buff/xdp_md have fields

__u32 data;
__u32 data_end;
__u32 data_meta;

which actually represent pointers. Typically in bpf program, users
write the following:

void *data = (void *)(long)__sk_buff->data;

The kernel verifier magically copied the address to the target
64bit register for __sk_buff->data and hope nothing is messed up
and it can survive to the variable "data".

In the past, we have seen a few issues with this. For example,
for the above C code, the IR looks like:

i32_v = load u32
i64_v = zext i32_v
...

The BPF backend has tried, through InstrInfo.td pattern matching or
optimization in MachineInstr SSA analysis and transformation,
to recognize the above pattern to remove zext so the really "addr"
is preserved. But this is still fragile and in the past, we have
to fix multiple bugs due to other changes in BPF backend. The
optimization may not cover all possible cases. Some users may even
use inline assembly to work around potentially missed compiler
zext elimination.

The patch introduced the following builtin function for bpf target:

void *ptr = __builtin_bpf_load_u32_to_ptr(void *base, int offset);

The builtin will perform a 32bit load with address "base + offset"
and the result, with zext, will be returned. This way, user is
guaranteed a correct address.

Diff Detail

Event Timeline

yonghong-song created this revision.Jun 9 2020, 9:46 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 9 2020, 9:46 AM
ast added inline comments.Jun 9 2020, 10:33 AM
clang/test/CodeGen/builtin-bpf-load-u32-to-ptr.c
5

can it be expressed as:
__builtin_load_u32_to_ptr(&arg->b) ?

yonghong-song marked an inline comment as done.Jun 9 2020, 11:06 AM
yonghong-song added inline comments.
clang/test/CodeGen/builtin-bpf-load-u32-to-ptr.c
5

Good question. If this, we will have code like

tmp = ctx + 76
void *ptr = *(u32 *)(tmp + 0)

IIRC, the verifier does not like this. The verifier prefers ctx + offset.

IntrinsicsBPF.td uses __builtin_bpf_load_u32_to_ptr, while everywhere it is just __builtin_load_u32_to_ptr. I think having bpf prefix there is good, given this is bpf-specific. But either way, probably should be consistent everywhere?

IntrinsicsBPF.td uses __builtin_bpf_load_u32_to_ptr, while everywhere it is just __builtin_load_u32_to_ptr. I think having bpf prefix there is good, given this is bpf-specific. But either way, probably should be consistent everywhere?

Sounds good. This is indeed specific to bpf, adding bpf prefix to the public builtin name sounds reasonable. Will make the change.

yonghong-song retitled this revision from [BPF] introduce __builtin_load_u32_to_ptr() intrinsic to [BPF] introduce __builtin_bpf_load_u32_to_ptr() intrinsic.
yonghong-song edited the summary of this revision. (Show Details)

change builtin name from builtin_load_u32_to_ptr to builtin_bpf_load_u32_to_ptr to reflect it is bpf specific.

ast added a comment.Jun 9 2020, 7:28 PM

It feels that the same thing can be represented as inline asm.
What advantage builtin has?

In D81479#2083933, @ast wrote:

It feels that the same thing can be represented as inline asm.
What advantage builtin has?

Yes, this can be represented as an inline asm. I have a tendency not liking inline assembly codes in bpf programs.
But maybe for this one, we could hide inline asm in the header and provide the same signature like
__builtin_bpf_load_u32_to_ptr() to users.

I guess I will go with inline asm in kernel for now as llvm seems already doing a pretty good job parsing/understanding inline asm to integrated into optimization passes. A few passes like SimplifyCFG, GVN, etc. may have some impact but probably does not really matter for the use case here.

yonghong-song abandoned this revision.Aug 6 2020, 5:52 PM

inline asm can do the work, so abandon this patch.