In current linux BPF uapi
https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h
struct __sk_buff/xdp_md have fields
__u32 data; __u32 data_end; __u32 data_meta;
which actually represent pointers. Typically in bpf program, users
write the following:
void *data = (void *)(long)__sk_buff->data;
The kernel verifier magically copied the address to the target
64bit register for __sk_buff->data and hope nothing is messed up
and it can survive to the variable "data".
In the past, we have seen a few issues with this. For example,
for the above C code, the IR looks like:
i32_v = load u32 i64_v = zext i32_v ...
The BPF backend has tried, through InstrInfo.td pattern matching or
optimization in MachineInstr SSA analysis and transformation,
to recognize the above pattern to remove zext so the really "addr"
is preserved. But this is still fragile and in the past, we have
to fix multiple bugs due to other changes in BPF backend. The
optimization may not cover all possible cases. Some users may even
use inline assembly to work around potentially missed compiler
zext elimination.
The patch introduced the following builtin function for bpf target:
void *ptr = __builtin_bpf_load_u32_to_ptr(void *base, int offset);
The builtin will perform a 32bit load with address "base + offset"
and the result, with zext, will be returned. This way, user is
guaranteed a correct address.
can it be expressed as:
__builtin_load_u32_to_ptr(&arg->b) ?