This is an archive of the discontinued LLVM Phabricator instance.

[Clang][BPF] implement __builtin_btf_type_id() builtin function
ClosedPublic

Authored by yonghong-song on Feb 15 2020, 8:41 AM.

Details

Summary

Such a builtin function is mostly useful to preserve btf type id
for non-global data. For example,

extern void foo(..., void *data, int size);
int test(...) {
  struct t { int a; int b; int c; } d;
  d.a = ...; d.b = ...; d.c = ...;
  foo(..., &d, sizeof(d));
}

The function "foo" in the above only see raw data and does not
know what type of the data is. In certain cases, e.g., logging,
the additional type information will help pretty print.

This patch implemented a BPF specific builtin

u32 btf_type_id = __builtin_btf_type_id(param, flag)

which will return a btf type id for the "param".
flag == 0 will indicate a BTF local relocation,
which means btf type_id only adjusted when bpf program BTF changes.
flag == 1 will indicate a BTF remote relocation,
which means btf type_id is adjusted against linux kernel or
future other entities.

Diff Detail

Event Timeline

yonghong-song created this revision.Feb 15 2020, 8:41 AM

The corresponding LLVM side of change is https://reviews.llvm.org/D74572

ast accepted this revision.Feb 19 2020, 6:28 PM

lgtm. Thanks for explaining lvalue/value trick.

This revision is now accepted and ready to land.Feb 19 2020, 6:28 PM

rebase on top of master

Let's extend __builtin_btf_type_id() to accept second argument specifying whether it's local BTF ID (from program's BTF) or target BTF ID (from kernel/module BTF)? We can probably make it an enum just like with preserve_access_index() built-in, for easy future extension. WDYT?

Let's extend __builtin_btf_type_id() to accept second argument specifying whether it's local BTF ID (from program's BTF) or target BTF ID (from kernel/module BTF)? We can probably make it an enum just like with preserve_access_index() built-in, for easy future extension. WDYT?

This is on my to-do list. Haven't do it since the builtin is not used yet. Will do this once my bpf_iter v2 is sent out.

yonghong-song edited the summary of this revision. (Show Details)

add second argument to __builtin_btf_type_id() to indicate whether a relocation should be generated or not.

what's the use case for flag==0 (no relocation)? why using built-in at all in such case? Also flag==1 means relocate to local BTF ID or remote (kernel) BTF ID? Do you plan to add flag=2 as well to cover both cases? Or am I misunderstanding the meaning of this flag?

what's the use case for flag==0 (no relocation)? why using built-in at all in such case? Also flag==1 means relocate to local BTF ID or remote (kernel) BTF ID? Do you plan to add flag=2 as well to cover both cases? Or am I misunderstanding the meaning of this flag?

Originally, I thought flag = 0 for the following use case:

e.g., they just want to know the type of a particular local structure for pretty print purpose.
Note that currently only types for global/extern variables, function parameters are recorded in btf.
  int test() {
     struct { int a; int b; ...} ctx;
     btf_id = __builtin_btf_type_id(ctx, 0);
     bpf_seq_write(seq, &btf_id, sizeof(btf_id));
     bpf_seq_write(seq, &ctx, sizeof(ctx));
     ...
  }

But obviously without relocation, this will not work with btf deduplication, future static linking etc.

flag 1: for relocation. My original thinking is for vmlinux relocation.

I think you brought a good point about local relocation, so will need
to change the flag to:

flag 0 : local relocation
flag 1:  vmlinux relocation

Two more relocation types will be generated:

BTF_TYPE_ID_LOCAL
BTF_TYPE_ID_REMOTE
yonghong-song edited the summary of this revision. (Show Details)May 5 2020, 9:29 AM

what's the use case for flag==0 (no relocation)? why using built-in at all in such case? Also flag==1 means relocate to local BTF ID or remote (kernel) BTF ID? Do you plan to add flag=2 as well to cover both cases? Or am I misunderstanding the meaning of this flag?

Originally, I thought flag = 0 for the following use case:

e.g., they just want to know the type of a particular local structure for pretty print purpose.
Note that currently only types for global/extern variables, function parameters are recorded in btf.
  int test() {
     struct { int a; int b; ...} ctx;
     btf_id = __builtin_btf_type_id(ctx, 0);
     bpf_seq_write(seq, &btf_id, sizeof(btf_id));
     bpf_seq_write(seq, &ctx, sizeof(ctx));
     ...
  }

But obviously without relocation, this will not work with btf deduplication, future static linking etc.

Right, that's what I was thinking. We should have relocation always, even if it's a noop for libbpf today.

flag 1: for relocation. My original thinking is for vmlinux relocation.

I think you brought a good point about local relocation, so will need
to change the flag to:

flag 0 : local relocation
flag 1:  vmlinux relocation

Two more relocation types will be generated:

BTF_TYPE_ID_LOCAL
BTF_TYPE_ID_REMOTE

Yep, that would be great.

This revision was automatically updated to reflect the committed changes.
dblaikie added a subscriber: dblaikie.

(mentioned also on the LLVM side of this D74572)

If this is in the interests of retaining certain types in the emitted debug info - it seems quite complicated compared to what I'd hope for. Would it be sufficient to support attribute((used)) on a type and have that force the type to be emitted into DWARF (this could be implemented using DICompileUNit's "retainedTypes" list (& is similar to what LLVM does for references to enum constants where the enum isn't otherwise used as a type in a function signature, variable, etc - attaching them to the DICompileUnit's "enums" list)). That seems like a more general purpose and tidier way that adding/removing things, having extra transformation passes, etc. (& even if attribute((used)) isn't suitable for your needs - at least, hopefully, you could still use DICompileUnit's retainedTypes list to save your types detected through some other means, and not need the backend pass to strip out these extra intrinsic calls, etc)?