Page MenuHomePhabricator

[BPF] Preserve debuginfo array/union/struct type/access index
AbandonedPublic

Authored by yonghong-song on May 10 2019, 3:35 PM.

Details

Summary

For background of BPF CO-RE project, please refer to

http://vger.kernel.org/bpfconf2019.html

In summary, BPF CO-RE intends to compile bpf programs
adjustable on struct/union layout change so the same
program can run on multiple kernels with adjustment
before loading based on native kernel structures.

In order to do this, we need keep track of GEP(getelementptr)
instruction base and result debuginfo types, so we
can adjust on the host based on kernel BTF info.
Capturing such information as an IR optimization is hard
as various optimization may have tweaked GEP and also
union is replaced by structure it is impossible to track
fieldindex for union member accesses.

Three intrinsic functions, preserve_{array,union,struct}_access_index,
are introducted.

addr = preserve_array_access_index(base, index, dimension)
addr = preserve_union_access_index(base, di_index)
addr = preserve_struct_access_index(base, gep_index, di_index)

here,

base: the base pointer for the array/union/struct access.
index: the last access index for array, the same for IR/DebugInfo layout.
dimension: the array dimension.
gep_index: the access index based on IR layout.
di_index: the access index based on user/debuginfo types.

If using these intrinsics blindly, i.e., transforming all GEPs
to these intrinsics and later on reducing them to GEPs, we have
seen up to 7% more instructions generated. To avoid such an overhead,
a clang builtin is proposed:

base = __builtin_preserve_access_index(base)

such that user wraps to-be-relocated GEPs in this builtin
and preserve_*_access_index intrinsics only apply to
those GEPs. Such a buyin will prevent performance degradation
if people do not use CO-RE, even for programs which use
bpf_probe_read().

For example, for the following example,

$ cat test.c
struct sk_buff {
   int i;
   int b1:1;
   int b2:2;
   union {
     struct {
       int o1;
       int o2;
     } o;
     struct {
       char flags;
       char dev_id;
     } dev;
     int netid;
   } u[10];
};

static int (*bpf_probe_read)(void *dst, int size, const void *unsafe_ptr)
    = (void *) 4;

#define _(x) (__builtin_preserve_access_index(x))

int bpf_prog(struct sk_buff *ctx) {
  char dev_id;
  bpf_probe_read(&dev_id, sizeof(char), _(&ctx->u[5].dev.dev_id));
  return dev_id;
}
$ clang -target bpf -O2 -g -emit-llvm -S -mllvm -print-before-all \
  test.c >& log

The generated IR looks like below:

...
define dso_local i32 @bpf_prog(%struct.sk_buff*) #0 !dbg !15 {
  %2 = alloca %struct.sk_buff*, align 8
  %3 = alloca i8, align 1
  store %struct.sk_buff* %0, %struct.sk_buff** %2, align 8, !tbaa !45
  call void @llvm.dbg.declare(metadata %struct.sk_buff** %2, metadata !43, metadata !DIExpression()), !dbg !49
  call void @llvm.lifetime.start.p0i8(i64 1, i8* %3) #4, !dbg !50
  call void @llvm.dbg.declare(metadata i8* %3, metadata !44, metadata !DIExpression()), !dbg !51
  %4 = load i32 (i8*, i32, i8*)*, i32 (i8*, i32, i8*)** @bpf_probe_read, align 8, !dbg !52, !tbaa !45
  %5 = load %struct.sk_buff*, %struct.sk_buff** %2, align 8, !dbg !53, !tbaa !45
  %6 = call [10 x %union.anon]* @llvm.preserve.struct.access.index.p0a10s_union.anons.p0s_struct.sk_buffs(
       %struct.sk_buff* %5, i32 2, i32 3), !dbg !53, !llvm.preserve.access.index !19
  %7 = call %union.anon* @llvm.preserve.array.access.index.p0s_union.anons.p0a10s_union.anons(
       [10 x %union.anon]* %6, i32 1, i32 5), !dbg !53
  %8 = call %union.anon* @llvm.preserve.union.access.index.p0s_union.anons.p0s_union.anons(
       %union.anon* %7, i32 1), !dbg !53, !llvm.preserve.access.index !26
  %9 = bitcast %union.anon* %8 to %struct.anon.0*, !dbg !53
  %10 = call i8* @llvm.preserve.struct.access.index.p0i8.p0s_struct.anon.0s(
       %struct.anon.0* %9, i32 1, i32 1), !dbg !53, !llvm.preserve.access.index !34
  %11 = call i32 %4(i8* %3, i32 1, i8* %10), !dbg !52
  %12 = load i8, i8* %3, align 1, !dbg !54, !tbaa !55
  %13 = sext i8 %12 to i32, !dbg !54
  call void @llvm.lifetime.end.p0i8(i64 1, i8* %3) #4, !dbg !56
  ret i32 %13, !dbg !57
}

!19 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "sk_buff", file: !3, line: 1, size: 704, elements: !20)
!26 = distinct !DICompositeType(tag: DW_TAG_union_type, scope: !19, file: !3, line: 5, size: 64, elements: !27)
!34 = distinct !DICompositeType(tag: DW_TAG_structure_type, scope: !26, file: !3, line: 10, size: 16, elements: !35)

Note that @llvm.preserve.{struct,union}.access.index calls have metadata llvm.preserve.access.index
attached to instructions to provide struct/union debuginfo type information.

For &ctx->u[5].dev.dev_id,

. The "%6 = ..." represents struct member "u" with index 2 for IR layout and index 3 for DI layout.
. The "%7 = ..." represents array subscript "5".
. The "%8 = ..." represents union member "dev" with index 1 for DI layout.
. The "%10 = ..." represents struct member "dev_id" with index 1 for both IR and DI layout.

Basically, traversing the use-def chain recursively for the 3rd argument of bpf_probe_read() and
examining all preserve_*_access_index calls, the debuginfo struct/union/array access index
can be achieved.

The intrinsics also contain enough information to regenerate codes for IR layout.
For array and structure intrinsics, the proper GEP can be constructed.
For union intrinsics, replacing all uses of "addr" with "base" should be enough.

Diff Detail

Repository
rL LLVM

Event Timeline

yonghong-song created this revision.May 10 2019, 3:35 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2019, 3:35 PM

Tests seems to be missing?

@lebedev.ri Thanks for the comment. This patch is not ready to land yet. Yes, tests are missing and I am going to add tests later.
More importantly, I want to get a sense whether what I am implementing here is the right direction or not.
The following two other patches are also related to describe our end goal:

https://reviews.llvm.org/D61810
https://reviews.llvm.org/D61524
rsmith added a subscriber: rsmith.May 10 2019, 4:26 PM

If I understand correctly, you want to be able to compile a program against some struct and union layouts, and then at load time "update" the program to cope with the actual layouts for those types being something else, but containing (at least) the set of members you know about. And your proposed approach is to add intrinsics into the IR that identify the GEPs that we emitted to model struct field accesses, but to otherwise not change the emitted IR. If so, I don't see how this approach to that problem can work. There seem to be a couple of problems:

%2 = alloca %struct.sk_buff*, align 8

The size you allocate here will presumably need to vary as the struct layout changes, and you have no way of knowing which allocas will need their sizes to be changed.

%6 = getelementptr inbounds %struct.sk_buff, %struct.sk_buff* %5, i32 0, i32 2, !dbg !54
%7 = call [10 x %union.anon]*
    @llvm.preserve.di.access.index.p0a10s_union.anons.p0a10s_union.anons.p0s_struct.sk_buffs(
    [10 x %union.anon]* %6, %struct.sk_buff* %5,
    i8* getelementptr inbounds ([8 x i8], [8 x i8]* @0, i32 0, i32 0), i32 3), !dbg !54

If this call is supposed to somehow represent that the %6 GEP is computed from %5 by way of a struct field access, this will not be robust against any kind of optimization: optimizations will happily rewrite uses of %6 to use %5 or some other form directly, and you will have lost track of the pointers you need to update.

It would seem better to me to represent a relocateable GEP directly: that is, instead of emitting a regular GEP and some fixup intrinsic, you could emit an intrinsic call instead of the GEP, and have the intrinsic compute the field offset. (And you'll need to do something about your allocas, such as using an intrinsic to get the struct size and emitting a dynamic alloca of that size.) For example, instead of the above, you could emit

%6 = call i8* @llvm.preserve.di.access.gep.2(i8* %5, i32 0, i32 2, !typeinfo)

(where !typeinfo is whatever information you need to identify struct sk_buff later on, when generating your relocations, and the indices you pass in are whichever ones you need to identify the subobject within that type information). Does that make sense?

The size you allocate here will presumably need to vary as the struct layout changes, and you have no way of knowing which allocas will need their sizes to be changed.

Your example is just a pointer; the size of a pointer won't change.

That said, yes, address computation isn't the only place C cares about the layout of a struct. In particular variables (local or global) of struct type, sizeof(), and offsetof() need different handling. But I'm not sure how much any of those matter for BPF.

If this call is supposed to somehow represent that the %6 GEP is computed from %5 by way of a struct field access, this will not be robust against any kind of optimization: optimizations will happily rewrite uses of %6 to use %5 or some other form directly, and you will have lost track of the pointers you need to update.

I think actually, the idea is that the intrinsic represents the computation of some additional adjustment, and so the following instructions use %7, the result of the call, not %6. So I think the existing formulation is sound.

That said, it's awkward to have both an instruction and an intrinsic representing parts of the same offset computation; it probably makes sense to emit just the intrinsic, instead of both the intrinsic and the GEP, like you suggest.

@rsmith @eli.friedman Thanks for your comments. I fully agree that it seems awkward that we have both GEP and intrinsic generation. I will try to do some experiments here to only have intrinsic generation. My only concern is possible performance degradation. I will post here once I got concrete implementation and numbers.

@rsmith regarding to your concerns about struct size change. Yes, the structure size may indeed change. Currently, we plan only to support field fields (non-struct/union type with 1/2/4/8 bytes, and strings). These are most common use cases. Let me explain a little bit more.

Here, we are targeting the bpf_probe_read (https://github.com/torvalds/linux/blob/master/include/uapi/linux/bpf.h#L532-L537)

int bpf_probe_read(void *dst, u32 size, const void *src)

which tries to copy some data from kernel (src pointer) to the stack of the bpf program.

Here, if "src" points to a structure, and user intends to copy the whole structure out of kernel, then "size" may need to be different for different kernels if different kernels indeed have different structure size.
But first, in all our use cases, users typically do not read the whole structure. Second, if "size" does change, we request users to use multiple versioning (using patchable global variables) with possibly different "size" values for different kernel versions.
Note that "size" must be constant for bpf verifier to succeed.
(https://github.com/torvalds/linux/blob/master/kernel/trace/bpf_trace.c#L138-L156)

Just gave another two examples here about reading kernel memories in iovisor/bcc. Note that
bcc will transform certain pointer access "b->c" to "bpf_probe_read(&dest, size, &b->c)" internally.

The bcc tool biosnoop.py contain several bpf probe read:

https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L110
      data.len = req->__data_len;
https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L117
      data.len = req->__data_len;
https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L118
      data.sector = req->__sector;
https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L120
      struct gendisk *rq_disk = req->rq_disk;
https://github.com/iovisor/bcc/blob/master/tools/biosnoop.py#L121-L122
     bpf_probe_read(&data.disk_name, sizeof(data.disk_name),
                     rq_disk->disk_name);

They are all primitive types and strings.

Another example for bcc tool tcpconnect.py

https://github.com/iovisor/bcc/blob/master/tools/tcpconnect.py#L128
  u16 dport = skp->__sk_common.skc_dport;
https://github.com/iovisor/bcc/blob/master/tools/tcpconnect.py#L136-L137
      data4.saddr = skp->__sk_common.skc_rcv_saddr;
      data4.daddr = skp->__sk_common.skc_daddr;

.....

In summary, right now, the to-be-read kernel memory size (through a specific bpf_probe_read() call)
won't change between different kernel versions. So we do not need to handle structure allocation size change here.

yonghong-song retitled this revision from [BPF] Preserve original struct/union type name/access index and array subscripts to [BPF] Preserve debuginfo array/union/struct type name/access index.
yonghong-song edited the summary of this revision. (Show Details)
yonghong-song added reviewers: rsmith, lebedev.ri.

do not generate gep any more, only intrinsics

@rsmith @eli.friedman Hi, I just implemented a new version which generates intrinsics only. Three intrinsics are implemented for array/union/structure separately as their handling and parameters are different. Could you take a look? Thanks!

efriedma added inline comments.May 16 2019, 12:35 PM
lib/CodeGen/CGExpr.cpp
3359 ↗(On Diff #199843)

I don't think there's any reason to expect the array subscript is a ConstantInt here?

yonghong-song marked an inline comment as done.May 16 2019, 1:25 PM
yonghong-song added inline comments.
lib/CodeGen/CGExpr.cpp
3359 ↗(On Diff #199843)

right.

yonghong-song edited the summary of this revision. (Show Details)

array index may not be constant and remove inst_name

yonghong-song retitled this revision from [BPF] Preserve debuginfo array/union/struct type name/access index to [BPF] Preserve debuginfo array/union/struct type/access index.
yonghong-song edited the summary of this revision. (Show Details)

add clang builtin __builtin_preserve_access_index

lebedev.ri resigned from this revision.May 23 2019, 10:26 AM

Don't think i will be of any help here.

test/CodeGen/bpf-offsetreloc.c
2–4 ↗(On Diff #200639)

This looks like a bad test.
Can't you use FileCheck?
I'd think there is more to test than a single tiny test..

yonghong-song marked an inline comment as done.May 23 2019, 5:29 PM
yonghong-song added inline comments.
test/CodeGen/bpf-offsetreloc.c
2–4 ↗(On Diff #200639)

Thanks. I will add more tests with FileCheck once we get an agreement for the general approach.

ast accepted this revision.May 23 2019, 6:06 PM

from kernel, libbpf, tooling and user experience points of view this approach looks the best to me.

This revision is now accepted and ready to land.May 23 2019, 6:06 PM

@rsmith @eli.friedman Do you have any comments on the clang intrinsic interface in this patch and the llvm intrinsics interface at https://reviews.llvm.org/D61810?

remove bpf offsetreloc option and use FileCheck in the test.

ast accepted this revision.May 28 2019, 5:22 PM

lgtm

@rsmith @eli.friedman Ping again. Do you have any comments on my proposed clang/IR intrinsics? Thanks!

The general idea of restricting the intrinsics to specific contexts makes sense. I'm not sure it makes sense to mark expressions, as opposed to types, though; can we really expect the user to know which expressions to apply this to?

I'd like to see an actual specification for this in docs/LanguageExtensions.rst at some point.

Other than that, generally seems okay.

lib/CodeGen/CGExpr.cpp
663 ↗(On Diff #201793)

I'm not sure you can use getParents like this safely... it's not really meant for use inside of clang semantic analysis/code generation, and I don't think we recompute it as the AST changes.

yonghong-song marked an inline comment as done.Jun 19 2019, 3:44 PM

@eli.friedman Sorry for replying late. I am outside US and currently in PTO. Will back to US soon to address your comments.

can we really expect the user to know which expressions to apply this to?

Yes, this is specifically targeting some bpf helper calls like bpf_probe_read. So users know which expressions to apply.

I'd like to see an actual specification for this in docs/LanguageExtensions.rst at some point.

I will find a place to put this into docs/LanguageExtensions.rst.

lib/CodeGen/CGExpr.cpp
663 ↗(On Diff #201793)

Good point. Let me check whether I can traverse AST instead.

do not use ctx.getParents() to check whether a GEP candidate inside a preserve_access_index. instead, mark the region upfront. Add the new clang intrinsic into the language doc.

yonghong-song marked an inline comment as not done.Jun 24 2019, 12:14 PM

@eli.friedman I removed the usage of astcontext getParents() stuff. Instead, I mark the region upfront. I also added the intrinsic __builtin_preserve_access_index() into clang
lang extention doc. Could you take a look at new patch? Thanks!

@eli.friedman Hi, Just ping. I have removed getParents() for ASTContext and added description of the new intrinsic in language doc. Could you take a look? Thanks!

The new approach to tracking expressions inside of __builtin_preserve_access_index seems okay.

Please let @rsmith comment since he looked at this before.

docs/LanguageExtensions.rst
1958 ↗(On Diff #206274)

"preserved with the IR intrinsics" isn't really useful; this is the user's manual, not a developer guide to LLVM internals. Probably better to say what it enables from the user's perspective: the CO-RE feature for BPF targets.

1960 ↗(On Diff #206274)

I would rather not have __builtin_preserve_access_index fail to do anything when debug info is disabled. If it's hard to fix, making it a hard error is probably okay.

yonghong-song marked 2 inline comments as done.Jun 27 2019, 10:12 PM

@rsmith I have proposed one clang intrinsic and three IR intrinsics for CO-RE. Could you take a look and share your opinions as well? Thanks!

docs/LanguageExtensions.rst
1958 ↗(On Diff #206274)

Will reword to be more towards users in the next revision.

1960 ↗(On Diff #206274)

The IR intrinsics needs to have debuginfo as the metadata so that the user-level access info can be reconstructed. If no debug info, just IR intrinsics without debuginfo is less useful. So let me make a hard error.

We can relax it later if IR intrinsics without deebuginfo metadata becomes workable/useful.

reword the lang doc for builtin_preserve_access_index() to be more user focused
treat using builtin_preserve_access_index() without -g or in nested way as compiler errors

yonghong-song edited the summary of this revision. (Show Details)

change void *__builtin_preserve_access_index(void *) to const void * _builtin_preserve_access_index(const void *).

handle unnamed bitfield properly. these struct/union members are not encoded in debuginfo, so skip them during preserve_*_access_index IR intrinsics generation.

@rsmith Just ping, could you take a look at the clang change? If possible, could you share your opinion? Thanks!

@rsmith Ping again, do you have any comments on this patch? Thanks!

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 8 2019, 9:22 PM
jdoerfert reopened this revision.EditedJul 10 2019, 6:26 PM
jdoerfert added a subscriber: jdoerfert.

The test coverage here is not acceptable:

  1. The command line of the tests is far from what it should be (see other tests).
  2. The check lines do little to prevent regressions in the future.
  3. There are no test for wrong usage, errors, ...

Also, it seems these patches were reviewed without the respective commits list as a subscriber. Please add these for any future code review.

This revision is now accepted and ready to land.Jul 10 2019, 6:26 PM

@jdoerfert Thanks for comments. I will submit another patch to add adequate tests for clang frontend as you suggested.

The followup patch for this change is at https://reviews.llvm.org/D64598.

yonghong-song abandoned this revision.Jul 24 2019, 7:28 PM

abandon this one. A revised version with most tests has been merged.