Page MenuHomePhabricator

[BPF] Add support for floats and doubles
ClosedPublic

Authored by iii on Jul 7 2020, 3:23 AM.

Details

Summary

Some BPF programs compiled on s390 fail to load, because s390
arch-specific linux headers contain float and double types. At the
moment there is no BTF_KIND for floats and doubles, so the release
version of LLVM ends up emitting type id 0 for them, which the
in-kernel verifier does not accept.

Introduce support for such types to libbpf by representing them using
the new BTF_KIND_FLOAT.

Diff Detail

Event Timeline

iii created this revision.Jul 7 2020, 3:23 AM

@iii Thanks for the patch. Looks like you also addressed the issue for larger vlen for struct/union/enum/func_proto. Currently, we mostly just ignored these types. We issued fatal error for unhandled types (mostly non-C types). Such a compiler sanitization should be fine. I am thinking whether we should issue a compiler warning for such unsupported type or not. If we did, we will need an option to silence them. Not sure whether it is worth it or not since we did not issue warning right now for these types.

For the code itself, is it possible to keep all visit function as void type and do unknown type generation whenever it hits (e.g., float/double, large vlen)?
For unknown types, I still like to keep it as fatal error. If it did happen, we will see how to deal with it. I just do not want to silently convert it.

iii updated this revision to Diff 276704.Jul 9 2020, 4:45 AM

Thanks for the review!

I've changed the return types back to void and added addUnknownType()
calls where needed. I've also restored llvm_unreachable() for unknown
DIType subclasses.

I don't have strong opinion on whether using int for float/double or byte array or completely skip emitting BTF for those members. I would slightly prefer the later (skip emitting BTF) for now until we figure out a real use case that the bpf prog needs to read them. I think the bpf program is not supposed to access the float/double anyway, right?.

I would prefer a warning. especially for >64k members which should not happen, right?

Thanks for the change. LGTM. Could you add some warnings for these unsupported types (using llvm::errs())? Without this patch, BTF will be rejected. This patch will make them pass
verifier. Warning will let user know the program has unsupported types or certain limitations and these types and limitations are worked around by the compiler. People may report
these types and limitations back to bpf community so we can evaluate whether we need to extend BTF to have proper support.

iii updated this revision to Diff 277007.Jul 10 2020, 5:53 AM

I've added the warning. On my s390 it looks like this when building
kselftests:

CLANG    kselftest/bpf/tools/build/bpftool/pid_iter.bpf.o

Type float is not supported by BTF, emitting unsigned char[4] instead
Type double is not supported by BTF, emitting unsigned char[8] instead

I don't think existing progs are using any of these fields, but it
might be beneficial to be able to at least memcpy their contents to
e.g. a perf event buffer for later inspection.

ast added a subscriber: ast.Jul 10 2020, 12:15 PM
ast added a subscriber: anakryiko.Jul 10 2020, 12:20 PM
ast added a comment.Jul 10 2020, 12:24 PM

does pahole convert float/double to int ? Is it really the case?
I think it's better to skip float/double when they are part of a struct and leave a hole in there.
I worry that representing them as 'char' array may cause backward compatibility issues later.
If pahole is doing such hack now it probably should be fixed too.

As far as structs with >64k members I think the hard error is better.

pahole does a sensible thing, it represents it as BTF_KIND_INT (which maps to DWARF's DW_TAG_base_type), but with no encoding (so it's effectively unsigned integer with proper size).

I've been advocating for adding one extra encoding bit to BTF_KIND_INT's encoding field to represent floating-point numbers, in addition to existing CHAR, SIGNED, and BOOL. This would make sense (float is indivisible set of bytes with particular interpretation, which is what BTF_KIND_INT is), it's easy to support in kernel, it's easy to sanitize in libbpf.

The only downside is a bit mismatched INT name (DWARF uses more generic BASE_TYPE name for this), which I think is very-very minor (and one can argue that all the primitive types are integers).

For structs with >64k members, currently, it is ignored (with type id 0), no compilation error. That is why I recommend to issue a warning. I worried maybe some codes there with such huge struct/enum (esp. enum), which is not used by bpf program directly, but present in the bpf code. The fatal error may break them. But I agree such cases should be really rare, so maybe fatal error is okay.

Using hole instead of representing float as another type may make it complicated to handle cases like

const float ...
float * ...
float a[100];

some may be struct members, some other may be pointee's, some may be function arguments, etc....
Or we only care about float member in a structure, which is the motivation for this patch, and all
other potential use of float should be ignored (as of today)?

I concur with @yonghong-song, making float disappear is much bigger pain.

But I also think we should go with encoding and actually also fix encoding interpretation in kernel. Right now it allows only one of SIGNED, CHAR, or BOOL to be specified, which makes it impossible to correctly represent signed char type, as you can see from this:

$ cat test.c
struct s {
        signed char wat;
};

int main() {
        static struct s s = { .wat = -2 };
        return s.wat;
}
$ cc -g test.c -o test

$ readelf -wi test | rg 'signed char'
    <48>   DW_AT_encoding    : 6        (signed char)
    <49>   DW_AT_name        : (indirect string, offset: 0x50): signed char

$ pahole -JV test | rg 'signed char'
[2] INT signed char size=1 bit_offset=0 nr_bits=8 encoding=(none)

$ clang -g -target bpf -c test.c -o test.bpf.o

$ bpftool btf dump file test.bpf.o | rg 'signed char'
[5] INT 'signed char' size=1 bits_offset=0 nr_bits=8 encoding=SIGNED

DWARF gets it right. pahole does the same thing as for float, just emits no encoding. clang emits it as signed, but not a char.

BTW, I couldn't make neither pahole nor clang to emit CHAR encoding for anything (char, unsigned char, signed char), what was it supposed to be used for originally?

So I'd say let's add BTF_INT_FLOATING and fix SIGNED CHAR and CHAR problem altogether? Libbpf can trivially sanitize BTF for older kernels.

Thoughts?

ast added a comment.Jul 10 2020, 3:26 PM

'float *var;' in the struct won't be trivial to handle with a skip indeed.
My concern with float->char[] substitute is that the kernel decides on calling convention based on these types.
If there is a function fn(int a, float b); and pahole emits BTF for it as fn(int a, char b[4])
the kernel will let bpf progs to attach to it with wrong register passing.
Currently array is not allowed in btf_distill_func_proto(), so it's a theoretical issue, but still dangerous long term.
I think pahole/clang should either skip generating BTF for anything with float or BTF should be extended to encode it.
I think extending BTF would be easier.
I don't like KIND_INT_FLOATING though. KIND_FLOAT is better. Just like single KIND_INT that represents char/int/long
the KIND_FLOAT should be able to represent float/double/long double.

yonghong-song added a comment.EditedJul 10 2020, 3:58 PM

I will experiment with proper float type support soon. new kind e.g., KIND_FLOAT proper cleanest based on current state of BTF.

iii abandoned this revision.Jul 13 2020, 2:40 AM

I'll close this change now. Thanks for the discussion; looking forward to the proper floating point support in BTF!

@iii I probably won't have time to work on this in the near future. Since you have a use case here, can you help do a design and implementation here? This may involve llvm, libbpf, kernel and pahole changes. Thanks!

iii added a comment.Jul 15 2020, 2:48 PM

@yonghong-song Sure, I will give it a try.

Sounds good. Thanks!

iii updated this revision to Diff 322155.Mon, Feb 8, 10:08 AM

Hi, it has been a long time, but I've finally implemented the support
for the floating-point types in BTF. Here are the other pieces:

Could you please have a quick look regarding whether this all makes
sense and whether we want to go forward with this? I can then split
the commits and send them to the mailing lists for the proper review.

In D83289#2549050, @iii wrote:

Hi, it has been a long time, but I've finally implemented the support
for the floating-point types in BTF. Here are the other pieces:

fyi, github.com/libbpf/libbpf is derived from kernel sources, so there is no need for separate Github parts

Could you please have a quick look regarding whether this all makes
sense and whether we want to go forward with this? I can then split
the commits and send them to the mailing lists for the proper review.

I still think float/double doesn't deserve its own BTF_KIND_FLOAT kind and all the extra code (and APIs). Just like DWARF is doing just fine with just a DW_ATE_float encoding in base_type kind, so can BTF have BTF_KIND_INT (unfortunate name, but it corresponds to DWARF's base_type) with special FLOAT encoding. Having 'float'/'double' as BTF_KIND_INT (without extra encoding bits) would make all this work with vmlinux.h generation with absolutely no changes in kernel, libbpf, pahole, today. We could also add FLOAT encoding for kernel to deal with calling conventions, though. That would still require less changes across existing tool chains and whichever other users of BTF are out there.

Either way, libbpf as it is right now is missing BTF sanitization bits. You'd probably want to implement BTF_KIND_FLOAT -> BTF_KIND_INT substitution for that in bpf_object__sanitize_btf().

iii updated this revision to Diff 323849.Mon, Feb 15, 5:02 PM

Removed an unnecessary trailing u32.

iii retitled this revision from [BPF] Emit unknown types as byte arrays to [BPF] Add support for floats and doubles.Mon, Feb 15, 5:21 PM
iii edited the summary of this revision. (Show Details)
yonghong-song accepted this revision.Wed, Feb 17, 10:24 PM

LGTM. But let us delay merging until kernel side patch is reasonable shape and ready to merge, then we can merge this patch to llvm. Thanks!

This revision is now accepted and ready to land.Wed, Feb 17, 10:24 PM

@iii if you have write permission, can you merge the patch by yourself? Otherwise, I can do that. Let me know. Thanks!

This revision was landed with ongoing or failed builds.Fri, Mar 5, 6:11 AM
This revision was automatically updated to reflect the committed changes.