This is an archive of the discontinued LLVM Phabricator instance.

[Clang][BPF] Support record argument with direct values
ClosedPublic

Authored by yonghong-song on Aug 18 2022, 9:51 AM.

Details

Summary

Currently, record arguments are always passed by reference by allocating
space for record values in the caller. This is less efficient for
small records which may take one or two registers. For example,
for x86_64 and aarch64, for a record size up to 16 bytes, the record
values can be passed by values directly on the registers.

This patch added BPF support of record argument with direct values
for up to 16 byte record size. If record size is 0, that record
will not take any register, which is the same behavior for x86_64
and aarch64. If the record size is greater than 16 bytes, the
record argument will be passed by reference.

Diff Detail

Event Timeline

yonghong-song created this revision.Aug 18 2022, 9:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 9:51 AM
yonghong-song requested review of this revision.Aug 18 2022, 9:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 9:51 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript

Are there any restrictions about number of struct arguments that can be passed in? Kernel changes were assuming at most 2, should we have a test that tests passing 3 structs that fit into 5 input registers and another test that passes 3 structs that do not fit in 5 input registers? And what should be the behavior in the latter case -- compilation error or switch to "pass by reference"?

yonghong-song added a comment.EditedAug 18 2022, 11:30 AM

Are there any restrictions about number of struct arguments that can be passed in? Kernel changes were assuming at most 2, should we have a test that tests passing 3 structs that fit into 5 input registers and another test that passes 3 structs that do not fit in 5 input registers? And what should be the behavior in the latter case -- compilation error or switch to "pass by reference"?

This is a good question. I actually tried to add such a test but unfortunately didn't find a good way to add a test for the failure case.
The following is an example,

[$ ~/tmp3] cat t2.c
struct t1 {
  long a;
};
struct t2 {
  long a;
  long b;
};

long foo1(long a1, long a2, long a3, long a4, long a5) {
  return a1 + a2 + a3 + a4 + a5;
}

long foo2(struct t1 a1, long a2, long a3, long a4, long a5) {
  return a1.a + a2 + a3 + a4 + a5;
}

long foo3(struct t2 a1, long a2, long a3, long a4, long a5) {
  return a1.a + a2 + a3 + a4 + a5;
}
[$ ~/tmp3] clang -target bpf -O2 -emit-llvm -S -Xclang -disable-llvm-passes t2.c
[$ ~/tmp3] clang -target bpf -O2 -emit-llvm -S t2.c
[$ ~/tmp3] clang -target bpf -O2 -S t2.c
t2.c:17:6: error: defined with too many args
long foo3(struct t2 a1, long a2, long a3, long a4, long a5) {
     ^
1 error generated.
[$~/tmp3]

You can see compiler does not flag the error during IR generation and IR optimization.
The error is actually flagged out during IR->Machine IR translation in file

Target/BPF/BPFISelLowering.cpp:      fail(DL, DAG, "defined with too many args");

Because the error is trigged by the BPF backend, which means that .s or .o file will
not get generated so we could not really test it. This is the case for all other
backend triggered fail or fatal_error.

But I think I will add a llvm test from IR to asm code which will include struct arguments (16 bytes)
and it will extend to use 5 registers for all arguments.

yonghong-song added a comment.EditedAug 18 2022, 11:34 AM

The compiler only counted the total number of registers won't exceed 5. You could have 5 struct arguments (each one register), or 2 struct arguments (with two registers each) and one integer register.
Whether to pass by value or by reference is a ABI issue, not related to the number of arguments. Checking whether arguments are enough or not is left to BPF backend lowering.
This is an example,

[$~/tmp3] cat t3.c
int foo(int a1, int a2, int a3, int a4, int a5, int a6) {
  return a1 + a2 + a3 + a4 + a5 + a6;
}
[$ ~/tmp3] clang -target bpf -O2 -S -emit-llvm t3.c
[$~/tmp3] clang -target bpf -O2 -S t3.c
t3.c:1:5: error: defined with too many args
int foo(int a1, int a2, int a3, int a4, int a5, int a6) {
    ^
1 error generated.
[$ ~/tmp3]
yonghong-song added a comment.EditedAug 18 2022, 11:37 AM

Kernel changes were assuming at most 2

My new version code allows all 5 parameters to be a structure argument. Previous limitation 2 is from vmlinux.h where I can at most 2 struct arguments. But 5 is certainly allowed.

ast added a comment.Aug 18 2022, 1:13 PM

clang -target bpf -O2 -S t3.c
t3.c:1:5: error: defined with too many args

this will be the case when backend needs to emit the code,
but if the function is always_inline it can have more than 5 arguments, right?
Would be good to have a test for that to make sure when BPF_KPROBE macro supports casting of 16-byte structs the clang side won't be an issue.

but if the function is always_inline it can have more than 5 arguments, right?

That is correct. Will add a test case for this.

add two llvm/test/CodeGen/BPF tests.

Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 4:47 PM
ast accepted this revision.Aug 18 2022, 4:50 PM
This revision is now accepted and ready to land.Aug 18 2022, 4:50 PM
This revision was landed with ongoing or failed builds.Aug 18 2022, 7:12 PM
This revision was automatically updated to reflect the committed changes.