This is a cheap pass so there's no need to limit to -O3.
This removes some differences between various pipelines.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I don't think it really matters, not checking the opt level simplifies things a tiny bit, but if you feel strongly I can restrict this to not run at -O1
Not going to insist. I don't particularly care about the O1 pipeline, and I don't think we have a good understanding of what it is supposed to be anyway.
Hi, @aeubanks , this patch is hurting bpf backend.
The bpf backend calling convention is only allowing 5
parameters in registers and does not allow pass arguments
through stacks. But ArgumentPromotionPass could increase
the number of arguments for the function. If the eventual
number of arguments is more than 5, bpf backend will
have a fatal error.
The issue is discovered when I am compiling with latest clang
on bpf selftests, By default, for bpf community, it is recommended
to use -O2 as the compilation flag and that is why the issue is
not exposed earlier. This patch enabled the transformation at -O2,
and I hit the following errors with latest clang:
... progs/test_xdp_noinline.c:739:8: error: too many args to t8: i64 = GlobalAddress<ptr @encap_v4> 0, progs/test_xdp_noinline.c:739:8 if (!encap_v4(xdp, cval, &pckt, dst, pkt_bytes)) ^ ... progs/test_xdp_noinline.c:321:6: error: defined with too many args bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval, ^ ...
The following is a further analysis why the above happens.
struct flow_key { union { __be32 src; __be32 srcv6[4]; }; union { __be32 dst; __be32 dstv6[4]; }; union { __u32 ports; __u16 port16[2]; }; __u8 proto; }; struct packet_description { struct flow_key flow; __u8 flags; }; static __attribute__ ((noinline)) bool encap_v4(struct xdp_md *xdp, struct ctl_value *cval, struct packet_description *pckt, struct real_definition *dst, __u32 pkt_bytes) { ... __u32 ip_suffix = bpf_ntohs(pckt->flow.port16[0]); ... ip_suffix ^= pckt->flow.src; ... } __attribute__ ((noinline)) static int process_packet(...) { struct packet_description pckt = { }; ... encap_v4(xdp, cval, &pckt, dst, pkt_bytes) ... }
In function encap_v4, for argument pckt, only pckt->flow_port16[0]
and pckt->flow_src are referenced.
Before ArgumentPromotionPass pass, encap_v4 signature (in IR):
i1 @encap_v4(ptr noundef %xdp, ptr noundef %cval, ptr noundef %pckt, ptr noundef %dst, i32 noundef %pkt_bytes)
After ArgumentPromotionPass pass, encap_v4 signature (in IR):
i1 @encap_v4(ptr noundef %xdp, ptr noundef %cval, i32 %pckt.0.val, i16 %pckt.32.val, ptr noundef %dst, i32 noundef %pkt_bytes)
Basically, the argument 'pckt' is replaced with two arguments corresponding two pckt dereferences
inside encap_v4. But this transformation increases the number of arguments to 6 and eventually
bpf backend complains and errors out.
What is the proper way to resolve this? Adding a TTI hook to get target maximum number of
arguments? E.g, if -1, means no constraints, otherwise, it should return a >= 0 number which
indicates the maximum number of allowed arguments? Is it possible ArgumentPromotionPass may
replace the old reference parameter with a new struct argument with struct size > 8? If this
is the case we will have problem too.
Or for simplicity, simply disable this transformation for bpf backend?
To clarify, is this a limitation of the BPF backend, or BPF itself? That is, is the backend just missing legalization support for converting too many arguments into pass by stack?
Generally I'd be fine with a TTI hook to control the maximum number of arguments though.
@nikic This is a BPF limitation. See the following kernel document:
https://www.kernel.org/doc/html/latest/bpf/bpf_design_QA.html#q-can-more-than-5-function-arguments-be-supported-in-the-future
Q: Can more than 5 function arguments be supported in the future? ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ A: NO. BPF calling convention only allows registers R1-R5 to be used as arguments. BPF is not a standalone instruction set. (unlike x64 ISA that allows msft, cdecl and other conventions)
I will propose a TTI hook then.
Hi, this causes test failures on chromium's crashpad_tests.
Note: Google Test filter = StartHandlerForSelfTestSuite/StartHandlerForSelfTest.StartHandlerInChild/94 [==========] Running 1 test from 1 test suite. [----------] Global test environment set-up. [----------] 1 test from StartHandlerForSelfTestSuite/StartHandlerForSelfTest [ RUN ] StartHandlerForSelfTestSuite/StartHandlerForSelfTest.StartHandlerInChild/94 ERROR crashpad_tests[4189468:4189468]: [file_io.cc(94)] ReadExactly: expected 1, observed 0 ../../third_party/crashpad/crashpad/client/crashpad_client_linux_test.cc:474: Failure Value of: LoggingReadFileExactly(ReadPipeHandle(), &c, sizeof(c)) Actual: false Expected: true Stack trace: #0 0x55ac0ffc6334 crashpad::test::(anonymous namespace)::StartHandlerForSelfInChildTest::MultiprocessParent() #1 0x55ac1033b844 crashpad::test::Multiprocess::RunParent() #2 0x55ac1033b322 crashpad::test::Multiprocess::Run() #3 0x55ac0ffc5cb7 crashpad::test::(anonymous namespace)::StartHandlerForSelfTest_StartHandlerInChild_Test::TestBody() ../../third_party/crashpad/crashpad/client/crashpad_client_linux_test.cc:494: Failure Expected equality of these values: reports.size() Which is: 0 report_expected ? 1u : 0u Which is: 1 Stack trace: #0 0x55ac0ffc666f crashpad::test::(anonymous namespace)::StartHandlerForSelfInChildTest::MultiprocessParent() #1 0x55ac1033b844 crashpad::test::Multiprocess::RunParent() #2 0x55ac1033b322 crashpad::test::Multiprocess::Run() #3 0x55ac0ffc5cb7 crashpad::test::(anonymous namespace)::StartHandlerForSelfTest_StartHandlerInChild_Test::TestBody() ../../third_party/crashpad/crashpad/test/multiprocess_posix.cc:139: Failure Failed Child exited with code 0, expected termination by signal 11 (Segmentation fault) Stack trace: #0 0x55ac1033b733 crashpad::test::Multiprocess::Run() #1 0x55ac0ffc5cb7 crashpad::test::(anonymous namespace)::StartHandlerForSelfTest_StartHandlerInChild_Test::TestBody() [ FAILED ] StartHandlerForSelfTestSuite/StartHandlerForSelfTest.StartHandlerInChild/94, where GetParam() = (true, false, true, true, true, 4-byte object <02-00 00-00>) (6 ms) [----------] 1 test from StartHandlerForSelfTestSuite/StartHandlerForSelfTest (6 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test suite ran. (6 ms total) [ PASSED ] 0 tests. [ FAILED ] 1 test, listed below: [ FAILED ] StartHandlerForSelfTestSuite/StartHandlerForSelfTest.StartHandlerInChild/94, where GetParam() = (true, false, true, true, true, 4-byte object <02-00 00-00>)
Hi, this causes test failures on chromium's crashpad_tests.
NVM, it's due to flaky tests.