Otherwise, we fail to compile calls to CUDA kernels that are static members.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
720 ms | x64 debian > libomp.lock::omp_init_lock.c |
Event Timeline
need a test for non-template static member function as kernel. also need codegen tests.
I've added more tests for different code paths leading to the kernel call. Interestingly enough, only the a0 actually calls BuildCallToMemberFunction. Other variants go through different code paths that handle the call without it.
As for the codegen, all kernel calls that make it to clang::Sema::BuildResolvedCallExpr with launch config are handled the same way. I don't think codegen tests will buy us much.
I am concerned that there may be more places which need handling, and passing exec config expr by function arguments may not scale. Is it possible to represent the kernel call expr by a derived class of call expr and add the exec config expr as member to it?
I don't think it's worth it.
This config pass-through code has been around from the very early days of attempting to implement CUDA and we're already passing it around during call resolution.
AFAICT, this particular place was a relatively new addition which didn't implement the pass-through of the config.
While there may be other places where a similar issue may happen in the future (or exists as a corner case we didn't find yet), it/when we run into it, it will be diagnosed, as it was in this case.
It took us few years until we ran into this one. I'm pretty sure that this particular code path is pretty rare and the patch is not going to have a measurable impact on compiler performance.
clang-format suggested style edits found: