This is an archive of the discontinued LLVM Phabricator instance.

[clang][AST matchers] adding submatchers under cudaKernelCallExpr to match kernel launch config
Needs ReviewPublic

Authored by ajohnson-uoregon on Mar 3 2022, 6:11 PM.

Details

Summary

adding more AST matchers for all possible launch params to a CUDA kernel, e.g. cudaGridDim and cudaSharedMemPerBlock

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 6:11 PM
Herald added a subscriber: yaxunl. · View Herald Transcript
ajohnson-uoregon requested review of this revision.Mar 3 2022, 6:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2022, 6:11 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
aaron.ballman added a subscriber: aaron.ballman.

Typically, we only add AST matchers when there's an obvious need for them (we do this because compiling this file takes a *long* time and generates quite a few symbols in the executable). I'm not certain that these CUDA matchers rise to that level of need -- are you planning to write a number of clang-tidy checks that use these new interfaces (or some other in-tree use)?

(Assuming you think we should continue to progress this patch, you should fix the clang-format issues, add test coverage, and regenerate documentation.)

fixing clang-format things and adding docs; also making docs a bit more clear

I still need to write tests but: I do have a use case for these over here: https://github.com/ajohnson-uoregon/llvm-project/blob/feature-ajohnson/clang-tools-extra/clang-rewrite/ConstructMatchers.cpp#L472
tl;dr, we'd like to match the kernel launch arguments (i.e., the arguments to __cudaPushCallConfiguration()) and these matchers made writing the code to generate those AST matchers much easier.

Without at least the hasKernelConfig() matcher, it's actually currently impossible to match the kernel launch args. (I wasn't able to find a way after quite a while poking at the AST, at least.) As for the others, it's not clear how to match the kernel launch args without exposing the fact that there's a second CallExpr inside the CUDAKernelCallExpr to the user and writing a pretty messy matcher, along the lines of cudaKernelCallExpr(hasKernelConfig(callExpr(hasArgument(0, expr())))) for the grid dim. cudaKernelCallExpr(cudaGridDim()) is a lot cleaner and easier to understand.

I still need to write tests but: I do have a use case for these over here: https://github.com/ajohnson-uoregon/llvm-project/blob/feature-ajohnson/clang-tools-extra/clang-rewrite/ConstructMatchers.cpp#L472
tl;dr, we'd like to match the kernel launch arguments (i.e., the arguments to __cudaPushCallConfiguration()) and these matchers made writing the code to generate those AST matchers much easier.

Thanks for the example use case, that helps! Do you expect to need it in more than one project though? We support defining local AST matchers so people can do one-off matching, but we typically only add AST matchers for things that we expect to be generally useful (multiple projects would benefit from it). For example, this project adds a number of local matchers: https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clang-tidy/misc/RedundantExpressionCheck.cpp#L435

Without at least the hasKernelConfig() matcher, it's actually currently impossible to match the kernel launch args. (I wasn't able to find a way after quite a while poking at the AST, at least.) As for the others, it's not clear how to match the kernel launch args without exposing the fact that there's a second CallExpr inside the CUDAKernelCallExpr to the user and writing a pretty messy matcher, along the lines of cudaKernelCallExpr(hasKernelConfig(callExpr(hasArgument(0, expr())))) for the grid dim. cudaKernelCallExpr(cudaGridDim()) is a lot cleaner and easier to understand.

Yup, I think you need these kind of matchers for what you want to do. What I'm less certain of is whether others will need them (we don't have any CUDA-specific clang-tidy modules, I don't think any of the existing coding standards we support checks for have anything to say about CUDA, etc).

For the specific argument matchers (cudaBlockDim(), cudaStream(), etc), I agree, I'm not sure if there's any other use cases in tree. I wasn't aware you could write local definitions like that, I could definitely do that within my project for those.

I would still like to upstream hasKernelConfig() though, so it's at least possible to match the kernel config without defining a local matcher.

For the specific argument matchers (cudaBlockDim(), cudaStream(), etc), I agree, I'm not sure if there's any other use cases in tree. I wasn't aware you could write local definitions like that, I could definitely do that within my project for those.

SGTM!

I would still like to upstream hasKernelConfig() though, so it's at least possible to match the kernel config without defining a local matcher.

I think that's reasonable. One thing to keep in mind for that is there are multiple language modes that all have the notion of a kernel (OpenCL and SYCL both do as well). We should make sure the matcher can be extended for those cases someday in the future, or be given a CUDA-specific name (e.g, hasCUDAKernelConfig()) if that's a better approach. (I suspect polymorphic matchers will work fine for this, but it's worth someone considering explicitly.)

Removing specific argument matchers and leaving just hasKernelConfig()

Thanks! This is missing test coverage for the new matcher, and you should also add a release note for it.

clang/include/clang/ASTMatchers/ASTMatchers.h
7859–7861