Value, type, and instantiation dependence were not being handled
correctly for CUDAKernelCallExpr AST nodes. As a result, if an undeclared
identifier was used in the triple-angle-bracket kernel call configuration,
there would be no error during parsing, and there would be a crash during code
gen. This patch makes sure that an error will be issued during parsing in this
case, just as there would be for any other use of an undeclared identifier in
C++.
Details
Diff Detail
Event Timeline
test/SemaCUDA/kernel-call.cu | ||
---|---|---|
27 | We set four things in setConfig -- does this test fail if any one of them is commented out? If not, is it hard to add additional tests that will cover the changes? |
- Correct dependence info in CUDA kernel call AST
This patch removes the propagation of type and value dependence and the propagation of information on unexpanded parameter packs from the CUDA kernel configuration function call expression to its parent CUDA kernel call expression AST node. It does, however, maintain the propagation of instantiation dependence between those nodes, as introduced in the previous revision of this patch.
The last patch should not have propagated value and type dependence from the CUDA kernel config function to the entire CUDA kernel call expression AST node. The reason is that the CUDA kernel call expression has a void value, so it's value cannot depend on template types or values, it is always simply void.
However, the CUDA kernel call expression node can contain template arguments, so it can be instantiation dependent. That means that the instantiation dependence should be propagated from the config call to the kernel call node. The instantiation dependence propagation is also sufficient to fix the crashing bug that results from using an undeclared identifier as a config argument.
As for tracking unexpanded parameter packs, it is not yet clear how the CUDA triple-angle-bracket syntax will interoperate with variadic templates, so I will leave that propagation out of this patch and it can be dealt with later.
Looks sane to me, although I have no idea what I'm doing here; you should probably get someone else's approval.
test/SemaCUDA/kernel-call.cu | ||
---|---|---|
27 | Thanks for bringing this up. While trying to find tests that dealt with each dependence individually, I came to realize that value and type dependence should not be set for the CUDAKernelCallExpr node because it's value is always void. So, I removed the propagation of those two dependencies. Then, while looking for a test that could handle the parameter pack information, I realized that it was opening up a whole new can of worms and that the triple-angle-bracket syntax does not currently support variadic templates. I decided that parameter packs should be handled as a separate bug, so I removed them from this patch. The instantiation dependence propagation is still valid, though, because it just represents whether a template parameter is present anywhere in the expression, so I left it in. Correctly tracking instantiation dependence in enough to fix the bug this patch was meant to fix, so I think it is the only change that should be made in this patch. |
include/clang/AST/ExprCXX.h | ||
---|---|---|
176 | Can you assert that the argument is only set once here? (If we set it to something instantiation-dependent and then to something that isn't instantiation-dependent, we'd compute the wrong instantiation-dependence flag.) This function should only be called by the normal constructor and by people who called the EmptyShell constructor. | |
test/SemaCUDA/kernel-call.cu | ||
27 | What happens if an unexpanded pack is used within the kernel arguments of a CUDA kernel call? Do we already reject that? Are there tests for that somewhere? |
test/SemaCUDA/kernel-call.cu | ||
---|---|---|
27 | There don't seem to be any tests currently that handle this case. The case I had in mind for an unexpanded parameter pack was something like the following: __global__ void kernel() {} template <int ...Dimensions> kernel_wrapper() { kernel<<<Dimensions...>>>(); } This currently leads to a warning at the time of parsing that says the closing ">>>" is not found. I believe the cause is that the argument list is parsed as a simple argument list, so it doesn't handle the ellipsis correctly. I experimented with using standard (non-simple) parsing for the argument list, but that led to failures in other unit tests where ">>" wasn't being warned correctly in C++98 mode. I'm planning to file a bug for this (at least to fix the warning if not to allow the construction) and deal with it in a later patch. Does that sound reasonable? |
include/clang/AST/ExprCXX.h | ||
---|---|---|
181 | Perhaps assert(!getPreArg(CONFIG)) instead of storing a separate flag? | |
test/SemaCUDA/kernel-call.cu | ||
27 | Your approach for that testcase seems fine, but it's not a test for the right thing as it doesn't have an unexpanded pack within the kernel call args. Here's a testcase for that scenario: template<int ...Dimensions> void kernel_wrapper() { void (*fs[])() = { []{ kernel<<<Dimensions>>>(); } ... }; } |
include/clang/AST/ExprCXX.h | ||
---|---|---|
181 | Yes, I think that is much nicer. But I am worried that the config pre-arg might not be initialized before I check it, so in my next revision of this patch I've also added code in the CallExpr constructors to make sure that pre-args are always initialized to zero. Unfortunately, the simple way to do this leads to the entire array of function argument pointers being initialized for every function call expression construction (not just CUDA calls). Do you think I should try to avoid this overhead by only initializing the pre-arguments (either in the CallExpr code for everyone to share, or just in the CUDA code)? | |
test/SemaCUDA/kernel-call.cu | ||
27 | Excellent! Thank you for that test case. I suspected that I wasn't really exercising the "unexpanded" parameter pack with my code. |
include/clang/AST/ExprCXX.h | ||
---|---|---|
181 | My preference would be to pass the CallExpr constructor an ArrayRef<Expr*> for the preargs and have it initialize them itself. |
include/clang/AST/ExprCXX.h | ||
---|---|---|
181 | Sounds good. The current version introduces a new constructor for CallExpr that takes an ArrayRef<Expr*> for the preargs, and the CUDAKernelCallExpr constructor now passes the Config argument explicitly to this constructor. This new constructor also replaces the old one that took the number of preargs. In the new constructor I designed the dependency handling for the preargs to mirror the way it is done for regular function arguments, and I extracted a little helper function so as not to repeat that code. I also introduced another new CallExpr constructor to handle the common case of zero preargs, and replaced calls to the old NumPreArgs constructor where zero NumPreArgs was passed explicitly with calls to this new constructor. |
- Handle unexpanded parameter packs
The changes for instantiation dependence also fix a bug with unexpanded parameter packs, so add a unit test for unexpanded parameter packs as well.
rsmith, I think the patch is ready to be committed. Please take a look if you have a moment. Thanks for your help.
LGTM, thanks!
test/SemaCUDA/cxx11-kernel-call.cu | ||
---|---|---|
8 ↗ | (On Diff #44538) | Perhaps also add another test: void (*g[])() = { [] { k1<<<Dimensions, Dimensions>>>(); } ... }; // ok |
Can you assert that the argument is only set once here? (If we set it to something instantiation-dependent and then to something that isn't instantiation-dependent, we'd compute the wrong instantiation-dependence flag.) This function should only be called by the normal constructor and by people who called the EmptyShell constructor.