Add cuda-unchecked-api-call check
Motivation
cuda_runtime.h header is included by default in all CUDA files. Functions defined there provide an interface between regular c/c++ and the CUDA driver. Most of them return errors in the form of cudaError_t enum. The specific errors returned for each call are described in the CUDA documentation. However, it is really easy to ignore those errors and that the calls can fail. This can happen for call-related reasons, such as the address provided to the call being in the wrong address space (host vs. device), or for external reasons, such as problems with the CUDA device or with the driver. In any case, those can and will impact your execution, in the best case leading to a crash or, in the worst case, resulting in an incorrect program state.
Behavior
The cuda-unchecked-api-call check checks whether the value returned by a call to the CUDA API is unused, similar to the bugprone-unused-return-value check (however, it is more specific and it’s more likely to be used by the people that need it). It defines a CUDA API function as a function that:
- returns a type cudaError_t
- Is included through a header whose suffix is cuda_runtime.h (this allows for cuda_runtime.h to be replaced by for example _cuda_runtime.h in case some buck setup is configured like that)
Automatic fixes
The lint check can be configured to produce a FixItHint that puts the value from the CUDA API call inside a macro or a function handler. You can specify the error handler for your project by setting the HandlerName option for the cuda-unchecked-api-call. Here is an example of how this fix can transform unhandled code from:
void foo() { cudaDeviceReset(); }
to
void foo() { C10_CUDA_CHECK(cudaDeviceReset()); }
The specific handler used for this example is taken from PyTorch and its definition can be found here.
Limiting the allowed handlers
Since the projects may only have a limited set of handlers for the errors thrown by CUDA, there is also an option to limit the allowed ways to handle the value of the error returned by the check by setting the AcceptedHandlers option to a comma-separated list of names (which can be scoped) of those allowed error handlers. If HandlerName is set then it will also be implicitly added to that list. This option also works with dummy macros that pass the error through and do not do anything (which may be present in the code for performance reasons).
Parent diffs
This diff relies on D133801 and D133436 to properly run, so feel free to take a look at those as well
This does not look right.