This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang-tools-extra/
-
clang-tidy/
-
cuda/
-
CMakeLists.txt
-
CudaTidyModule.cpp
-
UnsafeKernelCallCheck.h
-
UnsafeKernelCallCheck.cpp
-
utils/
-
FixItHintUtils.cpp
-
docs/
-
ReleaseNotes.rst
-
clang-tidy/checks/
-
checks/
-
cuda/
-
unsafe-kernel-call.rst
-
list.rst
-
test/clang-tidy/checkers/
-
clang-tidy/
-
checkers/
-
Inputs/Headers/cuda/
-
Headers/
-
cuda/
-
cuda.h
-
cuda_runtime.h
-
cuda/
-
unsafe-kernel-call-function-handler.cu
8/16
unsafe-kernel-call-macro-handler.cu

Differential D133956

Cuda Check for ignored errors after calling a CUDA kernel
Needs ReviewPublic

Authored by barcisz on Sep 15 2022, 10:44 AM.

Download Raw Diff

Details

Reviewers

tra
alexfh
• alexfh_
pcc
yaxunl
rnk
ivanmurashko
r-barnes
0x1eaf
kuganv
njames93
alex
aaron.ballman
LegalizeAdulthood

Summary

Add cuda-unchecked-kernel-call check

Motivation

Calls to CUDA kernels can yield errors after their invocation. These errors can be obtained by calling cudaGetLastError(), which also resets CUDA’s error state. There is a non error-resetting version of this function called cudaPeekAtLastError(), but the lint check does not accept this (see
below). A limited set of errors can block a kernel from launching including driver malfunctions, trying to allocate too much shared memory, or using too many threads or blocks. Since those errors can cause unexpected behavior that blocks subsequent computation, they should be caught as close
to the launch point as possible. The lint check enforces this by requiring that every kernel be immediately followed by an error check.

Behavior

The cuda-unchecked-kernel-call checks whether there is a call to cudaGetLastError() directly after each kernel call. To be precise, there can be no side-effecting or branching code between the kernel call and the call to cudaGetLastError(), such as branching due to the ?: operator or due
to a call to a function. This is because a more complicated behavior is likely to be harder for humans to read and would would be significantly slower to automatically check. We want to encourage well-designed, multi-line macros that check for errors, so we explicitly allow macros whose content is
do { /* error check */ } while(false), since this is the recommended way of making multi-line macros.
The check does also accept the handler it was provided as a valid way to handle the error, even if the handler does not comply with the rule above (or is a function which cannot be easily and quickly checked). However, it is still encouraged to call cudaGetLastError() early in the handler’s code
for the code to be readable.

Automatic fixes

The lint check can be configured to automatically fix the issue by adding an error handling macro right after the kernel launch. You can specify the error handler for your project by setting the HandlerName option for the cuda-unchecked-kernel-call. Here is an example of how this fix can
transform unhandled code from:

void foo(bool b) {
  if (b)
    kernel<<<x, y>>>();
}

void foo(bool b) {
  if(b)
    {kernel<<<x, y>>>(); `C10_CUDA_KERNEL_LAUNCH_CHECK`();}
}

The specific handler used for this example is taken from PyTorch and its definition can be found here.

Known Limitations

Using cudaPeekAtLastError()

cudaPeekAtLastError() can also be used to check for CUDA kernel launch errors. However, there are several reasons why this is not and will most likely not be considered as a valid way to check for errors after kernel invocations. This all has to do with the purpose of the function, which is to
not reset the internal error variable:

Subsequent kernel calls, even if they don’t produce any errors, will seem as if they produced an error due to the error not being reset. This behavior is easy to overlook and may cause he significant difficulty in debugging.
Our linter cannot easily check whether the error was reset before subsequent kernel calls. It might even be impossible to do so due to the error leaking inter-procedurally from functions whose code we can’t access.

Checking for errors that occurred while a kernel was running

Our linter does not check whether errors occurred while a kernel was running. The linter only enforces checks that a kernel launched correctly. cudaDeviceSynchronize() and similar API calls can be used to see that a kernel’s computation was successful, but these are blocking calls, so we are not
able to suggest where they should go automatically.

Parent diffs

This diff relies on D133436, D133725 and D133942 to properly run, so feel free to take a look at those as well

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

barcisz created this revision.Sep 15 2022, 10:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 10:44 AM

Herald added subscribers: mattd, carlosgalvezp, yaxunl, mgorny. · View Herald Transcript

barcisz requested review of this revision.Sep 15 2022, 10:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 15 2022, 10:44 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B186896: Diff 460455.Sep 15 2022, 10:44 AM

Our linter cannot easily check whether the error was reset

It can not in principle. Many CUDA errors are 'sticky' and can only be cleared by resetting the GPU or exiting the application and the former is virtually never used beyond toy examples (resetting a GPU would clear a lot of state, including memory allocations and restoring it is usually not feasible in practice).
E.g:
https://stackoverflow.com/questions/43659314/how-can-i-reset-the-cuda-error-to-success-with-driver-api-after-a-trap-instructi
https://stackoverflow.com/questions/56329377/reset-cuda-context-after-exception/56330491

The checker has no way to tell whether the returned error is sticky and the stickiness can start at any CUDA runtime call, so, generally speaking, all CUDA API calls must be checked and any of them may be the one producing sticky errors due to preceding calls. At the very minimum, in addition to <<<...>>> kernel launches, user may also launch kernels via cudaLaunchKernel() and, I believe, CUDA runtime itself may launch some helper kernels under the hood, so I would not be surprised to see other sources of errors.

I think ultimately the checker should be generalized to flag all unchecked CUDA runtime calls. The problem is that that is going to be exceedingly noisy in practice as a lot of real code does not bother to check for the errors consistently. Limiting the checks to kernel launches may be a reasonable starting point as it would give us the ability to zero in on the culprit kernel by running the app with "CUDA_LAUNCH_BLOCKING".

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu
53	Just curious -- is it sufficient to just call `cudaGetLastError();` ? Or does the checker require using its result, too? I.e. in practice this particular check will not really do anything useful. The tests below look somewhat inconsistent.
75	WDYM by "is not considered safe" here? How is that different from calling `cudaGetLastError()` and checking its value?
80–82	What would happen with a single `;` as would be seen in the normal user code?
112	Why does this case produce no warning, while a very similar case above does? In both cases result of `cudaGetLastError()` is assigned to an unused variable within the loop body. b<<<1, 2>>>(); // CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch. for(;;) auto err2 = cudaGetLastError(); // Brackets omitted purposefully, since they create an additional AST node

barcisz mentioned this in D133942: Clang tidy utility to generate a fix hint for a subsequent expression to the existing one.Sep 15 2022, 1:14 PM

barcisz mentioned this in D133725: Searching for tokens including comments.

In D133956#3793022, @tra wrote:

I think ultimately the checker should be generalized to flag all unchecked CUDA runtime calls. The problem is that that is going to be exceedingly noisy in practice as a lot of real code does not bother to check for the errors consistently. Limiting the checks to kernel launches may be a reasonable starting point as it would give us the ability to zero in on the culprit kernel by running the app with "CUDA_LAUNCH_BLOCKING".

By that do you mean that the way the check is now it is acceptable or that it should be improved to handle intra-procedural analysis? The intention with this check is to work a lot in tandem with the one in D133804, which therefore prevents most such cases. The practice that the check checks for is also commonly used in ML frameworks which heavily rely on CUDA, so not catching such cases might still be helpful for them just for the sake of preserving code consistency and catching such errors (since if there is an error then that means that some part of the code was broken anyways). Thus, the check is optimized for lowering false positives during static checking and for a practice lowering the number of false negatives within the CUDA code.

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu
53	Technically it does not require the user to actually use the value of `cudaGetLastError()`, but If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel the check being introduced in D133804 can be used to check if the return value has been used, so checking it here as well would have been a duplication
75	As in the check does not do inter-procedural analysis short of adding the handler to AcceptedHandlers, so the check will flag such occurences
80–82	Nothing, it would work just fine; it's rather that all other kernel calls in this test use a single `;` so I want to check this case here
112	Because often a macro will wrap its error handling code in a do {...} while(0) loop and that's why we check this case and simmilar ones with CFG analysis

barcisz retitled this revision from git push Cuda Check for ignored errors after calling a CUDA kernel to Cuda Check for ignored errors after calling a CUDA kernel.Sep 15 2022, 3:01 PM

barcisz mentioned this in D133436: Ground work for cuda-related checks in clang-tidy.

barcisz edited the summary of this revision. (Show Details)Sep 15 2022, 3:04 PM

barcisz added reviewers: tra, alexfh, • alexfh_, pcc, yaxunl, rnk, ivanmurashko, r-barnes, 0x1eaf, kuganv, njames93.Sep 15 2022, 3:13 PM

The intention with this check is to work a lot in tandem with the one in D133804, which therefore prevents most such cases.
Thus, the check is optimized for lowering false positives during static checking and for a practice lowering the number of false negatives within the CUDA code.

SGTM. This patch + D133804 should have everything covered.

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu
53	If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel If that's the case, then why kernel launches on lines 45 and 51 are reported as possibly unchecked? Both are followed by the `cudaGetLastError()` call and are, technically checked, if we're not analyzing the usage of the result of the call. What am I missing?
75	Hmm.. Using a helper function to check for cuda errors is a fairly common pattern. Is there a way to annotate such a helper function as `checks cudaGetLastError`?
80–82	I still do not understand how it all fits together. What does a kernel call, the extra `;`, the macro, and the checker code have to do with each other? Is the idea that the checker should see though the empty statement between the kernel call and the checker macro? If that's the case I'd make it a bit more prominent. E.g. something like this: b<<<1, 2>>>(); ; /* Make sure that we see through empty expressions in-between the call and the checker. */ ; CUDA_CHECK_KERNEL();
112	The `do/while(0)` wrapping part I understand. I'm puzzled why the checker appears to work differently with different loop kinds. Why a `cudaGetLastError()` call inside `do {} while()` is detected and considered as a cuda result check, but the same call within `for() {}` is not?

Rebase

Harbormaster completed remote builds in B186974: Diff 460550.Sep 15 2022, 4:35 PM

barcisz added inline comments.Sep 15 2022, 4:51 PM

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu
53	The idea is that the call should happen directly after the kernel without any branching (because branching can often make things much harder to understand in case of things like for loop make the error not actually have `cudaGetLastError()` called after every kernel call
75	There would be an easy way to do that, but it's much more common for projects to have those helper functions project-wide (or at least sub-project wide) which means they can be just specified explicitly for the project in the options for the check (the official documentation for the check will be uploaded tomorrow)
80–82	The reason we're checking for multiple `;`s here is that due to macros not being present in the AST they have to be located on the lexer stage, which makes it necessary to search for them based on tokens. The tokens used after the kernel call here (semicolons and a comment) are the only allowed token between the kernel call and the macro, since any other one would indicate another statement being present
112	This is because `for(<something>;<something>;<something>)` works differently in the CFG analysis. The precise definition for where the cudaGetLastError() is allowed is that it should be the first statement/expression tree/function call after the kernel call and should be in a straight line from it. For example, `for(;;) {cudaGetLastError()}` would not be similifiable to a single control flow block, and `for(;false;) {cudaGetLastError()}` has an expression tree evaluated before the call to `cudaGetLastError()`. Technically , this definition currently only supports wrapping the statement with `cudaGetLastError()`, but is made more general in case The user uses gotos to achieve a simmilar pattern Such CFG layout can be achieved with other c++ mechanisms Means to achieve such CFG layout with different mechanisms appear in future standards of c++

barcisz added reviewers: alex, aaron.ballman, LegalizeAdulthood.Sep 16 2022, 6:46 AM

documentation for the check

Herald added a subscriber: aheejin. · View Herald TranscriptSep 16 2022, 7:07 AM

Harbormaster completed remote builds in B187139: Diff 460743.Sep 16 2022, 7:07 AM

LegalizeAdulthood resigned from this revision.Mar 29 2023, 8:19 AM

Herald added a subscriber: PiotrZSL. · View Herald TranscriptMar 29 2023, 8:19 AM

Revision Contents

Path

Size

clang-tools-extra/

clang-tidy/

cuda/

CMakeLists.txt

1 line

CudaTidyModule.cpp

3 lines

UnsafeKernelCallCheck.h

81 lines

UnsafeKernelCallCheck.cpp

357 lines

utils/

FixItHintUtils.cpp

2 lines

docs/

ReleaseNotes.rst

6 lines

clang-tidy/

checks/

cuda/

unsafe-kernel-call.rst

6 lines

list.rst

3 lines

test/

clang-tidy/

checkers/

Inputs/

Headers/

cuda/

cuda.h

2 lines

cuda_runtime.h

1 line

cuda/

unsafe-kernel-call-function-handler.cu

150 lines

unsafe-kernel-call-macro-handler.cu

116 lines

Diff 460455

clang-tools-extra/clang-tidy/cuda/CMakeLists.txt

	add_clang_library(clangTidyCudaModule			add_clang_library(clangTidyCudaModule
	CudaTidyModule.cpp			CudaTidyModule.cpp
	UnsafeApiCallCheck.cpp			UnsafeApiCallCheck.cpp
				UnsafeKernelCallCheck.cpp
	LINK_LIBS			LINK_LIBS
	clangTidy			clangTidy
	clangTidyUtils			clangTidyUtils
	)			)

	clang_target_link_libraries(clangTidyAlteraModule			clang_target_link_libraries(clangTidyAlteraModule
	PRIVATE			PRIVATE
	clangAnalysis			clangAnalysis
	clangAST			clangAST
	clangASTMatchers			clangASTMatchers
	clangBasic			clangBasic
	clangLex			clangLex
	)			)

clang-tools-extra/clang-tidy/cuda/CudaTidyModule.cpp

	//===--- CudaTidyModule.cpp - clang-tidy --------------------------------===//			//===--- CudaTidyModule.cpp - clang-tidy --------------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "../ClangTidy.h"			#include "../ClangTidy.h"
	#include "../ClangTidyModule.h"			#include "../ClangTidyModule.h"
	#include "../ClangTidyModuleRegistry.h"			#include "../ClangTidyModuleRegistry.h"
	#include "UnsafeApiCallCheck.h"			#include "UnsafeApiCallCheck.h"
				#include "UnsafeKernelCallCheck.h"

	using namespace clang::ast_matchers;			using namespace clang::ast_matchers;

	namespace clang {			namespace clang {
	namespace tidy {			namespace tidy {
	namespace cuda {			namespace cuda {

	class CudaModule : public ClangTidyModule {			class CudaModule : public ClangTidyModule {
	public:			public:
	void addCheckFactories(ClangTidyCheckFactories &CheckFactories) override {			void addCheckFactories(ClangTidyCheckFactories &CheckFactories) override {
	CheckFactories.registerCheck<UnsafeApiCallCheck>("cuda-unsafe-api-call");			CheckFactories.registerCheck<UnsafeApiCallCheck>("cuda-unsafe-api-call");
				CheckFactories.registerCheck<UnsafeKernelCallCheck>(
				"cuda-unsafe-kernel-call");
	}			}
	};			};

	// Register the CudaTidyModule using this statically initialized variable.			// Register the CudaTidyModule using this statically initialized variable.
	static ClangTidyModuleRegistry::Add<CudaModule>			static ClangTidyModuleRegistry::Add<CudaModule>
	X("cuda-module", "Adds Cuda-related lint checks.");			X("cuda-module", "Adds Cuda-related lint checks.");

	} // namespace cuda			} // namespace cuda

	// This anchor is used to force the linker to link in the generated object file			// This anchor is used to force the linker to link in the generated object file
	// and thus register the CudaModule.			// and thus register the CudaModule.
	volatile int CudaModuleAnchorSource = 0;			volatile int CudaModuleAnchorSource = 0;

	} // namespace tidy			} // namespace tidy
	} // namespace clang			} // namespace clang

clang-tools-extra/clang-tidy/cuda/UnsafeKernelCallCheck.h

This file was added.

				//===--- UnsafeKernelCallCheck.h - clang-tidy -------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CUDA_UNSAFEKERNELCALLCHECK_H
				#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CUDA_UNSAFEKERNELCALLCHECK_H

				#include "../ClangTidyCheck.h"
				#include "llvm/ADT/StringSet.h"
				#include <unordered_set>

				namespace clang {
				namespace tidy {
				namespace cuda {

				/// Checks for whether the possible errors with kernel launches are handled.
				///
				/// CUDA kernels do not always launch correctly. This may happen due to a driver
				/// malfunction, lack of permissions, lack of a GPU, or a multitude of other
				/// reasons. Such errors should be detected by calling the cudaGetLastError()
				/// function following the kernel invocation. The invocation of the error should
				/// be the the first side-effectful AST node after the invocation of the kernel
				/// call (traversing the AST post-order) and a part of the first non-expression
				/// statement after the kernel call. More precisely, it should be the first CFG
				/// statement produced in line after the kernel call using the default options
				/// for CFG building. This is because having the error checks closer to the
				/// kernel invocation makes it easier to debug the code.
				///
				/// The check provides the following options:
				/// - "HandlerName" (optional):
				/// specifies the name of the function or the macro to which the return
				/// value of the API call should be passed. This effectively automates the
				/// process of adding the error checks in question for projects that have
				/// such a mechanism implemented in them. The handler will also be accepted
				/// even if it does not actually call cudaGetLastError().
				/// - "AcceptedHandlers" (optional):
				/// a comma-separated list specifying the only accepted handling
				/// functions/macros that can alternatively handle the kernel error besides
				/// the handler specified in HandlerName. The handlers may have scope
				/// specifiers included in them, but if so then the full qualified name
				/// (with all namespaces explicitly stated) has to be provided (for the
				/// performance sake).
				class UnsafeKernelCallCheck : public ClangTidyCheck {
				class PPCallback;

				public:
				UnsafeKernelCallCheck(llvm::StringRef Name,
				clang::tidy::ClangTidyContext *Context);
				void registerPPCallbacks(const SourceManager &SM, Preprocessor *PP,
				Preprocessor *ModuleExpanderPP) override;
				void registerMatchers(clang::ast_matchers::MatchFinder *Finder) override;
				void
				check(const clang::ast_matchers::MatchFinder::MatchResult &Result) override;
				void storeOptions(ClangTidyOptions::OptionMap &Opts) override;

				private:
				const std::string HandlerName;
				void reportIssue(const Stmt &Stmt, ASTContext &Context);
				bool checkHandlerMacro(const Stmt &Stmt, ASTContext &Context);

				const std::string AcceptedHandlersList;
				const llvm::StringSet<llvm::MallocAllocator> AcceptedHandlersSet;
				bool isAcceptedHandler(const StringRef &Name);
				static llvm::StringSet<llvm::MallocAllocator>
				splitAcceptedHandlers(const llvm::StringRef &AcceptedHandlers,
				const llvm::StringRef &HandlerName);

				std::unordered_set<SourceLocation,
				std::function<unsigned(const SourceLocation &)>>
				HandlerMacroLocations;
				};

				} // namespace cuda
				} // namespace tidy
				} // namespace clang

				#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_CUDA_UNSAFEKERNELCALLCHECK_H

clang-tools-extra/clang-tidy/cuda/UnsafeKernelCallCheck.cpp

This file was added.

				//===--- UnsafeKernelCallCheck.cpp - clang-tidy ---------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "UnsafeKernelCallCheck.h"
				#include "../utils/FixItHintUtils.h"
				#include "../utils/LexerUtils.h"
				#include "clang/Analysis/CFG.h"
				#include "clang/Basic/SourceManagerInternals.h"
				#include "clang/Lex/PPCallbacks.h"
				#include "clang/Lex/Preprocessor.h"
				#include "clang/Tooling/FixIt.h"
				#include <cctype>

				using namespace clang::ast_matchers;

				namespace clang {
				namespace tidy {
				namespace cuda {

				namespace {

				constexpr auto HandlerNameOptionName = "HandlerName";
				constexpr auto AcceptedHandlersOptionName = "AcceptedHandlers";

				} // namespace

				UnsafeKernelCallCheck::UnsafeKernelCallCheck(
				llvm::StringRef Name, clang::tidy::ClangTidyContext *Context)
				: ClangTidyCheck(Name, Context),
				HandlerName(Options.get(HandlerNameOptionName, "")),
				AcceptedHandlersList(Options.get(AcceptedHandlersOptionName, "")),
				AcceptedHandlersSet(
				splitAcceptedHandlers(AcceptedHandlersList, HandlerName)),
				HandlerMacroLocations(
				8, [](const SourceLocation &sLoc) { return sLoc.getHashValue(); }) {
				if (AcceptedHandlersSet.find("") != AcceptedHandlersSet.end()) {
				configurationDiag(
				"Empty handler name found in the list of accepted handlers",
				DiagnosticIDs::Error);
				}
				}

				llvm::StringSet<llvm::MallocAllocator>
				UnsafeKernelCallCheck::splitAcceptedHandlers(
				const llvm::StringRef &AcceptedHandlers,
				const llvm::StringRef &HandlerName) {
				if (AcceptedHandlers.trim().empty()) {
				return HandlerName.empty()
				? llvm::StringSet<llvm::MallocAllocator>()
				: llvm::StringSet<llvm::MallocAllocator>{HandlerName};
				}
				llvm::SmallVector<llvm::StringRef> AcceptedHandlersVector;
				AcceptedHandlers.split(AcceptedHandlersVector, ',');

				llvm::StringSet<llvm::MallocAllocator> AcceptedHandlersSet;
				for (auto AcceptedHandler : AcceptedHandlersVector) {
				AcceptedHandlersSet.insert(AcceptedHandler.trim());
				}
				if (!AcceptedHandlersSet.empty() && !HandlerName.empty()) {
				AcceptedHandlersSet.insert(HandlerName);
				}

				return AcceptedHandlersSet;
				}

				void UnsafeKernelCallCheck::storeOptions(ClangTidyOptions::OptionMap &Opts) {
				Options.store(Opts, HandlerNameOptionName, HandlerName);
				Options.store(Opts, AcceptedHandlersOptionName, AcceptedHandlersList);
				}

				bool UnsafeKernelCallCheck::isAcceptedHandler(const StringRef &Name) {
				return AcceptedHandlersSet.contains(Name);
				}

				// Gathers the instances of the handler as a macro being used
				class UnsafeKernelCallCheck::PPCallback : public PPCallbacks {
				public:
				PPCallback(UnsafeKernelCallCheck &Check) : Check(Check) {}

				void MacroExpands(const Token &MacroNameTok, const MacroDefinition &MD,
				SourceRange Range, const MacroArgs *Args) override {
				if (Check.isAcceptedHandler(MacroNameTok.getIdentifierInfo()->getName())) {
				Check.HandlerMacroLocations.insert(MacroNameTok.getLocation());
				}
				}

				private:
				UnsafeKernelCallCheck &Check;
				};

				void UnsafeKernelCallCheck::registerPPCallbacks(
				const SourceManager &SM, Preprocessor PP, Preprocessor ModuleExpanderPP) {
				ModuleExpanderPP->addPPCallbacks(
				std::make_unique<UnsafeKernelCallCheck::PPCallback>(*this));
				}

				void UnsafeKernelCallCheck::registerMatchers(MatchFinder *Finder) {
				Finder->addMatcher(functionDecl(hasBody(hasDescendant(cudaKernelCallExpr())))
				.bind("function"),
				this);
				}

				namespace {

				// Fetches the first parent available. Should be used
				// for things that are common for the parents, like the location,
				// since the only way a node can have multiple parents is with templates
				template <typename Node, typename Parent = Node>
				inline const Parent *getParent(const Node &Stmt, ASTContext &Context) {
				auto parents = Context.getParents(Stmt);

				return parents.empty() ? nullptr : parents.begin()->template get<Parent>();
				}

				bool isKernelCall(const Stmt *Stmt) {
				return Stmt->getStmtClass() == Stmt::CUDAKernelCallExprClass;
				}

				bool isInCudaRuntimeHeader(SourceLocation Loc, const SourceManager &SM) {
				constexpr auto CudaHeaderNameSuffix = "cuda_runtime.h";
				while (Loc.isValid()) {
				if (SM.getFilename(Loc).endswith(CudaHeaderNameSuffix)) {
				return true;
				}
				Loc = SM.getIncludeLoc(SM.getFileID(Loc));
				}
				return false;
				}

				bool isCudaGetLastErrorCall(const Stmt *const Stmt, const SourceManager &SM) {
				constexpr auto GetLastErrorFunctionName = "cudaGetLastError";
				constexpr auto GetLastErrorFunctionScopedType = "::cudaError_t";
				constexpr auto GetLastErrorFunctionType = GetLastErrorFunctionScopedType + 2;
				if (Stmt->getStmtClass() != Stmt::CallExprClass) {
				return false;
				}
				auto CallExprNode = static_cast<const CallExpr *>(Stmt);

				if (!CallExprNode->getCalleeDecl() \|\|
				CallExprNode->getCalleeDecl()->getKind() != Decl::Function) {
				return false;
				}
				const auto FunctionDeclNode =
				static_cast<const FunctionDecl *>(CallExprNode->getCalleeDecl());

				const auto ReturnTypeName = FunctionDeclNode->getReturnType().getAsString();
				return FunctionDeclNode->getName() == GetLastErrorFunctionName &&
				(ReturnTypeName == GetLastErrorFunctionType \|\|
				StringRef(ReturnTypeName).endswith(GetLastErrorFunctionScopedType)) &&
				isInCudaRuntimeHeader(FunctionDeclNode->getLocation(), SM);
				}

				bool isHandlerCall(
				const Stmt *const Stmt,
				std::function<bool(const llvm::StringRef &)> HandlerNamePredicate) {
				if (Stmt->getStmtClass() != Stmt::CallExprClass) {
				return false;
				}
				auto CallExprNode = static_cast<const CallExpr *>(Stmt);

				if (!CallExprNode->getCalleeDecl() \|\|
				CallExprNode->getCalleeDecl()->getKind() != Decl::Function) {
				return false;
				}
				const auto FunctionDeclNode =
				static_cast<const FunctionDecl *>(CallExprNode->getCalleeDecl());

				return HandlerNamePredicate(FunctionDeclNode->getName()) \|\|
				HandlerNamePredicate(FunctionDeclNode->getQualifiedNameAsString());
				}

				/// Searches for the closest CFGElement that is an instance of CFGStmt. Does not
				/// increment the index if it already indexes a CFGStmt.
				const Stmt findStmt(const CFGBlock const Block, size_t &Idx) {
				while (Idx < Block->size() && !(*Block)[Idx].getAs<CFGStmt>().has_value()) {
				Idx++;
				}
				if (Idx < Block->size()) {
				return (*Block)[Idx].castAs<CFGStmt>().getStmt();
				}
				return nullptr;
				}

				inline bool isBlockReachable(const CFGBlock::AdjacentBlock &Block) {
				return Block && Block.isReachable();
				}

				template <typename Iter>
				inline size_t countReachableBlocks(llvm::iterator_range<Iter> Range) {
				return std::count_if(Range.begin(), Range.end(), isBlockReachable);
				}

				template <typename Iter>
				inline Iter findReachableBlock(llvm::iterator_range<Iter> Range) {
				return std::find_if(Range.begin(), Range.end(), isBlockReachable);
				}

				/// Searches for a next statement from this successor block as if all the empty
				/// blocks were removed and all blocks that could be merged were merged. For
				/// instance, in the following code the call to b() should be found assuming the
				/// `block` argument is set to the first CFG block after the first block:
				/// int foo() {
				/// a();
				/// do {
				/// do {
				/// b()
				/// } while(0);
				/// } while(0);
				/// }
				const Stmt findNextStmtNonEmptyBlock(const CFGBlock const Block) {
				// Enforce that the next block could be mergeable with the next block, i.e.
				// has no non-trivial predecesors. Trivial predecessors here are chains of
				// empty predecessors that have up to one predecessor that is itself a trivial
				// predecessor.
				int PrunedPredCount = 0;
				for (auto Pred : Block->preds()) {
				while (Pred && Pred.isReachable() && Pred->empty() &&
				countReachableBlocks(Pred->preds()) == 1) {
				Pred = *findReachableBlock(Pred->preds());
				}
				if (Pred && (!Pred->empty() \|\| countReachableBlocks(Pred->preds()) > 1)) {
				++PrunedPredCount;
				}
				}
				if (PrunedPredCount > 1) {
				return nullptr;
				}

				// Check if there is any statement in this block that we could return
				size_t Idx = 0;
				if (const auto Stmt = findStmt(Block, Idx)) {
				return Stmt;
				}

				// If the block is empty then try our luck with the next block, provided there
				// is only one
				if (countReachableBlocks(Block->succs()) != 1) {
				return nullptr;
				}
				const auto NextBlock = *findReachableBlock(Block->succs());
				return findNextStmtNonEmptyBlock(NextBlock);
				}

				} // namespace

				void UnsafeKernelCallCheck::check(const MatchFinder::MatchResult &Result) {
				const auto FunctionDeclNode =
				Result.Nodes.getNodeAs<FunctionDecl>("function");
				const auto Cfg = CFG::buildCFG(FunctionDeclNode, FunctionDeclNode->getBody(),
				Result.Context, CFG::BuildOptions());

				for (const auto &block : *Cfg) {
				size_t Idx = 0;
				while (const auto Stmt = findStmt(block, Idx)) {
				++Idx;
				if (!isKernelCall(Stmt)) {
				continue;
				}
				if (checkHandlerMacro(Stmt, Result.Context)) {
				continue;
				}

				auto NextStmt = findStmt(block, Idx);
				// Workaround for the do {...} while(0) not being erased out during
				// pruning
				if (!NextStmt) {
				if (countReachableBlocks(block->succs()) != 1) {
				reportIssue(Stmt, Result.Context);
				continue;
				}
				const auto NextBlock = findReachableBlock(block->succs());
				NextStmt = findNextStmtNonEmptyBlock(*NextBlock);
				}

				if (NextStmt && isCudaGetLastErrorCall(NextStmt, *Result.SourceManager)) {
				continue;
				}
				if (NextStmt &&
				isHandlerCall(NextStmt, [this](const llvm::StringRef &Name) {
				return isAcceptedHandler(Name);
				})) {
				continue;
				}
				reportIssue(Stmt, Result.Context);
				}
				}
				}

				// Searches for a handler macro being used right after the kernel call
				bool UnsafeKernelCallCheck::checkHandlerMacro(const Stmt &Stmt,
				ASTContext &Context) {
				llvm::Optional<Token> Token = Lexer::findNextToken(
				Stmt.getEndLoc(), Context.getSourceManager(), Context.getLangOpts());
				if (!Token.has_value()) {
				return false;
				}
				while (Token->isOneOf(tok::semi, tok::comment)) {
				Token =
				Lexer::findNextToken(Token->getLocation(), Context.getSourceManager(),
				Context.getLangOpts());
				if (!Token.has_value()) {
				return false;
				}
				}
				return HandlerMacroLocations.find(Token->getLocation()) !=
				HandlerMacroLocations.end();
				}

				void UnsafeKernelCallCheck::reportIssue(const Stmt &Stmt, ASTContext &Context) {
				// Get the wrapping expression
				const clang::Stmt *ExprWithCleanups =
				getParent<clang::Stmt, clang::ExprWithCleanups>(Stmt, Context);

				// Under certain compilation options kernel calls may not be wrapped
				// in cleanups
				if (!ExprWithCleanups) {
				ExprWithCleanups = &Stmt;
				}

				const bool IsInMacro = ExprWithCleanups->getBeginLoc().isInvalid() \|\|
				ExprWithCleanups->getBeginLoc().isMacroID() \|\|
				ExprWithCleanups->getEndLoc().isInvalid() \|\|
				ExprWithCleanups->getEndLoc().isMacroID();

				if (!HandlerName.empty()) {
				const auto DiagnosticBuilder = diag(
				Stmt.getEndLoc(), (llvm::Twine("Possible unchecked error after a "
				"kernel launch. Try adding the `") +
				HandlerName + "()` macro after the kernel call:")
				.str());
				if (IsInMacro) {
				return;
				}
				const auto ExprTerminator = utils::lexer::findNextTerminator(
				ExprWithCleanups->getEndLoc(), Context.getSourceManager(),
				Context.getLangOpts());
				const auto ParentStmt = getParent<clang::Stmt>(*ExprWithCleanups, Context);
				assert(ParentStmt);
				DiagnosticBuilder << utils::fixit::addSubsequentStatement(
				SourceRange(ExprWithCleanups->getBeginLoc(), ExprTerminator),
				*ParentStmt, HandlerName + "()", Context);
				} else {
				diag(Stmt.getEndLoc(),
				"Possible unchecked error after a kernel launch. Try using "
				"`cudaGetLastError()` right after the kernel call to get the error or "
				"specify a project-wide kernel call error handler.");
				}
				}

				} // namespace cuda
				} // namespace tidy
				} // namespace clang

clang-tools-extra/clang-tidy/utils/FixItHintUtils.cpp

	//===--- FixItHintUtils.cpp - clang-tidy-----------------------------------===//			//===--- FixItHintUtils.cpp - clang-tidy-----------------------------------===//
	//			//
	// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.			// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
	// See https://llvm.org/LICENSE.txt for license information.			// See https://llvm.org/LICENSE.txt for license information.
	// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception			// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "FixItHintUtils.h"			#include "FixItHintUtils.h"
	#include "LexerUtils.h"			#include "LexerUtils.h"
	#include "clang/AST/ASTContext.h"			#include "clang/AST/ASTContext.h"
	#include "clang/AST/Type.h"			#include "clang/AST/Type.h"

				#include <iostream>

	namespace clang {			namespace clang {
	namespace tidy {			namespace tidy {
	namespace utils {			namespace utils {
	namespace fixit {			namespace fixit {

	FixItHint changeVarDeclToReference(const VarDecl &Var, ASTContext &Context) {			FixItHint changeVarDeclToReference(const VarDecl &Var, ASTContext &Context) {
	SourceLocation AmpLocation = Var.getLocation();			SourceLocation AmpLocation = Var.getLocation();
	auto Token = utils::lexer::getPreviousToken(			auto Token = utils::lexer::getPreviousToken(
	▲ Show 20 Lines • Show All 356 Lines • Show Last 20 Lines

clang-tools-extra/docs/ReleaseNotes.rst

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	- New :doc:`cppcoreguidelines-avoid-const-or-ref-data-members

Warns when a struct or class uses const or reference (lvalue or rvalue) data members.		Warns when a struct or class uses const or reference (lvalue or rvalue) data members.

- New :doc:`cuda-unsafe-api-call <clang-tidy/checks/cuda/unsafe-api-call>` check.		- New :doc:`cuda-unsafe-api-call <clang-tidy/checks/cuda/unsafe-api-call>` check.

Warns whenever the error from CUDA API call is ignored/not handled with a set handler		Warns whenever the error from CUDA API call is ignored/not handled with a set handler
and provides fixes for it.		and provides fixes for it.

		- New :doc:`cuda-unsafe-kernel-call
		<clang-tidy/checks/cuda/unsafe-kernel-call>` check.

		Warns whenever the possible error after launchign a CUDA kernel is not checked
		(with a `cudaGetLastError()` function).

New check aliases		New check aliases
^^^^^^^^^^^^^^^^^		^^^^^^^^^^^^^^^^^

- New alias :doc:`cert-msc54-cpp		- New alias :doc:`cert-msc54-cpp
<clang-tidy/checks/cert/msc54-cpp>` to		<clang-tidy/checks/cert/msc54-cpp>` to
:doc:`bugprone-signal-handler		:doc:`bugprone-signal-handler
<clang-tidy/checks/bugprone/signal-handler>` was added.		<clang-tidy/checks/bugprone/signal-handler>` was added.

▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

clang-tools-extra/docs/clang-tidy/checks/cuda/unsafe-kernel-call.rst

This file was added.

				.. title:: clang-tidy - cuda-unsafe-kernel-call

				cuda-unsafe-kernel-call
				=======================

				FIXME: Describe what patterns does the check detect and why. Give examples.

clang-tools-extra/docs/clang-tidy/checks/list.rst

Show First 20 Lines • Show All 193 Lines • ▼ Show 20 Lines	.. csv-table::
`cppcoreguidelines-pro-type-cstyle-cast <cppcoreguidelines/pro-type-cstyle-cast.html>`_, "Yes"		`cppcoreguidelines-pro-type-cstyle-cast <cppcoreguidelines/pro-type-cstyle-cast.html>`_, "Yes"
`cppcoreguidelines-pro-type-member-init <cppcoreguidelines/pro-type-member-init.html>`_, "Yes"		`cppcoreguidelines-pro-type-member-init <cppcoreguidelines/pro-type-member-init.html>`_, "Yes"
`cppcoreguidelines-pro-type-reinterpret-cast <cppcoreguidelines/pro-type-reinterpret-cast.html>`_,		`cppcoreguidelines-pro-type-reinterpret-cast <cppcoreguidelines/pro-type-reinterpret-cast.html>`_,
`cppcoreguidelines-pro-type-static-cast-downcast <cppcoreguidelines/pro-type-static-cast-downcast.html>`_, "Yes"		`cppcoreguidelines-pro-type-static-cast-downcast <cppcoreguidelines/pro-type-static-cast-downcast.html>`_, "Yes"
`cppcoreguidelines-pro-type-union-access <cppcoreguidelines/pro-type-union-access.html>`_,		`cppcoreguidelines-pro-type-union-access <cppcoreguidelines/pro-type-union-access.html>`_,
`cppcoreguidelines-pro-type-vararg <cppcoreguidelines/pro-type-vararg.html>`_,		`cppcoreguidelines-pro-type-vararg <cppcoreguidelines/pro-type-vararg.html>`_,
`cppcoreguidelines-slicing <cppcoreguidelines/slicing.html>`_,		`cppcoreguidelines-slicing <cppcoreguidelines/slicing.html>`_,
`cppcoreguidelines-special-member-functions <cppcoreguidelines/special-member-functions.html>`_,		`cppcoreguidelines-special-member-functions <cppcoreguidelines/special-member-functions.html>`_,
`cuda-unsafe-api-call <cuda/unsafe-api-call.html>`_, "Yes"
`cppcoreguidelines-virtual-class-destructor <cppcoreguidelines/virtual-class-destructor.html>`_, "Yes"		`cppcoreguidelines-virtual-class-destructor <cppcoreguidelines/virtual-class-destructor.html>`_, "Yes"
		`cuda-unsafe-api-call <cuda/unsafe-api-call.html>`_, "Yes"
		`cuda-unsafe-kernel-call <cuda/unsafe-kernel-call.html>`_, "Yes"
`darwin-avoid-spinlock <darwin/avoid-spinlock.html>`_,		`darwin-avoid-spinlock <darwin/avoid-spinlock.html>`_,
`darwin-dispatch-once-nonstatic <darwin/dispatch-once-nonstatic.html>`_, "Yes"		`darwin-dispatch-once-nonstatic <darwin/dispatch-once-nonstatic.html>`_, "Yes"
`fuchsia-default-arguments-calls <fuchsia/default-arguments-calls.html>`_,		`fuchsia-default-arguments-calls <fuchsia/default-arguments-calls.html>`_,
`fuchsia-default-arguments-declarations <fuchsia/default-arguments-declarations.html>`_, "Yes"		`fuchsia-default-arguments-declarations <fuchsia/default-arguments-declarations.html>`_, "Yes"
`fuchsia-multiple-inheritance <fuchsia/multiple-inheritance.html>`_,		`fuchsia-multiple-inheritance <fuchsia/multiple-inheritance.html>`_,
`fuchsia-overloaded-operator <fuchsia/overloaded-operator.html>`_,		`fuchsia-overloaded-operator <fuchsia/overloaded-operator.html>`_,
`fuchsia-statically-constructed-objects <fuchsia/statically-constructed-objects.html>`_,		`fuchsia-statically-constructed-objects <fuchsia/statically-constructed-objects.html>`_,
`fuchsia-trailing-return <fuchsia/trailing-return.html>`_,		`fuchsia-trailing-return <fuchsia/trailing-return.html>`_,
▲ Show 20 Lines • Show All 285 Lines • Show Last 20 Lines

clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/cuda/cuda.h

	/* Minimal declarations for CUDA support. Testing purposes only. */			/* Minimal declarations for CUDA support. Testing purposes only. */

				using size_t = long long unsigned;

	#define __constant__ __attribute__((constant))			#define __constant__ __attribute__((constant))
	#define __device__ __attribute__((device))			#define __device__ __attribute__((device))
	#define __global__ __attribute__((global))			#define __global__ __attribute__((global))
	#define __host__ __attribute__((host))			#define __host__ __attribute__((host))
	#define __shared__ __attribute__((shared))			#define __shared__ __attribute__((shared))

	struct dim3 {			struct dim3 {
	unsigned x, y, z;			unsigned x, y, z;
	Show All 19 Lines

clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/cuda/cuda_runtime.h

	#include "cuda.h"			#include "cuda.h"

	cudaError_t cudaDeviceReset();			cudaError_t cudaDeviceReset();
				cudaError_t cudaGetLastError();

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-function-handler.cu

This file was added.

				// RUN: %check_clang_tidy %s cuda-unsafe-kernel-call %t -- \
				// RUN: -config="{CheckOptions: \
				// RUN: [{key: cuda-unsafe-kernel-call.HandlerName, \
				// RUN: value: 'errorCheck'}] \
				// RUN: }" \
				// RUN: -- -isystem %clang_tidy_headers

				#include <cuda/cuda_runtime.h>

				__global__
				void b();

				void general();

				void errorCheck() {
				auto err = cudaGetLastError();
				}

				void bad_next_line_stmt() {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();

				b<<<1, 2>>>(); /* some / / comments */ // present
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();

				if (true) // Dummy comment
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} if (true) { // Dummy comment{{$}}
				// CHECK-FIXES: {{^}} b<<<1, 2>>>();{{$}}
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				else // Dummy comment
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} } else { // Dummy comment{{$}}
				// CHECK-FIXES: {{^}} b<<<1, 2>>>();{{$}}
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				// CHECK-FIXES: {{^}} }{{$}}
				general();

				while (true) b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} while (true) { b<<<1, 2>>>();{{$}}
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				// CHECK-FIXES: {{^}} }{{$}}
				general();

				for (;;) // Dummy comment
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} for (;;) { // Dummy comment{{$}}
				// CHECK-FIXES: {{^}} b<<<1, 2>>>();{{$}}
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				// CHECK-FIXES: {{^}} }{{$}}
				general();

				if (true) {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();
				} else {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();
				}

				while(true) {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();
				}

				for (;;) {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();
				}

				do {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} errorCheck();{{$}}
				general();
				} while(true);
				}

				void bad_same_line_stmt() {
				b<<<1, 2>>>(); general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); general();{{$}}

				b<<<1, 2>>>(); /* hello / / there */ general(); // kenobi
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); /* hello / / there */ general(); // kenobi{{$}}

				if (true) // Dummy comment
				b<<<1, 2>>>(); /* comment */ general(); // comment
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} {b<<<1, 2>>>(); errorCheck();} /* comment */ general(); // comment{{$}}

				while (true) b<<<1, 2>>>(); general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} while (true) {b<<<1, 2>>>(); errorCheck();} general();{{$}}

				for (;;) // Dummy comment
				b<<<1, 2>>>(); /* comment */ general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} {b<<<1, 2>>>(); errorCheck();} /* comment */ general();{{$}}

				if (true) {
				b<<<1, 2>>>(); general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); general();{{$}}
				} else {
				b<<<1, 2>>>(); /* comment */ general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); /* comment */ general();{{$}}
				}

				while(true) {
				b<<<1, 2>>>(); /* comment */ general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); /* comment */ general();{{$}}
				}

				for (;;) {
				b<<<1, 2>>>(); general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); general();{{$}}
				}

				do {
				b<<<1, 2>>>(); /* comment */ general();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// CHECK-FIXES: {{^}} b<<<1, 2>>>(); errorCheck(); /* comment */ general();{{$}}
				} while(true);
				}

				void good() {
				b<<<1, 2>>>();
				errorCheck(); // Here the function call works because the handler is set to its name
				}

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu

This file was added.

				// RUN: %check_clang_tidy %s cuda-unsafe-kernel-call %t -- \
				// RUN: -config="{CheckOptions: \
				// RUN: [{key: cuda-unsafe-kernel-call.HandlerName, \
				// RUN: value: 'CUDA_CHECK_KERNEL'}, \
				// RUN: {key: cuda-unsafe-kernel-call.AcceptedHandlers, \
				// RUN: value: 'ALTERNATIVE_CUDA_CHECK_KERNEL, cudaCheckKernel, \
				// RUN: alternative::alternativeCudaCheckKernel, \
				// RUN: otherAlternativeCudaCheckKernel'}] \
				// RUN: }" \
				// RUN: -- -isystem %clang_tidy_headers

				#include <cuda/cuda_runtime.h>

				#define CUDA_CHECK_KERNEL() do {} while(0)

				#define ALTERNATIVE_CUDA_CHECK_KERNEL() CUDA_CHECK_KERNEL()

				void cudaCheckKernel();

				namespace alternative {

				void alternativeCudaCheckKernel();
				void otherAlternativeCudaCheckKernel();

				}

				__global__
				void b();

				#define KERNEL_CALL() do {b<<<1, 2>>>();} while(0)

				void errorCheck() {
				auto err = cudaGetLastError();
				}

				void bad() {
				b<<<1, 2>>>(); // sample comment
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.

				KERNEL_CALL(); // sample comment
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// There isn't supposed to be a fix here since it's a macro call

				if(true)
				b<<<1, 2>>>() ; // Brackets omitted purposefully, since they create an additional AST node
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				else {
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				}
				auto err = cudaGetLastError();
				traUnsubmitted Not Done Reply Inline Actions Just curious -- is it sufficient to just call `cudaGetLastError();` ? Or does the checker require using its result, too? I.e. in practice this particular check will not really do anything useful. The tests below look somewhat inconsistent. tra: Just curious -- is it sufficient to just call `cudaGetLastError();` ? Or does the checker…
				barciszAuthorUnsubmitted Done Reply Inline Actions Technically it does not require the user to actually use the value of `cudaGetLastError()`, but If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel the check being introduced in D133804 can be used to check if the return value has been used, so checking it here as well would have been a duplication barcisz: Technically it does not require the user to actually use the value of `cudaGetLastError()`, but…
				traUnsubmitted Not Done Reply Inline Actions If they are calling it then they most likely did not place this call there randomly and are using it to check for the error returned by the kernel If that's the case, then why kernel launches on lines 45 and 51 are reported as possibly unchecked? Both are followed by the `cudaGetLastError()` call and are, technically checked, if we're not analyzing the usage of the result of the call. What am I missing? tra: > 1. If they are calling it then they most likely did not place this call there randomly and…
				barciszAuthorUnsubmitted Done Reply Inline Actions The idea is that the call should happen directly after the kernel without any branching (because branching can often make things much harder to understand in case of things like for loop make the error not actually have `cudaGetLastError()` called after every kernel call barcisz: The idea is that the call should happen directly after the kernel without any branching…

				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				if (true)
				cudaGetLastError();

				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				for(;;)
				auto err2 = cudaGetLastError(); // Brackets omitted purposefully, since they create an additional AST node

				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				auto err3 = true ? 1 : cudaGetLastError();

				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				auto err4 = cudaDeviceReset() + cudaGetLastError();

				b<<<1, 2>>>();
				// CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch.
				// Calling an error-checking function after a kernel is not considered safe.
				traUnsubmitted Not Done Reply Inline Actions WDYM by "is not considered safe" here? How is that different from calling `cudaGetLastError()` and checking its value? tra: WDYM by "is not considered safe" here? How is that different from calling `cudaGetLastError()`…
				barciszAuthorUnsubmitted Done Reply Inline Actions As in the check does not do inter-procedural analysis short of adding the handler to AcceptedHandlers, so the check will flag such occurences barcisz: As in the check does not do inter-procedural analysis short of adding the handler to…
				traUnsubmitted Not Done Reply Inline Actions Hmm.. Using a helper function to check for cuda errors is a fairly common pattern. Is there a way to annotate such a helper function as `checks cudaGetLastError`? tra: Hmm.. Using a helper function to check for cuda errors is a fairly common pattern. Is there a…
				barciszAuthorUnsubmitted Done Reply Inline Actions There would be an easy way to do that, but it's much more common for projects to have those helper functions project-wide (or at least sub-project wide) which means they can be just specified explicitly for the project in the options for the check (the official documentation for the check will be uploaded tomorrow) barcisz: There would be an easy way to do that, but it's much more common for projects to have those…
				errorCheck();
				}

				void good() {
				b<<<1, 2>>>();; /* The semicolons are here because the
				detection of the macro is done with a lexer */ ;
				CUDA_CHECK_KERNEL();
				traUnsubmitted Not Done Reply Inline Actions What would happen with a single `;` as would be seen in the normal user code? tra: What would happen with a single `;` as would be seen in the normal user code?
				barciszAuthorUnsubmitted Done Reply Inline Actions Nothing, it would work just fine; it's rather that all other kernel calls in this test use a single `;` so I want to check this case here barcisz: Nothing, it would work just fine; it's rather that all other kernel calls in this test use a…
				traUnsubmitted Not Done Reply Inline Actions I still do not understand how it all fits together. What does a kernel call, the extra `;`, the macro, and the checker code have to do with each other? Is the idea that the checker should see though the empty statement between the kernel call and the checker macro? If that's the case I'd make it a bit more prominent. E.g. something like this: b<<<1, 2>>>(); ; /* Make sure that we see through empty expressions in-between the call and the checker. / ; CUDA_CHECK_KERNEL(); tra:* I still do not understand how it all fits together. What does a kernel call, the extra `;`, the…
				barciszAuthorUnsubmitted Done Reply Inline Actions The reason we're checking for multiple `;`s here is that due to macros not being present in the AST they have to be located on the lexer stage, which makes it necessary to search for them based on tokens. The tokens used after the kernel call here (semicolons and a comment) are the only allowed token between the kernel call and the macro, since any other one would indicate another statement being present barcisz: The reason we're checking for multiple `;`s here is that due to macros not being present in the…

				b<<<1, 2>>>();
				ALTERNATIVE_CUDA_CHECK_KERNEL();

				b<<<1, 2>>>();
				alternative::alternativeCudaCheckKernel();

				b<<<1, 2>>>();
				alternative::otherAlternativeCudaCheckKernel();

				b<<<1, 2>>>();
				switch(1 + cudaGetLastError()) {
				default:;
				}

				b<<<1, 2>>>();
				if(3 < cudaGetLastError()) {
				1;
				} else {
				2;
				}

				b<<<1, 2>>>();
				for(int i = cudaGetLastError();;);

				b<<<1, 2>>>();
				do {
				do {
				do {
				auto err2 = cudaGetLastError();
				traUnsubmitted Not Done Reply Inline Actions Why does this case produce no warning, while a very similar case above does? In both cases result of `cudaGetLastError()` is assigned to an unused variable within the loop body. b<<<1, 2>>>(); // CHECK-MESSAGES: :[[@LINE-1]]:{{[0-9]+}}: warning: Possible unchecked error after a kernel launch. for(;;) auto err2 = cudaGetLastError(); // Brackets omitted purposefully, since they create an additional AST node tra: Why does this case produce no warning, while a very similar case above does? In both cases…
				barciszAuthorUnsubmitted Done Reply Inline Actions Because often a macro will wrap its error handling code in a do {...} while(0) loop and that's why we check this case and simmilar ones with CFG analysis barcisz: Because often a macro will wrap its error handling code in a do {...} while(0) loop and that's…
				traUnsubmitted Not Done Reply Inline Actions The `do/while(0)` wrapping part I understand. I'm puzzled why the checker appears to work differently with different loop kinds. Why a `cudaGetLastError()` call inside `do {} while()` is detected and considered as a cuda result check, but the same call within `for() {}` is not? tra: The `do/while(0)` wrapping part I understand. I'm puzzled why the checker appears to work…
				barciszAuthorUnsubmitted Done Reply Inline Actions This is because `for(<something>;<something>;<something>)` works differently in the CFG analysis. The precise definition for where the cudaGetLastError() is allowed is that it should be the first statement/expression tree/function call after the kernel call and should be in a straight line from it. For example, `for(;;) {cudaGetLastError()}` would not be similifiable to a single control flow block, and `for(;false;) {cudaGetLastError()}` has an expression tree evaluated before the call to `cudaGetLastError()`. Technically , this definition currently only supports wrapping the statement with `cudaGetLastError()`, but is made more general in case The user uses gotos to achieve a simmilar pattern Such CFG layout can be achieved with other c++ mechanisms Means to achieve such CFG layout with different mechanisms appear in future standards of c++ barcisz: This is because `for(<something>;<something>;<something>)` works differently in the CFG…
				} while(0);
				} while(0);
				} while(0);
				}

This is an archive of the discontinued LLVM Phabricator instance.

Cuda Check for ignored errors after calling a CUDA kernelNeeds ReviewPublic

Details

Add cuda-unchecked-kernel-call check

Motivation

Behavior

Automatic fixes

Known Limitations

Using cudaPeekAtLastError()

Checking for errors that occurred while a kernel was running

Parent diffs

Diff Detail

Event Timeline

Revision Contents

Diff 460455

clang-tools-extra/clang-tidy/cuda/CMakeLists.txt

clang-tools-extra/clang-tidy/cuda/CudaTidyModule.cpp

clang-tools-extra/clang-tidy/cuda/UnsafeKernelCallCheck.h

clang-tools-extra/clang-tidy/cuda/UnsafeKernelCallCheck.cpp

clang-tools-extra/clang-tidy/utils/FixItHintUtils.cpp

clang-tools-extra/docs/ReleaseNotes.rst

clang-tools-extra/docs/clang-tidy/checks/cuda/unsafe-kernel-call.rst

clang-tools-extra/docs/clang-tidy/checks/list.rst

clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/cuda/cuda.h

clang-tools-extra/test/clang-tidy/checkers/Inputs/Headers/cuda/cuda_runtime.h

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-function-handler.cu

clang-tools-extra/test/clang-tidy/checkers/cuda/unsafe-kernel-call-macro-handler.cu

Cuda Check for ignored errors after calling a CUDA kernel
Needs ReviewPublic