This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
Basic/
-
DiagnosticGroups.td
-
DiagnosticSemaKinds.td
-
Sema/
-
Sema.h
-
SemaInternal.h
-
lib/Sema/
-
Sema/
1/2
SemaCUDA.cpp
-
SemaDeclCXX.cpp
-
SemaExpr.cpp
-
SemaLambda.cpp
-
test/
-
CodeGenCUDA/
-
function-overload.cu
-
SemaCUDA/
2/5
variable-target.cu

Differential D79344

[cuda] Start diagnosing variables with bad target.
Needs ReviewPublic

Authored by hliao on May 4 2020, 11:39 AM.

Download Raw Diff

Details

Reviewers

tra
rjmccall
yaxunl

Summary

Non-local variables on the host side are generally not accessible from the device side. Without proper diagnostic messages, the compilation may pass until the final linking stage. That link error may not be intuitive enough for developers, especially for relocatable code compilation. For certain cases like assembly output only, it is even worse that the compilation just passes.
This patch addresses that issue by checking the use of non-local variables and issuing errors on bad target references. For references through default argumennts, a warning is generated on the function declaration as, at that point, that variables are just bound. No real code would be generated if that function won't be used.
The oppose direction, i.e. accessing device variables from the host side, is NOT addressed in this patch as the host code allows the access those device variables by using runtime interface on their shadow variables. It needs more support to identify how that variable is used on the host side for simple cases. The comprehensive diagnosing would be so expensive that alternative analysis tools like clang-tidy should be used.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hliao created this revision.May 4 2020, 11:39 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2020, 11:39 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

That test code just passed compilation on clang trunk if only assembly code is generated, https://godbolt.org/z/XYjRcT. But NVCC generates errors on all cases.

Harbormaster failed remote builds in B55673: Diff 261882!May 4 2020, 12:22 PM

Reformatting test code following pre-merge checks.

yaxunl added inline comments.May 4 2020, 12:31 PM

clang/lib/Sema/SemaCUDA.cpp
156	We may need to mark constexpr variables as host device too. In practice such usage has exist for long time.
clang/test/SemaCUDA/variable-target.cu
43	we need to have a test to check captured local host variable is allowed in device lambda. we need to have some test for constexpr variables used in device function.

This has a good chance of breaking existing code. It would be great to add an escape hatch option to revert to the old behavior if we run into problems. The change is relatively simple, so reverting it in case something goes wrong should work, too. Up to you.

clang/test/SemaCUDA/variable-target.cu
7	The current set of tests only verifies access of host variable from device side. We need to check that things work in other direction (i.e. device veriable is not accessible from host). A bit of it is covered in function-overload.cu, but it would make sense to deal with all variable-related things here. It would be great to add more test cases: access of device variable from various host functions. test cases to verify that &var and sizeof(var) works for device vars in host functions.

In D79344#2018561, @tra wrote:

This has a good chance of breaking existing code. It would be great to add an escape hatch option to revert to the old behavior if we run into problems. The change is relatively simple, so reverting it in case something goes wrong should work, too. Up to you.

Why? for the cases addressed in this patch, if there is existing code, it won't be compiled to generate module file due to the missing symbol. Anything missing?

hliao marked an inline comment as done.May 4 2020, 12:55 PM

hliao added inline comments.

clang/test/SemaCUDA/variable-target.cu
43	This patch just addresses the direct address of variables. For capture, it would be better to start with another patch.

hliao marked an inline comment as done.May 4 2020, 12:58 PM

hliao added inline comments.

clang/test/SemaCUDA/variable-target.cu
7	yeah, as noted in both the message and some sources, that direction diagnosing is more complicated because the host code still be able to access shadow variables. We need to issue warnings on improper usage, such as variable direct read/write. I want to address that in another patch as more change is required to check how a variable is being used.

yaxunl added inline comments.May 4 2020, 1:15 PM

clang/test/SemaCUDA/variable-target.cu
43	but there are chances that this patch may break valid usage of captured variables in device lambda. At least we should add test to avoid that.

hliao marked an inline comment as done.May 4 2020, 1:16 PM

hliao added inline comments.

clang/lib/Sema/SemaCUDA.cpp
156	`cosntexpr` variable is a little bit tricky as it's still possible for that variable to be finally emitted as a variable. For example, if its address is taken, it won't be optimized away and still needs emitting somewhere. But, like other non-local variables, CUDA forbids their initializers. Any suggestion?

In D79344#2018628, @hliao wrote:

In D79344#2018561, @tra wrote:

This has a good chance of breaking existing code. It would be great to add an escape hatch option to revert to the old behavior if we run into problems. The change is relatively simple, so reverting it in case something goes wrong should work, too. Up to you.

Why? for the cases addressed in this patch, if there is existing code, it won't be compiled to generate module file due to the missing symbol. Anything missing?

Logistics, mostly.

Overloading is a rather fragile area of CUDA. This is the area where clang and NVCC behave differently. Combined with the existing code that needs to work with both compilers, even minor changes in compiler behavior can result in unexpected issues. Stricter checks tend to expose existing code which happens to work (or to compile) when it should not have, but it's not always trivial to fix those quickly. Having an escape hatch allows us to deal with those issues. It allows the owner of the code to reproduce the problem while the rest of the world continues to work. Reverting is suboptimal as the end user is often not in a good position to build a compiler with your patch plumbed in and then plumb the patched compiler into their build system. Adding another compiler option to enable/disable the new behavior is much more manageable.

Harbormaster completed remote builds in B55686: Diff 261904.May 4 2020, 1:27 PM

In D79344#2018683, @tra wrote:

In D79344#2018628, @hliao wrote:

In D79344#2018561, @tra wrote:

This has a good chance of breaking existing code. It would be great to add an escape hatch option to revert to the old behavior if we run into problems. The change is relatively simple, so reverting it in case something goes wrong should work, too. Up to you.

Why? for the cases addressed in this patch, if there is existing code, it won't be compiled to generate module file due to the missing symbol. Anything missing?

Logistics, mostly.

Overloading is a rather fragile area of CUDA. This is the area where clang and NVCC behave differently. Combined with the existing code that needs to work with both compilers, even minor changes in compiler behavior can result in unexpected issues. Stricter checks tend to expose existing code which happens to work (or to compile) when it should not have, but it's not always trivial to fix those quickly. Having an escape hatch allows us to deal with those issues. It allows the owner of the code to reproduce the problem while the rest of the world continues to work. Reverting is suboptimal as the end user is often not in a good position to build a compiler with your patch plumbed in and then plumb the patched compiler into their build system. Adding another compiler option to enable/disable the new behavior is much more manageable.

OK, I will add one option, But, do we turn it on by default or off?

In D79344#2018693, @hliao wrote:

OK, I will add one option, But, do we turn it on by default or off?

As a rule of thumb, if it's an experimental feature, then the default would be off. For a change which should be the default, but is risky, the default is on. This patch looks like the latter.

If you can wait, I can try patching this change into our clang tree and then see if it breaks anything obvious. If nothing falls apart, I'll be fine with the patch as is.

In D79344#2018915, @tra wrote:

If you can wait, I can try patching this change into our clang tree and then see if it breaks anything obvious. If nothing falls apart, I'll be fine with the patch as is.

The patch appears to break compilation of CUDA headers:

In file included from <built-in>:1:
In file included from llvm_unstable/toolchain/lib/clang/google3-trunk/include/__clang_cuda_runtime_wrapper.h:406:
llvm_unstable/toolchain/lib/clang/google3-trunk/include/__clang_cuda_complex_builtins.h:30:13: error: call to 'copysign' is ambiguous
      __a = std::copysign(std::isinf(__a) ? 1 : 0, __a);
            ^~~~~~~~~~~~~
llvm_unstable/toolchain/lib/clang/google3-trunk/include/__clang_cuda_math.h:76:19: note: candidate function
__DEVICE__ double copysign(double __a, double __b) {
                  ^
third_party/gpus/cuda_10_1/include/crt/math_functions.hpp:861:32: note: candidate function
__MATH_FUNCTIONS_DECL__ double copysign(float a, double b)
                               ^
1 error generated when compiling for sm_60.

We're calling copysign( int, double). The standard library provides copysign(double, double), CUDA provides only copysign(float, double). As far as C++ is concerned, both require one type conversion. I guess previously we would give __device__ one provided by CUDA a higher preference, considering that the callee is a device function. Now both seem to have equal weight. I'm not sure how/why,

In D79344#2026025, @tra wrote:

We're calling copysign( int, double). The standard library provides copysign(double, double), CUDA provides only copysign(float, double). As far as C++ is concerned, both require one type conversion. I guess previously we would give __device__ one provided by CUDA a higher preference, considering that the callee is a device function. Now both seem to have equal weight. I'm not sure how/why,

@yaxunl, that may be related to the change of overload resolution. Back to this change, that error should not be related to the non-local variable checks.

In D79344#2026126, @hliao wrote:

In D79344#2026025, @tra wrote:

We're calling copysign( int, double). The standard library provides copysign(double, double), CUDA provides only copysign(float, double). As far as C++ is concerned, both require one type conversion. I guess previously we would give __device__ one provided by CUDA a higher preference, considering that the callee is a device function. Now both seem to have equal weight. I'm not sure how/why,

@yaxunl, that may be related to the change of overload resolution. Back to this change, that error should not be related to the non-local variable checks.

The tree I've tested had Sam's changes reverted (bf6a26b066382e0f41bf023c781d84061c542307), so it appears to be triggered by this patch. Let me try reproducing it in the upstream HEAD.

The problem is reproducible in upstream clang. Let's see if I can reduce it to something simpler.

In D79344#2026180, @tra wrote:

The problem is reproducible in upstream clang. Let's see if I can reduce it to something simpler.

I remembered found similar errors when the math part is refactored out into the current but, later, it seems fixed. Not sure, it's relevant or not.

In D79344#2026180, @tra wrote:

The problem is reproducible in upstream clang. Let's see if I can reduce it to something simpler.

Reduced it down to this -- compiles with clang w/o the patch, but fails with it.

__attribute__((device)) double copysign(double, double);
__attribute__((device)) double copysign(float, double);
template <typename> struct a { static const bool b = true; };
template <bool, class> struct c;
template <class f> struct c<true, f> { typedef f g; };
template <typename d, typename h>
__attribute__((device)) typename c<a<h>::b, double>::g copysign(d, h) {
  double e = copysign(0, e);
}

Here's a slightly smaller variant which may be a good clue for tracking down the root cause. This one fails with:

var.cc:6:14: error: no matching function for call to 'copysign'
  double g = copysign(0, g);
             ^~~~~~~~
var.cc:5:56: note: candidate template ignored: substitution failure [with e = int, f = double]: reference to __host__ variable 'b' in __device__ function
__attribute__((device)) typename c<a<f>::b, double>::d copysign(e, f) {
                                         ~             ^
1 error generated when compiling for sm_60.

I suspect that it's handling of non-type template parameter that may be breaking things in both cases.

template <typename> struct a { static const bool b = true; };
template <bool, class> struct c;
template <class h> struct c<true, h> { typedef h d; };
template <typename e, typename f>
__attribute__((device)) typename c<a<f>::b, double>::d copysign(e, f) {
  double g = copysign(0, g);
}

In D79344#2026349, @tra wrote:

Here's a slightly smaller variant which may be a good clue for tracking down the root cause. This one fails with:

var.cc:6:14: error: no matching function for call to 'copysign'
  double g = copysign(0, g);
             ^~~~~~~~
var.cc:5:56: note: candidate template ignored: substitution failure [with e = int, f = double]: reference to __host__ variable 'b' in __device__ function
__attribute__((device)) typename c<a<f>::b, double>::d copysign(e, f) {
                                         ~             ^
1 error generated when compiling for sm_60.

I suspect that it's handling of non-type template parameter that may be breaking things in both cases.

template <typename> struct a { static const bool b = true; };
template <bool, class> struct c;
template <class h> struct c<true, h> { typedef h d; };
template <typename e, typename f>
__attribute__((device)) typename c<a<f>::b, double>::d copysign(e, f) {
  double g = copysign(0, g);
}

My bad. We need a similar logic in the call check to skip the template not instantiated yet, i.e.

diff --git a/clang/lib/Sema/SemaCUDA.cpp b/clang/lib/Sema/SemaCUDA.cpp
index 583e588e4bd..467136f4579 100644
--- a/clang/lib/Sema/SemaCUDA.cpp
+++ b/clang/lib/Sema/SemaCUDA.cpp
@@ -910,6 +910,10 @@ bool Sema::CheckCUDAAccess(SourceLocation Loc, FunctionDecl *Caller,
   assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");
   assert(VD && isNonLocalVariable(VD) && "Variable must be a non-local one.");
 
+  auto &ExprEvalCtx = ExprEvalContexts.back();
+  if (ExprEvalCtx.isUnevaluated() || ExprEvalCtx.isConstantEvaluated())
+    return true;
+
   // FIXME: Is bailing out early correct here?  Should we instead assume that
   // the caller is a global initializer?
   if (!Caller)

This triggers an assertion:

clang: /usr/local/google/home/tra/work/llvm/repo/clang/lib/AST/Decl.cpp:2697: clang::Expr *clang::ParmVarDecl::getDefaultArg(): Assertion `!hasUninstantiatedDefaultArg() && "Default argument is not yet instantiated!"' failed.

#2  0x00007fffeb8ae40f in __assert_fail_base (fmt=0x7fffeba106e0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
    assertion=0x7fffe7d2e909 "!hasUninstantiatedDefaultArg() && \"Default argument is not yet instantiated!\"",
    file=0x7fffe7d22e5c "/usr/local/google/home/tra/work/llvm/repo/clang/lib/AST/Decl.cpp", line=2697, function=<optimized out>) at assert.c:92
#3  0x00007fffeb8bbb92 in __GI___assert_fail (
    assertion=0x7fffe7d2e909 "!hasUninstantiatedDefaultArg() && \"Default argument is not yet instantiated!\"",
    file=0x7fffe7d22e5c "/usr/local/google/home/tra/work/llvm/repo/clang/lib/AST/Decl.cpp", line=2697,
    function=0x7fffe7dda0fb "clang::Expr *clang::ParmVarDecl::getDefaultArg()") at assert.c:101
#4  0x00007fffe8460aec in clang::ParmVarDecl::getDefaultArg (this=0x112f560) at /usr/local/google/home/tra/work/llvm/repo/clang/lib/AST/Decl.cpp:2696
#5  0x00007fffe618a5a6 in clang::Sema::checkCUDAParamWithInvalidDefaultArg (this=0x392450, Loc=..., FD=0x112f678, PVD=0x112f560)
    at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Sema/SemaCUDA.cpp:729
#6  0x00007fffe62ed89a in clang::Sema::CheckCXXDefaultArguments (this=0x392450, FD=0x112f678)
    at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Sema/SemaDeclCXX.cpp:1551
#7  0x00007fffe61c9443 in clang::Sema::CheckFunctionDeclaration (this=0x392450, S=0x0, NewFD=0x112f678, Previous=..., IsMemberSpecialization=false)
    at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Sema/SemaDecl.cpp:10765
#8  0x00007fffe6d5f0b7 in clang::TemplateDeclInstantiator::VisitCXXMethodDecl (this=0x7ffffffe7f20, D=0x111b198, TemplateParams=0x0,
    ClassScopeSpecializationArgs=llvm::Optional is not initialized, FunctionRewriteKind=clang::TemplateDeclInstantiator::RewriteKind::None)
    at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp:2424
#9  0x00007fffe6d62f10 in clang::TemplateDeclInstantiator::VisitCXXMethodDecl (this=0x7ffffffe7f20, D=0x111b198)
    at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp:3410
#10 0x00007fffe6d62ead in clang::TemplateDeclInstantiator::VisitCXXConstructorDecl (this=0x7ffffffe7f20, D=0x111b198)
    at /usr/local/google/home/tra/work/llvm/repo/clang/lib/Sema/SemaTemplateInstantiateDecl.cpp:2498

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

DiagnosticGroups.td

3 lines

DiagnosticSemaKinds.td

8 lines

Sema/

Sema.h

51 lines

SemaInternal.h

7 lines

lib/

Sema/

257 lines

17 lines

12 lines

6 lines

test/

CodeGenCUDA/

function-overload.cu

4 lines

SemaCUDA/

variable-target.cu

42 lines

Diff 261904

clang/include/clang/Basic/DiagnosticGroups.td

	Show First 20 Lines • Show All 1,132 Lines • ▼ Show 20 Lines

	// A warning group for warnings about code that clang accepts when			// A warning group for warnings about code that clang accepts when
	// compiling CUDA C/C++ but which is not compatible with the CUDA spec.			// compiling CUDA C/C++ but which is not compatible with the CUDA spec.
	def CudaCompat : DiagGroup<"cuda-compat">;			def CudaCompat : DiagGroup<"cuda-compat">;

	// Warning about unknown CUDA SDK version.			// Warning about unknown CUDA SDK version.
	def CudaUnknownVersion: DiagGroup<"unknown-cuda-version">;			def CudaUnknownVersion: DiagGroup<"unknown-cuda-version">;

				// Warning about a potential bad target reference.
				def CudaBadTargetRef: DiagGroup<"cuda-bad-target-ref">;

	// A warning group for warnings about features supported by HIP but			// A warning group for warnings about features supported by HIP but
	// ignored by CUDA.			// ignored by CUDA.
	def HIPOnly : DiagGroup<"hip-only">;			def HIPOnly : DiagGroup<"hip-only">;

	// Warnings which cause linking of the runtime libraries like			// Warnings which cause linking of the runtime libraries like
	// libc and the CRT to be skipped.			// libc and the CRT to be skipped.
	def AVRRtlibLinkingQuirks : DiagGroup<"avr-rtlib-linking-quirks">;			def AVRRtlibLinkingQuirks : DiagGroup<"avr-rtlib-linking-quirks">;

	▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 7,938 Lines • ▼ Show 20 Lines
	def err_config_scalar_return : Error<			def err_config_scalar_return : Error<
	"CUDA special function '%0' must have scalar return type">;			"CUDA special function '%0' must have scalar return type">;
	def err_kern_call_not_global_function : Error<			def err_kern_call_not_global_function : Error<
	"kernel call to non-global function %0">;			"kernel call to non-global function %0">;
	def err_global_call_not_config : Error<			def err_global_call_not_config : Error<
	"call to global function %0 not configured">;			"call to global function %0 not configured">;
	def err_ref_bad_target : Error<			def err_ref_bad_target : Error<
	"reference to %select{__device__\|__global__\|__host__\|__host__ __device__}0 "			"reference to %select{__device__\|__global__\|__host__\|__host__ __device__}0 "
	"function %1 in %select{__device__\|__global__\|__host__\|__host__ __device__}2 function">;			"%select{function\|variable}1 %2 in "
				"%select{__device__\|__global__\|__host__\|__host__ __device__}3 function">;
	def err_ref_bad_target_global_initializer : Error<			def err_ref_bad_target_global_initializer : Error<
	"reference to %select{__device__\|__global__\|__host__\|__host__ __device__}0 "			"reference to %select{__device__\|__global__\|__host__\|__host__ __device__}0 "
	"function %1 in global initializer">;			"function %1 in global initializer">;
				def warn_ref_bad_target_default_argument : Warning<
				"reference to %select{__device__\|__global__\|__host__\|__host__ __device__}0 "
				"variable %1 as default argument in "
				"%select{__device__\|__global__\|__host__\|__host__ __device__}2 function">,
				InGroup<CudaBadTargetRef>;
	def warn_kern_is_method : Extension<			def warn_kern_is_method : Extension<
	"kernel function %0 is a member function; this may not be accepted by nvcc">,			"kernel function %0 is a member function; this may not be accepted by nvcc">,
	InGroup<CudaCompat>;			InGroup<CudaCompat>;
	def warn_kern_is_inline : Warning<			def warn_kern_is_inline : Warning<
	"ignored 'inline' attribute on kernel function %0">,			"ignored 'inline' attribute on kernel function %0">,
	InGroup<CudaCompat>;			InGroup<CudaCompat>;
	def err_variadic_device_fn : Error<			def err_variadic_device_fn : Error<
	"CUDA device code does not support variadic functions">;			"CUDA device code does not support variadic functions">;
	▲ Show 20 Lines • Show All 2,795 Lines • Show Last 20 Lines

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,649 Lines • ▼ Show 20 Lines	enum CUDAFunctionTarget {
CFT_InvalidTarget		CFT_InvalidTarget
};		};

/// Determines whether the given function is a CUDA device/host/kernel/etc.		/// Determines whether the given function is a CUDA device/host/kernel/etc.
/// function.		/// function.
///		///
/// Use this rather than examining the function's attributes yourself -- you		/// Use this rather than examining the function's attributes yourself -- you
/// will get it wrong. Returns CFT_Host if D is null.		/// will get it wrong. Returns CFT_Host if D is null.
CUDAFunctionTarget IdentifyCUDATarget(const FunctionDecl *D,
bool IgnoreImplicitHDAttr = false);
CUDAFunctionTarget IdentifyCUDATarget(const ParsedAttributesView &Attrs);		CUDAFunctionTarget IdentifyCUDATarget(const ParsedAttributesView &Attrs);
		CUDAFunctionTarget IdentifyCUDATarget(const FunctionDecl *FD,
		bool IgnoreImplicitHDAttr = false);
		CUDAFunctionTarget IdentifyCUDATarget(const VarDecl *VD,
		bool IgnoreImplicitHDAttr = false);
		// This routine is the top level dispatcher to more specific variants above.
		CUDAFunctionTarget IdentifyCUDATarget(const Decl *D,
		bool IgnoreImplicitHDAttr = false);

/// Gets the CUDA target for the current context.		/// Gets the CUDA target for the current context.
CUDAFunctionTarget CurrentCUDATarget() {		CUDAFunctionTarget CurrentCUDATarget() {
return IdentifyCUDATarget(dyn_cast<FunctionDecl>(CurContext));		return IdentifyCUDATarget(dyn_cast<FunctionDecl>(CurContext));
}		}

// CUDA function call preference. Must be ordered numerically from		// CUDA function call preference. Must be ordered numerically from
// worst to best.		// worst to best.
Show All 12 Lines	public:
/// combination, based on their host/device attributes.		/// combination, based on their host/device attributes.
/// \param Caller function which needs address of \p Callee.		/// \param Caller function which needs address of \p Callee.
/// nullptr in case of global context.		/// nullptr in case of global context.
/// \param Callee target function		/// \param Callee target function
///		///
/// \returns preference value for particular Caller/Callee combination.		/// \returns preference value for particular Caller/Callee combination.
CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,		CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,
const FunctionDecl *Callee);		const FunctionDecl *Callee);
		/// Identifies relative preference of a given non-local VD within a Caller,
		/// based on their host/device attributes.
		/// \param Caller function which needs address of \p Callee.
		/// nullptr in case of global context.
		/// \param VD the non-local variable.
		///
		/// \returns preference value for that VD within Caller.
		CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,
		const VarDecl *VD);

/// Determines whether Caller may invoke Callee, based on their CUDA		/// Determines whether Caller may invoke Callee, based on their CUDA
/// host/device attributes. Returns false if the call is not allowed.		/// host/device attributes. Returns false if the call is not allowed.
///		///
/// Note: Will return true for CFP_WrongSide calls. These may appear in		/// Note: Will return true for CFP_WrongSide calls. These may appear in
/// semantically correct CUDA programs, but only if they're never codegen'ed.		/// semantically correct CUDA programs, but only if they're never codegen'ed.
bool IsAllowedCUDACall(const FunctionDecl *Caller,		bool IsAllowedCUDACall(const FunctionDecl *Caller,
const FunctionDecl *Callee) {		const FunctionDecl *Callee) {
Show All 16 Lines	public:
/// be emitted if and when the caller is codegen'ed, and returns true.		/// be emitted if and when the caller is codegen'ed, and returns true.
///		///
/// Will only create deferred diagnostics for a given SourceLocation once,		/// Will only create deferred diagnostics for a given SourceLocation once,
/// so you can safely call this multiple times without generating duplicate		/// so you can safely call this multiple times without generating duplicate
/// deferred errors.		/// deferred errors.
///		///
/// - Otherwise, returns true without emitting any diagnostics.		/// - Otherwise, returns true without emitting any diagnostics.
bool CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee);		bool CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee);
		/// Check whether we're allowed to access VD, a non-local varilable, from the
		/// given Caller.
		///
		/// - If the accesss is never allowed in a semantically-correct program
		/// (CFP_Never), emits an error and returns false.
		///
		/// - If the access is allowed in semantically-correct programs, but only if
		/// it's never codegen'ed (CFP_WrongSide), creates a deferred diagnostic to
		/// be emitted if and when the caller is codegen'ed, and returns true.
		///
		/// Will only create deferred diagnostics for a given SourceLocation once,
		/// so you can safely call this multiple times without generating duplicate
		/// deferred errors.
		///
		/// - Otherwise, returns true without emitting any diagnostics.
		///
		/// TODO: A shadow variable on the host side should be treated specially as
		/// it is only allowed to be accessed through the runtime interface. It
		/// cannot be accessed as a regular variable.
		bool CheckCUDAAccess(SourceLocation Loc, FunctionDecl Caller, VarDecl VD);

/// Set __device__ or __host__ __device__ attributes on the given lambda		/// Set __device__ or __host__ __device__ attributes on the given lambda
/// operator() method.		/// operator() method.
///		///
/// CUDA lambdas declared inside __device__ or __global__ functions inherit		/// CUDA lambdas declared inside __device__ or __global__ functions inherit
/// the __device__ attribute. Similarly, lambdas inside __host__ __device__		/// the __device__ attribute. Similarly, lambdas inside __host__ __device__
/// functions become __host__ __device__ themselves.		/// functions become __host__ __device__ themselves.
void CUDASetLambdaAttrs(CXXMethodDecl *Method);		void CUDASetLambdaAttrs(CXXMethodDecl *Method);
Show All 32 Lines	public:
//		//
// \details CUDA allows only empty constructors as initializers for global		// \details CUDA allows only empty constructors as initializers for global
// variables (see E.2.3.1, CUDA 7.5). The same restriction also applies to all		// variables (see E.2.3.1, CUDA 7.5). The same restriction also applies to all
// __shared__ variables whether they are local or not (they all are implicitly		// __shared__ variables whether they are local or not (they all are implicitly
// static in CUDA). One exception is that CUDA allows constant initializers		// static in CUDA). One exception is that CUDA allows constant initializers
// for __constant__ and __device__ variables.		// for __constant__ and __device__ variables.
void checkAllowedCUDAInitializer(VarDecl *VD);		void checkAllowedCUDAInitializer(VarDecl *VD);

		// \brief Check that default arguments potentially violate CUDA restrictions
		// in a function declaration. Only warning is issued as it is bound at the
		// point of declaration.
		//
		// \details __device__ variables are accessible from all the threads within
		// the grid and from the host through the runtime interfaces (see B.2.1).
		bool checkCUDAParamWithInvalidDefaultArg(SourceLocation Loc, FunctionDecl *FD,
		ParmVarDecl *PVD);
		// \brief Check that default arguments potentially violate CUDA restrictions
		// in a function declaration. An error is generated if there is any violance.
		bool checkCUDAInvalidDefaultArgument(SourceLocation Loc, FunctionDecl *FD,
		Expr *E);

/// Check whether NewFD is a valid overload for CUDA. Emits		/// Check whether NewFD is a valid overload for CUDA. Emits
/// diagnostics and invalidates NewFD if not.		/// diagnostics and invalidates NewFD if not.
void checkCUDATargetOverload(FunctionDecl *NewFD,		void checkCUDATargetOverload(FunctionDecl *NewFD,
const LookupResult &Previous);		const LookupResult &Previous);
/// Copies target attributes from the template TD to the function FD.		/// Copies target attributes from the template TD to the function FD.
void inheritCUDATargetAttrs(FunctionDecl *FD, const FunctionTemplateDecl &TD);		void inheritCUDATargetAttrs(FunctionDecl *FD, const FunctionTemplateDecl &TD);

/// Returns the name of the launch configuration function. This is the name		/// Returns the name of the launch configuration function. This is the name
▲ Show 20 Lines • Show All 689 Lines • Show Last 20 Lines

clang/include/clang/Sema/SemaInternal.h

	Show First 20 Lines • Show All 321 Lines • ▼ Show 20 Lines
	inline Sema::TypoExprState &Sema::TypoExprState::			inline Sema::TypoExprState &Sema::TypoExprState::
	operator=(Sema::TypoExprState &&other) noexcept {			operator=(Sema::TypoExprState &&other) noexcept {
	Consumer = std::move(other.Consumer);			Consumer = std::move(other.Consumer);
	DiagHandler = std::move(other.DiagHandler);			DiagHandler = std::move(other.DiagHandler);
	RecoveryHandler = std::move(other.RecoveryHandler);			RecoveryHandler = std::move(other.RecoveryHandler);
	return *this;			return *this;
	}			}

				/// Determine whether the given declaration is a global variable or static data
				/// member.
				inline bool isNonLocalVariable(const Decl *D) {
				const VarDecl *VD = dyn_cast_or_null<VarDecl>(D);
				return VD && VD->hasGlobalStorage();
				}

	} // end namespace clang			} // end namespace clang

	#endif			#endif

clang/lib/Sema/SemaCUDA.cpp

//===--- SemaCUDA.cpp - Semantic Analysis for CUDA constructs -------------===//		//===--- SemaCUDA.cpp - Semantic Analysis for CUDA constructs -------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
/// \file		/// \file
/// This file implements semantic analysis for CUDA constructs.		/// This file implements semantic analysis for CUDA constructs.
///		///
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "clang/AST/ASTContext.h"		#include "clang/AST/ASTContext.h"
#include "clang/AST/Decl.h"		#include "clang/AST/Decl.h"
#include "clang/AST/ExprCXX.h"		#include "clang/AST/ExprCXX.h"
		#include "clang/AST/StmtVisitor.h"
#include "clang/Basic/Cuda.h"		#include "clang/Basic/Cuda.h"
#include "clang/Basic/TargetInfo.h"		#include "clang/Basic/TargetInfo.h"
#include "clang/Lex/Preprocessor.h"		#include "clang/Lex/Preprocessor.h"
#include "clang/Sema/Lookup.h"		#include "clang/Sema/Lookup.h"
#include "clang/Sema/Sema.h"		#include "clang/Sema/Sema.h"
#include "clang/Sema/SemaDiagnostic.h"		#include "clang/Sema/SemaDiagnostic.h"
#include "clang/Sema/SemaInternal.h"		#include "clang/Sema/SemaInternal.h"
#include "clang/Sema/Template.h"		#include "clang/Sema/Template.h"
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	Sema::IdentifyCUDATarget(const ParsedAttributesView &Attrs) {

if (HasDeviceAttr)		if (HasDeviceAttr)
return CFT_Device;		return CFT_Device;

return CFT_Host;		return CFT_Host;
}		}

template <typename A>		template <typename A>
static bool hasAttr(const FunctionDecl *D, bool IgnoreImplicitAttr) {		static bool hasAttr(const Decl *D, bool IgnoreImplicitAttr) {
return D->hasAttrs() && llvm::any_of(D->getAttrs(), [&](Attr *Attribute) {		return D->hasAttrs() && llvm::any_of(D->getAttrs(), [&](Attr *Attribute) {
return isa<A>(Attribute) &&		return isa<A>(Attribute) &&
!(IgnoreImplicitAttr && Attribute->isImplicit());		!(IgnoreImplicitAttr && Attribute->isImplicit());
});		});
}		}

/// IdentifyCUDATarget - Determine the CUDA compilation target for this function		/// IdentifyCUDATarget - Determine the CUDA compilation target for this
Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D,		/// function.
		Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *FD,
bool IgnoreImplicitHDAttr) {		bool IgnoreImplicitHDAttr) {
// Code that lives outside a function is run on the host.		// Code that lives outside a function is run on the host.
if (D == nullptr)		if (FD == nullptr)
return CFT_Host;		return CFT_Host;

if (D->hasAttr<CUDAInvalidTargetAttr>())		if (FD->hasAttr<CUDAInvalidTargetAttr>())
return CFT_InvalidTarget;		return CFT_InvalidTarget;

if (D->hasAttr<CUDAGlobalAttr>())		if (FD->hasAttr<CUDAGlobalAttr>())
return CFT_Global;		return CFT_Global;

if (hasAttr<CUDADeviceAttr>(D, IgnoreImplicitHDAttr)) {		if (hasAttr<CUDADeviceAttr>(FD, IgnoreImplicitHDAttr)) {
if (hasAttr<CUDAHostAttr>(D, IgnoreImplicitHDAttr))		if (hasAttr<CUDAHostAttr>(FD, IgnoreImplicitHDAttr))
return CFT_HostDevice;		return CFT_HostDevice;
return CFT_Device;		return CFT_Device;
} else if (hasAttr<CUDAHostAttr>(D, IgnoreImplicitHDAttr)) {		} else if (hasAttr<CUDAHostAttr>(FD, IgnoreImplicitHDAttr)) {
return CFT_Host;		return CFT_Host;
} else if (D->isImplicit() && !IgnoreImplicitHDAttr) {		} else if (FD->isImplicit() && !IgnoreImplicitHDAttr) {
// Some implicit declarations (like intrinsic functions) are not marked.		// Some implicit declarations (like intrinsic functions) are not marked.
// Set the most lenient target on them for maximal flexibility.		// Set the most lenient target on them for maximal flexibility.
return CFT_HostDevice;		return CFT_HostDevice;
}		}

return CFT_Host;		return CFT_Host;
}		}

		/// IdentifyCUDATarget - Determine the CUDA compilation target for this
		/// variable.
		Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const VarDecl *VD,
		bool IgnoreImplicitHDAttr) {
		// Code that lives outside a function is run on the host.
		if (VD == nullptr)
		return CFT_Host;

		assert(VD->hasGlobalStorage() &&
		"Only non-local variable needs identifying.");

		if (VD->hasAttr<CUDAInvalidTargetAttr>())
		return CFT_InvalidTarget;

		if (hasAttr<CUDAConstantAttr>(VD, IgnoreImplicitHDAttr) \|\|
		hasAttr<CUDADeviceAttr>(VD, IgnoreImplicitHDAttr) \|\|
		hasAttr<CUDASharedAttr>(VD, IgnoreImplicitHDAttr))
		return CFT_Device;

		if (VD->getType()->isCUDADeviceBuiltinSurfaceType() \|\|
		VD->getType()->isCUDADeviceBuiltinTextureType())
		yaxunlUnsubmitted Not Done Reply Inline Actions We may need to mark constexpr variables as host device too. In practice such usage has exist for long time. yaxunl: We may need to mark constexpr variables as host device too. In practice such usage has exist…
		hliaoAuthorUnsubmitted Done Reply Inline Actions `cosntexpr` variable is a little bit tricky as it's still possible for that variable to be finally emitted as a variable. For example, if its address is taken, it won't be optimized away and still needs emitting somewhere. But, like other non-local variables, CUDA forbids their initializers. Any suggestion? hliao: `cosntexpr` variable is a little bit tricky as it's still possible for that variable to be…
		return CFT_HostDevice;

		return CFT_Host;
		}

		/// IdentifyCUDATarget - Determine the CUDA compilation target for a given
		/// declaration.
		Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const Decl *D,
		bool IgnoreImplicitHDAttr) {
		if (D == nullptr)
		return CFT_Host;

		if (const auto *FD = dyn_cast<FunctionDecl>(D))
		return IdentifyCUDATarget(FD, IgnoreImplicitHDAttr);

		if (const auto *VD = dyn_cast<VarDecl>(D))
		return IdentifyCUDATarget(VD, IgnoreImplicitHDAttr);

		llvm_unreachable("Unexpected decl for CUDA target identification.");
		}

// * CUDA Call preference table		// * CUDA Call preference table
//		//
// F - from,		// F - from,
// T - to		// T - to
// Ph - preference in host mode		// Ph - preference in host mode
// Pd - preference in device mode		// Pd - preference in device mode
// H - handled in (x)		// H - handled in (x)
// Preferences: N:native, SS:same side, HD:host-device, WS:wrong side, --:never.		// Preferences: N:native, SS:same side, HD:host-device, WS:wrong side, --:never.
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	Sema::IdentifyCUDAPreference(const FunctionDecl *Caller,
if ((CallerTarget == CFT_Host && CalleeTarget == CFT_Device) \|\|		if ((CallerTarget == CFT_Host && CalleeTarget == CFT_Device) \|\|
(CallerTarget == CFT_Device && CalleeTarget == CFT_Host) \|\|		(CallerTarget == CFT_Device && CalleeTarget == CFT_Host) \|\|
(CallerTarget == CFT_Global && CalleeTarget == CFT_Host))		(CallerTarget == CFT_Global && CalleeTarget == CFT_Host))
return CFP_Never;		return CFP_Never;

llvm_unreachable("All cases should've been handled by now.");		llvm_unreachable("All cases should've been handled by now.");
}		}

		// * CUDA variable reference preference table
		//
		// F - from,
		// T - to
		// Ph - preference in host mode
		// Pd - preference in device mode
		// H - handled in (x)
		// Preferences: N:native, SS:same side, HD:host-device, WS:wrong side, --:never.
		//
		// \| F \| T \| Ph \| Pd \| H \|
		// \|----+----+-----+-----+-----+
		// \| d \| d \| N \| N \| (b) \|
		// \| d \| h \| -- \| -- \| (e) \|
		// \| d \| hd \| HD \| HD \| (a) \|
		// \| g \| d \| N \| N \| (b) \|
		// \| g \| h \| -- \| -- \| (e) \|
		// \| g \| hd \| HD \| HD \| (a) \|
		// \| h \| d \| HD* \| HD* \| (d) \|
		// \| h \| h \| N \| N \| (b) \|
		// \| h \| hd \| HD \| HD \| (a) \|
		// \| hd \| d \| HD* \| SS \| (c) \|
		// \| hd \| h \| SS \| WS \| (c) \|
		// \| hd \| hd \| HD \| HD \| (a) \|
		//
		// * As the shadow variable is always generated on the host side for each
		// device variable, the host-side code could always access its shadow copy.

		Sema::CUDAFunctionPreference
		Sema::IdentifyCUDAPreference(const FunctionDecl Caller, const VarDecl VD) {
		assert(VD && isNonLocalVariable(VD) && "Variable must be a non-local one.");
		CUDAFunctionTarget CallerTarget = IdentifyCUDATarget(Caller);
		CUDAFunctionTarget CalleeTarget = IdentifyCUDATarget(VD);

		// If one of the targets is invalid, the check always fails, no matter what
		// the other target is.
		if (CallerTarget == CFT_InvalidTarget \|\| CalleeTarget == CFT_InvalidTarget)
		return CFP_Never;

		// (a) Accessing HostDevice is OK for everyone.
		if (CalleeTarget == CFT_HostDevice)
		return CFP_HostDevice;

		// (b) Best case scenarios
		if (CalleeTarget == CallerTarget \|\|
		(CallerTarget == CFT_Global && CalleeTarget == CFT_Device))
		return CFP_Native;

		// (c) HostDevice behavior depends on compilation mode.
		if (CallerTarget == CFT_HostDevice) {
		// It's OK to call a compilation-mode matching function from an HD one.
		if ((getLangOpts().CUDAIsDevice && CalleeTarget == CFT_Device) \|\|
		(!getLangOpts().CUDAIsDevice && CalleeTarget == CFT_Host))
		return CFP_SameSide;

		// Device variables always have their shadow copies on the host side. Even
		// though the access to them should be made through the runtime API, they
		// are basically allowed to be accessed in the host code. It's too costy to
		// examine whether their accesses in the host code is valid, extra tools
		// such as clang-tidy may need enhancing to report those improper uses.
		if (CalleeTarget == CFT_Device)
		return CFP_HostDevice;

		// Calls from HD to non-mode-matching functions (i.e., to host functions
		// when compiling in device mode or to device functions when compiling in
		// host mode) are allowed at the sema level, but eventually rejected if
		// they're ever codegened. TODO: Reject said calls earlier.
		return CFP_WrongSide;
		}

		// (d) Device variables always have their shadow copies on the host side.
		// Even though the access to them should be made through the runtime API,
		// they are basically allowed to be accessed in the host code. It's too costy
		// to examine whether their accesses in the host code is valid, extra tools
		// such as clang-tidy may need enhancing to report those improper uses.
		if (CallerTarget == CFT_Host && CalleeTarget == CFT_Device)
		return CFP_HostDevice;

		// (e) Calling across device/host boundary is not something you should do.
		if ((CallerTarget == CFT_Device && CalleeTarget == CFT_Host) \|\|
		(CallerTarget == CFT_Global && CalleeTarget == CFT_Host))
		return CFP_Never;

		llvm_unreachable("All cases should've been handled by now.");
		}

void Sema::EraseUnwantedCUDAMatches(		void Sema::EraseUnwantedCUDAMatches(
const FunctionDecl *Caller,		const FunctionDecl *Caller,
SmallVectorImpl<std::pair<DeclAccessPair, FunctionDecl *>> &Matches) {		SmallVectorImpl<std::pair<DeclAccessPair, FunctionDecl *>> &Matches) {
if (Matches.size() <= 1)		if (Matches.size() <= 1)
return;		return;

using Pair = std::pair<DeclAccessPair, FunctionDecl*>;		using Pair = std::pair<DeclAccessPair, FunctionDecl*>;

▲ Show 20 Lines • Show All 315 Lines • ▼ Show 20 Lines	if (InitFn) {
<< InitFnTarget << InitFn;		<< InitFnTarget << InitFn;
Diag(InitFn->getLocation(), diag::note_previous_decl) << InitFn;		Diag(InitFn->getLocation(), diag::note_previous_decl) << InitFn;
VD->setInvalidDecl();		VD->setInvalidDecl();
}		}
}		}
}		}
}		}

		namespace {
		class CheckDefaultArgumentVisitor
		: public StmtVisitor<CheckDefaultArgumentVisitor, bool> {
		Sema &S;
		SourceLocation Loc;
		FunctionDecl *FD;
		ParmVarDecl *PVD;

		public:
		CheckDefaultArgumentVisitor(Sema &S, SourceLocation L, FunctionDecl *F,
		ParmVarDecl *P = nullptr)
		: S(S), Loc(L), FD(F), PVD(P) {}

		bool VisitStmt(Stmt *S) {
		bool Invalid = false;
		for (auto *Child : S->children())
		Invalid \|= Child && Visit(Child);
		return Invalid;
		}

		bool VisitDeclRefExpr(DeclRefExpr *DRE) {
		auto VD = dyn_cast<VarDecl>(DRE->getDecl());
		if (!VD \|\| !isNonLocalVariable(VD))
		return false;
		if (PVD) {
		switch (S.IdentifyCUDAPreference(FD, VD)) {
		default:
		return false;
		case Sema::CFP_Never:
		case Sema::CFP_WrongSide:
		break;
		}
		S.Diag(Loc, diag::warn_ref_bad_target_default_argument)
		<< S.IdentifyCUDATarget(VD) << VD << S.IdentifyCUDATarget(FD);
		S.Diag(VD->getLocation(), diag::note_previous_decl) << VD;
		return true;
		}
		return S.CheckCUDAAccess(Loc, FD, VD);
		}
		};
		} // End anonymous namespace

		bool Sema::checkCUDAParamWithInvalidDefaultArg(SourceLocation Loc,
		FunctionDecl *FD,
		ParmVarDecl *PVD) {
		CheckDefaultArgumentVisitor Checker(*this, Loc, FD, PVD);
		return Checker.Visit(PVD->getDefaultArg());
		}

		bool Sema::checkCUDAInvalidDefaultArgument(SourceLocation Loc, FunctionDecl *FD,
		Expr *E) {
		CheckDefaultArgumentVisitor Checker(*this, Loc, FD);
		return Checker.Visit(E);
		}

// With -fcuda-host-device-constexpr, an unattributed constexpr function is		// With -fcuda-host-device-constexpr, an unattributed constexpr function is
// treated as implicitly __host__ __device__, unless:		// treated as implicitly __host__ __device__, unless:
// * it is a variadic function (device-side variadic functions are not		// * it is a variadic function (device-side variadic functions are not
// allowed), or		// allowed), or
// * a __device__ function with this signature was already declared, in which		// * a __device__ function with this signature was already declared, in which
// case in which case we output an error, unless the __device__ decl is in a		// case in which case we output an error, unless the __device__ decl is in a
// system header, in which case we leave the constexpr function unattributed.		// system header, in which case we leave the constexpr function unattributed.
//		//
▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	bool Sema::CheckCUDACall(SourceLocation Loc, FunctionDecl *Callee) {
// Avoid emitting this error twice for the same location. Using a hashtable		// Avoid emitting this error twice for the same location. Using a hashtable
// like this is unfortunate, but because we must continue parsing as normal		// like this is unfortunate, but because we must continue parsing as normal
// after encountering a deferred error, it's otherwise very tricky for us to		// after encountering a deferred error, it's otherwise very tricky for us to
// ensure that we only emit this deferred error once.		// ensure that we only emit this deferred error once.
if (!LocsWithCUDACallDiags.insert({Caller, Loc}).second)		if (!LocsWithCUDACallDiags.insert({Caller, Loc}).second)
return true;		return true;

DeviceDiagBuilder(DiagKind, Loc, diag::err_ref_bad_target, Caller, *this)		DeviceDiagBuilder(DiagKind, Loc, diag::err_ref_bad_target, Caller, *this)
<< IdentifyCUDATarget(Callee) << Callee << IdentifyCUDATarget(Caller);		<< IdentifyCUDATarget(Callee) << /function/ 0 << Callee
		<< IdentifyCUDATarget(Caller);
DeviceDiagBuilder(DiagKind, Callee->getLocation(), diag::note_previous_decl,		DeviceDiagBuilder(DiagKind, Callee->getLocation(), diag::note_previous_decl,
Caller, *this)		Caller, *this)
<< Callee;		<< Callee;
return DiagKind != DeviceDiagBuilder::K_Immediate &&		return DiagKind != DeviceDiagBuilder::K_Immediate &&
DiagKind != DeviceDiagBuilder::K_ImmediateWithCallStack;		DiagKind != DeviceDiagBuilder::K_ImmediateWithCallStack;
}		}

		bool Sema::CheckCUDAAccess(SourceLocation Loc, FunctionDecl *Caller,
		VarDecl *VD) {
		assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");
		assert(VD && isNonLocalVariable(VD) && "Variable must be a non-local one.");

		// FIXME: Is bailing out early correct here? Should we instead assume that
		// the caller is a global initializer?
		if (!Caller)
		return true;

		// If the caller is known-emitted, mark the callee as known-emitted.
		// Otherwise, mark the call in our call graph so we can traverse it later.
		bool CallerKnownEmitted =
		getEmissionStatus(Caller) == FunctionEmissionStatus::Emitted;
		DeviceDiagBuilder::Kind DiagKind = [this, Caller, VD, CallerKnownEmitted] {
		switch (IdentifyCUDAPreference(Caller, VD)) {
		case CFP_Never:
		return DeviceDiagBuilder::K_Immediate;
		case CFP_WrongSide:
		assert(Caller && "WrongSide calls require a non-null caller");
		// If we know the caller will be emitted, we know this wrong-side call
		// will be emitted, so it's an immediate error. Otherwise, defer the
		// error until we know the caller is emitted.
		return CallerKnownEmitted ? DeviceDiagBuilder::K_ImmediateWithCallStack
		: DeviceDiagBuilder::K_Deferred;
		default:
		return DeviceDiagBuilder::K_Nop;
		}
		}();

		if (DiagKind == DeviceDiagBuilder::K_Nop)
		return true;

		// Avoid emitting this error twice for the same location. Using a hashtable
		// like this is unfortunate, but because we must continue parsing as normal
		// after encountering a deferred error, it's otherwise very tricky for us to
		// ensure that we only emit this deferred error once.
		if (!LocsWithCUDACallDiags.insert({Caller, Loc}).second)
		return true;

		DeviceDiagBuilder(DiagKind, Loc, diag::err_ref_bad_target, Caller, *this)
		<< IdentifyCUDATarget(VD) << /variable/ 1 << VD
		<< IdentifyCUDATarget(Caller);
		DeviceDiagBuilder(DiagKind, VD->getLocation(), diag::note_previous_decl,
		Caller, *this)
		<< VD;
		return DiagKind != DeviceDiagBuilder::K_Immediate &&
		DiagKind != DeviceDiagBuilder::K_ImmediateWithCallStack;
		}

void Sema::CUDASetLambdaAttrs(CXXMethodDecl *Method) {		void Sema::CUDASetLambdaAttrs(CXXMethodDecl *Method) {
assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");		assert(getLangOpts().CUDA && "Should only be called during CUDA compilation");
if (Method->hasAttr<CUDAHostAttr>() \|\| Method->hasAttr<CUDADeviceAttr>())		if (Method->hasAttr<CUDAHostAttr>() \|\| Method->hasAttr<CUDADeviceAttr>())
return;		return;
FunctionDecl *CurFn = dyn_cast<FunctionDecl>(CurContext);		FunctionDecl *CurFn = dyn_cast<FunctionDecl>(CurContext);
if (!CurFn)		if (!CurFn)
return;		return;
CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);		CUDAFunctionTarget Target = IdentifyCUDATarget(CurFn);
▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

clang/lib/Sema/SemaDeclCXX.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,540 Lines • ▼ Show 20 Lines	void Sema::CheckCXXDefaultArguments(FunctionDecl *FD) {
// In a given function declaration, each parameter subsequent to a parameter		// In a given function declaration, each parameter subsequent to a parameter
// with a default argument shall have a default argument supplied in this or		// with a default argument shall have a default argument supplied in this or
// a previous declaration or shall be a function parameter pack. A default		// a previous declaration or shall be a function parameter pack. A default
// argument shall not be redefined by a later declaration (not even to the		// argument shall not be redefined by a later declaration (not even to the
// same value).		// same value).
unsigned LastMissingDefaultArg = 0;		unsigned LastMissingDefaultArg = 0;
for (; p < NumParams; ++p) {		for (; p < NumParams; ++p) {
ParmVarDecl *Param = FD->getParamDecl(p);		ParmVarDecl *Param = FD->getParamDecl(p);
		if (getLangOpts().CUDA && Param->hasDefaultArg() &&
		(FD->hasAttr<CUDADeviceAttr>() \|\| FD->hasAttr<CUDAGlobalAttr>())) {
		checkCUDAParamWithInvalidDefaultArg(Param->getLocation(), FD, Param);
		}
if (!Param->hasDefaultArg() && !Param->isParameterPack()) {		if (!Param->hasDefaultArg() && !Param->isParameterPack()) {
if (Param->isInvalidDecl())		if (Param->isInvalidDecl())
/* We already complained about this parameter. */;		/* We already complained about this parameter. */;
else if (Param->getIdentifier())		else if (Param->getIdentifier())
Diag(Param->getLocation(),		Diag(Param->getLocation(),
diag::err_param_default_argument_missing_name)		diag::err_param_default_argument_missing_name)
<< Param->getIdentifier();		<< Param->getIdentifier();
else		else
▲ Show 20 Lines • Show All 15,350 Lines • ▼ Show 20 Lines	void Sema::ActOnPureSpecifier(Decl *D, SourceLocation ZeroLoc) {
if (D->getFriendObjectKind())		if (D->getFriendObjectKind())
Diag(D->getLocation(), diag::err_pure_friend);		Diag(D->getLocation(), diag::err_pure_friend);
else if (auto *M = dyn_cast<CXXMethodDecl>(D))		else if (auto *M = dyn_cast<CXXMethodDecl>(D))
CheckPureMethod(M, ZeroLoc);		CheckPureMethod(M, ZeroLoc);
else		else
Diag(D->getLocation(), diag::err_illegal_initializer);		Diag(D->getLocation(), diag::err_illegal_initializer);
}		}

/// Determine whether the given declaration is a global variable or
/// static data member.
static bool isNonlocalVariable(const Decl *D) {
if (const VarDecl *Var = dyn_cast_or_null<VarDecl>(D))
return Var->hasGlobalStorage();

return false;
}

/// Invoked when we are about to parse an initializer for the declaration		/// Invoked when we are about to parse an initializer for the declaration
/// 'Dcl'.		/// 'Dcl'.
///		///
/// After this method is called, according to [C++ 3.4.1p13], if 'Dcl' is a		/// After this method is called, according to [C++ 3.4.1p13], if 'Dcl' is a
/// static data member of class X, names should be looked up in the scope of		/// static data member of class X, names should be looked up in the scope of
/// class X. If the declaration had a scope specifier, a scope will have		/// class X. If the declaration had a scope specifier, a scope will have
/// been created and passed in for this purpose. Otherwise, S will be null.		/// been created and passed in for this purpose. Otherwise, S will be null.
void Sema::ActOnCXXEnterDeclInitializer(Scope S, Decl D) {		void Sema::ActOnCXXEnterDeclInitializer(Scope S, Decl D) {
// If there is no declaration, there was an error parsing it.		// If there is no declaration, there was an error parsing it.
if (!D \|\| D->isInvalidDecl())		if (!D \|\| D->isInvalidDecl())
return;		return;

// We will always have a nested name specifier here, but this declaration		// We will always have a nested name specifier here, but this declaration
// might not be out of line if the specifier names the current namespace:		// might not be out of line if the specifier names the current namespace:
// extern int n;		// extern int n;
// int ::n = 0;		// int ::n = 0;
if (S && D->isOutOfLine())		if (S && D->isOutOfLine())
EnterDeclaratorContext(S, D->getDeclContext());		EnterDeclaratorContext(S, D->getDeclContext());

// If we are parsing the initializer for a static data member, push a		// If we are parsing the initializer for a static data member, push a
// new expression evaluation context that is associated with this static		// new expression evaluation context that is associated with this static
// data member.		// data member.
if (isNonlocalVariable(D))		if (isNonLocalVariable(D))
PushExpressionEvaluationContext(		PushExpressionEvaluationContext(
ExpressionEvaluationContext::PotentiallyEvaluated, D);		ExpressionEvaluationContext::PotentiallyEvaluated, D);
}		}

/// Invoked after we are finished parsing an initializer for the declaration D.		/// Invoked after we are finished parsing an initializer for the declaration D.
void Sema::ActOnCXXExitDeclInitializer(Scope S, Decl D) {		void Sema::ActOnCXXExitDeclInitializer(Scope S, Decl D) {
// If there is no declaration, there was an error parsing it.		// If there is no declaration, there was an error parsing it.
if (!D \|\| D->isInvalidDecl())		if (!D \|\| D->isInvalidDecl())
return;		return;

if (isNonlocalVariable(D))		if (isNonLocalVariable(D))
PopExpressionEvaluationContext();		PopExpressionEvaluationContext();

if (S && D->isOutOfLine())		if (S && D->isOutOfLine())
ExitDeclaratorContext(S);		ExitDeclaratorContext(S);
}		}

/// ActOnCXXConditionDeclarationExpr - Parsed a condition declaration of a		/// ActOnCXXConditionDeclarationExpr - Parsed a condition declaration of a
/// C++ if/switch/while/for statement.		/// C++ if/switch/while/for statement.
▲ Show 20 Lines • Show All 745 Lines • Show Last 20 Lines

clang/lib/Sema/SemaExpr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 339 Lines • ▼ Show 20 Lines	bool Sema::DiagnoseUseOfDecl(NamedDecl *D, ArrayRef<SourceLocation> Locs,
if (LangOpts.OpenMP && DMD && !CurContext->containsDecl(D) &&		if (LangOpts.OpenMP && DMD && !CurContext->containsDecl(D) &&
isa<VarDecl>(D)) {		isa<VarDecl>(D)) {
Diag(Loc, diag::err_omp_declare_mapper_wrong_var)		Diag(Loc, diag::err_omp_declare_mapper_wrong_var)
<< DMD->getVarName().getAsString();		<< DMD->getVarName().getAsString();
Diag(D->getLocation(), diag::note_entity_declared_at) << D;		Diag(D->getLocation(), diag::note_entity_declared_at) << D;
return true;		return true;
}		}

		if (LangOpts.CUDA && isNonLocalVariable(D) &&
		!CheckCUDAAccess(Loc, dyn_cast<FunctionDecl>(CurContext),
		cast<VarDecl>(D)))
		return true;

DiagnoseAvailabilityOfDecl(D, Locs, UnknownObjCClass, ObjCPropertyAccess,		DiagnoseAvailabilityOfDecl(D, Locs, UnknownObjCClass, ObjCPropertyAccess,
AvoidPartialAvailabilityChecks, ClassReceiver);		AvoidPartialAvailabilityChecks, ClassReceiver);

DiagnoseUnusedOfDecl(*this, D, Loc);		DiagnoseUnusedOfDecl(*this, D, Loc);

diagnoseUseOfInternalDeclInInlineFunction(*this, D, Loc);		diagnoseUseOfInternalDeclInInlineFunction(*this, D, Loc);

if (isa<ParmVarDecl>(D) && isa<RequiresExprBodyDecl>(D->getDeclContext()) &&		if (isa<ParmVarDecl>(D) && isa<RequiresExprBodyDecl>(D->getDeclContext()) &&
▲ Show 20 Lines • Show All 5,119 Lines • ▼ Show 20 Lines	if (auto Init = dyn_cast<ExprWithCleanups>(Param->getInit())) {

// Append all the objects to the cleanup list. Right now, this		// Append all the objects to the cleanup list. Right now, this
// should always be a no-op, because blocks in default argument		// should always be a no-op, because blocks in default argument
// expressions should never be able to capture anything.		// expressions should never be able to capture anything.
assert(!Init->getNumObjects() &&		assert(!Init->getNumObjects() &&
"default argument expression has capturing blocks?");		"default argument expression has capturing blocks?");
}		}

		// TODO: Add CUDA check on the default argument and issue warning if any
		// invalid target reference from the function.
		if (getLangOpts().CUDA &&
		checkCUDAInvalidDefaultArgument(
		CallLoc, dyn_cast<FunctionDecl>(CurContext), Param->getDefaultArg()))
		return true;

// We already type-checked the argument, so we know it works.		// We already type-checked the argument, so we know it works.
// Just mark all of the declarations in this potentially-evaluated expression		// Just mark all of the declarations in this potentially-evaluated expression
// as being "referenced".		// as being "referenced".
EnterExpressionEvaluationContext EvalContext(		EnterExpressionEvaluationContext EvalContext(
*this, ExpressionEvaluationContext::PotentiallyEvaluated, Param);		*this, ExpressionEvaluationContext::PotentiallyEvaluated, Param);
MarkDeclarationsReferencedInExpr(Param->getDefaultArg(),		MarkDeclarationsReferencedInExpr(Param->getDefaultArg(),
/SkipLocalVariables=/true);		/SkipLocalVariables=/true);
return false;		return false;
▲ Show 20 Lines • Show All 13,473 Lines • Show Last 20 Lines

clang/lib/Sema/SemaLambda.cpp

Show First 20 Lines • Show All 970 Lines • ▼ Show 20 Lines	void Sema::ActOnStartOfLambdaDefinition(LambdaIntroducer &Intro,
}		}

CXXRecordDecl *Class = createLambdaClosureType(Intro.Range, MethodTyInfo,		CXXRecordDecl *Class = createLambdaClosureType(Intro.Range, MethodTyInfo,
KnownDependent, Intro.Default);		KnownDependent, Intro.Default);
CXXMethodDecl *Method =		CXXMethodDecl *Method =
startLambdaDefinition(Class, Intro.Range, MethodTyInfo, EndLoc, Params,		startLambdaDefinition(Class, Intro.Range, MethodTyInfo, EndLoc, Params,
ParamInfo.getDeclSpec().getConstexprSpecifier(),		ParamInfo.getDeclSpec().getConstexprSpecifier(),
ParamInfo.getTrailingRequiresClause());		ParamInfo.getTrailingRequiresClause());
if (ExplicitParams)
CheckCXXDefaultArguments(Method);

// This represents the function body for the lambda function, check if we		// This represents the function body for the lambda function, check if we
// have to apply optnone due to a pragma.		// have to apply optnone due to a pragma.
AddRangeBasedOptnone(Method);		AddRangeBasedOptnone(Method);

// code_seg attribute on lambda apply to the method.		// code_seg attribute on lambda apply to the method.
if (Attr A = getImplicitCodeSegOrSectionAttrForFunction(Method, /IsDefinition=*/true))		if (Attr A = getImplicitCodeSegOrSectionAttrForFunction(Method, /IsDefinition=*/true))
Method->addAttr(A);		Method->addAttr(A);

// Attributes on the lambda apply to the method.		// Attributes on the lambda apply to the method.
ProcessDeclAttributes(CurScope, Method, ParamInfo);		ProcessDeclAttributes(CurScope, Method, ParamInfo);

// CUDA lambdas get implicit attributes based on the scope in which they're		// CUDA lambdas get implicit attributes based on the scope in which they're
// declared.		// declared.
if (getLangOpts().CUDA)		if (getLangOpts().CUDA)
CUDASetLambdaAttrs(Method);		CUDASetLambdaAttrs(Method);

		// Check parameters with default arguments.
		if (ExplicitParams)
		CheckCXXDefaultArguments(Method);

// Number the lambda for linkage purposes if necessary.		// Number the lambda for linkage purposes if necessary.
handleLambdaNumbering(Class, Method);		handleLambdaNumbering(Class, Method);

// Introduce the function call operator as the current declaration context.		// Introduce the function call operator as the current declaration context.
PushDeclContext(CurScope, Method);		PushDeclContext(CurScope, Method);

// Build the lambda scope.		// Build the lambda scope.
buildLambdaScope(LSI, Method, Intro.Range, Intro.Default, Intro.DefaultLoc,		buildLambdaScope(LSI, Method, Intro.Range, Intro.Default, Intro.DefaultLoc,
▲ Show 20 Lines • Show All 932 Lines • Show Last 20 Lines

clang/test/CodeGenCUDA/function-overload.cu

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target

	// Make sure we handle target overloads correctly. Most of this is checked in			// Make sure we handle target overloads correctly. Most of this is checked in
	// sema, but special functions like constructors and destructors are here.			// sema, but special functions like constructors and destructors are here.
	//			//
	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-HOST %s			// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-HOST %s
	// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fcuda-is-device -emit-llvm -o - %s \			// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fcuda-is-device -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-DEVICE %s			// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-DEVICE %s

	#include "Inputs/cuda.h"			#include "Inputs/cuda.h"

	// Check constructors/destructors for D/H functions			// Check constructors/destructors for D/H functions
	int x;			__device__ int x;
	struct s_cd_dh {			struct s_cd_dh {
				// TODO: Need to generate warning on direct accesses on shadow variables.
	__host__ s_cd_dh() { x = 11; }			__host__ s_cd_dh() { x = 11; }
	__device__ s_cd_dh() { x = 12; }			__device__ s_cd_dh() { x = 12; }
	};			};

	struct s_cd_hd {			struct s_cd_hd {
				// TODO: Need to generate warning on direct accesses on shadow variables.
	__host__ __device__ s_cd_hd() { x = 31; }			__host__ __device__ s_cd_hd() { x = 31; }
	__host__ __device__ ~s_cd_hd() { x = 32; }			__host__ __device__ ~s_cd_hd() { x = 32; }
	};			};

	// CHECK-BOTH: define void @_Z7wrapperv			// CHECK-BOTH: define void @_Z7wrapperv
	#if defined(__CUDA_ARCH__)			#if defined(__CUDA_ARCH__)
	__device__			__device__
	#else			#else
	Show All 26 Lines

clang/test/SemaCUDA/variable-target.cu

This file was added.

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fcuda-is-device -verify %s

				#include "Inputs/cuda.h"

				static int gvar;
				// expected-note@-1{{'gvar' declared here}}
				traUnsubmitted Not Done Reply Inline Actions The current set of tests only verifies access of host variable from device side. We need to check that things work in other direction (i.e. device veriable is not accessible from host). A bit of it is covered in function-overload.cu, but it would make sense to deal with all variable-related things here. It would be great to add more test cases: access of device variable from various host functions. test cases to verify that &var and sizeof(var) works for device vars in host functions. tra: The current set of tests only verifies access of host variable from device side. We need to…
				hliaoAuthorUnsubmitted Done Reply Inline Actions yeah, as noted in both the message and some sources, that direction diagnosing is more complicated because the host code still be able to access shadow variables. We need to issue warnings on improper usage, such as variable direct read/write. I want to address that in another patch as more change is required to check how a variable is being used. hliao: yeah, as noted in both the message and some sources, that direction diagnosing is more…
				// expected-note@-2{{'gvar' declared here}}
				// expected-note@-3{{'gvar' declared here}}
				// expected-note@-4{{'gvar' declared here}}
				// expected-note@-5{{'gvar' declared here}}
				// expected-note@-6{{'gvar' declared here}}

				__device__ int d0() {
				// expected-error@+1{{reference to __host__ variable 'gvar' in __device__ function}}
				return gvar;
				}
				__device__ int d1() {
				// expected-error@+1{{reference to __host__ variable 'gvar' in __device__ function}}
				return []() -> int { return gvar; }();
				}

				// expected-warning@+1{{reference to __host__ variable 'gvar' as default argument in __device__ function}}
				__device__ int d2(int arg = gvar) {
				return arg;
				}
				__device__ int d3() {
				// expected-error@+1{{reference to __host__ variable 'gvar' in __device__ function}}
				return d2();
				}

				template <typename F>
				__global__ void g0(F f) {
				// expected-error@+1{{reference to __host__ variable 'gvar' in __global__ function}}
				f();
				}
				int h0() {
				// expected-warning@+1{{reference to __host__ variable 'gvar' as default argument in __device__ function}}
				g0<<<1, 1>>>([] __device__(int arg = gvar) -> int { return arg; });
				// expected-note-re@-1{{in instantiation of function template specialization 'g0<(lambda at {{.*}})>' requested here}}
				return 0;
				}
				yaxunlUnsubmitted Not Done Reply Inline Actions we need to have a test to check captured local host variable is allowed in device lambda. we need to have some test for constexpr variables used in device function. yaxunl: we need to have a test to check captured local host variable is allowed in device lambda. we…
				hliaoAuthorUnsubmitted Done Reply Inline Actions This patch just addresses the direct address of variables. For capture, it would be better to start with another patch. hliao: This patch just addresses the direct address of variables. For capture, it would be better to…
				yaxunlUnsubmitted Not Done Reply Inline Actions but there are chances that this patch may break valid usage of captured variables in device lambda. At least we should add test to avoid that. yaxunl: but there are chances that this patch may break valid usage of captured variables in device…

This is an archive of the discontinued LLVM Phabricator instance.

[cuda] Start diagnosing variables with bad target.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 261904

clang/include/clang/Basic/DiagnosticGroups.td

clang/include/clang/Basic/DiagnosticSemaKinds.td

clang/include/clang/Sema/Sema.h

clang/include/clang/Sema/SemaInternal.h

clang/lib/Sema/SemaCUDA.cpp

clang/lib/Sema/SemaDeclCXX.cpp

clang/lib/Sema/SemaExpr.cpp

clang/lib/Sema/SemaLambda.cpp

clang/test/CodeGenCUDA/function-overload.cu

clang/test/SemaCUDA/variable-target.cu

[cuda] Start diagnosing variables with bad target.
Needs ReviewPublic