This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/
-
trunk/
-
include/clang/
-
clang/
-
Basic/
-
DiagnosticSemaKinds.td
-
LangOptions.def
-
Driver/
-
CC1Options.td
-
Sema/
-
Sema.h
-
lib/
-
Frontend/
-
CompilerInvocation.cpp
-
Sema/
-
SemaCUDA.cpp
-
SemaDecl.cpp
-
SemaOverload.cpp
-
test/SemaCUDA/
-
SemaCUDA/
-
Inputs/
-
overload.h
-
host-device-constexpr.cu
-
no-host-device-constexpr.cu

Differential D18380

[CUDA] Make unattributed constexpr functions (usually) implicitly host+device.
ClosedPublic

Authored by jlebar on Mar 22 2016, 3:08 PM.

Download Raw Diff

Details

Reviewers

tra
rnk
rsmith

Commits

rGba122ab42fe5: [CUDA] Make unattributed constexpr functions implicitly host+device.
rC264964: [CUDA] Make unattributed constexpr functions implicitly host+device.
rL264964: [CUDA] Make unattributed constexpr functions implicitly host+device.

Summary

[CUDA] Make unattributed constexpr functions implicitly host+device.

With this patch, by a constexpr function is implicitly host+device
unless:

a) it's a variadic function (variadic functions are not allowed on the device side), or
b) it's preceeded by a device overload in a system header.

The restriction on overloading host device functions on the
basis of their CUDA attributes remains in place, but we use (b) to allow
us to define device overloads for constexpr functions in cmath,
which would otherwise be host device and thus not overloadable.

You can disable this behavior with -fno-cuda-host-device-constexpr.

Diff Detail

Repository: rL LLVM

Event Timeline

jlebar updated this revision to Diff 51352.Mar 22 2016, 3:08 PM

jlebar retitled this revision from to [CUDA] Implement -fcuda-relaxed-constexpr, and enable it by default..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added subscribers: rsmith, rnk, cfe-commits.

Actually run the tests, and fix the CUDA overloading test.

Now that H/D and HD cal all be in the same overload set, we'll also need additional tests in CodeGenCUDA/function-overload.cu for cases that now became legal.

In D18380#381025, @tra wrote:

Now that H/D and HD cal all be in the same overload set, we'll also need additional tests in CodeGenCUDA/function-overload.cu for cases that now became legal.

There are lots of tests that used to be compile errors and now aren't -- what do you think we're missing?

We need tests to demonstrate that we pick correct function when we have mix
of HD+H/D in the overload set.
Existing tests only cover resolution of {HD,HD}, {H,H} {D,D} {H,D} sets

Add tests checking host+device overloading.

In D18380#381031, @tra wrote:

We need tests to demonstrate that we pick correct function when we have mix
of HD+H/D in the overload set.
Existing tests only cover resolution of {HD,HD}, {H,H} {D,D} {H,D} sets

Aha, got it. I think adding this is simple given the existing framework -- lmk what you think.

Update test as discussed -- now we check that we're invoking the correct overloads.

tra accepted this revision.Mar 23 2016, 10:37 AM

tra edited edge metadata.

This revision is now accepted and ready to land.Mar 23 2016, 10:37 AM

rsmith added inline comments.Mar 23 2016, 10:58 AM

include/clang/Driver/CC1Options.td
702–703 ↗	(On Diff #51384)	Is there a better name we can use for this? I don't think this is "relaxed" in any obvious sense. `-fcuda-host-device-constexpr` or `-fcuda-constexpr-on-device` might be clearer?
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	For flags that are enabled by default, we usually have the -cc1 flag be a `-fno-*` flag. This allows people to use (for instance) `clang blah.cu -Xclang -fno-cuda-relaxed-constexpr` if necessary.
lib/Sema/SemaOverload.cpp
1132 ↗	(On Diff #51384)	No parens around `==` comparisons.

jlebar added inline comments.Mar 23 2016, 11:30 AM

include/clang/Driver/CC1Options.td
702–703 ↗	(On Diff #51384)	"relaxed constexpr" is nvidia's term -- do you think it might be helpful to use the same terminology? I understand there's some prior art here, with respect to clang accepting gcc's flags, although the situation here is of course different.
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	Yeah, Artem and I had a discussion about this yesterday. As you can see, there are two other flags above which are turned on by default -- these also lack -fno variants. I think it would be good to be consistent here. I'm tempted to add another patch below this one which makes the other two -fno, then we can make this one -fno as well. It seems that convention is to just get rid of the existing non-fno flags, rather than leave both positive and negative versions. Does that sound OK to you?

rsmith added inline comments.Mar 23 2016, 11:35 AM

include/clang/Driver/CC1Options.td
702–703 ↗	(On Diff #51384)	I think it's problematic to use that terminology, as "relaxed constexpr" is also used to describe the C++14 `constexpr` rules (see n3652).
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	Yes, that sounds fine.

jlebar added inline comments.Mar 23 2016, 1:24 PM

include/clang/Driver/CC1Options.td
702–703 ↗	(On Diff #51384)	Heh, I can't argue with that.
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	Okay, thank you. After talking to Artem, we're just going to remove those two flags entirely. So after we convert relaxed-constexpr to an fno flag, there should be no changes to this file in this patch.

Switch to -fno-cuda-host-device-constexpr. Only implicitly add the attributes
on functions which themselves lack host/device attributes. Add more tests.

Changed as discussed. Please have another look. Thank you for your continued patience here.

tra added inline comments.Mar 23 2016, 3:38 PM

lib/Sema/SemaDecl.cpp
8011–8013 ↗	(On Diff #51479)	Can we have constexpr `__global__` ?

Add check for global constexpr functions.

lib/Sema/SemaDecl.cpp
8011–8013 ↗	(On Diff #51479)	Yikes. We're saved (unless Richard has a tricky counterexample) because kernels must be void and constexpr must not be void. But I'll add a check here anyway.

Richard, are you happy here?

The change to allow __host__ __device__ functions to be overloaded with other combinations of target attributes appears to be separable from the constexpr change; please split it out and commit it first.

include/clang/Basic/LangOptions.def
175 ↗	(On Diff #51495)	This should be a noun phrase -- this string appears in contexts like "support for %0 is enabled" -- so this should be "treating unattributed [...]".
lib/Sema/SemaDecl.cpp
8015–8017 ↗	(On Diff #51495)	`constexpr` functions can return `void` in a couple of different ways (in C++11, if they're template specializations with dependent return types that instantiate to `void`, and in C++14 there is no restriction on `constexpr` functions returning `void`).

jlebar marked 2 inline comments as done.Mar 24 2016, 1:04 PM

jlebar added inline comments.

include/clang/Basic/LangOptions.def
175 ↗	(On Diff #51495)	Thanks. This is fixed in my patch queue, and I will push a change for the other ones as part of this patch queue.

Okay, just one more patch, D18458, then I think we're good here. (This is split up into two patches in my queue.)

Thanks for your help, Richard.

jlebar mentioned this in D18458: [CUDA] Mangle __host__ __device__ functions differently than __host__ or __device__ functions..Mar 28 2016, 6:49 PM

jlebar updated this object.

jlebar edited edge metadata.

jlebar updated this object.

jlebar added a reviewer: rnk.

jlebar updated this object.

jlebar retitled this revision from [CUDA] Implement -fcuda-relaxed-constexpr, and enable it by default. to [CUDA] Make unattributed constexpr functions (usually) implicitly host+device..Mar 28 2016, 6:52 PM

Update per changes to patch description. Now a constexpr becomes implicitly HD
unless there's a preceeding device overload.

Updated as discussed -- please have a look.

I wonder if we can find a way to decide whether particular constexpr function should be treated as HD or not without relying on particular order the functions are seen by compiler (or whether they come from system headers).

Right now we're relying on checking overloads of constexpr's function decl once and applying HD attributes based on state of overload set at the point in TU. We then use those attributes during overload resolution.

What if instead of permanently sticking HD attributes on the constexpr function, we instead postpone decision to the point of overload resolution and figure out effective attributes or call preference based on contents of the whole overload set regardless of the order the decls were added to the set.

test/SemaCUDA/host-device-constexpr.cu
30–31 ↗	(On Diff #51868)	"should prevent this"

In D18380#385240, @tra wrote:

What if instead of permanently sticking HD attributes on the constexpr function, we instead postpone decision to the point of overload resolution and figure out effective attributes or call preference based on contents of the whole overload set regardless of the order the decls were added to the set.

The problem we were trying to prevent by requiring that the __device__ overload come first is:

constexpr int foo();
__device__ void bar() { foo(); }
__device__ int foo();
__device__ void baz() { foo(); }

In this example, we're forced to instantiate both versions of foo() on the device. Being lazy about making the first foo HD doesn't help, because at the time we see bar, it's the only option available.

(Instantiating both foos is a problem if they have the same mangling. And we want them to have the same mangling so we maintain ABI compatibility with nvcc.)

jlebar mentioned this in rL264740: [CUDA] Make CUDA description strings in langopts into noun phrases. NFC.Mar 29 2016, 9:29 AM

LGTM.

(Just to be clear, I'm waiting on Richard's review here, even though he lg'ed an version of this patch.)

LGTM

Thank you all your time here, Art, Reid, and Richard. Fingers crossed we don't have to worry about this again for a while...

Closed by commit rL264964: [CUDA] Make unattributed constexpr functions implicitly host+device. (authored by jlebar). · Explain WhyMar 30 2016, 4:35 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

cfe/

trunk/

include/

clang/

Basic/

DiagnosticSemaKinds.td

6 lines

LangOptions.def

1 line

Driver/

CC1Options.td

2 lines

Sema/

Sema.h

8 lines

lib/

Frontend/

CompilerInvocation.cpp

3 lines

Sema/

SemaCUDA.cpp

51 lines

SemaDecl.cpp

3 lines

SemaOverload.cpp

4 lines

test/

SemaCUDA/

Inputs/

overload.h

8 lines

host-device-constexpr.cu

69 lines

no-host-device-constexpr.cu

20 lines

Diff 52152

cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 6,485 Lines • ▼ Show 20 Lines
	def warn_kern_is_inline : Warning<			def warn_kern_is_inline : Warning<
	"ignored 'inline' attribute on kernel function %0">,			"ignored 'inline' attribute on kernel function %0">,
	InGroup<CudaCompat>;			InGroup<CudaCompat>;
	def err_variadic_device_fn : Error<			def err_variadic_device_fn : Error<
	"CUDA device code does not support variadic functions">;			"CUDA device code does not support variadic functions">;
	def err_va_arg_in_device : Error<			def err_va_arg_in_device : Error<
	"CUDA device code does not support va_arg">;			"CUDA device code does not support va_arg">;
	def err_alias_not_supported_on_nvptx : Error<"CUDA does not support aliases">;			def err_alias_not_supported_on_nvptx : Error<"CUDA does not support aliases">;
				def err_cuda_unattributed_constexpr_cannot_overload_device : Error<
				"constexpr function '%0' without __host__ or __device__ attributes cannot "
				"overload __device__ function with same signature. Add a __host__ "
				"attribute, or build with -fno-cuda-host-device-constexpr.">;
				def note_cuda_conflicting_device_function_declared_here : Note<
				"conflicting __device__ function declared here">;
	def err_dynamic_var_init : Error<			def err_dynamic_var_init : Error<
	"dynamic initialization is not supported for "			"dynamic initialization is not supported for "
	"__device__, __constant__, and __shared__ variables.">;			"__device__, __constant__, and __shared__ variables.">;
	def err_shared_var_init : Error<			def err_shared_var_init : Error<
	"initialization is not supported for __shared__ variables.">;			"initialization is not supported for __shared__ variables.">;

	def warn_non_pod_vararg_with_format_string : Warning<			def warn_non_pod_vararg_with_format_string : Warning<
	"cannot pass %select{non-POD\|non-trivial}0 object of type %1 to variadic "			"cannot pass %select{non-POD\|non-trivial}0 object of type %1 to variadic "
	▲ Show 20 Lines • Show All 1,892 Lines • Show Last 20 Lines

cfe/trunk/include/clang/Basic/LangOptions.def

	Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines
	LANGOPT(HalfArgsAndReturns, 1, 0, "half args and returns")			LANGOPT(HalfArgsAndReturns, 1, 0, "half args and returns")
	LANGOPT(CUDA , 1, 0, "CUDA")			LANGOPT(CUDA , 1, 0, "CUDA")
	LANGOPT(OpenMP , 1, 0, "OpenMP support")			LANGOPT(OpenMP , 1, 0, "OpenMP support")
	LANGOPT(OpenMPUseTLS , 1, 0, "Use TLS for threadprivates or runtime calls")			LANGOPT(OpenMPUseTLS , 1, 0, "Use TLS for threadprivates or runtime calls")
	LANGOPT(OpenMPIsDevice , 1, 0, "Generate code only for OpenMP target device")			LANGOPT(OpenMPIsDevice , 1, 0, "Generate code only for OpenMP target device")

	LANGOPT(CUDAIsDevice , 1, 0, "compiling for CUDA device")			LANGOPT(CUDAIsDevice , 1, 0, "compiling for CUDA device")
	LANGOPT(CUDAAllowVariadicFunctions, 1, 0, "allowing variadic functions in CUDA device code")			LANGOPT(CUDAAllowVariadicFunctions, 1, 0, "allowing variadic functions in CUDA device code")
				LANGOPT(CUDAHostDeviceConstexpr, 1, 1, "treating unattributed constexpr functions as __host__ __device__")

	LANGOPT(AssumeSaneOperatorNew , 1, 1, "implicit __attribute__((malloc)) for C++'s new operators")			LANGOPT(AssumeSaneOperatorNew , 1, 1, "implicit __attribute__((malloc)) for C++'s new operators")
	LANGOPT(SizedDeallocation , 1, 0, "enable sized deallocation functions")			LANGOPT(SizedDeallocation , 1, 0, "enable sized deallocation functions")
	LANGOPT(ConceptsTS , 1, 0, "enable C++ Extensions for Concepts")			LANGOPT(ConceptsTS , 1, 0, "enable C++ Extensions for Concepts")
	BENIGN_LANGOPT(ElideConstructors , 1, 1, "C++ copy constructor elision")			BENIGN_LANGOPT(ElideConstructors , 1, 1, "C++ copy constructor elision")
	BENIGN_LANGOPT(DumpRecordLayouts , 1, 0, "dumping the layout of IRgen'd records")			BENIGN_LANGOPT(DumpRecordLayouts , 1, 0, "dumping the layout of IRgen'd records")
	BENIGN_LANGOPT(DumpRecordLayoutsSimple , 1, 0, "dumping the layout of IRgen'd records in a simple form")			BENIGN_LANGOPT(DumpRecordLayoutsSimple , 1, 0, "dumping the layout of IRgen'd records in a simple form")
	BENIGN_LANGOPT(DumpVTableLayouts , 1, 0, "dumping the layouts of emitted vtables")			BENIGN_LANGOPT(DumpVTableLayouts , 1, 0, "dumping the layouts of emitted vtables")
	▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

cfe/trunk/include/clang/Driver/CC1Options.td

	Show First 20 Lines • Show All 685 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def fcuda_is_device : Flag<["-"], "fcuda-is-device">,			def fcuda_is_device : Flag<["-"], "fcuda-is-device">,
	HelpText<"Generate code for CUDA device">;			HelpText<"Generate code for CUDA device">;
	def fcuda_include_gpubinary : Separate<["-"], "fcuda-include-gpubinary">,			def fcuda_include_gpubinary : Separate<["-"], "fcuda-include-gpubinary">,
	HelpText<"Incorporate CUDA device-side binary into host object file.">;			HelpText<"Incorporate CUDA device-side binary into host object file.">;
	def fcuda_allow_variadic_functions : Flag<["-"], "fcuda-allow-variadic-functions">,			def fcuda_allow_variadic_functions : Flag<["-"], "fcuda-allow-variadic-functions">,
	HelpText<"Allow variadic functions in CUDA device code.">;			HelpText<"Allow variadic functions in CUDA device code.">;
				def fno_cuda_host_device_constexpr : Flag<["-"], "fno-cuda-host-device-constexpr">,
				HelpText<"Don't treat unattributed constexpr functions as __host__ __device__.">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// OpenMP Options			// OpenMP Options
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def fopenmp_is_device : Flag<["-"], "fopenmp-is-device">,			def fopenmp_is_device : Flag<["-"], "fopenmp-is-device">,
	HelpText<"Generate code only for an OpenMP target device.">;			HelpText<"Generate code only for an OpenMP target device.">;
	def fomp_host_ir_file_path : Separate<["-"], "fomp-host-ir-file-path">,			def fomp_host_ir_file_path : Separate<["-"], "fomp-host-ir-file-path">,
	Show All 32 Lines

cfe/trunk/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,186 Lines • ▼ Show 20 Lines	enum OverloadKind {
/// non-function.		/// non-function.
Ovl_NonFunction		Ovl_NonFunction
};		};
OverloadKind CheckOverload(Scope *S,		OverloadKind CheckOverload(Scope *S,
FunctionDecl *New,		FunctionDecl *New,
const LookupResult &OldDecls,		const LookupResult &OldDecls,
NamedDecl *&OldDecl,		NamedDecl *&OldDecl,
bool IsForUsingDecl);		bool IsForUsingDecl);
bool IsOverload(FunctionDecl New, FunctionDecl Old, bool IsForUsingDecl);		bool IsOverload(FunctionDecl New, FunctionDecl Old, bool IsForUsingDecl,
		bool ConsiderCudaAttrs = true);

/// \brief Checks availability of the function depending on the current		/// \brief Checks availability of the function depending on the current
/// function context.Inside an unavailable function,unavailability is ignored.		/// function context.Inside an unavailable function,unavailability is ignored.
///		///
/// \returns true if \p FD is unavailable and current context is inside		/// \returns true if \p FD is unavailable and current context is inside
/// an available function, false otherwise.		/// an available function, false otherwise.
bool isFunctionConsideredUnavailable(FunctionDecl *FD);		bool isFunctionConsideredUnavailable(FunctionDecl *FD);

▲ Show 20 Lines • Show All 6,695 Lines • ▼ Show 20 Lines	CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,
const FunctionDecl *Callee);		const FunctionDecl *Callee);

/// Determines whether Caller may invoke Callee, based on their CUDA		/// Determines whether Caller may invoke Callee, based on their CUDA
/// host/device attributes. Returns true if the call is not allowed.		/// host/device attributes. Returns true if the call is not allowed.
bool CheckCUDATarget(const FunctionDecl Caller, const FunctionDecl Callee) {		bool CheckCUDATarget(const FunctionDecl Caller, const FunctionDecl Callee) {
return IdentifyCUDAPreference(Caller, Callee) == CFP_Never;		return IdentifyCUDAPreference(Caller, Callee) == CFP_Never;
}		}

		/// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to FD,
		/// depending on FD and the current compilation settings.
		void maybeAddCUDAHostDeviceAttrs(Scope S, FunctionDecl FD,
		const LookupResult &Previous);

/// Finds a function in \p Matches with highest calling priority		/// Finds a function in \p Matches with highest calling priority
/// from \p Caller context and erases all functions with lower		/// from \p Caller context and erases all functions with lower
/// calling priority.		/// calling priority.
void EraseUnwantedCUDAMatches(const FunctionDecl *Caller,		void EraseUnwantedCUDAMatches(const FunctionDecl *Caller,
SmallVectorImpl<FunctionDecl *> &Matches);		SmallVectorImpl<FunctionDecl *> &Matches);
void EraseUnwantedCUDAMatches(const FunctionDecl *Caller,		void EraseUnwantedCUDAMatches(const FunctionDecl *Caller,
SmallVectorImpl<DeclAccessPair> &Matches);		SmallVectorImpl<DeclAccessPair> &Matches);
void EraseUnwantedCUDAMatches(		void EraseUnwantedCUDAMatches(
▲ Show 20 Lines • Show All 498 Lines • Show Last 20 Lines

cfe/trunk/lib/Frontend/CompilerInvocation.cpp

Show First 20 Lines • Show All 1,554 Lines • ▼ Show 20 Lines	if (Args.hasArg(OPT_fno_operator_names))
Opts.CXXOperatorNames = 0;		Opts.CXXOperatorNames = 0;

if (Args.hasArg(OPT_fcuda_is_device))		if (Args.hasArg(OPT_fcuda_is_device))
Opts.CUDAIsDevice = 1;		Opts.CUDAIsDevice = 1;

if (Args.hasArg(OPT_fcuda_allow_variadic_functions))		if (Args.hasArg(OPT_fcuda_allow_variadic_functions))
Opts.CUDAAllowVariadicFunctions = 1;		Opts.CUDAAllowVariadicFunctions = 1;

		if (Args.hasArg(OPT_fno_cuda_host_device_constexpr))
		Opts.CUDAHostDeviceConstexpr = 0;

if (Opts.ObjC1) {		if (Opts.ObjC1) {
if (Arg *arg = Args.getLastArg(OPT_fobjc_runtime_EQ)) {		if (Arg *arg = Args.getLastArg(OPT_fobjc_runtime_EQ)) {
StringRef value = arg->getValue();		StringRef value = arg->getValue();
if (Opts.ObjCRuntime.tryParse(value))		if (Opts.ObjCRuntime.tryParse(value))
Diags.Report(diag::err_drv_unknown_objc_runtime) << value;		Diags.Report(diag::err_drv_unknown_objc_runtime) << value;
}		}

if (Args.hasArg(OPT_fobjc_gc_only))		if (Args.hasArg(OPT_fobjc_gc_only))
▲ Show 20 Lines • Show All 810 Lines • Show Last 20 Lines

cfe/trunk/lib/Sema/SemaCUDA.cpp

//===--- SemaCUDA.cpp - Semantic Analysis for CUDA constructs -------------===//		//===--- SemaCUDA.cpp - Semantic Analysis for CUDA constructs -------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
/// \file		/// \file
/// \brief This file implements semantic analysis for CUDA constructs.		/// \brief This file implements semantic analysis for CUDA constructs.
///		///
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "clang/Sema/Sema.h"
#include "clang/AST/ASTContext.h"		#include "clang/AST/ASTContext.h"
#include "clang/AST/Decl.h"		#include "clang/AST/Decl.h"
#include "clang/AST/ExprCXX.h"		#include "clang/AST/ExprCXX.h"
#include "clang/Lex/Preprocessor.h"		#include "clang/Lex/Preprocessor.h"
		#include "clang/Sema/Lookup.h"
		#include "clang/Sema/Sema.h"
#include "clang/Sema/SemaDiagnostic.h"		#include "clang/Sema/SemaDiagnostic.h"
		#include "clang/Sema/Template.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
using namespace clang;		using namespace clang;

ExprResult Sema::ActOnCUDAExecConfigExpr(Scope *S, SourceLocation LLLLoc,		ExprResult Sema::ActOnCUDAExecConfigExpr(Scope *S, SourceLocation LLLLoc,
MultiExprArg ExecConfig,		MultiExprArg ExecConfig,
SourceLocation GGGLoc) {		SourceLocation GGGLoc) {
FunctionDecl *ConfigDecl = Context.getcudaConfigureCallDecl();		FunctionDecl *ConfigDecl = Context.getcudaConfigureCallDecl();
▲ Show 20 Lines • Show All 348 Lines • ▼ Show 20 Lines	if (!llvm::all_of(CD->inits(), [&](const CXXCtorInitializer *CI) {
dyn_cast<CXXConstructExpr>(CI->getInit()))		dyn_cast<CXXConstructExpr>(CI->getInit()))
return isEmptyCudaConstructor(Loc, CE->getConstructor());		return isEmptyCudaConstructor(Loc, CE->getConstructor());
return false;		return false;
}))		}))
return false;		return false;

return true;		return true;
}		}

		// With -fcuda-host-device-constexpr, an unattributed constexpr function is
		// treated as implicitly __host__ __device__, unless:
		// * it is a variadic function (device-side variadic functions are not
		// allowed), or
		// * a __device__ function with this signature was already declared, in which
		// case in which case we output an error, unless the __device__ decl is in a
		// system header, in which case we leave the constexpr function unattributed.
		void Sema::maybeAddCUDAHostDeviceAttrs(Scope S, FunctionDecl NewD,
		const LookupResult &Previous) {
		assert(getLangOpts().CUDA && "May be called only for CUDA compilations.");
		if (!getLangOpts().CUDAHostDeviceConstexpr \|\| !NewD->isConstexpr() \|\|
		NewD->isVariadic() \|\| NewD->hasAttr<CUDAHostAttr>() \|\|
		NewD->hasAttr<CUDADeviceAttr>() \|\| NewD->hasAttr<CUDAGlobalAttr>())
		return;

		// Is D a __device__ function with the same signature as NewD, ignoring CUDA
		// attributes?
		auto IsMatchingDeviceFn = [&](NamedDecl *D) {
		if (UsingShadowDecl *Using = dyn_cast<UsingShadowDecl>(D))
		D = Using->getTargetDecl();
		FunctionDecl *OldD = D->getAsFunction();
		return OldD && OldD->hasAttr<CUDADeviceAttr>() &&
		!OldD->hasAttr<CUDAHostAttr>() &&
		!IsOverload(NewD, OldD, /* UseMemberUsingDeclRules = */ false,
		/* ConsiderCudaAttrs = */ false);
		};
		auto It = llvm::find_if(Previous, IsMatchingDeviceFn);
		if (It != Previous.end()) {
		// We found a __device__ function with the same name and signature as NewD
		// (ignoring CUDA attrs). This is an error unless that function is defined
		// in a system header, in which case we simply return without making NewD
		// host+device.
		NamedDecl Match = It;
		if (!getSourceManager().isInSystemHeader(Match->getLocation())) {
		Diag(NewD->getLocation(),
		diag::err_cuda_unattributed_constexpr_cannot_overload_device)
		<< NewD->getName();
		Diag(Match->getLocation(),
		diag::note_cuda_conflicting_device_function_declared_here);
		}
		return;
		}

		NewD->addAttr(CUDAHostAttr::CreateImplicit(Context));
		NewD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
		}

cfe/trunk/lib/Sema/SemaDecl.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,003 Lines • ▼ Show 20 Lines	if (UnifySection(CodeSegStack.CurrentValue->getString(),
ASTContext::PSF_Read,		ASTContext::PSF_Read,
NewFD))		NewFD))
NewFD->dropAttr<SectionAttr>();		NewFD->dropAttr<SectionAttr>();
}		}

// Handle attributes.		// Handle attributes.
ProcessDeclAttributes(S, NewFD, D);		ProcessDeclAttributes(S, NewFD, D);

		if (getLangOpts().CUDA)
		maybeAddCUDAHostDeviceAttrs(S, NewFD, Previous);

if (getLangOpts().OpenCL) {		if (getLangOpts().OpenCL) {
// OpenCL v1.1 s6.5: Using an address space qualifier in a function return		// OpenCL v1.1 s6.5: Using an address space qualifier in a function return
// type declaration will generate a compilation error.		// type declaration will generate a compilation error.
unsigned AddressSpace = NewFD->getReturnType().getAddressSpace();		unsigned AddressSpace = NewFD->getReturnType().getAddressSpace();
if (AddressSpace == LangAS::opencl_local \|\|		if (AddressSpace == LangAS::opencl_local \|\|
AddressSpace == LangAS::opencl_global \|\|		AddressSpace == LangAS::opencl_global \|\|
AddressSpace == LangAS::opencl_constant) {		AddressSpace == LangAS::opencl_constant) {
Diag(NewFD->getLocation(),		Diag(NewFD->getLocation(),
▲ Show 20 Lines • Show All 7,034 Lines • Show Last 20 Lines

cfe/trunk/lib/Sema/SemaOverload.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 986 Lines • ▼ Show 20 Lines	if (FunctionDecl *OldF = OldD->getAsFunction()) {
return Ovl_NonFunction;		return Ovl_NonFunction;
}		}
}		}

return Ovl_Overload;		return Ovl_Overload;
}		}

bool Sema::IsOverload(FunctionDecl New, FunctionDecl Old,		bool Sema::IsOverload(FunctionDecl New, FunctionDecl Old,
bool UseMemberUsingDeclRules) {		bool UseMemberUsingDeclRules, bool ConsiderCudaAttrs) {
// C++ [basic.start.main]p2: This function shall not be overloaded.		// C++ [basic.start.main]p2: This function shall not be overloaded.
if (New->isMain())		if (New->isMain())
return false;		return false;

// MSVCRT user defined entry points cannot be overloaded.		// MSVCRT user defined entry points cannot be overloaded.
if (New->isMSVCRTEntryPoint())		if (New->isMSVCRTEntryPoint())
return false;		return false;

▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	if (NewI == NewE \|\| OldI == OldE)
return true;		return true;
llvm::FoldingSetNodeID NewID, OldID;		llvm::FoldingSetNodeID NewID, OldID;
NewI->getCond()->Profile(NewID, Context, true);		NewI->getCond()->Profile(NewID, Context, true);
OldI->getCond()->Profile(OldID, Context, true);		OldI->getCond()->Profile(OldID, Context, true);
if (NewID != OldID)		if (NewID != OldID)
return true;		return true;
}		}

if (getLangOpts().CUDA) {		if (getLangOpts().CUDA && ConsiderCudaAttrs) {
CUDAFunctionTarget NewTarget = IdentifyCUDATarget(New),		CUDAFunctionTarget NewTarget = IdentifyCUDATarget(New),
OldTarget = IdentifyCUDATarget(Old);		OldTarget = IdentifyCUDATarget(Old);
if (NewTarget == CFT_InvalidTarget \|\| NewTarget == CFT_Global)		if (NewTarget == CFT_InvalidTarget \|\| NewTarget == CFT_Global)
return false;		return false;

assert((OldTarget != CFT_InvalidTarget) && "Unexpected invalid target.");		assert((OldTarget != CFT_InvalidTarget) && "Unexpected invalid target.");

// Don't allow mixing of HD with other kinds. This guarantees that		// Don't allow mixing of HD with other kinds. This guarantees that
▲ Show 20 Lines • Show All 11,916 Lines • Show Last 20 Lines

cfe/trunk/test/SemaCUDA/Inputs/overload.h

				// This header is used by tests which are interested in __device__ functions
				// which appear in a system header.

				__device__ int OverloadMe();

				namespace ns {
				using ::OverloadMe;
				}

cfe/trunk/test/SemaCUDA/host-device-constexpr.cu

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify -isystem %S/Inputs %s
				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify -isystem %S/Inputs %s -fcuda-is-device

				#include "Inputs/cuda.h"

				// Declares one function and pulls it into namespace ns:
				//
				// __device__ int OverloadMe();
				// namespace ns { using ::OverloadMe; }
				//
				// Clang cares that this is done in a system header.
				#include <overload.h>

				// Opaque type used to determine which overload we're invoking.
				struct HostReturnTy {};

				// These shouldn't become host+device because they already have attributes.
				__host__ constexpr int HostOnly() { return 0; }
				// expected-note@-1 0+ {{not viable}}
				__device__ constexpr int DeviceOnly() { return 0; }
				// expected-note@-1 0+ {{not viable}}

				constexpr int HostDevice() { return 0; }

				// This should be a host-only function, because there's a previous __device__
				// overload in <overload.h>.
				constexpr HostReturnTy OverloadMe() { return HostReturnTy(); }

				namespace ns {
				// The "using" statement in overload.h should prevent OverloadMe from being
				// implicitly host+device.
				constexpr HostReturnTy OverloadMe() { return HostReturnTy(); }
				} // namespace ns

				// This is an error, because NonSysHdrOverload was not defined in a system
				// header.
				__device__ int NonSysHdrOverload() { return 0; }
				// expected-note@-1 {{conflicting __device__ function declared here}}
				constexpr int NonSysHdrOverload() { return 0; }
				// expected-error@-1 {{constexpr function 'NonSysHdrOverload' without __host__ or __device__ attributes}}

				// Variadic device functions are not allowed, so this is just treated as
				// host-only.
				constexpr void Variadic(const char*, ...);
				// expected-note@-1 {{call to __host__ function from __device__ function}}

				__host__ void HostFn() {
				HostOnly();
				DeviceOnly(); // expected-error {{no matching function}}
				HostReturnTy x = OverloadMe();
				HostReturnTy y = ns::OverloadMe();
				Variadic("abc", 42);
				}

				__device__ void DeviceFn() {
				HostOnly(); // expected-error {{no matching function}}
				DeviceOnly();
				int x = OverloadMe();
				int y = ns::OverloadMe();
				Variadic("abc", 42); // expected-error {{no matching function}}
				}

				__host__ __device__ void HostDeviceFn() {
				#ifdef __CUDA_ARCH__
				int y = OverloadMe();
				#else
				constexpr HostReturnTy y = OverloadMe();
				#endif
				}

cfe/trunk/test/SemaCUDA/no-host-device-constexpr.cu

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fno-cuda-host-device-constexpr -verify %s
				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fno-cuda-host-device-constexpr -fcuda-is-device -verify %s

				#include "Inputs/cuda.h"

				// Check that, with -fno-cuda-host-device-constexpr, constexpr functions are
				// host-only, and __device__ constexpr functions are still device-only.

				constexpr int f() { return 0; } // expected-note {{not viable}}
				__device__ constexpr int g() { return 0; } // expected-note {{not viable}}

				void __device__ foo() {
				f(); // expected-error {{no matching function}}
				g();
				}

				void __host__ foo() {
				f();
				g(); // expected-error {{no matching function}}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Make unattributed constexpr functions (usually) implicitly host+device.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 52152

cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td

cfe/trunk/include/clang/Basic/LangOptions.def

cfe/trunk/include/clang/Driver/CC1Options.td

cfe/trunk/include/clang/Sema/Sema.h

cfe/trunk/lib/Frontend/CompilerInvocation.cpp

cfe/trunk/lib/Sema/SemaCUDA.cpp

cfe/trunk/lib/Sema/SemaDecl.cpp

cfe/trunk/lib/Sema/SemaOverload.cpp

cfe/trunk/test/SemaCUDA/Inputs/overload.h

cfe/trunk/test/SemaCUDA/host-device-constexpr.cu

cfe/trunk/test/SemaCUDA/no-host-device-constexpr.cu

[CUDA] Make unattributed constexpr functions (usually) implicitly host+device.
ClosedPublic