This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/
-
clang/
-
Basic/
1/2
LangOptions.def
-
Driver/
4
CC1Options.td
-
lib/
-
Frontend/
-
CompilerInvocation.cpp
-
Sema/
1/3
SemaDecl.cpp
1
SemaOverload.cpp
-
test/SemaCUDA/
-
SemaCUDA/
-
function-overload.cu
1/1
host-device-constexpr.cu
-
no-host-device-constexpr.cu

Differential D18380

[CUDA] Make unattributed constexpr functions (usually) implicitly host+device.
ClosedPublic

Authored by jlebar on Mar 22 2016, 3:08 PM.

Download Raw Diff

Details

Reviewers

tra
rnk
rsmith

Commits

rGba122ab42fe5: [CUDA] Make unattributed constexpr functions implicitly host+device.
rC264964: [CUDA] Make unattributed constexpr functions implicitly host+device.
rL264964: [CUDA] Make unattributed constexpr functions implicitly host+device.

Summary

[CUDA] Make unattributed constexpr functions implicitly host+device.

With this patch, by a constexpr function is implicitly host+device
unless:

a) it's a variadic function (variadic functions are not allowed on the device side), or
b) it's preceeded by a device overload in a system header.

The restriction on overloading host device functions on the
basis of their CUDA attributes remains in place, but we use (b) to allow
us to define device overloads for constexpr functions in cmath,
which would otherwise be host device and thus not overloadable.

You can disable this behavior with -fno-cuda-host-device-constexpr.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 51352.Mar 22 2016, 3:08 PM

jlebar retitled this revision from to [CUDA] Implement -fcuda-relaxed-constexpr, and enable it by default..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added subscribers: rsmith, rnk, cfe-commits.

Actually run the tests, and fix the CUDA overloading test.

Now that H/D and HD cal all be in the same overload set, we'll also need additional tests in CodeGenCUDA/function-overload.cu for cases that now became legal.

In D18380#381025, @tra wrote:

Now that H/D and HD cal all be in the same overload set, we'll also need additional tests in CodeGenCUDA/function-overload.cu for cases that now became legal.

There are lots of tests that used to be compile errors and now aren't -- what do you think we're missing?

We need tests to demonstrate that we pick correct function when we have mix
of HD+H/D in the overload set.
Existing tests only cover resolution of {HD,HD}, {H,H} {D,D} {H,D} sets

Add tests checking host+device overloading.

In D18380#381031, @tra wrote:

We need tests to demonstrate that we pick correct function when we have mix
of HD+H/D in the overload set.
Existing tests only cover resolution of {HD,HD}, {H,H} {D,D} {H,D} sets

Aha, got it. I think adding this is simple given the existing framework -- lmk what you think.

Update test as discussed -- now we check that we're invoking the correct overloads.

tra accepted this revision.Mar 23 2016, 10:37 AM

tra edited edge metadata.

This revision is now accepted and ready to land.Mar 23 2016, 10:37 AM

rsmith added inline comments.Mar 23 2016, 10:58 AM

include/clang/Driver/CC1Options.td
694–695	Is there a better name we can use for this? I don't think this is "relaxed" in any obvious sense. `-fcuda-host-device-constexpr` or `-fcuda-constexpr-on-device` might be clearer?
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	For flags that are enabled by default, we usually have the -cc1 flag be a `-fno-*` flag. This allows people to use (for instance) `clang blah.cu -Xclang -fno-cuda-relaxed-constexpr` if necessary.
lib/Sema/SemaOverload.cpp
1132	No parens around `==` comparisons.

jlebar added inline comments.Mar 23 2016, 11:30 AM

include/clang/Driver/CC1Options.td
694–695	"relaxed constexpr" is nvidia's term -- do you think it might be helpful to use the same terminology? I understand there's some prior art here, with respect to clang accepting gcc's flags, although the situation here is of course different.
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	Yeah, Artem and I had a discussion about this yesterday. As you can see, there are two other flags above which are turned on by default -- these also lack -fno variants. I think it would be good to be consistent here. I'm tempted to add another patch below this one which makes the other two -fno, then we can make this one -fno as well. It seems that convention is to just get rid of the existing non-fno flags, rather than leave both positive and negative versions. Does that sound OK to you?

rsmith added inline comments.Mar 23 2016, 11:35 AM

include/clang/Driver/CC1Options.td
694–695	I think it's problematic to use that terminology, as "relaxed constexpr" is also used to describe the C++14 `constexpr` rules (see n3652).
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	Yes, that sounds fine.

jlebar added inline comments.Mar 23 2016, 1:24 PM

include/clang/Driver/CC1Options.td
694–695	Heh, I can't argue with that.
lib/Driver/Tools.cpp
3597 ↗	(On Diff #51384)	Okay, thank you. After talking to Artem, we're just going to remove those two flags entirely. So after we convert relaxed-constexpr to an fno flag, there should be no changes to this file in this patch.

Switch to -fno-cuda-host-device-constexpr. Only implicitly add the attributes
on functions which themselves lack host/device attributes. Add more tests.

Changed as discussed. Please have another look. Thank you for your continued patience here.

tra added inline comments.Mar 23 2016, 3:38 PM

lib/Sema/SemaDecl.cpp
8015–8017	Can we have constexpr `__global__` ?

Add check for global constexpr functions.

lib/Sema/SemaDecl.cpp
8015–8017	Yikes. We're saved (unless Richard has a tricky counterexample) because kernels must be void and constexpr must not be void. But I'll add a check here anyway.

Richard, are you happy here?

The change to allow __host__ __device__ functions to be overloaded with other combinations of target attributes appears to be separable from the constexpr change; please split it out and commit it first.

include/clang/Basic/LangOptions.def
175	This should be a noun phrase -- this string appears in contexts like "support for %0 is enabled" -- so this should be "treating unattributed [...]".
lib/Sema/SemaDecl.cpp
8015–8017	`constexpr` functions can return `void` in a couple of different ways (in C++11, if they're template specializations with dependent return types that instantiate to `void`, and in C++14 there is no restriction on `constexpr` functions returning `void`).

jlebar marked 2 inline comments as done.Mar 24 2016, 1:04 PM

jlebar added inline comments.

include/clang/Basic/LangOptions.def
175	Thanks. This is fixed in my patch queue, and I will push a change for the other ones as part of this patch queue.

Okay, just one more patch, D18458, then I think we're good here. (This is split up into two patches in my queue.)

Thanks for your help, Richard.

jlebar mentioned this in D18458: [CUDA] Mangle __host__ __device__ functions differently than __host__ or __device__ functions..Mar 28 2016, 6:49 PM

jlebar updated this object.

jlebar edited edge metadata.

jlebar updated this object.

jlebar added a reviewer: rnk.

jlebar updated this object.

jlebar retitled this revision from [CUDA] Implement -fcuda-relaxed-constexpr, and enable it by default. to [CUDA] Make unattributed constexpr functions (usually) implicitly host+device..Mar 28 2016, 6:52 PM

Update per changes to patch description. Now a constexpr becomes implicitly HD
unless there's a preceeding device overload.

Updated as discussed -- please have a look.

I wonder if we can find a way to decide whether particular constexpr function should be treated as HD or not without relying on particular order the functions are seen by compiler (or whether they come from system headers).

Right now we're relying on checking overloads of constexpr's function decl once and applying HD attributes based on state of overload set at the point in TU. We then use those attributes during overload resolution.

What if instead of permanently sticking HD attributes on the constexpr function, we instead postpone decision to the point of overload resolution and figure out effective attributes or call preference based on contents of the whole overload set regardless of the order the decls were added to the set.

test/SemaCUDA/host-device-constexpr.cu
31–32	"should prevent this"

In D18380#385240, @tra wrote:

What if instead of permanently sticking HD attributes on the constexpr function, we instead postpone decision to the point of overload resolution and figure out effective attributes or call preference based on contents of the whole overload set regardless of the order the decls were added to the set.

The problem we were trying to prevent by requiring that the __device__ overload come first is:

constexpr int foo();
__device__ void bar() { foo(); }
__device__ int foo();
__device__ void baz() { foo(); }

In this example, we're forced to instantiate both versions of foo() on the device. Being lazy about making the first foo HD doesn't help, because at the time we see bar, it's the only option available.

(Instantiating both foos is a problem if they have the same mangling. And we want them to have the same mangling so we maintain ABI compatibility with nvcc.)

jlebar mentioned this in rL264740: [CUDA] Make CUDA description strings in langopts into noun phrases. NFC.Mar 29 2016, 9:29 AM

LGTM.

(Just to be clear, I'm waiting on Richard's review here, even though he lg'ed an version of this patch.)

LGTM

Thank you all your time here, Art, Reid, and Richard. Fingers crossed we don't have to worry about this again for a while...

Closed by commit rL264964: [CUDA] Make unattributed constexpr functions implicitly host+device. (authored by jlebar). · Explain WhyMar 30 2016, 4:35 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

clang/

Basic/

LangOptions.def

1 line

Driver/

CC1Options.td

2 lines

lib/

Frontend/

CompilerInvocation.cpp

3 lines

Sema/

SemaDecl.cpp

11 lines

SemaOverload.cpp

11 lines

test/

SemaCUDA/

function-overload.cu

126 lines

host-device-constexpr.cu

65 lines

no-host-device-constexpr.cu

20 lines

Diff 51495

include/clang/Basic/LangOptions.def

	Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines
	LANGOPT(HalfArgsAndReturns, 1, 0, "half args and returns")			LANGOPT(HalfArgsAndReturns, 1, 0, "half args and returns")
	LANGOPT(CUDA , 1, 0, "CUDA")			LANGOPT(CUDA , 1, 0, "CUDA")
	LANGOPT(OpenMP , 1, 0, "OpenMP support")			LANGOPT(OpenMP , 1, 0, "OpenMP support")
	LANGOPT(OpenMPUseTLS , 1, 0, "Use TLS for threadprivates or runtime calls")			LANGOPT(OpenMPUseTLS , 1, 0, "Use TLS for threadprivates or runtime calls")
	LANGOPT(OpenMPIsDevice , 1, 0, "Generate code only for OpenMP target device")			LANGOPT(OpenMPIsDevice , 1, 0, "Generate code only for OpenMP target device")

	LANGOPT(CUDAIsDevice , 1, 0, "Compiling for CUDA device")			LANGOPT(CUDAIsDevice , 1, 0, "Compiling for CUDA device")
	LANGOPT(CUDAAllowVariadicFunctions, 1, 0, "Allow variadic functions in CUDA device code")			LANGOPT(CUDAAllowVariadicFunctions, 1, 0, "Allow variadic functions in CUDA device code")
				LANGOPT(CUDAHostDeviceConstexpr, 1, 1, "Treat unattributed constexpr functions as __host__ __device__")
				rsmithUnsubmitted Done Reply Inline Actions This should be a noun phrase -- this string appears in contexts like "support for %0 is enabled" -- so this should be "treating unattributed [...]". rsmith: This should be a noun phrase -- this string appears in contexts like "support for %0 is…
				jlebarAuthorUnsubmitted Not Done Reply Inline Actions Thanks. This is fixed in my patch queue, and I will push a change for the other ones as part of this patch queue. jlebar: Thanks. This is fixed in my patch queue, and I will push a change for the other ones as part…

	LANGOPT(AssumeSaneOperatorNew , 1, 1, "implicit __attribute__((malloc)) for C++'s new operators")			LANGOPT(AssumeSaneOperatorNew , 1, 1, "implicit __attribute__((malloc)) for C++'s new operators")
	LANGOPT(SizedDeallocation , 1, 0, "enable sized deallocation functions")			LANGOPT(SizedDeallocation , 1, 0, "enable sized deallocation functions")
	LANGOPT(ConceptsTS , 1, 0, "enable C++ Extensions for Concepts")			LANGOPT(ConceptsTS , 1, 0, "enable C++ Extensions for Concepts")
	BENIGN_LANGOPT(ElideConstructors , 1, 1, "C++ copy constructor elision")			BENIGN_LANGOPT(ElideConstructors , 1, 1, "C++ copy constructor elision")
	BENIGN_LANGOPT(DumpRecordLayouts , 1, 0, "dumping the layout of IRgen'd records")			BENIGN_LANGOPT(DumpRecordLayouts , 1, 0, "dumping the layout of IRgen'd records")
	BENIGN_LANGOPT(DumpRecordLayoutsSimple , 1, 0, "dumping the layout of IRgen'd records in a simple form")			BENIGN_LANGOPT(DumpRecordLayoutsSimple , 1, 0, "dumping the layout of IRgen'd records in a simple form")
	BENIGN_LANGOPT(DumpVTableLayouts , 1, 0, "dumping the layouts of emitted vtables")			BENIGN_LANGOPT(DumpVTableLayouts , 1, 0, "dumping the layouts of emitted vtables")
	▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

include/clang/Driver/CC1Options.td

	Show First 20 Lines • Show All 685 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def fcuda_is_device : Flag<["-"], "fcuda-is-device">,			def fcuda_is_device : Flag<["-"], "fcuda-is-device">,
	HelpText<"Generate code for CUDA device">;			HelpText<"Generate code for CUDA device">;
	def fcuda_include_gpubinary : Separate<["-"], "fcuda-include-gpubinary">,			def fcuda_include_gpubinary : Separate<["-"], "fcuda-include-gpubinary">,
	HelpText<"Incorporate CUDA device-side binary into host object file.">;			HelpText<"Incorporate CUDA device-side binary into host object file.">;
	def fcuda_allow_variadic_functions : Flag<["-"], "fcuda-allow-variadic-functions">,			def fcuda_allow_variadic_functions : Flag<["-"], "fcuda-allow-variadic-functions">,
	HelpText<"Allow variadic functions in CUDA device code.">;			HelpText<"Allow variadic functions in CUDA device code.">;
				def fno_cuda_host_device_constexpr : Flag<["-"], "fno-cuda-host-device-constexpr">,
				HelpText<"Don't treat unattributed constexpr functions as __host__ __device__.">;
				rsmithUnsubmitted Not Done Reply Inline Actions Is there a better name we can use for this? I don't think this is "relaxed" in any obvious sense. `-fcuda-host-device-constexpr` or `-fcuda-constexpr-on-device` might be clearer? rsmith: Is there a better name we can use for this? I don't think this is "relaxed" in any obvious…
				jlebarAuthorUnsubmitted Not Done Reply Inline Actions "relaxed constexpr" is nvidia's term -- do you think it might be helpful to use the same terminology? I understand there's some prior art here, with respect to clang accepting gcc's flags, although the situation here is of course different. jlebar: "relaxed constexpr" is nvidia's term -- do you think it might be helpful to use the same…
				rsmithUnsubmitted Not Done Reply Inline Actions I think it's problematic to use that terminology, as "relaxed constexpr" is also used to describe the C++14 `constexpr` rules (see n3652). rsmith: I think it's problematic to use that terminology, as "relaxed constexpr" is also used to…
				jlebarAuthorUnsubmitted Not Done Reply Inline Actions Heh, I can't argue with that. jlebar: Heh, I can't argue with that.

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// OpenMP Options			// OpenMP Options
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def fopenmp_is_device : Flag<["-"], "fopenmp-is-device">,			def fopenmp_is_device : Flag<["-"], "fopenmp-is-device">,
	HelpText<"Generate code only for an OpenMP target device.">;			HelpText<"Generate code only for an OpenMP target device.">;
	def fomp_host_ir_file_path : Separate<["-"], "fomp-host-ir-file-path">,			def fomp_host_ir_file_path : Separate<["-"], "fomp-host-ir-file-path">,
	Show All 32 Lines

lib/Frontend/CompilerInvocation.cpp

Show First 20 Lines • Show All 1,554 Lines • ▼ Show 20 Lines	if (Args.hasArg(OPT_fno_operator_names))
Opts.CXXOperatorNames = 0;		Opts.CXXOperatorNames = 0;

if (Args.hasArg(OPT_fcuda_is_device))		if (Args.hasArg(OPT_fcuda_is_device))
Opts.CUDAIsDevice = 1;		Opts.CUDAIsDevice = 1;

if (Args.hasArg(OPT_fcuda_allow_variadic_functions))		if (Args.hasArg(OPT_fcuda_allow_variadic_functions))
Opts.CUDAAllowVariadicFunctions = 1;		Opts.CUDAAllowVariadicFunctions = 1;

		if (Args.hasArg(OPT_fno_cuda_host_device_constexpr))
		Opts.CUDAHostDeviceConstexpr = 0;

if (Opts.ObjC1) {		if (Opts.ObjC1) {
if (Arg *arg = Args.getLastArg(OPT_fobjc_runtime_EQ)) {		if (Arg *arg = Args.getLastArg(OPT_fobjc_runtime_EQ)) {
StringRef value = arg->getValue();		StringRef value = arg->getValue();
if (Opts.ObjCRuntime.tryParse(value))		if (Opts.ObjCRuntime.tryParse(value))
Diags.Report(diag::err_drv_unknown_objc_runtime) << value;		Diags.Report(diag::err_drv_unknown_objc_runtime) << value;
}		}

if (Args.hasArg(OPT_fobjc_gc_only))		if (Args.hasArg(OPT_fobjc_gc_only))
▲ Show 20 Lines • Show All 814 Lines • Show Last 20 Lines

lib/Sema/SemaDecl.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,003 Lines • ▼ Show 20 Lines	if (UnifySection(CodeSegStack.CurrentValue->getString(),
ASTContext::PSF_Read,		ASTContext::PSF_Read,
NewFD))		NewFD))
NewFD->dropAttr<SectionAttr>();		NewFD->dropAttr<SectionAttr>();
}		}

// Handle attributes.		// Handle attributes.
ProcessDeclAttributes(S, NewFD, D);		ProcessDeclAttributes(S, NewFD, D);

		// With CUDAHostDeviceConstexpr, unattributed constexpr functions are treated
		// as implicitly __host__ __device__. Device-side variadic functions are not
		// allowed, so we just treat those as host-only.
		if (getLangOpts().CUDA && getLangOpts().CUDAHostDeviceConstexpr &&
		NewFD->isConstexpr() && !NewFD->isVariadic() &&
		!NewFD->hasAttr<CUDAHostAttr>() && !NewFD->hasAttr<CUDADeviceAttr>() &&
		traUnsubmitted Not Done Reply Inline Actions Can we have constexpr `__global__` ? tra: Can we have constexpr `__global__` ?
		jlebarAuthorUnsubmitted Not Done Reply Inline Actions Yikes. We're saved (unless Richard has a tricky counterexample) because kernels must be void and constexpr must not be void. But I'll add a check here anyway. jlebar: Yikes. We're saved (unless Richard has a tricky counterexample) because kernels must be void…
		rsmithUnsubmitted Done Reply Inline Actions `constexpr` functions can return `void` in a couple of different ways (in C++11, if they're template specializations with dependent return types that instantiate to `void`, and in C++14 there is no restriction on `constexpr` functions returning `void`). rsmith: `constexpr` functions can return `void` in a couple of different ways (in C++11, if they're…
		!NewFD->hasAttr<CUDAGlobalAttr>()) {
		NewFD->addAttr(CUDAHostAttr::CreateImplicit(Context));
		NewFD->addAttr(CUDADeviceAttr::CreateImplicit(Context));
		}

if (getLangOpts().OpenCL) {		if (getLangOpts().OpenCL) {
// OpenCL v1.1 s6.5: Using an address space qualifier in a function return		// OpenCL v1.1 s6.5: Using an address space qualifier in a function return
// type declaration will generate a compilation error.		// type declaration will generate a compilation error.
unsigned AddressSpace = NewFD->getReturnType().getAddressSpace();		unsigned AddressSpace = NewFD->getReturnType().getAddressSpace();
if (AddressSpace == LangAS::opencl_local \|\|		if (AddressSpace == LangAS::opencl_local \|\|
AddressSpace == LangAS::opencl_global \|\|		AddressSpace == LangAS::opencl_global \|\|
AddressSpace == LangAS::opencl_constant) {		AddressSpace == LangAS::opencl_constant) {
Diag(NewFD->getLocation(),		Diag(NewFD->getLocation(),
▲ Show 20 Lines • Show All 7,034 Lines • Show Last 20 Lines

lib/Sema/SemaOverload.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,120 Lines • ▼ Show 20 Lines	bool Sema::IsOverload(FunctionDecl New, FunctionDecl Old,
if (getLangOpts().CUDA) {		if (getLangOpts().CUDA) {
CUDAFunctionTarget NewTarget = IdentifyCUDATarget(New),		CUDAFunctionTarget NewTarget = IdentifyCUDATarget(New),
OldTarget = IdentifyCUDATarget(Old);		OldTarget = IdentifyCUDATarget(Old);
if (NewTarget == CFT_InvalidTarget \|\| NewTarget == CFT_Global)		if (NewTarget == CFT_InvalidTarget \|\| NewTarget == CFT_Global)
return false;		return false;

assert((OldTarget != CFT_InvalidTarget) && "Unexpected invalid target.");		assert((OldTarget != CFT_InvalidTarget) && "Unexpected invalid target.");

// Don't allow mixing of HD with other kinds. This guarantees that		// Don't allow __global__ functions to be overloaded with other functions,
// we have only one viable function with this signature on any		// based solely on their CUDA attributes. This guarantees that we have only
// side of CUDA compilation .		// one viable function with this signature on any side of CUDA compilation.
// __global__ functions can't be overloaded based on attribute		if (NewTarget == CFT_Global \|\| OldTarget == CFT_Global)
		rsmithUnsubmitted Not Done Reply Inline Actions No parens around `==` comparisons. rsmith: No parens around `==` comparisons.
// difference because, like HD, they also exist on both sides.
if ((NewTarget == CFT_HostDevice) \|\| (OldTarget == CFT_HostDevice) \|\|
(NewTarget == CFT_Global) \|\| (OldTarget == CFT_Global))
return false;		return false;

// Allow overloading of functions with same signature, but		// Allow overloading of functions with same signature, but
// different CUDA target attributes.		// different CUDA target attributes.
return NewTarget != OldTarget;		return NewTarget != OldTarget;
}		}

// The signatures match; this is not an overload.		// The signatures match; this is not an overload.
▲ Show 20 Lines • Show All 11,901 Lines • Show Last 20 Lines

test/SemaCUDA/function-overload.cu

Show All 33 Lines
// Host and unattributed functions can't be overloaded.		// Host and unattributed functions can't be overloaded.
__host__ void hh() {} // expected-note {{previous definition is here}}		__host__ void hh() {} // expected-note {{previous definition is here}}
void hh() {} // expected-error {{redefinition of 'hh'}}		void hh() {} // expected-error {{redefinition of 'hh'}}

// H/D overloading is OK.		// H/D overloading is OK.
__host__ HostReturnTy dh() { return HostReturnTy(); }		__host__ HostReturnTy dh() { return HostReturnTy(); }
__device__ DeviceReturnTy dh() { return DeviceReturnTy(); }		__device__ DeviceReturnTy dh() { return DeviceReturnTy(); }

// H/HD and D/HD are not allowed.		// H/HD and D/HD are also OK.
__host__ __device__ int hdh() { return 0; } // expected-note {{previous definition is here}}		__host__ __device__ HostDeviceReturnTy hdh() { return HostDeviceReturnTy(); }
__host__ int hdh() { return 0; } // expected-error {{redefinition of 'hdh'}}		__host__ HostReturnTy hdh() { return HostReturnTy(); }

__host__ int hhd() { return 0; } // expected-note {{previous definition is here}}		__host__ HostReturnTy hhd() { return HostReturnTy(); }
__host__ __device__ int hhd() { return 0; } // expected-error {{redefinition of 'hhd'}}		__host__ __device__ HostDeviceReturnTy hhd() { return HostDeviceReturnTy(); }
// expected-warning@-1 {{attribute declaration must precede definition}}
// expected-note@-3 {{previous definition is here}}		__host__ __device__ HostDeviceReturnTy hdd() { return HostDeviceReturnTy(); }
		__device__ DeviceReturnTy hdd() { return DeviceReturnTy(); }
__host__ __device__ int hdd() { return 0; } // expected-note {{previous definition is here}}
__device__ int hdd() { return 0; } // expected-error {{redefinition of 'hdd'}}		__device__ DeviceReturnTy dhd() { return DeviceReturnTy(); }
		__host__ __device__ HostDeviceReturnTy dhd() { return HostDeviceReturnTy(); }
__device__ int dhd() { return 0; } // expected-note {{previous definition is here}}
__host__ __device__ int dhd() { return 0; } // expected-error {{redefinition of 'dhd'}}
// expected-warning@-1 {{attribute declaration must precede definition}}
// expected-note@-3 {{previous definition is here}}

// Same tests for extern "C" functions.		// Same tests for extern "C" functions.
extern "C" __host__ int chh() { return 0; } // expected-note {{previous definition is here}}		extern "C" __host__ int chh() { return 0; } // expected-note {{previous definition is here}}
extern "C" int chh() { return 0; } // expected-error {{redefinition of 'chh'}}		extern "C" int chh() { return 0; } // expected-error {{redefinition of 'chh'}}

// H/D overloading is OK.		// H/D overloading is OK.
extern "C" __device__ DeviceReturnTy cdh() { return DeviceReturnTy(); }		extern "C" __device__ DeviceReturnTy cdh() { return DeviceReturnTy(); }
extern "C" __host__ HostReturnTy cdh() { return HostReturnTy(); }		extern "C" __host__ HostReturnTy cdh() { return HostReturnTy(); }

// H/HD and D/HD overloading is not allowed.		// H/HD and D/HD overloading is OK.
extern "C" __host__ __device__ int chhd1() { return 0; } // expected-note {{previous definition is here}}		extern "C" __host__ __device__ HostDeviceReturnTy chhd() { return HostDeviceReturnTy(); }
extern "C" __host__ int chhd1() { return 0; } // expected-error {{redefinition of 'chhd1'}}		extern "C" __host__ HostReturnTy chhd() { return HostReturnTy(); }

extern "C" __host__ int chhd2() { return 0; } // expected-note {{previous definition is here}}		extern "C" __host__ __device__ HostDeviceReturnTy chdd() { return HostDeviceReturnTy(); }
extern "C" __host__ __device__ int chhd2() { return 0; } // expected-error {{redefinition of 'chhd2'}}		extern "C" __device__ DeviceReturnTy chdd() { return DeviceReturnTy(); }
// expected-warning@-1 {{attribute declaration must precede definition}}
// expected-note@-3 {{previous definition is here}}

// Helper functions to verify calling restrictions.		// Helper functions to verify calling restrictions.
__device__ DeviceReturnTy d() { return DeviceReturnTy(); }		__device__ DeviceReturnTy d() { return DeviceReturnTy(); }
// expected-note@-1 1+ {{'d' declared here}}		// expected-note@-1 1+ {{'d' declared here}}
// expected-note@-2 1+ {{candidate function not viable: call to __device__ function from __host__ function}}		// expected-note@-2 1+ {{candidate function not viable: call to __device__ function from __host__ function}}
// expected-note@-3 0+ {{candidate function not viable: call to __device__ function from __host__ __device__ function}}		// expected-note@-3 0+ {{candidate function not viable: call to __device__ function from __host__ __device__ function}}

__host__ HostReturnTy h() { return HostReturnTy(); }		__host__ HostReturnTy h() { return HostReturnTy(); }
Show All 30 Lines	__host__ void hostf() {
HostFnPtr fp_ch = ch;		HostFnPtr fp_ch = ch;
HostReturnTy ret_ch = ch();		HostReturnTy ret_ch = ch();

HostFnPtr fp_dh = dh;		HostFnPtr fp_dh = dh;
HostReturnTy ret_dh = dh();		HostReturnTy ret_dh = dh();
HostFnPtr fp_cdh = cdh;		HostFnPtr fp_cdh = cdh;
HostReturnTy ret_cdh = cdh();		HostReturnTy ret_cdh = cdh();

		HostFnPtr fp_hdh = hdh;
		HostReturnTy ret_hdh = hdh();
		HostFnPtr fp_chhd = chhd;
		HostReturnTy ret_chhd = chhd();

		HostDeviceFnPtr fp_hdd = hdd;
		HostDeviceReturnTy ret_hdd = hdd();
		HostDeviceFnPtr fp_chdd = chdd;
		HostDeviceReturnTy ret_chdd = chdd();

GlobalFnPtr fp_g = g;		GlobalFnPtr fp_g = g;
g(); // expected-error {{call to global function g not configured}}		g(); // expected-error {{call to global function g not configured}}
g<<<0, 0>>>();		g<<<0, 0>>>();
}		}

__device__ void devicef() {		__device__ void devicef() {
DeviceFnPtr fp_d = d;		DeviceFnPtr fp_d = d;
DeviceReturnTy ret_d = d();		DeviceReturnTy ret_d = d();
DeviceFnPtr fp_cd = cd;		DeviceFnPtr fp_cd = cd;
DeviceReturnTy ret_cd = cd();		DeviceReturnTy ret_cd = cd();

HostFnPtr fp_h = h; // expected-error {{reference to __host__ function 'h' in __device__ function}}		HostFnPtr fp_h = h; // expected-error {{reference to __host__ function 'h' in __device__ function}}
HostReturnTy ret_h = h(); // expected-error {{no matching function for call to 'h'}}		HostReturnTy ret_h = h(); // expected-error {{no matching function for call to 'h'}}
HostFnPtr fp_ch = ch; // expected-error {{reference to __host__ function 'ch' in __device__ function}}		HostFnPtr fp_ch = ch; // expected-error {{reference to __host__ function 'ch' in __device__ function}}
HostReturnTy ret_ch = ch(); // expected-error {{no matching function for call to 'ch'}}		HostReturnTy ret_ch = ch(); // expected-error {{no matching function for call to 'ch'}}

DeviceFnPtr fp_dh = dh;		DeviceFnPtr fp_dh = dh;
DeviceReturnTy ret_dh = dh();		DeviceReturnTy ret_dh = dh();
DeviceFnPtr fp_cdh = cdh;		DeviceFnPtr fp_cdh = cdh;
DeviceReturnTy ret_cdh = cdh();		DeviceReturnTy ret_cdh = cdh();

		HostDeviceFnPtr fp_hdh = hdh;
		HostDeviceReturnTy ret_hdh = hdh();
		HostDeviceFnPtr fp_chhd = chhd;
		HostDeviceReturnTy ret_chhd = chhd();

		DeviceFnPtr fp_hdd = hdd;
		DeviceReturnTy ret_hdd = hdd();
		DeviceFnPtr fp_chdd = chdd;
		DeviceReturnTy ret_chdd = chdd();

GlobalFnPtr fp_g = g; // expected-error {{reference to __global__ function 'g' in __device__ function}}		GlobalFnPtr fp_g = g; // expected-error {{reference to __global__ function 'g' in __device__ function}}
g(); // expected-error {{no matching function for call to 'g'}}		g(); // expected-error {{no matching function for call to 'g'}}
g<<<0,0>>>(); // expected-error {{reference to __global__ function 'g' in __device__ function}}		g<<<0,0>>>(); // expected-error {{reference to __global__ function 'g' in __device__ function}}
}		}

__global__ void globalf() {		__global__ void globalf() {
DeviceFnPtr fp_d = d;		DeviceFnPtr fp_d = d;
DeviceReturnTy ret_d = d();		DeviceReturnTy ret_d = d();
DeviceFnPtr fp_cd = cd;		DeviceFnPtr fp_cd = cd;
DeviceReturnTy ret_cd = cd();		DeviceReturnTy ret_cd = cd();

HostFnPtr fp_h = h; // expected-error {{reference to __host__ function 'h' in __global__ function}}		HostFnPtr fp_h = h; // expected-error {{reference to __host__ function 'h' in __global__ function}}
HostReturnTy ret_h = h(); // expected-error {{no matching function for call to 'h'}}		HostReturnTy ret_h = h(); // expected-error {{no matching function for call to 'h'}}
HostFnPtr fp_ch = ch; // expected-error {{reference to __host__ function 'ch' in __global__ function}}		HostFnPtr fp_ch = ch; // expected-error {{reference to __host__ function 'ch' in __global__ function}}
HostReturnTy ret_ch = ch(); // expected-error {{no matching function for call to 'ch'}}		HostReturnTy ret_ch = ch(); // expected-error {{no matching function for call to 'ch'}}

DeviceFnPtr fp_dh = dh;		DeviceFnPtr fp_dh = dh;
DeviceReturnTy ret_dh = dh();		DeviceReturnTy ret_dh = dh();
DeviceFnPtr fp_cdh = cdh;		DeviceFnPtr fp_cdh = cdh;
DeviceReturnTy ret_cdh = cdh();		DeviceReturnTy ret_cdh = cdh();

		HostDeviceFnPtr fp_hdh = hdh;
		HostDeviceReturnTy ret_hdh = hdh();
		HostDeviceFnPtr fp_chhd = chhd;
		HostDeviceReturnTy ret_chhd = chhd();

		DeviceFnPtr fp_hdd = hdd;
		DeviceReturnTy ret_hdd = hdd();
		DeviceFnPtr fp_chdd = chdd;
		DeviceReturnTy ret_chdd = chdd();

GlobalFnPtr fp_g = g; // expected-error {{reference to __global__ function 'g' in __global__ function}}		GlobalFnPtr fp_g = g; // expected-error {{reference to __global__ function 'g' in __global__ function}}
g(); // expected-error {{no matching function for call to 'g'}}		g(); // expected-error {{no matching function for call to 'g'}}
g<<<0,0>>>(); // expected-error {{reference to __global__ function 'g' in __global__ function}}		g<<<0,0>>>(); // expected-error {{reference to __global__ function 'g' in __global__ function}}
}		}

__host__ __device__ void hostdevicef() {		__host__ __device__ void hostdevicef() {
DeviceFnPtr fp_d = d;		DeviceFnPtr fp_d = d;
DeviceReturnTy ret_d = d();		DeviceReturnTy ret_d = d();
DeviceFnPtr fp_cd = cd;		DeviceFnPtr fp_cd = cd;
DeviceReturnTy ret_cd = cd();		DeviceReturnTy ret_cd = cd();

HostFnPtr fp_h = h;		HostFnPtr fp_h = h;
HostReturnTy ret_h = h();		HostReturnTy ret_h = h();
HostFnPtr fp_ch = ch;		HostFnPtr fp_ch = ch;
HostReturnTy ret_ch = ch();		HostReturnTy ret_ch = ch();

CurrentFnPtr fp_dh = dh;		CurrentFnPtr fp_dh = dh;
CurrentReturnTy ret_dh = dh();		CurrentReturnTy ret_dh = dh();
CurrentFnPtr fp_cdh = cdh;		CurrentFnPtr fp_cdh = cdh;
CurrentReturnTy ret_cdh = cdh();		CurrentReturnTy ret_cdh = cdh();

		// HDOrHostFoo is HostFoo if we're doing host compilation, and HDFoo
		// otherwise.
		#ifdef __CUDA_ARCH__
		typedef HostDeviceReturnTy HDOrHostReturnTy;
		typedef HostDeviceFnPtr HDOrHostFnPtr;
		typedef DeviceReturnTy HDOrDeviceReturnTy;
		typedef DeviceFnPtr HDOrDeviceFnPtr;
		#else
		typedef HostReturnTy HDOrHostReturnTy;
		typedef HostFnPtr HDOrHostFnPtr;
		typedef HostDeviceReturnTy HDOrDeviceReturnTy;
		typedef HostDeviceFnPtr HDOrDeviceFnPtr;
		#endif

		HDOrHostFnPtr fp_hdh = hdh;
		HDOrHostReturnTy ret_hdh = hdh();
		HDOrHostFnPtr fp_chhd = chhd;
		HDOrHostReturnTy ret_chhd = chhd();

		HDOrDeviceFnPtr fp_hdd = hdd;
		HDOrDeviceReturnTy ret_hdd = hdd();
		HDOrDeviceFnPtr fp_chdd = chdd;
		HDOrDeviceReturnTy ret_chdd = chdd();

GlobalFnPtr fp_g = g;		GlobalFnPtr fp_g = g;
#if defined(__CUDA_ARCH__)		#if defined(__CUDA_ARCH__)
// expected-error@-2 {{reference to __global__ function 'g' in __host__ __device__ function}}		// expected-error@-2 {{reference to __global__ function 'g' in __host__ __device__ function}}
#endif		#endif
g();		g();
g<<<0,0>>>();		g<<<0,0>>>();
#if !defined(__CUDA_ARCH__)		#if !defined(__CUDA_ARCH__)
// expected-error@-3 {{call to global function g not configured}}		// expected-error@-3 {{call to global function g not configured}}
Show All 24 Lines	struct d_dh {
__host__ ~d_dh() {}		__host__ ~d_dh() {}
};		};

// HD is OK		// HD is OK
struct d_hd {		struct d_hd {
__host__ __device__ ~d_hd() {}		__host__ __device__ ~d_hd() {}
};		};

// Mixing H/D and HD is not allowed.		// Mixing H/D and HD is OK.
struct d_dhhd {		struct d_dhhd {
__device__ ~d_dhhd() {}		__device__ ~d_dhhd() {}
__host__ ~d_dhhd() {} // expected-note {{previous declaration is here}}		__host__ ~d_dhhd() {}
__host__ __device__ ~d_dhhd() {} // expected-error {{destructor cannot be redeclared}}		__host__ __device__ ~d_dhhd() {}
};		};

struct d_hhd {		struct d_hhd {
__host__ ~d_hhd() {} // expected-note {{previous declaration is here}}		__host__ ~d_hhd() {}
__host__ __device__ ~d_hhd() {} // expected-error {{destructor cannot be redeclared}}		__host__ __device__ ~d_hhd() {}
};		};

struct d_hdh {		struct d_hdh {
__host__ __device__ ~d_hdh() {} // expected-note {{previous declaration is here}}		__host__ __device__ ~d_hdh() {}
__host__ ~d_hdh() {} // expected-error {{destructor cannot be redeclared}}		__host__ ~d_hdh() {}
};		};

struct d_dhd {		struct d_dhd {
__device__ ~d_dhd() {} // expected-note {{previous declaration is here}}		__device__ ~d_dhd() {}
__host__ __device__ ~d_dhd() {} // expected-error {{destructor cannot be redeclared}}		__host__ __device__ ~d_dhd() {}
};		};

struct d_hdd {		struct d_hdd {
__host__ __device__ ~d_hdd() {} // expected-note {{previous declaration is here}}		__host__ __device__ ~d_hdd() {}
__device__ ~d_hdd() {} // expected-error {{destructor cannot be redeclared}}		__device__ ~d_hdd() {}
};		};

// Test overloading of member functions		// Test overloading of member functions
struct m_h {		struct m_h {
void operator delete(void *ptr); // expected-note {{previous declaration is here}}		void operator delete(void *ptr); // expected-note {{previous declaration is here}}
__host__ void operator delete(void *ptr); // expected-error {{class member cannot be redeclared}}		__host__ void operator delete(void *ptr); // expected-error {{class member cannot be redeclared}}
};		};

// D/H overloading is OK		// D/H overloading is OK
struct m_dh {		struct m_dh {
__device__ void operator delete(void *ptr);		__device__ void operator delete(void *ptr);
__host__ void operator delete(void *ptr);		__host__ void operator delete(void *ptr);
};		};

// HD by itself is OK		// HD by itself is OK
struct m_hd {		struct m_hd {
__device__ __host__ void operator delete(void *ptr);		__device__ __host__ void operator delete(void *ptr);
};		};

struct m_hhd {		struct m_hhd {
__host__ void operator delete(void *ptr) {} // expected-note {{previous declaration is here}}		__host__ void operator delete(void *ptr) {}
__host__ __device__ void operator delete(void *ptr) {} // expected-error {{class member cannot be redeclared}}		__host__ __device__ void operator delete(void *ptr) {}
};		};

struct m_hdh {		struct m_hdh {
__host__ __device__ void operator delete(void *ptr) {} // expected-note {{previous declaration is here}}		__host__ __device__ void operator delete(void *ptr) {}
__host__ void operator delete(void *ptr) {} // expected-error {{class member cannot be redeclared}}		__host__ void operator delete(void *ptr) {}
};		};

struct m_dhd {		struct m_dhd {
__device__ void operator delete(void *ptr) {} // expected-note {{previous declaration is here}}		__device__ void operator delete(void *ptr) {}
__host__ __device__ void operator delete(void *ptr) {} // expected-error {{class member cannot be redeclared}}		__host__ __device__ void operator delete(void *ptr) {}
};		};

struct m_hdd {		struct m_hdd {
__host__ __device__ void operator delete(void *ptr) {} // expected-note {{previous declaration is here}}		__host__ __device__ void operator delete(void *ptr) {}
__device__ void operator delete(void *ptr) {} // expected-error {{class member cannot be redeclared}}		__device__ void operator delete(void *ptr) {}
};		};

// __global__ functions can't be overloaded based on attribute		// __global__ functions can't be overloaded based on attribute
// difference.		// difference.
struct G {		struct G {
friend void friend_of_g(G &arg);		friend void friend_of_g(G &arg);
private:		private:
int x;		int x;
▲ Show 20 Lines • Show All 88 Lines • Show Last 20 Lines

test/SemaCUDA/host-device-constexpr.cu

This file was added.

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s
				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s -fcuda-is-device

				#include "Inputs/cuda.h"

				// Opaque types used to determine which overload we're invoking.
				struct HostReturnTy {};
				struct DeviceReturnTy {};
				struct HostDeviceReturnTy {};

				// These shouldn't become host+device because they already have attributes.
				__host__ constexpr int HostOnly() { return 0; }
				// expected-note@-1 0+ {{not viable}}
				__device__ constexpr int DeviceOnly() { return 0; }
				// expected-note@-1 0+ {{not viable}}

				__host__ HostReturnTy Overloaded1();
				constexpr HostDeviceReturnTy Overloaded1() { return HostDeviceReturnTy(); }

				__device__ DeviceReturnTy Overloaded2();
				constexpr HostDeviceReturnTy Overloaded2() { return HostDeviceReturnTy(); }

				__host__ void HostFn() {
				HostOnly();
				DeviceOnly(); // expected-error {{no matching function}}
				HostReturnTy x = Overloaded1();
				HostDeviceReturnTy y = Overloaded2();
				}

				__device__ void DeviceFn() {
				HostOnly(); // expected-error {{no matching function}}
				DeviceOnly();
				traUnsubmitted Done Reply Inline Actions "should prevent this" tra: "should prevent this"
				HostDeviceReturnTy x = Overloaded1();
				DeviceReturnTy y = Overloaded2();
				}

				__host__ __device__ void HostDeviceFn() {
				#ifdef __CUDA_ARCH__
				constexpr HostDeviceReturnTy x = Overloaded1();
				DeviceReturnTy y = Overloaded2();
				#else
				HostReturnTy x = Overloaded1();
				constexpr HostDeviceReturnTy y = Overloaded2();
				#endif
				}

				// Check that a constexpr function can overload a __device__ function, and
				// that, in particular, we don't get errors if one of them is static and the
				// other isn't.
				static __device__ void f1();
				constexpr void f1();

				__device__ void f2();
				static constexpr void f2();

				// Different potential error depending on the order of declaration.
				constexpr void f3();
				static __device__ void f3();

				static constexpr void f4();
				__device__ void f4();

				// Variadic device functions are not allowed, so this is just treated as
				// host-only.
				constexpr void variadic(const char*, ...);

test/SemaCUDA/no-host-device-constexpr.cu

This file was added.

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fno-cuda-host-device-constexpr -verify %s
				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -fno-cuda-host-device-constexpr -fcuda-is-device -verify %s

				#include "Inputs/cuda.h"

				// Check that, with -fno-cuda-host-device-constexpr, constexpr functions are
				// host-only, and __device__ constexpr functions are still device-only.

				constexpr int f() { return 0; } // expected-note {{not viable}}
				__device__ constexpr int g() { return 0; } // expected-note {{not viable}}

				void __device__ foo() {
				f(); // expected-error {{no matching function}}
				g();
				}

				void __host__ foo() {
				f();
				g(); // expected-error {{no matching function}}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Make unattributed constexpr functions (usually) implicitly host+device.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 51495

include/clang/Basic/LangOptions.def

include/clang/Driver/CC1Options.td

lib/Frontend/CompilerInvocation.cpp

lib/Sema/SemaDecl.cpp

lib/Sema/SemaOverload.cpp

test/SemaCUDA/function-overload.cu

test/SemaCUDA/host-device-constexpr.cu

test/SemaCUDA/no-host-device-constexpr.cu

[CUDA] Make unattributed constexpr functions (usually) implicitly host+device.
ClosedPublic