This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
Basic/
-
DiagnosticSemaKinds.td
-
Sema/
3/5
Sema.h
-
lib/Sema/
-
Sema/
-
SemaOverload.cpp
-
test/
-
CodeGenCUDA/
-
function-overload.cu
-
Misc/
-
warning-flags.c
-
SemaCUDA/
-
function-overload.cu

Differential D61458

[hip] Relax CUDA call restriction within `decltype` context.
Needs ReviewPublic

Authored by hliao on May 2 2019, 1:03 PM.

Download Raw Diff

Details

Reviewers

tra
rjmccall
yaxunl
jlebar

Summary

Within decltype, expressions are only type-inspected. The restriction on CUDA calls should be relaxed.

Diff Detail

Repository

rG LLVM Github Monorepo

Build Status

Buildable 40830
Build 40964: arc lint + arc unit

Event Timeline

hliao created this revision.May 2 2019, 1:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 2 2019, 1:03 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B31306: Diff 197850.May 2 2019, 1:03 PM

Perhaps we should allow this in all unevaluated contexts?
I.e. int s = sizeof(foo(x)); should also work.

clang/include/clang/Sema/Sema.h
10972	I think you want `return llvm::any_of(ExprEvalContexts, ...)` here and you can fold it directly into `if()` below.

In D61458#1488523, @tra wrote:

Perhaps we should allow this in all unevaluated contexts?
I.e. int s = sizeof(foo(x)); should also work.

good point, do we have a dedicated context for sizeof? that make the checking easier.

clang/include/clang/Sema/Sema.h
10972	yeah, that's much simpler, I will make the change.

simplify the logic using llvm::any_of.

Harbormaster completed remote builds in B31311: Diff 197860.May 2 2019, 1:57 PM

In D61458#1488550, @hliao wrote:

In D61458#1488523, @tra wrote:

Perhaps we should allow this in all unevaluated contexts?
I.e. int s = sizeof(foo(x)); should also work.

good point, do we have a dedicated context for sizeof? that make the checking easier.

Sema::isUnevaluatedContext() may be able to do the job.

tra added inline comments.May 2 2019, 2:27 PM

clang/include/clang/Sema/Sema.h
10969–10975	One more thing. The idea of this function is that we're checking if the `Caller` is allowed to call the `Callee`. However here, you're checking the current context, which may not necessarily be the same as the caller's. I.e. someone could potentially call it way after the context is gone. Currently all uses of this function obtain the caller from `CurContext`, but if we start relying on other properties of the current context other than the caller function, then we may neet to pass the context explicitly, or only pass the Callee and check if it's callable from the current context.

Here's one for you:

__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) {}

What is the return type of foo? :)

I don't believe the right answer is, "float when compiling for host, int when compiling for device."

I'd be happy if we said this was an error, so long as it's well-defined what exactly we're disallowing. But I bet @rsmith can come up with substantially more evil testcases than this.

In D61458#1488970, @jlebar wrote:
Here's one for you:
__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) {}
What is the return type of foo? :)

I don't believe the right answer is, "float when compiling for host, int when compiling for device."

So, actually, I wonder if that's not the right answer. We generally allow different overloads to have different return types. What if, for example, the return type on the host is __float128 and on the device it's MyLongFloatTy?

I'd be happy if we said this was an error, so long as it's well-defined what exactly we're disallowing. But I bet @rsmith can come up with substantially more evil testcases than this.

In D61458#1488972, @hfinkel wrote:
In D61458#1488970, @jlebar wrote:
Here's one for you:
__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) {}
What is the return type of foo? :)

I don't believe the right answer is, "float when compiling for host, int when compiling for device."
So, actually, I wonder if that's not the right answer. We generally allow different overloads to have different return types.

Only if they also differ in some other way. C++ does not (generally) have return-type-based overloading. The two functions described would even mangle the same way if CUDA didn't include host/device in the mangling.

(Function templates can differ only by return type, but if both return types successfully instantiate for a given set of (possibly inferred) template arguments then the templates can only be distinguished when taking their address, not when calling.)

I think I've said before that adding this kind of overloading is not a good idea, but since it's apparently already there, you should consult the specification (or at least existing practice) to figure out what you're supposed to do.

Only if they also differ in some other way. C++ does not (generally) have return-type-based overloading. The two functions described would even mangle the same way if CUDA didn't include host/device in the mangling.

Certainly. I didn't mean to imply otherwise.

In D61458#1488970, @jlebar wrote:
Here's one for you:
__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) {}
What is the return type of foo? :)

I don't believe the right answer is, "float when compiling for host, int when compiling for device."

I'd be happy if we said this was an error, so long as it's well-defined what exactly we're disallowing. But I bet @rsmith can come up with substantially more evil testcases than this.

This patch is introduced to allow function or template function from std library to be used with device function. By allowing different-side candidates with a context only caring type inspection, we have new issue as there are extra beyond the regular rule for C++ overloadable resolution. We need an extra policy to figure out which is one the best candidate by considering CUDA attributes. Says the case you proposed, we may consider the following order to choose an overloadable candidate, e.g.

SAME-SIDE (with the same CUDA attribute)
NATIVE (without any CUDA attribute)
WRONG-SIDE (with the opposite CUDA attribute)

or just

SAME-SIDE
NATIVE

It that a reasonable change?

In D61458#1488981, @rjmccall wrote:
In D61458#1488972, @hfinkel wrote:
In D61458#1488970, @jlebar wrote:
Here's one for you:
__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) {}
What is the return type of foo? :)

I don't believe the right answer is, "float when compiling for host, int when compiling for device."
So, actually, I wonder if that's not the right answer. We generally allow different overloads to have different return types.
Only if they also differ in some other way. C++ does not (generally) have return-type-based overloading. The two functions described would even mangle the same way if CUDA didn't include host/device in the mangling.

(Function templates can differ only by return type, but if both return types successfully instantiate for a given set of (possibly inferred) template arguments then the templates can only be distinguished when taking their address, not when calling.)

I think I've said before that adding this kind of overloading is not a good idea, but since it's apparently already there, you should consult the specification (or at least existing practice) to figure out what you're supposed to do.

BTW, just check similar stuff with nvcc, with more than one candidates, it accepts the following code

float bar(); // This line could be replaced by appendig `__host` or `__device__`, all of them are accepted.
__host__ __device__ auto foo() -> decltype(bar()) {}

however, if there are more than one candidates differenct on the return type (without or with CUDA attibute difference), it could raise the error

foo.cu(4): error: cannot overload functions distinguished by return type alone

it seems to me that that's also an acceptable policy to handle the issue after we allow different-side candidates in type-only context.

In D61458#1488970, @jlebar wrote:
Here's one for you:
__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) {}
What is the return type of foo? :)

I don't believe the right answer is, "float when compiling for host, int when compiling for device."

I'd be happy if we said this was an error, so long as it's well-defined what exactly we're disallowing. But I bet @rsmith can come up with substantially more evil testcases than this.

At from CUDA 10, that's not acceptable as we are declaring two functions only differ from the return type. It seems CUDA attributes do not contribute to the function signature. clang is quite different here.

hliao marked an inline comment as done.May 3 2019, 5:50 AM

hliao added inline comments.

clang/include/clang/Sema/Sema.h
10969–10975	as the expression within `decltype` may be quite complicated, the idea here is to relax that rule within `decltype` context, not only for a particular pair of caller/callee.

At [nvcc] from CUDA 10, that's not acceptable as we are declaring two functions only differ from the return type. It seems CUDA attributes do not contribute to the function signature. clang is quite different here.

Yes, this is an intentional and more relaxed semantics in clang. It's also sort of the linchpin of our mixed-mode compilation strategy, which is very different from nvcc's source-to-source splitting strategy.

Back in the day you could trick nvcc into allowing host/device overloading on same-signature functions by slapping a template on one or both of them. Checking just now it seems they fixed this, but I suspect there are still dark corners where nvcc relies on effectively the same behavior as we get in clang via true overloading.

tra added inline comments.May 3 2019, 9:23 AM

clang/include/clang/Sema/Sema.h
10969–10975	I understand the idea, but in this case the argument was more about the code style. Currently the contract is that the function's decision is derived from its arguments (and could, perhaps, be a static method). With this patch you start relying on the context, but it's not obvious from the function signature. Replacing Caller with context, or removing the caller altogether would bring the function signature closer to what the function does.

This patch is revived with more changes addressing the previous concerns.

Back to Justin's example:

__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() -> decltype(bar()) { return bar(); }

Even without this patch, that example already passed the compilation without
either errors or warnings. Says

clang -std=c++11 -x cuda -nocudainc -nocudalib --cuda-gpu-arch=sm_60 --cuda-device-only -S -emit-llvm -O3 foo.cu

In c++14, that example could be even simplified without decltype but the same ambiguity.

__host__ float bar();
__device__ int bar();
__host__ __device__ auto foo() { return bar(); }

Without any change, clang also compiles the code as well and uses different return types between host-side and device-side compilation.[^1]

[^1]: The first example has the same return type between host-side and device-side but that seems incorrect or unreasonable to me.

The ambiguity issue is in fact not introduced by relaxing decltype. That's an inherent one as we allow overloading over target attributes. Issuing warnings instead of errors seems more reasonable to me for such cases.

In this patch, besides relaxing the CUDA call rule under decltype, it also generates warning during function overloading if there are more than candidates with different return types.

hliao marked an inline comment as done.Nov 12 2019, 11:26 AM

Harbormaster completed remote builds in B40830: Diff 228924.Nov 12 2019, 11:32 AM

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

DiagnosticSemaKinds.td

6 lines

Sema/

Sema.h

22 lines

lib/

Sema/

SemaOverload.cpp

28 lines

test/

CodeGenCUDA/

function-overload.cu

13 lines

Misc/

warning-flags.c

3 lines

SemaCUDA/

function-overload.cu

29 lines

Diff 228924

clang/include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,490 Lines • ▼ Show 20 Lines	def err_cuda_host_shared : Error<
"__shared__ local variables not allowed in "		"__shared__ local variables not allowed in "
"%select{__device__\|__global__\|__host__\|__host__ __device__}0 functions">;		"%select{__device__\|__global__\|__host__\|__host__ __device__}0 functions">;
def err_cuda_nonglobal_constant : Error<"__constant__ variables must be global">;		def err_cuda_nonglobal_constant : Error<"__constant__ variables must be global">;
def err_cuda_ovl_target : Error<		def err_cuda_ovl_target : Error<
"%select{__device__\|__global__\|__host__\|__host__ __device__}0 function %1 "		"%select{__device__\|__global__\|__host__\|__host__ __device__}0 function %1 "
"cannot overload %select{__device__\|__global__\|__host__\|__host__ __device__}2 function %3">;		"cannot overload %select{__device__\|__global__\|__host__\|__host__ __device__}2 function %3">;
def note_cuda_ovl_candidate_target_mismatch : Note<		def note_cuda_ovl_candidate_target_mismatch : Note<
"candidate template ignored: target attributes do not match">;		"candidate template ignored: target attributes do not match">;
		def warn_decltype_ambiguous_return_type : Warning<
		"return type of %0 in 'decltype' is ambiguous and may not be expected">;
		def note_decltype_ambiguous_function_chosen : Note<
		"use this definition of %0">;
		def note_decltype_ambiguous_function_other : Note<
		"other definition of %0">;

def warn_non_pod_vararg_with_format_string : Warning<		def warn_non_pod_vararg_with_format_string : Warning<
"cannot pass %select{non-POD\|non-trivial}0 object of type %1 to variadic "		"cannot pass %select{non-POD\|non-trivial}0 object of type %1 to variadic "
"%select{function\|block\|method\|constructor}2; expected type from format "		"%select{function\|block\|method\|constructor}2; expected type from format "
"string was %3">, InGroup<NonPODVarargs>, DefaultError;		"string was %3">, InGroup<NonPODVarargs>, DefaultError;
// The arguments to this diagnostic should match the warning above.		// The arguments to this diagnostic should match the warning above.
def err_cannot_pass_objc_interface_to_vararg_format : Error<		def err_cannot_pass_objc_interface_to_vararg_format : Error<
"cannot pass object with interface type %1 by value to variadic "		"cannot pass object with interface type %1 by value to variadic "
▲ Show 20 Lines • Show All 2,568 Lines • Show Last 20 Lines

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,096 Lines • ▼ Show 20 Lines
/// Determines whether we are currently in a context that		/// Determines whether we are currently in a context that
/// is not evaluated as per C++ [expr] p5.		/// is not evaluated as per C++ [expr] p5.
bool isUnevaluatedContext() const {		bool isUnevaluatedContext() const {
assert(!ExprEvalContexts.empty() &&		assert(!ExprEvalContexts.empty() &&
"Must be in an expression evaluation context");		"Must be in an expression evaluation context");
return ExprEvalContexts.back().isUnevaluated();		return ExprEvalContexts.back().isUnevaluated();
}		}

		bool underDecltypeContext() const {
		return llvm::any_of(ExprEvalContexts,
		[](const ExpressionEvaluationContextRecord &C) {
		return C.ExprContext ==
		ExpressionEvaluationContextRecord::EK_Decltype;
		});
		}

/// RAII class used to determine whether SFINAE has		/// RAII class used to determine whether SFINAE has
/// trapped any errors that occur during template argument		/// trapped any errors that occur during template argument
/// deduction.		/// deduction.
class SFINAETrap {		class SFINAETrap {
Sema &SemaRef;		Sema &SemaRef;
unsigned PrevSFINAEErrors;		unsigned PrevSFINAEErrors;
bool PrevInNonInstantiationSFINAEContext;		bool PrevInNonInstantiationSFINAEContext;
bool PrevAccessCheckingSFINAE;		bool PrevAccessCheckingSFINAE;
▲ Show 20 Lines • Show All 2,835 Lines • ▼ Show 20 Lines	public:
/// \param Caller function which needs address of \p Callee.		/// \param Caller function which needs address of \p Callee.
/// nullptr in case of global context.		/// nullptr in case of global context.
/// \param Callee target function		/// \param Callee target function
///		///
/// \returns preference value for particular Caller/Callee combination.		/// \returns preference value for particular Caller/Callee combination.
CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,		CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,
const FunctionDecl *Callee);		const FunctionDecl *Callee);

/// Determines whether Caller may invoke Callee, based on their CUDA		/// Determines, under the current context, whether Callee may be invokable,
/// host/device attributes. Returns false if the call is not allowed.		/// based on their CUDA host/device attributes. Returns false if the call is
		/// not allowed.
///		///
/// Note: Will return true for CFP_WrongSide calls. These may appear in		/// Note: Will return true for CFP_WrongSide calls. These may appear in
/// semantically correct CUDA programs, but only if they're never codegen'ed.		/// semantically correct CUDA programs, but only if they're never codegen'ed.
bool IsAllowedCUDACall(const FunctionDecl *Caller,		bool isCUDACallAllowed(const FunctionDecl *Callee) {
const FunctionDecl *Callee) {		// Under `decltype`, the rule is relaxed.
return IdentifyCUDAPreference(Caller, Callee) != CFP_Never;		if (underDecltypeContext())
		traUnsubmitted Not Done Reply Inline Actions I think you want `return llvm::any_of(ExprEvalContexts, ...)` here and you can fold it directly into `if()` below. tra: I think you want `return llvm::any_of(ExprEvalContexts, ...)` here and you can fold it directly…
		hliaoAuthorUnsubmitted Done Reply Inline Actions yeah, that's much simpler, I will make the change. hliao: yeah, that's much simpler, I will make the change.
		return true;
		return IdentifyCUDAPreference(dyn_cast<FunctionDecl>(CurContext), Callee) !=
		CFP_Never;
		traUnsubmitted Not Done Reply Inline Actions One more thing. The idea of this function is that we're checking if the `Caller` is allowed to call the `Callee`. However here, you're checking the current context, which may not necessarily be the same as the caller's. I.e. someone could potentially call it way after the context is gone. Currently all uses of this function obtain the caller from `CurContext`, but if we start relying on other properties of the current context other than the caller function, then we may neet to pass the context explicitly, or only pass the Callee and check if it's callable from the current context. tra: One more thing. The idea of this function is that we're checking if the `Caller` is allowed to…
		hliaoAuthorUnsubmitted Done Reply Inline Actions as the expression within `decltype` may be quite complicated, the idea here is to relax that rule within `decltype` context, not only for a particular pair of caller/callee. hliao: as the expression within `decltype` may be quite complicated, the idea here is to relax that…
		traUnsubmitted Done Reply Inline Actions I understand the idea, but in this case the argument was more about the code style. Currently the contract is that the function's decision is derived from its arguments (and could, perhaps, be a static method). With this patch you start relying on the context, but it's not obvious from the function signature. Replacing Caller with context, or removing the caller altogether would bring the function signature closer to what the function does. tra: I understand the idea, but in this case the argument was more about the code style. Currently…
}		}

/// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to FD,		/// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to FD,
/// depending on FD and the current compilation settings.		/// depending on FD and the current compilation settings.
void maybeAddCUDAHostDeviceAttrs(FunctionDecl *FD,		void maybeAddCUDAHostDeviceAttrs(FunctionDecl *FD,
const LookupResult &Previous);		const LookupResult &Previous);

public:		public:
▲ Show 20 Lines • Show All 742 Lines • Show Last 20 Lines

clang/lib/Sema/SemaOverload.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,231 Lines • ▼ Show 20 Lines	void Sema::AddOverloadCandidate(

// (CUDA B.1): Check for invalid calls between targets.		// (CUDA B.1): Check for invalid calls between targets.
if (getLangOpts().CUDA)		if (getLangOpts().CUDA)
if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))		if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))
// Skip the check for callers that are implicit members, because in this		// Skip the check for callers that are implicit members, because in this
// case we may not yet know what the member's target is; the target is		// case we may not yet know what the member's target is; the target is
// inferred for the member automatically, based on the bases and fields of		// inferred for the member automatically, based on the bases and fields of
// the class.		// the class.
if (!Caller->isImplicit() && !IsAllowedCUDACall(Caller, Function)) {		if (!Caller->isImplicit() && !isCUDACallAllowed(Function)) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_bad_target;		Candidate.FailureKind = ovl_fail_bad_target;
return;		return;
}		}

// Determine the implicit conversion sequences for each of the		// Determine the implicit conversion sequences for each of the
// arguments.		// arguments.
for (unsigned ArgIdx = 0; ArgIdx < Args.size(); ++ArgIdx) {		for (unsigned ArgIdx = 0; ArgIdx < Args.size(); ++ArgIdx) {
▲ Show 20 Lines • Show All 499 Lines • ▼ Show 20 Lines	if (Candidate.Conversions[ConvIdx].isBad()) {
Candidate.FailureKind = ovl_fail_bad_conversion;		Candidate.FailureKind = ovl_fail_bad_conversion;
return;		return;
}		}
}		}

// (CUDA B.1): Check for invalid calls between targets.		// (CUDA B.1): Check for invalid calls between targets.
if (getLangOpts().CUDA)		if (getLangOpts().CUDA)
if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))		if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))
if (!IsAllowedCUDACall(Caller, Method)) {		if (!isCUDACallAllowed(Method)) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_bad_target;		Candidate.FailureKind = ovl_fail_bad_target;
return;		return;
}		}

// Determine the implicit conversion sequences for each of the		// Determine the implicit conversion sequences for each of the
// arguments.		// arguments.
for (unsigned ArgIdx = 0; ArgIdx < Args.size(); ++ArgIdx) {		for (unsigned ArgIdx = 0; ArgIdx < Args.size(); ++ArgIdx) {
▲ Show 20 Lines • Show All 2,903 Lines • ▼ Show 20 Lines	OverloadCandidateSet::BestViableFunction(Sema &S, SourceLocation Loc,
// If we found more than one best candidate, this is ambiguous.		// If we found more than one best candidate, this is ambiguous.
if (Best == end())		if (Best == end())
return OR_Ambiguous;		return OR_Ambiguous;

// Best is the best viable function.		// Best is the best viable function.
if (Best->Function && Best->Function->isDeleted())		if (Best->Function && Best->Function->isDeleted())
return OR_Deleted;		return OR_Deleted;

		// Issue a warning of return type resolution under `decltype`.
		if (S.getLangOpts().CUDA && Best->Function && S.underDecltypeContext()) {
		SmallVector<const OverloadCandidate *, 16> AmbiSet;
		QualType BestReturnType = Best->Function->getReturnType();
		for (auto &Cand : this->Candidates) {
		if (!Cand.Viable \|\| !Cand.Function)
		continue;
		if (BestReturnType != Cand.Function->getReturnType())
		AmbiSet.push_back(&Cand);
		}
		if (!AmbiSet.empty()) {
		S.Diag(Loc, diag::warn_decltype_ambiguous_return_type) << Best->Function;
		S.Diag(Best->Function->getLocation(),
		diag::note_decltype_ambiguous_function_chosen)
		<< Best->Function;
		for (auto C : AmbiSet)
		S.Diag(C->Function->getLocation(),
		diag::note_decltype_ambiguous_function_other)
		<< C->Function;
		}
		}

if (!EquivalentCands.empty())		if (!EquivalentCands.empty())
S.diagnoseEquivalentInternalLinkageDeclarations(Loc, Best->Function,		S.diagnoseEquivalentInternalLinkageDeclarations(Loc, Best->Function,
EquivalentCands);		EquivalentCands);

return OR_Success;		return OR_Success;
}		}

namespace {		namespace {
▲ Show 20 Lines • Show All 1,802 Lines • ▼ Show 20 Lines	if (CXXMethodDecl *Method = dyn_cast<CXXMethodDecl>(Fn)) {
return false;		return false;
}		}
else if (TargetTypeIsNonStaticMemberFunction)		else if (TargetTypeIsNonStaticMemberFunction)
return false;		return false;

if (FunctionDecl *FunDecl = dyn_cast<FunctionDecl>(Fn)) {		if (FunctionDecl *FunDecl = dyn_cast<FunctionDecl>(Fn)) {
if (S.getLangOpts().CUDA)		if (S.getLangOpts().CUDA)
if (FunctionDecl *Caller = dyn_cast<FunctionDecl>(S.CurContext))		if (FunctionDecl *Caller = dyn_cast<FunctionDecl>(S.CurContext))
if (!Caller->isImplicit() && !S.IsAllowedCUDACall(Caller, FunDecl))		if (!Caller->isImplicit() && !S.isCUDACallAllowed(FunDecl))
return false;		return false;
if (FunDecl->isMultiVersion()) {		if (FunDecl->isMultiVersion()) {
const auto *TA = FunDecl->getAttr<TargetAttr>();		const auto *TA = FunDecl->getAttr<TargetAttr>();
if (TA && !TA->isDefaultVersion())		if (TA && !TA->isDefaultVersion())
return false;		return false;
}		}

// If any candidate has a placeholder return type, trigger its deduction		// If any candidate has a placeholder return type, trigger its deduction
▲ Show 20 Lines • Show All 2,814 Lines • Show Last 20 Lines

clang/test/CodeGenCUDA/function-overload.cu

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target

	// Make sure we handle target overloads correctly. Most of this is checked in			// Make sure we handle target overloads correctly. Most of this is checked in
	// sema, but special functions like constructors and destructors are here.			// sema, but special functions like constructors and destructors are here.
	//			//
	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-HOST %s			// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-HOST %s
	// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fcuda-is-device -emit-llvm -o - %s \			// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fcuda-is-device -emit-llvm -o - %s \
	// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-DEVICE %s			// RUN: \| FileCheck -check-prefix=CHECK-BOTH -check-prefix=CHECK-DEVICE %s
				// RUN: %clang_cc1 -std=c++11 -DCHECK_DECLTYPE -triple amdgcn -fcuda-is-device -emit-llvm -o - %s \
				// RUN: \| FileCheck -check-prefix=CHECK-DECLTYPE %s

	#include "Inputs/cuda.h"			#include "Inputs/cuda.h"

	// Check constructors/destructors for D/H functions			// Check constructors/destructors for D/H functions
	int x;			int x;
	struct s_cd_dh {			struct s_cd_dh {
	__host__ s_cd_dh() { x = 11; }			__host__ s_cd_dh() { x = 11; }
	__device__ s_cd_dh() { x = 12; }			__device__ s_cd_dh() { x = 12; }
	Show All 29 Lines

	// CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_hdC2Ev(			// CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_hdC2Ev(
	// CHECK-BOTH: store i32 31,			// CHECK-BOTH: store i32 31,
	// CHECK-BOTH: ret void			// CHECK-BOTH: ret void

	// CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_hdD2Ev(			// CHECK-BOTH: define linkonce_odr void @_ZN7s_cd_hdD2Ev(
	// CHECK-BOTH: store i32 32,			// CHECK-BOTH: store i32 32,
	// CHECK-BOTH: ret void			// CHECK-BOTH: ret void

				#if defined(CHECK_DECLTYPE)
				int foo(float);
				// CHECK-DECLTYPE-LABEL: @_Z3barf
				// CHECK-DECLTYPE: fptosi
				// CHECK-DECLTYPE: sitofp
				__device__ float bar(float x) {
				decltype(foo(x)) y = x;
				return y + 3.f;
				}
				#endif

clang/test/Misc/warning-flags.c

Show All 12 Lines	(2) It prevents us adding new warnings to Clang that have no -W flag. All
new warnings should have -W flags.		new warnings should have -W flags.

If you add a new warning without a flag, this test will fail. To fix		If you add a new warning without a flag, this test will fail. To fix
this test, simply add a warning group to that warning.		this test, simply add a warning group to that warning.


The list of warnings below should NEVER grow. It should gradually shrink to 0.		The list of warnings below should NEVER grow. It should gradually shrink to 0.

CHECK: Warnings without flags (74):		CHECK: Warnings without flags (75):
CHECK-NEXT: ext_excess_initializers		CHECK-NEXT: ext_excess_initializers
CHECK-NEXT: ext_excess_initializers_in_char_array_initializer		CHECK-NEXT: ext_excess_initializers_in_char_array_initializer
CHECK-NEXT: ext_expected_semi_decl_list		CHECK-NEXT: ext_expected_semi_decl_list
CHECK-NEXT: ext_explicit_specialization_storage_class		CHECK-NEXT: ext_explicit_specialization_storage_class
CHECK-NEXT: ext_initializer_string_for_char_array_too_long		CHECK-NEXT: ext_initializer_string_for_char_array_too_long
CHECK-NEXT: ext_missing_declspec		CHECK-NEXT: ext_missing_declspec
CHECK-NEXT: ext_missing_whitespace_after_macro_name		CHECK-NEXT: ext_missing_whitespace_after_macro_name
CHECK-NEXT: ext_new_paren_array_nonconst		CHECK-NEXT: ext_new_paren_array_nonconst
Show All 12 Lines
CHECK-NEXT: warn_call_wrong_number_of_arguments		CHECK-NEXT: warn_call_wrong_number_of_arguments
CHECK-NEXT: warn_case_empty_range		CHECK-NEXT: warn_case_empty_range
CHECK-NEXT: warn_char_constant_too_large		CHECK-NEXT: warn_char_constant_too_large
CHECK-NEXT: warn_collection_expr_type		CHECK-NEXT: warn_collection_expr_type
CHECK-NEXT: warn_conflicting_variadic		CHECK-NEXT: warn_conflicting_variadic
CHECK-NEXT: warn_conv_to_base_not_used		CHECK-NEXT: warn_conv_to_base_not_used
CHECK-NEXT: warn_conv_to_self_not_used		CHECK-NEXT: warn_conv_to_self_not_used
CHECK-NEXT: warn_conv_to_void_not_used		CHECK-NEXT: warn_conv_to_void_not_used
		CHECK-NEXT: warn_decltype_ambiguous_return_type
CHECK-NEXT: warn_delete_array_type		CHECK-NEXT: warn_delete_array_type
CHECK-NEXT: warn_double_const_requires_fp64		CHECK-NEXT: warn_double_const_requires_fp64
CHECK-NEXT: warn_drv_assuming_mfloat_abi_is		CHECK-NEXT: warn_drv_assuming_mfloat_abi_is
CHECK-NEXT: warn_drv_clang_unsupported		CHECK-NEXT: warn_drv_clang_unsupported
CHECK-NEXT: warn_drv_pch_not_first_include		CHECK-NEXT: warn_drv_pch_not_first_include
CHECK-NEXT: warn_dup_category_def		CHECK-NEXT: warn_dup_category_def
CHECK-NEXT: warn_enum_value_overflow		CHECK-NEXT: warn_enum_value_overflow
CHECK-NEXT: warn_expected_qualified_after_typename		CHECK-NEXT: warn_expected_qualified_after_typename
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

clang/test/SemaCUDA/function-overload.cu

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target

	// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s			// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s
	// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify %s			// RUN: %clang_cc1 -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify %s
				// RUN: %clang_cc1 -std=c++11 -DCHECK_DECLTYPE -triple x86_64-unknown-linux-gnu -fsyntax-only -verify %s
				// RUN: %clang_cc1 -std=c++11 -DCHECK_DECLTYPE -triple nvptx64-nvidia-cuda -fsyntax-only -fcuda-is-device -verify %s

	#include "Inputs/cuda.h"			#include "Inputs/cuda.h"

	// Opaque return types used to check that we pick the right overloads.			// Opaque return types used to check that we pick the right overloads.
	struct HostReturnTy {};			struct HostReturnTy {};
	struct HostReturnTy2 {};			struct HostReturnTy2 {};
	struct DeviceReturnTy {};			struct DeviceReturnTy {};
	struct DeviceReturnTy2 {};			struct DeviceReturnTy2 {};
	▲ Show 20 Lines • Show All 400 Lines • ▼ Show 20 Lines
	__host__ __device__ int constexpr_overload(const T &x, const T &y) {			__host__ __device__ int constexpr_overload(const T &x, const T &y) {
	return x - y;			return x - y;
	}			}

	// Verify that function overloading doesn't prune candidate wrongly.			// Verify that function overloading doesn't prune candidate wrongly.
	int test_constexpr_overload(C2 &x, C2 &y) {			int test_constexpr_overload(C2 &x, C2 &y) {
	return constexpr_overload(x, y);			return constexpr_overload(x, y);
	}			}

				#if defined(CHECK_DECLTYPE)
				#if defined(__CUDA_ARCH__)
				// expected-note@+6 {{other definition of 't0'}}
				// expected-note@+6 {{use this definition of 't0'}}
				#else
				// expected-note@+3 {{use this definition of 't0'}}
				// expected-note@+3 {{other definition of 't0'}}
				#endif
				__host__ float t0();
				__device__ int t0();

				__host__ __device__ void dt0() {
				// expected-warning@+1 {{return type of 't0' in 'decltype' is ambiguous and may not be expected}}
				decltype(t0()) ret;
				}

				__host__ float t1();

				__device__ void dt1() {
				decltype(t1()) ret; // OK. `decltype` is relaxed.
				}

				__host__ __device__ void dt2() {
				decltype(t1()) ret; // OK. `decltype` is relaxed.
				}
				#endif