This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Sema/
-
clang/
-
Sema/
11/13
Sema.h
-
lib/
-
Parse/
1/3
ParseDecl.cpp
-
Sema/
-
SemaCUDA.cpp
1/1
SemaDeclCXX.cpp
-
SemaExprCXX.cpp
-
SemaOverload.cpp
-
test/SemaCUDA/
-
SemaCUDA/
1/1
function-overload.cu
-
global-initializers-host.cu

Differential D71227

[cuda][hip] Fix function overload resolution in the global initiailizer.
Needs ReviewPublic

Authored by hliao on Dec 9 2019, 1:37 PM.

Download Raw Diff

Details

Reviewers

tra
jlebar
yaxunl
rsmith
rjmccall

Summary

As global initializers are not under any function body, they need to look into the current variable being initialized. That is not addressed in the current CUDA/HIP overloadable function resolution and ignore target checking. That may result in wrong candidate to be considered as illustrated in the newly added test case.
In this patch, a non-local variable stack is introduced to keep track the current non-local variable being initialized so that initialization function could be inspected for the target preference.
Besides newly added tests, existing tests are refined as the current implementation adds extra checks on global initializers to ensure no device functions are used. As the target match checking is enabled in this patch, such check is only necessary for CUDA device global variables. They are not allowed to be non-trivially initialized. As HIP starts to support non-trivial initialization of device initialization, such target matching check is mandatory to be enforced.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hliao created this revision.Dec 9 2019, 1:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2019, 1:37 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

refine commit message

hliao edited the summary of this revision. (Show Details)Dec 9 2019, 1:39 PM

Harbormaster completed remote builds in B42157: Diff 232931.Dec 9 2019, 1:40 PM

Harbormaster completed remote builds in B42158: Diff 232933.

refine again

hliao edited the summary of this revision. (Show Details)Dec 9 2019, 1:42 PM

Harbormaster completed remote builds in B42160: Diff 232938.Dec 9 2019, 1:43 PM

File PR44266 to track that bug.

Looks good to me overall. I've pinged rsmith@ to double-check that we're covering all possibilities for non-local variable init.

clang/include/clang/Sema/Sema.h
11632	I'd add a comment describing that it's a wrapper which dispatches the call to one of more specific variants above.
11655–11656	Comment needs updating.
11668	Nit: I'd add an empty line between delarations and the function. Jammed together they are hard to read.
11697–11699	if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext)) if (Kind == SkipImplicitCaller && Caller->isImplicit()) return true;
11736–11737	Now that we always use getCUDAContextDecl() as the first argument, perhaps we can just always retrieve the context inside the function.
clang/lib/Parse/ParseDecl.cpp
2345	@rsmith -- is this sufficient to catch all attempts to call an initializer for a global? I wonder if there are other sneaky ways to call an initializer.
clang/test/SemaCUDA/function-overload.cu
458	I'd add more details here. The problem is that here the overload set has both functions and the one with the integer argument wins, even though it's a device function which we can't execute. We do handle similar cases during overload resolution in other places where we would prefer a callable function over a non-callable function with a better signature match.

I wonder if this patch will help with this case:

https://godbolt.org/z/X4KdsV

__device__ float fn(int) { return threadIdx.x; };
__host__ float fn(float);

float gvar1 = []()__device__ { return fn(1);} (); // This ends up calling fn(int) on *host*

We seem to happily let host code call device function from a lambda function used as an initializer.

In D71227#1778136, @tra wrote:
I wonder if this patch will help with this case:

https://godbolt.org/z/X4KdsV
__device__ float fn(int) { return threadIdx.x; };
__host__ float fn(float);

float gvar1 = []()__device__ { return fn(1);} (); // This ends up calling fn(int) on *host*
We seem to happily let host code call device function from a lambda function used as an initializer.

It's turned out that Sema::CheckCUDACall needs to consider global initializer as well. I will revise that part. But, technically, that's irrelevant to overloadable resolution. Should be prepared in another patch enhancing CheckCUDACall to check global initializers.

refinements are made after comments from reviewers.

code refinement after reviewers' comments.

Harbormaster completed remote builds in B43182: Diff 235921.Jan 2 2020, 12:12 PM

rsmith added inline comments.Jan 6 2020, 11:02 AM

clang/include/clang/Sema/Sema.h
11656–11658	Please capitalize the first word of each of these parameter descriptions, to match the style used elsewhere in Clang.
11657	"Null" not "nullptr".
11664–11672	Does this really need to be CUDA-specific? This is (at least) the third time we've needed this. We currently have a `ManglingContextDecl` on `ExpressionEvaluationContextRecord` that tracks the non-local variable whose initializer we're parsing. In addition to using this as a lambda context declaration, we also (hackily) use it as the context declaration for `DiagRuntimeBehavior`. It would seem sensible to use that mechanism here too (and rename it to remove any suggestion that this is specific to lambdas or mangling). I think we only currently push `ExpressionEvaluationContext`s for variable initializers in C++. That's presumably fine for CUDA's purposes.
clang/lib/Parse/ParseDecl.cpp
2345	No, this is not sufficient; it's missing (at least) the template instantiation case. (The `ExpressionEvaluationContextRecord` mechanism does handle that case properly.) You should also consider what should happen in default arguments (which are sometimes parsed before we form a `FunctionDecl` for the function for which they are parameters) and default member initializers (which are parsed after we know whether the enclosing class has a user-declared default constructor, so you could in principle consider the CUDA function kind of the declared constructors, I suppose -- but the constructor bodies are not yet available, so you can't tell which constructors would actually use the initializers). Both of those cases are also tracked by the `ExpressionEvaluationContextRecord` mechanism, though you may need to track additional information to process default arguments in the same mode as the function for which they are supplied.

revise comment.
add tests requiring tempate instantiation.

Harbormaster completed remote builds in B44491: Diff 239328.Jan 21 2020, 8:12 AM

Sorry for the late reply. Really appreciate your feedback. Thanks!

clang/include/clang/Sema/Sema.h
11664–11672	I tried that before adding the new non-local variable stack. Using `ManglingContextDecl` on `ExpressionEvaluationContextRecord` could serve some cases, but it cannot fit the case where the constructor needs resolving as well. When resolving the constructor, `ManglingContextDecl` scope is already closed and cannot be used to check the target of the global variables. Says the following code struct EC { int ec; __device__ EC() {} }; __device__ EC d_ec; I also tried enlarging the scope of `ManglingContextDecl` but that triggers even more issues for the generic C++ compilation. I'd appreciate any better solution as I agree that adding CUDA specific facilities should be minimized.
clang/lib/Parse/ParseDecl.cpp
2345	Could you elaborate more? I added new test cases requiring template instantiation. The current code handle them correctly. Do you refer to template variables?

In D71227#1831445, @hliao wrote:

Sorry for the late reply. Really appreciate your feedback. Thanks!

@rsmith Have you chance to review the revised change?

hliao added a reviewer: rsmith.Jan 30 2020, 7:06 PM

@rsmith do have u the chance to review the revised change again as well as my answers to your comments?

+ @rjmccall

Rebase to the latest trunk code.

Harbormaster failed remote builds in B46828: Diff 245461!Feb 19 2020, 11:09 AM

rjmccall added inline comments.Feb 19 2020, 12:13 PM

clang/include/clang/Sema/Sema.h
11678	This is tricky because we could be in a nested context, not just the initializer, and that context just might not be a function. For example, there could be a local class in a lambda or something like that.

Skip non-function or non-TU context so far as more cases need considering.

hliao marked 2 inline comments as done.Feb 20 2020, 10:00 AM

hliao added inline comments.

clang/include/clang/Sema/Sema.h
11678	You are right. Limit that to function and TU context so far. I need more efforts to consider other cases. One case in mind is that default member initialier in a class. But, for local classes in a lambda, they should be in a function (lambda function body) context.

Harbormaster failed remote builds in B46929: Diff 245682!Feb 20 2020, 10:29 AM

rjmccall added inline comments.Feb 20 2020, 1:49 PM

clang/include/clang/Sema/Sema.h
11680	You really want this to match whenever we're in a local context, right? How about structuring the function like: if (CurContext->isFunctionOrMethod()) return cast<Decl>(CurContext); if (!CurContext->isFileContext()) return nullptr; return getCUDACurrentNonLocalVariable(); As a more general solution, I think Sema funnels all changes to CurContext through a small number of places, and you could make those places save and restore the currently initialized variable as well.

Rebase the code to the latest trunk.

Revise following reviewer comments.

hliao marked an inline comment as done.Feb 20 2020, 2:26 PM

Harbormaster failed remote builds in B46947: Diff 245735!Feb 20 2020, 2:30 PM

Harbormaster failed remote builds in B46956: Diff 245747!Feb 20 2020, 2:58 PM

Rebase to the trunk.

Harbormaster failed remote builds in B47007: Diff 245863!Feb 21 2020, 8:32 AM

Rebase to the trunk.

Harbormaster failed remote builds in B47131: Diff 246204!Feb 24 2020, 8:21 AM

Rebase to the latest trunk.

Harbormaster failed remote builds in B47360: Diff 246826!Feb 26 2020, 2:29 PM

@rjmccall @rsmith @tra, could you review on this revision?

Fix pre-merge checks.

Harbormaster failed remote builds in B47579: Diff 247248!Feb 28 2020, 7:05 AM

rjmccall added inline comments.Mar 1 2020, 11:46 PM

clang/include/clang/Sema/Sema.h
11680	Richard, I'd like your opinion about this. We have three separate patches right now that would all benefit from being able to track that they're currently within a variable/field initializer in Sema. And it's a general deficiency that it's hard to track declarations in initializers back to their initialized variable. Swift actually bit the bullet and introduced a kind of `DeclContext` that's a non-local initializer, and that links back to the variable. That would be hard to bring back to Clang with the current AST because Clang assumes that all `DeclContext`s are `Decl`s, and I don't think we can reasonably remove that assumption; and of course `VarDecl` and `FieldDecl` aren't `DeclContext`s. Now, we could try to change that latter point. Making all `VarDecl`s and `FieldDecl`s DCs would have prohibitive memory overhead, since the vast majority are local / uninitialized; however, we could introduce a `VarDecl` subclass for global variables (including static member variables, of course), and similarly we could have a `FieldDecl` subclass for fields with initializers, which would nicely move some of the other overhead out-of-line and optimize for the C-style/old-style case. (We always know whether a field has an in-class initializer at parse time, right?) Less invasively, we could forget about trying to track this in the AST and just also track a current initialized variable in Sema. Anything which tried to change the context would have to save and restore that as well. That might be annoying because of PushDeclContext/PopDeclContext, though, which assume that you can restore the old context by just looking at the current context.
clang/lib/Sema/SemaDeclCXX.cpp
16932	The declaration could become invalid while processing its initializer; I think you should drop that condition.

Remove unncessary condition checking.

hliao marked an inline comment as done.Mar 2 2020, 9:42 AM

Harbormaster failed remote builds in B47808: Diff 247675!Mar 2 2020, 10:21 AM

Rebase to the latest trunk.

Harbormaster failed remote builds in B49533: Diff 250965!Mar 17 2020, 8:00 PM

Fix warnings from clang-tidy.

Harbormaster failed remote builds in B49637: Diff 251171!Mar 18 2020, 3:13 PM

Fix more clang-tidy warnings.

Harbormaster completed remote builds in B49657: Diff 251212.Mar 18 2020, 6:28 PM

Rebase to the latest trunk.

Harbormaster failed remote builds in B51876: Diff 255219!Apr 5 2020, 11:03 PM

Rebase to trunk.

Harbormaster failed remote builds in B53144: Diff 257347!Apr 14 2020, 9:38 AM

yaxunl mentioned this in D78970: [CUDA][HIP] Fix ambiguity of new operator.Apr 27 2020, 3:55 PM

LGTM. Can we get this in? There are other fixes depending on this. Thanks.

yaxunl mentioned this in rG55bcb96f3154: recommit c77a4078e01033aa2206c31a579d217c8a07569b with fix.Apr 28 2020, 6:24 AM

Rebase to trunk and resolve the conflict.

Harbormaster completed remote builds in B55167: Diff 261003.Apr 29 2020, 2:33 PM

Is this patch still actual?

Hi @rsmith, @rjmccall and @tra what's your suggestion to make progress on this review?

In D71227#2167596, @tra wrote:

Is this patch still actual?

I need to rebase this to the latest trunk. Interrupt with other heavy loads.

Revision Contents

Path

Size

clang/

include/

clang/

Sema/

Sema.h

55 lines

lib/

Parse/

ParseDecl.cpp

4 lines

Sema/

45 lines

14 lines

11 lines

77 lines

test/

SemaCUDA/

function-overload.cu

28 lines

global-initializers-host.cu

17 lines

Diff 261003

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,618 Lines • ▼ Show 20 Lines	enum CUDAFunctionTarget {
CFT_InvalidTarget		CFT_InvalidTarget
};		};

/// Determines whether the given function is a CUDA device/host/kernel/etc.		/// Determines whether the given function is a CUDA device/host/kernel/etc.
/// function.		/// function.
///		///
/// Use this rather than examining the function's attributes yourself -- you		/// Use this rather than examining the function's attributes yourself -- you
/// will get it wrong. Returns CFT_Host if D is null.		/// will get it wrong. Returns CFT_Host if D is null.
		CUDAFunctionTarget IdentifyCUDATarget(const ParsedAttributesView &Attrs);
CUDAFunctionTarget IdentifyCUDATarget(const FunctionDecl *D,		CUDAFunctionTarget IdentifyCUDATarget(const FunctionDecl *D,
bool IgnoreImplicitHDAttr = false);		bool IgnoreImplicitHDAttr = false);
CUDAFunctionTarget IdentifyCUDATarget(const ParsedAttributesView &Attrs);		CUDAFunctionTarget IdentifyCUDATarget(const VarDecl *D,
		bool IgnoreImplicitHDAttr = false);
		// This routine is the top level dispatcher to more specific variants above.
		traUnsubmitted Done Reply Inline Actions I'd add a comment describing that it's a wrapper which dispatches the call to one of more specific variants above. tra: I'd add a comment describing that it's a wrapper which dispatches the call to one of more…
		CUDAFunctionTarget IdentifyCUDATarget(const Decl *D,
		bool IgnoreImplicitHDAttr = false);

/// Gets the CUDA target for the current context.		/// Gets the CUDA target for the current context.
CUDAFunctionTarget CurrentCUDATarget() {		CUDAFunctionTarget CurrentCUDATarget() {
return IdentifyCUDATarget(dyn_cast<FunctionDecl>(CurContext));		return IdentifyCUDATarget(dyn_cast<FunctionDecl>(CurContext));
}		}

// CUDA function call preference. Must be ordered numerically from		// CUDA function call preference. Must be ordered numerically from
// worst to best.		// worst to best.
enum CUDAFunctionPreference {		enum CUDAFunctionPreference {
CFP_Never, // Invalid caller/callee combination.		CFP_Never, // Invalid caller/callee combination.
CFP_WrongSide, // Calls from host-device to host or device		CFP_WrongSide, // Calls from host-device to host or device
// function that do not match current compilation		// function that do not match current compilation
// mode.		// mode.
CFP_HostDevice, // Any calls to host/device functions.		CFP_HostDevice, // Any calls to host/device functions.
CFP_SameSide, // Calls from host-device to host or device		CFP_SameSide, // Calls from host-device to host or device
// function matching current compilation mode.		// function matching current compilation mode.
CFP_Native, // host-to-host or device-to-device calls.		CFP_Native, // host-to-host or device-to-device calls.
};		};

/// Identifies relative preference of a given Caller/Callee		/// Identifies relative preference of a given callee and that call context
/// combination, based on their host/device attributes.		/// combination, based on their host/device attributes.
/// \param Caller function which needs address of \p Callee.		/// \param CallContextDecl The context decl which needs address of \p Callee.
		traUnsubmitted Done Reply Inline Actions Comment needs updating. tra: Comment needs updating.
/// nullptr in case of global context.		/// Null in case of the global context.
		rsmithUnsubmitted Done Reply Inline Actions "Null" not "nullptr". rsmith: "Null" not "nullptr".
/// \param Callee target function		/// \param Callee Target function.
		rsmithUnsubmitted Done Reply Inline Actions Please capitalize the first word of each of these parameter descriptions, to match the style used elsewhere in Clang. rsmith: Please capitalize the first word of each of these parameter descriptions, to match the style…
///		///
/// \returns preference value for particular Caller/Callee combination.		/// \returns preference value for particular Caller/Callee combination.
CUDAFunctionPreference IdentifyCUDAPreference(const FunctionDecl *Caller,		CUDAFunctionPreference IdentifyCUDAPreference(const Decl *CallContextDecl,
const FunctionDecl *Callee);		const FunctionDecl *Callee);

		SmallVector<const Decl *, 8> CUDANonLocalVariableStack;

		void pushCUDANonLocalVariable(const Decl *D);
		void popCUDANonLocalVariable(const Decl *D);

		traUnsubmitted Done Reply Inline Actions Nit: I'd add an empty line between delarations and the function. Jammed together they are hard to read. tra: Nit: I'd add an empty line between delarations and the function. Jammed together they are hard…
		const Decl *getCUDACurrentNonLocalVariable() const {
		return CUDANonLocalVariableStack.empty() ? nullptr
		: CUDANonLocalVariableStack.back();
		}
		rsmithUnsubmitted Not Done Reply Inline Actions Does this really need to be CUDA-specific? This is (at least) the third time we've needed this. We currently have a `ManglingContextDecl` on `ExpressionEvaluationContextRecord` that tracks the non-local variable whose initializer we're parsing. In addition to using this as a lambda context declaration, we also (hackily) use it as the context declaration for `DiagRuntimeBehavior`. It would seem sensible to use that mechanism here too (and rename it to remove any suggestion that this is specific to lambdas or mangling). I think we only currently push `ExpressionEvaluationContext`s for variable initializers in C++. That's presumably fine for CUDA's purposes. rsmith: Does this really need to be CUDA-specific? This is (at least) the third time we've needed this.
		hliaoAuthorUnsubmitted Done Reply Inline Actions I tried that before adding the new non-local variable stack. Using `ManglingContextDecl` on `ExpressionEvaluationContextRecord` could serve some cases, but it cannot fit the case where the constructor needs resolving as well. When resolving the constructor, `ManglingContextDecl` scope is already closed and cannot be used to check the target of the global variables. Says the following code struct EC { int ec; __device__ EC() {} }; __device__ EC d_ec; I also tried enlarging the scope of `ManglingContextDecl` but that triggers even more issues for the generic C++ compilation. I'd appreciate any better solution as I agree that adding CUDA specific facilities should be minimized. hliao: I tried that before adding the new non-local variable stack. Using `ManglingContextDecl` on…

		const Decl *getCUDAContextDecl() const {
		if (CurContext->isFunctionOrMethod())
		return cast<Decl>(CurContext);
		if (!CurContext->isFileContext()) {
		// TODO: There are cases where proper checking is required, such as the
		rjmccallUnsubmitted Done Reply Inline Actions This is tricky because we could be in a nested context, not just the initializer, and that context just might not be a function. For example, there could be a local class in a lambda or something like that. rjmccall: This is tricky because we could be in a nested context, not just the initializer, and that…
		hliaoAuthorUnsubmitted Done Reply Inline Actions You are right. Limit that to function and TU context so far. I need more efforts to consider other cases. One case in mind is that default member initialier in a class. But, for local classes in a lambda, they should be in a function (lambda function body) context. hliao: You are right. Limit that to function and TU context so far. I need more efforts to consider…
		// default member initializer.
		return nullptr;
		rjmccallUnsubmitted Done Reply Inline Actions You really want this to match whenever we're in a local context, right? How about structuring the function like: if (CurContext->isFunctionOrMethod()) return cast<Decl>(CurContext); if (!CurContext->isFileContext()) return nullptr; return getCUDACurrentNonLocalVariable(); As a more general solution, I think Sema funnels all changes to CurContext through a small number of places, and you could make those places save and restore the currently initialized variable as well. rjmccall: You really want this to match whenever we're in a local context, right? How about structuring…
		rjmccallUnsubmitted Not Done Reply Inline Actions Richard, I'd like your opinion about this. We have three separate patches right now that would all benefit from being able to track that they're currently within a variable/field initializer in Sema. And it's a general deficiency that it's hard to track declarations in initializers back to their initialized variable. Swift actually bit the bullet and introduced a kind of `DeclContext` that's a non-local initializer, and that links back to the variable. That would be hard to bring back to Clang with the current AST because Clang assumes that all `DeclContext`s are `Decl`s, and I don't think we can reasonably remove that assumption; and of course `VarDecl` and `FieldDecl` aren't `DeclContext`s. Now, we could try to change that latter point. Making all `VarDecl`s and `FieldDecl`s DCs would have prohibitive memory overhead, since the vast majority are local / uninitialized; however, we could introduce a `VarDecl` subclass for global variables (including static member variables, of course), and similarly we could have a `FieldDecl` subclass for fields with initializers, which would nicely move some of the other overhead out-of-line and optimize for the C-style/old-style case. (We always know whether a field has an in-class initializer at parse time, right?) Less invasively, we could forget about trying to track this in the AST and just also track a current initialized variable in Sema. Anything which tried to change the context would have to save and restore that as well. That might be annoying because of PushDeclContext/PopDeclContext, though, which assume that you can restore the old context by just looking at the current context. rjmccall: Richard, I'd like your opinion about this. We have three separate patches right now that would…
		}
		// Check the current variable being initialized in the global context.
		return getCUDACurrentNonLocalVariable();
		}

/// Determines whether Caller may invoke Callee, based on their CUDA		/// Determines whether Caller may invoke Callee, based on their CUDA
/// host/device attributes. Returns false if the call is not allowed.		/// host/device attributes. Returns false if the call is not allowed.
///		///
/// Note: Will return true for CFP_WrongSide calls. These may appear in		/// Note: Will return true for CFP_WrongSide calls. These may appear in
/// semantically correct CUDA programs, but only if they're never codegen'ed.		/// semantically correct CUDA programs, but only if they're never codegen'ed.
bool IsAllowedCUDACall(const FunctionDecl *Caller,		enum SkipCallerKind_t { SkipNoneCaller, SkipImplicitCaller };
const FunctionDecl *Callee) {		bool isCUDACallAllowed(const FunctionDecl *Callee,
return IdentifyCUDAPreference(Caller, Callee) != CFP_Never;		SkipCallerKind_t Kind = SkipNoneCaller) {
		// Skip contexts where no real call could be performed.
		if (!CurContext->isFileContext() && !CurContext->isFunctionOrMethod())
		return true;
		if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))
		if (Kind == SkipImplicitCaller && Caller->isImplicit())
		return true;
		traUnsubmitted Done Reply Inline Actions if (const FunctionDecl Caller = dyn_cast<FunctionDecl>(CurContext)) if (Kind == SkipImplicitCaller && Caller->isImplicit()) return true; tra:* ``` if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext)) if (Kind ==…
		return IdentifyCUDAPreference(getCUDAContextDecl(), Callee) != CFP_Never;
}		}

/// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to FD,		/// May add implicit CUDAHostAttr and CUDADeviceAttr attributes to FD,
/// depending on FD and the current compilation settings.		/// depending on FD and the current compilation settings.
void maybeAddCUDAHostDeviceAttrs(FunctionDecl *FD,		void maybeAddCUDAHostDeviceAttrs(FunctionDecl *FD,
const LookupResult &Previous);		const LookupResult &Previous);

public:		public:
Show All 17 Lines	public:
/// operator() method.		/// operator() method.
///		///
/// CUDA lambdas declared inside __device__ or __global__ functions inherit		/// CUDA lambdas declared inside __device__ or __global__ functions inherit
/// the __device__ attribute. Similarly, lambdas inside __host__ __device__		/// the __device__ attribute. Similarly, lambdas inside __host__ __device__
/// functions become __host__ __device__ themselves.		/// functions become __host__ __device__ themselves.
void CUDASetLambdaAttrs(CXXMethodDecl *Method);		void CUDASetLambdaAttrs(CXXMethodDecl *Method);

/// Finds a function in \p Matches with highest calling priority		/// Finds a function in \p Matches with highest calling priority
/// from \p Caller context and erases all functions with lower		/// from the current context and erases all functions with lower
/// calling priority.		/// calling priority.
void EraseUnwantedCUDAMatches(		void EraseUnwantedCUDAMatches(
const FunctionDecl *Caller,
SmallVectorImpl<std::pair<DeclAccessPair, FunctionDecl *>> &Matches);		SmallVectorImpl<std::pair<DeclAccessPair, FunctionDecl *>> &Matches);
		traUnsubmitted Done Reply Inline Actions Now that we always use getCUDAContextDecl() as the first argument, perhaps we can just always retrieve the context inside the function. tra: Now that we always use getCUDAContextDecl() as the first argument, perhaps we can just always…

/// Given a implicit special member, infer its CUDA target from the		/// Given a implicit special member, infer its CUDA target from the
/// calls it needs to make to underlying base/field special members.		/// calls it needs to make to underlying base/field special members.
/// \param ClassDecl the class for which the member is being created.		/// \param ClassDecl the class for which the member is being created.
/// \param CSM the kind of special member.		/// \param CSM the kind of special member.
/// \param MemberDecl the special member itself.		/// \param MemberDecl the special member itself.
/// \param ConstRHS true if this is a copy operation with a const object on		/// \param ConstRHS true if this is a copy operation with a const object on
/// its RHS.		/// its RHS.
▲ Show 20 Lines • Show All 722 Lines • Show Last 20 Lines

clang/lib/Parse/ParseDecl.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,336 Lines • ▼ Show 20 Lines	if (Tok.is(tok::semi)) {
ThisDecl =		ThisDecl =
Actions.ActOnTemplateDeclarator(getCurScope(), FakedParamLists, D);		Actions.ActOnTemplateDeclarator(getCurScope(), FakedParamLists, D);
}		}
}		}
break;		break;
}		}
}		}

		Actions.pushCUDANonLocalVariable(ThisDecl);
		traUnsubmitted Not Done Reply Inline Actions @rsmith -- is this sufficient to catch all attempts to call an initializer for a global? I wonder if there are other sneaky ways to call an initializer. tra: @rsmith -- is this sufficient to catch all attempts to call an initializer for a global? I…
		rsmithUnsubmitted Not Done Reply Inline Actions No, this is not sufficient; it's missing (at least) the template instantiation case. (The `ExpressionEvaluationContextRecord` mechanism does handle that case properly.) You should also consider what should happen in default arguments (which are sometimes parsed before we form a `FunctionDecl` for the function for which they are parameters) and default member initializers (which are parsed after we know whether the enclosing class has a user-declared default constructor, so you could in principle consider the CUDA function kind of the declared constructors, I suppose -- but the constructor bodies are not yet available, so you can't tell which constructors would actually use the initializers). Both of those cases are also tracked by the `ExpressionEvaluationContextRecord` mechanism, though you may need to track additional information to process default arguments in the same mode as the function for which they are supplied. rsmith: No, this is not sufficient; it's missing (at least) the template instantiation case. (The…
		hliaoAuthorUnsubmitted Done Reply Inline Actions Could you elaborate more? I added new test cases requiring template instantiation. The current code handle them correctly. Do you refer to template variables? hliao: Could you elaborate more? I added new test cases requiring template instantiation. The current…

// Parse declarator '=' initializer.		// Parse declarator '=' initializer.
// If a '==' or '+=' is found, suggest a fixit to '='.		// If a '==' or '+=' is found, suggest a fixit to '='.
if (isTokenEqualOrEqualTypo()) {		if (isTokenEqualOrEqualTypo()) {
SourceLocation EqualLoc = ConsumeToken();		SourceLocation EqualLoc = ConsumeToken();

if (Tok.is(tok::kw_delete)) {		if (Tok.is(tok::kw_delete)) {
if (D.isFunctionDeclarator())		if (D.isFunctionDeclarator())
Diag(ConsumeToken(), diag::err_default_delete_in_multiple_declaration)		Diag(ConsumeToken(), diag::err_default_delete_in_multiple_declaration)
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	if (Init.isInvalid()) {
Actions.ActOnInitializerError(ThisDecl);		Actions.ActOnInitializerError(ThisDecl);
} else		} else
Actions.AddInitializerToDecl(ThisDecl, Init.get(), /DirectInit=/true);		Actions.AddInitializerToDecl(ThisDecl, Init.get(), /DirectInit=/true);

} else {		} else {
Actions.ActOnUninitializedDecl(ThisDecl);		Actions.ActOnUninitializedDecl(ThisDecl);
}		}

		Actions.popCUDANonLocalVariable(ThisDecl);

Actions.FinalizeDeclaration(ThisDecl);		Actions.FinalizeDeclaration(ThisDecl);

return ThisDecl;		return ThisDecl;
}		}

/// ParseSpecifierQualifierList		/// ParseSpecifierQualifierList
/// specifier-qualifier-list:		/// specifier-qualifier-list:
/// type-specifier specifier-qualifier-list[opt]		/// type-specifier specifier-qualifier-list[opt]
▲ Show 20 Lines • Show All 4,872 Lines • Show Last 20 Lines

clang/lib/Sema/SemaCUDA.cpp

Show First 20 Lines • Show All 90 Lines • ▼ Show 20 Lines	Sema::IdentifyCUDATarget(const ParsedAttributesView &Attrs) {

if (HasDeviceAttr)		if (HasDeviceAttr)
return CFT_Device;		return CFT_Device;

return CFT_Host;		return CFT_Host;
}		}

template <typename A>		template <typename A>
static bool hasAttr(const FunctionDecl *D, bool IgnoreImplicitAttr) {		static bool hasAttr(const Decl *D, bool IgnoreImplicitAttr) {
return D->hasAttrs() && llvm::any_of(D->getAttrs(), [&](Attr *Attribute) {		return D->hasAttrs() && llvm::any_of(D->getAttrs(), [&](Attr *Attribute) {
return isa<A>(Attribute) &&		return isa<A>(Attribute) &&
!(IgnoreImplicitAttr && Attribute->isImplicit());		!(IgnoreImplicitAttr && Attribute->isImplicit());
});		});
}		}

/// IdentifyCUDATarget - Determine the CUDA compilation target for this function		/// IdentifyCUDATarget - Determine the CUDA compilation target for this function
Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D,		Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const FunctionDecl *D,
Show All 18 Lines	if (hasAttr<CUDADeviceAttr>(D, IgnoreImplicitHDAttr)) {
// Some implicit declarations (like intrinsic functions) are not marked.		// Some implicit declarations (like intrinsic functions) are not marked.
// Set the most lenient target on them for maximal flexibility.		// Set the most lenient target on them for maximal flexibility.
return CFT_HostDevice;		return CFT_HostDevice;
}		}

return CFT_Host;		return CFT_Host;
}		}

		Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const VarDecl *D,
		bool IgnoreImplicitHDAttr) {
		if (D == nullptr)
		return CFT_Host;

		assert(D->hasGlobalStorage() && "Only non-local variable needs identifying.");

		if (D->hasAttr<CUDAInvalidTargetAttr>())
		return CFT_InvalidTarget;

		if (hasAttr<CUDAConstantAttr>(D, IgnoreImplicitHDAttr) \|\|
		hasAttr<CUDADeviceAttr>(D, IgnoreImplicitHDAttr) \|\|
		hasAttr<CUDASharedAttr>(D, IgnoreImplicitHDAttr))
		return CFT_Device;

		return CFT_Host;
		}

		Sema::CUDAFunctionTarget Sema::IdentifyCUDATarget(const Decl *D,
		bool IgnoreImplicitHDAttr) {
		if (D == nullptr)
		return CFT_Host;

		if (const auto *FD = dyn_cast<FunctionDecl>(D))
		return IdentifyCUDATarget(FD, IgnoreImplicitHDAttr);

		if (const auto *VD = dyn_cast<VarDecl>(D))
		return IdentifyCUDATarget(VD, IgnoreImplicitHDAttr);

		llvm_unreachable("Unexpected decl for CUDA target identification.");
		}

// * CUDA Call preference table		// * CUDA Call preference table
//		//
// F - from,		// F - from,
// T - to		// T - to
// Ph - preference in host mode		// Ph - preference in host mode
// Pd - preference in device mode		// Pd - preference in device mode
// H - handled in (x)		// H - handled in (x)
// Preferences: N:native, SS:same side, HD:host-device, WS:wrong side, --:never.		// Preferences: N:native, SS:same side, HD:host-device, WS:wrong side, --:never.
Show All 13 Lines
// \| h \| h \| N \| N \| (c) \|		// \| h \| h \| N \| N \| (c) \|
// \| h \| hd \| HD \| HD \| (b) \|		// \| h \| hd \| HD \| HD \| (b) \|
// \| hd \| d \| WS \| SS \| (d) \|		// \| hd \| d \| WS \| SS \| (d) \|
// \| hd \| g \| SS \| -- \|(d/a)\|		// \| hd \| g \| SS \| -- \|(d/a)\|
// \| hd \| h \| SS \| WS \| (d) \|		// \| hd \| h \| SS \| WS \| (d) \|
// \| hd \| hd \| HD \| HD \| (b) \|		// \| hd \| hd \| HD \| HD \| (b) \|

Sema::CUDAFunctionPreference		Sema::CUDAFunctionPreference
Sema::IdentifyCUDAPreference(const FunctionDecl *Caller,		Sema::IdentifyCUDAPreference(const Decl *ContextDecl,
const FunctionDecl *Callee) {		const FunctionDecl *Callee) {
assert(Callee && "Callee must be valid.");		assert(Callee && "Callee must be valid.");
CUDAFunctionTarget CallerTarget = IdentifyCUDATarget(Caller);		CUDAFunctionTarget CallerTarget = IdentifyCUDATarget(ContextDecl);
CUDAFunctionTarget CalleeTarget = IdentifyCUDATarget(Callee);		CUDAFunctionTarget CalleeTarget = IdentifyCUDATarget(Callee);

// If one of the targets is invalid, the check always fails, no matter what		// If one of the targets is invalid, the check always fails, no matter what
// the other target is.		// the other target is.
if (CallerTarget == CFT_InvalidTarget \|\| CalleeTarget == CFT_InvalidTarget)		if (CallerTarget == CFT_InvalidTarget \|\| CalleeTarget == CFT_InvalidTarget)
return CFP_Never;		return CFP_Never;

// (a) Can't call global from some contexts until we support CUDA's		// (a) Can't call global from some contexts until we support CUDA's
Show All 32 Lines	if ((CallerTarget == CFT_Host && CalleeTarget == CFT_Device) \|\|
(CallerTarget == CFT_Device && CalleeTarget == CFT_Host) \|\|		(CallerTarget == CFT_Device && CalleeTarget == CFT_Host) \|\|
(CallerTarget == CFT_Global && CalleeTarget == CFT_Host))		(CallerTarget == CFT_Global && CalleeTarget == CFT_Host))
return CFP_Never;		return CFP_Never;

llvm_unreachable("All cases should've been handled by now.");		llvm_unreachable("All cases should've been handled by now.");
}		}

void Sema::EraseUnwantedCUDAMatches(		void Sema::EraseUnwantedCUDAMatches(
const FunctionDecl *Caller,
SmallVectorImpl<std::pair<DeclAccessPair, FunctionDecl *>> &Matches) {		SmallVectorImpl<std::pair<DeclAccessPair, FunctionDecl *>> &Matches) {
if (Matches.size() <= 1)		if (Matches.size() <= 1)
return;		return;

using Pair = std::pair<DeclAccessPair, FunctionDecl*>;		using Pair = std::pair<DeclAccessPair, FunctionDecl*>;

// Gets the CUDA function preference for a call from Caller to Match.		const Decl *ContextDecl = getCUDAContextDecl();

		// Gets the CUDA function preference for a call from call context to Match.
auto GetCFP = [&](const Pair &Match) {		auto GetCFP = [&](const Pair &Match) {
return IdentifyCUDAPreference(Caller, Match.second);		return IdentifyCUDAPreference(ContextDecl, Match.second);
};		};

// Find the best call preference among the functions in Matches.		// Find the best call preference among the functions in Matches.
CUDAFunctionPreference BestCFP = GetCFP(*std::max_element(		CUDAFunctionPreference BestCFP = GetCFP(*std::max_element(
Matches.begin(), Matches.end(),		Matches.begin(), Matches.end(),
[&](const Pair &M1, const Pair &M2) { return GetCFP(M1) < GetCFP(M2); }));		[&](const Pair &M1, const Pair &M2) { return GetCFP(M1) < GetCFP(M2); }));

// Erase all functions with lower priority.		// Erase all functions with lower priority.
▲ Show 20 Lines • Show All 557 Lines • Show Last 20 Lines

clang/lib/Sema/SemaDeclCXX.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 16,915 Lines • ▼ Show 20 Lines
	/// static data member.			/// static data member.
	static bool isNonlocalVariable(const Decl *D) {			static bool isNonlocalVariable(const Decl *D) {
	if (const VarDecl *Var = dyn_cast_or_null<VarDecl>(D))			if (const VarDecl *Var = dyn_cast_or_null<VarDecl>(D))
	return Var->hasGlobalStorage();			return Var->hasGlobalStorage();

	return false;			return false;
	}			}

				void Sema::pushCUDANonLocalVariable(const Decl *D) {
				if (!D \|\| !isNonlocalVariable(D))
				return;
				CUDANonLocalVariableStack.push_back(D);
				}

				void Sema::popCUDANonLocalVariable(const Decl *D) {
				if (!D \|\| !isNonlocalVariable(D))
				return;
				rjmccallUnsubmitted Done Reply Inline Actions The declaration could become invalid while processing its initializer; I think you should drop that condition. rjmccall: The declaration could become invalid while processing its initializer; I think you should drop…
				assert(!CUDANonLocalVariableStack.empty() &&
				CUDANonLocalVariableStack.back() == D);
				CUDANonLocalVariableStack.pop_back();
				}

	/// Invoked when we are about to parse an initializer for the declaration			/// Invoked when we are about to parse an initializer for the declaration
	/// 'Dcl'.			/// 'Dcl'.
	///			///
	/// After this method is called, according to [C++ 3.4.1p13], if 'Dcl' is a			/// After this method is called, according to [C++ 3.4.1p13], if 'Dcl' is a
	/// static data member of class X, names should be looked up in the scope of			/// static data member of class X, names should be looked up in the scope of
	/// class X. If the declaration had a scope specifier, a scope will have			/// class X. If the declaration had a scope specifier, a scope will have
	/// been created and passed in for this purpose. Otherwise, S will be null.			/// been created and passed in for this purpose. Otherwise, S will be null.
	void Sema::ActOnCXXEnterDeclInitializer(Scope S, Decl D) {			void Sema::ActOnCXXEnterDeclInitializer(Scope S, Decl D) {
	▲ Show 20 Lines • Show All 779 Lines • Show Last 20 Lines

clang/lib/Sema/SemaExprCXX.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,502 Lines • ▼ Show 20 Lines	Result = CXXFunctionalCastExpr::Create(
Result.get(), /Path=/nullptr, Locs.getBegin(), Locs.getEnd());		Result.get(), /Path=/nullptr, Locs.getBegin(), Locs.getEnd());
}		}

return Result;		return Result;
}		}

bool Sema::isUsualDeallocationFunction(const CXXMethodDecl *Method) {		bool Sema::isUsualDeallocationFunction(const CXXMethodDecl *Method) {
// [CUDA] Ignore this function, if we can't call it.		// [CUDA] Ignore this function, if we can't call it.
const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext);		const Decl *ContextDecl = getCUDAContextDecl();
if (getLangOpts().CUDA &&		if (getLangOpts().CUDA &&
IdentifyCUDAPreference(Caller, Method) <= CFP_WrongSide)		IdentifyCUDAPreference(ContextDecl, Method) <= CFP_WrongSide)
return false;		return false;

SmallVector<const FunctionDecl*, 4> PreventedBy;		SmallVector<const FunctionDecl*, 4> PreventedBy;
bool Result = Method->isUsualDeallocationFunction(PreventedBy);		bool Result = Method->isUsualDeallocationFunction(PreventedBy);

if (Result \|\| !getLangOpts().CUDA \|\| PreventedBy.empty())		if (Result \|\| !getLangOpts().CUDA \|\| PreventedBy.empty())
return Result;		return Result;

// In case of CUDA, return true if none of the 1-argument deallocator		// In case of CUDA, return true if none of the 1-argument deallocator
// functions are actually callable.		// functions are actually callable.
return llvm::none_of(PreventedBy, [&](const FunctionDecl *FD) {		return llvm::none_of(PreventedBy, [&](const FunctionDecl *FD) {
assert(FD->getNumParams() == 1 &&		assert(FD->getNumParams() == 1 &&
"Only single-operand functions should be in PreventedBy");		"Only single-operand functions should be in PreventedBy");
return IdentifyCUDAPreference(Caller, FD) >= CFP_HostDevice;		return IdentifyCUDAPreference(ContextDecl, FD) >= CFP_HostDevice;
});		});
}		}

/// Determine whether the given function is a non-placement		/// Determine whether the given function is a non-placement
/// deallocation function.		/// deallocation function.
static bool isNonPlacementDeallocationFunction(Sema &S, FunctionDecl *FD) {		static bool isNonPlacementDeallocationFunction(Sema &S, FunctionDecl *FD) {
if (CXXMethodDecl *Method = dyn_cast<CXXMethodDecl>(FD))		if (CXXMethodDecl *Method = dyn_cast<CXXMethodDecl>(FD))
return S.isUsualDeallocationFunction(Method);		return S.isUsualDeallocationFunction(Method);
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	UsualDeallocFnInfo(Sema &S, DeclAccessPair Found)
if (NumBaseParams < FD->getNumParams() &&		if (NumBaseParams < FD->getNumParams() &&
FD->getParamDecl(NumBaseParams)->getType()->isAlignValT()) {		FD->getParamDecl(NumBaseParams)->getType()->isAlignValT()) {
++NumBaseParams;		++NumBaseParams;
HasAlignValT = true;		HasAlignValT = true;
}		}

// In CUDA, determine how much we'd like / dislike to call this.		// In CUDA, determine how much we'd like / dislike to call this.
if (S.getLangOpts().CUDA)		if (S.getLangOpts().CUDA)
if (auto *Caller = dyn_cast<FunctionDecl>(S.CurContext))		CUDAPref = S.IdentifyCUDAPreference(S.getCUDAContextDecl(), FD);
CUDAPref = S.IdentifyCUDAPreference(Caller, FD);
}		}

explicit operator bool() const { return FD; }		explicit operator bool() const { return FD; }

bool isBetterThan(const UsualDeallocFnInfo &Other, bool WantSize,		bool isBetterThan(const UsualDeallocFnInfo &Other, bool WantSize,
bool WantAlign) const {		bool WantAlign) const {
// C++ P0722:		// C++ P0722:
// A destroying operator delete is preferred over a non-destroying		// A destroying operator delete is preferred over a non-destroying
▲ Show 20 Lines • Show All 1,086 Lines • ▼ Show 20 Lines	for (LookupResult::iterator D = FoundDelete.begin(),
if (Context.hasSameType(adjustCCAndNoReturn(Fn->getType(),		if (Context.hasSameType(adjustCCAndNoReturn(Fn->getType(),
ExpectedFunctionType,		ExpectedFunctionType,
/AdjustExcpetionSpec/true),		/AdjustExcpetionSpec/true),
ExpectedFunctionType))		ExpectedFunctionType))
Matches.push_back(std::make_pair(D.getPair(), Fn));		Matches.push_back(std::make_pair(D.getPair(), Fn));
}		}

if (getLangOpts().CUDA)		if (getLangOpts().CUDA)
EraseUnwantedCUDAMatches(dyn_cast<FunctionDecl>(CurContext), Matches);		EraseUnwantedCUDAMatches(Matches);
} else {		} else {
// C++1y [expr.new]p22:		// C++1y [expr.new]p22:
// For a non-placement allocation function, the normal deallocation		// For a non-placement allocation function, the normal deallocation
// function lookup is used		// function lookup is used
//		//
// Per [expr.delete]p10, this lookup prefers a member operator delete		// Per [expr.delete]p10, this lookup prefers a member operator delete
// without a size_t argument, but prefers a non-member operator delete		// without a size_t argument, but prefers a non-member operator delete
// with a size_t where possible (which it always is in this case).		// with a size_t where possible (which it always is in this case).
▲ Show 20 Lines • Show All 5,993 Lines • Show Last 20 Lines

clang/lib/Sema/SemaOverload.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,296 Lines • ▼ Show 20 Lines	void Sema::AddOverloadCandidate(
if (Args.size() < MinRequiredArgs && !PartialOverloading) {		if (Args.size() < MinRequiredArgs && !PartialOverloading) {
// Not enough arguments.		// Not enough arguments.
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_too_few_arguments;		Candidate.FailureKind = ovl_fail_too_few_arguments;
return;		return;
}		}

// (CUDA B.1): Check for invalid calls between targets.		// (CUDA B.1): Check for invalid calls between targets.
if (getLangOpts().CUDA)		if (getLangOpts().CUDA &&
if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))		!isCUDACallAllowed(Function, Sema::SkipImplicitCaller)) {
// Skip the check for callers that are implicit members, because in this
// case we may not yet know what the member's target is; the target is
// inferred for the member automatically, based on the bases and fields of
// the class.
if (!Caller->isImplicit() && !IsAllowedCUDACall(Caller, Function)) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_bad_target;		Candidate.FailureKind = ovl_fail_bad_target;
return;		return;
}		}

if (Function->getTrailingRequiresClause()) {		if (Function->getTrailingRequiresClause()) {
ConstraintSatisfaction Satisfaction;		ConstraintSatisfaction Satisfaction;
if (CheckFunctionConstraints(Function, Satisfaction) \|\|		if (CheckFunctionConstraints(Function, Satisfaction) \|\|
!Satisfaction.IsSatisfied) {		!Satisfaction.IsSatisfied) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_constraints_not_satisfied;		Candidate.FailureKind = ovl_fail_constraints_not_satisfied;
return;		return;
▲ Show 20 Lines • Show All 494 Lines • ▼ Show 20 Lines	else {
if (Candidate.Conversions[ConvIdx].isBad()) {		if (Candidate.Conversions[ConvIdx].isBad()) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_bad_conversion;		Candidate.FailureKind = ovl_fail_bad_conversion;
return;		return;
}		}
}		}

// (CUDA B.1): Check for invalid calls between targets.		// (CUDA B.1): Check for invalid calls between targets.
if (getLangOpts().CUDA)		if (getLangOpts().CUDA && !isCUDACallAllowed(Method)) {
if (const FunctionDecl *Caller = dyn_cast<FunctionDecl>(CurContext))
if (!IsAllowedCUDACall(Caller, Method)) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_bad_target;		Candidate.FailureKind = ovl_fail_bad_target;
return;		return;
}		}

if (Method->getTrailingRequiresClause()) {		if (Method->getTrailingRequiresClause()) {
ConstraintSatisfaction Satisfaction;		ConstraintSatisfaction Satisfaction;
if (CheckFunctionConstraints(Method, Satisfaction) \|\|		if (CheckFunctionConstraints(Method, Satisfaction) \|\|
!Satisfaction.IsSatisfied) {		!Satisfaction.IsSatisfied) {
Candidate.Viable = false;		Candidate.Viable = false;
Candidate.FailureKind = ovl_fail_constraints_not_satisfied;		Candidate.FailureKind = ovl_fail_constraints_not_satisfied;
return;		return;
▲ Show 20 Lines • Show All 2,663 Lines • ▼ Show 20 Lines	bool clang::isBetterOverloadCandidate(
// viability of a function. If two functions are both viable, other factors		// viability of a function. If two functions are both viable, other factors
// should take precedence in preference, e.g. the standard-defined preferences		// should take precedence in preference, e.g. the standard-defined preferences
// like argument conversion ranks or enable_if partial-ordering. The		// like argument conversion ranks or enable_if partial-ordering. The
// preference for pass-object-size parameters is probably most similar to a		// preference for pass-object-size parameters is probably most similar to a
// type-based-overloading decision and so should take priority.		// type-based-overloading decision and so should take priority.
//		//
// If other rules cannot determine which is better, CUDA preference will be		// If other rules cannot determine which is better, CUDA preference will be
// used again to determine which is better.		// used again to determine which is better.
//
// TODO: Currently IdentifyCUDAPreference does not return correct values
// for functions called in global variable initializers due to missing
// correct context about device/host. Therefore we can only enforce this
// rule when there is a caller. We should enforce this rule for functions
// in global variable initializers once proper context is added.
if (S.getLangOpts().CUDA && Cand1.Function && Cand2.Function) {		if (S.getLangOpts().CUDA && Cand1.Function && Cand2.Function) {
if (FunctionDecl *Caller = dyn_cast<FunctionDecl>(S.CurContext)) {		const Decl *ContextDecl = S.getCUDAContextDecl();
auto P1 = S.IdentifyCUDAPreference(Caller, Cand1.Function);		auto P1 = S.IdentifyCUDAPreference(ContextDecl, Cand1.Function);
auto P2 = S.IdentifyCUDAPreference(Caller, Cand2.Function);		auto P2 = S.IdentifyCUDAPreference(ContextDecl, Cand2.Function);
assert(P1 != Sema::CFP_Never && P2 != Sema::CFP_Never);		assert(P1 != Sema::CFP_Never && P2 != Sema::CFP_Never);
auto Cand1Emittable = P1 > Sema::CFP_WrongSide;		auto Cand1Emittable = P1 > Sema::CFP_WrongSide;
auto Cand2Emittable = P2 > Sema::CFP_WrongSide;		auto Cand2Emittable = P2 > Sema::CFP_WrongSide;
if (Cand1Emittable && !Cand2Emittable)		if (Cand1Emittable && !Cand2Emittable)
return true;		return true;
if (!Cand1Emittable && Cand2Emittable)		if (!Cand1Emittable && Cand2Emittable)
return false;		return false;
}		}
}

// C++ [over.match.best]p1:		// C++ [over.match.best]p1:
//		//
// -- if F is a static member function, ICS1(F) is defined such		// -- if F is a static member function, ICS1(F) is defined such
// that ICS1(F) is neither better nor worse than ICS1(G) for		// that ICS1(F) is neither better nor worse than ICS1(G) for
// any function G, and, symmetrically, ICS1(G) is neither		// any function G, and, symmetrically, ICS1(G) is neither
// better nor worse than ICS1(F).		// better nor worse than ICS1(F).
unsigned StartArg = 0;		unsigned StartArg = 0;
▲ Show 20 Lines • Show All 235 Lines • ▼ Show 20 Lines	bool clang::isBetterOverloadCandidate(
if (MV == Comparison::Better)		if (MV == Comparison::Better)
return true;		return true;
if (MV == Comparison::Worse)		if (MV == Comparison::Worse)
return false;		return false;

// If other rules cannot determine which is better, CUDA preference is used		// If other rules cannot determine which is better, CUDA preference is used
// to determine which is better.		// to determine which is better.
if (S.getLangOpts().CUDA && Cand1.Function && Cand2.Function) {		if (S.getLangOpts().CUDA && Cand1.Function && Cand2.Function) {
FunctionDecl *Caller = dyn_cast<FunctionDecl>(S.CurContext);		const Decl *ContextDecl = S.getCUDAContextDecl();
return S.IdentifyCUDAPreference(Caller, Cand1.Function) >		return S.IdentifyCUDAPreference(ContextDecl, Cand1.Function) >
S.IdentifyCUDAPreference(Caller, Cand2.Function);		S.IdentifyCUDAPreference(ContextDecl, Cand2.Function);
}		}

return false;		return false;
}		}

/// Determine whether two declarations are "equivalent" for the purposes of		/// Determine whether two declarations are "equivalent" for the purposes of
/// name lookup and overload resolution. This applies when the same internal/no		/// name lookup and overload resolution. This applies when the same internal/no
/// linkage entity is defined by two modules (probably by textually including		/// linkage entity is defined by two modules (probably by textually including
▲ Show 20 Lines • Show All 1,052 Lines • ▼ Show 20 Lines	if (CheckArityMismatch(S, Cand, NumArgs))
return;		return;
}		}
DiagnoseBadDeduction(S, Cand->FoundDecl, Cand->Function, // pattern		DiagnoseBadDeduction(S, Cand->FoundDecl, Cand->Function, // pattern
Cand->DeductionFailure, NumArgs, TakingCandidateAddress);		Cand->DeductionFailure, NumArgs, TakingCandidateAddress);
}		}

/// CUDA: diagnose an invalid call across targets.		/// CUDA: diagnose an invalid call across targets.
static void DiagnoseBadTarget(Sema &S, OverloadCandidate *Cand) {		static void DiagnoseBadTarget(Sema &S, OverloadCandidate *Cand) {
FunctionDecl *Caller = cast<FunctionDecl>(S.CurContext);		const Decl *ContextDecl = S.getCUDAContextDecl();
FunctionDecl *Callee = Cand->Function;		FunctionDecl *Callee = Cand->Function;

Sema::CUDAFunctionTarget CallerTarget = S.IdentifyCUDATarget(Caller),		Sema::CUDAFunctionTarget CallerTarget = S.IdentifyCUDATarget(ContextDecl),
CalleeTarget = S.IdentifyCUDATarget(Callee);		CalleeTarget = S.IdentifyCUDATarget(Callee);

std::string FnDesc;		std::string FnDesc;
std::pair<OverloadCandidateKind, OverloadCandidateSelect> FnKindPair =		std::pair<OverloadCandidateKind, OverloadCandidateSelect> FnKindPair =
ClassifyOverloadCandidate(S, Cand->FoundDecl, Callee,		ClassifyOverloadCandidate(S, Cand->FoundDecl, Callee,
Cand->getRewriteKind(), FnDesc);		Cand->getRewriteKind(), FnDesc);

S.Diag(Callee->getLocation(), diag::note_ovl_candidate_bad_target)		S.Diag(Callee->getLocation(), diag::note_ovl_candidate_bad_target)
▲ Show 20 Lines • Show All 1,019 Lines • ▼ Show 20 Lines	if (CXXMethodDecl *Method = dyn_cast<CXXMethodDecl>(Fn)) {
// when converting to member pointer.		// when converting to member pointer.
if (Method->isStatic() == TargetTypeIsNonStaticMemberFunction)		if (Method->isStatic() == TargetTypeIsNonStaticMemberFunction)
return false;		return false;
}		}
else if (TargetTypeIsNonStaticMemberFunction)		else if (TargetTypeIsNonStaticMemberFunction)
return false;		return false;

if (FunctionDecl *FunDecl = dyn_cast<FunctionDecl>(Fn)) {		if (FunctionDecl *FunDecl = dyn_cast<FunctionDecl>(Fn)) {
if (S.getLangOpts().CUDA)		if (S.getLangOpts().CUDA &&
if (FunctionDecl *Caller = dyn_cast<FunctionDecl>(S.CurContext))		!S.isCUDACallAllowed(FunDecl, Sema::SkipImplicitCaller))
if (!Caller->isImplicit() && !S.IsAllowedCUDACall(Caller, FunDecl))
return false;		return false;
if (FunDecl->isMultiVersion()) {		if (FunDecl->isMultiVersion()) {
const auto *TA = FunDecl->getAttr<TargetAttr>();		const auto *TA = FunDecl->getAttr<TargetAttr>();
if (TA && !TA->isDefaultVersion())		if (TA && !TA->isDefaultVersion())
return false;		return false;
}		}

// If any candidate has a placeholder return type, trigger its deduction		// If any candidate has a placeholder return type, trigger its deduction
// now.		// now.
▲ Show 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	for (unsigned I = 0, N = Matches.size(); I != N; ) {
++I;		++I;
else {		else {
Matches[I] = Matches[--N];		Matches[I] = Matches[--N];
Matches.resize(N);		Matches.resize(N);
}		}
}		}
}		}

void EliminateSuboptimalCudaMatches() {		void EliminateSuboptimalCudaMatches() { S.EraseUnwantedCUDAMatches(Matches); }
S.EraseUnwantedCUDAMatches(dyn_cast<FunctionDecl>(S.CurContext), Matches);
}

public:		public:
void ComplainNoMatchesFound() const {		void ComplainNoMatchesFound() const {
assert(Matches.empty());		assert(Matches.empty());
S.Diag(OvlExpr->getBeginLoc(), diag::err_addr_ovl_no_viable)		S.Diag(OvlExpr->getBeginLoc(), diag::err_addr_ovl_no_viable)
<< OvlExpr->getName() << TargetFunctionType		<< OvlExpr->getName() << TargetFunctionType
<< OvlExpr->getSourceRange();		<< OvlExpr->getSourceRange();
if (FailedCandidates.empty())		if (FailedCandidates.empty())
▲ Show 20 Lines • Show All 2,872 Lines • Show Last 20 Lines

clang/test/SemaCUDA/function-overload.cu

	Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines
	#if defined(__CUDA_ARCH__)			#if defined(__CUDA_ARCH__)
	// expected-error@-2 {{reference to __global__ function 'g' in __host__ __device__ function}}			// expected-error@-2 {{reference to __global__ function 'g' in __host__ __device__ function}}
	#endif			#endif
	}			}

	// Test for address of overloaded function resolution in the global context.			// Test for address of overloaded function resolution in the global context.
	HostFnPtr fp_h = h;			HostFnPtr fp_h = h;
	HostFnPtr fp_ch = ch;			HostFnPtr fp_ch = ch;
				#if !defined(__CUDA_ARCH__)
	CurrentFnPtr fp_dh = dh;			CurrentFnPtr fp_dh = dh;
	CurrentFnPtr fp_cdh = cdh;			CurrentFnPtr fp_cdh = cdh;
				#endif
	GlobalFnPtr fp_g = g;			GlobalFnPtr fp_g = g;


	// Test overloading of destructors			// Test overloading of destructors
	// Can't mix H and unattributed destructors			// Can't mix H and unattributed destructors
	struct d_h {			struct d_h {
	~d_h() {} // expected-note {{previous definition is here}}			~d_h() {} // expected-note {{previous definition is here}}
	__host__ ~d_h() {} // expected-error {{destructor cannot be redeclared}}			__host__ ~d_h() {} // expected-error {{destructor cannot be redeclared}}
	▲ Show 20 Lines • Show All 221 Lines • ▼ Show 20 Lines
	// Verify that function overloading doesn't prune candidate wrongly.			// Verify that function overloading doesn't prune candidate wrongly.
	int test_constexpr_overload(C2 &x, C2 &y) {			int test_constexpr_overload(C2 &x, C2 &y) {
	return constexpr_overload(x, y);			return constexpr_overload(x, y);
	}			}

	// Verify no ambiguity for new operator.			// Verify no ambiguity for new operator.
	void *a = new int;			void *a = new int;
	__device__ void *b = new int;			__device__ void *b = new int;
	// expected-error@-1{{dynamic initialization is not supported for __device__, __constant__, and __shared__ variables.}}			// expected-error@-1{{dynamic initialization is not supported for __device__, __constant__, and __shared__ variables.}}
				traUnsubmitted Done Reply Inline Actions I'd add more details here. The problem is that here the overload set has both functions and the one with the integer argument wins, even though it's a device function which we can't execute. We do handle similar cases during overload resolution in other places where we would prefer a callable function over a non-callable function with a better signature match. tra: I'd add more details here. The problem is that here the overload set has both functions and the…

	// Verify no ambiguity for new operator.			// Verify no ambiguity for new operator.
	template<typename _Tp> _Tp&& f();			template<typename _Tp> _Tp&& f();
	template<typename _Tp, typename = decltype(new _Tp(f<_Tp>()))>			template<typename _Tp, typename = decltype(new _Tp(f<_Tp>()))>
	void __test();			void __test();

	void foo() {			void foo() {
	__test<int>();			__test<int>();
	}			}

				// Overload resolution in the global initialization should follow the same rule
				// as the one in other places. That is, we prefer a callable function over a
				// non-callable function with a better signature match. In this test case, even
				// though the device function has exactly matching with the integer argument,
				// it can't be executed.

				__device__ float fn(int);
				__host__ float fn(float);

				float gvar1 = fn(1);

				__device__ float dev_only_fn(int);
				// expected-note@-1 {{candidate function not viable: call to __device__ function from __host__ function}}

				float gvar2 = dev_only_fn(1); // expected-error {{no matching function for call to 'dev_only_fn'}}

				#ifdef __CUDA_ARCH__
				__device__ DeviceReturnTy gvar3 = template_vs_function(1.f);
				// expected-error@-1 {{dynamic initialization is not supported for __device__, __constant__, and __shared__ variables.}}
				__device__ int gvar4 = template_overload(1);
				// expected-error@-1 {{dynamic initialization is not supported for __device__, __constant__, and __shared__ variables.}}
				#else
				TemplateReturnTy gvar3 = template_vs_function(2.f);
				int gvar4 = template_overload(1);
				#endif

clang/test/SemaCUDA/global-initializers-host.cu

	// RUN: %clang_cc1 %s --std=c++11 -triple x86_64-linux-unknown -fsyntax-only -o - -verify			// RUN: %clang_cc1 %s --std=c++11 -triple x86_64-linux-unknown -fsyntax-only -o - -verify

	#include "Inputs/cuda.h"			#include "Inputs/cuda.h"

	// Check that we get an error if we try to call a __device__ function from a			// Check that we get an error if we try to call a __device__ function from a
	// module initializer.			// module initializer.

	struct S {			struct S {
				// expected-note@-1 {{candidate constructor (the implicit copy constructor) not viable: requires 1 argument, but 0 were provided}}
				// expected-note@-2 {{candidate constructor (the implicit move constructor) not viable: requires 1 argument, but 0 were provided}}
	__device__ S() {}			__device__ S() {}
	// expected-note@-1 {{'S' declared here}}			// expected-note@-1 {{candidate constructor not viable: call to __device__ function from __host__ function}}
	};			};

	S s;			S s;
	// expected-error@-1 {{reference to __device__ function 'S' in global initializer}}			// expected-error@-1 {{no matching constructor for initialization of 'S'}}

	struct T {			struct T {
	__host__ __device__ T() {}			__host__ __device__ T() {}
	};			};
	T t; // No error, this is OK.			T t; // No error, this is OK.

	struct U {			struct U {
				// expected-note@-1 {{candidate constructor (the implicit copy constructor) not viable: no known conversion from 'int' to 'const U' for 1st argument}}
				// expected-note@-2 {{candidate constructor (the implicit move constructor) not viable: no known conversion from 'int' to 'U' for 1st argument}}
	__host__ U() {}			__host__ U() {}
				// expected-note@-1 {{candidate constructor not viable: requires 0 arguments, but 1 was provided}}
	__device__ U(int) {}			__device__ U(int) {}
	// expected-note@-1 {{'U' declared here}}			// expected-note@-1 {{candidate constructor not viable: call to __device__ function from __host__ function}}
	};			};
	U u(42);			U u(42);
	// expected-error@-1 {{reference to __device__ function 'U' in global initializer}}			// expected-error@-1 {{no matching constructor for initialization of 'U'}}

	__device__ int device_fn() { return 42; }			__device__ int device_fn() { return 42; }
	// expected-note@-1 {{'device_fn' declared here}}			// expected-note@-1 {{candidate function not viable: call to __device__ function from __host__ function}}
	int n = device_fn();			int n = device_fn();
	// expected-error@-1 {{reference to __device__ function 'device_fn' in global initializer}}			// expected-error@-1 {{no matching function for call to 'device_fn'}}

This is an archive of the discontinued LLVM Phabricator instance.

[cuda][hip] Fix function overload resolution in the global initiailizer.Needs ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 261003

clang/include/clang/Sema/Sema.h

clang/lib/Parse/ParseDecl.cpp

clang/lib/Sema/SemaCUDA.cpp

clang/lib/Sema/SemaDeclCXX.cpp

clang/lib/Sema/SemaExprCXX.cpp

clang/lib/Sema/SemaOverload.cpp

clang/test/SemaCUDA/function-overload.cu

clang/test/SemaCUDA/global-initializers-host.cu

[cuda][hip] Fix function overload resolution in the global initiailizer.
Needs ReviewPublic