This is an archive of the discontinued LLVM Phabricator instance.

I've added more tests for different code paths leading to the kernel call. Interestingly enough, only the a0 actually calls BuildCallToMemberFunction. Other variants go through different code paths that handle the call without it.

As for the codegen, all kernel calls that make it to clang::Sema::BuildResolvedCallExpr with launch config are handled the same way. I don't think codegen tests will buy us much.

Harbormaster completed remote builds in B121521: Diff 369140.Aug 27 2021, 11:24 AM

I am concerned that there may be more places which need handling, and passing exec config expr by function arguments may not scale. Is it possible to represent the kernel call expr by a derived class of call expr and add the exec config expr as member to it?

In D108787#2999943, @yaxunl wrote:

I am concerned that there may be more places which need handling, and passing exec config expr by function arguments may not scale.
Is it possible to represent the kernel call expr by a derived class of call expr and add the exec config expr as member to it?

I don't think it's worth it.

This config pass-through code has been around from the very early days of attempting to implement CUDA and we're already passing it around during call resolution.
AFAICT, this particular place was a relatively new addition which didn't implement the pass-through of the config.

While there may be other places where a similar issue may happen in the future (or exists as a corner case we didn't find yet), it/when we run into it, it will be diagnosed, as it was in this case.
It took us few years until we ran into this one. I'm pretty sure that this particular code path is pretty rare and the patch is not going to have a measurable impact on compiler performance.

LGTM. Thanks.

This revision was landed with ongoing or failed builds.Sep 16 2021, 11:19 AM

Closed by commit rG6b20ea696356: [CUDA] Pass ExecConfig through BuildCallToMemberFunction (authored by tra). · Explain Why

This revision was automatically updated to reflect the committed changes.

tra added a commit: rG6b20ea696356: [CUDA] Pass ExecConfig through BuildCallToMemberFunction.

Revision Contents

Path

Size

clang/

include/

clang/

Sema/

Sema.h

2 lines

lib/

Sema/

SemaExpr.cpp

6 lines

SemaOverload.cpp

5 lines

test/

SemaCUDA/

kernel-call.cu

31 lines

Diff 369140

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===--- Sema.h - Semantic Analysis & AST Building --------------- C++ --===//		//===--- Sema.h - Semantic Analysis & AST Building --------------- C++ --===//
		Lint: Lint Inline Actions clang-format suggested style edits found: Lint: Lint: clang-format suggested style edits found:
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file defines the Sema class, which performs semantic analysis and		// This file defines the Sema class, which performs semantic analysis and
▲ Show 20 Lines • Show All 3,868 Lines • ▼ Show 20 Lines	ExprResult BuildSynthesizedThreeWayComparison(SourceLocation OpLoc,
const UnresolvedSetImpl &Fns,		const UnresolvedSetImpl &Fns,
Expr LHS, Expr RHS,		Expr LHS, Expr RHS,
FunctionDecl *DefaultedFn);		FunctionDecl *DefaultedFn);

ExprResult CreateOverloadedArraySubscriptExpr(SourceLocation LLoc,		ExprResult CreateOverloadedArraySubscriptExpr(SourceLocation LLoc,
SourceLocation RLoc,		SourceLocation RLoc,
Expr Base,Expr Idx);		Expr Base,Expr Idx);

ExprResult BuildCallToMemberFunction(Scope S, Expr MemExpr,		ExprResult BuildCallToMemberFunction(Scope S, Expr MemExpr,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - ExprResult BuildCallToMemberFunction(Scope S, Expr MemExpr, - SourceLocation LParenLoc, - MultiExprArg Args, - SourceLocation RParenLoc, - Expr ExecConfig = nullptr, - bool IsExecConfig = false, - bool AllowRecovery = false); + ExprResult BuildCallToMemberFunction( + Scope S, Expr MemExpr, SourceLocation LParenLoc, MultiExprArg Args, + SourceLocation RParenLoc, Expr ExecConfig = nullptr, + bool IsExecConfig = false, bool AllowRecovery = false); Lint: Pre-merge checks: clang-format: please reformat the code ``` - ExprResult BuildCallToMemberFunction(Scope *S…
SourceLocation LParenLoc,		SourceLocation LParenLoc,
MultiExprArg Args,		MultiExprArg Args,
SourceLocation RParenLoc,		SourceLocation RParenLoc,
		Expr *ExecConfig = nullptr,
		bool IsExecConfig = false,
bool AllowRecovery = false);		bool AllowRecovery = false);
ExprResult		ExprResult
BuildCallToObjectOfClassType(Scope S, Expr Object, SourceLocation LParenLoc,		BuildCallToObjectOfClassType(Scope S, Expr Object, SourceLocation LParenLoc,
MultiExprArg Args,		MultiExprArg Args,
SourceLocation RParenLoc);		SourceLocation RParenLoc);

ExprResult BuildOverloadedArrowExpr(Scope S, Expr Base,		ExprResult BuildOverloadedArrowExpr(Scope S, Expr Base,
SourceLocation OpLoc,		SourceLocation OpLoc,
▲ Show 20 Lines • Show All 9,200 Lines • Show Last 20 Lines

clang/lib/Sema/SemaExpr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,492 Lines • ▼ Show 20 Lines	if (getLangOpts().CPlusPlus) {
if (Fn->getType() == Context.UnknownAnyTy) {		if (Fn->getType() == Context.UnknownAnyTy) {
ExprResult result = rebuildUnknownAnyFunction(*this, Fn);		ExprResult result = rebuildUnknownAnyFunction(*this, Fn);
if (result.isInvalid()) return ExprError();		if (result.isInvalid()) return ExprError();
Fn = result.get();		Fn = result.get();
}		}

if (Fn->getType() == Context.BoundMemberTy) {		if (Fn->getType() == Context.BoundMemberTy) {
return BuildCallToMemberFunction(Scope, Fn, LParenLoc, ArgExprs,		return BuildCallToMemberFunction(Scope, Fn, LParenLoc, ArgExprs,
RParenLoc, AllowRecovery);		RParenLoc, ExecConfig, IsExecConfig,
		AllowRecovery);
}		}
}		}

// Check for overloaded calls. This can happen even in C due to extensions.		// Check for overloaded calls. This can happen even in C due to extensions.
if (Fn->getType() == Context.OverloadTy) {		if (Fn->getType() == Context.OverloadTy) {
OverloadExpr::FindResult find = OverloadExpr::find(Fn);		OverloadExpr::FindResult find = OverloadExpr::find(Fn);

// We aren't supposed to apply this logic if there's an '&' involved.		// We aren't supposed to apply this logic if there's an '&' involved.
if (!find.HasFormOfMemberPointer) {		if (!find.HasFormOfMemberPointer) {
if (Expr::hasAnyTypeDependentArguments(ArgExprs))		if (Expr::hasAnyTypeDependentArguments(ArgExprs))
return CallExpr::Create(Context, Fn, ArgExprs, Context.DependentTy,		return CallExpr::Create(Context, Fn, ArgExprs, Context.DependentTy,
VK_PRValue, RParenLoc, CurFPFeatureOverrides());		VK_PRValue, RParenLoc, CurFPFeatureOverrides());
OverloadExpr *ovl = find.Expression;		OverloadExpr *ovl = find.Expression;
if (UnresolvedLookupExpr *ULE = dyn_cast<UnresolvedLookupExpr>(ovl))		if (UnresolvedLookupExpr *ULE = dyn_cast<UnresolvedLookupExpr>(ovl))
return BuildOverloadedCallExpr(		return BuildOverloadedCallExpr(
Scope, Fn, ULE, LParenLoc, ArgExprs, RParenLoc, ExecConfig,		Scope, Fn, ULE, LParenLoc, ArgExprs, RParenLoc, ExecConfig,
/AllowTypoCorrection=/true, find.IsAddressOfOperand);		/AllowTypoCorrection=/true, find.IsAddressOfOperand);
return BuildCallToMemberFunction(Scope, Fn, LParenLoc, ArgExprs,		return BuildCallToMemberFunction(Scope, Fn, LParenLoc, ArgExprs,
RParenLoc, AllowRecovery);		RParenLoc, ExecConfig, IsExecConfig,
		AllowRecovery);
}		}
}		}

// If we're directly calling a function, get the appropriate declaration.		// If we're directly calling a function, get the appropriate declaration.
if (Fn->getType() == Context.UnknownAnyTy) {		if (Fn->getType() == Context.UnknownAnyTy) {
ExprResult result = rebuildUnknownAnyFunction(*this, Fn);		ExprResult result = rebuildUnknownAnyFunction(*this, Fn);
if (result.isInvalid()) return ExprError();		if (result.isInvalid()) return ExprError();
Fn = result.get();		Fn = result.get();
▲ Show 20 Lines • Show All 13,400 Lines • Show Last 20 Lines

clang/lib/Sema/SemaOverload.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 14,160 Lines • ▼ Show 20 Lines
/// arguments to the function call (not including the object		/// arguments to the function call (not including the object
/// parameter). The caller needs to validate that the member		/// parameter). The caller needs to validate that the member
/// expression refers to a non-static member function or an overloaded		/// expression refers to a non-static member function or an overloaded
/// member function.		/// member function.
ExprResult Sema::BuildCallToMemberFunction(Scope S, Expr MemExprE,		ExprResult Sema::BuildCallToMemberFunction(Scope S, Expr MemExprE,
SourceLocation LParenLoc,		SourceLocation LParenLoc,
MultiExprArg Args,		MultiExprArg Args,
SourceLocation RParenLoc,		SourceLocation RParenLoc,
		Expr *ExecConfig, bool IsExecConfig,
bool AllowRecovery) {		bool AllowRecovery) {
assert(MemExprE->getType() == Context.BoundMemberTy \|\|		assert(MemExprE->getType() == Context.BoundMemberTy \|\|
MemExprE->getType() == Context.OverloadTy);		MemExprE->getType() == Context.OverloadTy);

// Dig out the member expression. This holds both the object		// Dig out the member expression. This holds both the object
// argument and the member function we're referring to.		// argument and the member function we're referring to.
Expr *NakedMemExpr = MemExprE->IgnoreParens();		Expr *NakedMemExpr = MemExprE->IgnoreParens();

▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	if (isa<MemberExpr>(NakedMemExpr)) {
if (!Succeeded)		if (!Succeeded)
return BuildRecoveryExpr(chooseRecoveryType(CandidateSet, &Best));		return BuildRecoveryExpr(chooseRecoveryType(CandidateSet, &Best));

MemExprE = FixOverloadedFunctionReference(MemExprE, FoundDecl, Method);		MemExprE = FixOverloadedFunctionReference(MemExprE, FoundDecl, Method);

// If overload resolution picked a static member, build a		// If overload resolution picked a static member, build a
// non-member call based on that function.		// non-member call based on that function.
if (Method->isStatic()) {		if (Method->isStatic()) {
return BuildResolvedCallExpr(MemExprE, Method, LParenLoc, Args,		return BuildResolvedCallExpr(MemExprE, Method, LParenLoc, Args, RParenLoc,
RParenLoc);		ExecConfig, IsExecConfig);
}		}

MemExpr = cast<MemberExpr>(MemExprE->IgnoreParens());		MemExpr = cast<MemberExpr>(MemExprE->IgnoreParens());
}		}

QualType ResultType = Method->getReturnType();		QualType ResultType = Method->getReturnType();
ExprValueKind VK = Expr::getValueKindForType(ResultType);		ExprValueKind VK = Expr::getValueKindForType(ResultType);
ResultType = ResultType.getNonLValueExprType(Context);		ResultType = ResultType.getNonLValueExprType(Context);
▲ Show 20 Lines • Show All 803 Lines • Show Last 20 Lines

clang/test/SemaCUDA/kernel-call.cu

Show All 20 Lines	int main(void) {

h1<<<1, 1>>>(42); // expected-error {{kernel call to non-global function 'h1'}}		h1<<<1, 1>>>(42); // expected-error {{kernel call to non-global function 'h1'}}

int (*fp)(int) = h2;		int (*fp)(int) = h2;
fp<<<1, 1>>>(42); // expected-error {{must have void return type}}		fp<<<1, 1>>>(42); // expected-error {{must have void return type}}

g1<<<undeclared, 1>>>(42); // expected-error {{use of undeclared identifier 'undeclared'}}		g1<<<undeclared, 1>>>(42); // expected-error {{use of undeclared identifier 'undeclared'}}
}		}

		// Make sure we can call static member kernels.
		template <typename > struct a0 {
		template <typename T> static __global__ void Call(T);
		};
		struct a1 {
		template <typename T> static __global__ void Call(T);
		};
		template <typename T> struct a2 {
		static __global__ void Call(T);
		};
		struct a3 {
		static __global__ void Call(int);
		static __global__ void Call(void*);
		};

		struct b {
		template <typename c> void d0(c arg) {
		a0<c>::Call<<<0, 0>>>(arg);
		a1::Call<<<0,0>>>(arg);
		a2<c>::Call<<<0,0>>>(arg);
		a3::Call<<<0, 0>>>(arg);
		}
		void d1(void* arg) {
		a0<void*>::Call<<<0, 0>>>(arg);
		a1::Call<<<0,0>>>(arg);
		a2<void*>::Call<<<0,0>>>(arg);
		a3::Call<<<0, 0>>>(arg);
		}
		void e() { d0(1); }
		};

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Pass ExecConfig through BuildCallToMemberFunctionClosedPublic

Details

Diff Detail