This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/
-
clang/
-
AST/
2/3
DeclCXX.h
-
Sema/
-
Sema.h
-
lib/
-
AST/
-
ASTImporter.cpp
-
Decl.cpp
-
Sema/
2/3
SemaLambda.cpp
-
TreeTransform.h
-
Serialization/
-
ASTReaderDecl.cpp
-
ASTWriter.cpp
-
test/CodeGenCUDA/
-
CodeGenCUDA/
-
unnamed-types.cu

Differential D68818

[hip][cuda] Fix the extended lambda name mangling issue.
ClosedPublic

Authored by hliao on Oct 10 2019, 10:21 AM.

Download Raw Diff

Details

Reviewers

tra
rsmith
yaxunl
martong
shafik
rjmccall

Commits

rG5392566a34f6: CP 243ebfba17da72566ba29a891193e4814cbc4ef3 and revert unnecessary changes.
rG243ebfba17da: [hip][cuda] Fix the extended lambda name mangling issue.
rL375309: [hip][cuda] Fix the extended lambda name mangling issue.
rC375309: [hip][cuda] Fix the extended lambda name mangling issue.

Summary

HIP/CUDA host side needs to use device kernel symbol name to match the device side binaries. Without a consistent naming between host- and device-side compilations, it's risky that wrong device binaries are executed. Consistent naming is usually not an issue until unnamed types are used, especially the lambda. In this patch, the consistent name mangling is addressed for the extended lambdas, i.e. the lambdas annotated with __device__.
In [Itanium C++ ABI][1], the mangling of the lambda is generally unspecified unless, in certain cases, ODR rule is required to ensure consisent naming cross TUs. The extended lambda is such a case as its name may be part of a device kernel function, e.g., the extended lambda is used as a template argument and etc. Thus, we need to force ODR for extended lambdas as they are referenced in both device- and host-side TUs. Furthermore, if a extended lambda is nested in other (extended or not) lambdas, those lambdas are required to follow ODR naming as well. This patch revises the current lambda mangle numbering to force ODR from an extended lambda to all its parent lambdas.
On the other side, the aforementioned ODR naming should not change those lambdas' original linkages, i.e., we cannot replace the original internal with linkonce_odr; otherwise, we may violate ODR in general. This patch introduces a new field HasKnownInternalLinkage in lambda data to decouple the current linkage calculation based on mangling number assigned.

[1]: https://itanium-cxx-abi.github.io/cxx-abi/abi.html

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

hliao created this revision.Oct 10 2019, 10:21 AM

Herald added a reviewer: martong. · View Herald TranscriptOct 10 2019, 10:21 AM

Herald added a reviewer: shafik. · View Herald Transcript

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: cfe-commits. · View Herald Transcript

Harbormaster completed remote builds in B39343: Diff 224400.Oct 10 2019, 10:23 AM

Herald added a subscriber: rnkovacs. · View Herald TranscriptOct 10 2019, 10:23 AM

this's a patch address the same issue previously proposed to be worked around in https://reviews.llvm.org/D63164

minor comment to help review.

BTW, as the AST ser/deser is changed as well, not sure we have compatibility issue and, if there is, how to handle that. please advice me on that concern. Thanks.

clang/include/clang/AST/DeclCXX.h
390–394	Try to avoid inflating memory footprint by borrowing one bit from `NumExplicitCaptures`. It seems to me that we won't have that large number (8192) of explicit capture.
clang/lib/Sema/SemaLambda.cpp
432	extracted from `startLambdaDefinition` so that we could have a single place to handle mangling numbers based on CUDA attributes.

PING for review, thanks

PING for review

@rsmith Richard, could you take a look, please? Lambdas, mangling, ODR rules & ABI scare me. :-)

In D68818#1709688, @tra wrote:

@rsmith Richard, could you take a look, please? Lambdas, mangling, ODR rules & ABI scare me. :-)

@tra thanks for promoting the review. This patch is quite critical to support extended lambda in clang. We have several workloads that have this mangle numbering issue.

PING for review

rsmith added a reviewer: rjmccall.Oct 16 2019, 12:23 PM

Broadly, I think it's reasonable to number additional lambda expressions in CUDA compilations. However:

This is (in theory) an ABI break on the host side, as it changes the lambda numbering in inline functions and function templates and the like. That could be mitigated by using a different numbering sequence for the lambdas that are only numbered for this reason.
Depending on whether the call operator is a device function is unstable. If I understand the CUDA rules correctly, then in practice, because constexpr functions are implicitly host device, all lambdas will get numbered in CUDA on C++14 onwards but not in CUDA on C++11, and we generally want those modes to be ABI-compatible. I'd suggest you simplify and stabilize this by simply numbering all lambdas in CUDA mode.

clang/lib/Sema/SemaLambda.cpp
477	What happens if there are other enclosing constructs that would give the lambda internal linkage (eg, an anonymous namespace or static function that might collide with one in another translation unit, or a declaration involving a type with internal linkage)? Presumably you can still suffer from mangling collisions in those cases, at least if you link together multiple translation units containing device code. Do we need (something like) unique identifiers for device code TUs to use in manglings of ostensibly internal linkage entities?

In D68818#1711698, @rsmith wrote:

Broadly, I think it's reasonable to number additional lambda expressions in CUDA compilations. However:

This is (in theory) an ABI break on the host side, as it changes the lambda numbering in inline functions and function templates and the like. That could be mitigated by using a different numbering sequence for the lambdas that are only numbered for this reason.

Depending on whether the call operator is a device function is unstable. If I understand the CUDA rules correctly, then in practice, because constexpr functions are implicitly host device, all lambdas will get numbered in CUDA on C++14 onwards but not in CUDA on C++11, and we generally want those modes to be ABI-compatible. I'd suggest you simplify and stabilize this by simply numbering all lambdas in CUDA mode.

I vote for the numbering of all lambdas in the CUDA mode once addressed in (D63164 as long as we decouple the numbering from the linkage. As long as all lambdas are numbered, the first one is also mitigated. I will prepare the changes forcing all lambdas to be numbered under CUDA/HIP mode.

clang/lib/Sema/SemaLambda.cpp
477	Those unnamed ones will be addressed in late patches as, currently, we only have lambda numbering issues from real workloads. I am preparing changes to address other unnamed types, namespaces, and etc.

I agree with Richard's suggestion of just numbering all the lambdas in both modes if that's viable.

clang/include/clang/AST/DeclCXX.h
390–394	If this field is necessary, please borrow its space from `ManglingNumber` below rather than the number of explicit captures.

Force numbering on all lambdas in CUDA/HIP.

hliao marked an inline comment as done.Oct 17 2019, 9:44 AM

Harbormaster completed remote builds in B39734: Diff 225452.Oct 17 2019, 9:46 AM

PING for review

minor test case revising.

Harbormaster completed remote builds in B39803: Diff 225678.Oct 18 2019, 12:40 PM

rsmith accepted this revision.Oct 18 2019, 12:54 PM

rsmith added inline comments.

clang/include/clang/AST/DeclCXX.h
1713	This function name should mention that it's only applicable to lambdas.

This revision is now accepted and ready to land.Oct 18 2019, 12:54 PM

Closed by commit rG243ebfba17da: [hip][cuda] Fix the extended lambda name mangling issue. (authored by hliao). · Explain WhyOct 18 2019, 5:19 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

clang/

include/

clang/

AST/

DeclCXX.h

29 lines

Sema/

Sema.h

17 lines

lib/

AST/

ASTImporter.cpp

3 lines

Decl.cpp

6 lines

Sema/

SemaLambda.cpp

73 lines

TreeTransform.h

12 lines

Serialization/

ASTReaderDecl.cpp

1 line

ASTWriter.cpp

1 line

test/

CodeGenCUDA/

unnamed-types.cu

39 lines

Diff 225720

clang/include/clang/AST/DeclCXX.h

Show First 20 Lines • Show All 381 Lines • ▼ Show 20 Lines	struct LambdaDefinitionData : public DefinitionData {

/// The Default Capture.		/// The Default Capture.
unsigned CaptureDefault : 2;		unsigned CaptureDefault : 2;

/// The number of captures in this lambda is limited 2^NumCaptures.		/// The number of captures in this lambda is limited 2^NumCaptures.
unsigned NumCaptures : 15;		unsigned NumCaptures : 15;

/// The number of explicit captures in this lambda.		/// The number of explicit captures in this lambda.
unsigned NumExplicitCaptures : 13;		unsigned NumExplicitCaptures : 13;

		/// Has known `internal` linkage.
		unsigned HasKnownInternalLinkage : 1;

		hliaoAuthorUnsubmitted Done Reply Inline Actions Try to avoid inflating memory footprint by borrowing one bit from `NumExplicitCaptures`. It seems to me that we won't have that large number (8192) of explicit capture. hliao: Try to avoid inflating memory footprint by borrowing one bit from `NumExplicitCaptures`. It…
		rjmccallUnsubmitted Done Reply Inline Actions If this field is necessary, please borrow its space from `ManglingNumber` below rather than the number of explicit captures. rjmccall: If this field is necessary, please borrow its space from `ManglingNumber` below rather than the…
/// The number used to indicate this lambda expression for name		/// The number used to indicate this lambda expression for name
/// mangling in the Itanium C++ ABI.		/// mangling in the Itanium C++ ABI.
unsigned ManglingNumber = 0;		unsigned ManglingNumber : 31;

/// The declaration that provides context for this lambda, if the		/// The declaration that provides context for this lambda, if the
/// actual DeclContext does not suffice. This is used for lambdas that		/// actual DeclContext does not suffice. This is used for lambdas that
/// occur within default arguments of function parameters within the class		/// occur within default arguments of function parameters within the class
/// or within a data member initializer.		/// or within a data member initializer.
LazyDeclPtr ContextDecl;		LazyDeclPtr ContextDecl;

/// The list of captures, both explicit and implicit, for this		/// The list of captures, both explicit and implicit, for this
/// lambda.		/// lambda.
Capture *Captures = nullptr;		Capture *Captures = nullptr;

/// The type of the call method.		/// The type of the call method.
TypeSourceInfo *MethodTyInfo;		TypeSourceInfo *MethodTyInfo;

LambdaDefinitionData(CXXRecordDecl D, TypeSourceInfo Info,		LambdaDefinitionData(CXXRecordDecl D, TypeSourceInfo Info, bool Dependent,
bool Dependent, bool IsGeneric,		bool IsGeneric, LambdaCaptureDefault CaptureDefault)
LambdaCaptureDefault CaptureDefault)
: DefinitionData(D), Dependent(Dependent), IsGenericLambda(IsGeneric),		: DefinitionData(D), Dependent(Dependent), IsGenericLambda(IsGeneric),
CaptureDefault(CaptureDefault), NumCaptures(0), NumExplicitCaptures(0),		CaptureDefault(CaptureDefault), NumCaptures(0),
		NumExplicitCaptures(0), HasKnownInternalLinkage(0), ManglingNumber(0),
MethodTyInfo(Info) {		MethodTyInfo(Info) {
IsLambda = true;		IsLambda = true;

// C++1z [expr.prim.lambda]p4:		// C++1z [expr.prim.lambda]p4:
// This class type is not an aggregate type.		// This class type is not an aggregate type.
Aggregate = false;		Aggregate = false;
PlainOldData = false;		PlainOldData = false;
}		}
};		};
▲ Show 20 Lines • Show All 1,277 Lines • ▼ Show 20 Lines
/// Zero indicates that this closure type has internal linkage, so the		/// Zero indicates that this closure type has internal linkage, so the
/// mangling number does not matter, while a non-zero value indicates which		/// mangling number does not matter, while a non-zero value indicates which
/// lambda expression this is in this particular context.		/// lambda expression this is in this particular context.
unsigned getLambdaManglingNumber() const {		unsigned getLambdaManglingNumber() const {
assert(isLambda() && "Not a lambda closure type!");		assert(isLambda() && "Not a lambda closure type!");
return getLambdaData().ManglingNumber;		return getLambdaData().ManglingNumber;
}		}

		/// The lambda is known to has internal linkage no matter whether it has name
		/// mangling number.
		bool hasKnownLambdaInternalLinkage() const {
		rsmithUnsubmitted Not Done Reply Inline Actions This function name should mention that it's only applicable to lambdas. rsmith: This function name should mention that it's only applicable to lambdas.
		assert(isLambda() && "Not a lambda closure type!");
		return getLambdaData().HasKnownInternalLinkage;
		}

/// Retrieve the declaration that provides additional context for a		/// Retrieve the declaration that provides additional context for a
/// lambda, when the normal declaration context is not specific enough.		/// lambda, when the normal declaration context is not specific enough.
///		///
/// Certain contexts (default arguments of in-class function parameters and		/// Certain contexts (default arguments of in-class function parameters and
/// the initializers of data members) have separate name mangling rules for		/// the initializers of data members) have separate name mangling rules for
/// lambdas within the Itanium C++ ABI. For these cases, this routine provides		/// lambdas within the Itanium C++ ABI. For these cases, this routine provides
/// the declaration in which the lambda occurs, e.g., the function parameter		/// the declaration in which the lambda occurs, e.g., the function parameter
/// or the non-static data member. Otherwise, it returns NULL to imply that		/// or the non-static data member. Otherwise, it returns NULL to imply that
/// the declaration context suffices.		/// the declaration context suffices.
Decl *getLambdaContextDecl() const;		Decl *getLambdaContextDecl() const;

/// Set the mangling number and context declaration for a lambda		/// Set the mangling number and context declaration for a lambda
/// class.		/// class.
void setLambdaMangling(unsigned ManglingNumber, Decl *ContextDecl) {		void setLambdaMangling(unsigned ManglingNumber, Decl *ContextDecl,
		bool HasKnownInternalLinkage = false) {
		assert(isLambda() && "Not a lambda closure type!");
getLambdaData().ManglingNumber = ManglingNumber;		getLambdaData().ManglingNumber = ManglingNumber;
getLambdaData().ContextDecl = ContextDecl;		getLambdaData().ContextDecl = ContextDecl;
		getLambdaData().HasKnownInternalLinkage = HasKnownInternalLinkage;
}		}

/// Returns the inheritance model used for this record.		/// Returns the inheritance model used for this record.
MSInheritanceAttr::Spelling getMSInheritanceModel() const;		MSInheritanceAttr::Spelling getMSInheritanceModel() const;

/// Calculate what the inheritance model would be for this class.		/// Calculate what the inheritance model would be for this class.
MSInheritanceAttr::Spelling calculateInheritanceModel() const;		MSInheritanceAttr::Spelling calculateInheritanceModel() const;

▲ Show 20 Lines • Show All 2,139 Lines • Show Last 20 Lines

clang/include/clang/Sema/Sema.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,920 Lines • ▼ Show 20 Lines	public:

/// Create a new lambda closure type.		/// Create a new lambda closure type.
CXXRecordDecl *createLambdaClosureType(SourceRange IntroducerRange,		CXXRecordDecl *createLambdaClosureType(SourceRange IntroducerRange,
TypeSourceInfo *Info,		TypeSourceInfo *Info,
bool KnownDependent,		bool KnownDependent,
LambdaCaptureDefault CaptureDefault);		LambdaCaptureDefault CaptureDefault);

/// Start the definition of a lambda expression.		/// Start the definition of a lambda expression.
CXXMethodDecl *		CXXMethodDecl startLambdaDefinition(CXXRecordDecl Class,
startLambdaDefinition(CXXRecordDecl *Class, SourceRange IntroducerRange,		SourceRange IntroducerRange,
TypeSourceInfo *MethodType, SourceLocation EndLoc,		TypeSourceInfo *MethodType,
		SourceLocation EndLoc,
ArrayRef<ParmVarDecl *> Params,		ArrayRef<ParmVarDecl *> Params,
ConstexprSpecKind ConstexprKind,		ConstexprSpecKind ConstexprKind);
Optional<std::pair<unsigned, Decl *>> Mangling = None);
		/// Number lambda for linkage purposes if necessary.
		void handleLambdaNumbering(
		CXXRecordDecl Class, CXXMethodDecl Method,
		Optional<std::tuple<unsigned, bool, Decl *>> Mangling = None);

/// Endow the lambda scope info with the relevant properties.		/// Endow the lambda scope info with the relevant properties.
void buildLambdaScope(sema::LambdaScopeInfo *LSI,		void buildLambdaScope(sema::LambdaScopeInfo *LSI,
CXXMethodDecl *CallOperator,		CXXMethodDecl *CallOperator,
SourceRange IntroducerRange,		SourceRange IntroducerRange,
LambdaCaptureDefault CaptureDefault,		LambdaCaptureDefault CaptureDefault,
SourceLocation CaptureDefaultLoc,		SourceLocation CaptureDefaultLoc,
bool ExplicitParams,		bool ExplicitParams,
▲ Show 20 Lines • Show All 5,698 Lines • Show Last 20 Lines

clang/lib/AST/ASTImporter.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,688 Lines • ▼ Show 20 Lines	if (DCXX->isLambda()) {
if (GetImportedOrCreateSpecialDecl(		if (GetImportedOrCreateSpecialDecl(
D2CXX, CXXRecordDecl::CreateLambda, D, Importer.getToContext(),		D2CXX, CXXRecordDecl::CreateLambda, D, Importer.getToContext(),
DC, *TInfoOrErr, Loc, DCXX->isDependentLambda(),		DC, *TInfoOrErr, Loc, DCXX->isDependentLambda(),
DCXX->isGenericLambda(), DCXX->getLambdaCaptureDefault()))		DCXX->isGenericLambda(), DCXX->getLambdaCaptureDefault()))
return D2CXX;		return D2CXX;
ExpectedDecl CDeclOrErr = import(DCXX->getLambdaContextDecl());		ExpectedDecl CDeclOrErr = import(DCXX->getLambdaContextDecl());
if (!CDeclOrErr)		if (!CDeclOrErr)
return CDeclOrErr.takeError();		return CDeclOrErr.takeError();
D2CXX->setLambdaMangling(DCXX->getLambdaManglingNumber(), *CDeclOrErr);		D2CXX->setLambdaMangling(DCXX->getLambdaManglingNumber(), *CDeclOrErr,
		DCXX->hasKnownLambdaInternalLinkage());
} else if (DCXX->isInjectedClassName()) {		} else if (DCXX->isInjectedClassName()) {
// We have to be careful to do a similar dance to the one in		// We have to be careful to do a similar dance to the one in
// Sema::ActOnStartCXXMemberDeclarations		// Sema::ActOnStartCXXMemberDeclarations
const bool DelayTypeCreation = true;		const bool DelayTypeCreation = true;
if (GetImportedOrCreateDecl(		if (GetImportedOrCreateDecl(
D2CXX, D, Importer.getToContext(), D->getTagKind(), DC,		D2CXX, D, Importer.getToContext(), D->getTagKind(), DC,
*BeginLocOrErr, Loc, Name.getAsIdentifierInfo(),		*BeginLocOrErr, Loc, Name.getAsIdentifierInfo(),
cast_or_null<CXXRecordDecl>(PrevDecl), DelayTypeCreation))		cast_or_null<CXXRecordDecl>(PrevDecl), DelayTypeCreation))
▲ Show 20 Lines • Show All 6,103 Lines • Show Last 20 Lines

clang/lib/AST/Decl.cpp

Show First 20 Lines • Show All 1,379 Lines • ▼ Show 20 Lines	switch (D->getKind()) {
case Decl::ObjCProperty:		case Decl::ObjCProperty:
case Decl::ObjCPropertyImpl:		case Decl::ObjCPropertyImpl:
case Decl::ObjCProtocol:		case Decl::ObjCProtocol:
return getExternalLinkageFor(D);		return getExternalLinkageFor(D);

case Decl::CXXRecord: {		case Decl::CXXRecord: {
const auto *Record = cast<CXXRecordDecl>(D);		const auto *Record = cast<CXXRecordDecl>(D);
if (Record->isLambda()) {		if (Record->isLambda()) {
if (!Record->getLambdaManglingNumber()) {		if (Record->hasKnownLambdaInternalLinkage() \|\|
		!Record->getLambdaManglingNumber()) {
// This lambda has no mangling number, so it's internal.		// This lambda has no mangling number, so it's internal.
return getInternalLinkageFor(D);		return getInternalLinkageFor(D);
}		}

// This lambda has its linkage/visibility determined:		// This lambda has its linkage/visibility determined:
// - either by the outermost lambda if that lambda has no mangling		// - either by the outermost lambda if that lambda has no mangling
// number.		// number.
// - or by the parent of the outer most lambda		// - or by the parent of the outer most lambda
// This prevents infinite recursion in settings such as nested lambdas		// This prevents infinite recursion in settings such as nested lambdas
// used in NSDMI's, for e.g.		// used in NSDMI's, for e.g.
// struct L {		// struct L {
// int t{};		// int t{};
// int t2 = ([](int a) { return [](int b) { return b; };})(t)(t);		// int t2 = ([](int a) { return [](int b) { return b; };})(t)(t);
// };		// };
const CXXRecordDecl *OuterMostLambda =		const CXXRecordDecl *OuterMostLambda =
getOutermostEnclosingLambda(Record);		getOutermostEnclosingLambda(Record);
if (!OuterMostLambda->getLambdaManglingNumber())		if (OuterMostLambda->hasKnownLambdaInternalLinkage() \|\|
		!OuterMostLambda->getLambdaManglingNumber())
return getInternalLinkageFor(D);		return getInternalLinkageFor(D);

return getLVForClosure(		return getLVForClosure(
OuterMostLambda->getDeclContext()->getRedeclContext(),		OuterMostLambda->getDeclContext()->getRedeclContext(),
OuterMostLambda->getLambdaContextDecl(), computation);		OuterMostLambda->getLambdaContextDecl(), computation);
}		}

break;		break;
▲ Show 20 Lines • Show All 3,491 Lines • Show Last 20 Lines

clang/lib/Sema/SemaLambda.cpp

Show First 20 Lines • Show All 329 Lines • ▼ Show 20 Lines	case Normal: {
}		}

return std::make_tuple(nullptr, nullptr);		return std::make_tuple(nullptr, nullptr);
}		}

case StaticDataMember:		case StaticDataMember:
// -- the initializers of nonspecialized static members of template classes		// -- the initializers of nonspecialized static members of template classes
if (!IsInNonspecializedTemplate)		if (!IsInNonspecializedTemplate)
return std::make_tuple(nullptr, nullptr);		return std::make_tuple(nullptr, ManglingContextDecl);
// Fall through to get the current context.		// Fall through to get the current context.
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;

case DataMember:		case DataMember:
// -- the in-class initializers of class members		// -- the in-class initializers of class members
case DefaultArgument:		case DefaultArgument:
// -- default arguments appearing in class definitions		// -- default arguments appearing in class definitions
case InlineVariable:		case InlineVariable:
// -- the initializers of inline variables		// -- the initializers of inline variables
case VariableTemplate:		case VariableTemplate:
// -- the initializers of templated variables		// -- the initializers of templated variables
return std::make_tuple(		return std::make_tuple(
&Context.getManglingNumberContext(ASTContext::NeedExtraManglingDecl,		&Context.getManglingNumberContext(ASTContext::NeedExtraManglingDecl,
ManglingContextDecl),		ManglingContextDecl),
ManglingContextDecl);		ManglingContextDecl);
}		}

llvm_unreachable("unexpected context");		llvm_unreachable("unexpected context");
}		}

CXXMethodDecl *Sema::startLambdaDefinition(		CXXMethodDecl Sema::startLambdaDefinition(CXXRecordDecl Class,
CXXRecordDecl *Class, SourceRange IntroducerRange,		SourceRange IntroducerRange,
TypeSourceInfo *MethodTypeInfo, SourceLocation EndLoc,		TypeSourceInfo *MethodTypeInfo,
ArrayRef<ParmVarDecl *> Params, ConstexprSpecKind ConstexprKind,		SourceLocation EndLoc,
Optional<std::pair<unsigned, Decl *>> Mangling) {		ArrayRef<ParmVarDecl *> Params,
		ConstexprSpecKind ConstexprKind) {
QualType MethodType = MethodTypeInfo->getType();		QualType MethodType = MethodTypeInfo->getType();
TemplateParameterList *TemplateParams =		TemplateParameterList *TemplateParams =
getGenericLambdaTemplateParameterList(getCurLambda(), *this);		getGenericLambdaTemplateParameterList(getCurLambda(), *this);
// If a lambda appears in a dependent context or is a generic lambda (has		// If a lambda appears in a dependent context or is a generic lambda (has
// template parameters) and has an 'auto' return type, deduce it to a		// template parameters) and has an 'auto' return type, deduce it to a
// dependent type.		// dependent type.
if (Class->isDependentContext() \|\| TemplateParams) {		if (Class->isDependentContext() \|\| TemplateParams) {
const FunctionProtoType *FPT = MethodType->castAs<FunctionProtoType>();		const FunctionProtoType *FPT = MethodType->castAs<FunctionProtoType>();
QualType Result = FPT->getReturnType();		QualType Result = FPT->getReturnType();
if (Result->isUndeducedType()) {		if (Result->isUndeducedType()) {
Result = SubstAutoType(Result, Context.DependentTy);		Result = SubstAutoType(Result, Context.DependentTy);
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	if (!Params.empty()) {
Method->setParams(Params);		Method->setParams(Params);
CheckParmsForFunctionDef(Params,		CheckParmsForFunctionDef(Params,
/CheckParameterNames=/false);		/CheckParameterNames=/false);

for (auto P : Method->parameters())		for (auto P : Method->parameters())
P->setOwningFunction(Method);		P->setOwningFunction(Method);
}		}

		return Method;
		}

		void Sema::handleLambdaNumbering(
		hliaoAuthorUnsubmitted Done Reply Inline Actions extracted from `startLambdaDefinition` so that we could have a single place to handle mangling numbers based on CUDA attributes. hliao: extracted from `startLambdaDefinition` so that we could have a single place to handle mangling…
		CXXRecordDecl Class, CXXMethodDecl Method,
		Optional<std::tuple<unsigned, bool, Decl *>> Mangling) {
if (Mangling) {		if (Mangling) {
Class->setLambdaMangling(Mangling->first, Mangling->second);		unsigned ManglingNumber;
} else {		bool HasKnownInternalLinkage;
		Decl *ManglingContextDecl;
		std::tie(ManglingNumber, HasKnownInternalLinkage, ManglingContextDecl) =
		Mangling.getValue();
		Class->setLambdaMangling(ManglingNumber, ManglingContextDecl,
		HasKnownInternalLinkage);
		return;
		}

		auto getMangleNumberingContext =
		[this](CXXRecordDecl Class, Decl ManglingContextDecl) -> MangleNumberingContext * {
		// Get mangle numbering context if there's any extra decl context.
		if (ManglingContextDecl)
		return &Context.getManglingNumberContext(
		ASTContext::NeedExtraManglingDecl, ManglingContextDecl);
		// Otherwise, from that lambda's decl context.
		auto DC = Class->getDeclContext();
		while (auto *CD = dyn_cast<CapturedDecl>(DC))
		DC = CD->getParent();
		return &Context.getManglingNumberContext(DC);
		};

MangleNumberingContext *MCtx;		MangleNumberingContext *MCtx;
Decl *ManglingContextDecl;		Decl *ManglingContextDecl;
std::tie(MCtx, ManglingContextDecl) =		std::tie(MCtx, ManglingContextDecl) =
getCurrentMangleNumberContext(Class->getDeclContext());		getCurrentMangleNumberContext(Class->getDeclContext());
		bool HasKnownInternalLinkage = false;
		if (!MCtx && getLangOpts().CUDA) {
		// Force lambda numbering in CUDA/HIP as we need to name lambdas following
		// ODR. Both device- and host-compilation need to have a consistent naming
		// on kernel functions. As lambdas are potential part of these `__global__`
		// function names, they needs numbering following ODR.
		MCtx = getMangleNumberingContext(Class, ManglingContextDecl);
		assert(MCtx && "Retrieving mangle numbering context failed!");
		HasKnownInternalLinkage = true;
		}
if (MCtx) {		if (MCtx) {
unsigned ManglingNumber = MCtx->getManglingNumber(Method);		unsigned ManglingNumber = MCtx->getManglingNumber(Method);
Class->setLambdaMangling(ManglingNumber, ManglingContextDecl);		Class->setLambdaMangling(ManglingNumber, ManglingContextDecl,
		HasKnownInternalLinkage);
}		}
		rsmithUnsubmitted Not Done Reply Inline Actions What happens if there are other enclosing constructs that would give the lambda internal linkage (eg, an anonymous namespace or static function that might collide with one in another translation unit, or a declaration involving a type with internal linkage)? Presumably you can still suffer from mangling collisions in those cases, at least if you link together multiple translation units containing device code. Do we need (something like) unique identifiers for device code TUs to use in manglings of ostensibly internal linkage entities? rsmith: What happens if there are other enclosing constructs that would give the lambda internal…
		hliaoAuthorUnsubmitted Done Reply Inline Actions Those unnamed ones will be addressed in late patches as, currently, we only have lambda numbering issues from real workloads. I am preparing changes to address other unnamed types, namespaces, and etc. hliao: Those unnamed ones will be addressed in late patches as, currently, we only have lambda…
}		}

return Method;
}

void Sema::buildLambdaScope(LambdaScopeInfo *LSI,		void Sema::buildLambdaScope(LambdaScopeInfo *LSI,
CXXMethodDecl *CallOperator,		CXXMethodDecl *CallOperator,
SourceRange IntroducerRange,		SourceRange IntroducerRange,
LambdaCaptureDefault CaptureDefault,		LambdaCaptureDefault CaptureDefault,
SourceLocation CaptureDefaultLoc,		SourceLocation CaptureDefaultLoc,
bool ExplicitParams,		bool ExplicitParams,
bool ExplicitResultType,		bool ExplicitResultType,
bool Mutable) {		bool Mutable) {
▲ Show 20 Lines • Show All 494 Lines • ▼ Show 20 Lines	void Sema::ActOnStartOfLambdaDefinition(LambdaIntroducer &Intro,
// Attributes on the lambda apply to the method.		// Attributes on the lambda apply to the method.
ProcessDeclAttributes(CurScope, Method, ParamInfo);		ProcessDeclAttributes(CurScope, Method, ParamInfo);

// CUDA lambdas get implicit attributes based on the scope in which they're		// CUDA lambdas get implicit attributes based on the scope in which they're
// declared.		// declared.
if (getLangOpts().CUDA)		if (getLangOpts().CUDA)
CUDASetLambdaAttrs(Method);		CUDASetLambdaAttrs(Method);

		// Number the lambda for linkage purposes if necessary.
		handleLambdaNumbering(Class, Method);

// Introduce the function call operator as the current declaration context.		// Introduce the function call operator as the current declaration context.
PushDeclContext(CurScope, Method);		PushDeclContext(CurScope, Method);

// Build the lambda scope.		// Build the lambda scope.
buildLambdaScope(LSI, Method, Intro.Range, Intro.Default, Intro.DefaultLoc,		buildLambdaScope(LSI, Method, Intro.Range, Intro.Default, Intro.DefaultLoc,
ExplicitParams, ExplicitResultType, !Method->isConst());		ExplicitParams, ExplicitResultType, !Method->isConst());

// C++11 [expr.prim.lambda]p9:		// C++11 [expr.prim.lambda]p9:
▲ Show 20 Lines • Show All 926 Lines • Show Last 20 Lines

clang/lib/Sema/TreeTransform.h

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,491 Lines • ▼ Show 20 Lines	TreeTransform<Derived>::TransformLambdaExpr(LambdaExpr *E) {
CXXRecordDecl *OldClass = E->getLambdaClass();		CXXRecordDecl *OldClass = E->getLambdaClass();
CXXRecordDecl *Class		CXXRecordDecl *Class
= getSema().createLambdaClosureType(E->getIntroducerRange(),		= getSema().createLambdaClosureType(E->getIntroducerRange(),
NewCallOpTSI,		NewCallOpTSI,
/KnownDependent=/false,		/KnownDependent=/false,
E->getCaptureDefault());		E->getCaptureDefault());
getDerived().transformedLocalDecl(OldClass, {Class});		getDerived().transformedLocalDecl(OldClass, {Class});

Optional<std::pair<unsigned, Decl*>> Mangling;		Optional<std::tuple<unsigned, bool, Decl *>> Mangling;
if (getDerived().ReplacingOriginal())		if (getDerived().ReplacingOriginal())
Mangling = std::make_pair(OldClass->getLambdaManglingNumber(),		Mangling = std::make_tuple(OldClass->getLambdaManglingNumber(),
		OldClass->hasKnownLambdaInternalLinkage(),
OldClass->getLambdaContextDecl());		OldClass->getLambdaContextDecl());

// Build the call operator.		// Build the call operator.
CXXMethodDecl *NewCallOperator = getSema().startLambdaDefinition(		CXXMethodDecl *NewCallOperator = getSema().startLambdaDefinition(
Class, E->getIntroducerRange(), NewCallOpTSI,		Class, E->getIntroducerRange(), NewCallOpTSI,
E->getCallOperator()->getEndLoc(),		E->getCallOperator()->getEndLoc(),
NewCallOpTSI->getTypeLoc().castAs<FunctionProtoTypeLoc>().getParams(),		NewCallOpTSI->getTypeLoc().castAs<FunctionProtoTypeLoc>().getParams(),
E->getCallOperator()->getConstexprKind(), Mangling);		E->getCallOperator()->getConstexprKind());

LSI->CallOperator = NewCallOperator;		LSI->CallOperator = NewCallOperator;

for (unsigned I = 0, NumParams = NewCallOperator->getNumParams();		for (unsigned I = 0, NumParams = NewCallOperator->getNumParams();
I != NumParams; ++I) {		I != NumParams; ++I) {
auto *P = NewCallOperator->getParamDecl(I);		auto *P = NewCallOperator->getParamDecl(I);
if (P->hasUninstantiatedDefaultArg()) {		if (P->hasUninstantiatedDefaultArg()) {
EnterExpressionEvaluationContext Eval(		EnterExpressionEvaluationContext Eval(
getSema(),		getSema(),
Sema::ExpressionEvaluationContext::PotentiallyEvaluatedIfUsed, P);		Sema::ExpressionEvaluationContext::PotentiallyEvaluatedIfUsed, P);
ExprResult R = getDerived().TransformExpr(		ExprResult R = getDerived().TransformExpr(
E->getCallOperator()->getParamDecl(I)->getDefaultArg());		E->getCallOperator()->getParamDecl(I)->getDefaultArg());
P->setDefaultArg(R.get());		P->setDefaultArg(R.get());
}		}
}		}

getDerived().transformAttrs(E->getCallOperator(), NewCallOperator);		getDerived().transformAttrs(E->getCallOperator(), NewCallOperator);
getDerived().transformedLocalDecl(E->getCallOperator(), {NewCallOperator});		getDerived().transformedLocalDecl(E->getCallOperator(), {NewCallOperator});

		// Number the lambda for linkage purposes if necessary.
		getSema().handleLambdaNumbering(Class, NewCallOperator, Mangling);

// Introduce the context of the call operator.		// Introduce the context of the call operator.
Sema::ContextRAII SavedContext(getSema(), NewCallOperator,		Sema::ContextRAII SavedContext(getSema(), NewCallOperator,
/NewThisContext/false);		/NewThisContext/false);

// Enter the scope of the lambda.		// Enter the scope of the lambda.
getSema().buildLambdaScope(LSI, NewCallOperator,		getSema().buildLambdaScope(LSI, NewCallOperator,
E->getIntroducerRange(),		E->getIntroducerRange(),
E->getCaptureDefault(),		E->getCaptureDefault(),
▲ Show 20 Lines • Show All 1,858 Lines • Show Last 20 Lines

clang/lib/Serialization/ASTReaderDecl.cpp

Show First 20 Lines • Show All 1,684 Lines • ▼ Show 20 Lines	if (Data.IsLambda) {
using Capture = LambdaCapture;		using Capture = LambdaCapture;

auto &Lambda = static_cast<CXXRecordDecl::LambdaDefinitionData &>(Data);		auto &Lambda = static_cast<CXXRecordDecl::LambdaDefinitionData &>(Data);
Lambda.Dependent = Record.readInt();		Lambda.Dependent = Record.readInt();
Lambda.IsGenericLambda = Record.readInt();		Lambda.IsGenericLambda = Record.readInt();
Lambda.CaptureDefault = Record.readInt();		Lambda.CaptureDefault = Record.readInt();
Lambda.NumCaptures = Record.readInt();		Lambda.NumCaptures = Record.readInt();
Lambda.NumExplicitCaptures = Record.readInt();		Lambda.NumExplicitCaptures = Record.readInt();
		Lambda.HasKnownInternalLinkage = Record.readInt();
Lambda.ManglingNumber = Record.readInt();		Lambda.ManglingNumber = Record.readInt();
Lambda.ContextDecl = ReadDeclID();		Lambda.ContextDecl = ReadDeclID();
Lambda.Captures = (Capture *)Reader.getContext().Allocate(		Lambda.Captures = (Capture *)Reader.getContext().Allocate(
sizeof(Capture) * Lambda.NumCaptures);		sizeof(Capture) * Lambda.NumCaptures);
Capture *ToCapture = Lambda.Captures;		Capture *ToCapture = Lambda.Captures;
Lambda.MethodTyInfo = GetTypeSourceInfo();		Lambda.MethodTyInfo = GetTypeSourceInfo();
for (unsigned I = 0, N = Lambda.NumCaptures; I != N; ++I) {		for (unsigned I = 0, N = Lambda.NumCaptures; I != N; ++I) {
SourceLocation Loc = ReadSourceLocation();		SourceLocation Loc = ReadSourceLocation();
▲ Show 20 Lines • Show All 2,828 Lines • Show Last 20 Lines

clang/lib/Serialization/ASTWriter.cpp

Show First 20 Lines • Show All 6,218 Lines • ▼ Show 20 Lines	void ASTRecordWriter::AddCXXDefinitionData(const CXXRecordDecl *D) {
// Add lambda-specific data.		// Add lambda-specific data.
if (Data.IsLambda) {		if (Data.IsLambda) {
auto &Lambda = D->getLambdaData();		auto &Lambda = D->getLambdaData();
Record->push_back(Lambda.Dependent);		Record->push_back(Lambda.Dependent);
Record->push_back(Lambda.IsGenericLambda);		Record->push_back(Lambda.IsGenericLambda);
Record->push_back(Lambda.CaptureDefault);		Record->push_back(Lambda.CaptureDefault);
Record->push_back(Lambda.NumCaptures);		Record->push_back(Lambda.NumCaptures);
Record->push_back(Lambda.NumExplicitCaptures);		Record->push_back(Lambda.NumExplicitCaptures);
		Record->push_back(Lambda.HasKnownInternalLinkage);
Record->push_back(Lambda.ManglingNumber);		Record->push_back(Lambda.ManglingNumber);
AddDeclRef(D->getLambdaContextDecl());		AddDeclRef(D->getLambdaContextDecl());
AddTypeSourceInfo(Lambda.MethodTyInfo);		AddTypeSourceInfo(Lambda.MethodTyInfo);
for (unsigned I = 0, N = Lambda.NumCaptures; I != N; ++I) {		for (unsigned I = 0, N = Lambda.NumCaptures; I != N; ++I) {
const LambdaCapture &Capture = Lambda.Captures[I];		const LambdaCapture &Capture = Lambda.Captures[I];
AddSourceLocation(Capture.getLocation());		AddSourceLocation(Capture.getLocation());
Record->push_back(Capture.isImplicit());		Record->push_back(Capture.isImplicit());
Record->push_back(Capture.getCaptureKind());		Record->push_back(Capture.getCaptureKind());
▲ Show 20 Lines • Show All 894 Lines • Show Last 20 Lines

clang/test/CodeGenCUDA/unnamed-types.cu

This file was added.

				// RUN: %clang_cc1 -std=c++11 -x hip -triple x86_64-linux-gnu -aux-triple amdgcn-amd-amdhsa -emit-llvm %s -o - \| FileCheck %s --check-prefix=HOST
				// RUN: %clang_cc1 -std=c++11 -x hip -triple amdgcn-amd-amdhsa -fcuda-is-device -emit-llvm %s -o - \| FileCheck %s --check-prefix=DEVICE

				#include "Inputs/cuda.h"

				// HOST: @0 = private unnamed_addr constant [43 x i8] c"_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_\00", align 1

				__device__ float d0(float x) {
				return [](float x) { return x + 2.f; }(x);
				}

				__device__ float d1(float x) {
				return [](float x) { return x * 2.f; }(x);
				}

				// DEVICE: amdgpu_kernel void @_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_(
				template <typename F>
				__global__ void k0(float *p, F f) {
				p[0] = f(p[0]) + d0(p[1]) + d1(p[2]);
				}

				void f0(float *p) {
				[](float *p) {
				*p = 1.f;
				}(p);
				}

				// The inner/outer lambdas are required to be mangled following ODR but their
				// linkages are still required to keep the original `internal` linkage.

				// HOST: define internal void @_ZZ2f1PfENKUlS_E_clES_(
				// DEVICE: define internal float @_ZZZ2f1PfENKUlS_E_clES_ENKUlfE_clEf(
				void f1(float *p) {
				[](float *p) {
				k0<<<1,1>>>(p, [] __device__ (float x) { return x + 1.f; });
				}(p);
				}
				// HOST: @__hip_register_globals
				// HOST: __hipRegisterFunction{{.}}@_Z2k0IZZ2f1PfENKUlS0_E_clES0_EUlfE_EvS0_T_{{.}}@0

This is an archive of the discontinued LLVM Phabricator instance.

[hip][cuda] Fix the extended lambda name mangling issue.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 225720

clang/include/clang/AST/DeclCXX.h

clang/include/clang/Sema/Sema.h

clang/lib/AST/ASTImporter.cpp

clang/lib/AST/Decl.cpp

clang/lib/Sema/SemaLambda.cpp

clang/lib/Sema/TreeTransform.h

clang/lib/Serialization/ASTReaderDecl.cpp

clang/lib/Serialization/ASTWriter.cpp

clang/test/CodeGenCUDA/unnamed-types.cu

[hip][cuda] Fix the extended lambda name mangling issue.
ClosedPublic