This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/AST/
-
clang/
-
AST/
-
ASTContext.h
-
lib/
-
AST/
-
ASTContext.cpp
-
CodeGen/
-
CGCUDANV.cpp
-
CGCXXABI.h

Differential D64015

[WIP][CUDA] Use shared MangleContext for CUDA and CXX CG
Needs ReviewPublic

Authored by psalz on Jul 1 2019, 8:36 AM.

Download Raw Diff

Details

Reviewers

tra
hliao
aheejin

Summary

NOTE: This is a work in progress and mainly intended to highlight the issue - i.e., I'm not certain the provided solution is appropriate.

Given this CUDA program

template<typename Lambda>
__global__ void run_this(Lambda lambda) {
    lambda();
}

template<typename T>
struct remove_reference {
    using type = T;
};

template<typename T>
struct remove_reference<T&> {
    using type = T;
};

template<typename T>
constexpr typename remove_reference<T>::type&& move(T&& t) {
    return static_cast<typename remove_reference<T>::type&&>(t);
}

int main() {
    auto foo = move([](){});
    run_this<<<1, 1, 1>>>([]() __device__ { printf("Hello World\n"); }); 
    return 0;
}

the assertion at the top of CGNVCUDARuntime::emitDeviceStub will fail. For release builds the effect is simply a cudaErrorInvalidDeviceFunction error at run time. The reason for this is that the mangled names of the device stub and the actual device side function differ: The stub is called _Z8run_thisIZ4mainE3$_1EvT_, while the device function is _Z8run_thisIZ4mainE3$_0EvT_. The difference comes down to the anonymous struct ID that is maintained and assigned by the ManglerContext. It appears that for the latter getAnonymousStructId is never called for the moved no-op lambda, resulting in an ID of 0 for the kernel.

My proposed solution would be to simply share the ManglerContext used by the CGNVCUDARuntime and CGCXXABI code generators. For this I've added a new ASTContext::getSharedMangleContext function that memoizes created manglers for the given target ABI. From looking at ManglerContext to me at least it doesn't look like that could cause any issues, but then again, I really don't know much about Clang's internals.

Of course an alternative solution could be to make sure that getAnonymousStructId is always called for both lambdas (and in the correct order), but again I don't really know why that is not happening in the first place.

Diff Detail

Repository: rC Clang

Event Timeline

psalz created this revision.Jul 1 2019, 8:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 1 2019, 8:36 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

hfinkel added subscribers: erichkeane, hfinkel.Jul 1 2019, 9:59 AM

I don't know of any problems that the shared mangle context would cause, though I'm not sure about using the shared_ptr. It seems to me that SOMEONE should own this, and the other should use a reference. IMO, the SharedMangleContexts should be a unique_ptr and CGCXXABI/CGCUDANV should have a reference to the proper implementation.

Move ownership of shared MangleContexts to ASTContext, return references from getSharedMangleContext.

Sorry, I don't think I know enough about this code to review this.

hliao resigned from this revision.Nov 1 2019, 11:13 AM

Revision Contents

Path

Size

include/

clang/

AST/

ASTContext.h

11 lines

lib/

AST/

ASTContext.cpp

34 lines

CodeGen/

CGCUDANV.cpp

4 lines

CGCXXABI.h

4 lines

Diff 207330

include/clang/AST/ASTContext.h

Show All 18 Lines
#include "clang/AST/CanonicalType.h"		#include "clang/AST/CanonicalType.h"
#include "clang/AST/CommentCommandTraits.h"		#include "clang/AST/CommentCommandTraits.h"
#include "clang/AST/ComparisonCategories.h"		#include "clang/AST/ComparisonCategories.h"
#include "clang/AST/Decl.h"		#include "clang/AST/Decl.h"
#include "clang/AST/DeclBase.h"		#include "clang/AST/DeclBase.h"
#include "clang/AST/DeclarationName.h"		#include "clang/AST/DeclarationName.h"
#include "clang/AST/Expr.h"		#include "clang/AST/Expr.h"
#include "clang/AST/ExternalASTSource.h"		#include "clang/AST/ExternalASTSource.h"
		#include "clang/AST/Mangle.h"
#include "clang/AST/NestedNameSpecifier.h"		#include "clang/AST/NestedNameSpecifier.h"
#include "clang/AST/PrettyPrinter.h"		#include "clang/AST/PrettyPrinter.h"
#include "clang/AST/RawCommentList.h"		#include "clang/AST/RawCommentList.h"
#include "clang/AST/TemplateBase.h"		#include "clang/AST/TemplateBase.h"
#include "clang/AST/TemplateName.h"		#include "clang/AST/TemplateName.h"
#include "clang/AST/Type.h"		#include "clang/AST/Type.h"
#include "clang/Basic/AddressSpaces.h"		#include "clang/Basic/AddressSpaces.h"
#include "clang/Basic/AttrKinds.h"		#include "clang/Basic/AttrKinds.h"
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
class CharUnits;		class CharUnits;
class CXXABI;		class CXXABI;
class CXXConstructorDecl;		class CXXConstructorDecl;
class CXXMethodDecl;		class CXXMethodDecl;
class CXXRecordDecl;		class CXXRecordDecl;
class DiagnosticsEngine;		class DiagnosticsEngine;
class Expr;		class Expr;
class FixedPointSemantics;		class FixedPointSemantics;
class MangleContext;
class MangleNumberingContext;		class MangleNumberingContext;
class MaterializeTemporaryExpr;		class MaterializeTemporaryExpr;
class MemberSpecializationInfo;		class MemberSpecializationInfo;
class Module;		class Module;
class ObjCCategoryDecl;		class ObjCCategoryDecl;
class ObjCCategoryImplDecl;		class ObjCCategoryImplDecl;
class ObjCContainerDecl;		class ObjCContainerDecl;
class ObjCImplDecl;		class ObjCImplDecl;
▲ Show 20 Lines • Show All 2,141 Lines • ▼ Show 20 Lines	public:

bool isNearlyEmpty(const CXXRecordDecl *RD) const;		bool isNearlyEmpty(const CXXRecordDecl *RD) const;

VTableContextBase *getVTableContext();		VTableContextBase *getVTableContext();

/// If \p T is null pointer, assume the target in ASTContext.		/// If \p T is null pointer, assume the target in ASTContext.
MangleContext createMangleContext(const TargetInfo T = nullptr);		MangleContext createMangleContext(const TargetInfo T = nullptr);

		private:
		llvm::DenseMap<std::underlying_type<MangleContext::ManglerKind>::type,
		std::shared_ptr<MangleContext>> SharedMangleContexts;

		public:
		/// If \p T is null pointer, assume the target in ASTContext.
		std::shared_ptr<MangleContext>
		&getSharedMangleContext(const TargetInfo *T = nullptr);

void DeepCollectObjCIvars(const ObjCInterfaceDecl *OI, bool leafClass,		void DeepCollectObjCIvars(const ObjCInterfaceDecl *OI, bool leafClass,
SmallVectorImpl<const ObjCIvarDecl*> &Ivars) const;		SmallVectorImpl<const ObjCIvarDecl*> &Ivars) const;

unsigned CountNonClassIvars(const ObjCInterfaceDecl *OI) const;		unsigned CountNonClassIvars(const ObjCInterfaceDecl *OI) const;
void CollectInheritedProtocols(const Decl *CDecl,		void CollectInheritedProtocols(const Decl *CDecl,
llvm::SmallPtrSet<ObjCProtocolDecl*, 8> &Protocols);		llvm::SmallPtrSet<ObjCProtocolDecl*, 8> &Protocols);

/// Return true if the specified type has unique object representations		/// Return true if the specified type has unique object representations
▲ Show 20 Lines • Show All 875 Lines • Show Last 20 Lines

lib/AST/ASTContext.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 10,042 Lines • ▼ Show 20 Lines	if (!VTContext.get()) {
if (Target->getCXXABI().isMicrosoft())		if (Target->getCXXABI().isMicrosoft())
VTContext.reset(new MicrosoftVTableContext(*this));		VTContext.reset(new MicrosoftVTableContext(*this));
else		else
VTContext.reset(new ItaniumVTableContext(*this));		VTContext.reset(new ItaniumVTableContext(*this));
}		}
return VTContext.get();		return VTContext.get();
}		}

MangleContext ASTContext::createMangleContext(const TargetInfo T) {		MangleContext::ManglerKind getManglerKindForABI(TargetCXXABI::Kind K) {
if (!T)		switch (K) {
T = Target;
switch (T->getCXXABI().getKind()) {
case TargetCXXABI::GenericAArch64:		case TargetCXXABI::GenericAArch64:
case TargetCXXABI::GenericItanium:		case TargetCXXABI::GenericItanium:
case TargetCXXABI::GenericARM:		case TargetCXXABI::GenericARM:
case TargetCXXABI::GenericMIPS:		case TargetCXXABI::GenericMIPS:
case TargetCXXABI::iOS:		case TargetCXXABI::iOS:
case TargetCXXABI::iOS64:		case TargetCXXABI::iOS64:
case TargetCXXABI::WebAssembly:		case TargetCXXABI::WebAssembly:
case TargetCXXABI::WatchOS:		case TargetCXXABI::WatchOS:
return ItaniumMangleContext::create(*this, getDiagnostics());		return MangleContext::MK_Itanium;
case TargetCXXABI::Microsoft:		case TargetCXXABI::Microsoft:
return MicrosoftMangleContext::create(*this, getDiagnostics());		return MangleContext::MK_Microsoft;
}		}
llvm_unreachable("Unsupported ABI");		llvm_unreachable("Unsupported ABI");
}		}

		MangleContext ASTContext::createMangleContext(const TargetInfo T) {
		if (!T)
		T = Target;
		switch (getManglerKindForABI(T->getCXXABI().getKind())) {
		case MangleContext::MK_Itanium:
		return ItaniumMangleContext::create(*this, getDiagnostics());
		case MangleContext::MK_Microsoft:
		return MicrosoftMangleContext::create(*this, getDiagnostics());
		}
		llvm_unreachable("Unsupported MangleContext");
		}

		std::shared_ptr<MangleContext>&
		ASTContext::getSharedMangleContext(const TargetInfo* T) {
		if (!T)
		T = Target;
		auto Kind = getManglerKindForABI(T->getCXXABI().getKind());
		auto I = SharedMangleContexts.find(Kind);
		if (I == SharedMangleContexts.end())
		I = SharedMangleContexts.insert(
		{Kind, std::shared_ptr<MangleContext>(createMangleContext(T))}).first;
		return I->second;
		}

CXXABI::~CXXABI() = default;		CXXABI::~CXXABI() = default;

size_t ASTContext::getSideTableAllocatedMemory() const {		size_t ASTContext::getSideTableAllocatedMemory() const {
return ASTRecordLayouts.getMemorySize() +		return ASTRecordLayouts.getMemorySize() +
llvm::capacity_in_bytes(ObjCLayouts) +		llvm::capacity_in_bytes(ObjCLayouts) +
llvm::capacity_in_bytes(KeyFunctions) +		llvm::capacity_in_bytes(KeyFunctions) +
llvm::capacity_in_bytes(ObjCImpls) +		llvm::capacity_in_bytes(ObjCImpls) +
llvm::capacity_in_bytes(BlockVarCopyInits) +		llvm::capacity_in_bytes(BlockVarCopyInits) +
▲ Show 20 Lines • Show All 564 Lines • Show Last 20 Lines

lib/CodeGen/CGCUDANV.cpp

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	private:
llvm::SmallVector<VarInfo, 16> DeviceVars;		llvm::SmallVector<VarInfo, 16> DeviceVars;
/// Keeps track of variable containing handle of GPU binary. Populated by		/// Keeps track of variable containing handle of GPU binary. Populated by
/// ModuleCtorFunction() and used to create corresponding cleanup calls in		/// ModuleCtorFunction() and used to create corresponding cleanup calls in
/// ModuleDtorFunction()		/// ModuleDtorFunction()
llvm::GlobalVariable *GpuBinaryHandle = nullptr;		llvm::GlobalVariable *GpuBinaryHandle = nullptr;
/// Whether we generate relocatable device code.		/// Whether we generate relocatable device code.
bool RelocatableDeviceCode;		bool RelocatableDeviceCode;
/// Mangle context for device.		/// Mangle context for device.
std::unique_ptr<MangleContext> DeviceMC;		std::shared_ptr<MangleContext> DeviceMC;

llvm::FunctionCallee getSetupArgumentFn() const;		llvm::FunctionCallee getSetupArgumentFn() const;
llvm::FunctionCallee getLaunchFn() const;		llvm::FunctionCallee getLaunchFn() const;

llvm::FunctionType *getRegisterGlobalsFnTy() const;		llvm::FunctionType *getRegisterGlobalsFnTy() const;
llvm::FunctionType *getCallbackFnTy() const;		llvm::FunctionType *getCallbackFnTy() const;
llvm::FunctionType *getRegisterLinkedBinaryFnTy() const;		llvm::FunctionType *getRegisterLinkedBinaryFnTy() const;
std::string addPrefixToName(StringRef FuncName) const;		std::string addPrefixToName(StringRef FuncName) const;
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	if (CGM.getLangOpts().HIP)
return ((Twine("__hip") + Twine(FuncName)).str());		return ((Twine("__hip") + Twine(FuncName)).str());
return ((Twine("__cuda") + Twine(FuncName)).str());		return ((Twine("__cuda") + Twine(FuncName)).str());
}		}

CGNVCUDARuntime::CGNVCUDARuntime(CodeGenModule &CGM)		CGNVCUDARuntime::CGNVCUDARuntime(CodeGenModule &CGM)
: CGCUDARuntime(CGM), Context(CGM.getLLVMContext()),		: CGCUDARuntime(CGM), Context(CGM.getLLVMContext()),
TheModule(CGM.getModule()),		TheModule(CGM.getModule()),
RelocatableDeviceCode(CGM.getLangOpts().GPURelocatableDeviceCode),		RelocatableDeviceCode(CGM.getLangOpts().GPURelocatableDeviceCode),
DeviceMC(CGM.getContext().createMangleContext(		DeviceMC(CGM.getContext().getSharedMangleContext(
CGM.getContext().getAuxTargetInfo())) {		CGM.getContext().getAuxTargetInfo())) {
CodeGen::CodeGenTypes &Types = CGM.getTypes();		CodeGen::CodeGenTypes &Types = CGM.getTypes();
ASTContext &Ctx = CGM.getContext();		ASTContext &Ctx = CGM.getContext();

IntTy = CGM.IntTy;		IntTy = CGM.IntTy;
SizeTy = CGM.SizeTy;		SizeTy = CGM.SizeTy;
VoidTy = CGM.VoidTy;		VoidTy = CGM.VoidTy;

▲ Show 20 Lines • Show All 638 Lines • Show Last 20 Lines

lib/CodeGen/CGCXXABI.h

	Show All 37 Lines
	class CodeGenFunction;			class CodeGenFunction;
	class CodeGenModule;			class CodeGenModule;
	struct CatchTypeInfo;			struct CatchTypeInfo;

	/// Implements C++ ABI-specific code generation functions.			/// Implements C++ ABI-specific code generation functions.
	class CGCXXABI {			class CGCXXABI {
	protected:			protected:
	CodeGenModule &CGM;			CodeGenModule &CGM;
	std::unique_ptr<MangleContext> MangleCtx;			std::shared_ptr<MangleContext> MangleCtx;

	CGCXXABI(CodeGenModule &CGM)			CGCXXABI(CodeGenModule &CGM)
	: CGM(CGM), MangleCtx(CGM.getContext().createMangleContext()) {}			: CGM(CGM), MangleCtx(CGM.getContext().getSharedMangleContext()) {}

	protected:			protected:
	ImplicitParamDecl *getThisDecl(CodeGenFunction &CGF) {			ImplicitParamDecl *getThisDecl(CodeGenFunction &CGF) {
	return CGF.CXXABIThisDecl;			return CGF.CXXABIThisDecl;
	}			}
	llvm::Value *getThisValue(CodeGenFunction &CGF) {			llvm::Value *getThisValue(CodeGenFunction &CGF) {
	return CGF.CXXABIThisValue;			return CGF.CXXABIThisValue;
	}			}
	▲ Show 20 Lines • Show All 562 Lines • Show Last 20 Lines