This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/Basic/
-
clang/
-
Basic/
-
DiagnosticSemaKinds.td
-
Visibility.h
-
lib/
-
AST/
2
Decl.cpp
-
CodeGen/
-
TargetInfo.cpp
-
Sema/
-
SemaDeclAttr.cpp
-
test/
-
SemaCUDA/
-
visibility-diagnostics.cu
-
SemaOpenCL/
-
visibility-diagnostics.cl

Differential D61274

[Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables
Needs ReviewPublic

Authored by scott.linder on Apr 29 2019, 12:18 PM.

Download Raw Diff

Details

Reviewers

Anastasia
tra
yaxunl
rjmccall

Summary

For AMDGPU the visibility of these symbols (OpenCL kernels, CUDA __global__ functions, and CUDA __device__ variables) must not be hidden, as we rely on them being available in the dynamic symbol table in the final DSO.

This patch implements this by considering language attributes as a source of explicit visibility, but rather than attributing any one visibility to them they are simply coerced to be a non-hidden visibility. This allows for the optimization of using protected visibility when these symbols are known to be dso_local.

This patch also adds diagnostics for explicitly setting a hidden visibility on these symbols.

I imagine there are a number of issues with the patch in its current state, but I wanted to get something implemented before reaching out to OpenCL/CUDA maintainers to see if this is a reasonable change. @Anastasia and @tra I wasn't certain if you would be good candidates to discuss this change, so please let me know if I need to keep looking.

Diff Detail

Event Timeline

scott.linder created this revision.Apr 29 2019, 12:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2019, 12:18 PM

Herald added subscribers: cfe-commits, tpr. · View Herald Transcript

A kernel functions in CUDA is actually two different functions. One is the real kernel we compile for the GPU, another is a host-side stub that launches the device-side kernel.

On device side both clang and nvcc currently silently ignore hidden visibility and force the kernel to always be visible:
https://godbolt.org/z/xrPMGc
This is needed because the kernel must be externally visible in the device-side executable for the host-side code to execute it.
The device-side executable and its symbols are isolated from the host DSO they are encapsulated in, so whether the hidden attribute is ignored or not on device side is independent of the visibility the kernel symbol gets on the host side.

On the host side there's no particular reason to give kernel (or, rather, its host-side stub) a special treatment. Setting hidden on it should be fine, if someone needs it for whatever reason.
Most likely users who may have applied hidden to a kernel would do so in order to avoid exposing kernel symbols outside of a DSO and the attribute will do that job just fine.

I think the warning is not going to buy anything for CUDA. The hidden attribute effectively applies to the host side only, where it should work correctly and where it is potentially useful. I'd rather not impose restrictions that are not necessary, even if it's just a warning.

Seems reasonable for OpenCL kernels. You might want to add an AST dump test to check that the visibility is being set correctly in case it's being printed in AST.

Okay. So it sounds like this should either be a device-only rule, with no warning in mixed-mode languages like CUDA, or we should take a different approach.

yaxunl added inline comments.Apr 30 2019, 3:24 PM

lib/AST/Decl.cpp
738	we also need this for `__constant__` variables.

Anastasia added inline comments.May 2 2019, 6:09 AM

lib/AST/Decl.cpp
738	And what about `__global` and `__constant` program scope variables in OpenCL?

Revision Contents

Path

Size

include/

clang/

Basic/

DiagnosticSemaKinds.td

3 lines

Visibility.h

4 lines

lib/

AST/

Decl.cpp

13 lines

CodeGen/

TargetInfo.cpp

15 lines

Sema/

SemaDeclAttr.cpp

10 lines

test/

SemaCUDA/

visibility-diagnostics.cu

13 lines

SemaOpenCL/

visibility-diagnostics.cl

11 lines

Diff 197157

include/clang/Basic/DiagnosticSemaKinds.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,356 Lines • ▼ Show 20 Lines	def warn_transparent_union_attribute_zero_fields : Warning<
"transparent union definition must contain at least one field; "		"transparent union definition must contain at least one field; "
"transparent_union attribute ignored">,		"transparent_union attribute ignored">,
InGroup<IgnoredAttributes>;		InGroup<IgnoredAttributes>;
def warn_attribute_type_not_supported : Warning<		def warn_attribute_type_not_supported : Warning<
"%0 attribute argument not supported: %1">,		"%0 attribute argument not supported: %1">,
InGroup<IgnoredAttributes>;		InGroup<IgnoredAttributes>;
def warn_attribute_unknown_visibility : Warning<"unknown visibility %0">,		def warn_attribute_unknown_visibility : Warning<"unknown visibility %0">,
InGroup<IgnoredAttributes>;		InGroup<IgnoredAttributes>;
		def warn_attribute_hidden_visibility :
		Warning<"'hidden' visibility on %select{function\|variable}0 with incompatible language attribute will be ignored">,
		InGroup<IgnoredAttributes>;
def warn_attribute_protected_visibility :		def warn_attribute_protected_visibility :
Warning<"target does not support 'protected' visibility; using 'default'">,		Warning<"target does not support 'protected' visibility; using 'default'">,
InGroup<DiagGroup<"unsupported-visibility">>;		InGroup<DiagGroup<"unsupported-visibility">>;
def err_mismatched_visibility: Error<"visibility does not match previous declaration">;		def err_mismatched_visibility: Error<"visibility does not match previous declaration">;
def note_previous_attribute : Note<"previous attribute is here">;		def note_previous_attribute : Note<"previous attribute is here">;
def note_conflicting_attribute : Note<"conflicting attribute is here">;		def note_conflicting_attribute : Note<"conflicting attribute is here">;
def note_attribute : Note<"attribute is here">;		def note_attribute : Note<"attribute is here">;
def err_mismatched_ms_inheritance : Error<		def err_mismatched_ms_inheritance : Error<
▲ Show 20 Lines • Show All 6,254 Lines • Show Last 20 Lines

include/clang/Basic/Visibility.h

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
inline Visibility minVisibility(Visibility L, Visibility R) {		inline Visibility minVisibility(Visibility L, Visibility R) {
return L < R ? L : R;		return L < R ? L : R;
}		}

class LinkageInfo {		class LinkageInfo {
uint8_t linkage_ : 3;		uint8_t linkage_ : 3;
uint8_t visibility_ : 2;		uint8_t visibility_ : 2;
uint8_t explicit_ : 1;		uint8_t explicit_ : 1;

void setVisibility(Visibility V, bool E) { visibility_ = V; explicit_ = E; }
public:		public:
LinkageInfo() : linkage_(ExternalLinkage), visibility_(DefaultVisibility),		LinkageInfo() : linkage_(ExternalLinkage), visibility_(DefaultVisibility),
explicit_(false) {}		explicit_(false) {}
LinkageInfo(Linkage L, Visibility V, bool E)		LinkageInfo(Linkage L, Visibility V, bool E)
: linkage_(L), visibility_(V), explicit_(E) {		: linkage_(L), visibility_(V), explicit_(E) {
assert(getLinkage() == L && getVisibility() == V &&		assert(getLinkage() == L && getVisibility() == V &&
isVisibilityExplicit() == E && "Enum truncated!");		isVisibilityExplicit() == E && "Enum truncated!");
}		}
Show All 15 Lines	public:
}		}

Linkage getLinkage() const { return (Linkage)linkage_; }		Linkage getLinkage() const { return (Linkage)linkage_; }
Visibility getVisibility() const { return (Visibility)visibility_; }		Visibility getVisibility() const { return (Visibility)visibility_; }
bool isVisibilityExplicit() const { return explicit_; }		bool isVisibilityExplicit() const { return explicit_; }

void setLinkage(Linkage L) { linkage_ = L; }		void setLinkage(Linkage L) { linkage_ = L; }

		void setVisibility(Visibility V, bool E) { visibility_ = V; explicit_ = E; }

void mergeLinkage(Linkage L) {		void mergeLinkage(Linkage L) {
setLinkage(minLinkage(getLinkage(), L));		setLinkage(minLinkage(getLinkage(), L));
}		}
void mergeLinkage(LinkageInfo other) {		void mergeLinkage(LinkageInfo other) {
mergeLinkage(other.getLinkage());		mergeLinkage(other.getLinkage());
}		}

void mergeExternalVisibility(Linkage L) {		void mergeExternalVisibility(Linkage L) {
▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

lib/AST/Decl.cpp

Show First 20 Lines • Show All 725 Lines • ▼ Show 20 Lines	if (!LV.isVisibilityExplicit()) {

// If we're paying attention to global visibility, apply		// If we're paying attention to global visibility, apply
// -finline-visibility-hidden if this is an inline method.		// -finline-visibility-hidden if this is an inline method.
if (useInlineVisibilityHidden(D))		if (useInlineVisibilityHidden(D))
LV.mergeVisibility(HiddenVisibility, /visibilityExplicit=/false);		LV.mergeVisibility(HiddenVisibility, /visibilityExplicit=/false);
}		}
}		}

		// We consider OpenCL kernels, __global__ Cuda functions, and __device__ Cuda
		// variables to have explicit visibility of "non-hidden".
		if (D->hasAttr<OpenCLKernelAttr>() \|\|
		(isa<FunctionDecl>(D) && D->hasAttr<CUDAGlobalAttr>()) \|\|
		(isa<VarDecl>(D) && D->hasAttr<CUDADeviceAttr>())) {
		yaxunlUnsubmitted Not Done Reply Inline Actions we also need this for `__constant__` variables. yaxunl: we also need this for `__constant__` variables.
		AnastasiaUnsubmitted Not Done Reply Inline Actions And what about `__global` and `__constant` program scope variables in OpenCL? Anastasia: And what about `__global` and `__constant` program scope variables in OpenCL?
		Visibility Vis = LV.getVisibility();
		if (LV.getVisibility() == HiddenVisibility)
		Vis = Context.getTargetInfo().hasProtectedVisibility()
		? ProtectedVisibility
		: DefaultVisibility;
		LV.setVisibility(Vis, true);
		}

// C++ [basic.link]p4:		// C++ [basic.link]p4:

// A name having namespace scope has external linkage if it is the		// A name having namespace scope has external linkage if it is the
// name of		// name of
//		//
// - an object or reference, unless it has internal linkage; or		// - an object or reference, unless it has internal linkage; or
if (const auto *Var = dyn_cast<VarDecl>(D)) {		if (const auto *Var = dyn_cast<VarDecl>(D)) {
// GCC applies the following optimization to variables and static		// GCC applies the following optimization to variables and static
▲ Show 20 Lines • Show All 3,991 Lines • Show Last 20 Lines

lib/CodeGen/TargetInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,834 Lines • ▼ Show 20 Lines	public:
createEnqueuedBlockKernel(CodeGenFunction &CGF,		createEnqueuedBlockKernel(CodeGenFunction &CGF,
llvm::Function *BlockInvokeFunc,		llvm::Function *BlockInvokeFunc,
llvm::Value *BlockLiteral) const override;		llvm::Value *BlockLiteral) const override;
bool shouldEmitStaticExternCAliases() const override;		bool shouldEmitStaticExternCAliases() const override;
void setCUDAKernelCallingConvention(const FunctionType *&FT) const override;		void setCUDAKernelCallingConvention(const FunctionType *&FT) const override;
};		};
}		}

static bool requiresAMDGPUProtectedVisibility(const Decl *D,
llvm::GlobalValue *GV) {
if (GV->getVisibility() != llvm::GlobalValue::HiddenVisibility)
return false;

return D->hasAttr<OpenCLKernelAttr>() \|\|
(isa<FunctionDecl>(D) && D->hasAttr<CUDAGlobalAttr>()) \|\|
(isa<VarDecl>(D) && D->hasAttr<CUDADeviceAttr>());
}

void AMDGPUTargetCodeGenInfo::setTargetAttributes(		void AMDGPUTargetCodeGenInfo::setTargetAttributes(
const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &M) const {		const Decl D, llvm::GlobalValue GV, CodeGen::CodeGenModule &M) const {
if (requiresAMDGPUProtectedVisibility(D, GV)) {
GV->setVisibility(llvm::GlobalValue::ProtectedVisibility);
GV->setDSOLocal(true);
}

if (GV->isDeclaration())		if (GV->isDeclaration())
return;		return;
const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D);		const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D);
if (!FD)		if (!FD)
return;		return;

llvm::Function *F = cast<llvm::Function>(GV);		llvm::Function *F = cast<llvm::Function>(GV);

▲ Show 20 Lines • Show All 1,759 Lines • Show Last 20 Lines

lib/Sema/SemaDeclAttr.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,369 Lines • ▼ Show 20 Lines	void Sema::ProcessDeclAttributeList(Scope S, Decl D,
// family, and it can be applied after objc_designated_initializer. This is a		// family, and it can be applied after objc_designated_initializer. This is a
// bit of a hack, but we need it to be compatible with versions of clang that		// bit of a hack, but we need it to be compatible with versions of clang that
// processed the attribute list in the wrong order.		// processed the attribute list in the wrong order.
if (D->hasAttr<ObjCDesignatedInitializerAttr>() &&		if (D->hasAttr<ObjCDesignatedInitializerAttr>() &&
cast<ObjCMethodDecl>(D)->getMethodFamily() != OMF_init) {		cast<ObjCMethodDecl>(D)->getMethodFamily() != OMF_init) {
Diag(D->getLocation(), diag::err_designated_init_attr_non_init);		Diag(D->getLocation(), diag::err_designated_init_attr_non_init);
D->dropAttr<ObjCDesignatedInitializerAttr>();		D->dropAttr<ObjCDesignatedInitializerAttr>();
}		}

		if ((D->hasAttr<VisibilityAttr>() &&
		D->getAttr<VisibilityAttr>()->getVisibility() ==
		VisibilityAttr::Hidden) &&
		(D->hasAttr<OpenCLKernelAttr>() \|\|
		(isa<FunctionDecl>(D) && D->hasAttr<CUDAGlobalAttr>()) \|\|
		(isa<VarDecl>(D) && D->hasAttr<CUDADeviceAttr>()))) {
		Diag(D->getLocation(), diag::warn_attribute_hidden_visibility)
		<< (isa<FunctionDecl>(D) ? 0 : 1);
		}
}		}

// Helper for delayed processing TransparentUnion attribute.		// Helper for delayed processing TransparentUnion attribute.
void Sema::ProcessDeclAttributeDelayed(Decl *D,		void Sema::ProcessDeclAttributeDelayed(Decl *D,
const ParsedAttributesView &AttrList) {		const ParsedAttributesView &AttrList) {
for (const ParsedAttr &AL : AttrList)		for (const ParsedAttr &AL : AttrList)
if (AL.getKind() == ParsedAttr::AT_TransparentUnion) {		if (AL.getKind() == ParsedAttr::AT_TransparentUnion) {
handleTransparentUnionAttr(*this, D, AL);		handleTransparentUnionAttr(*this, D, AL);
▲ Show 20 Lines • Show All 1,234 Lines • Show Last 20 Lines

test/SemaCUDA/visibility-diagnostics.cu

This file was added.

				// RUN: %clang_cc1 -std=c++11 -fsyntax-only -verify %s

				#include "Inputs/cuda.h"

				__attribute__((visibility("hidden"))) __global__ void global_func_hidden() {} // expected-warning {{'hidden' visibility on function with incompatible language attribute will be ignored}}
				__attribute__((visibility("protected"))) __global__ void global_func_protected() {}
				__attribute__((visibility("default"))) __global__ void global_func_default() {}
				__global__ void global_func() {}

				__attribute__((visibility("hidden"))) __device__ int device_var_hidden; // expected-warning {{'hidden' visibility on variable with incompatible language attribute will be ignored}}
				__attribute__((visibility("protected"))) __device__ int device_var_protected;
				__attribute__((visibility("default"))) __device__ int device_var_default;
				__device__ int device_var;

test/SemaOpenCL/visibility-diagnostics.cl

This file was added.

				// RUN: %clang_cc1 -std=cl2.0 -verify -pedantic -fsyntax-only %s

				__attribute__((visibility("hidden"))) kernel void kern_hidden() {} // expected-warning {{'hidden' visibility on function with incompatible language attribute will be ignored}}
				__attribute__((visibility("protected"))) kernel void kern_protected();
				__attribute__((visibility("default"))) kernel void kern_default();
				kernel void kern();

				__attribute__((visibility("hidden"))) extern kernel void ext_kern_hidden(); // expected-warning {{'hidden' visibility on function with incompatible language attribute will be ignored}}
				__attribute__((visibility("protected"))) extern kernel void ext_kern_protected();
				__attribute__((visibility("default"))) extern kernel void ext_kern_default();
				extern kernel void ext_kern();