This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Add appropriate host/device attribute to target-specific builtins.
ClosedPublic

Authored by tra on Aug 18 2015, 3:36 PM.

Download Raw Diff

Details

Reviewers

eliben
echristo

Commits

rG9674a64cd9f8: [CUDA] Add appropriate host/device attribute to builtins.
rG39259ffc6555: [CUDA] Add appropriate host/device attribute to builtins.
rC248296: [CUDA] Add appropriate host/device attribute to builtins.
rC245496: [CUDA] Add appropriate host/device attribute to builtins.
rL248296: [CUDA] Add appropriate host/device attribute to builtins.
rL245496: [CUDA] Add appropriate host/device attribute to builtins.

Summary

The patch adds appropriate host or device attributes to target-specific builtins
so we can properly check whether they may or may not be called from particular context.

Diff Detail

Repository: rL LLVM

Event Timeline

tra updated this revision to Diff 32463.Aug 18 2015, 3:36 PM

tra retitled this revision from to [CUDA] Add appropriate host/device attribute to target-specific builtins..

tra updated this object.

tra added reviewers: eliben, echristo.

tra added a subscriber: cfe-commits.

eliben added inline comments.Aug 18 2015, 4:05 PM

include/clang/Basic/Builtins.h
85 ↗	(On Diff #32463)	You can also use it in SemaChecking now
lib/Sema/SemaDecl.cpp
11166 ↗	(On Diff #32463)	It is not immediately clear why you would mark target-specific builtins as host in host compilation mode. So for example __builtin_ptx_read_tid_x would be callable from a host function when compiling in host mode? Can you clarify this with a comment here, and also add a relevant test?

used isTSBuiltin in SemaChecking.cpp

Added a comment explaining reasoning behind attribute choice for target-specific builtins.

tra marked an inline comment as done.Aug 18 2015, 5:04 PM

tra added inline comments.

lib/Sema/SemaDecl.cpp
11166 ↗	(On Diff #32463)	Target triple used for particular compilation pass is assumed to be valid for this particular compilation mode. Builtins provided by the target are therefore assumed to be intended to be used in that compilation mode. builtins.cu already includes test cases for using target-specific builtins from different targets and and non-target-specific builtins in host and device functions in host and device compilation. Your example scenario (builtin_ptx_read_tid_x getting host attribute) is possible only if you compile in host mode and use --triple nvptx-unknown-cuda. IMO, host__ would be an appropriate attribute to assign to builtins in this scenario, even though I don't think we can currently do anything useful with NVPTX as the host at the moment. If you attempt to use __builtin_ptx_read_tid_x() during host compilation using non-NVPTX target (which is a typical scenario), then compilation will fail because that particular builtin would not be available at all. That said, another corner case would be compiling CUDA for device with the same architecture as host. Again, it's not a practical scenario at the moment. If/when we get to the point when we may want to do that, it should be easy to check for it and treat builtins as host device which would reflect their intended use domain.

lgtm

This revision is now accepted and ready to land.Aug 19 2015, 8:09 AM

Closed by commit rL245496: [CUDA] Add appropriate host/device attribute to builtins. (authored by tra). · Explain WhyAug 19 2015, 1:49 PM

This revision was automatically updated to reflect the committed changes.

Reverted in r245592 due to breaking internal tests.

tra mentioned this in D12453: [CUDA] Allow function overloads based on host/device attributes..Sep 1 2015, 3:08 PM

Revision Contents

Path

Size

cfe/

trunk/

include/

clang/

Basic/

Builtins.h

5 lines

lib/

Sema/

SemaChecking.cpp

2 lines

SemaDecl.cpp

11 lines

test/

SemaCUDA/

builtins.cu

35 lines

implicit-intrinsic.cu

6 lines

Diff 32607

cfe/trunk/include/clang/Basic/Builtins.h

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	const char *getName(unsigned ID) const {
return getRecord(ID).Name;		return getRecord(ID).Name;
}		}

/// \brief Get the type descriptor string for the specified builtin.		/// \brief Get the type descriptor string for the specified builtin.
const char *getTypeString(unsigned ID) const {		const char *getTypeString(unsigned ID) const {
return getRecord(ID).Type;		return getRecord(ID).Type;
}		}

		/// \brief Return true if this function is a target-specific builtin
		bool isTSBuiltin(unsigned ID) const {
		return ID >= Builtin::FirstTSBuiltin;
		}

/// \brief Return true if this function has no side effects and doesn't		/// \brief Return true if this function has no side effects and doesn't
/// read memory.		/// read memory.
bool isConst(unsigned ID) const {		bool isConst(unsigned ID) const {
return strchr(getRecord(ID).Attributes, 'c') != nullptr;		return strchr(getRecord(ID).Attributes, 'c') != nullptr;
}		}

/// \brief Return true if we know this builtin never throws an exception.		/// \brief Return true if we know this builtin never throws an exception.
bool isNoThrow(unsigned ID) const {		bool isNoThrow(unsigned ID) const {
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

cfe/trunk/lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 519 Lines • ▼ Show 20 Lines	case Builtin::BI__GetExceptionInfo:

TheCall->setType(Context.VoidPtrTy);		TheCall->setType(Context.VoidPtrTy);
break;		break;

}		}

// Since the target specific builtins for each arch overlap, only check those		// Since the target specific builtins for each arch overlap, only check those
// of the arch we are compiling for.		// of the arch we are compiling for.
if (BuiltinID >= Builtin::FirstTSBuiltin) {		if (Context.BuiltinInfo.isTSBuiltin(BuiltinID)) {
switch (Context.getTargetInfo().getTriple().getArch()) {		switch (Context.getTargetInfo().getTriple().getArch()) {
case llvm::Triple::arm:		case llvm::Triple::arm:
case llvm::Triple::armeb:		case llvm::Triple::armeb:
case llvm::Triple::thumb:		case llvm::Triple::thumb:
case llvm::Triple::thumbeb:		case llvm::Triple::thumbeb:
if (CheckARMBuiltinFunctionCall(BuiltinID, TheCall))		if (CheckARMBuiltinFunctionCall(BuiltinID, TheCall))
return ExprError();		return ExprError();
break;		break;
▲ Show 20 Lines • Show All 9,139 Lines • Show Last 20 Lines

cfe/trunk/lib/Sema/SemaDecl.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,181 Lines • ▼ Show 20 Lines	if (unsigned BuiltinID = FD->getBuiltinID()) {
if (Context.BuiltinInfo.isReturnsTwice(BuiltinID) &&		if (Context.BuiltinInfo.isReturnsTwice(BuiltinID) &&
!FD->hasAttr<ReturnsTwiceAttr>())		!FD->hasAttr<ReturnsTwiceAttr>())
FD->addAttr(ReturnsTwiceAttr::CreateImplicit(Context,		FD->addAttr(ReturnsTwiceAttr::CreateImplicit(Context,
FD->getLocation()));		FD->getLocation()));
if (Context.BuiltinInfo.isNoThrow(BuiltinID) && !FD->hasAttr<NoThrowAttr>())		if (Context.BuiltinInfo.isNoThrow(BuiltinID) && !FD->hasAttr<NoThrowAttr>())
FD->addAttr(NoThrowAttr::CreateImplicit(Context, FD->getLocation()));		FD->addAttr(NoThrowAttr::CreateImplicit(Context, FD->getLocation()));
if (Context.BuiltinInfo.isConst(BuiltinID) && !FD->hasAttr<ConstAttr>())		if (Context.BuiltinInfo.isConst(BuiltinID) && !FD->hasAttr<ConstAttr>())
FD->addAttr(ConstAttr::CreateImplicit(Context, FD->getLocation()));		FD->addAttr(ConstAttr::CreateImplicit(Context, FD->getLocation()));
		if (getLangOpts().CUDA && Context.BuiltinInfo.isTSBuiltin(BuiltinID) &&
		!FD->hasAttr<CUDADeviceAttr>() && !FD->hasAttr<CUDAHostAttr>()) {
		// Target-specific builtins are assumed to be intended for use
		// in this particular CUDA compilation mode and should have
		// appropriate attribute set so we can enforce CUDA function
		// call restrictions.
		if (getLangOpts().CUDAIsDevice)
		FD->addAttr(CUDADeviceAttr::CreateImplicit(Context, FD->getLocation()));
		else
		FD->addAttr(CUDAHostAttr::CreateImplicit(Context, FD->getLocation()));
		}
}		}

IdentifierInfo *Name = FD->getIdentifier();		IdentifierInfo *Name = FD->getIdentifier();
if (!Name)		if (!Name)
return;		return;
if ((!getLangOpts().CPlusPlus &&		if ((!getLangOpts().CPlusPlus &&
FD->getDeclContext()->isTranslationUnit()) \|\|		FD->getDeclContext()->isTranslationUnit()) \|\|
(isa<LinkageSpecDecl>(FD->getDeclContext()) &&		(isa<LinkageSpecDecl>(FD->getDeclContext()) &&
▲ Show 20 Lines • Show All 3,297 Lines • Show Last 20 Lines

cfe/trunk/test/SemaCUDA/builtins.cu

				// Tests that target-specific builtins have appropriate host/device
				// attributes and that CUDA call restrictions are enforced. Also
				// verify that non-target builtins can be used from both host and
				// device functions.
				//
				// REQUIRES: x86-registered-target
				// REQUIRES: nvptx-registered-target
				// RUN: %clang_cc1 -triple x86_64-unknown-unknown -fsyntax-only -verify %s
				// RUN: %clang_cc1 -triple nvptx64-unknown-cuda -fcuda-is-device \
				// RUN: -fsyntax-only -verify %s


				#ifdef __CUDA_ARCH__
				// Device-side builtins are not allowed to be called from host functions.
				void hf() {
				int x = __builtin_ptx_read_tid_x(); // expected-note {{'__builtin_ptx_read_tid_x' declared here}}
				// expected-error@-1 {{reference to __device__ function '__builtin_ptx_read_tid_x' in __host__ function}}
				x = __builtin_abs(1);
				}
				__attribute__((device)) void df() {
				int x = __builtin_ptx_read_tid_x();
				x = __builtin_abs(1);
				}
				#else
				// Host-side builtins are not allowed to be called from device functions.
				__attribute__((device)) void df() {
				int x = __builtin_ia32_rdtsc(); // expected-note {{'__builtin_ia32_rdtsc' declared here}}
				// expected-error@-1 {{reference to __host__ function '__builtin_ia32_rdtsc' in __device__ function}}
				x = __builtin_abs(1);
				}
				void hf() {
				int x = __builtin_ia32_rdtsc();
				x = __builtin_abs(1);
				}
				#endif

cfe/trunk/test/SemaCUDA/implicit-intrinsic.cu

	// RUN: %clang_cc1 -std=gnu++11 -triple nvptx64-unknown-unknown -fsyntax-only -verify %s			// RUN: %clang_cc1 -triple nvptx64-unknown-unknown -fcuda-is-device -fsyntax-only -verify %s

	#include "Inputs/cuda.h"			#include "Inputs/cuda.h"

	// expected-no-diagnostics			// expected-no-diagnostics
	__device__ void __threadfence_system() {			__device__ void __threadfence_system() {
	// This shouldn't produce an error, since __nvvm_membar_sys is inferred to			// This shouldn't produce an error, since __nvvm_membar_sys should be
	// be __host__ __device__ and thus callable from device code.			// __device__ and thus callable from device code.
	__nvvm_membar_sys();			__nvvm_membar_sys();
	}			}