This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] renamed cuda_runtime.h wrapper to __clang_cuda_runtime_wrapper.h
ClosedPublic

Authored by tra on Dec 15 2015, 10:30 AM.

Download Raw Diff

Details

Reviewers

chandlerc
echristo

Commits

rG7fda3c9ff30f: [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h
rC255802: [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h
rL255802: [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h

Summary

Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.

Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.

The file is only intended to be -include'd by clang and should never be included by users, hence '__'.

Diff Detail

Event Timeline

tra updated this revision to Diff 42875.Dec 15 2015, 10:30 AM

tra retitled this revision from to [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h.

tra updated this object.

tra added a reviewer: echristo.

tra added a subscriber: cfe-commits.

Changed name to __clang_cuda_runtime_wrapper.h
Added comments in the header explaining intended use.

The substance of the patch LGTM. My nit picking is just on the wording of the comment. =] Submit whenever.

lib/Headers/__clang_cuda_runtime_wrapper.h
25–26	"by compiler" -> "by the compiler"
28–29	You say above that they'll be included by clang directly? I think instead of "impossible to use them by clang directly" you want to say something more along the lines of "impossible for user code to #include directly when compiling with clang".
30–34	I would consistently capitalize "CUDA" when not talking about a particular header like cuda_runtime.h, and "Clang" and "NVCC" unless talking about running a command. Some other nits: "included from" -> "included" "so we have to abuse preprocessor in order to" -> "so we use the preprocessor to" "shape CUDA headers into something" -> "force the headers into a form that"

This revision is now accepted and ready to land.Dec 15 2015, 4:55 PM

jhen added a subscriber: jhen.Dec 15 2015, 5:13 PM

jhen added inline comments.

lib/Headers/__clang_cuda_runtime_wrapper.h
95	Now that the name of this header has been changed, would it be appropriate to change this #include_next to a simple #include?

Incorporated Chandler's suggestions.
Fixed #include_next -> #include.

lib/Headers/__clang_cuda_runtime_wrapper.h
95	Fixed.

Closed by commit rL255802: [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h (authored by tra). · Explain WhyDec 16 2015, 10:55 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Driver/

ToolChains.cpp

2 lines

Headers/

CMakeLists.txt

2 lines

	__clang_cuda_runtime_wrapper.h
	cuda_runtime.h

34 lines

cuda_runtime.h

test/

Driver/

cuda-detect.cu

6 lines

Diff 43025

lib/Driver/ToolChains.cpp

	Show First 20 Lines • Show All 4,110 Lines • ▼ Show 20 Lines
	void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,			void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
	ArgStringList &CC1Args) const {			ArgStringList &CC1Args) const {
	if (DriverArgs.hasArg(options::OPT_nocudainc))			if (DriverArgs.hasArg(options::OPT_nocudainc))
	return;			return;

	if (CudaInstallation.isValid()) {			if (CudaInstallation.isValid()) {
	addSystemInclude(DriverArgs, CC1Args, CudaInstallation.getIncludePath());			addSystemInclude(DriverArgs, CC1Args, CudaInstallation.getIncludePath());
	CC1Args.push_back("-include");			CC1Args.push_back("-include");
	CC1Args.push_back("cuda_runtime.h");			CC1Args.push_back("__clang_cuda_runtime_wrapper.h");
	}			}
	}			}

	bool Linux::isPIEDefault() const { return getSanitizerArgs().requiresPIE(); }			bool Linux::isPIEDefault() const { return getSanitizerArgs().requiresPIE(); }

	SanitizerMask Linux::getSupportedSanitizers() const {			SanitizerMask Linux::getSupportedSanitizers() const {
	const bool IsX86 = getTriple().getArch() == llvm::Triple::x86;			const bool IsX86 = getTriple().getArch() == llvm::Triple::x86;
	const bool IsX86_64 = getTriple().getArch() == llvm::Triple::x86_64;			const bool IsX86_64 = getTriple().getArch() == llvm::Triple::x86_64;
	▲ Show 20 Lines • Show All 395 Lines • Show Last 20 Lines

lib/Headers/CMakeLists.txt

Show All 9 Lines	set(files
avx512fintrin.h		avx512fintrin.h
avx512vlbwintrin.h		avx512vlbwintrin.h
avx512vlintrin.h		avx512vlintrin.h
avx512dqintrin.h		avx512dqintrin.h
avx512vldqintrin.h		avx512vldqintrin.h
avxintrin.h		avxintrin.h
bmi2intrin.h		bmi2intrin.h
bmiintrin.h		bmiintrin.h
		__clang_cuda_runtime_wrapper.h
cpuid.h		cpuid.h
cuda_builtin_vars.h		cuda_builtin_vars.h
cuda_runtime.h
emmintrin.h		emmintrin.h
f16cintrin.h		f16cintrin.h
float.h		float.h
fma4intrin.h		fma4intrin.h
fmaintrin.h		fmaintrin.h
fxsrintrin.h		fxsrintrin.h
htmintrin.h		htmintrin.h
htmxlintrin.h		htmxlintrin.h
▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

lib/Headers/__clang_cuda_runtime_wrapper.h

This file was moved from lib/Headers/cuda_runtime.h.

	/*===---- cuda_runtime.h - CUDA runtime support ----------------------------===			/*===---- __clang_cuda_runtime_wrapper.h - CUDA runtime support -------------===
	*			*
	* Permission is hereby granted, free of charge, to any person obtaining a copy			* Permission is hereby granted, free of charge, to any person obtaining a copy
	* of this software and associated documentation files (the "Software"), to deal			* of this software and associated documentation files (the "Software"), to deal
	* in the Software without restriction, including without limitation the rights			* in the Software without restriction, including without limitation the rights
	* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell			* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
	* copies of the Software, and to permit persons to whom the Software is			* copies of the Software, and to permit persons to whom the Software is
	* furnished to do so, subject to the following conditions:			* furnished to do so, subject to the following conditions:
	*			*
	* The above copyright notice and this permission notice shall be included in			* The above copyright notice and this permission notice shall be included in
	* all copies or substantial portions of the Software.			* all copies or substantial portions of the Software.
	*			*
	* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR			* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
	* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,			* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
	* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE			* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
	* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER			* AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
	* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,			* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
	* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN			* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
	* THE SOFTWARE.			* THE SOFTWARE.
	*			*
	*===-----------------------------------------------------------------------===			*===-----------------------------------------------------------------------===
	*/			*/

	#ifndef __CLANG_CUDA_RUNTIME_H__			/*
	#define __CLANG_CUDA_RUNTIME_H__			* WARNING: This header is intended to be directly -include'd by
				* the compiler and is not supposed to be included by users.
				chandlercUnsubmitted Done Reply Inline Actions "by compiler" -> "by the compiler" chandlerc: "by compiler" -> "by the compiler"
				*
				* CUDA headers are implemented in a way that currently makes it
				* impossible for user code to #include directly when compiling with
				chandlercUnsubmitted Done Reply Inline Actions You say above that they'll be included by clang directly? I think instead of "impossible to use them by clang directly" you want to say something more along the lines of "impossible for user code to #include directly when compiling with clang". chandlerc: You say above that they'll be included by clang directly? I think instead of "impossible to…
				* Clang. They present different view of CUDA-supplied functions
				* depending on where in NVCC's compilation pipeline the headers are
				* included. Neither of these modes provides function definitions with
				* correct attributes, so we use preprocessor to force the headers
				* into a form that Clang can use.
				chandlercUnsubmitted Done Reply Inline Actions I would consistently capitalize "CUDA" when not talking about a particular header like cuda_runtime.h, and "Clang" and "NVCC" unless talking about running a command. Some other nits: "included from" -> "included" "so we have to abuse preprocessor in order to" -> "so we use the preprocessor to" "shape CUDA headers into something" -> "force the headers into a form that" chandlerc: I would consistently capitalize "CUDA" when not talking about a particular header like…
				*
				* Similarly to NVCC which -include's cuda_runtime.h, Clang -include's
				* this file during every CUDA compilation.
				*/

				#ifndef __CLANG_CUDA_RUNTIME_WRAPPER_H__
				#define __CLANG_CUDA_RUNTIME_WRAPPER_H__

	#if defined(__CUDA__) && defined(__clang__)			#if defined(__CUDA__) && defined(__clang__)

	// Include some standard headers to avoid CUDA headers including them			// Include some standard headers to avoid CUDA headers including them
	// while some required macros (like __THROW) are in a weird state.			// while some required macros (like __THROW) are in a weird state.
	#include <stdlib.h>			#include <stdlib.h>

	// Preserve common macros that will be changed below by us or by CUDA			// Preserve common macros that will be changed below by us or by CUDA
	// headers.			// headers.
	#pragma push_macro("__THROW")			#pragma push_macro("__THROW")
	#pragma push_macro("__CUDA_ARCH__")			#pragma push_macro("__CUDA_ARCH__")

	// WARNING: Preprocessor hacks below are based on specific of			// WARNING: Preprocessor hacks below are based on specific details of
	// implementation of CUDA-7.x headers and are expected to break with			// CUDA-7.x headers and are not expected to work with any other
	// any other version of CUDA headers.			// version of CUDA headers.
	#include "cuda.h"			#include "cuda.h"
	#if !defined(CUDA_VERSION)			#if !defined(CUDA_VERSION)
	#error "cuda.h did not define CUDA_VERSION"			#error "cuda.h did not define CUDA_VERSION"
	#elif CUDA_VERSION < 7000 \|\| CUDA_VERSION > 7050			#elif CUDA_VERSION < 7000 \|\| CUDA_VERSION > 7050
	#error "Unsupported CUDA version!"			#error "Unsupported CUDA version!"
	#endif			#endif

	// Make largest subset of device functions available during host			// Make largest subset of device functions available during host
	Show All 22 Lines
	#include "host_config.h"			#include "host_config.h"
	#include "host_defines.h"			#include "host_defines.h"
	#include "driver_types.h"			#include "driver_types.h"
	#include "common_functions.h"			#include "common_functions.h"
	#undef __CUDADEVRT_INTERNAL__			#undef __CUDADEVRT_INTERNAL__

	#undef __CUDABE__			#undef __CUDABE__
	#define __CUDACC__			#define __CUDACC__
	#include_next "cuda_runtime.h"			#include "cuda_runtime.h"
				jhenUnsubmitted Not Done Reply Inline Actions Now that the name of this header has been changed, would it be appropriate to change this #include_next to a simple #include? jhen: Now that the name of this header has been changed, would it be appropriate to change this…
				traAuthorUnsubmitted Not Done Reply Inline Actions Fixed. tra: Fixed.

	#undef __CUDACC__			#undef __CUDACC__
	#define __CUDABE__			#define __CUDABE__

	// CUDA headers use __nvvm_memcpy and __nvvm_memset which clang does			// CUDA headers use __nvvm_memcpy and __nvvm_memset which Clang does
	// not have at the moment. Emulate them with a builtin memcpy/memset.			// not have at the moment. Emulate them with a builtin memcpy/memset.
	#define __nvvm_memcpy(s,d,n,a) __builtin_memcpy(s,d,n)			#define __nvvm_memcpy(s,d,n,a) __builtin_memcpy(s,d,n)
	#define __nvvm_memset(d,c,n,a) __builtin_memset(d,c,n)			#define __nvvm_memset(d,c,n,a) __builtin_memset(d,c,n)

	#include "crt/host_runtime.h"			#include "crt/host_runtime.h"
	#include "crt/device_runtime.h"			#include "crt/device_runtime.h"
	// device_runtime.h defines __cxa_* macros that will conflict with			// device_runtime.h defines __cxa_* macros that will conflict with
	// cxxabi.h.			// cxxabi.h.
	▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines
	// function which is implicitly assumed by NVVMReflect pass.			// function which is implicitly assumed by NVVMReflect pass.
	extern "C" __device__ __attribute__((const)) int __nvvm_reflect(const void *);			extern "C" __device__ __attribute__((const)) int __nvvm_reflect(const void *);
	static __device__ __attribute__((used)) int __nvvm_reflect_anchor() {			static __device__ __attribute__((used)) int __nvvm_reflect_anchor() {
	return __nvvm_reflect("NONE");			return __nvvm_reflect("NONE");
	}			}
	#endif			#endif

	#endif // __CUDA__			#endif // __CUDA__
	#endif // __CLANG_CUDA_RUNTIME_H__			#endif // __CLANG_CUDA_RUNTIME_WRAPPER_H__

lib/Headers/cuda_runtime.h

This file was moved to lib/Headers/__clang_cuda_runtime_wrapper.h.

test/Driver/cuda-detect.cu

	Show All 33 Lines
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_30 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_30 \
	// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \			// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE
	// .. or if we explicitly passed -nocudalib			// .. or if we explicitly passed -nocudalib
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
	// RUN: -nocudalib --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \			// RUN: -nocudalib --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE
	// Verify that we don't add include paths, link with libdevice or			// Verify that we don't add include paths, link with libdevice or
	// -include cuda_runtime without valid CUDA installation.			// -include __clang_cuda_runtime_wrapper.h without valid CUDA installation.
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
	// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \			// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON \			// RUN: \| FileCheck %s -check-prefix COMMON \
	// RUN: -check-prefix NOCUDAINC -check-prefix NOLIBDEVICE			// RUN: -check-prefix NOCUDAINC -check-prefix NOLIBDEVICE

	// CHECK: Found CUDA installation: {{.*}}/Inputs/CUDA/usr/local/cuda			// CHECK: Found CUDA installation: {{.*}}/Inputs/CUDA/usr/local/cuda
	// NOCUDA-NOT: Found CUDA installation:			// NOCUDA-NOT: Found CUDA installation:

	// COMMON: "-triple" "nvptx-nvidia-cuda"			// COMMON: "-triple" "nvptx-nvidia-cuda"
	// COMMON-SAME: "-fcuda-is-device"			// COMMON-SAME: "-fcuda-is-device"
	// LIBDEVICE-SAME: "-mlink-cuda-bitcode"			// LIBDEVICE-SAME: "-mlink-cuda-bitcode"
	// NOLIBDEVICE-NOT: "-mlink-cuda-bitcode"			// NOLIBDEVICE-NOT: "-mlink-cuda-bitcode"
	// LIBDEVICE21-SAME: libdevice.compute_20.10.bc			// LIBDEVICE21-SAME: libdevice.compute_20.10.bc
	// LIBDEVICE35-SAME: libdevice.compute_35.10.bc			// LIBDEVICE35-SAME: libdevice.compute_35.10.bc
	// NOLIBDEVICE-NOT: libdevice.compute_{{.*}}.bc			// NOLIBDEVICE-NOT: libdevice.compute_{{.*}}.bc
	// LIBDEVICE-SAME: "-target-feature" "+ptx42"			// LIBDEVICE-SAME: "-target-feature" "+ptx42"
	// NOLIBDEVICE-NOT: "-target-feature" "+ptx42"			// NOLIBDEVICE-NOT: "-target-feature" "+ptx42"
	// CUDAINC-SAME: "-internal-isystem" "{{.*}}/Inputs/CUDA/usr/local/cuda/include"			// CUDAINC-SAME: "-internal-isystem" "{{.*}}/Inputs/CUDA/usr/local/cuda/include"
	// NOCUDAINC-NOT: "-internal-isystem" "{{.*}}/cuda/include"			// NOCUDAINC-NOT: "-internal-isystem" "{{.*}}/cuda/include"
	// CUDAINC-SAME: "-include" "cuda_runtime.h"			// CUDAINC-SAME: "-include" "__clang_cuda_runtime_wrapper.h"
	// NOCUDAINC-NOT: "-include" "cuda_runtime.h"			// NOCUDAINC-NOT: "-include" "__clang_cuda_runtime_wrapper.h"
	// COMMON-SAME: "-x" "cuda"			// COMMON-SAME: "-x" "cuda"