This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Add utility functions for dealing with CUDA versions / architectures.
ClosedPublic

Authored by jlebar on Jun 29 2016, 3:58 PM.

Download Raw Diff

Details

Reviewers

Commits

rG629076178a5e: [CUDA] Add utility functions for dealing with CUDA versions / architectures.
rC274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures.
rL274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

Summary

Currently our handling of CUDA architectures is scattered all around
clang. This patch centralizes it.

A key advantage of this centralization is that you can now write a C++
switch on e.g. CudaArch and get a compile error if you don't handle one
of the enum values.

Diff Detail

Repository: rL LLVM

Event Timeline

jlebar updated this revision to Diff 62299.Jun 29 2016, 3:58 PM

jlebar retitled this revision from to [CUDA] Add utility functions for dealing with CUDA versions / architectures..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added a subscriber: cfe-commits.

LGTM.

lib/Basic/Cuda.cpp
8–19 ↗	(On Diff #62299)	We seem to do a lot of enum->string and string->enum mapping in this file. Is there something comparable to Boost.bimap in standard c++ library or in LLVM?
lib/Driver/Driver.cpp
1026–1028 ↗	(On Diff #62299)	I think this could be collapsed to just CudaArchToString(CDA->getGpuArch()). "(multiple archs)" is as informative as (and indistinguishable from) "unknown" here.

This revision is now accepted and ready to land.Jun 30 2016, 9:50 AM

jlebar marked an inline comment as done.Jun 30 2016, 12:53 PM

jlebar added inline comments.

lib/Basic/Cuda.cpp
8–19 ↗	(On Diff #62299)	Not to my knowledge.
lib/Driver/Driver.cpp
1026–1028 ↗	(On Diff #62299)	I'm not crazy about "unknown", since it is actually known. How about we just not output anything?

tra added inline comments.Jun 30 2016, 1:27 PM

lib/Driver/Driver.cpp
1026–1028 ↗	(On Diff #62299)	It's a debugging output so it would be good to accurately reflect our internal state. In this case if we for some reason end up with CudaArch::UNKNOWN, I'd want to know that. If we really use UNKNOWN to represent multiple archs, perhaps it needs an enum for multiple-archs.

jlebar marked an inline comment as done.Jun 30 2016, 1:31 PM

jlebar added inline comments.

lib/Driver/Driver.cpp
1026–1028 ↗	(On Diff #62299)	We really do use UNKNOWN here to represent multiple architectures. It is used for the architecture of the Action corresponding to the call to fatbin. I think adding an enum value for multiple-archs is going to be more harmful than useful, because it means that everywhere that we switch() on arch, we're going to have to handle (and assert) MULTIPLE_ARCHs.

tra added inline comments.Jun 30 2016, 1:44 PM

lib/Driver/Driver.cpp
1026–1028 ↗	(On Diff #62299)	OK. No output is fine with me.

Address Art's review.

Closed by commit rL274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures. (authored by jlebar). · Explain WhyJul 6 2016, 2:29 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

cfe/

trunk/

include/

clang/

Basic/

Cuda.h

77 lines

Driver/

Action.h

19 lines

lib/

Basic/

CMakeLists.txt

1 line

Cuda.cpp

165 lines

Targets.cpp

68 lines

Driver/

Action.cpp

36 lines

Driver.cpp

29 lines

Tools.cpp

7 lines

Diff 62975

cfe/trunk/include/clang/Basic/Cuda.h

				//===--- Cuda.h - Utilities for compiling CUDA code ------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_BASIC_CUDA_H
				#define LLVM_CLANG_BASIC_CUDA_H

				namespace llvm {
				class StringRef;
				} // namespace llvm

				namespace clang {

				enum class CudaVersion {
				UNKNOWN,
				CUDA_70,
				CUDA_75,
				CUDA_80,
				};
				const char *CudaVersionToString(CudaVersion V);

				// No string -> CudaVersion conversion function because there's no canonical
				// spelling of the various CUDA versions.

				enum class CudaArch {
				UNKNOWN,
				SM_20,
				SM_21,
				SM_30,
				SM_32,
				SM_35,
				SM_37,
				SM_50,
				SM_52,
				SM_53,
				SM_60,
				SM_61,
				SM_62,
				};
				const char *CudaArchToString(CudaArch A);

				// The input should have the form "sm_20".
				CudaArch StringToCudaArch(llvm::StringRef S);

				enum class CudaVirtualArch {
				UNKNOWN,
				COMPUTE_20,
				COMPUTE_30,
				COMPUTE_32,
				COMPUTE_35,
				COMPUTE_37,
				COMPUTE_50,
				COMPUTE_52,
				COMPUTE_53,
				COMPUTE_60,
				COMPUTE_61,
				COMPUTE_62,
				};
				const char *CudaVirtualArchToString(CudaVirtualArch A);

				// The input should have the form "compute_20".
				CudaVirtualArch StringToCudaVirtualArch(llvm::StringRef S);

				/// Get the compute_xx corresponding to an sm_yy.
				CudaVirtualArch VirtualArchForCudaArch(CudaArch A);

				/// Get the earliest CudaVersion that supports the given CudaArch.
				CudaVersion MinVersionForCudaArch(CudaArch A);

				} // namespace clang

				#endif

cfe/trunk/include/clang/Driver/Action.h

//===--- Action.h - Abstract compilation steps ------------------- C++ --===//		//===--- Action.h - Abstract compilation steps ------------------- C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_DRIVER_ACTION_H		#ifndef LLVM_CLANG_DRIVER_ACTION_H
#define LLVM_CLANG_DRIVER_ACTION_H		#define LLVM_CLANG_DRIVER_ACTION_H

		#include "clang/Basic/Cuda.h"
#include "clang/Driver/Types.h"		#include "clang/Driver/Types.h"
#include "clang/Driver/Util.h"		#include "clang/Driver/Util.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"

namespace llvm {		namespace llvm {

class StringRef;		class StringRef;

▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	public:

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == BindArchClass;		return A->getKind() == BindArchClass;
}		}
};		};

class CudaDeviceAction : public Action {		class CudaDeviceAction : public Action {
virtual void anchor();		virtual void anchor();
/// GPU architecture to bind. Always of the form /sm_\d+/ or null (when the
/// action applies to multiple architectures).		const CudaArch GpuArch;
const char *GpuArchName;
/// True when action results are not consumed by the host action (e.g when		/// True when action results are not consumed by the host action (e.g when
/// -fsyntax-only or --cuda-device-only options are used).		/// -fsyntax-only or --cuda-device-only options are used).
bool AtTopLevel;		bool AtTopLevel;

public:		public:
CudaDeviceAction(Action Input, const char ArchName, bool AtTopLevel);		CudaDeviceAction(Action *Input, CudaArch Arch, bool AtTopLevel);

const char *getGpuArchName() const { return GpuArchName; }		/// Get the CUDA GPU architecture to which this Action corresponds. Returns
		/// UNKNOWN if this Action corresponds to multiple architectures.
/// Gets the compute_XX that corresponds to getGpuArchName(). Returns null		CudaArch getGpuArch() const { return GpuArch; }
/// when getGpuArchName() is null.
const char *getComputeArchName() const;

bool isAtTopLevel() const { return AtTopLevel; }		bool isAtTopLevel() const { return AtTopLevel; }

static bool IsValidGpuArchName(llvm::StringRef ArchName);

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == CudaDeviceClass;		return A->getKind() == CudaDeviceClass;
}		}
};		};

class CudaHostAction : public Action {		class CudaHostAction : public Action {
virtual void anchor();		virtual void anchor();
ActionList DeviceActions;		ActionList DeviceActions;
▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

cfe/trunk/lib/Basic/CMakeLists.txt

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	set_source_files_properties(Version.cpp
PROPERTIES COMPILE_DEFINITIONS "SVN_REVISION=\"${SVN_REVISION}\"")		PROPERTIES COMPILE_DEFINITIONS "SVN_REVISION=\"${SVN_REVISION}\"")
endif()		endif()
endif()		endif()

add_clang_library(clangBasic		add_clang_library(clangBasic
Attributes.cpp		Attributes.cpp
Builtins.cpp		Builtins.cpp
CharInfo.cpp		CharInfo.cpp
		Cuda.cpp
Diagnostic.cpp		Diagnostic.cpp
DiagnosticIDs.cpp		DiagnosticIDs.cpp
DiagnosticOptions.cpp		DiagnosticOptions.cpp
FileManager.cpp		FileManager.cpp
FileSystemStatCache.cpp		FileSystemStatCache.cpp
IdentifierTable.cpp		IdentifierTable.cpp
LangOptions.cpp		LangOptions.cpp
Module.cpp		Module.cpp
Show All 17 Lines

cfe/trunk/lib/Basic/Cuda.cpp

				#include "clang/Basic/Cuda.h"

				#include "llvm/ADT/StringRef.h"
				#include "llvm/ADT/StringSwitch.h"

				namespace clang {

				const char *CudaVersionToString(CudaVersion V) {
				switch (V) {
				case CudaVersion::UNKNOWN:
				return "unknown";
				case CudaVersion::CUDA_70:
				return "7.0";
				case CudaVersion::CUDA_75:
				return "7.5";
				case CudaVersion::CUDA_80:
				return "8.0";
				}
				}

				const char *CudaArchToString(CudaArch A) {
				switch (A) {
				case CudaArch::UNKNOWN:
				return "unknown";
				case CudaArch::SM_20:
				return "sm_20";
				case CudaArch::SM_21:
				return "sm_21";
				case CudaArch::SM_30:
				return "sm_30";
				case CudaArch::SM_32:
				return "sm_32";
				case CudaArch::SM_35:
				return "sm_35";
				case CudaArch::SM_37:
				return "sm_37";
				case CudaArch::SM_50:
				return "sm_50";
				case CudaArch::SM_52:
				return "sm_52";
				case CudaArch::SM_53:
				return "sm_53";
				case CudaArch::SM_60:
				return "sm_60";
				case CudaArch::SM_61:
				return "sm_61";
				case CudaArch::SM_62:
				return "sm_62";
				}
				}

				CudaArch StringToCudaArch(llvm::StringRef S) {
				return llvm::StringSwitch<CudaArch>(S)
				.Case("sm_20", CudaArch::SM_20)
				.Case("sm_21", CudaArch::SM_21)
				.Case("sm_30", CudaArch::SM_30)
				.Case("sm_32", CudaArch::SM_32)
				.Case("sm_35", CudaArch::SM_35)
				.Case("sm_37", CudaArch::SM_37)
				.Case("sm_50", CudaArch::SM_50)
				.Case("sm_52", CudaArch::SM_52)
				.Case("sm_53", CudaArch::SM_53)
				.Case("sm_60", CudaArch::SM_60)
				.Case("sm_61", CudaArch::SM_61)
				.Case("sm_62", CudaArch::SM_62)
				.Default(CudaArch::UNKNOWN);
				}

				const char *CudaVirtualArchToString(CudaVirtualArch A) {
				switch (A) {
				case CudaVirtualArch::UNKNOWN:
				return "unknown";
				case CudaVirtualArch::COMPUTE_20:
				return "compute_20";
				case CudaVirtualArch::COMPUTE_30:
				return "compute_30";
				case CudaVirtualArch::COMPUTE_32:
				return "compute_32";
				case CudaVirtualArch::COMPUTE_35:
				return "compute_35";
				case CudaVirtualArch::COMPUTE_37:
				return "compute_37";
				case CudaVirtualArch::COMPUTE_50:
				return "compute_50";
				case CudaVirtualArch::COMPUTE_52:
				return "compute_52";
				case CudaVirtualArch::COMPUTE_53:
				return "compute_53";
				case CudaVirtualArch::COMPUTE_60:
				return "compute_60";
				case CudaVirtualArch::COMPUTE_61:
				return "compute_61";
				case CudaVirtualArch::COMPUTE_62:
				return "compute_62";
				}
				}

				CudaVirtualArch StringToCudaVirtualArch(llvm::StringRef S) {
				return llvm::StringSwitch<CudaVirtualArch>(S)
				.Case("compute_20", CudaVirtualArch::COMPUTE_20)
				.Case("compute_30", CudaVirtualArch::COMPUTE_30)
				.Case("compute_32", CudaVirtualArch::COMPUTE_32)
				.Case("compute_35", CudaVirtualArch::COMPUTE_35)
				.Case("compute_37", CudaVirtualArch::COMPUTE_37)
				.Case("compute_50", CudaVirtualArch::COMPUTE_50)
				.Case("compute_52", CudaVirtualArch::COMPUTE_52)
				.Case("compute_53", CudaVirtualArch::COMPUTE_53)
				.Case("compute_60", CudaVirtualArch::COMPUTE_60)
				.Case("compute_61", CudaVirtualArch::COMPUTE_61)
				.Case("compute_62", CudaVirtualArch::COMPUTE_62)
				.Default(CudaVirtualArch::UNKNOWN);
				}

				CudaVirtualArch VirtualArchForCudaArch(CudaArch A) {
				switch (A) {
				case CudaArch::UNKNOWN:
				return CudaVirtualArch::UNKNOWN;
				case CudaArch::SM_20:
				case CudaArch::SM_21:
				return CudaVirtualArch::COMPUTE_20;
				case CudaArch::SM_30:
				return CudaVirtualArch::COMPUTE_30;
				case CudaArch::SM_32:
				return CudaVirtualArch::COMPUTE_32;
				case CudaArch::SM_35:
				return CudaVirtualArch::COMPUTE_35;
				case CudaArch::SM_37:
				return CudaVirtualArch::COMPUTE_37;
				case CudaArch::SM_50:
				return CudaVirtualArch::COMPUTE_50;
				case CudaArch::SM_52:
				return CudaVirtualArch::COMPUTE_52;
				case CudaArch::SM_53:
				return CudaVirtualArch::COMPUTE_53;
				case CudaArch::SM_60:
				return CudaVirtualArch::COMPUTE_60;
				case CudaArch::SM_61:
				return CudaVirtualArch::COMPUTE_61;
				case CudaArch::SM_62:
				return CudaVirtualArch::COMPUTE_62;
				}
				}

				CudaVersion MinVersionForCudaArch(CudaArch A) {
				switch (A) {
				case CudaArch::UNKNOWN:
				return CudaVersion::UNKNOWN;
				case CudaArch::SM_20:
				case CudaArch::SM_21:
				case CudaArch::SM_30:
				case CudaArch::SM_32:
				case CudaArch::SM_35:
				case CudaArch::SM_37:
				case CudaArch::SM_50:
				case CudaArch::SM_52:
				case CudaArch::SM_53:
				return CudaVersion::CUDA_70;
				case CudaArch::SM_60:
				case CudaArch::SM_61:
				case CudaArch::SM_62:
				return CudaVersion::CUDA_80;
				}
				}

				} // namespace clang

cfe/trunk/lib/Basic/Targets.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===--- Targets.cpp - Implement target feature support -------------------===//		//===--- Targets.cpp - Implement target feature support -------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements construction of a TargetInfo object from a		// This file implements construction of a TargetInfo object from a
// target triple.		// target triple.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/Builtins.h"		#include "clang/Basic/Builtins.h"
		#include "clang/Basic/Cuda.h"
#include "clang/Basic/Diagnostic.h"		#include "clang/Basic/Diagnostic.h"
#include "clang/Basic/LangOptions.h"		#include "clang/Basic/LangOptions.h"
#include "clang/Basic/MacroBuilder.h"		#include "clang/Basic/MacroBuilder.h"
#include "clang/Basic/TargetBuiltins.h"		#include "clang/Basic/TargetBuiltins.h"
		#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/TargetOptions.h"		#include "clang/Basic/TargetOptions.h"
#include "clang/Basic/Version.h"		#include "clang/Basic/Version.h"
#include "llvm/ADT/APFloat.h"		#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
▲ Show 20 Lines • Show All 1,660 Lines • ▼ Show 20 Lines	static const unsigned NVPTXAddrSpaceMap[] = {
1, // cuda_device		1, // cuda_device
4, // cuda_constant		4, // cuda_constant
3, // cuda_shared		3, // cuda_shared
};		};

class NVPTXTargetInfo : public TargetInfo {		class NVPTXTargetInfo : public TargetInfo {
static const char *const GCCRegNames[];		static const char *const GCCRegNames[];
static const Builtin::Info BuiltinInfo[];		static const Builtin::Info BuiltinInfo[];
		CudaArch GPU;
// The GPU profiles supported by the NVPTX backend
enum GPUKind {
GK_NONE,
GK_SM20,
GK_SM21,
GK_SM30,
GK_SM32,
GK_SM35,
GK_SM37,
GK_SM50,
GK_SM52,
GK_SM53,
GK_SM60,
GK_SM61,
GK_SM62,
} GPU;

public:		public:
NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)		NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
: TargetInfo(Triple) {		: TargetInfo(Triple) {
BigEndian = false;		BigEndian = false;
TLSSupported = false;		TLSSupported = false;
LongWidth = LongAlign = 64;		LongWidth = LongAlign = 64;
AddrSpaceMap = &NVPTXAddrSpaceMap;		AddrSpaceMap = &NVPTXAddrSpaceMap;
UseAddrSpaceMapMangling = true;		UseAddrSpaceMapMangling = true;
// Define available target features		// Define available target features
// These must be defined in sorted order!		// These must be defined in sorted order!
NoAsmVariants = true;		NoAsmVariants = true;
// Set the default GPU to sm20		GPU = CudaArch::SM_20;
GPU = GK_SM20;

// If possible, get a TargetInfo for our host triple, so we can match its		// If possible, get a TargetInfo for our host triple, so we can match its
// types.		// types.
llvm::Triple HostTriple(Opts.HostTriple);		llvm::Triple HostTriple(Opts.HostTriple);
if (HostTriple.isNVPTX())		if (HostTriple.isNVPTX())
return;		return;
std::unique_ptr<TargetInfo> HostTarget(		std::unique_ptr<TargetInfo> HostTarget(
AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));		AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	public:
void getTargetDefines(const LangOptions &Opts,		void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const override {		MacroBuilder &Builder) const override {
Builder.defineMacro("__PTX__");		Builder.defineMacro("__PTX__");
Builder.defineMacro("__NVPTX__");		Builder.defineMacro("__NVPTX__");
if (Opts.CUDAIsDevice) {		if (Opts.CUDAIsDevice) {
// Set __CUDA_ARCH__ for the GPU specified.		// Set __CUDA_ARCH__ for the GPU specified.
std::string CUDAArchCode = [this] {		std::string CUDAArchCode = [this] {
switch (GPU) {		switch (GPU) {
case GK_NONE:		case CudaArch::UNKNOWN:
assert(false && "No GPU arch when compiling CUDA device code.");		assert(false && "No GPU arch when compiling CUDA device code.");
return "";		return "";
case GK_SM20:		case CudaArch::SM_20:
return "200";		return "200";
case GK_SM21:		case CudaArch::SM_21:
return "210";		return "210";
case GK_SM30:		case CudaArch::SM_30:
return "300";		return "300";
case GK_SM32:		case CudaArch::SM_32:
return "320";		return "320";
case GK_SM35:		case CudaArch::SM_35:
return "350";		return "350";
case GK_SM37:		case CudaArch::SM_37:
return "370";		return "370";
case GK_SM50:		case CudaArch::SM_50:
return "500";		return "500";
case GK_SM52:		case CudaArch::SM_52:
return "520";		return "520";
case GK_SM53:		case CudaArch::SM_53:
return "530";		return "530";
case GK_SM60:		case CudaArch::SM_60:
return "600";		return "600";
case GK_SM61:		case CudaArch::SM_61:
return "610";		return "610";
case GK_SM62:		case CudaArch::SM_62:
return "620";		return "620";
}		}
}();		}();
Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);		Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);
}		}
}		}
ArrayRef<Builtin::Info> getTargetBuiltins() const override {		ArrayRef<Builtin::Info> getTargetBuiltins() const override {
return llvm::makeArrayRef(BuiltinInfo,		return llvm::makeArrayRef(BuiltinInfo,
Show All 27 Lines	const char *getClobbers() const override {
// FIXME: Is this really right?		// FIXME: Is this really right?
return "";		return "";
}		}
BuiltinVaListKind getBuiltinVaListKind() const override {		BuiltinVaListKind getBuiltinVaListKind() const override {
// FIXME: implement		// FIXME: implement
return TargetInfo::CharPtrBuiltinVaList;		return TargetInfo::CharPtrBuiltinVaList;
}		}
bool setCPU(const std::string &Name) override {		bool setCPU(const std::string &Name) override {
GPU = llvm::StringSwitch<GPUKind>(Name)		GPU = StringToCudaArch(Name);
.Case("sm_20", GK_SM20)		return GPU != CudaArch::UNKNOWN;
.Case("sm_21", GK_SM21)
.Case("sm_30", GK_SM30)
.Case("sm_32", GK_SM32)
.Case("sm_35", GK_SM35)
.Case("sm_37", GK_SM37)
.Case("sm_50", GK_SM50)
.Case("sm_52", GK_SM52)
.Case("sm_53", GK_SM53)
.Case("sm_60", GK_SM60)
.Case("sm_61", GK_SM61)
.Case("sm_62", GK_SM62)
.Default(GK_NONE);

return GPU != GK_NONE;
}		}
void setSupportedOpenCLOpts() override {		void setSupportedOpenCLOpts() override {
auto &Opts = getSupportedOpenCLOpts();		auto &Opts = getSupportedOpenCLOpts();
Opts.cl_clang_storage_class_specifiers = 1;		Opts.cl_clang_storage_class_specifiers = 1;
Opts.cl_khr_gl_sharing = 1;		Opts.cl_khr_gl_sharing = 1;
Opts.cl_khr_icd = 1;		Opts.cl_khr_icd = 1;

Opts.cl_khr_fp64 = 1;		Opts.cl_khr_fp64 = 1;
▲ Show 20 Lines • Show All 6,746 Lines • Show Last 20 Lines

cfe/trunk/lib/Driver/Action.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	InputAction::InputAction(const Arg &_Input, types::ID _Type)
: Action(InputClass, _Type), Input(_Input) {		: Action(InputClass, _Type), Input(_Input) {
}		}

void BindArchAction::anchor() {}		void BindArchAction::anchor() {}

BindArchAction::BindArchAction(Action Input, const char _ArchName)		BindArchAction::BindArchAction(Action Input, const char _ArchName)
: Action(BindArchClass, Input), ArchName(_ArchName) {}		: Action(BindArchClass, Input), ArchName(_ArchName) {}

// Converts CUDA GPU architecture, e.g. "sm_21", to its corresponding virtual
// compute arch, e.g. "compute_20". Returns null if the input arch is null or
// doesn't match an existing arch.
static const char* GpuArchToComputeName(const char *ArchName) {
if (!ArchName)
return nullptr;
return llvm::StringSwitch<const char *>(ArchName)
.Cases("sm_20", "sm_21", "compute_20")
.Case("sm_30", "compute_30")
.Case("sm_32", "compute_32")
.Case("sm_35", "compute_35")
.Case("sm_37", "compute_37")
.Case("sm_50", "compute_50")
.Case("sm_52", "compute_52")
.Case("sm_53", "compute_53")
.Case("sm_60", "compute_60")
.Case("sm_61", "compute_61")
.Case("sm_62", "compute_62")
.Default(nullptr);
}

void CudaDeviceAction::anchor() {}		void CudaDeviceAction::anchor() {}

CudaDeviceAction::CudaDeviceAction(Action Input, const char ArchName,		CudaDeviceAction::CudaDeviceAction(Action *Input, CudaArch Arch,
bool AtTopLevel)		bool AtTopLevel)
: Action(CudaDeviceClass, Input), GpuArchName(ArchName),		: Action(CudaDeviceClass, Input), GpuArch(Arch), AtTopLevel(AtTopLevel) {}
AtTopLevel(AtTopLevel) {
assert(!GpuArchName \|\| IsValidGpuArchName(GpuArchName));
}

const char *CudaDeviceAction::getComputeArchName() const {
return GpuArchToComputeName(GpuArchName);
}

bool CudaDeviceAction::IsValidGpuArchName(llvm::StringRef ArchName) {
return GpuArchToComputeName(ArchName.data()) != nullptr;
}

void CudaHostAction::anchor() {}		void CudaHostAction::anchor() {}

CudaHostAction::CudaHostAction(Action *Input, const ActionList &DeviceActions)		CudaHostAction::CudaHostAction(Action *Input, const ActionList &DeviceActions)
: Action(CudaHostClass, Input), DeviceActions(DeviceActions) {}		: Action(CudaHostClass, Input), DeviceActions(DeviceActions) {}

void JobAction::anchor() {}		void JobAction::anchor() {}

▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

cfe/trunk/lib/Driver/Driver.cpp

Show All 17 Lines
#include "clang/Driver/DriverDiagnostic.h"		#include "clang/Driver/DriverDiagnostic.h"
#include "clang/Driver/Job.h"		#include "clang/Driver/Job.h"
#include "clang/Driver/Options.h"		#include "clang/Driver/Options.h"
#include "clang/Driver/SanitizerArgs.h"		#include "clang/Driver/SanitizerArgs.h"
#include "clang/Driver/Tool.h"		#include "clang/Driver/Tool.h"
#include "clang/Driver/ToolChain.h"		#include "clang/Driver/ToolChain.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSet.h"		#include "llvm/ADT/StringSet.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/Option/Arg.h"		#include "llvm/Option/Arg.h"
#include "llvm/Option/ArgList.h"		#include "llvm/Option/ArgList.h"
#include "llvm/Option/OptSpecifier.h"		#include "llvm/Option/OptSpecifier.h"
#include "llvm/Option/OptTable.h"		#include "llvm/Option/OptTable.h"
#include "llvm/Option/Option.h"		#include "llvm/Option/Option.h"
▲ Show 20 Lines • Show All 983 Lines • ▼ Show 20 Lines	static unsigned PrintActions1(const Compilation &C, Action *A,

os << Action::getClassName(A->getKind()) << ", ";		os << Action::getClassName(A->getKind()) << ", ";
if (InputAction *IA = dyn_cast<InputAction>(A)) {		if (InputAction *IA = dyn_cast<InputAction>(A)) {
os << "\"" << IA->getInputArg().getValue() << "\"";		os << "\"" << IA->getInputArg().getValue() << "\"";
} else if (BindArchAction *BIA = dyn_cast<BindArchAction>(A)) {		} else if (BindArchAction *BIA = dyn_cast<BindArchAction>(A)) {
os << '"' << BIA->getArchName() << '"' << ", {"		os << '"' << BIA->getArchName() << '"' << ", {"
<< PrintActions1(C, *BIA->input_begin(), Ids) << "}";		<< PrintActions1(C, *BIA->input_begin(), Ids) << "}";
} else if (CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {		} else if (CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {
os << '"'		CudaArch Arch = CDA->getGpuArch();
<< (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)")		if (Arch != CudaArch::UNKNOWN)
<< '"' << ", {" << PrintActions1(C, *CDA->input_begin(), Ids) << "}";		os << "'" << CudaArchToString(Arch) << "', ";
		os << "{" << PrintActions1(C, *CDA->input_begin(), Ids) << "}";
} else {		} else {
const ActionList *AL;		const ActionList *AL;
if (CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {		if (CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {
os << "{" << PrintActions1(C, *CHA->input_begin(), Ids) << "}"		os << "{" << PrintActions1(C, *CHA->input_begin(), Ids) << "}"
<< ", gpu binaries ";		<< ", gpu binaries ";
AL = &CHA->getDeviceActions();		AL = &CHA->getDeviceActions();
} else		} else
AL = &A->getInputs();		AL = &A->getInputs();
▲ Show 20 Lines • Show All 339 Lines • ▼ Show 20 Lines	static Action *buildCudaActions(Compilation &C, DerivedArgList &Args,
bool CompileDeviceOnly =		bool CompileDeviceOnly =
PartialCompilationArg &&		PartialCompilationArg &&
PartialCompilationArg->getOption().matches(options::OPT_cuda_device_only);		PartialCompilationArg->getOption().matches(options::OPT_cuda_device_only);

if (CompileHostOnly)		if (CompileHostOnly)
return C.MakeAction<CudaHostAction>(HostAction, ActionList());		return C.MakeAction<CudaHostAction>(HostAction, ActionList());

// Collect all cuda_gpu_arch parameters, removing duplicates.		// Collect all cuda_gpu_arch parameters, removing duplicates.
SmallVector<const char *, 4> GpuArchList;		SmallVector<CudaArch, 4> GpuArchList;
llvm::StringSet<> GpuArchNames;		llvm::SmallSet<CudaArch, 4> GpuArchs;
for (Arg *A : Args) {		for (Arg *A : Args) {
if (!A->getOption().matches(options::OPT_cuda_gpu_arch_EQ))		if (!A->getOption().matches(options::OPT_cuda_gpu_arch_EQ))
continue;		continue;
A->claim();		A->claim();

const auto& Arch = A->getValue();		const auto &ArchStr = A->getValue();
if (!CudaDeviceAction::IsValidGpuArchName(Arch))		CudaArch Arch = StringToCudaArch(ArchStr);
C.getDriver().Diag(clang::diag::err_drv_cuda_bad_gpu_arch) << Arch;		if (Arch == CudaArch::UNKNOWN)
else if (GpuArchNames.insert(Arch).second)		C.getDriver().Diag(clang::diag::err_drv_cuda_bad_gpu_arch) << ArchStr;
		else if (GpuArchs.insert(Arch).second)
GpuArchList.push_back(Arch);		GpuArchList.push_back(Arch);
}		}

// Default to sm_20 which is the lowest common denominator for supported GPUs.		// Default to sm_20 which is the lowest common denominator for supported GPUs.
// sm_20 code should work correctly, if suboptimally, on all newer GPUs.		// sm_20 code should work correctly, if suboptimally, on all newer GPUs.
if (GpuArchList.empty())		if (GpuArchList.empty())
GpuArchList.push_back("sm_20");		GpuArchList.push_back(CudaArch::SM_20);

// Replicate inputs for each GPU architecture.		// Replicate inputs for each GPU architecture.
Driver::InputList CudaDeviceInputs;		Driver::InputList CudaDeviceInputs;
for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)		for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)
CudaDeviceInputs.push_back(std::make_pair(types::TY_CUDA_DEVICE, InputArg));		CudaDeviceInputs.push_back(std::make_pair(types::TY_CUDA_DEVICE, InputArg));

// Build actions for all device inputs.		// Build actions for all device inputs.
assert(C.getSingleOffloadToolChain<Action::OFK_Cuda>() &&		assert(C.getSingleOffloadToolChain<Action::OFK_Cuda>() &&
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I) {

for (const auto& A : {AssembleAction, BackendAction}) {		for (const auto& A : {AssembleAction, BackendAction}) {
DeviceActions.push_back(C.MakeAction<CudaDeviceAction>(		DeviceActions.push_back(C.MakeAction<CudaDeviceAction>(
A, GpuArchList[I], /* AtTopLevel */ false));		A, GpuArchList[I], /* AtTopLevel */ false));
}		}
}		}
auto FatbinAction = C.MakeAction<CudaDeviceAction>(		auto FatbinAction = C.MakeAction<CudaDeviceAction>(
C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),		C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),
/* GpuArchName = */ nullptr,		CudaArch::UNKNOWN,
/* AtTopLevel = */ false);		/* AtTopLevel = */ false);
// Return a new host action that incorporates original host action and all		// Return a new host action that incorporates original host action and all
// device actions.		// device actions.
return C.MakeAction<CudaHostAction>(std::move(HostAction),		return C.MakeAction<CudaHostAction>(std::move(HostAction),
ActionList({FatbinAction}));		ActionList({FatbinAction}));
}		}

void Driver::BuildActions(Compilation &C, DerivedArgList &Args,		void Driver::BuildActions(Compilation &C, DerivedArgList &Args,
▲ Show 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	return BuildJobsForAction(C, *BAA->input_begin(), TC, ArchName, AtTopLevel,
MultipleArchs, LinkingOutput, CachedResults);		MultipleArchs, LinkingOutput, CachedResults);
}		}

if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {		if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {
// Initial processing of CudaDeviceAction carries host params.		// Initial processing of CudaDeviceAction carries host params.
// Call BuildJobsForAction() again, now with correct device parameters.		// Call BuildJobsForAction() again, now with correct device parameters.
InputInfo II = BuildJobsForAction(		InputInfo II = BuildJobsForAction(
C, *CDA->input_begin(), C.getSingleOffloadToolChain<Action::OFK_Cuda>(),		C, *CDA->input_begin(), C.getSingleOffloadToolChain<Action::OFK_Cuda>(),
CDA->getGpuArchName(), CDA->isAtTopLevel(), /MultipleArchs=/true,		CudaArchToString(CDA->getGpuArch()), CDA->isAtTopLevel(),
LinkingOutput, CachedResults);		/MultipleArchs=/true, LinkingOutput, CachedResults);
// Currently II's Action is *CDA->input_begin(). Set it to CDA instead, so		// Currently II's Action is *CDA->input_begin(). Set it to CDA instead, so
// that one can retrieve II's GPU arch.		// that one can retrieve II's GPU arch.
II.setAction(A);		II.setAction(A);
return II;		return II;
}		}

const ActionList *Inputs = &A->getInputs();		const ActionList *Inputs = &A->getInputs();

▲ Show 20 Lines • Show All 604 Lines • Show Last 20 Lines

cfe/trunk/lib/Driver/Tools.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,216 Lines • ▼ Show 20 Lines	void NVPTX::Linker::ConstructJob(Compilation &C, const JobAction &JA,
CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-64" : "-32");		CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-64" : "-32");
CmdArgs.push_back(Args.MakeArgString("--create"));		CmdArgs.push_back(Args.MakeArgString("--create"));
CmdArgs.push_back(Args.MakeArgString(Output.getFilename()));		CmdArgs.push_back(Args.MakeArgString(Output.getFilename()));

for (const auto& II : Inputs) {		for (const auto& II : Inputs) {
auto* A = cast<const CudaDeviceAction>(II.getAction());		auto* A = cast<const CudaDeviceAction>(II.getAction());
// We need to pass an Arch of the form "sm_XX" for cubin files and		// We need to pass an Arch of the form "sm_XX" for cubin files and
// "compute_XX" for ptx.		// "compute_XX" for ptx.
const char *Arch = (II.getType() == types::TY_PP_Asm)		const char *Arch =
? A->getComputeArchName()		(II.getType() == types::TY_PP_Asm)
: A->getGpuArchName();		? CudaVirtualArchToString(VirtualArchForCudaArch(A->getGpuArch()))
		: CudaArchToString(A->getGpuArch());
CmdArgs.push_back(Args.MakeArgString(llvm::Twine("--image=profile=") +		CmdArgs.push_back(Args.MakeArgString(llvm::Twine("--image=profile=") +
Arch + ",file=" + II.getFilename()));		Arch + ",file=" + II.getFilename()));
}		}

for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_fatbinary))		for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_fatbinary))
CmdArgs.push_back(Args.MakeArgString(A));		CmdArgs.push_back(Args.MakeArgString(A));

const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));		const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));
C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));		C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));
}		}