This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Add utility functions for dealing with CUDA versions / architectures.
ClosedPublic

Authored by jlebar on Jun 29 2016, 3:58 PM.

Download Raw Diff

Details

Reviewers

Commits

rG629076178a5e: [CUDA] Add utility functions for dealing with CUDA versions / architectures.
rC274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures.
rL274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures.

Summary

Currently our handling of CUDA architectures is scattered all around
clang. This patch centralizes it.

A key advantage of this centralization is that you can now write a C++
switch on e.g. CudaArch and get a compile error if you don't handle one
of the enum values.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 62299.Jun 29 2016, 3:58 PM

jlebar retitled this revision from to [CUDA] Add utility functions for dealing with CUDA versions / architectures..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added a subscriber: cfe-commits.

LGTM.

lib/Basic/Cuda.cpp
9–20	We seem to do a lot of enum->string and string->enum mapping in this file. Is there something comparable to Boost.bimap in standard c++ library or in LLVM?
lib/Driver/Driver.cpp
1025–1029	I think this could be collapsed to just CudaArchToString(CDA->getGpuArch()). "(multiple archs)" is as informative as (and indistinguishable from) "unknown" here.

This revision is now accepted and ready to land.Jun 30 2016, 9:50 AM

jlebar marked an inline comment as done.Jun 30 2016, 12:53 PM

jlebar added inline comments.

lib/Basic/Cuda.cpp
9–20	Not to my knowledge.
lib/Driver/Driver.cpp
1025–1029	I'm not crazy about "unknown", since it is actually known. How about we just not output anything?

tra added inline comments.Jun 30 2016, 1:27 PM

lib/Driver/Driver.cpp
1025–1029	It's a debugging output so it would be good to accurately reflect our internal state. In this case if we for some reason end up with CudaArch::UNKNOWN, I'd want to know that. If we really use UNKNOWN to represent multiple archs, perhaps it needs an enum for multiple-archs.

jlebar marked an inline comment as done.Jun 30 2016, 1:31 PM

jlebar added inline comments.

lib/Driver/Driver.cpp
1025–1029	We really do use UNKNOWN here to represent multiple architectures. It is used for the architecture of the Action corresponding to the call to fatbin. I think adding an enum value for multiple-archs is going to be more harmful than useful, because it means that everywhere that we switch() on arch, we're going to have to handle (and assert) MULTIPLE_ARCHs.

tra added inline comments.Jun 30 2016, 1:44 PM

lib/Driver/Driver.cpp
1025–1029	OK. No output is fine with me.

Address Art's review.

Closed by commit rL274681: [CUDA] Add utility functions for dealing with CUDA versions / architectures. (authored by jlebar). · Explain WhyJul 6 2016, 2:29 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

include/

clang/

Basic/

Cuda.h

77 lines

Driver/

Action.h

19 lines

lib/

Basic/

CMakeLists.txt

1 line

Cuda.cpp

165 lines

Targets.cpp

68 lines

Driver/

Action.cpp

36 lines

Driver.cpp

29 lines

Tools.cpp

7 lines

Diff 62409

include/clang/Basic/Cuda.h

This file was added.

				//===--- Cuda.h - Utilities for compiling CUDA code ------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CLANG_BASIC_CUDA_H
				#define LLVM_CLANG_BASIC_CUDA_H

				namespace llvm {
				class StringRef;
				} // namespace llvm

				namespace clang {

				enum class CudaVersion {
				UNKNOWN,
				CUDA_70,
				CUDA_75,
				CUDA_80,
				};
				const char *CudaVersionToString(CudaVersion V);

				// No string -> CudaVersion conversion function because there's no canonical
				// spelling of the various CUDA versions.

				enum class CudaArch {
				UNKNOWN,
				SM_20,
				SM_21,
				SM_30,
				SM_32,
				SM_35,
				SM_37,
				SM_50,
				SM_52,
				SM_53,
				SM_60,
				SM_61,
				SM_62,
				};
				const char *CudaArchToString(CudaArch A);

				// The input should have the form "sm_20".
				CudaArch StringToCudaArch(llvm::StringRef S);

				enum class CudaVirtualArch {
				UNKNOWN,
				COMPUTE_20,
				COMPUTE_30,
				COMPUTE_32,
				COMPUTE_35,
				COMPUTE_37,
				COMPUTE_50,
				COMPUTE_52,
				COMPUTE_53,
				COMPUTE_60,
				COMPUTE_61,
				COMPUTE_62,
				};
				const char *CudaVirtualArchToString(CudaVirtualArch A);

				// The input should have the form "compute_20".
				CudaVirtualArch StringToCudaVirtualArch(llvm::StringRef S);

				/// Get the compute_xx corresponding to an sm_yy.
				CudaVirtualArch VirtualArchForCudaArch(CudaArch A);

				/// Get the earliest CudaVersion that supports the given CudaArch.
				CudaVersion MinVersionForCudaArch(CudaArch A);

				} // namespace clang

				#endif

include/clang/Driver/Action.h

//===--- Action.h - Abstract compilation steps ------------------- C++ --===//		//===--- Action.h - Abstract compilation steps ------------------- C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_DRIVER_ACTION_H		#ifndef LLVM_CLANG_DRIVER_ACTION_H
#define LLVM_CLANG_DRIVER_ACTION_H		#define LLVM_CLANG_DRIVER_ACTION_H

		#include "clang/Basic/Cuda.h"
#include "clang/Driver/Types.h"		#include "clang/Driver/Types.h"
#include "clang/Driver/Util.h"		#include "clang/Driver/Util.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"

namespace llvm {		namespace llvm {

class StringRef;		class StringRef;

▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	public:

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == BindArchClass;		return A->getKind() == BindArchClass;
}		}
};		};

class CudaDeviceAction : public Action {		class CudaDeviceAction : public Action {
virtual void anchor();		virtual void anchor();
/// GPU architecture to bind. Always of the form /sm_\d+/ or null (when the
/// action applies to multiple architectures).		const CudaArch GpuArch;
const char *GpuArchName;
/// True when action results are not consumed by the host action (e.g when		/// True when action results are not consumed by the host action (e.g when
/// -fsyntax-only or --cuda-device-only options are used).		/// -fsyntax-only or --cuda-device-only options are used).
bool AtTopLevel;		bool AtTopLevel;

public:		public:
CudaDeviceAction(Action Input, const char ArchName, bool AtTopLevel);		CudaDeviceAction(Action *Input, CudaArch Arch, bool AtTopLevel);

const char *getGpuArchName() const { return GpuArchName; }		/// Get the CUDA GPU architecture to which this Action corresponds. Returns
		/// UNKNOWN if this Action corresponds to multiple architectures.
/// Gets the compute_XX that corresponds to getGpuArchName(). Returns null		CudaArch getGpuArch() const { return GpuArch; }
/// when getGpuArchName() is null.
const char *getComputeArchName() const;

bool isAtTopLevel() const { return AtTopLevel; }		bool isAtTopLevel() const { return AtTopLevel; }

static bool IsValidGpuArchName(llvm::StringRef ArchName);

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == CudaDeviceClass;		return A->getKind() == CudaDeviceClass;
}		}
};		};

class CudaHostAction : public Action {		class CudaHostAction : public Action {
virtual void anchor();		virtual void anchor();
ActionList DeviceActions;		ActionList DeviceActions;
▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

lib/Basic/CMakeLists.txt

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	set_source_files_properties(Version.cpp
PROPERTIES COMPILE_DEFINITIONS "SVN_REVISION=\"${SVN_REVISION}\"")		PROPERTIES COMPILE_DEFINITIONS "SVN_REVISION=\"${SVN_REVISION}\"")
endif()		endif()
endif()		endif()

add_clang_library(clangBasic		add_clang_library(clangBasic
Attributes.cpp		Attributes.cpp
Builtins.cpp		Builtins.cpp
CharInfo.cpp		CharInfo.cpp
		Cuda.cpp
Diagnostic.cpp		Diagnostic.cpp
DiagnosticIDs.cpp		DiagnosticIDs.cpp
DiagnosticOptions.cpp		DiagnosticOptions.cpp
FileManager.cpp		FileManager.cpp
FileSystemStatCache.cpp		FileSystemStatCache.cpp
IdentifierTable.cpp		IdentifierTable.cpp
LangOptions.cpp		LangOptions.cpp
Module.cpp		Module.cpp
Show All 17 Lines

lib/Basic/Cuda.cpp

This file was added.

				#include "clang/Basic/Cuda.h"

				#include "llvm/ADT/StringRef.h"
				#include "llvm/ADT/StringSwitch.h"

				namespace clang {

				const char *CudaVersionToString(CudaVersion V) {
				switch (V) {
				case CudaVersion::UNKNOWN:
				return "unknown";
				case CudaVersion::CUDA_70:
				return "7.0";
				case CudaVersion::CUDA_75:
				return "7.5";
				case CudaVersion::CUDA_80:
				return "8.0";
				}
				}

				traUnsubmitted Done Reply Inline Actions We seem to do a lot of enum->string and string->enum mapping in this file. Is there something comparable to Boost.bimap in standard c++ library or in LLVM? tra: We seem to do a lot of enum->string and string->enum mapping in this file. Is there something…
				jlebarAuthorUnsubmitted Not Done Reply Inline Actions Not to my knowledge. jlebar: Not to my knowledge.
				const char *CudaArchToString(CudaArch A) {
				switch (A) {
				case CudaArch::UNKNOWN:
				return "unknown";
				case CudaArch::SM_20:
				return "sm_20";
				case CudaArch::SM_21:
				return "sm_21";
				case CudaArch::SM_30:
				return "sm_30";
				case CudaArch::SM_32:
				return "sm_32";
				case CudaArch::SM_35:
				return "sm_35";
				case CudaArch::SM_37:
				return "sm_37";
				case CudaArch::SM_50:
				return "sm_50";
				case CudaArch::SM_52:
				return "sm_52";
				case CudaArch::SM_53:
				return "sm_53";
				case CudaArch::SM_60:
				return "sm_60";
				case CudaArch::SM_61:
				return "sm_61";
				case CudaArch::SM_62:
				return "sm_62";
				}
				}

				CudaArch StringToCudaArch(llvm::StringRef S) {
				return llvm::StringSwitch<CudaArch>(S)
				.Case("sm_20", CudaArch::SM_20)
				.Case("sm_21", CudaArch::SM_21)
				.Case("sm_30", CudaArch::SM_30)
				.Case("sm_32", CudaArch::SM_32)
				.Case("sm_35", CudaArch::SM_35)
				.Case("sm_37", CudaArch::SM_37)
				.Case("sm_50", CudaArch::SM_50)
				.Case("sm_52", CudaArch::SM_52)
				.Case("sm_53", CudaArch::SM_53)
				.Case("sm_60", CudaArch::SM_60)
				.Case("sm_61", CudaArch::SM_61)
				.Case("sm_62", CudaArch::SM_62)
				.Default(CudaArch::UNKNOWN);
				}

				const char *CudaVirtualArchToString(CudaVirtualArch A) {
				switch (A) {
				case CudaVirtualArch::UNKNOWN:
				return "unknown";
				case CudaVirtualArch::COMPUTE_20:
				return "compute_20";
				case CudaVirtualArch::COMPUTE_30:
				return "compute_30";
				case CudaVirtualArch::COMPUTE_32:
				return "compute_32";
				case CudaVirtualArch::COMPUTE_35:
				return "compute_35";
				case CudaVirtualArch::COMPUTE_37:
				return "compute_37";
				case CudaVirtualArch::COMPUTE_50:
				return "compute_50";
				case CudaVirtualArch::COMPUTE_52:
				return "compute_52";
				case CudaVirtualArch::COMPUTE_53:
				return "compute_53";
				case CudaVirtualArch::COMPUTE_60:
				return "compute_60";
				case CudaVirtualArch::COMPUTE_61:
				return "compute_61";
				case CudaVirtualArch::COMPUTE_62:
				return "compute_62";
				}
				}

				CudaVirtualArch StringToCudaVirtualArch(llvm::StringRef S) {
				return llvm::StringSwitch<CudaVirtualArch>(S)
				.Case("compute_20", CudaVirtualArch::COMPUTE_20)
				.Case("compute_30", CudaVirtualArch::COMPUTE_30)
				.Case("compute_32", CudaVirtualArch::COMPUTE_32)
				.Case("compute_35", CudaVirtualArch::COMPUTE_35)
				.Case("compute_37", CudaVirtualArch::COMPUTE_37)
				.Case("compute_50", CudaVirtualArch::COMPUTE_50)
				.Case("compute_52", CudaVirtualArch::COMPUTE_52)
				.Case("compute_53", CudaVirtualArch::COMPUTE_53)
				.Case("compute_60", CudaVirtualArch::COMPUTE_60)
				.Case("compute_61", CudaVirtualArch::COMPUTE_61)
				.Case("compute_62", CudaVirtualArch::COMPUTE_62)
				.Default(CudaVirtualArch::UNKNOWN);
				}

				CudaVirtualArch VirtualArchForCudaArch(CudaArch A) {
				switch (A) {
				case CudaArch::UNKNOWN:
				return CudaVirtualArch::UNKNOWN;
				case CudaArch::SM_20:
				case CudaArch::SM_21:
				return CudaVirtualArch::COMPUTE_20;
				case CudaArch::SM_30:
				return CudaVirtualArch::COMPUTE_30;
				case CudaArch::SM_32:
				return CudaVirtualArch::COMPUTE_32;
				case CudaArch::SM_35:
				return CudaVirtualArch::COMPUTE_35;
				case CudaArch::SM_37:
				return CudaVirtualArch::COMPUTE_37;
				case CudaArch::SM_50:
				return CudaVirtualArch::COMPUTE_50;
				case CudaArch::SM_52:
				return CudaVirtualArch::COMPUTE_52;
				case CudaArch::SM_53:
				return CudaVirtualArch::COMPUTE_53;
				case CudaArch::SM_60:
				return CudaVirtualArch::COMPUTE_60;
				case CudaArch::SM_61:
				return CudaVirtualArch::COMPUTE_61;
				case CudaArch::SM_62:
				return CudaVirtualArch::COMPUTE_62;
				}
				}

				CudaVersion MinVersionForCudaArch(CudaArch A) {
				switch (A) {
				case CudaArch::UNKNOWN:
				return CudaVersion::UNKNOWN;
				case CudaArch::SM_20:
				case CudaArch::SM_21:
				case CudaArch::SM_30:
				case CudaArch::SM_32:
				case CudaArch::SM_35:
				case CudaArch::SM_37:
				case CudaArch::SM_50:
				case CudaArch::SM_52:
				case CudaArch::SM_53:
				return CudaVersion::CUDA_70;
				case CudaArch::SM_60:
				case CudaArch::SM_61:
				case CudaArch::SM_62:
				return CudaVersion::CUDA_80;
				}
				}

				} // namespace clang

lib/Basic/Targets.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

//===--- Targets.cpp - Implement target feature support -------------------===//		//===--- Targets.cpp - Implement target feature support -------------------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file implements construction of a TargetInfo object from a		// This file implements construction of a TargetInfo object from a
// target triple.		// target triple.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/Builtins.h"		#include "clang/Basic/Builtins.h"
		#include "clang/Basic/Cuda.h"
#include "clang/Basic/Diagnostic.h"		#include "clang/Basic/Diagnostic.h"
#include "clang/Basic/LangOptions.h"		#include "clang/Basic/LangOptions.h"
#include "clang/Basic/MacroBuilder.h"		#include "clang/Basic/MacroBuilder.h"
#include "clang/Basic/TargetBuiltins.h"		#include "clang/Basic/TargetBuiltins.h"
		#include "clang/Basic/TargetInfo.h"
#include "clang/Basic/TargetOptions.h"		#include "clang/Basic/TargetOptions.h"
#include "clang/Basic/Version.h"		#include "clang/Basic/Version.h"
#include "llvm/ADT/APFloat.h"		#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
▲ Show 20 Lines • Show All 1,660 Lines • ▼ Show 20 Lines	static const unsigned NVPTXAddrSpaceMap[] = {
1, // cuda_device		1, // cuda_device
4, // cuda_constant		4, // cuda_constant
3, // cuda_shared		3, // cuda_shared
};		};

class NVPTXTargetInfo : public TargetInfo {		class NVPTXTargetInfo : public TargetInfo {
static const char *const GCCRegNames[];		static const char *const GCCRegNames[];
static const Builtin::Info BuiltinInfo[];		static const Builtin::Info BuiltinInfo[];
		CudaArch GPU;
// The GPU profiles supported by the NVPTX backend
enum GPUKind {
GK_NONE,
GK_SM20,
GK_SM21,
GK_SM30,
GK_SM32,
GK_SM35,
GK_SM37,
GK_SM50,
GK_SM52,
GK_SM53,
GK_SM60,
GK_SM61,
GK_SM62,
} GPU;

public:		public:
NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)		NVPTXTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
: TargetInfo(Triple) {		: TargetInfo(Triple) {
BigEndian = false;		BigEndian = false;
TLSSupported = false;		TLSSupported = false;
LongWidth = LongAlign = 64;		LongWidth = LongAlign = 64;
AddrSpaceMap = &NVPTXAddrSpaceMap;		AddrSpaceMap = &NVPTXAddrSpaceMap;
UseAddrSpaceMapMangling = true;		UseAddrSpaceMapMangling = true;
// Define available target features		// Define available target features
// These must be defined in sorted order!		// These must be defined in sorted order!
NoAsmVariants = true;		NoAsmVariants = true;
// Set the default GPU to sm20		GPU = CudaArch::SM_20;
GPU = GK_SM20;

// If possible, get a TargetInfo for our host triple, so we can match its		// If possible, get a TargetInfo for our host triple, so we can match its
// types.		// types.
llvm::Triple HostTriple(Opts.HostTriple);		llvm::Triple HostTriple(Opts.HostTriple);
if (HostTriple.isNVPTX())		if (HostTriple.isNVPTX())
return;		return;
std::unique_ptr<TargetInfo> HostTarget(		std::unique_ptr<TargetInfo> HostTarget(
AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));		AllocateTarget(llvm::Triple(Opts.HostTriple), Opts));
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	public:
void getTargetDefines(const LangOptions &Opts,		void getTargetDefines(const LangOptions &Opts,
MacroBuilder &Builder) const override {		MacroBuilder &Builder) const override {
Builder.defineMacro("__PTX__");		Builder.defineMacro("__PTX__");
Builder.defineMacro("__NVPTX__");		Builder.defineMacro("__NVPTX__");
if (Opts.CUDAIsDevice) {		if (Opts.CUDAIsDevice) {
// Set __CUDA_ARCH__ for the GPU specified.		// Set __CUDA_ARCH__ for the GPU specified.
std::string CUDAArchCode = [this] {		std::string CUDAArchCode = [this] {
switch (GPU) {		switch (GPU) {
case GK_NONE:		case CudaArch::UNKNOWN:
assert(false && "No GPU arch when compiling CUDA device code.");		assert(false && "No GPU arch when compiling CUDA device code.");
return "";		return "";
case GK_SM20:		case CudaArch::SM_20:
return "200";		return "200";
case GK_SM21:		case CudaArch::SM_21:
return "210";		return "210";
case GK_SM30:		case CudaArch::SM_30:
return "300";		return "300";
case GK_SM32:		case CudaArch::SM_32:
return "320";		return "320";
case GK_SM35:		case CudaArch::SM_35:
return "350";		return "350";
case GK_SM37:		case CudaArch::SM_37:
return "370";		return "370";
case GK_SM50:		case CudaArch::SM_50:
return "500";		return "500";
case GK_SM52:		case CudaArch::SM_52:
return "520";		return "520";
case GK_SM53:		case CudaArch::SM_53:
return "530";		return "530";
case GK_SM60:		case CudaArch::SM_60:
return "600";		return "600";
case GK_SM61:		case CudaArch::SM_61:
return "610";		return "610";
case GK_SM62:		case CudaArch::SM_62:
return "620";		return "620";
}		}
}();		}();
Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);		Builder.defineMacro("__CUDA_ARCH__", CUDAArchCode);
}		}
}		}
ArrayRef<Builtin::Info> getTargetBuiltins() const override {		ArrayRef<Builtin::Info> getTargetBuiltins() const override {
return llvm::makeArrayRef(BuiltinInfo,		return llvm::makeArrayRef(BuiltinInfo,
Show All 27 Lines	const char *getClobbers() const override {
// FIXME: Is this really right?		// FIXME: Is this really right?
return "";		return "";
}		}
BuiltinVaListKind getBuiltinVaListKind() const override {		BuiltinVaListKind getBuiltinVaListKind() const override {
// FIXME: implement		// FIXME: implement
return TargetInfo::CharPtrBuiltinVaList;		return TargetInfo::CharPtrBuiltinVaList;
}		}
bool setCPU(const std::string &Name) override {		bool setCPU(const std::string &Name) override {
GPU = llvm::StringSwitch<GPUKind>(Name)		GPU = StringToCudaArch(Name);
.Case("sm_20", GK_SM20)		return GPU != CudaArch::UNKNOWN;
.Case("sm_21", GK_SM21)
.Case("sm_30", GK_SM30)
.Case("sm_32", GK_SM32)
.Case("sm_35", GK_SM35)
.Case("sm_37", GK_SM37)
.Case("sm_50", GK_SM50)
.Case("sm_52", GK_SM52)
.Case("sm_53", GK_SM53)
.Case("sm_60", GK_SM60)
.Case("sm_61", GK_SM61)
.Case("sm_62", GK_SM62)
.Default(GK_NONE);

return GPU != GK_NONE;
}		}
void setSupportedOpenCLOpts() override {		void setSupportedOpenCLOpts() override {
auto &Opts = getSupportedOpenCLOpts();		auto &Opts = getSupportedOpenCLOpts();
Opts.cl_clang_storage_class_specifiers = 1;		Opts.cl_clang_storage_class_specifiers = 1;
Opts.cl_khr_gl_sharing = 1;		Opts.cl_khr_gl_sharing = 1;
Opts.cl_khr_icd = 1;		Opts.cl_khr_icd = 1;

Opts.cl_khr_fp64 = 1;		Opts.cl_khr_fp64 = 1;
▲ Show 20 Lines • Show All 6,681 Lines • Show Last 20 Lines

lib/Driver/Action.cpp

Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	InputAction::InputAction(const Arg &_Input, types::ID _Type)
: Action(InputClass, _Type), Input(_Input) {		: Action(InputClass, _Type), Input(_Input) {
}		}

void BindArchAction::anchor() {}		void BindArchAction::anchor() {}

BindArchAction::BindArchAction(Action Input, const char _ArchName)		BindArchAction::BindArchAction(Action Input, const char _ArchName)
: Action(BindArchClass, Input), ArchName(_ArchName) {}		: Action(BindArchClass, Input), ArchName(_ArchName) {}

// Converts CUDA GPU architecture, e.g. "sm_21", to its corresponding virtual
// compute arch, e.g. "compute_20". Returns null if the input arch is null or
// doesn't match an existing arch.
static const char* GpuArchToComputeName(const char *ArchName) {
if (!ArchName)
return nullptr;
return llvm::StringSwitch<const char *>(ArchName)
.Cases("sm_20", "sm_21", "compute_20")
.Case("sm_30", "compute_30")
.Case("sm_32", "compute_32")
.Case("sm_35", "compute_35")
.Case("sm_37", "compute_37")
.Case("sm_50", "compute_50")
.Case("sm_52", "compute_52")
.Case("sm_53", "compute_53")
.Case("sm_60", "compute_60")
.Case("sm_61", "compute_61")
.Case("sm_62", "compute_62")
.Default(nullptr);
}

void CudaDeviceAction::anchor() {}		void CudaDeviceAction::anchor() {}

CudaDeviceAction::CudaDeviceAction(Action Input, const char ArchName,		CudaDeviceAction::CudaDeviceAction(Action *Input, CudaArch Arch,
bool AtTopLevel)		bool AtTopLevel)
: Action(CudaDeviceClass, Input), GpuArchName(ArchName),		: Action(CudaDeviceClass, Input), GpuArch(Arch), AtTopLevel(AtTopLevel) {}
AtTopLevel(AtTopLevel) {
assert(!GpuArchName \|\| IsValidGpuArchName(GpuArchName));
}

const char *CudaDeviceAction::getComputeArchName() const {
return GpuArchToComputeName(GpuArchName);
}

bool CudaDeviceAction::IsValidGpuArchName(llvm::StringRef ArchName) {
return GpuArchToComputeName(ArchName.data()) != nullptr;
}

void CudaHostAction::anchor() {}		void CudaHostAction::anchor() {}

CudaHostAction::CudaHostAction(Action *Input, const ActionList &DeviceActions)		CudaHostAction::CudaHostAction(Action *Input, const ActionList &DeviceActions)
: Action(CudaHostClass, Input), DeviceActions(DeviceActions) {}		: Action(CudaHostClass, Input), DeviceActions(DeviceActions) {}

void JobAction::anchor() {}		void JobAction::anchor() {}

▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

lib/Driver/Driver.cpp

Show All 17 Lines
#include "clang/Driver/DriverDiagnostic.h"		#include "clang/Driver/DriverDiagnostic.h"
#include "clang/Driver/Job.h"		#include "clang/Driver/Job.h"
#include "clang/Driver/Options.h"		#include "clang/Driver/Options.h"
#include "clang/Driver/SanitizerArgs.h"		#include "clang/Driver/SanitizerArgs.h"
#include "clang/Driver/Tool.h"		#include "clang/Driver/Tool.h"
#include "clang/Driver/ToolChain.h"		#include "clang/Driver/ToolChain.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSet.h"		#include "llvm/ADT/StringSet.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/Option/Arg.h"		#include "llvm/Option/Arg.h"
#include "llvm/Option/ArgList.h"		#include "llvm/Option/ArgList.h"
#include "llvm/Option/OptSpecifier.h"		#include "llvm/Option/OptSpecifier.h"
#include "llvm/Option/OptTable.h"		#include "llvm/Option/OptTable.h"
#include "llvm/Option/Option.h"		#include "llvm/Option/Option.h"
▲ Show 20 Lines • Show All 982 Lines • ▼ Show 20 Lines	static unsigned PrintActions1(const Compilation &C, Action *A,
llvm::raw_string_ostream os(str);		llvm::raw_string_ostream os(str);

os << Action::getClassName(A->getKind()) << ", ";		os << Action::getClassName(A->getKind()) << ", ";
if (InputAction *IA = dyn_cast<InputAction>(A)) {		if (InputAction *IA = dyn_cast<InputAction>(A)) {
os << "\"" << IA->getInputArg().getValue() << "\"";		os << "\"" << IA->getInputArg().getValue() << "\"";
} else if (BindArchAction *BIA = dyn_cast<BindArchAction>(A)) {		} else if (BindArchAction *BIA = dyn_cast<BindArchAction>(A)) {
os << '"' << BIA->getArchName() << '"' << ", {"		os << '"' << BIA->getArchName() << '"' << ", {"
<< PrintActions1(C, *BIA->input_begin(), Ids) << "}";		<< PrintActions1(C, *BIA->input_begin(), Ids) << "}";
} else if (CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {		} else if (CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {
os << '"'		CudaArch Arch = CDA->getGpuArch();
<< (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)")		if (Arch != CudaArch::UNKNOWN)
<< '"' << ", {" << PrintActions1(C, *CDA->input_begin(), Ids) << "}";		os << "'" << CudaArchToString(Arch) << "', ";
		os << "{" << PrintActions1(C, *CDA->input_begin(), Ids) << "}";
		traUnsubmitted Done Reply Inline Actions I think this could be collapsed to just CudaArchToString(CDA->getGpuArch()). "(multiple archs)" is as informative as (and indistinguishable from) "unknown" here. tra: I think this could be collapsed to just CudaArchToString(CDA->getGpuArch()). "(multiple archs)"…
		jlebarAuthorUnsubmitted Done Reply Inline Actions I'm not crazy about "unknown", since it is actually known. How about we just not output anything? jlebar: I'm not crazy about "unknown", since it is actually known. How about we just not output…
		traUnsubmitted Done Reply Inline Actions It's a debugging output so it would be good to accurately reflect our internal state. In this case if we for some reason end up with CudaArch::UNKNOWN, I'd want to know that. If we really use UNKNOWN to represent multiple archs, perhaps it needs an enum for multiple-archs. tra: It's a debugging output so it would be good to accurately reflect our internal state. In this…
		jlebarAuthorUnsubmitted Done Reply Inline Actions We really do use UNKNOWN here to represent multiple architectures. It is used for the architecture of the Action corresponding to the call to fatbin. I think adding an enum value for multiple-archs is going to be more harmful than useful, because it means that everywhere that we switch() on arch, we're going to have to handle (and assert) MULTIPLE_ARCHs. jlebar: We really do use UNKNOWN here to represent multiple architectures. It is used for the…
		traUnsubmitted Not Done Reply Inline Actions OK. No output is fine with me. tra: OK. No output is fine with me.
} else {		} else {
const ActionList *AL;		const ActionList *AL;
if (CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {		if (CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {
os << "{" << PrintActions1(C, *CHA->input_begin(), Ids) << "}"		os << "{" << PrintActions1(C, *CHA->input_begin(), Ids) << "}"
<< ", gpu binaries ";		<< ", gpu binaries ";
AL = &CHA->getDeviceActions();		AL = &CHA->getDeviceActions();
} else		} else
AL = &A->getInputs();		AL = &A->getInputs();
▲ Show 20 Lines • Show All 339 Lines • ▼ Show 20 Lines	static Action *buildCudaActions(Compilation &C, DerivedArgList &Args,
bool CompileDeviceOnly =		bool CompileDeviceOnly =
PartialCompilationArg &&		PartialCompilationArg &&
PartialCompilationArg->getOption().matches(options::OPT_cuda_device_only);		PartialCompilationArg->getOption().matches(options::OPT_cuda_device_only);

if (CompileHostOnly)		if (CompileHostOnly)
return C.MakeAction<CudaHostAction>(HostAction, ActionList());		return C.MakeAction<CudaHostAction>(HostAction, ActionList());

// Collect all cuda_gpu_arch parameters, removing duplicates.		// Collect all cuda_gpu_arch parameters, removing duplicates.
SmallVector<const char *, 4> GpuArchList;		SmallVector<CudaArch, 4> GpuArchList;
llvm::StringSet<> GpuArchNames;		llvm::SmallSet<CudaArch, 4> GpuArchs;
for (Arg *A : Args) {		for (Arg *A : Args) {
if (!A->getOption().matches(options::OPT_cuda_gpu_arch_EQ))		if (!A->getOption().matches(options::OPT_cuda_gpu_arch_EQ))
continue;		continue;
A->claim();		A->claim();

const auto& Arch = A->getValue();		const auto &ArchStr = A->getValue();
if (!CudaDeviceAction::IsValidGpuArchName(Arch))		CudaArch Arch = StringToCudaArch(ArchStr);
C.getDriver().Diag(clang::diag::err_drv_cuda_bad_gpu_arch) << Arch;		if (Arch == CudaArch::UNKNOWN)
else if (GpuArchNames.insert(Arch).second)		C.getDriver().Diag(clang::diag::err_drv_cuda_bad_gpu_arch) << ArchStr;
		else if (GpuArchs.insert(Arch).second)
GpuArchList.push_back(Arch);		GpuArchList.push_back(Arch);
}		}

// Default to sm_20 which is the lowest common denominator for supported GPUs.		// Default to sm_20 which is the lowest common denominator for supported GPUs.
// sm_20 code should work correctly, if suboptimally, on all newer GPUs.		// sm_20 code should work correctly, if suboptimally, on all newer GPUs.
if (GpuArchList.empty())		if (GpuArchList.empty())
GpuArchList.push_back("sm_20");		GpuArchList.push_back(CudaArch::SM_20);

// Replicate inputs for each GPU architecture.		// Replicate inputs for each GPU architecture.
Driver::InputList CudaDeviceInputs;		Driver::InputList CudaDeviceInputs;
for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)		for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)
CudaDeviceInputs.push_back(std::make_pair(types::TY_CUDA_DEVICE, InputArg));		CudaDeviceInputs.push_back(std::make_pair(types::TY_CUDA_DEVICE, InputArg));

// Build actions for all device inputs.		// Build actions for all device inputs.
assert(C.getSingleOffloadToolChain<Action::OFK_Cuda>() &&		assert(C.getSingleOffloadToolChain<Action::OFK_Cuda>() &&
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I) {

for (const auto& A : {AssembleAction, BackendAction}) {		for (const auto& A : {AssembleAction, BackendAction}) {
DeviceActions.push_back(C.MakeAction<CudaDeviceAction>(		DeviceActions.push_back(C.MakeAction<CudaDeviceAction>(
A, GpuArchList[I], /* AtTopLevel */ false));		A, GpuArchList[I], /* AtTopLevel */ false));
}		}
}		}
auto FatbinAction = C.MakeAction<CudaDeviceAction>(		auto FatbinAction = C.MakeAction<CudaDeviceAction>(
C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),		C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),
/* GpuArchName = */ nullptr,		CudaArch::UNKNOWN,
/* AtTopLevel = */ false);		/* AtTopLevel = */ false);
// Return a new host action that incorporates original host action and all		// Return a new host action that incorporates original host action and all
// device actions.		// device actions.
return C.MakeAction<CudaHostAction>(std::move(HostAction),		return C.MakeAction<CudaHostAction>(std::move(HostAction),
ActionList({FatbinAction}));		ActionList({FatbinAction}));
}		}

void Driver::BuildActions(Compilation &C, DerivedArgList &Args,		void Driver::BuildActions(Compilation &C, DerivedArgList &Args,
▲ Show 20 Lines • Show All 567 Lines • ▼ Show 20 Lines	return BuildJobsForAction(C, *BAA->input_begin(), TC, ArchName, AtTopLevel,
MultipleArchs, LinkingOutput, CachedResults);		MultipleArchs, LinkingOutput, CachedResults);
}		}

if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {		if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {
// Initial processing of CudaDeviceAction carries host params.		// Initial processing of CudaDeviceAction carries host params.
// Call BuildJobsForAction() again, now with correct device parameters.		// Call BuildJobsForAction() again, now with correct device parameters.
InputInfo II = BuildJobsForAction(		InputInfo II = BuildJobsForAction(
C, *CDA->input_begin(), C.getSingleOffloadToolChain<Action::OFK_Cuda>(),		C, *CDA->input_begin(), C.getSingleOffloadToolChain<Action::OFK_Cuda>(),
CDA->getGpuArchName(), CDA->isAtTopLevel(), /MultipleArchs=/true,		CudaArchToString(CDA->getGpuArch()), CDA->isAtTopLevel(),
LinkingOutput, CachedResults);		/MultipleArchs=/true, LinkingOutput, CachedResults);
// Currently II's Action is *CDA->input_begin(). Set it to CDA instead, so		// Currently II's Action is *CDA->input_begin(). Set it to CDA instead, so
// that one can retrieve II's GPU arch.		// that one can retrieve II's GPU arch.
II.setAction(A);		II.setAction(A);
return II;		return II;
}		}

const ActionList *Inputs = &A->getInputs();		const ActionList *Inputs = &A->getInputs();

▲ Show 20 Lines • Show All 604 Lines • Show Last 20 Lines

lib/Driver/Tools.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,207 Lines • ▼ Show 20 Lines	void NVPTX::Linker::ConstructJob(Compilation &C, const JobAction &JA,
CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-64" : "-32");		CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-64" : "-32");
CmdArgs.push_back(Args.MakeArgString("--create"));		CmdArgs.push_back(Args.MakeArgString("--create"));
CmdArgs.push_back(Args.MakeArgString(Output.getFilename()));		CmdArgs.push_back(Args.MakeArgString(Output.getFilename()));

for (const auto& II : Inputs) {		for (const auto& II : Inputs) {
auto* A = cast<const CudaDeviceAction>(II.getAction());		auto* A = cast<const CudaDeviceAction>(II.getAction());
// We need to pass an Arch of the form "sm_XX" for cubin files and		// We need to pass an Arch of the form "sm_XX" for cubin files and
// "compute_XX" for ptx.		// "compute_XX" for ptx.
const char *Arch = (II.getType() == types::TY_PP_Asm)		const char *Arch =
? A->getComputeArchName()		(II.getType() == types::TY_PP_Asm)
: A->getGpuArchName();		? CudaVirtualArchToString(VirtualArchForCudaArch(A->getGpuArch()))
		: CudaArchToString(A->getGpuArch());
CmdArgs.push_back(Args.MakeArgString(llvm::Twine("--image=profile=") +		CmdArgs.push_back(Args.MakeArgString(llvm::Twine("--image=profile=") +
Arch + ",file=" + II.getFilename()));		Arch + ",file=" + II.getFilename()));
}		}

for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_fatbinary))		for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_fatbinary))
CmdArgs.push_back(Args.MakeArgString(A));		CmdArgs.push_back(Args.MakeArgString(A));

const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));		const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));
C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));		C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));
}		}