This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Driver changes to support CUDA compilation on MacOS.
ClosedPublic

Authored by jlebar on Nov 16 2016, 4:21 PM.

Download Raw Diff

Details

Reviewers

Commits

rG66c4fd7987d2: [CUDA] Driver changes to support CUDA compilation on MacOS.
rC287285: [CUDA] Driver changes to support CUDA compilation on MacOS.
rL287285: [CUDA] Driver changes to support CUDA compilation on MacOS.

Summary

Compiling CUDA device code requires us to know the host toolchain,
because CUDA device-side compiles pull in e.g. host headers.

When we only supported Linux compilation, this worked because
CudaToolChain, which is responsible for device-side CUDA compilation,
inherited from the Linux toolchain. But in order to support MacOS,
CudaToolChain needs to take a HostToolChain pointer.

Because a CUDA toolchain now requires a host TC, we no longer will
create a CUDA toolchain from Driver::getToolChain -- you have to go
through CreateOffloadingDeviceToolChains. I am *pretty* sure this is
correct, and that previously any attempt to create a CUDA toolchain
through getToolChain() would eventually have resulted in us throwing
"error: unsupported use of NVPTX for host compilation".

In any case hacking getToolChain to create a CUDA+host toolchain would
be wrong, because a Driver can be reused for multiple compilations,
potentially with different host TCs, and getToolChain will cache the
result, causing us to potentially use a stale host TC.

So that's the main change in this patch.

In addition, we have to pull CudaInstallationDetector out of Generic_GCC
and into a top-level class. It's now used by the Generic_GCC and MachO
toolchains.

Diff Detail

Build Status

Buildable 1344
Build 1344: arc lint + arc unit

Event Timeline

jlebar updated this revision to Diff 78286.Nov 16 2016, 4:21 PM

jlebar retitled this revision from to [CUDA] Driver changes to support CUDA compilation on MacOS..

jlebar updated this object.

jlebar added a reviewer: tra.

jlebar added subscribers: sfantao, hfinkel, rryan.

jlebar added a subscriber: cfe-commits.Nov 16 2016, 4:59 PM

Hi Justin,

Thanks for the patch.

clang/lib/Driver/Driver.cpp
479	I am not sure I understand why to pair host and device toolchain in the map. The driver can be used to several compilations, but how do these compilation use different host toolchains? Can you give an example of an invocation? Maybe add it to the regression tests bellow.

jlebar added inline comments.Nov 17 2016, 10:52 AM

clang/lib/Driver/Driver.cpp
479	The driver can be used to several compilations, but how do these compilation use different host toolchains? I don't know if it's possible to do so when compiling through the command line. But if using clang as a library, you can create a Driver and use it for multiple compilations with arbitrary targets. I am not certain we do this inside of the tree, although there are a few places where we create Driver objects, such as lib/Tooling/CompilationDatabase.cpp and lib/Tooling/Tooling.cpp. But also anyone downstream can presumably use clang this way.

LGTM, with couple of minor nits.

clang/lib/Driver/Driver.cpp
3650–3654	should there be an assert() or llvm_unreachable() to ensure that? Right now we'll happily return default toolchain.
clang/test/Driver/cuda-detect.cu
67	Should that be --target=i386-apple-macosx ?

This revision is now accepted and ready to land.Nov 17 2016, 1:31 PM

jlebar marked 2 inline comments as done.Nov 17 2016, 4:45 PM

jlebar added inline comments.

clang/lib/Driver/Driver.cpp
3650–3654	Unfortunately no -- the way the code is structured now, we get the toolchain before we have a chance to raise an error. I agree that's pretty broken...
clang/test/Driver/cuda-detect.cu
67	Wow, good eye.

Closed by commit rL287285: [CUDA] Driver changes to support CUDA compilation on MacOS. (authored by jlebar). · Explain WhyNov 17 2016, 4:51 PM

This revision was automatically updated to reflect the committed changes.

jlebar marked 2 inline comments as done.

Revision Contents

Path

Size

clang/

include/

clang/

Driver/

ToolChain.h

1 line

lib/

Driver/

29 lines

141 lines

140 lines

2 lines

test/

Driver/

Inputs/

CUDA-macosx/

usr/

local/

cuda/

bin/

.keep

include/

.keep

lib/

.keep

nvvm/

libdevice/

libdevice.compute_30.10.bc

libdevice.compute_35.10.bc

cuda-detect.cu

28 lines

cuda-external-tools.cu

11 lines

cuda-macosx.cu

8 lines

Diff 78286

clang/include/clang/Driver/ToolChain.h

	Show All 32 Lines
	namespace clang {			namespace clang {
	class ObjCRuntime;			class ObjCRuntime;
	namespace vfs {			namespace vfs {
	class FileSystem;			class FileSystem;
	}			}

	namespace driver {			namespace driver {
	class Compilation;			class Compilation;
				class CudaInstallationDetector;
	class Driver;			class Driver;
	class JobAction;			class JobAction;
	class RegisterEffectiveTriple;			class RegisterEffectiveTriple;
	class SanitizerArgs;			class SanitizerArgs;
	class Tool;			class Tool;

	/// ToolChain - Access to tools for a single platform.			/// ToolChain - Access to tools for a single platform.
	class ToolChain {			class ToolChain {
	▲ Show 20 Lines • Show All 423 Lines • Show Last 20 Lines

clang/lib/Driver/Driver.cpp

Show First 20 Lines • Show All 464 Lines • ▼ Show 20 Lines	void Driver::CreateOffloadingDeviceToolChains(Compilation &C,

//		//
// CUDA		// CUDA
//		//
// We need to generate a CUDA toolchain if any of the inputs has a CUDA type.		// We need to generate a CUDA toolchain if any of the inputs has a CUDA type.
if (llvm::any_of(Inputs, [](std::pair<types::ID, const llvm::opt::Arg *> &I) {		if (llvm::any_of(Inputs, [](std::pair<types::ID, const llvm::opt::Arg *> &I) {
return types::isCuda(I.first);		return types::isCuda(I.first);
})) {		})) {
const ToolChain &TC = getToolChain(		const ToolChain *HostTC = C.getSingleOffloadToolChain<Action::OFK_Host>();
C.getInputArgs(),		const llvm::Triple &HostTriple = HostTC->getTriple();
llvm::Triple(C.getSingleOffloadToolChain<Action::OFK_Host>()		llvm::Triple CudaTriple(HostTriple.isArch64Bit() ? "nvptx64-nvidia-cuda"
->getTriple()		: "nvptx-nvidia-cuda");
.isArch64Bit()		// Use the CUDA and host triples as the key into the ToolChains map, because
? "nvptx64-nvidia-cuda"		// the device toolchain we create depends on both.
: "nvptx-nvidia-cuda"));		ToolChain *&CudaTC = ToolChains[CudaTriple.str() + "/" + HostTriple.str()];
		sfantaoUnsubmitted Not Done Reply Inline Actions I am not sure I understand why to pair host and device toolchain in the map. The driver can be used to several compilations, but how do these compilation use different host toolchains? Can you give an example of an invocation? Maybe add it to the regression tests bellow. sfantao: I am not sure I understand why to pair host and device toolchain in the map. The driver can be…
		jlebarAuthorUnsubmitted Not Done Reply Inline Actions The driver can be used to several compilations, but how do these compilation use different host toolchains? I don't know if it's possible to do so when compiling through the command line. But if using clang as a library, you can create a Driver and use it for multiple compilations with arbitrary targets. I am not certain we do this inside of the tree, although there are a few places where we create Driver objects, such as lib/Tooling/CompilationDatabase.cpp and lib/Tooling/Tooling.cpp. But also anyone downstream can presumably use clang this way. jlebar: > The driver can be used to several compilations, but how do these compilation use different…
C.addOffloadDeviceToolChain(&TC, Action::OFK_Cuda);		if (!CudaTC) {
		CudaTC = new toolchains::CudaToolChain(this, CudaTriple, HostTC,
		C.getInputArgs());
		}
		C.addOffloadDeviceToolChain(CudaTC, Action::OFK_Cuda);
}		}

//		//
// OpenMP		// OpenMP
//		//
// We need to generate an OpenMP toolchain if the user specified targets with		// We need to generate an OpenMP toolchain if the user specified targets with
// the -fopenmp-targets option.		// the -fopenmp-targets option.
if (Arg *OpenMPTargets =		if (Arg *OpenMPTargets =
▲ Show 20 Lines • Show All 3,108 Lines • ▼ Show 20 Lines	case llvm::Triple::Win32:
TC = new toolchains::CrossWindowsToolChain(*this, Target, Args);		TC = new toolchains::CrossWindowsToolChain(*this, Target, Args);
break;		break;
case llvm::Triple::MSVC:		case llvm::Triple::MSVC:
case llvm::Triple::UnknownEnvironment:		case llvm::Triple::UnknownEnvironment:
TC = new toolchains::MSVCToolChain(*this, Target, Args);		TC = new toolchains::MSVCToolChain(*this, Target, Args);
break;		break;
}		}
break;		break;
case llvm::Triple::CUDA:
TC = new toolchains::CudaToolChain(*this, Target, Args);
break;
case llvm::Triple::PS4:		case llvm::Triple::PS4:
TC = new toolchains::PS4CPU(*this, Target, Args);		TC = new toolchains::PS4CPU(*this, Target, Args);
break;		break;
case llvm::Triple::Contiki:		case llvm::Triple::Contiki:
TC = new toolchains::Contiki(*this, Target, Args);		TC = new toolchains::Contiki(*this, Target, Args);
break;		break;
default:		default:
// Of these targets, Hexagon is the only one that might have		// Of these targets, Hexagon is the only one that might have
Show All 25 Lines	default:
TC = new toolchains::Generic_ELF(*this, Target, Args);		TC = new toolchains::Generic_ELF(*this, Target, Args);
else if (Target.isOSBinFormatMachO())		else if (Target.isOSBinFormatMachO())
TC = new toolchains::MachO(*this, Target, Args);		TC = new toolchains::MachO(*this, Target, Args);
else		else
TC = new toolchains::Generic_GCC(*this, Target, Args);		TC = new toolchains::Generic_GCC(*this, Target, Args);
}		}
}		}
}		}

		// Intentionally omitted from the switch above: llvm::Triple::CUDA. CUDA
		// compiles always need two toolchains, the CUDA toolchain and the host
		// toolchain. So the only valid way to create a CUDA toolchain is via
		// CreateOffloadingDeviceToolChains.
		traUnsubmitted Done Reply Inline Actions should there be an assert() or llvm_unreachable() to ensure that? Right now we'll happily return default toolchain. tra: should there be an assert() or llvm_unreachable() to ensure that? Right now we'll happily…
		jlebarAuthorUnsubmitted Not Done Reply Inline Actions Unfortunately no -- the way the code is structured now, we get the toolchain before we have a chance to raise an error. I agree that's pretty broken... jlebar: Unfortunately no -- the way the code is structured now, we get the toolchain before we have a…

return *TC;		return *TC;
}		}

bool Driver::ShouldUseClangCompiler(const JobAction &JA) const {		bool Driver::ShouldUseClangCompiler(const JobAction &JA) const {
// Say "no" if there is not exactly one input of a type clang understands.		// Say "no" if there is not exactly one input of a type clang understands.
if (JA.size() != 1 \|\|		if (JA.size() != 1 \|\|
!types::isAcceptedByClang((*JA.input_begin())->getType()))		!types::isAcceptedByClang((*JA.input_begin())->getType()))
return false;		return false;
▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains.h

Show All 18 Lines
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include <set>		#include <set>
#include <vector>		#include <vector>

namespace clang {		namespace clang {
namespace driver {		namespace driver {

		/// A class to find a viable CUDA installation
		class CudaInstallationDetector {
		private:
		const Driver &D;
		bool IsValid = false;
		CudaVersion Version = CudaVersion::UNKNOWN;
		std::string InstallPath;
		std::string BinPath;
		std::string LibPath;
		std::string LibDevicePath;
		std::string IncludePath;
		llvm::StringMap<std::string> LibDeviceMap;

		// CUDA architectures for which we have raised an error in
		// CheckCudaVersionSupportsArch.
		mutable llvm::SmallSet<CudaArch, 4> ArchsWithVersionTooLowErrors;

		public:
		CudaInstallationDetector(const Driver &D, const llvm::Triple &Triple,
		const llvm::opt::ArgList &Args);

		void AddCudaIncludeArgs(const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args) const;

		/// \brief Emit an error if Version does not support the given Arch.
		///
		/// If either Version or Arch is unknown, does not emit an error. Emits at
		/// most one error per Arch.
		void CheckCudaVersionSupportsArch(CudaArch Arch) const;

		/// \brief Check whether we detected a valid Cuda install.
		bool isValid() const { return IsValid; }
		/// \brief Print information about the detected CUDA installation.
		void print(raw_ostream &OS) const;

		/// \brief Get the detected Cuda install's version.
		CudaVersion version() const { return Version; }
		/// \brief Get the detected Cuda installation path.
		StringRef getInstallPath() const { return InstallPath; }
		/// \brief Get the detected path to Cuda's bin directory.
		StringRef getBinPath() const { return BinPath; }
		/// \brief Get the detected Cuda Include path.
		StringRef getIncludePath() const { return IncludePath; }
		/// \brief Get the detected Cuda library path.
		StringRef getLibPath() const { return LibPath; }
		/// \brief Get the detected Cuda device library path.
		StringRef getLibDevicePath() const { return LibDevicePath; }
		/// \brief Get libdevice file for given architecture
		std::string getLibDeviceFile(StringRef Gpu) const {
		return LibDeviceMap.lookup(Gpu);
		}
		};

namespace toolchains {		namespace toolchains {

/// Generic_GCC - A tool chain using the 'gcc' command to perform		/// Generic_GCC - A tool chain using the 'gcc' command to perform
/// all subcommands; this relies on gcc translating the majority of		/// all subcommands; this relies on gcc translating the majority of
/// command line options.		/// command line options.
class LLVM_LIBRARY_VISIBILITY Generic_GCC : public ToolChain {		class LLVM_LIBRARY_VISIBILITY Generic_GCC : public ToolChain {
public:		public:
/// \brief Struct to store and manipulate GCC versions.		/// \brief Struct to store and manipulate GCC versions.
▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	void scanLibDirForGCCTripleSolaris(const llvm::Triple &TargetArch,
const llvm::opt::ArgList &Args,		const llvm::opt::ArgList &Args,
const std::string &LibDir,		const std::string &LibDir,
StringRef CandidateTriple,		StringRef CandidateTriple,
bool NeedsBiarchSuffix = false);		bool NeedsBiarchSuffix = false);
};		};

protected:		protected:
GCCInstallationDetector GCCInstallation;		GCCInstallationDetector GCCInstallation;

// \brief A class to find a viable CUDA installation
class CudaInstallationDetector {
private:
const Driver &D;
bool IsValid = false;
CudaVersion Version = CudaVersion::UNKNOWN;
std::string InstallPath;
std::string BinPath;
std::string LibPath;
std::string LibDevicePath;
std::string IncludePath;
llvm::StringMap<std::string> LibDeviceMap;

// CUDA architectures for which we have raised an error in
// CheckCudaVersionSupportsArch.
mutable llvm::SmallSet<CudaArch, 4> ArchsWithVersionTooLowErrors;

public:
CudaInstallationDetector(const Driver &D) : D(D) {}
void init(const llvm::Triple &TargetTriple, const llvm::opt::ArgList &Args);

/// \brief Emit an error if Version does not support the given Arch.
///
/// If either Version or Arch is unknown, does not emit an error. Emits at
/// most one error per Arch.
void CheckCudaVersionSupportsArch(CudaArch Arch) const;

/// \brief Check whether we detected a valid Cuda install.
bool isValid() const { return IsValid; }
/// \brief Print information about the detected CUDA installation.
void print(raw_ostream &OS) const;

/// \brief Get the detected Cuda install's version.
CudaVersion version() const { return Version; }
/// \brief Get the detected Cuda installation path.
StringRef getInstallPath() const { return InstallPath; }
/// \brief Get the detected path to Cuda's bin directory.
StringRef getBinPath() const { return BinPath; }
/// \brief Get the detected Cuda Include path.
StringRef getIncludePath() const { return IncludePath; }
/// \brief Get the detected Cuda library path.
StringRef getLibPath() const { return LibPath; }
/// \brief Get the detected Cuda device library path.
StringRef getLibDevicePath() const { return LibDevicePath; }
/// \brief Get libdevice file for given architecture
std::string getLibDeviceFile(StringRef Gpu) const {
return LibDeviceMap.lookup(Gpu);
}
};

CudaInstallationDetector CudaInstallation;		CudaInstallationDetector CudaInstallation;

public:		public:
Generic_GCC(const Driver &D, const llvm::Triple &Triple,		Generic_GCC(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args);		const llvm::opt::ArgList &Args);
~Generic_GCC() override;		~Generic_GCC() override;

void printVerboseInfo(raw_ostream &OS) const override;		void printVerboseInfo(raw_ostream &OS) const override;
▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	enum DarwinPlatformKind {
WatchOSSimulator		WatchOSSimulator
};		};

mutable DarwinPlatformKind TargetPlatform;		mutable DarwinPlatformKind TargetPlatform;

/// The OS version we are targeting.		/// The OS version we are targeting.
mutable VersionTuple TargetVersion;		mutable VersionTuple TargetVersion;

		CudaInstallationDetector CudaInstallation;

private:		private:
void AddDeploymentTarget(llvm::opt::DerivedArgList &Args) const;		void AddDeploymentTarget(llvm::opt::DerivedArgList &Args) const;

public:		public:
Darwin(const Driver &D, const llvm::Triple &Triple,		Darwin(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args);		const llvm::opt::ArgList &Args);
~Darwin() override;		~Darwin() override;

▲ Show 20 Lines • Show All 124 Lines • ▼ Show 20 Lines	public:
llvm::opt::DerivedArgList *		llvm::opt::DerivedArgList *
TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef BoundArch,		TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
Action::OffloadKind DeviceOffloadKind) const override;		Action::OffloadKind DeviceOffloadKind) const override;

CXXStdlibType GetDefaultCXXStdlibType() const override;		CXXStdlibType GetDefaultCXXStdlibType() const override;
ObjCRuntime getDefaultObjCRuntime(bool isNonFragile) const override;		ObjCRuntime getDefaultObjCRuntime(bool isNonFragile) const override;
bool hasBlocksRuntime() const override;		bool hasBlocksRuntime() const override;

		void AddCudaIncludeArgs(const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args) const override;

bool UseObjCMixedDispatch() const override {		bool UseObjCMixedDispatch() const override {
// This is only used with the non-fragile ABI and non-legacy dispatch.		// This is only used with the non-fragile ABI and non-legacy dispatch.

// Mixed dispatch is used everywhere except OS X before 10.6.		// Mixed dispatch is used everywhere except OS X before 10.6.
return !(isTargetMacOS() && isMacosxVersionLT(10, 6));		return !(isTargetMacOS() && isMacosxVersionLT(10, 6));
}		}

unsigned GetDefaultStackProtectorLevel(bool KernelOrKext) const override {		unsigned GetDefaultStackProtectorLevel(bool KernelOrKext) const override {
Show All 13 Lines	public:

void CheckObjCARC() const override;		void CheckObjCARC() const override;

bool UseSjLjExceptions(const llvm::opt::ArgList &Args) const override;		bool UseSjLjExceptions(const llvm::opt::ArgList &Args) const override;

bool SupportsEmbeddedBitcode() const override;		bool SupportsEmbeddedBitcode() const override;

SanitizerMask getSupportedSanitizers() const override;		SanitizerMask getSupportedSanitizers() const override;

		void printVerboseInfo(raw_ostream &OS) const override;
};		};

/// DarwinClang - The Darwin toolchain used by Clang.		/// DarwinClang - The Darwin toolchain used by Clang.
class LLVM_LIBRARY_VISIBILITY DarwinClang : public Darwin {		class LLVM_LIBRARY_VISIBILITY DarwinClang : public Darwin {
public:		public:
DarwinClang(const Driver &D, const llvm::Triple &Triple,		DarwinClang(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args);		const llvm::opt::ArgList &Args);

▲ Show 20 Lines • Show All 279 Lines • ▼ Show 20 Lines	public:

std::vector<std::string> ExtraOpts;		std::vector<std::string> ExtraOpts;

protected:		protected:
Tool *buildAssembler() const override;		Tool *buildAssembler() const override;
Tool *buildLinker() const override;		Tool *buildLinker() const override;
};		};

class LLVM_LIBRARY_VISIBILITY CudaToolChain : public Linux {		class LLVM_LIBRARY_VISIBILITY CudaToolChain : public ToolChain {
public:		public:
CudaToolChain(const Driver &D, const llvm::Triple &Triple,		CudaToolChain(const Driver &D, const llvm::Triple &Triple,
const llvm::opt::ArgList &Args);		const ToolChain &HostTC, const llvm::opt::ArgList &Args);

llvm::opt::DerivedArgList *		llvm::opt::DerivedArgList *
TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef BoundArch,		TranslateArgs(const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
Action::OffloadKind DeviceOffloadKind) const override;		Action::OffloadKind DeviceOffloadKind) const override;
void addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,		void addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args) const override;		llvm::opt::ArgStringList &CC1Args) const override;

// Never try to use the integrated assembler with CUDA; always fork out to		// Never try to use the integrated assembler with CUDA; always fork out to
// ptxas.		// ptxas.
bool useIntegratedAs() const override { return false; }		bool useIntegratedAs() const override { return false; }
		bool isCrossCompiling() const override { return true; }
		bool isPICDefault() const override { return false; }
		bool isPIEDefault() const override { return false; }
		bool isPICDefaultForced() const override { return false; }
		bool SupportsProfiling() const override { return false; }
		bool SupportsObjCGC() const override { return false; }

void AddCudaIncludeArgs(const llvm::opt::ArgList &DriverArgs,		void AddCudaIncludeArgs(const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args) const override;		llvm::opt::ArgStringList &CC1Args) const override;

const Generic_GCC::CudaInstallationDetector &cudaInstallation() const {		void addClangWarningOptions(llvm::opt::ArgStringList &CC1Args) const override;
return CudaInstallation;		CXXStdlibType GetCXXStdlibType(const llvm::opt::ArgList &Args) const override;
}		void
Generic_GCC::CudaInstallationDetector &cudaInstallation() {		AddClangSystemIncludeArgs(const llvm::opt::ArgList &DriverArgs,
return CudaInstallation;		llvm::opt::ArgStringList &CC1Args) const override;
}		void AddClangCXXStdlibIncludeArgs(
		const llvm::opt::ArgList &Args,
		llvm::opt::ArgStringList &CC1Args) const override;
		void AddIAMCUIncludeArgs(const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args) const override;

		const ToolChain &HostTC;
		CudaInstallationDetector CudaInstallation;

protected:		protected:
Tool *buildAssembler() const override; // ptxas		Tool *buildAssembler() const override; // ptxas
Tool *buildLinker() const override; // fatbinary (ok, not really a linker)		Tool *buildLinker() const override; // fatbinary (ok, not really a linker)
};		};

class LLVM_LIBRARY_VISIBILITY MipsLLVMToolChain : public Linux {		class LLVM_LIBRARY_VISIBILITY MipsLLVMToolChain : public Linux {
protected:		protected:
▲ Show 20 Lines • Show All 403 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains.cpp

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	MachO::MachO(const Driver &D, const llvm::Triple &Triple, const ArgList &Args)
// We expect 'as', 'ld', etc. to be adjacent to our install dir.		// We expect 'as', 'ld', etc. to be adjacent to our install dir.
getProgramPaths().push_back(getDriver().getInstalledDir());		getProgramPaths().push_back(getDriver().getInstalledDir());
if (getDriver().getInstalledDir() != getDriver().Dir)		if (getDriver().getInstalledDir() != getDriver().Dir)
getProgramPaths().push_back(getDriver().Dir);		getProgramPaths().push_back(getDriver().Dir);
}		}

/// Darwin - Darwin tool chain for i386 and x86_64.		/// Darwin - Darwin tool chain for i386 and x86_64.
Darwin::Darwin(const Driver &D, const llvm::Triple &Triple, const ArgList &Args)		Darwin::Darwin(const Driver &D, const llvm::Triple &Triple, const ArgList &Args)
: MachO(D, Triple, Args), TargetInitialized(false) {}		: MachO(D, Triple, Args), TargetInitialized(false),
		CudaInstallation(D, Triple, Args) {}

types::ID MachO::LookupTypeForExtension(StringRef Ext) const {		types::ID MachO::LookupTypeForExtension(StringRef Ext) const {
types::ID Ty = types::lookupTypeForExtension(Ext);		types::ID Ty = types::lookupTypeForExtension(Ext);

// Darwin always preprocesses assembly files (unless -x is used explicitly).		// Darwin always preprocesses assembly files (unless -x is used explicitly).
if (Ty == types::TY_PP_Asm)		if (Ty == types::TY_PP_Asm)
return types::TY_Asm;		return types::TY_Asm;

Show All 30 Lines	bool Darwin::hasBlocksRuntime() const {
else if (isTargetIOSBased())		else if (isTargetIOSBased())
return !isIPhoneOSVersionLT(3, 2);		return !isIPhoneOSVersionLT(3, 2);
else {		else {
assert(isTargetMacOS() && "unexpected darwin target");		assert(isTargetMacOS() && "unexpected darwin target");
return !isMacosxVersionLT(10, 6);		return !isMacosxVersionLT(10, 6);
}		}
}		}

		void Darwin::AddCudaIncludeArgs(const ArgList &DriverArgs,
		ArgStringList &CC1Args) const {
		CudaInstallation.AddCudaIncludeArgs(DriverArgs, CC1Args);
		}

// This is just a MachO name translation routine and there's no		// This is just a MachO name translation routine and there's no
// way to join this into ARMTargetParser without breaking all		// way to join this into ARMTargetParser without breaking all
// other assumptions. Maybe MachO should consider standardising		// other assumptions. Maybe MachO should consider standardising
// their nomenclature.		// their nomenclature.
static const char *ArmMachOArchName(StringRef Arch) {		static const char *ArmMachOArchName(StringRef Arch) {
return llvm::StringSwitch<const char *>(Arch)		return llvm::StringSwitch<const char *>(Arch)
.Case("armv6k", "armv6")		.Case("armv6k", "armv6")
.Case("armv6m", "armv6m")		.Case("armv6m", "armv6m")
▲ Show 20 Lines • Show All 1,181 Lines • ▼ Show 20 Lines	if (IsX86_64)
Res \|= SanitizerKind::Thread;		Res \|= SanitizerKind::Thread;
} else if (isTargetIOSSimulator() \|\| isTargetTvOSSimulator()) {		} else if (isTargetIOSSimulator() \|\| isTargetTvOSSimulator()) {
if (IsX86_64)		if (IsX86_64)
Res \|= SanitizerKind::Thread;		Res \|= SanitizerKind::Thread;
}		}
return Res;		return Res;
}		}

		void Darwin::printVerboseInfo(raw_ostream &OS) const {
		CudaInstallation.print(OS);
		}

/// Generic_GCC - A tool chain using the 'gcc' command to perform		/// Generic_GCC - A tool chain using the 'gcc' command to perform
/// all subcommands; this relies on gcc translating the majority of		/// all subcommands; this relies on gcc translating the majority of
/// command line options.		/// command line options.

/// \brief Parse a GCCVersion object out of a string of text.		/// \brief Parse a GCCVersion object out of a string of text.
///		///
/// This is the primary means of forming GCCVersion objects.		/// This is the primary means of forming GCCVersion objects.
/static/		/static/
▲ Show 20 Lines • Show All 499 Lines • ▼ Show 20 Lines	static CudaVersion ParseCudaVersionFile(llvm::StringRef V) {
}		}
if (Major == 7 && Minor == 5)		if (Major == 7 && Minor == 5)
return CudaVersion::CUDA_75;		return CudaVersion::CUDA_75;
if (Major == 8 && Minor == 0)		if (Major == 8 && Minor == 0)
return CudaVersion::CUDA_80;		return CudaVersion::CUDA_80;
return CudaVersion::UNKNOWN;		return CudaVersion::UNKNOWN;
}		}

// \brief -- try common CUDA installation paths looking for files we need for		CudaInstallationDetector::CudaInstallationDetector(
// CUDA compilation.		const Driver &D, const llvm::Triple &TargetTriple,
void Generic_GCC::CudaInstallationDetector::init(		const llvm::opt::ArgList &Args)
const llvm::Triple &TargetTriple, const llvm::opt::ArgList &Args) {		: D(D) {
SmallVector<std::string, 4> CudaPathCandidates;		SmallVector<std::string, 4> CudaPathCandidates;

if (Args.hasArg(options::OPT_cuda_path_EQ))		if (Args.hasArg(options::OPT_cuda_path_EQ))
CudaPathCandidates.push_back(		CudaPathCandidates.push_back(
Args.getLastArgValue(options::OPT_cuda_path_EQ));		Args.getLastArgValue(options::OPT_cuda_path_EQ));
else {		else {
CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda");		CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda");
CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda-8.0");		CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda-8.0");
CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda-7.5");		CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda-7.5");
CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda-7.0");		CudaPathCandidates.push_back(D.SysRoot + "/usr/local/cuda-7.0");
}		}

for (const auto &CudaPath : CudaPathCandidates) {		for (const auto &CudaPath : CudaPathCandidates) {
if (CudaPath.empty() \|\| !D.getVFS().exists(CudaPath))		if (CudaPath.empty() \|\| !D.getVFS().exists(CudaPath))
continue;		continue;

InstallPath = CudaPath;		InstallPath = CudaPath;
BinPath = CudaPath + "/bin";		BinPath = CudaPath + "/bin";
IncludePath = InstallPath + "/include";		IncludePath = InstallPath + "/include";
LibDevicePath = InstallPath + "/nvvm/libdevice";		LibDevicePath = InstallPath + "/nvvm/libdevice";
LibPath = InstallPath + (TargetTriple.isArch64Bit() ? "/lib64" : "/lib");

auto &FS = D.getVFS();		auto &FS = D.getVFS();
if (!(FS.exists(IncludePath) && FS.exists(BinPath) && FS.exists(LibPath) &&		if (!(FS.exists(IncludePath) && FS.exists(BinPath) &&
FS.exists(LibDevicePath)))		FS.exists(LibDevicePath)))
continue;		continue;

		// On Linux, we have both lib and lib64 directories, and we need to choose
		// based on our triple. On MacOS, we have only a lib directory.
		//
		// It's sufficient for our purposes to be flexible: If both lib and lib64
		// exist, we choose whichever one matches our triple. Otherwise, if only
		// lib exists, we use it.
		if (TargetTriple.isArch64Bit() && FS.exists(InstallPath + "/lib64"))
		LibPath = InstallPath + "/lib64";
		else if (FS.exists(InstallPath + "/lib"))
		LibPath = InstallPath + "/lib";
		else
		continue;

llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> VersionFile =		llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> VersionFile =
FS.getBufferForFile(InstallPath + "/version.txt");		FS.getBufferForFile(InstallPath + "/version.txt");
if (!VersionFile) {		if (!VersionFile) {
// CUDA 7.0 doesn't have a version.txt, so guess that's our version if		// CUDA 7.0 doesn't have a version.txt, so guess that's our version if
// version.txt isn't present.		// version.txt isn't present.
Version = CudaVersion::CUDA_70;		Version = CudaVersion::CUDA_70;
} else {		} else {
Version = ParseCudaVersionFile((*VersionFile)->getBuffer());		Version = ParseCudaVersionFile((*VersionFile)->getBuffer());
Show All 40 Lines	for (llvm::sys::fs::directory_iterator LI(LibDevicePath, EC), LE;
}		}
}		}

IsValid = true;		IsValid = true;
break;		break;
}		}
}		}

void Generic_GCC::CudaInstallationDetector::CheckCudaVersionSupportsArch(		void CudaInstallationDetector::AddCudaIncludeArgs(
		const ArgList &DriverArgs, ArgStringList &CC1Args) const {
		if (!DriverArgs.hasArg(options::OPT_nobuiltininc)) {
		// Add cuda_wrappers/* to our system include path. This lets us wrap
		// standard library headers.
		SmallString<128> P(D.ResourceDir);
		llvm::sys::path::append(P, "include");
		llvm::sys::path::append(P, "cuda_wrappers");
		CC1Args.push_back("-internal-isystem");
		CC1Args.push_back(DriverArgs.MakeArgString(P));
		}

		if (DriverArgs.hasArg(options::OPT_nocudainc))
		return;

		if (!isValid()) {
		D.Diag(diag::err_drv_no_cuda_installation);
		return;
		}

		CC1Args.push_back("-internal-isystem");
		CC1Args.push_back(DriverArgs.MakeArgString(getIncludePath()));
		CC1Args.push_back("-include");
		CC1Args.push_back("__clang_cuda_runtime_wrapper.h");
		}

		void CudaInstallationDetector::CheckCudaVersionSupportsArch(
CudaArch Arch) const {		CudaArch Arch) const {
if (Arch == CudaArch::UNKNOWN \|\| Version == CudaVersion::UNKNOWN \|\|		if (Arch == CudaArch::UNKNOWN \|\| Version == CudaVersion::UNKNOWN \|\|
ArchsWithVersionTooLowErrors.count(Arch) > 0)		ArchsWithVersionTooLowErrors.count(Arch) > 0)
return;		return;

auto RequiredVersion = MinVersionForCudaArch(Arch);		auto RequiredVersion = MinVersionForCudaArch(Arch);
if (Version < RequiredVersion) {		if (Version < RequiredVersion) {
ArchsWithVersionTooLowErrors.insert(Arch);		ArchsWithVersionTooLowErrors.insert(Arch);
D.Diag(diag::err_drv_cuda_version_too_low)		D.Diag(diag::err_drv_cuda_version_too_low)
<< InstallPath << CudaArchToString(Arch) << CudaVersionToString(Version)		<< InstallPath << CudaArchToString(Arch) << CudaVersionToString(Version)
<< CudaVersionToString(RequiredVersion);		<< CudaVersionToString(RequiredVersion);
}		}
}		}

void Generic_GCC::CudaInstallationDetector::print(raw_ostream &OS) const {		void CudaInstallationDetector::print(raw_ostream &OS) const {
if (isValid())		if (isValid())
OS << "Found CUDA installation: " << InstallPath << ", version "		OS << "Found CUDA installation: " << InstallPath << ", version "
<< CudaVersionToString(Version) << "\n";		<< CudaVersionToString(Version) << "\n";
}		}

namespace {		namespace {
// Filter to remove Multilibs that don't exist as a suffix to Path		// Filter to remove Multilibs that don't exist as a suffix to Path
class FilterNonExistent {		class FilterNonExistent {
▲ Show 20 Lines • Show All 826 Lines • ▼ Show 20 Lines	for (vfs::directory_iterator
GCCParentLibPath = GCCInstallPath + LibAndInstallSuffixes[i][1];		GCCParentLibPath = GCCInstallPath + LibAndInstallSuffixes[i][1];
IsValid = true;		IsValid = true;
}		}
}		}
}		}

Generic_GCC::Generic_GCC(const Driver &D, const llvm::Triple &Triple,		Generic_GCC::Generic_GCC(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)		const ArgList &Args)
: ToolChain(D, Triple, Args), GCCInstallation(D), CudaInstallation(D) {		: ToolChain(D, Triple, Args), GCCInstallation(D),
		CudaInstallation(D, Triple, Args) {
getProgramPaths().push_back(getDriver().getInstalledDir());		getProgramPaths().push_back(getDriver().getInstalledDir());
if (getDriver().getInstalledDir() != getDriver().Dir)		if (getDriver().getInstalledDir() != getDriver().Dir)
getProgramPaths().push_back(getDriver().Dir);		getProgramPaths().push_back(getDriver().Dir);
}		}

Generic_GCC::~Generic_GCC() {}		Generic_GCC::~Generic_GCC() {}

Tool *Generic_GCC::getTool(Action::ActionClass AC) const {		Tool *Generic_GCC::getTool(Action::ActionClass AC) const {
▲ Show 20 Lines • Show All 1,389 Lines • ▼ Show 20 Lines	static void addMultilibsFilePaths(const Driver &D, const MultilibSet &Multilibs,
if (const auto &PathsCallback = Multilibs.filePathsCallback())		if (const auto &PathsCallback = Multilibs.filePathsCallback())
for (const auto &Path : PathsCallback(Multilib))		for (const auto &Path : PathsCallback(Multilib))
addPathIfExists(D, InstallPath + Path, Paths);		addPathIfExists(D, InstallPath + Path, Paths);
}		}

Linux::Linux(const Driver &D, const llvm::Triple &Triple, const ArgList &Args)		Linux::Linux(const Driver &D, const llvm::Triple &Triple, const ArgList &Args)
: Generic_ELF(D, Triple, Args) {		: Generic_ELF(D, Triple, Args) {
GCCInstallation.init(Triple, Args);		GCCInstallation.init(Triple, Args);
CudaInstallation.init(Triple, Args);
Multilibs = GCCInstallation.getMultilibs();		Multilibs = GCCInstallation.getMultilibs();
llvm::Triple::ArchType Arch = Triple.getArch();		llvm::Triple::ArchType Arch = Triple.getArch();
std::string SysRoot = computeSysRoot();		std::string SysRoot = computeSysRoot();

// Cross-compiling binutils and GCC installations (vanilla and openSUSE at		// Cross-compiling binutils and GCC installations (vanilla and openSUSE at
// least) put various tools in a triple-prefixed directory off of the parent		// least) put various tools in a triple-prefixed directory off of the parent
// of the GCC installation. We use the GCC triple here to ensure that we end		// of the GCC installation. We use the GCC triple here to ensure that we end
// up with tools that support the same amount of cross compiling as the		// up with tools that support the same amount of cross compiling as the
▲ Show 20 Lines • Show All 588 Lines • ▼ Show 20 Lines	if (addLibStdCXXIncludePaths(IncludePath, /Suffix/ "", TripleStr,
/TargetMultiarchTriple/ "",		/TargetMultiarchTriple/ "",
Multilib.includeSuffix(), DriverArgs, CC1Args))		Multilib.includeSuffix(), DriverArgs, CC1Args))
break;		break;
}		}
}		}

void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,		void Linux::AddCudaIncludeArgs(const ArgList &DriverArgs,
ArgStringList &CC1Args) const {		ArgStringList &CC1Args) const {
if (!DriverArgs.hasArg(options::OPT_nobuiltininc)) {		CudaInstallation.AddCudaIncludeArgs(DriverArgs, CC1Args);
// Add cuda_wrappers/* to our system include path. This lets us wrap
// standard library headers.
SmallString<128> P(getDriver().ResourceDir);
llvm::sys::path::append(P, "include");
llvm::sys::path::append(P, "cuda_wrappers");
addSystemInclude(DriverArgs, CC1Args, P);
}

if (DriverArgs.hasArg(options::OPT_nocudainc))
return;

if (!CudaInstallation.isValid()) {
getDriver().Diag(diag::err_drv_no_cuda_installation);
return;
}

addSystemInclude(DriverArgs, CC1Args, CudaInstallation.getIncludePath());
CC1Args.push_back("-include");
CC1Args.push_back("__clang_cuda_runtime_wrapper.h");
}		}

void Linux::AddIAMCUIncludeArgs(const ArgList &DriverArgs,		void Linux::AddIAMCUIncludeArgs(const ArgList &DriverArgs,
ArgStringList &CC1Args) const {		ArgStringList &CC1Args) const {
if (GCCInstallation.isValid()) {		if (GCCInstallation.isValid()) {
CC1Args.push_back("-isystem");		CC1Args.push_back("-isystem");
CC1Args.push_back(DriverArgs.MakeArgString(		CC1Args.push_back(DriverArgs.MakeArgString(
GCCInstallation.getParentLibPath() + "/../" +		GCCInstallation.getParentLibPath() + "/../" +
▲ Show 20 Lines • Show All 165 Lines • ▼ Show 20 Lines	Tool *DragonFly::buildLinker() const {
return new tools::dragonfly::Linker(*this);		return new tools::dragonfly::Linker(*this);
}		}

/// CUDA toolchain. Our assembler is ptxas, and our "linker" is fatbinary,		/// CUDA toolchain. Our assembler is ptxas, and our "linker" is fatbinary,
/// which isn't properly a linker but nonetheless performs the step of stitching		/// which isn't properly a linker but nonetheless performs the step of stitching
/// together object files from the assembler into a single blob.		/// together object files from the assembler into a single blob.

CudaToolChain::CudaToolChain(const Driver &D, const llvm::Triple &Triple,		CudaToolChain::CudaToolChain(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)		const ToolChain &HostTC, const ArgList &Args)
: Linux(D, Triple, Args) {		: ToolChain(D, Triple, Args), HostTC(HostTC),
		CudaInstallation(D, Triple, Args) {
if (CudaInstallation.isValid())		if (CudaInstallation.isValid())
getProgramPaths().push_back(CudaInstallation.getBinPath());		getProgramPaths().push_back(CudaInstallation.getBinPath());
}		}

void		void CudaToolChain::addClangTargetOptions(
CudaToolChain::addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,		const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args) const {		llvm::opt::ArgStringList &CC1Args) const {
Linux::addClangTargetOptions(DriverArgs, CC1Args);		HostTC.addClangTargetOptions(DriverArgs, CC1Args);

CC1Args.push_back("-fcuda-is-device");		CC1Args.push_back("-fcuda-is-device");

if (DriverArgs.hasFlag(options::OPT_fcuda_flush_denormals_to_zero,		if (DriverArgs.hasFlag(options::OPT_fcuda_flush_denormals_to_zero,
options::OPT_fno_cuda_flush_denormals_to_zero, false))		options::OPT_fno_cuda_flush_denormals_to_zero, false))
CC1Args.push_back("-fcuda-flush-denormals-to-zero");		CC1Args.push_back("-fcuda-flush-denormals-to-zero");

if (DriverArgs.hasFlag(options::OPT_fcuda_approx_transcendentals,		if (DriverArgs.hasFlag(options::OPT_fcuda_approx_transcendentals,
options::OPT_fno_cuda_approx_transcendentals, false))		options::OPT_fno_cuda_approx_transcendentals, false))
Show All 25 Lines	void CudaToolChain::AddCudaIncludeArgs(const ArgList &DriverArgs,
ArgStringList &CC1Args) const {		ArgStringList &CC1Args) const {
// Check our CUDA version if we're going to include the CUDA headers.		// Check our CUDA version if we're going to include the CUDA headers.
if (!DriverArgs.hasArg(options::OPT_nocudainc) &&		if (!DriverArgs.hasArg(options::OPT_nocudainc) &&
!DriverArgs.hasArg(options::OPT_no_cuda_version_check)) {		!DriverArgs.hasArg(options::OPT_no_cuda_version_check)) {
StringRef Arch = DriverArgs.getLastArgValue(options::OPT_march_EQ);		StringRef Arch = DriverArgs.getLastArgValue(options::OPT_march_EQ);
assert(!Arch.empty() && "Must have an explicit GPU arch.");		assert(!Arch.empty() && "Must have an explicit GPU arch.");
CudaInstallation.CheckCudaVersionSupportsArch(StringToCudaArch(Arch));		CudaInstallation.CheckCudaVersionSupportsArch(StringToCudaArch(Arch));
}		}
Linux::AddCudaIncludeArgs(DriverArgs, CC1Args);		CudaInstallation.AddCudaIncludeArgs(DriverArgs, CC1Args);
}		}

llvm::opt::DerivedArgList *		llvm::opt::DerivedArgList *
CudaToolChain::TranslateArgs(const llvm::opt::DerivedArgList &Args,		CudaToolChain::TranslateArgs(const llvm::opt::DerivedArgList &Args,
StringRef BoundArch, Action::OffloadKind) const {		StringRef BoundArch,
DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());		Action::OffloadKind DeviceOffloadKind) const {
		DerivedArgList *DAL =
		HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);
		if (!DAL)
		DAL = new DerivedArgList(Args.getBaseArgs());

const OptTable &Opts = getDriver().getOpts();		const OptTable &Opts = getDriver().getOpts();

for (Arg *A : Args) {		for (Arg *A : Args) {
if (A->getOption().matches(options::OPT_Xarch__)) {		if (A->getOption().matches(options::OPT_Xarch__)) {
// Skip this argument unless the architecture matches BoundArch		// Skip this argument unless the architecture matches BoundArch
if (BoundArch.empty() \|\| A->getValue(0) != BoundArch)		if (BoundArch.empty() \|\| A->getValue(0) != BoundArch)
continue;		continue;

Show All 35 Lines
Tool *CudaToolChain::buildAssembler() const {		Tool *CudaToolChain::buildAssembler() const {
return new tools::NVPTX::Assembler(*this);		return new tools::NVPTX::Assembler(*this);
}		}

Tool *CudaToolChain::buildLinker() const {		Tool *CudaToolChain::buildLinker() const {
return new tools::NVPTX::Linker(*this);		return new tools::NVPTX::Linker(*this);
}		}

		void CudaToolChain::addClangWarningOptions(ArgStringList &CC1Args) const {
		HostTC.addClangWarningOptions(CC1Args);
		}

		ToolChain::CXXStdlibType
		CudaToolChain::GetCXXStdlibType(const ArgList &Args) const {
		return HostTC.GetCXXStdlibType(Args);
		}

		void CudaToolChain::AddClangSystemIncludeArgs(const ArgList &DriverArgs,
		ArgStringList &CC1Args) const {
		HostTC.AddClangSystemIncludeArgs(DriverArgs, CC1Args);
		}

		void CudaToolChain::AddClangCXXStdlibIncludeArgs(const ArgList &Args,
		ArgStringList &CC1Args) const {
		HostTC.AddClangCXXStdlibIncludeArgs(Args, CC1Args);
		}

		void CudaToolChain::AddIAMCUIncludeArgs(const ArgList &Args,
		ArgStringList &CC1Args) const {
		HostTC.AddIAMCUIncludeArgs(Args, CC1Args);
		}

/// XCore tool chain		/// XCore tool chain
XCoreToolChain::XCoreToolChain(const Driver &D, const llvm::Triple &Triple,		XCoreToolChain::XCoreToolChain(const Driver &D, const llvm::Triple &Triple,
const ArgList &Args)		const ArgList &Args)
: ToolChain(D, Triple, Args) {		: ToolChain(D, Triple, Args) {
// ProgramPaths are found via 'PATH' environment variable.		// ProgramPaths are found via 'PATH' environment variable.
}		}

Tool *XCoreToolChain::buildAssembler() const {		Tool *XCoreToolChain::buildAssembler() const {
▲ Show 20 Lines • Show All 292 Lines • Show Last 20 Lines

clang/lib/Driver/Tools.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,975 Lines • ▼ Show 20 Lines	void NVPTX::Assembler::ConstructJob(Compilation &C, const JobAction &JA,

// Obtain architecture from the action.		// Obtain architecture from the action.
CudaArch gpu_arch = StringToCudaArch(JA.getOffloadingArch());		CudaArch gpu_arch = StringToCudaArch(JA.getOffloadingArch());
assert(gpu_arch != CudaArch::UNKNOWN &&		assert(gpu_arch != CudaArch::UNKNOWN &&
"Device action expected to have an architecture.");		"Device action expected to have an architecture.");

// Check that our installation's ptxas supports gpu_arch.		// Check that our installation's ptxas supports gpu_arch.
if (!Args.hasArg(options::OPT_no_cuda_version_check)) {		if (!Args.hasArg(options::OPT_no_cuda_version_check)) {
TC.cudaInstallation().CheckCudaVersionSupportsArch(gpu_arch);		TC.CudaInstallation.CheckCudaVersionSupportsArch(gpu_arch);
}		}

ArgStringList CmdArgs;		ArgStringList CmdArgs;
CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-m64" : "-m32");		CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-m64" : "-m32");
if (Args.hasFlag(options::OPT_cuda_noopt_device_debug,		if (Args.hasFlag(options::OPT_cuda_noopt_device_debug,
options::OPT_no_cuda_noopt_device_debug, false)) {		options::OPT_no_cuda_noopt_device_debug, false)) {
// ptxas does not accept -g option if optimization is enabled, so		// ptxas does not accept -g option if optimization is enabled, so
// we ignore the compiler's -O* options if we want debug info.		// we ignore the compiler's -O* options if we want debug info.
▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/bin/.keep

This file was added.

This is an empty file.

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/include/.keep

This file was added.

This is an empty file.

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/lib/.keep

This file was added.

This is an empty file.

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/nvvm/libdevice/libdevice.compute_30.10.bc

This file was added.

This is an empty file.

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/nvvm/libdevice/libdevice.compute_35.10.bc

This file was added.

This is an empty file.

clang/test/Driver/cuda-detect.cu

	// REQUIRES: clang-driver			// REQUIRES: clang-driver
	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target
	//			//
	// # Check that we properly detect CUDA installation.			// # Check that we properly detect CUDA installation.
	// RUN: %clang -v --target=i386-unknown-linux \			// RUN: %clang -v --target=i386-unknown-linux \
	// RUN: --sysroot=%S/no-cuda-there 2>&1 \| FileCheck %s -check-prefix NOCUDA			// RUN: --sysroot=%S/no-cuda-there 2>&1 \| FileCheck %s -check-prefix NOCUDA
				// RUN: %clang -v --target=i386-apple-macosx \
				// RUN: --sysroot=%S/no-cuda-there 2>&1 \| FileCheck %s -check-prefix NOCUDA

	// RUN: %clang -v --target=i386-unknown-linux \			// RUN: %clang -v --target=i386-unknown-linux \
	// RUN: --sysroot=%S/Inputs/CUDA 2>&1 \| FileCheck %s			// RUN: --sysroot=%S/Inputs/CUDA 2>&1 \| FileCheck %s
				// RUN: %clang -v --target=i386-apple-macosx \
				// RUN: --sysroot=%S/Inputs/CUDA 2>&1 \| FileCheck %s

	// RUN: %clang -v --target=i386-unknown-linux \			// RUN: %clang -v --target=i386-unknown-linux \
	// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda 2>&1 \| FileCheck %s			// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda 2>&1 \| FileCheck %s
				// RUN: %clang -v --target=i386-apple-macosx \
				// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda 2>&1 \| FileCheck %s

	// Make sure we map libdevice bitcode files to proper GPUs. These			// Make sure we map libdevice bitcode files to proper GPUs. These
	// tests use Inputs/CUDA_80 which has full set of libdevice files.			// tests use Inputs/CUDA_80 which has full set of libdevice files.
	// However, libdevice mapping only matches CUDA-7.x at the moment.			// However, libdevice mapping only matches CUDA-7.x at the moment.
	// sm_2x, sm_32 -> compute_20			// sm_2x, sm_32 -> compute_20
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_21 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_21 \
	// RUN: --cuda-path=%S/Inputs/CUDA_80/usr/local/cuda %s 2>&1 \			// RUN: --cuda-path=%S/Inputs/CUDA_80/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON \			// RUN: \| FileCheck %s -check-prefix COMMON \
	Show All 26 Lines
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix CUDAINC \			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix CUDAINC \
	// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE35			// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE35
	// sm_5x -> compute_50 for CUDA-8.0 and newer.			// sm_5x -> compute_50 for CUDA-8.0 and newer.
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_50 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_50 \
	// RUN: --cuda-path=%S/Inputs/CUDA_80/usr/local/cuda %s 2>&1 \			// RUN: --cuda-path=%S/Inputs/CUDA_80/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON \			// RUN: \| FileCheck %s -check-prefix COMMON \
	// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE50			// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE50


	// Verify that -nocudainc prevents adding include path to CUDA headers.			// Verify that -nocudainc prevents adding include path to CUDA headers.
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
	// RUN: -nocudainc --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \			// RUN: -nocudainc --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOCUDAINC \			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOCUDAINC \
	// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE35			// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE35
				// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
				traUnsubmitted Done Reply Inline Actions Should that be --target=i386-apple-macosx ? tra: Should that be --target=i386-apple-macosx ?
				jlebarAuthorUnsubmitted Not Done Reply Inline Actions Wow, good eye. jlebar: Wow, good eye.
				// RUN: -nocudainc --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
				// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOCUDAINC \
				// RUN: -check-prefix LIBDEVICE -check-prefix LIBDEVICE35

	// We should not add any CUDA include paths if there's no valid CUDA installation			// We should not add any CUDA include paths if there's no valid CUDA installation
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
	// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \			// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOCUDAINC			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOCUDAINC
				// RUN: %clang -### -v --target=i386-apple-macosx --cuda-gpu-arch=sm_35 \
				// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \
				// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOCUDAINC

	// Verify that we get an error if there's no libdevice library to link with.			// Verify that we get an error if there's no libdevice library to link with.
	// NOTE: Inputs/CUDA deliberately does not have libdevice.compute_20 for this purpose.			// NOTE: Inputs/CUDA deliberately does not have libdevice.compute_20 for this purpose.
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_20 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_20 \
	// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \			// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix MISSINGLIBDEVICE			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix MISSINGLIBDEVICE
				// RUN: %clang -### -v --target=i386-apple-macosx --cuda-gpu-arch=sm_20 \
				// RUN: --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
				// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix MISSINGLIBDEVICE

	// Verify that -nocudalib prevents linking libdevice bitcode in.			// Verify that -nocudalib prevents linking libdevice bitcode in.
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
	// RUN: -nocudalib --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \			// RUN: -nocudalib --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE			// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE
				// RUN: %clang -### -v --target=i386-apple-macosx --cuda-gpu-arch=sm_35 \
				// RUN: -nocudalib --cuda-path=%S/Inputs/CUDA/usr/local/cuda %s 2>&1 \
				// RUN: \| FileCheck %s -check-prefix COMMON -check-prefix NOLIBDEVICE

	// Verify that we don't add include paths, link with libdevice or			// Verify that we don't add include paths, link with libdevice or
	// -include __clang_cuda_runtime_wrapper.h without valid CUDA installation.			// -include __clang_cuda_runtime_wrapper.h without valid CUDA installation.
	// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \			// RUN: %clang -### -v --target=i386-unknown-linux --cuda-gpu-arch=sm_35 \
	// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \			// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \
	// RUN: \| FileCheck %s -check-prefix COMMON \			// RUN: \| FileCheck %s -check-prefix COMMON \
	// RUN: -check-prefix NOCUDAINC -check-prefix NOLIBDEVICE			// RUN: -check-prefix NOCUDAINC -check-prefix NOLIBDEVICE
				// RUN: %clang -### -v --target=i386-apple-macosx --cuda-gpu-arch=sm_35 \
				// RUN: --cuda-path=%S/no-cuda-there %s 2>&1 \
				// RUN: \| FileCheck %s -check-prefix COMMON \
				// RUN: -check-prefix NOCUDAINC -check-prefix NOLIBDEVICE

	// Verify that C++ include paths are passed for both host and device frontends.			// Verify that C++ include paths are passed for both host and device frontends.
	// RUN: %clang -### -no-canonical-prefixes -target x86_64-linux-gnu %s \			// RUN: %clang -### -no-canonical-prefixes -target x86_64-linux-gnu %s \
	// RUN: --stdlib=libstdc++ --sysroot=%S/Inputs/ubuntu_14.04_multiarch_tree2 \			// RUN: --stdlib=libstdc++ --sysroot=%S/Inputs/ubuntu_14.04_multiarch_tree2 \
	// RUN: --gcc-toolchain="" 2>&1 \			// RUN: --gcc-toolchain="" 2>&1 \
	// RUN: \| FileCheck %s --check-prefix CHECK-CXXINCLUDE			// RUN: \| FileCheck %s --check-prefix CHECK-CXXINCLUDE

	// CHECK: Found CUDA installation: {{.*}}/Inputs/CUDA/usr/local/cuda			// CHECK: Found CUDA installation: {{.*}}/Inputs/CUDA/usr/local/cuda
	Show All 28 Lines

clang/test/Driver/cuda-external-tools.cu

	// Tests that ptxas and fatbinary are correctly during CUDA compilation.			// Tests that ptxas and fatbinary are invoked correctly during CUDA
				// compilation.
	//			//
	// REQUIRES: clang-driver			// REQUIRES: clang-driver
	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target

	// Regular compiles with -O{0,1,2,3,4,fast}. -O4 and -Ofast map to ptxas O3.			// Regular compiles with -O{0,1,2,3,4,fast}. -O4 and -Ofast map to ptxas O3.
	// RUN: %clang -### -target x86_64-linux-gnu -O0 -c %s 2>&1 \			// RUN: %clang -### -target x86_64-linux-gnu -O0 -c %s 2>&1 \
	// RUN: \| FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 %s			// RUN: \| FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 %s
	▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
	// RUN: \| FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 %s			// RUN: \| FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 %s

	// Check -Xcuda-ptxas and -Xcuda-fatbinary			// Check -Xcuda-ptxas and -Xcuda-fatbinary
	// RUN: %clang -### -target x86_64-linux-gnu -c -Xcuda-ptxas -foo1 \			// RUN: %clang -### -target x86_64-linux-gnu -c -Xcuda-ptxas -foo1 \
	// RUN: -Xcuda-fatbinary -bar1 -Xcuda-ptxas -foo2 -Xcuda-fatbinary -bar2 %s 2>&1 \			// RUN: -Xcuda-fatbinary -bar1 -Xcuda-ptxas -foo2 -Xcuda-fatbinary -bar2 %s 2>&1 \
	// RUN: \| FileCheck -check-prefix SM20 -check-prefix PTXAS-EXTRA \			// RUN: \| FileCheck -check-prefix SM20 -check-prefix PTXAS-EXTRA \
	// RUN: -check-prefix FATBINARY-EXTRA %s			// RUN: -check-prefix FATBINARY-EXTRA %s

				// MacOS spot-checks
				// RUN: %clang -### -target x86_64-apple-macosx -O0 -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix ARCH64 -check-prefix SM20 -check-prefix OPT0 %s
				// RUN: %clang -### -target x86_64-apple-macosx --cuda-gpu-arch=sm_35 -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix ARCH64 -check-prefix SM35 %s
				// RUN: %clang -### -target x86_32-apple-macosx -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix ARCH32 -check-prefix SM20 %s

	// Match clang job that produces PTX assembly.			// Match clang job that produces PTX assembly.
	// CHECK: "-cc1" "-triple" "nvptx64-nvidia-cuda"			// CHECK: "-cc1" "-triple" "nvptx64-nvidia-cuda"
	// SM20: "-target-cpu" "sm_20"			// SM20: "-target-cpu" "sm_20"
	// SM35: "-target-cpu" "sm_35"			// SM35: "-target-cpu" "sm_35"
	// SM20: "-o" "[[PTXFILE:[^"]*]]"			// SM20: "-o" "[[PTXFILE:[^"]*]]"
	// SM35: "-o" "[[PTXFILE:[^"]*]]"			// SM35: "-o" "[[PTXFILE:[^"]*]]"

	// Match the call to ptxas (which assembles PTX to SASS).			// Match the call to ptxas (which assembles PTX to SASS).
	Show All 37 Lines

clang/test/Driver/cuda-macosx.cu

This file was added.

				// REQUIRES: clang-driver
				// REQUIRES: x86-registered-target
				// REQUIRES: nvptx-registered-target
				//
				// RUN: %clang -v --target=i386-apple-macosx \
				// RUN: --sysroot=%S/Inputs/CUDA-macosx 2>&1 \| FileCheck %s

				// CHECK: Found CUDA installation: {{.*}}/Inputs/CUDA-macosx/usr/local/cuda

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Driver changes to support CUDA compilation on MacOS.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 78286

clang/include/clang/Driver/ToolChain.h

clang/lib/Driver/Driver.cpp

clang/lib/Driver/ToolChains.h

clang/lib/Driver/ToolChains.cpp

clang/lib/Driver/Tools.cpp

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/bin/.keep

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/include/.keep

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/lib/.keep

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/nvvm/libdevice/libdevice.compute_30.10.bc

clang/test/Driver/Inputs/CUDA-macosx/usr/local/cuda/nvvm/libdevice/libdevice.compute_35.10.bc

clang/test/Driver/cuda-detect.cu

clang/test/Driver/cuda-external-tools.cu

clang/test/Driver/cuda-macosx.cu

[CUDA] Driver changes to support CUDA compilation on MacOS.
ClosedPublic