This is an archive of the discontinued LLVM Phabricator instance.

LGTM, though I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.
You can compile a kernel, but kernel loading, launching, and related data transfers will all need to be done via driver API. It should be possible to implement a functional replacement, but I'm not aware of any existing open-source implementations. I'm also not sure if clang will be able to deal with CUDA headers correctly on FreeBSD as CUDA headers do sometimes seem to rely on implementation specifics of Linux headers.

This revision is now accepted and ready to land.Nov 18 2019, 11:22 AM

In D69990#1750348, @tra wrote:

LGTM, though I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.

After extracting the necessary CUDA stuff and enabling Linux emulation (for ptxas), at least a "hello world" sample program compiles to an object file:

$ ~/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang --cuda-path=/share/dim/src/freebsd/cuda/cuda-10.1 --cuda-gpu-arch=sm_60 -c hello.cu -v
clang version 10.0.0 (https://github.com/llvm/llvm-project.git 014799db369c8e30c222c0e9d3ea143f349c3db9)
Target: x86_64-unknown-freebsd13.0
Thread model: posix
InstalledDir: /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin
Found CUDA installation: /share/dim/src/freebsd/cuda/cuda-10.1, version 10.1
 "/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-freebsd13.0 -S -disable-free -main-file-name hello.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -no-integrated-as -fuse-init-array -fcuda-is-device -mlink-builtin-bitcode /share/dim/src/freebsd/cuda/cuda-10.1/nvvm/libdevice/libdevice.10.bc -target-feature +ptx64 -target-sdk-version=10.1 -target-cpu sm_60 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0 -internal-isystem /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /share/dim/src/freebsd/cuda/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/include/c++/v1 -internal-isystem /usr/include/c++/v1 -fdeprecated-macro -fno-dwarf-directory-asm -fno-autolink -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 160 -fgnuc-version=4.2.1 -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /home/dim/tmp/hello-f032c8.s -x cuda hello.cu
clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-freebsd13.0
ignoring duplicate directory "/usr/include/c++/v1"
#include "..." search starts here:
#include <...> search starts here:
 /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers
 /share/dim/src/freebsd/cuda/cuda-10.1/include
 /usr/include/c++/v1
 /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include
 /usr/include
End of search list.
 "/share/dim/src/freebsd/cuda/cuda-10.1/bin/ptxas" -m64 -O0 -v --gpu-name sm_60 --output-file /home/dim/tmp/hello-54422a.o /home/dim/tmp/hello-f032c8.s
ptxas info    : 23 bytes gmem
ptxas info    : Compiling entry function '_Z10cuda_hellov' for 'sm_60'
ptxas info    : Function properties for _Z10cuda_hellov
    0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info    : Used 8 registers, 320 bytes cmem[0]
 "/share/dim/src/freebsd/cuda/cuda-10.1/bin/fatbinary" -64 --create /home/dim/tmp/hello-9cd109.fatbin --image=profile=sm_60,file=/home/dim/tmp/hello-54422a.o --image=profile=compute_60,file=/home/dim/tmp/hello-f032c8.s
 "/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang" -cc1 -triple x86_64-unknown-freebsd13.0 -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name hello.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0 -internal-isystem /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /share/dim/src/freebsd/cuda/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/include/c++/v1 -internal-isystem /usr/include/c++/v1 -fdeprecated-macro -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 160 -fgnuc-version=4.2.1 -fobjc-runtime=gnustep -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -fcuda-include-gpubinary /home/dim/tmp/hello-9cd109.fatbin -faddrsig -o hello.o -x cuda hello.cu
clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-freebsd13.0
ignoring duplicate directory "/usr/include/c++/v1"
#include "..." search starts here:
#include <...> search starts here:
 /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers
 /share/dim/src/freebsd/cuda/cuda-10.1/include
 /usr/include/c++/v1
 /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include
 /usr/include
End of search list.

I can't link it into an executable yet, though. That's probably going to need some added link flags.

You can compile a kernel, but kernel loading, launching, and related data transfers will all need to be done via driver API. It should be possible to implement a functional replacement, but I'm not aware of any existing open-source implementations. I'm also not sure if clang will be able to deal with CUDA headers correctly on FreeBSD as CUDA headers do sometimes seem to rely on implementation specifics of Linux headers.

I think @6yearold is at least experimenting with this. One step at a time... :)

Closed by commit rGee31adb7fa42: Populate CUDA flags on FreeBSD too, as many other toolchains do. (authored by dim). · Explain WhyNov 18 2019, 12:58 PM

This revision was automatically updated to reflect the committed changes.

... I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.

FYI, I've just got our internal proof-of-concept runtime support library which may get CUDA apps run on FreeBSD:
https://github.com/google/gpu-runtime

It's somewhat old and misses few glue functions needed by CUDA-10, but it should work well enough for CUDA-9.

In D69990#1790047, @tra wrote:

... I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.

FYI, I've just got our internal proof-of-concept runtime support library which may get CUDA apps run on FreeBSD:
https://github.com/google/gpu-runtime

It's somewhat old and misses few glue functions needed by CUDA-10, but it should work well enough for CUDA-9.

Interesting, thanks for sharing. However, at quick look, it seems to require other CUDA libraries (libcuda, libcublas, etc.), which also aren't available for FreeBSD.

In D69990#1790154, @6yearold wrote:

It's somewhat old and misses few glue functions needed by CUDA-10, but it should work well enough for CUDA-9.

Interesting, thanks for sharing. However, at quick look, it seems to require other CUDA libraries (libcuda, libcublas, etc.), which also aren't available for FreeBSD.

cuBLAS is only for testing. The runtime itself does not need it.

libcuda.so is normally part of the GPU *driver* not CUDA itself, at least it is on Linux. I didn't check if that's also the case on FreeBSD.
Looks like you're correct -- the driver archive only has NVIDIA-FreeBSD-x86_64-440.44/obj/linux/libcuda.so.440.44 in it.

:-(

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

FreeBSD.h

2 lines

FreeBSD.cpp

5 lines

test/

Driver/

cuda-options-freebsd.cu

289 lines

Diff 229905

clang/lib/Driver/ToolChains/FreeBSD.h

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	public:
bool IsObjCNonFragileABIDefault() const override { return true; }		bool IsObjCNonFragileABIDefault() const override { return true; }

CXXStdlibType GetDefaultCXXStdlibType() const override;		CXXStdlibType GetDefaultCXXStdlibType() const override;
void addLibStdCxxIncludePaths(		void addLibStdCxxIncludePaths(
const llvm::opt::ArgList &DriverArgs,		const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args) const override;		llvm::opt::ArgStringList &CC1Args) const override;
void AddCXXStdlibLibArgs(const llvm::opt::ArgList &Args,		void AddCXXStdlibLibArgs(const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs) const override;		llvm::opt::ArgStringList &CmdArgs) const override;
		void AddCudaIncludeArgs(const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args) const override;

llvm::ExceptionHandling GetExceptionModel(		llvm::ExceptionHandling GetExceptionModel(
const llvm::opt::ArgList &Args) const override;		const llvm::opt::ArgList &Args) const override;
bool IsUnwindTablesDefault(const llvm::opt::ArgList &Args) const override;		bool IsUnwindTablesDefault(const llvm::opt::ArgList &Args) const override;
bool isPIEDefault() const override;		bool isPIEDefault() const override;
SanitizerMask getSupportedSanitizers() const override;		SanitizerMask getSupportedSanitizers() const override;
unsigned GetDefaultDwarfVersion() const override;		unsigned GetDefaultDwarfVersion() const override;
// Until dtrace (via CTF) and LLDB can deal with distributed debug info,		// Until dtrace (via CTF) and LLDB can deal with distributed debug info,
Show All 13 Lines

clang/lib/Driver/ToolChains/FreeBSD.cpp

Show First 20 Lines • Show All 391 Lines • ▼ Show 20 Lines	case ToolChain::CST_Libcxx:
break;		break;

case ToolChain::CST_Libstdcxx:		case ToolChain::CST_Libstdcxx:
CmdArgs.push_back(Profiling ? "-lstdc++_p" : "-lstdc++");		CmdArgs.push_back(Profiling ? "-lstdc++_p" : "-lstdc++");
break;		break;
}		}
}		}

		void FreeBSD::AddCudaIncludeArgs(const ArgList &DriverArgs,
		ArgStringList &CC1Args) const {
		CudaInstallation.AddCudaIncludeArgs(DriverArgs, CC1Args);
		}

Tool *FreeBSD::buildAssembler() const {		Tool *FreeBSD::buildAssembler() const {
return new tools::freebsd::Assembler(*this);		return new tools::freebsd::Assembler(*this);
}		}

Tool FreeBSD::buildLinker() const { return new tools::freebsd::Linker(this); }		Tool FreeBSD::buildLinker() const { return new tools::freebsd::Linker(this); }

llvm::ExceptionHandling FreeBSD::GetExceptionModel(const ArgList &Args) const {		llvm::ExceptionHandling FreeBSD::GetExceptionModel(const ArgList &Args) const {
// FreeBSD uses SjLj exceptions on ARM oabi.		// FreeBSD uses SjLj exceptions on ARM oabi.
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

clang/test/Driver/cuda-options-freebsd.cu

This file was added.

				// Tests CUDA compilation pipeline construction in Driver.
				// REQUIRES: clang-driver
				// REQUIRES: x86-registered-target
				// REQUIRES: nvptx-registered-target

				// Simple compilation case. Compile device-side to PTX assembly and make sure
				// we use it on the host side.
				// RUN: %clang -### -target x86_64-unknown-freebsd -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix HOST -check-prefix INCLUDES-DEVICE \
				// RUN: -check-prefix NOLINK %s

				// Typical compilation + link case.
				// RUN: %clang -### -target x86_64-unknown-freebsd %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix HOST -check-prefix INCLUDES-DEVICE \
				// RUN: -check-prefix LINK %s

				// Verify that --cuda-host-only disables device-side compilation, but doesn't
				// disable host-side compilation/linking.
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-host-only %s 2>&1 \
				// RUN: \| FileCheck -check-prefix NODEVICE -check-prefix HOST \
				// RUN: -check-prefix NOINCLUDES-DEVICE -check-prefix LINK %s

				// Verify that --cuda-device-only disables host-side compilation and linking.
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix NOHOST -check-prefix NOLINK %s

				// Check that the last of --cuda-compile-host-device, --cuda-host-only, and
				// --cuda-device-only wins.

				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-host-only %s 2>&1 \
				// RUN: \| FileCheck -check-prefix NODEVICE -check-prefix HOST \
				// RUN: -check-prefix NOINCLUDES-DEVICE -check-prefix LINK %s

				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-compile-host-device \
				// RUN: --cuda-host-only %s 2>&1 \
				// RUN: \| FileCheck -check-prefix NODEVICE -check-prefix HOST \
				// RUN: -check-prefix NOINCLUDES-DEVICE -check-prefix LINK %s

				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-host-only \
				// RUN: --cuda-device-only %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix NOHOST -check-prefix NOLINK %s

				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-compile-host-device \
				// RUN: --cuda-device-only %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix NOHOST -check-prefix NOLINK %s

				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-host-only \
				// RUN: --cuda-compile-host-device %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix HOST -check-prefix INCLUDES-DEVICE \
				// RUN: -check-prefix LINK %s

				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-compile-host-device %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix HOST -check-prefix INCLUDES-DEVICE \
				// RUN: -check-prefix LINK %s

				// Verify that --cuda-gpu-arch option passes the correct GPU architecture to
				// device compilation.
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-gpu-arch=sm_30 -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix DEVICE-SM30 -check-prefix HOST \
				// RUN: -check-prefix INCLUDES-DEVICE -check-prefix NOLINK %s

				// Verify that there is one device-side compilation per --cuda-gpu-arch args
				// and that all results are included on the host side.
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes DEVICE,DEVICE-NOSAVE,DEVICE2 \
				// RUN: -check-prefixes DEVICE-SM30,DEVICE2-SM35 \
				// RUN: -check-prefixes INCLUDES-DEVICE,INCLUDES-DEVICE2 \
				// RUN: -check-prefixes HOST,HOST-NOSAVE,NOLINK %s

				// Verify that device-side results are passed to the correct tool when
				// -save-temps is used.
				// RUN: %clang -### -target x86_64-unknown-freebsd -save-temps -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-SAVE \
				// RUN: -check-prefix HOST -check-prefix HOST-SAVE -check-prefix NOLINK %s

				// Verify that device-side results are passed to the correct tool when
				// -fno-integrated-as is used.
				// RUN: %clang -### -target x86_64-unknown-freebsd -fno-integrated-as -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix DEVICE -check-prefix DEVICE-NOSAVE \
				// RUN: -check-prefix HOST -check-prefix HOST-NOSAVE \
				// RUN: -check-prefix HOST-AS -check-prefix NOLINK %s

				// Verify that --[no-]cuda-gpu-arch arguments are handled correctly.
				// a) --no-cuda-gpu-arch=X negates preceding --cuda-gpu-arch=X
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-gpu-arch=sm_35 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes NOARCH-SM20,ARCH-SM30,NOARCH-SM35 %s

				// b) --no-cuda-gpu-arch=X negates more than one preceding --cuda-gpu-arch=X
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-gpu-arch=sm_35 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes NOARCH-SM20,ARCH-SM30,NOARCH-SM35 %s

				// c) if --no-cuda-gpu-arch=X negates all preceding --cuda-gpu-arch=X
				// we default to sm_20 -- same as if no --cuda-gpu-arch were passed.
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-gpu-arch=sm_35 --no-cuda-gpu-arch=sm_30 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes ARCH-SM20,NOARCH-SM30,NOARCH-SM35 %s

				// d) --no-cuda-gpu-arch=X is a no-op if there's no preceding --cuda-gpu-arch=X
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30\
				// RUN: --no-cuda-gpu-arch=sm_50 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes NOARCH-SM20,ARCH-SM30,ARCH-SM35 %s

				// e) --no-cuda-gpu-arch=X does not affect following --cuda-gpu-arch=X
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --no-cuda-gpu-arch=sm_35 --no-cuda-gpu-arch=sm_30 \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes NOARCH-SM20,ARCH-SM30,ARCH-SM35 %s

				// f) --no-cuda-gpu-arch=all negates all preceding --cuda-gpu-arch=X
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-gpu-arch=sm_20 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-gpu-arch=all \
				// RUN: --cuda-gpu-arch=sm_35 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes NOARCH-SM20,NOARCH-SM30,ARCH-SM35 %s

				// g) There's no --cuda-gpu-arch=all
				// RUN: %clang -### -target x86_64-unknown-freebsd --cuda-device-only \
				// RUN: --cuda-gpu-arch=all \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefix ARCHALLERROR %s


				// Verify that --[no-]cuda-include-ptx arguments are handled correctly.
				// a) by default we're including PTX for all GPUs.
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes FATBIN-COMMON,PTX-SM35,PTX-SM30 %s

				// b) --no-cuda-include-ptx=all disables PTX inclusion for all GPUs
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-include-ptx=all \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes FATBIN-COMMON,NOPTX-SM35,NOPTX-SM30 %s

				// c) --no-cuda-include-ptx=sm_XX disables PTX inclusion for that GPU only.
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-include-ptx=sm_35 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes FATBIN-COMMON,NOPTX-SM35,PTX-SM30 %s
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-include-ptx=sm_30 \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes FATBIN-COMMON,PTX-SM35,NOPTX-SM30 %s

				// d) --cuda-include-ptx=all overrides preceding --no-cuda-include-ptx=all
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-include-ptx=all --cuda-include-ptx=all \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes FATBIN-COMMON,PTX-SM35,PTX-SM30 %s

				// e) --cuda-include-ptx=all overrides preceding --no-cuda-include-ptx=sm_XX
				// RUN: %clang -### -target x86_64-unknown-freebsd \
				// RUN: --cuda-gpu-arch=sm_35 --cuda-gpu-arch=sm_30 \
				// RUN: --no-cuda-include-ptx=sm_30 --cuda-include-ptx=all \
				// RUN: -c %s 2>&1 \
				// RUN: \| FileCheck -check-prefixes FATBIN-COMMON,PTX-SM35,PTX-SM30 %s


				// ARCH-SM20: "-cc1"{{.*}}"-target-cpu" "sm_20"
				// NOARCH-SM20-NOT: "-cc1"{{.*}}"-target-cpu" "sm_20"
				// ARCH-SM30: "-cc1"{{.*}}"-target-cpu" "sm_30"
				// NOARCH-SM30-NOT: "-cc1"{{.*}}"-target-cpu" "sm_30"
				// ARCH-SM35: "-cc1"{{.*}}"-target-cpu" "sm_35"
				// NOARCH-SM35-NOT: "-cc1"{{.*}}"-target-cpu" "sm_35"
				// ARCHALLERROR: error: Unsupported CUDA gpu architecture: all

				// Match device-side preprocessor and compiler phases with -save-temps.
				// DEVICE-SAVE: "-cc1" "-triple" "nvptx64-nvidia-cuda"
				// DEVICE-SAVE-SAME: "-aux-triple" "x86_64-unknown-freebsd"
				// DEVICE-SAVE-SAME: "-fcuda-is-device"
				// DEVICE-SAVE-SAME: "-x" "cuda"

				// DEVICE-SAVE: "-cc1" "-triple" "nvptx64-nvidia-cuda"
				// DEVICE-SAVE-SAME: "-aux-triple" "x86_64-unknown-freebsd"
				// DEVICE-SAVE-SAME: "-fcuda-is-device"
				// DEVICE-SAVE-SAME: "-x" "cuda-cpp-output"

				// Match the job that produces PTX assembly.
				// DEVICE: "-cc1" "-triple" "nvptx64-nvidia-cuda"
				// DEVICE-NOSAVE-SAME: "-aux-triple" "x86_64-unknown-freebsd"
				// DEVICE-SAME: "-fcuda-is-device"
				// DEVICE-SM30-SAME: "-target-cpu" "sm_30"
				// DEVICE-SAME: "-o" "[[PTXFILE:[^"]*]]"
				// DEVICE-NOSAVE-SAME: "-x" "cuda"
				// DEVICE-SAVE-SAME: "-x" "ir"

				// Match the call to ptxas (which assembles PTX to SASS).
				// DEVICE:ptxas
				// DEVICE-SM30-DAG: "--gpu-name" "sm_30"
				// DEVICE-DAG: "--output-file" "[[CUBINFILE:[^"]*]]"
				// DEVICE-DAG: "[[PTXFILE]]"

				// Match another device-side compilation.
				// DEVICE2: "-cc1" "-triple" "nvptx64-nvidia-cuda"
				// DEVICE2-SAME: "-aux-triple" "x86_64-unknown-freebsd"
				// DEVICE2-SAME: "-fcuda-is-device"
				// DEVICE2-SM35-SAME: "-target-cpu" "sm_35"
				// DEVICE2-SAME: "-o" "[[PTXFILE2:[^"]*]]"
				// DEVICE2-SAME: "-x" "cuda"

				// Match another call to ptxas.
				// DEVICE2: ptxas
				// DEVICE2-SM35-DAG: "--gpu-name" "sm_35"
				// DEVICE2-DAG: "--output-file" "[[CUBINFILE2:[^"]*]]"
				// DEVICE2-DAG: "[[PTXFILE2]]"

				// Match no device-side compilation.
				// NODEVICE-NOT: "-cc1" "-triple" "nvptx64-nvidia-cuda"
				// NODEVICE-NOT: "-fcuda-is-device"

				// INCLUDES-DEVICE:fatbinary
				// INCLUDES-DEVICE-DAG: "--create" "[[FATBINARY:[^"]*]]"
				// INCLUDES-DEVICE-DAG: "--image=profile=sm_{{[0-9]+}},file=[[CUBINFILE]]"
				// INCLUDES-DEVICE-DAG: "--image=profile=compute_{{[0-9]+}},file=[[PTXFILE]]"
				// INCLUDES-DEVICE2-DAG: "--image=profile=sm_{{[0-9]+}},file=[[CUBINFILE2]]"
				// INCLUDES-DEVICE2-DAG: "--image=profile=compute_{{[0-9]+}},file=[[PTXFILE2]]"

				// Match host-side preprocessor job with -save-temps.
				// HOST-SAVE: "-cc1" "-triple" "x86_64-unknown-freebsd"
				// HOST-SAVE-SAME: "-aux-triple" "nvptx64-nvidia-cuda"
				// HOST-SAVE-NOT: "-fcuda-is-device"
				// HOST-SAVE-SAME: "-x" "cuda"

				// Match host-side compilation.
				// HOST: "-cc1" "-triple" "x86_64-unknown-freebsd"
				// HOST-SAME: "-aux-triple" "nvptx64-nvidia-cuda"
				// HOST-NOT: "-fcuda-is-device"
				// There is only one GPU binary after combining it with fatbinary!
				// INCLUDES-DEVICE2-NOT: "-fcuda-include-gpubinary"
				// INCLUDES-DEVICE-SAME: "-fcuda-include-gpubinary" "[[FATBINARY]]"
				// There is only one GPU binary after combining it with fatbinary.
				// INCLUDES-DEVICE2-NOT: "-fcuda-include-gpubinary"
				// HOST-SAME: "-o" "[[HOSTOUTPUT:[^"]*]]"
				// HOST-NOSAVE-SAME: "-x" "cuda"
				// HOST-SAVE-SAME: "-x" "cuda-cpp-output"

				// Match external assembler that uses compilation output.
				// HOST-AS: "-o" "{{.*}}.o" "[[HOSTOUTPUT]]"

				// Match no GPU code inclusion.
				// NOINCLUDES-DEVICE-NOT: "-fcuda-include-gpubinary"

				// Match no host compilation.
				// NOHOST-NOT: "-cc1" "-triple"
				// NOHOST-NOT: "-x" "cuda"

				// Match linker.
				// LINK: "{{.*}}{{ld\|link}}{{(.exe)?}}"
				// LINK-SAME: "[[HOSTOUTPUT]]"

				// Match no linker.
				// NOLINK-NOT: "{{.*}}{{ld\|link}}{{(.exe)?}}"

				// FATBIN-COMMON:fatbinary
				// FATBIN-COMMON: "--create" "[[FATBINARY:[^"]*]]"
				// FATBIN-COMMON: "--image=profile=sm_30,file=
				// PTX-SM30: "--image=profile=compute_30,file=
				// NOPTX-SM30-NOT: "--image=profile=compute_30,file=
				// FATBIN-COMMON: "--image=profile=sm_35,file=
				// PTX-SM35: "--image=profile=compute_35,file=
				// NOPTX-SM35-NOT: "--image=profile=compute_35,file=

This is an archive of the discontinued LLVM Phabricator instance.

Populate CUDA flags on FreeBSD too, as many other toolchains do.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 229905

clang/lib/Driver/ToolChains/FreeBSD.h

clang/lib/Driver/ToolChains/FreeBSD.cpp

clang/test/Driver/cuda-options-freebsd.cu

Populate CUDA flags on FreeBSD too, as many other toolchains do.
ClosedPublic