This allows clang to be used to compile CUDA programs. Compiled
simple helloworld.cu with this.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Adding a few people who might know a bit more about CUDA specific things. Please take a look if this review makes any sense, thanks. :)
LGTM, though I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.
You can compile a kernel, but kernel loading, launching, and related data transfers will all need to be done via driver API. It should be possible to implement a functional replacement, but I'm not aware of any existing open-source implementations. I'm also not sure if clang will be able to deal with CUDA headers correctly on FreeBSD as CUDA headers do sometimes seem to rely on implementation specifics of Linux headers.
After extracting the necessary CUDA stuff and enabling Linux emulation (for ptxas), at least a "hello world" sample program compiles to an object file:
$ ~/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang --cuda-path=/share/dim/src/freebsd/cuda/cuda-10.1 --cuda-gpu-arch=sm_60 -c hello.cu -v clang version 10.0.0 (https://github.com/llvm/llvm-project.git 014799db369c8e30c222c0e9d3ea143f349c3db9) Target: x86_64-unknown-freebsd13.0 Thread model: posix InstalledDir: /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin Found CUDA installation: /share/dim/src/freebsd/cuda/cuda-10.1, version 10.1 "/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang" -cc1 -triple nvptx64-nvidia-cuda -aux-triple x86_64-unknown-freebsd13.0 -S -disable-free -main-file-name hello.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -no-integrated-as -fuse-init-array -fcuda-is-device -mlink-builtin-bitcode /share/dim/src/freebsd/cuda/cuda-10.1/nvvm/libdevice/libdevice.10.bc -target-feature +ptx64 -target-sdk-version=10.1 -target-cpu sm_60 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0 -internal-isystem /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /share/dim/src/freebsd/cuda/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/include/c++/v1 -internal-isystem /usr/include/c++/v1 -fdeprecated-macro -fno-dwarf-directory-asm -fno-autolink -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 160 -fgnuc-version=4.2.1 -fobjc-runtime=gcc -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -o /home/dim/tmp/hello-f032c8.s -x cuda hello.cu clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-freebsd13.0 ignoring duplicate directory "/usr/include/c++/v1" #include "..." search starts here: #include <...> search starts here: /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers /share/dim/src/freebsd/cuda/cuda-10.1/include /usr/include/c++/v1 /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include /usr/include End of search list. "/share/dim/src/freebsd/cuda/cuda-10.1/bin/ptxas" -m64 -O0 -v --gpu-name sm_60 --output-file /home/dim/tmp/hello-54422a.o /home/dim/tmp/hello-f032c8.s ptxas info : 23 bytes gmem ptxas info : Compiling entry function '_Z10cuda_hellov' for 'sm_60' ptxas info : Function properties for _Z10cuda_hellov 0 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads ptxas info : Used 8 registers, 320 bytes cmem[0] "/share/dim/src/freebsd/cuda/cuda-10.1/bin/fatbinary" -64 --create /home/dim/tmp/hello-9cd109.fatbin --image=profile=sm_60,file=/home/dim/tmp/hello-54422a.o --image=profile=compute_60,file=/home/dim/tmp/hello-f032c8.s "/home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/bin/clang" -cc1 -triple x86_64-unknown-freebsd13.0 -target-sdk-version=10.1 -aux-triple nvptx64-nvidia-cuda -emit-obj -mrelax-all -disable-free -main-file-name hello.cu -mrelocation-model static -mthread-model posix -mframe-pointer=all -fno-rounding-math -masm-verbose -mconstructor-aliases -munwind-tables -fuse-init-array -target-cpu x86-64 -dwarf-column-info -debugger-tuning=gdb -v -resource-dir /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0 -internal-isystem /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers -internal-isystem /share/dim/src/freebsd/cuda/cuda-10.1/include -include __clang_cuda_runtime_wrapper.h -internal-isystem /usr/include/c++/v1 -internal-isystem /usr/include/c++/v1 -fdeprecated-macro -fdebug-compilation-dir /tmp -ferror-limit 19 -fmessage-length 160 -fgnuc-version=4.2.1 -fobjc-runtime=gnustep -fcxx-exceptions -fexceptions -fdiagnostics-show-option -fcolor-diagnostics -fcuda-include-gpubinary /home/dim/tmp/hello-9cd109.fatbin -faddrsig -o hello.o -x cuda hello.cu clang -cc1 version 10.0.0 based upon LLVM 10.0.0git default target x86_64-unknown-freebsd13.0 ignoring duplicate directory "/usr/include/c++/v1" #include "..." search starts here: #include <...> search starts here: /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include/cuda_wrappers /share/dim/src/freebsd/cuda/cuda-10.1/include /usr/include/c++/v1 /home/dim/obj/llvm/llvmorg-10-init-10100-g9a5b7b785bf-freebsd13-amd64-ninja-rel-1/lib/clang/10.0.0/include /usr/include End of search list.
I can't link it into an executable yet, though. That's probably going to need some added link flags.
You can compile a kernel, but kernel loading, launching, and related data transfers will all need to be done via driver API. It should be possible to implement a functional replacement, but I'm not aware of any existing open-source implementations. I'm also not sure if clang will be able to deal with CUDA headers correctly on FreeBSD as CUDA headers do sometimes seem to rely on implementation specifics of Linux headers.
I think @6yearold is at least experimenting with this. One step at a time... :)
... I'm curious if it's particularly useful. Last time I checked NVIDIA didn't ship libcudart for FreeBSD and without it it's rather cumbersome to use CUDA in practice.
FYI, I've just got our internal proof-of-concept runtime support library which may get CUDA apps run on FreeBSD:
https://github.com/google/gpu-runtime
It's somewhat old and misses few glue functions needed by CUDA-10, but it should work well enough for CUDA-9.
Interesting, thanks for sharing. However, at quick look, it seems to require other CUDA libraries (libcuda, libcublas, etc.), which also aren't available for FreeBSD.
cuBLAS is only for testing. The runtime itself does not need it.
libcuda.so is normally part of the GPU *driver* not CUDA itself, at least it is on Linux. I didn't check if that's also the case on FreeBSD.
Looks like you're correct -- the driver archive only has NVIDIA-FreeBSD-x86_64-440.44/obj/linux/libcuda.so.440.44 in it.
:-(