Add logic to allow users to specify the CUDA path at configuration time.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Speaking of which, it turns out we can use CMake's standard FindCUDA module to do everything except find the driver library. :) So I switched to doing things that way. It also allows you to control where it looks for CUDA by using the CUDA_TOOLKIT_ROOT_DIR variable.
streamexecutor/lib/platforms/cuda/cmake/modules/FindLIBCUDA.cmake | ||
---|---|---|
9 ↗ | (On Diff #71420) | Do you still want this? |
Does this also need the CUDA library to be added to streamexecutor-config output for --libs?
At the moment (with this patch), if I try and link against SE using streamexecutor-config, I get undefined references to all of the CUDA driver functions. If I manually add -framework cuda (OS X), it all works. The CUDA SAXPY example works for me, but I can see from the verbose ninja output that the example is building with -framework cuda already.
Thanks for bringing this up! I added logic to streamexecutor-config to add in the CUDA library for the --libs output if CUDA is enabled. It seems to work on Ubuntu. I'd definitely be glad to know if it works on OS X as well.
streamexecutor/lib/platforms/cuda/cmake/modules/FindLibcuda.cmake | ||
---|---|---|
1 ↗ | (On Diff #71420) | I think this was a phabricator bug. This whole file is gone from my standpoint (and I'm glad it's gone :)). |
Alas, this still doesn't work for OS X. It seems that CUDA_DRIVER_LIBRARY is set to the full path to the framework (e.g. /Library/Frameworks/cuda.framework) whereas the flag that needs to be passed is -framework cuda. CMake automagically sorts this out when passing the framework path to target_link_libraries, but I'm not sure there's a simple way of getting it to spit out the correct flag for streamexecutor-config in a platform independent manner.
This doesn't have to hold up this patch BTW, I just thought I'd mention it here since I noticed it.
What do you think of the idea of doing a check in Python to see if the library base name is *.framework and then emitting the flag -lframework * instead of the raw path, if so? It's hacky, but it seems like it might work pretty much all the time.
This doesn't have to hold up this patch BTW, I just thought I'd mention it here since I noticed it.
I can't tell you how much I appreciate the work you are doing checking StreamExecutor on different platforms.
Sounds good to me. It sounds hacky, but I've just looked at the CMake source code for this and that's almost exactly what they do :-)
I'll take a positive view of that and say that great minds hack alike. :)
The hack is in now. Please let me know if it works for you. Thanks!