Different NVIDIA GPUs support different compute capabilities. To enable the inlining of runtime functions and the best performance on different generations of NVIDIA GPUs, a bc library for each compute capability needs to be compiled. The same compiler build will then be usable in conjunction with multiple generations of NVIDIA GPUs.
To differentiate between versions of the same bc lib, the output file name will contain the compute capability ID.
Depends on D14254
Details
- Reviewers
Hahnfeld hfinkel carlo.bertolli caomhin ABataev grokos - Commits
- rGd5ae4e65014f: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for…
rL324904: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for…
rOMP324904: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for…
Diff Detail
- Repository
- rOMP OpenMP
- Build Status
Buildable 14859 Build 14859: arc lint + arc unit
Event Timeline
Please rebase on top of the lastest changes in D14254.
Do we want to rename LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY to LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES (plural) to reflect that the user can pass in a list? In any case, please document in README.rst!
libomptarget/deviceRTLs/nvptx/CMakeLists.txt | ||
---|---|---|
203 | Actually I don't care which one we use. Depending on what we choose here I'll make clang agree with it :) |
On whatever name we decide, please document in README.rst that the user can pass multiple values.
libomptarget/deviceRTLs/nvptx/CMakeLists.txt | ||
---|---|---|
63–64 | Maybe we can add some compatibility? Upstream only has it since a few days and others will need to change their scripts anyway. set(default_capabilities 35) if (DEFINED LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY) set(default_capabilities ${LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY}) libomptarget_warning_say("LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY is deprecated, please use LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES") endif() set(LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES ${default_capabilities} CACHE STRING "List of CUDA Compute Capabilities to be used to compile the NVPTX device RTL.") | |
203 | I agree: The user will probably never see the name anyway. So let's just keep it compatible. |
README.rst | ||
---|---|---|
282–286 | Maybe add a comment here that the single-capability option is deprecated and the user should use the next option to define compute capabilities (even if they want to define only one)? |
README.rst | ||
---|---|---|
282–286 | I think we should just remove it: There was no (upstream) release with this variable and the only code checking for it is there for compatibility. |
Maybe add a comment here that the single-capability option is deprecated and the user should use the next option to define compute capabilities (even if they want to define only one)?