libdevice bitcode provided by NVIDIA is linked with clang/LLVM-generated IR
which uses nvptx*-nvidia-cuda triple. We need to mark them as compatible.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
FWIW, with OpenMP we now see 3 warnings:
[ 90%] Building CXX object ... warning: linking module '/usr/local/cuda-11.0/nvvm/libdevice/libdevice.10.bc': Linking two modules of different data layouts: '/usr/local/cuda-11.0/nvvm/libdevice/libdevice.10.bc' is '' whereas '...' is 'e-i64:64-i128:128-v16:16-v32:32-n16:32:64' [-Wlinker-warnings] warning: linking module '/usr/local/cuda-11.0/nvvm/libdevice/libdevice.10.bc': Linking two modules of different target triples: '/usr/local/cuda-11.0/nvvm/libdevice/libdevice.10.bc' is 'nvptx64-nvidia-gpulibs' whereas '...' is 'nvptx64-nvidia-cuda' [-Wlinker-warnings] warning: linking module '/soft/llvm/main-20210824/lib/libomptarget-nvptx-sm_80.bc': Linking two modules of different target triples: '/soft/llvm/main-20210824/lib/libomptarget-nvptx-sm_80.bc' is 'nvptx64' whereas '...' is 'nvptx64-nvidia-cuda' [-Wlinker-warnings]
This patch should fix the middle one. The last one we can fix ourselves by always embedding canonical/full triples. The first one I am not sure about yet.
Long story short, I think this makes sense. LGTM.
I've checked libdevice in different CUDA versions and the results are rather inconsistent:
For triples, CUDA-10+ uses nvptx64-nvidia-gpulibs, older versions use nvptx-unknown-unknown
Data layout is absent until CUDA-11.1.
Instead of chasing all the variations, I wonder if we should just disable linker warning for libdevice somewhere in clang.
I was initially thinking to have a flag to do so. With this patch I though we could make it work without. If this patch doesn't work,
I agree, linking libdevice should set a "don't look too closely" flag.
The warnings we see are most likely caused by D108603, just to have a connection to the discussion.
How is libomptarget-nvptx-sm_80.bc built? Perhaps it should be changed to use a more detailed triple. Perhaps match libdevice's nvptx64-nvidia-gpulibs or have a distinct openmp-specific one?
Do we have to deal with the libomptarget-nvptx-sm_80.bc shipped with the current nvptx64 triple or is it shipped with clang and we don't need to worry about the older versions in the wild?
We do not have to care about older versions and we can change the triple there. I'm not sure changing it to gpulibs is the right thing though.
My plan was to normalize nvptx64 to nvptx-nvidia-cuda in the frontend such that user can use any subset of that normalized triple w/o warning.
Unsure what we gain by having "gpulibs" and special handling there. That said we could certainly do it.
I'm however unsure how that helps us with libdevice, libomptarget-nvptx is in our control so certainly less of a problem.
My plan was to normalize nvptx64 to nvptx-nvidia-cuda in the frontend such that user can use any subset of that normalized triple w/o warning.
SGTM. So, for now I'll just deal with the libdevice-related warnings.
Yes, libdevice we need to somehow make work for all the version incl. mismatching/missing target information.
Suppress warnings about triple and DataLayout mismatches related to CUDA's
libdevice. Tested on CUDA versions 8.0-11.3
llvm/lib/Support/Triple.cpp | ||
---|---|---|
1642 | I think this could be incorporated into the SuppressDLWarning calculation above. This way we'll keep this libdevice quirk handling to one place. I'll update the patch tomorrow. |
OK, this patch no longer produces warnings about Triple and DL when we're linking in libbdevice from CUDA-8.0 through 11.4.
I've tried to make it as specific as I could.
We can actually test this, right?
Commit subject and message needs to be updated too.
I think this could be incorporated into the SuppressDLWarning calculation above. This way we'll keep this libdevice quirk handling to one place. I'll update the patch tomorrow.