libdevice in recent CUDA versions relies on __nvvm_reflect() to select
GPU-specific bitcode. This patch addresses the requirement.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
Just to check, the notion is that it's OK if I report a sm version less than what I end up running on?
llvm/lib/Target/NVPTX/NVVMReflect.cpp | ||
---|---|---|
55 ↗ | (On Diff #158866) | explicit |
Comment Actions
Yes. We may lose some performance, but not correctness as we are expected to be forward-compatible.
Until now reflect was being replaced with 0, so we were picking the variant suitable for the oldest GPU.