CUDA-11 headers rely on these NVCC builtins.
Despite having __nv previx, those are *not* provided by libdevice.
Details
- Reviewers
jdoerfert - Commits
- rGf526ee5b8517: [CUDA] Provide address space conversion builtins.
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Tested by verifying that generated code (both PTX and SASS) matches that of NVCC: https://godbolt.org/z/rWYYx63bT
Not loving the magic constants here but I don't think we have a enum or similar right now.
I also have to question the people that choose size_t here... we will end up with int2ptr(ptr2int(...)) IR everywhere if this is actually used (outside the asm uses in cuda).
Anyway, LGTM.
Yup.
I also have to question the people that choose size_t here... we will end up with int2ptr(ptr2int(...)) IR everywhere if this is actually used (outside the asm uses in cuda).
I guess size_t was 'good enough' to accommodate all pointer sizes (though it should've been uintptr_t).
I think this chain of conversions gets quickly instcombined away even at -O1:
E.g: https://godbolt.org/z/4vd94cEsj
Well, it does get translated into sensible PTX, so, while not ideal, it's not too big of a deal.
Using an integer is a sensible approach to prevent accidental load/store using a wrong address space.
An alternative would be to make conversion functions return a pointer with specific AS attribute, but that's clang-specific and it would not work for something that needs to plug in into CUDA headers that were written for NVCC.
So, yeah. It could be better, but it's tolerable. At least we didn't have to resort to using inline asm. :-)