The GPU makes use of different address spaces. We generally work with
global memory, thread private memory, and thread shared memory. This
patch simply adds a few preliminary wrappers to map these concepts to
the numerical values the backend uses. Obviously casts between these
will need to be checked by the user.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
libc/src/__support/GPU/amdgpu/utils.h | ||
---|---|---|
25–28 | I don't really condone using the numbered address spaces to access these, they don't quite work the same way |
libc/src/__support/GPU/amdgpu/utils.h | ||
---|---|---|
25–28 | They directly map LLVM ones to the ones documented in the backend. Since this is regular C++ we don't have any other option. |
libc/src/__support/GPU/amdgpu/utils.h | ||
---|---|---|
25–28 | Doesn't declaring as the opencl langAses work? I think those have attributes |
libc/src/__support/GPU/amdgpu/utils.h | ||
---|---|---|
25–28 | That doesn't work for NVPTX it seems, https://godbolt.org/z/7515T83nE. |
libc/src/__support/GPU/amdgpu/utils.h | ||
---|---|---|
25–28 | This looks correct to me when you account for the stack-is-flat hack PTX uses |
libc/src/__support/GPU/amdgpu/utils.h | ||
---|---|---|
25–28 | Ah, you're right it seems it understands constant and friends. |
It's not great but it's more sound than the numbers. I think the other builtin openmp and hip headers should do the same for the ocml interfacing
I don't really condone using the numbered address spaces to access these, they don't quite work the same way