This patch adds an external interface to access the dynamic shared
memory buffer in the device runtime. The function introduced is
`llvm_omp_get_dynamic_shared`. This includes a host-side
definition that only returns a null pointer so that it can be used when
host-fallback is enabled without crashing. Support for dynamic shared
memory was also ported to the old device runtime.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Exciting! Will take a close look early next week. Surprised there's no change to the GPU plugins needed
That was introduced in D110006, for CUDA it's easy since it's just an argument to the kernel launch function. I haven't implemented it for AMD yet.
openmp/libomptarget/deviceRTLs/common/src/data_sharing.cu | ||
---|---|---|
24 | ^ static |
The plumbing here is all uncontroversial, it's just a wrapper over the openmp pragma.
This won't work on amdgpu as-is, will need to pass the environment variable through to the HSA packet, and see what code clang emits for the allocator construct, and if that doesn't match what hip are using add lowering in the back end. There's nothing there that can't be done, just need to find the time.
NVPTX just sees anything with the extern shared x[] pattern in the PTX and hooks up the pointer to dynamic shared memory. I'm not sure if AMD uses a similar method, but if they do I think all that would need to be done is to add the argument to the config struct used in the AMD plugin.
^ static