This patchs adds the arguments necessary to allocate the size of the
dynamic shared memory via the LIBOMPTARGET_SHARED_MEMORY_SIZE
environment variable. This patch only allocates the memory, AMDGPU has a
limitation that shared memory can only be accessed from the kernel
directly. So this will currently only work with optimizations to inline
the accessor function.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/plugins/amdgpu/src/rtl.cpp | ||
---|---|---|
1189 | + KernelInfoEntry.group_segment_size; |
Comment Actions
Looks about as expected, modulo Johannes' comment above. Could we do the getenv call once, instead of per-launch? Somewhere in the massive class constructor perhaps.
openmp/libomptarget/plugins/amdgpu/src/rtl.cpp | ||
---|---|---|
1189 | Yep, need the sum. Possibly worth checking the stoi result is positive and < 64k or so. |
+ KernelInfoEntry.group_segment_size;