This patch extends the AMDGPU plugin for OpenMP target offloading from using a single HSA queue to multiple queues (four in this patch) per device. This enables concurrent threads to concurrently submit kernel launches to the same GPU.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo