Initialization for many queues/streams/events might come at a cost even
if we do not use them. This patch lazily initializes them, otherwise
nothing (major) is supposed to change. A minor difference is the
handling of an error in the initialization of an AMD queue (other than
the first). We now report an error but continue with the first queue;
unsure if this will ever come up.
Details
- Reviewers
jplehr mhalk jhuber6 tianshilei1992 ye-luo
Diff Detail
Event Timeline
openmp/docs/design/Runtimes.rst | ||
---|---|---|
1200 | FWIW, events are used more frequently than streams even for single thread offloading. |
openmp/docs/design/Runtimes.rst | ||
---|---|---|
1200 | Just wanted to point out (since it was not directly obvious to me): So, the patch would have to be adapted if LIBOMPTARGET_NUM_INITIAL_STREAMS will be increased. |
openmp/libomptarget/plugins-nextgen/amdgpu/src/rtl.cpp | ||
---|---|---|
585 | Is there any reason to maintain both init and initLazy instead of making init lazy by default? |
I am unsure we need this after all. @mhalk, can u check, if we don't need this we can scrap it.
Just looked through this patch and D154523.
Don't have a clear "yes"/"no".
IMO merging the two makes sense as they achieve the same: "single HSA queue eager init".
The busy tracking will revert like half of this patch's changes in amdgpu/src/rtl.cpp.
Should you tend to scrap it, I will have to incorporate ~30ish LoC from this one.
That is, the docs & setting of default/initial values (esp. outside of amdgpu/src/rtl.cpp).
FWIW, events are used more frequently than streams even for single thread offloading.