Patch fixes several problems in the implementation of NVPTX RTL.
- Detection of the last iteration for loops with static scheduling, no chunks.
- Fixes reductions for the serialized parallel constructs.
- Fixes handling of the barriers.
Paths
| Differential D48480
[OPENMP, NVPTX] Fixes for NVPTX RTL ClosedPublic Authored by ABataev on Jun 22 2018, 6:31 AM.
Details Summary Patch fixes several problems in the implementation of NVPTX RTL.
Diff Detail Event TimelineHahnfeld added inline comments.
This revision is now accepted and ready to land.Jun 25 2018, 5:43 AM Closed by commit rL335469: [OPENMP, NVPTX] Fixes for NVPTX RTL (authored by ABataev). · Explain WhyJun 25 2018, 6:48 AM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 152496 libomptarget/deviceRTLs/nvptx/src/loop.cu
libomptarget/deviceRTLs/nvptx/src/reduction.cu
libomptarget/deviceRTLs/nvptx/src/sync.cu
|
Are you moving these functions for a reason? I think they are defined as extern, so there's no need to have them before calling them.
(Moving code makes reading patches very difficult, I think the only change in this file is in __kmpc_barrier, right?)