When using the default schedule for distribute, the default must ensure that at most one iteration is associated with every thread.
Diff Detail
- Repository
- rOMP OpenMP
Event Timeline
libomptarget/deviceRTLs/nvptx/src/loop.cu | ||
---|---|---|
258 | Seems to me, you need to add plastiter and update it for the last iteration. Otherwise it might break lastprivates |
libomptarget/deviceRTLs/nvptx/src/loop.cu | ||
---|---|---|
243 | You don't use the initial value of the stride variable, you can declare on the first use. |
Do we really need new entry points for this? I think we could avoid code duplication by letting the compiler generated code pass the correct chunk to __kmpc_for_static_init_??. This could either come from a (single) new query function (__kmpc_nvptx_distribute_default_chunk?) or can be hard coded to threadsPerBlock because this is only relevant for SPMD.
Due to most recent proposed changes to Clang in D52434, changes to the runtime are no longer required.
You don't use the initial value of the stride variable, you can declare on the first use.
The same for ub