The synchronization at the end of parallel region cannot make sure all threads
exit the scope. As a result, the assertions right after it might be hit, and
further the state::assumeInitialState(IsSPMD) in __kmpc_target_deinit may
not hold as well. We either add a synchronization right after the parallel region,
or remove the assertions and assuptions. Here we choose the first one as those
assertions and assumptions can help optimizations.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | This explanation seems plausible. Are the synchronization requirement of ASSERT and why being documented somewhere? |
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | Have no idea. |
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | synchronize::threadsAligned() has been called at the end of above scope. |
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | Making code slower by the addtional synchronization can also potentially skip triggering the hang. So it needs to be well understood and explained. |
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | The key point here is the scope because state configuration is via RAII. The synchronization at the end of the scope can only make sure threads are aligned *before* exiting the scope, but since then, it is undetermined which threads exits first. |
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | If the destructor sets a state which is supposed to be shared by all the threads, can synchronize::threadsAligned() be called inside the destructor? |
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp | ||
---|---|---|
129 | OK. The destructor is only for a single value and having one alignment for multiple destructors is beneficial. |
This explanation seems plausible. Are the synchronization requirement of ASSERT and why being documented somewhere?