Page MenuHomePhabricator

[OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3
ClosedPublic

Authored by tianshilei1992 on Oct 29 2021, 6:19 PM.

Details

Summary

The synchronization at the end of parallel region cannot make sure all threads
exit the scope. As a result, the assertions right after it might be hit, and
further the state::assumeInitialState(IsSPMD) in __kmpc_target_deinit may
not hold as well. We either add a synchronization right after the parallel region,
or remove the assertions and assuptions. Here we choose the first one as those
assertions and assumptions can help optimizations.

Diff Detail

Event Timeline

tianshilei1992 created this revision.Oct 29 2021, 6:19 PM
tianshilei1992 requested review of this revision.Oct 29 2021, 6:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2021, 6:19 PM
This revision is now accepted and ready to land.Oct 30 2021, 11:38 AM
ye-luo added a subscriber: ye-luo.Oct 30 2021, 11:56 AM
ye-luo added inline comments.
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

This explanation seems plausible. Are the synchronization requirement of ASSERT and why being documented somewhere?

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

Have no idea.

ye-luo added inline comments.Oct 30 2021, 1:22 PM
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

synchronize::threadsAligned() has been called at the end of above scope.
What caused the out of sync and needs this additional call?
Why ASSERT requires synchronization.
All needs explanation or documentation.
"Synchronize all threads to make sure every thread exits the scope above;"
Just sounds plausible

ye-luo added inline comments.Oct 30 2021, 1:25 PM
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

Making code slower by the addtional synchronization can also potentially skip triggering the hang. So it needs to be well understood and explained.

tianshilei1992 added inline comments.Oct 30 2021, 1:47 PM
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

The key point here is the scope because state configuration is via RAII. The synchronization at the end of the scope can only make sure threads are aligned *before* exiting the scope, but since then, it is undetermined which threads exits first.

ye-luo added inline comments.Oct 30 2021, 5:41 PM
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

If the destructor sets a state which is supposed to be shared by all the threads, can synchronize::threadsAligned() be called inside the destructor?

ye-luo added inline comments.Oct 30 2021, 5:52 PM
openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129

OK. The destructor is only for a single value and having one alignment for multiple destructors is beneficial.