This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
openmp/libomptarget/DeviceRTL/src/
-
libomptarget/
-
DeviceRTL/
-
src/
2/7
Parallelism.cpp

Differential D112861

[OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3
ClosedPublic

Authored by tianshilei1992 on Oct 29 2021, 6:19 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
jhuber6

Commits

rG025f54924014: [OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3

Summary

The synchronization at the end of parallel region cannot make sure all threads
exit the scope. As a result, the assertions right after it might be hit, and
further the state::assumeInitialState(IsSPMD) in __kmpc_target_deinit may
not hold as well. We either add a synchronization right after the parallel region,
or remove the assertions and assuptions. Here we choose the first one as those
assertions and assumptions can help optimizations.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

tianshilei1992 created this revision.Oct 29 2021, 6:19 PM

Herald added subscribers: guansong, yaxunl. · View Herald TranscriptOct 29 2021, 6:19 PM

tianshilei1992 requested review of this revision.Oct 29 2021, 6:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptOct 29 2021, 6:19 PM

Herald added subscribers: openmp-commits, sstefan1. · View Herald Transcript

Harbormaster completed remote builds in B131550: Diff 383546.Oct 29 2021, 6:24 PM

This revision is now accepted and ready to land.Oct 30 2021, 11:38 AM

Closed by commit rG025f54924014: [OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3 (authored by tianshilei1992). · Explain WhyOct 30 2021, 11:44 AM

This revision was automatically updated to reflect the committed changes.

tianshilei1992 added a commit: rG025f54924014: [OpenMP][DeviceRTL] Fixed an issue that causes hang in SU3.

ye-luo added a subscriber: ye-luo.Oct 30 2021, 11:56 AM

ye-luo added inline comments.

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	This explanation seems plausible. Are the synchronization requirement of ASSERT and why being documented somewhere?

tianshilei1992 added inline comments.Oct 30 2021, 12:10 PM

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	Have no idea.

ye-luo added inline comments.Oct 30 2021, 1:22 PM

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	synchronize::threadsAligned() has been called at the end of above scope. What caused the out of sync and needs this additional call? Why ASSERT requires synchronization. All needs explanation or documentation. "Synchronize all threads to make sure every thread exits the scope above;" Just sounds plausible

ye-luo added inline comments.Oct 30 2021, 1:25 PM

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	Making code slower by the addtional synchronization can also potentially skip triggering the hang. So it needs to be well understood and explained.

tianshilei1992 added inline comments.Oct 30 2021, 1:47 PM

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	The key point here is the scope because state configuration is via RAII. The synchronization at the end of the scope can only make sure threads are aligned before exiting the scope, but since then, it is undetermined which threads exits first.

ye-luo added inline comments.Oct 30 2021, 5:41 PM

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	If the destructor sets a state which is supposed to be shared by all the threads, can synchronize::threadsAligned() be called inside the destructor?

ye-luo added inline comments.Oct 30 2021, 5:52 PM

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp
129	OK. The destructor is only for a single value and having one alignment for multiple destructors is beneficial.

Revision Contents

Path

Size

openmp/

libomptarget/

DeviceRTL/

src/

Parallelism.cpp

5 lines

Diff 383611

openmp/libomptarget/DeviceRTL/src/Parallelism.cpp

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	synchronize::threadsAligned();

if (TId < NumThreads)		if (TId < NumThreads)
invokeMicrotask(TId, 0, fn, args, nargs);		invokeMicrotask(TId, 0, fn, args, nargs);

// Synchronize all threads at the end of a parallel region.		// Synchronize all threads at the end of a parallel region.
synchronize::threadsAligned();		synchronize::threadsAligned();
}		}

		// Synchronize all threads to make sure every thread exits the scope above;
		// otherwise the following assertions and the assumption in
		// __kmpc_target_deinit may not hold.
		synchronize::threadsAligned();
		ye-luoUnsubmitted Not Done Reply Inline Actions This explanation seems plausible. Are the synchronization requirement of ASSERT and why being documented somewhere? ye-luo: This explanation seems plausible. Are the synchronization requirement of ASSERT and why being…
		tianshilei1992AuthorUnsubmitted Done Reply Inline Actions Have no idea. tianshilei1992: Have no idea.
		ye-luoUnsubmitted Not Done Reply Inline Actions synchronize::threadsAligned() has been called at the end of above scope. What caused the out of sync and needs this additional call? Why ASSERT requires synchronization. All needs explanation or documentation. "Synchronize all threads to make sure every thread exits the scope above;" Just sounds plausible ye-luo: synchronize::threadsAligned() has been called at the end of above scope. What caused the out of…
		ye-luoUnsubmitted Not Done Reply Inline Actions Making code slower by the addtional synchronization can also potentially skip triggering the hang. So it needs to be well understood and explained. ye-luo: Making code slower by the addtional synchronization can also potentially skip triggering the…
		tianshilei1992AuthorUnsubmitted Done Reply Inline Actions The key point here is the scope because state configuration is via RAII. The synchronization at the end of the scope can only make sure threads are aligned before exiting the scope, but since then, it is undetermined which threads exits first. tianshilei1992: The key point here is the scope because state configuration is via RAII. The synchronization at…
		ye-luoUnsubmitted Not Done Reply Inline Actions If the destructor sets a state which is supposed to be shared by all the threads, can synchronize::threadsAligned() be called inside the destructor? ye-luo: If the destructor sets a state which is supposed to be shared by all the threads, can…
		ye-luoUnsubmitted Not Done Reply Inline Actions OK. The destructor is only for a single value and having one alignment for multiple destructors is beneficial. ye-luo: OK. The destructor is only for a single value and having one alignment for multiple destructors…

ASSERT(state::ParallelTeamSize == 1u);		ASSERT(state::ParallelTeamSize == 1u);
ASSERT(icv::ActiveLevel == 0u);		ASSERT(icv::ActiveLevel == 0u);
ASSERT(icv::Level == 0u);		ASSERT(icv::Level == 0u);
return;		return;
}		}

// We do not create a new data environment because all threads in the team		// We do not create a new data environment because all threads in the team
// that are active are now running this parallel region. They share the		// that are active are now running this parallel region. They share the
▲ Show 20 Lines • Show All 92 Lines • Show Last 20 Lines