This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
openmp/trunk/runtime/src/
-
trunk/
-
runtime/
-
src/
-
kmp.h
-
kmp_runtime.cpp
-
kmp_tasking.cpp
-
kmp_wait_release.h

Differential D28377

Fix a race in shutdown when tasking is used
ClosedPublic

Authored by tlwilmar on Jan 5 2017, 1:22 PM.

Download Raw Diff

Details

Reviewers

hbae
Hahnfeld
AndreyChurbanov

Commits

rG581490e713a3: Fix a race in shutdown when tasking is used.
rOMP294214: Fix a race in shutdown when tasking is used.
rL294214: Fix a race in shutdown when tasking is used.

Summary

Jonas Hahnfeld reported a bug in shutdown in the presence of tasking.

This change fixes a race in shutdown code when threads are being reaped. Threads spinning in fork barrier and searching for tasks to steal may identify other threads as potential victims to steal from. The other threads may have already been reaped.

The fix creates a simple flag on the threads that lets them indicate that they are in a reapable state when shutdown is happening. The shutdown code then forces any threads out of the fork barrier and then waits until all the threads are reapable, before reaping any of them.

Diff Detail

Repository: rL LLVM

Event Timeline

tlwilmar updated this revision to Diff 83290.Jan 5 2017, 1:22 PM

tlwilmar retitled this revision from to Fix a race in shutdown when tasking is used.

tlwilmar updated this object.

tlwilmar added reviewers: Hahnfeld, AndreyChurbanov, hbae.

tlwilmar set the repository for this revision to rL LLVM.

tlwilmar added a subscriber: openmp-commits.

Thanks, this patch seems to solve the problem!

runtime/src/kmp_runtime.cpp
4007–4012 ↗	(On Diff #83290)	Do we need this here or would it be enough to have the flag completely handled in `__kmp_execute_tasks_template`? I don't know whether that would create a race on `th_reap_state`. If `__kmp_execute_tasks_template` is not guaranteed to be called at least once before a barrier finishes, why aren't there problems with multiple parallel regions? Each thread will have `th.th_reap_state = KMP_SAFE_TO_REAP` at the end of the first parallel region...
runtime/src/kmp_wait_release.h
215–224 ↗	(On Diff #83290)	Resetting to `th.th_reap_state = KMP_SAFE_TO_REAP` could then be done at the end of `__kmp_execute_tasks_template`

Hi Jonas,
There are probably numerous ways of doing this. I answered your comments with why I did it this way.
Thanks!
Terry

runtime/src/kmp_runtime.cpp
4007–4012 ↗	(On Diff #83290)	kmp_execute_tasks_template may not be called by all threads, and it may be called multiple times by individual threads, so it's often premature to set the flag to SAFE inside. kmp_initialize_info will be called to reset th_reap_state for each thread.
runtime/src/kmp_wait_release.h
215–224 ↗	(On Diff #83290)	As mentioned above, we want to avoid prematurely setting the thread as safe to reap. Note that the cases in which we set the thread as safe to reap are when 1) no tasks have been encountered by any threads; 2) the task team is no longer active; 3) the current thread's task team is NULL. The case inside of __kmp_execute_tasks_template only amounts to "this thread couldn't find any more tasks after randomly searching for some".

Hahnfeld added inline comments.Jan 7 2017, 8:02 AM

runtime/src/kmp_runtime.cpp
4007–4012 ↗	(On Diff #83290)	`__kmp_initialize_info` is not called if a hot team is reused with either the same or a lower number of threads. int i; for (i = 0; i < 2; i++) { #pragma omp parallel num_threads(2) { #pragma omp single nowait #pragma omp task { printf("Executed by thread #%d!\n", omp_get_thread_num()); } } } with `$ KMP_F_DEBUG=10 ./crash2 3>&1 1>&2 2>&3 \| grep -E __kmp_initialize_info1` (sorry for the pipes!) __kmp_initialize_info1: T#0:0 this_thread=0x60abc0 curtask=(nil) __kmp_initialize_info1: T#1:1 this_thread=0x617f00 curtask=(nil) __kmp_initialize_info1: T#0:0 this_thread=0x60abc0 curtask=0x607980 __kmp_initialize_info1: T#1:1 this_thread=0x617f00 curtask=0x610480 Executed by thread #1! Executed by thread #1! Finished parallel regions!
4348 ↗	(On Diff #83290)	This function is called when a team is reused, maybe we have to add it here?
runtime/src/kmp_wait_release.h
215–224 ↗	(On Diff #83290)	Ah, all right, I forgot that returning from `__kmp_execute_tasks_template` does not mean that all tasks are finished!

tlwilmar added inline comments.Jan 12 2017, 12:45 PM

runtime/src/kmp_runtime.cpp
4007–4012 ↗	(On Diff #83290)	You're right... but... I think that we need to set to NOT SAFE whenever we come out of the spin loop in order to reset before both fork and join barriers. However, that impacts how we free and reap the threads. I'll have to tinker with this a bit.

Hahnfeld requested changes to this revision.Jan 31 2017, 4:06 AM

This revision now requires changes to proceed.Jan 31 2017, 4:06 AM

Reap state does not need to be reset after each barrier. If thread attempts to execute tasks, it will be set to NOT SAFE to reap. It only matters in the spin at the fork barrier after shutdown is triggered. Master thread now waits for ALL threads to reach SAFE state before proceeding to clean anything up.

To write down how I think this works:

No worker thread can set th_reap_state = KMP_NOT_SAFE_TO_REAP after the master thread has passed the barrier.
So master thread waits for all worker threads to finish before reaping.
If all threads have finished, none of them will try to steal so all can be safely reaped.

If that's the case then LGTM!

runtime/src/kmp_runtime.cpp
5267 ↗	(On Diff #86519)	I think you can swap the loop and this if statement which does not depend on the loop iteration? (see also line 5284)
5276 ↗	(On Diff #86519)	Please reindent this to make it clear

This revision is now accepted and ready to land.Jan 31 2017, 11:40 PM

Jonas -- made the changes you requested (and removed second 'if' checking tasking mode).

Closed by commit rL294214: Fix a race in shutdown when tasking is used. (authored by achurbanov). · Explain WhyFeb 6 2017, 11:05 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

openmp/

trunk/

runtime/

src/

5 lines

27 lines

1 line

9 lines

Diff 87267

openmp/trunk/runtime/src/kmp.h

	Show First 20 Lines • Show All 1,619 Lines • ▼ Show 20 Lines

	// Constants for release barrier wait state: currently, hierarchical only			// Constants for release barrier wait state: currently, hierarchical only
	#define KMP_BARRIER_NOT_WAITING 0 // Normal state; worker not in wait_sleep			#define KMP_BARRIER_NOT_WAITING 0 // Normal state; worker not in wait_sleep
	#define KMP_BARRIER_OWN_FLAG 1 // Normal state; worker waiting on own b_go flag in release			#define KMP_BARRIER_OWN_FLAG 1 // Normal state; worker waiting on own b_go flag in release
	#define KMP_BARRIER_PARENT_FLAG 2 // Special state; worker waiting on parent's b_go flag in release			#define KMP_BARRIER_PARENT_FLAG 2 // Special state; worker waiting on parent's b_go flag in release
	#define KMP_BARRIER_SWITCH_TO_OWN_FLAG 3 // Special state; tells worker to shift from parent to own b_go			#define KMP_BARRIER_SWITCH_TO_OWN_FLAG 3 // Special state; tells worker to shift from parent to own b_go
	#define KMP_BARRIER_SWITCHING 4 // Special state; worker resets appropriate flag on wake-up			#define KMP_BARRIER_SWITCHING 4 // Special state; worker resets appropriate flag on wake-up

				#define KMP_NOT_SAFE_TO_REAP 0 // Thread th_reap_state: not safe to reap (tasking)
				#define KMP_SAFE_TO_REAP 1 // Thread th_reap_state: safe to reap (not tasking)

	enum barrier_type {			enum barrier_type {
	bs_plain_barrier = 0, /* 0, All non-fork/join barriers (except reduction barriers if enabled) */			bs_plain_barrier = 0, /* 0, All non-fork/join barriers (except reduction barriers if enabled) */
	bs_forkjoin_barrier, /* 1, All fork/join (parallel region) barriers */			bs_forkjoin_barrier, /* 1, All fork/join (parallel region) barriers */
	#if KMP_FAST_REDUCTION_BARRIER			#if KMP_FAST_REDUCTION_BARRIER
	bs_reduction_barrier, /* 2, All barriers that are used in reduction */			bs_reduction_barrier, /* 2, All barriers that are used in reduction */
	#endif // KMP_FAST_REDUCTION_BARRIER			#endif // KMP_FAST_REDUCTION_BARRIER
	bs_last_barrier /* Just a placeholder to mark the end */			bs_last_barrier /* Just a placeholder to mark the end */
	};			};
	▲ Show 20 Lines • Show All 660 Lines • ▼ Show 20 Lines
	* Tasking-related data for the thread			* Tasking-related data for the thread
	*/			*/
	kmp_task_team_t * th_task_team; // Task team struct			kmp_task_team_t * th_task_team; // Task team struct
	kmp_taskdata_t * th_current_task; // Innermost Task being executed			kmp_taskdata_t * th_current_task; // Innermost Task being executed
	kmp_uint8 th_task_state; // alternating 0/1 for task team identification			kmp_uint8 th_task_state; // alternating 0/1 for task team identification
	kmp_uint8 * th_task_state_memo_stack; // Stack holding memos of th_task_state at nested levels			kmp_uint8 * th_task_state_memo_stack; // Stack holding memos of th_task_state at nested levels
	kmp_uint32 th_task_state_top; // Top element of th_task_state_memo_stack			kmp_uint32 th_task_state_top; // Top element of th_task_state_memo_stack
	kmp_uint32 th_task_state_stack_sz; // Size of th_task_state_memo_stack			kmp_uint32 th_task_state_stack_sz; // Size of th_task_state_memo_stack
				kmp_uint32 th_reap_state; // Non-zero indicates thread is not
				// tasking, thus safe to reap

	/*			/*
	* More stuff for keeping track of active/sleeping threads			* More stuff for keeping track of active/sleeping threads
	* (this part is written by the worker thread)			* (this part is written by the worker thread)
	*/			*/
	kmp_uint8 th_active_in_pool; // included in count of			kmp_uint8 th_active_in_pool; // included in count of
	// #active threads in pool			// #active threads in pool
	int th_active; // ! sleeping			int th_active; // ! sleeping
	▲ Show 20 Lines • Show All 1,247 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_runtime.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,998 Lines • ▼ Show 20 Lines	__kmp_initialize_info( kmp_info_t this_thr, kmp_team_t team, int tid, int gtid )
KMP_DEBUG_ASSERT( master->th.th_root );		KMP_DEBUG_ASSERT( master->th.th_root );

KMP_MB();		KMP_MB();

TCW_SYNC_PTR(this_thr->th.th_team, team);		TCW_SYNC_PTR(this_thr->th.th_team, team);

this_thr->th.th_info.ds.ds_tid = tid;		this_thr->th.th_info.ds.ds_tid = tid;
this_thr->th.th_set_nproc = 0;		this_thr->th.th_set_nproc = 0;
		if (__kmp_tasking_mode != tskm_immediate_exec)
		// When tasking is possible, threads are not safe to reap until they are
		// done tasking; this will be set when tasking code is exited in wait
		this_thr->th.th_reap_state = KMP_NOT_SAFE_TO_REAP;
		else // no tasking --> always safe to reap
		this_thr->th.th_reap_state = KMP_SAFE_TO_REAP;
#if OMP_40_ENABLED		#if OMP_40_ENABLED
this_thr->th.th_set_proc_bind = proc_bind_default;		this_thr->th.th_set_proc_bind = proc_bind_default;
# if KMP_AFFINITY_SUPPORTED		# if KMP_AFFINITY_SUPPORTED
this_thr->th.th_new_place = this_thr->th.th_current_place;		this_thr->th.th_new_place = this_thr->th.th_current_place;
# endif		# endif
#endif		#endif
this_thr->th.th_root = master->th.th_root;		this_thr->th.th_root = master->th.th_root;

▲ Show 20 Lines • Show All 1,235 Lines • ▼ Show 20 Lines	#endif // KMP_NESTED_HOT_TEAMS

/* team is done working */		/* team is done working */
TCW_SYNC_PTR(team->t.t_pkfn, NULL); // Important for Debugging Support Library.		TCW_SYNC_PTR(team->t.t_pkfn, NULL); // Important for Debugging Support Library.
team->t.t_copyin_counter = 0; // init counter for possible reuse		team->t.t_copyin_counter = 0; // init counter for possible reuse
// Do not reset pointer to parent team to NULL for hot teams.		// Do not reset pointer to parent team to NULL for hot teams.

/* if we are non-hot team, release our threads */		/* if we are non-hot team, release our threads */
if( ! use_hot_team ) {		if( ! use_hot_team ) {
if ( __kmp_tasking_mode != tskm_immediate_exec ) {		if (__kmp_tasking_mode != tskm_immediate_exec) {
		// Wait for threads to reach reapable state
		for (f = 1; f < team->t.t_nproc; ++f) {
		KMP_DEBUG_ASSERT(team->t.t_threads[f]);
		volatile kmp_uint32 *state = &team->t.t_threads[f]->th.th_reap_state;
		while (*state != KMP_SAFE_TO_REAP) {
		#if KMP_OS_WINDOWS
		// On Windows a thread can be killed at any time, check this
		DWORD ecode;
		if (__kmp_is_thread_alive(team->t.t_threads[f], &ecode))
		KMP_CPU_PAUSE();
		else
		*state = KMP_SAFE_TO_REAP; // reset the flag for dead thread
		#else
		KMP_CPU_PAUSE();
		#endif
		}
		}

// Delete task teams		// Delete task teams
int tt_idx;		int tt_idx;
for (tt_idx=0; tt_idx<2; ++tt_idx) {		for (tt_idx=0; tt_idx<2; ++tt_idx) {
kmp_task_team_t *task_team = team->t.t_task_team[tt_idx];		kmp_task_team_t *task_team = team->t.t_task_team[tt_idx];
if ( task_team != NULL ) {		if ( task_team != NULL ) {
for (f=0; f<team->t.t_nproc; ++f) { // Have all threads unref task teams		for (f=0; f<team->t.t_nproc; ++f) { // Have all threads unref task teams
team->t.t_threads[f]->th.th_task_team = NULL;		team->t.t_threads[f]->th.th_task_team = NULL;
}		}
▲ Show 20 Lines • Show All 569 Lines • ▼ Show 20 Lines	// KMP_ASSERT( ! KMP_UBER_GTID( i ) ); // AC: there can be uber threads alive here

// Reap the worker threads.		// Reap the worker threads.
// This is valid for now, but be careful if threads are reaped sooner.		// This is valid for now, but be careful if threads are reaped sooner.
while ( __kmp_thread_pool != NULL ) { // Loop thru all the thread in the pool.		while ( __kmp_thread_pool != NULL ) { // Loop thru all the thread in the pool.
// Get the next thread from the pool.		// Get the next thread from the pool.
kmp_info_t * thread = (kmp_info_t *) __kmp_thread_pool;		kmp_info_t * thread = (kmp_info_t *) __kmp_thread_pool;
__kmp_thread_pool = thread->th.th_next_pool;		__kmp_thread_pool = thread->th.th_next_pool;
// Reap it.		// Reap it.
		KMP_DEBUG_ASSERT(thread->th.th_reap_state == KMP_SAFE_TO_REAP);
thread->th.th_next_pool = NULL;		thread->th.th_next_pool = NULL;
thread->th.th_in_pool = FALSE;		thread->th.th_in_pool = FALSE;
__kmp_reap_thread( thread, 0 );		__kmp_reap_thread( thread, 0 );
}; // while		}; // while
__kmp_thread_pool_insert_pt = NULL;		__kmp_thread_pool_insert_pt = NULL;

// Reap teams.		// Reap teams.
while ( __kmp_team_pool != NULL ) { // Loop thru all the teams in the pool.		while ( __kmp_team_pool != NULL ) { // Loop thru all the teams in the pool.
▲ Show 20 Lines • Show All 1,836 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_tasking.cpp

Show First 20 Lines • Show All 1,878 Lines • ▼ Show 20 Lines	static inline int __kmp_execute_tasks_template(kmp_info_t thread, kmp_int32 gtid, C flag, int final_spin,
KMP_DEBUG_ASSERT( __kmp_tasking_mode != tskm_immediate_exec );		KMP_DEBUG_ASSERT( __kmp_tasking_mode != tskm_immediate_exec );
KMP_DEBUG_ASSERT( thread == __kmp_threads[ gtid ] );		KMP_DEBUG_ASSERT( thread == __kmp_threads[ gtid ] );

if (task_team == NULL) return FALSE;		if (task_team == NULL) return FALSE;

KA_TRACE(15, ("__kmp_execute_tasks_template(enter): T#%d final_spin=%d *thread_finished=%d\n",		KA_TRACE(15, ("__kmp_execute_tasks_template(enter): T#%d final_spin=%d *thread_finished=%d\n",
gtid, final_spin, *thread_finished) );		gtid, final_spin, *thread_finished) );

		thread->th.th_reap_state = KMP_NOT_SAFE_TO_REAP;
threads_data = (kmp_thread_data_t *)TCR_PTR(task_team -> tt.tt_threads_data);		threads_data = (kmp_thread_data_t *)TCR_PTR(task_team -> tt.tt_threads_data);
KMP_DEBUG_ASSERT( threads_data != NULL );		KMP_DEBUG_ASSERT( threads_data != NULL );

nthreads = task_team -> tt.tt_nproc;		nthreads = task_team -> tt.tt_nproc;
unfinished_threads = &(task_team -> tt.tt_unfinished_threads);		unfinished_threads = &(task_team -> tt.tt_unfinished_threads);
#if OMP_45_ENABLED		#if OMP_45_ENABLED
KMP_DEBUG_ASSERT( nthreads > 1 \|\| task_team->tt.tt_found_proxy_tasks);		KMP_DEBUG_ASSERT( nthreads > 1 \|\| task_team->tt.tt_found_proxy_tasks);
#else		#else
▲ Show 20 Lines • Show All 1,268 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_wait_release.h

Show First 20 Lines • Show All 190 Lines • ▼ Show 20 Lines	while (flag->notdone_check()) {
2) All tasks have been executed to completion.		2) All tasks have been executed to completion.
3) Tasking is off for this region. This could be because we are in a serialized region		3) Tasking is off for this region. This could be because we are in a serialized region
(perhaps the outer one), or else tasking was manually disabled (KMP_TASKING=0). */		(perhaps the outer one), or else tasking was manually disabled (KMP_TASKING=0). */
if (task_team != NULL) {		if (task_team != NULL) {
if (TCR_SYNC_4(task_team->tt.tt_active)) {		if (TCR_SYNC_4(task_team->tt.tt_active)) {
if (KMP_TASKING_ENABLED(task_team))		if (KMP_TASKING_ENABLED(task_team))
flag->execute_tasks(this_thr, th_gtid, final_spin, &tasks_completed		flag->execute_tasks(this_thr, th_gtid, final_spin, &tasks_completed
USE_ITT_BUILD_ARG(itt_sync_obj), 0);		USE_ITT_BUILD_ARG(itt_sync_obj), 0);
		else
		this_thr->th.th_reap_state = KMP_SAFE_TO_REAP;
}		}
else {		else {
KMP_DEBUG_ASSERT(!KMP_MASTER_TID(this_thr->th.th_info.ds.ds_tid));		KMP_DEBUG_ASSERT(!KMP_MASTER_TID(this_thr->th.th_info.ds.ds_tid));
this_thr->th.th_task_team = NULL;		this_thr->th.th_task_team = NULL;
		this_thr->th.th_reap_state = KMP_SAFE_TO_REAP;
}		}
		} else {
		this_thr->th.th_reap_state = KMP_SAFE_TO_REAP;
} // if		} // if
} // if		} // if

KMP_FSYNC_SPIN_PREPARE(spin);		KMP_FSYNC_SPIN_PREPARE(spin);
if (TCR_4(__kmp_global.g.g_done)) {		if (TCR_4(__kmp_global.g.g_done)) {
if (__kmp_global.g.g_abort)		if (__kmp_global.g.g_abort)
__kmp_abort_thread();		__kmp_abort_thread();
break;		break;
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines	#endif

flag->suspend(th_gtid);		flag->suspend(th_gtid);

if (TCR_4(__kmp_global.g.g_done)) {		if (TCR_4(__kmp_global.g.g_done)) {
if (__kmp_global.g.g_abort)		if (__kmp_global.g.g_abort)
__kmp_abort_thread();		__kmp_abort_thread();
break;		break;
}		}
		else if (__kmp_tasking_mode != tskm_immediate_exec
		&& this_thr->th.th_reap_state == KMP_SAFE_TO_REAP) {
		this_thr->th.th_reap_state = KMP_NOT_SAFE_TO_REAP;
		}
// TODO: If thread is done with work and times out, disband/free		// TODO: If thread is done with work and times out, disband/free
}		}

#if OMPT_SUPPORT && OMPT_BLAME		#if OMPT_SUPPORT && OMPT_BLAME
if (ompt_enabled &&		if (ompt_enabled &&
ompt_state != ompt_state_undefined) {		ompt_state != ompt_state_undefined) {
if (ompt_state == ompt_state_idle) {		if (ompt_state == ompt_state_idle) {
if (ompt_callbacks.ompt_callback(ompt_event_idle_end)) {		if (ompt_callbacks.ompt_callback(ompt_event_idle_end)) {
▲ Show 20 Lines • Show All 329 Lines • Show Last 20 Lines