This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
openmp/trunk/runtime/
-
trunk/
-
runtime/
-
src/
-
kmp_tasking.c
-
test/tasking/
-
tasking/
-
bug_nested_proxy_task.c

Differential D23115

__kmp_free_task: Fix for serial explicit tasks producing proxy tasks
ClosedPublic

Authored by Hahnfeld on Aug 3 2016, 4:13 AM.

Download Raw Diff

Details

Reviewers

jlpeyton
AndreyChurbanov

Commits

rG69f8511f8f47: __kmp_free_task: Fix for serial explicit tasks producing proxy tasks
rOMP277991: __kmp_free_task: Fix for serial explicit tasks producing proxy tasks
rL277991: __kmp_free_task: Fix for serial explicit tasks producing proxy tasks

Summary

Consider the following code which may be executed by a serial team:

int dep;
#pragma omp target nowait depend(out: dep)
{
    sleep(1);
}
#pragma omp task depend(in: dep)
{
    #pragma omp target nowait
    {
        sleep(1);
    }
}

Here the explicit task may not be freed until the nested proxy task has
finished. The current code hasn't considered this and called __kmp_free_task
anyway which triggered an assert because of remaining incomplete children:

KMP_DEBUG_ASSERT( TCR_4(taskdata->td_incomplete_child_tasks) == 0 );

Diff Detail

Repository: rL LLVM

Event Timeline

Hahnfeld updated this revision to Diff 66641.Aug 3 2016, 4:13 AM

Hahnfeld retitled this revision from to __kmp_free_task: Fix for serial explicit tasks producing proxy tasks.

Hahnfeld updated this object.

Hahnfeld added reviewers: jlpeyton, AndreyChurbanov.

Hahnfeld added a subscriber: openmp-commits.

Please add your example as a regression test case.

In D23115#504638, @hfinkel wrote:

Please add your example as a regression test case.

Hmm, that will be tricky: Currently only the Intel Compiler will generate proxy tasks for target nowait...

In D23115#504641, @Hahnfeld wrote:

In D23115#504638, @hfinkel wrote:

Please add your example as a regression test case.

Hmm, that will be tricky: Currently only the Intel Compiler will generate proxy tasks for target nowait...

In general, I'd still add the test (just with a comment that it had failed using the Intel compiler). I presume that the test suite runs with the Intel compiler too. Since this uses the target directive, however, I'm not sure it will run with Clang currently, so we might need to wait.

In D23115#504652, @hfinkel wrote:

In D23115#504641, @Hahnfeld wrote:

In D23115#504638, @hfinkel wrote:

Please add your example as a regression test case.

Hmm, that will be tricky: Currently only the Intel Compiler will generate proxy tasks for target nowait...

In general, I'd still add the test (just with a comment that it had failed using the Intel compiler). I presume that the test suite runs with the Intel compiler too. Since this uses the target directive, however, I'm not sure it will run with Clang currently, so we might need to wait.

Hmm, I don't quite see much sense in a test that would not fail with the compilers the tests are most commonly (and automatically) run with.
Maybe we could reuse the idea of tasking/kmp_taskloop.c where the compiler generated code is manually emitted...

In D23115#504656, @Hahnfeld wrote:

In D23115#504652, @hfinkel wrote:

In D23115#504641, @Hahnfeld wrote:

In D23115#504638, @hfinkel wrote:

Please add your example as a regression test case.

Hmm, that will be tricky: Currently only the Intel Compiler will generate proxy tasks for target nowait...

In general, I'd still add the test (just with a comment that it had failed using the Intel compiler). I presume that the test suite runs with the Intel compiler too. Since this uses the target directive, however, I'm not sure it will run with Clang currently, so we might need to wait.

Hmm, I don't quite see much sense in a test that would not fail with the compilers the tests are most commonly (and automatically) run with.

We probably should have a buildbot for libomp that uses the Intel compiler so long as that's a configuration we don't want to break. Regardless, this is already an issue for lots of different reasons. We have tests that only fail with asserts enabled, we have tests for the JIT that only represent bugs on certain host platforms, etc. A test that everyone can use to verify correct behavior is best, but even if not, there is still value in it.

Maybe we could reuse the idea of tasking/kmp_taskloop.c where the compiler generated code is manually emitted...

That works too.

Add test emulating what I think the Intel Compiler generates and which shows the issue

In D23115#504658, @hfinkel wrote:

Maybe we could reuse the idea of tasking/kmp_taskloop.c where the compiler generated code is manually emitted...

That works too.

But it is not much fun. We should think about splitting kmp.h to make the compiler facing functions easily available in the tests which I have for now duplicated

AndreyChurbanov requested changes to this revision.Aug 4 2016, 7:33 AM

AndreyChurbanov edited edge metadata.

AndreyChurbanov added inline comments.

runtime/src/kmp_tasking.c
599 ↗	(On Diff #66794)	This change breaks the following code: #pragma omp task { #pragma omp task { } } The problem is that for a serial task its parent task is most likely still running and thus cannot be freed prematurely. To me, the correct fix would be to remember the status of initial task at the beginning of the routine, e.g. kmp_int32 task_serial = taskdata->td_flags.task_serial; then in the loop check this condition: if ( task_serial \|\| taskdata -> td_flags.tasktype == TASK_IMPLICIT ) return; I think checking task_serial flag here is better than team_serial or tasking_serial (as was done before the change), because a task can be serialized even if team is active and tasking is active (e.g. no room in thread's task queue).

This revision now requires changes to proceed.Aug 4 2016, 7:33 AM

Hahnfeld mentioned this in rL277730: Add test case for nested creation of tasks.Aug 4 2016, 8:03 AM

Hahnfeld added inline comments.Aug 4 2016, 8:04 AM

runtime/src/kmp_tasking.c
599 ↗	(On Diff #66794)	I'm afraid your proposal doesn't work as expected: I've just committed another test for nested task creation (I thought there already was one) and it will fail with this patch. Will have a deeper look at it tomorrow

After some offline discussion with Andrey, this seems to work and doesn't leak any allocated task structs

LGTM

This revision is now accepted and ready to land.Aug 8 2016, 2:47 AM

Closed by commit rL277991: __kmp_free_task: Fix for serial explicit tasks producing proxy tasks (authored by Hahnfeld). · Explain WhyAug 8 2016, 3:16 AM

This revision was automatically updated to reflect the committed changes.

AndreyChurbanov mentioned this in D25510: Fix for mistake done by https://reviews.llvm.org/D23115.Oct 12 2016, 4:21 AM

Revision Contents

Path

Size

openmp/

trunk/

runtime/

src/

kmp_tasking.c

24 lines

test/

tasking/

bug_nested_proxy_task.c

128 lines

Diff 67138

openmp/trunk/runtime/src/kmp_tasking.c

	Show First 20 Lines • Show All 570 Lines • ▼ Show 20 Lines
	//			//
	// gtid: Global thread ID of calling thread			// gtid: Global thread ID of calling thread
	// taskdata: task to free			// taskdata: task to free
	// thread: thread data structure of caller			// thread: thread data structure of caller

	static void			static void
	__kmp_free_task_and_ancestors( kmp_int32 gtid, kmp_taskdata_t * taskdata, kmp_info_t * thread )			__kmp_free_task_and_ancestors( kmp_int32 gtid, kmp_taskdata_t * taskdata, kmp_info_t * thread )
	{			{
	kmp_int32 children = 0;			// Proxy tasks must always be allowed to free their parents
	kmp_int32 team_or_tasking_serialized = taskdata -> td_flags.team_serial \|\| taskdata -> td_flags.tasking_ser;			// because they can be run in background even in serial mode.
				kmp_int32 task_serial = taskdata->td_flags.task_serial && !taskdata->td_flags.proxy;
	KMP_DEBUG_ASSERT( taskdata -> td_flags.tasktype == TASK_EXPLICIT );			KMP_DEBUG_ASSERT( taskdata -> td_flags.tasktype == TASK_EXPLICIT );

	if ( !team_or_tasking_serialized ) {			kmp_int32 children = KMP_TEST_THEN_DEC32( (kmp_int32 *)(& taskdata -> td_allocated_child_tasks) ) - 1;
	children = KMP_TEST_THEN_DEC32( (kmp_int32 *)(& taskdata -> td_allocated_child_tasks) ) - 1;
	KMP_DEBUG_ASSERT( children >= 0 );			KMP_DEBUG_ASSERT( children >= 0 );
	}

	// Now, go up the ancestor tree to see if any ancestors can now be freed.			// Now, go up the ancestor tree to see if any ancestors can now be freed.
	while ( children == 0 )			while ( children == 0 )
	{			{
	kmp_taskdata_t * parent_taskdata = taskdata -> td_parent;			kmp_taskdata_t * parent_taskdata = taskdata -> td_parent;

	KA_TRACE(20, ("__kmp_free_task_and_ancestors(enter): T#%d task %p complete "			KA_TRACE(20, ("__kmp_free_task_and_ancestors(enter): T#%d task %p complete "
	"and freeing itself\n", gtid, taskdata) );			"and freeing itself\n", gtid, taskdata) );

	// --- Deallocate my ancestor task ---			// --- Deallocate my ancestor task ---
	__kmp_free_task( gtid, taskdata, thread );			__kmp_free_task( gtid, taskdata, thread );

	taskdata = parent_taskdata;			taskdata = parent_taskdata;

	// Stop checking ancestors at implicit task or if tasking serialized			// Stop checking ancestors at implicit task
	// instead of walking up ancestor tree to avoid premature deallocation of ancestors.			// instead of walking up ancestor tree to avoid premature deallocation of ancestors.
	if ( team_or_tasking_serialized \|\| taskdata -> td_flags.tasktype == TASK_IMPLICIT )			if ( task_serial \|\| taskdata -> td_flags.tasktype == TASK_IMPLICIT )
	return;			return;

	if ( !team_or_tasking_serialized ) {
	// Predecrement simulated by "- 1" calculation			// Predecrement simulated by "- 1" calculation
	children = KMP_TEST_THEN_DEC32( (kmp_int32 *)(& taskdata -> td_allocated_child_tasks) ) - 1;			children = KMP_TEST_THEN_DEC32( (kmp_int32 *)(& taskdata -> td_allocated_child_tasks) ) - 1;
	KMP_DEBUG_ASSERT( children >= 0 );			KMP_DEBUG_ASSERT( children >= 0 );
	}			}
	}

	KA_TRACE(20, ("__kmp_free_task_and_ancestors(exit): T#%d task %p has %d children; "			KA_TRACE(20, ("__kmp_free_task_and_ancestors(exit): T#%d task %p has %d children; "
	"not freeing it yet\n", gtid, taskdata, children) );			"not freeing it yet\n", gtid, taskdata, children) );
	}			}

	//---------------------------------------------------------------------			//---------------------------------------------------------------------
	// __kmp_task_finish: bookkeeping to do when a task finishes execution			// __kmp_task_finish: bookkeeping to do when a task finishes execution
	// gtid: global thread ID for calling thread			// gtid: global thread ID for calling thread
	▲ Show 20 Lines • Show All 2,482 Lines • Show Last 20 Lines

openmp/trunk/runtime/test/tasking/bug_nested_proxy_task.c

				// RUN: %libomp-compile -lpthread && %libomp-run
				#include <stdio.h>
				#include <omp.h>
				#include <pthread.h>
				#include "omp_my_sleep.h"

				/*
				With task dependencies one can generate proxy tasks from an explicit task
				being executed by a serial task team. The OpenMP runtime library didn't
				expect that and tries to free the explicit task that is the parent of the
				proxy task still working in background. It therefore has incomplete children
				which triggers a debugging assertion.
				*/

				// Compiler-generated code (emulation)
				typedef long kmp_intptr_t;
				typedef int kmp_int32;

				typedef char bool;

				typedef struct ident {
				kmp_int32 reserved_1; /*< might be used in Fortran; see above /
				kmp_int32 flags; /*< also f.flags; KMP_IDENT_xxx flags; KMP_IDENT_KMPC identifies this union member /
				kmp_int32 reserved_2; /*< not really used in Fortran any more; see above /
				#if USE_ITT_BUILD
				/* but currently used for storing region-specific ITT */
				/* contextual information. */
				#endif /* USE_ITT_BUILD */
				kmp_int32 reserved_3; /*< source[4] in Fortran, do not use for C++ /
				char const psource; /*< String describing the source location.
				The string is composed of semi-colon separated fields which describe the source file,
				the function and a pair of line numbers that delimit the construct.
				*/
				} ident_t;

				typedef struct kmp_depend_info {
				kmp_intptr_t base_addr;
				size_t len;
				struct {
				bool in:1;
				bool out:1;
				} flags;
				} kmp_depend_info_t;

				struct kmp_task;
				typedef kmp_int32 (* kmp_routine_entry_t)( kmp_int32, struct kmp_task * );

				typedef struct kmp_task { /* GEH: Shouldn't this be aligned somehow? */
				void * shareds; /*< pointer to block of pointers to shared vars /
				kmp_routine_entry_t routine; /*< pointer to routine to call for executing task /
				kmp_int32 part_id; /*< part id for the task /
				} kmp_task_t;

				#ifdef __cplusplus
				extern "C" {
				#endif
				kmp_int32 __kmpc_global_thread_num ( ident_t * );
				kmp_task_t*
				__kmpc_omp_task_alloc( ident_t *loc_ref, kmp_int32 gtid, kmp_int32 flags,
				size_t sizeof_kmp_task_t, size_t sizeof_shareds,
				kmp_routine_entry_t task_entry );
				void __kmpc_proxy_task_completed_ooo ( kmp_task_t *ptask );
				kmp_int32 __kmpc_omp_task_with_deps ( ident_t loc_ref, kmp_int32 gtid, kmp_task_t new_task,
				kmp_int32 ndeps, kmp_depend_info_t *dep_list,
				kmp_int32 ndeps_noalias, kmp_depend_info_t *noalias_dep_list );
				kmp_int32
				__kmpc_omp_task( ident_t loc_ref, kmp_int32 gtid, kmp_task_t new_task );
				#ifdef __cplusplus
				}
				#endif

				void target(void task)
				{
				my_sleep( 0.1 );
				__kmpc_proxy_task_completed_ooo((kmp_task_t*) task);
				return NULL;
				}

				pthread_t target_thread;

				// User's code
				int task_entry(kmp_int32 gtid, kmp_task_t *task)
				{
				pthread_create(&target_thread, NULL, &target, task);
				return 0;
				}

				int main()
				{
				int dep;

				#pragma omp taskgroup
				{
				/*
				* Corresponds to:
				#pragma omp target nowait depend(out: dep)
				{
				my_sleep( 0.1 );
				}
				*/
				kmp_depend_info_t dep_info;
				dep_info.base_addr = (long) &dep;
				dep_info.len = sizeof(int);
				// out = inout per spec and runtime expects this
				dep_info.flags.in = 1;
				dep_info.flags.out = 1;

				kmp_int32 gtid = __kmpc_global_thread_num(NULL);
				kmp_task_t *proxy_task = __kmpc_omp_task_alloc(NULL,gtid,17,sizeof(kmp_task_t),0,&task_entry);
				__kmpc_omp_task_with_deps(NULL,gtid,proxy_task,1,&dep_info,0,NULL);

				#pragma omp task depend(in: dep)
				{
				/*
				* Corresponds to:
				#pragma omp target nowait depend(out: dep)
				{
				my_sleep( 0.1 );
				}
				*/
				kmp_task_t *nested_proxy_task = __kmpc_omp_task_alloc(NULL,gtid,17,sizeof(kmp_task_t),0,&task_entry);
				__kmpc_omp_task(NULL,gtid,nested_proxy_task);
				}
				}

				// only check that it didn't crash
				return 0;
				}