This is an archive of the discontinued LLVM Phabricator instance.

Differential D19229

[STATS] Use partitioned timer scheme
ClosedPublic

Authored by jlpeyton on Apr 18 2016, 10:39 AM.

Download Raw Diff

Details

Reviewers

jcownie
tlwilmar

Commits

rG11dc82fa83b9: [STATS] Use partitioned timer scheme
rOMP268640: [STATS] Use partitioned timer scheme
rL268640: [STATS] Use partitioned timer scheme

Summary

This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code.

There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct.

The changes are mostly changing the MACROs to use the new PARITIONED_* macros, the new partitionedTimers class and its methods, and new state logic.

Diff Detail

Repository: rL LLVM

Event Timeline

jlpeyton updated this revision to Diff 54080.Apr 18 2016, 10:39 AM

jlpeyton retitled this revision from to [STATS] Use partitioned timer scheme.

jlpeyton updated this object.

jlpeyton added reviewers: tlwilmar, jcownie.

jlpeyton set the repository for this revision to rL LLVM.

jlpeyton added a subscriber: openmp-commits.

Herald added a subscriber: sanjoy. · View Herald TranscriptApr 18 2016, 10:39 AM

LGTM

This revision is now accepted and ready to land.Apr 28 2016, 9:13 AM

I am somewhat surprised by this set of changes. The set of thread states being maintained is very close to what OMPT supports. Are the differences worthwhile? If so, why not have someone from Intel participate in the OMPT discussions and lobby for changes. If the differences are not worthwhile, then what is the point of having maintaining two independent implementations of state maintenance that largely duplicate one another? Why not just use the OMPT states? If states aren't being maintained properly for the OpenMP library (I know that not all of the OMPT state maintenance is implemented or correct), why doesn't someone from Intel with intimate knowledge of how the runtime works fix it? If I were to fix the OMPT implementation, I would consider simply putting OMPT state maintenance in the same places that Intel engineers who wrote the library put them (where they are in this patch).

After going to the effort of patching OMPT into this library, I've noticed the distinct lack of Intel investment in helping move the implementation forward. To me, this is a sign that Intel would prefer that the efforts to standardize the OMPT API would just wither and die. If I'm mistaken, please correct me and let's explore how to collaborate.

Closed by commit rL268640: [STATS] Use partitioned timer scheme (authored by jlpeyton). · Explain WhyMay 5 2016, 9:22 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

openmp/

trunk/

runtime/

src/

18 lines

19 lines

2 lines

23 lines

2 lines

205 lines

79 lines

24 lines

26 lines

30 lines

9 lines

Diff 56300

openmp/trunk/runtime/src/kmp_barrier.cpp

Show First 20 Lines • Show All 1,042 Lines • ▼ Show 20 Lines
/* If is_split is true, do a split barrier, otherwise, do a plain barrier		/* If is_split is true, do a split barrier, otherwise, do a plain barrier
If reduce is non-NULL, do a split reduction barrier, otherwise, do a split barrier		If reduce is non-NULL, do a split reduction barrier, otherwise, do a split barrier
Returns 0 if master thread, 1 if worker thread. */		Returns 0 if master thread, 1 if worker thread. */
int		int
__kmp_barrier(enum barrier_type bt, int gtid, int is_split, size_t reduce_size,		__kmp_barrier(enum barrier_type bt, int gtid, int is_split, size_t reduce_size,
void reduce_data, void (reduce)(void , void ))		void reduce_data, void (reduce)(void , void ))
{		{
KMP_TIME_DEVELOPER_BLOCK(KMP_barrier);		KMP_TIME_DEVELOPER_BLOCK(KMP_barrier);
		KMP_SET_THREAD_STATE_BLOCK(PLAIN_BARRIER);
		KMP_TIME_PARTITIONED_BLOCK(OMP_plain_barrier);
register int tid = __kmp_tid_from_gtid(gtid);		register int tid = __kmp_tid_from_gtid(gtid);
register kmp_info_t *this_thr = __kmp_threads[gtid];		register kmp_info_t *this_thr = __kmp_threads[gtid];
register kmp_team_t *team = this_thr->th.th_team;		register kmp_team_t *team = this_thr->th.th_team;
register int status = 0;		register int status = 0;
ident_t *loc = __kmp_threads[gtid]->th.th_ident;		ident_t *loc = __kmp_threads[gtid]->th.th_ident;
#if OMPT_SUPPORT		#if OMPT_SUPPORT
ompt_task_id_t my_task_id;		ompt_task_id_t my_task_id;
ompt_parallel_id_t my_parallel_id;		ompt_parallel_id_t my_parallel_id;
▲ Show 20 Lines • Show All 284 Lines • ▼ Show 20 Lines	if (!team->t.t_serialized) {
}		}
}		}
}		}


void		void
__kmp_join_barrier(int gtid)		__kmp_join_barrier(int gtid)
{		{
		KMP_TIME_PARTITIONED_BLOCK(OMP_fork_join_barrier);
		KMP_SET_THREAD_STATE_BLOCK(FORK_JOIN_BARRIER);
KMP_TIME_DEVELOPER_BLOCK(KMP_join_barrier);		KMP_TIME_DEVELOPER_BLOCK(KMP_join_barrier);
register kmp_info_t *this_thr = __kmp_threads[gtid];		register kmp_info_t *this_thr = __kmp_threads[gtid];
register kmp_team_t *team;		register kmp_team_t *team;
register kmp_uint nproc;		register kmp_uint nproc;
kmp_info_t *master_thread;		kmp_info_t *master_thread;
int tid;		int tid;
#ifdef KMP_DEBUG		#ifdef KMP_DEBUG
int team_id;		int team_id;
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	/* From this point on, the team data structure may be deallocated at any time by the
master thread - it is unsafe to reference it in any of the worker threads. Any per-team		master thread - it is unsafe to reference it in any of the worker threads. Any per-team
data items that need to be referenced before the end of the barrier should be moved to		data items that need to be referenced before the end of the barrier should be moved to
the kmp_task_team_t structs. */		the kmp_task_team_t structs. */
if (KMP_MASTER_TID(tid)) {		if (KMP_MASTER_TID(tid)) {
if (__kmp_tasking_mode != tskm_immediate_exec) {		if (__kmp_tasking_mode != tskm_immediate_exec) {
__kmp_task_team_wait(this_thr, team		__kmp_task_team_wait(this_thr, team
USE_ITT_BUILD_ARG(itt_sync_obj) );		USE_ITT_BUILD_ARG(itt_sync_obj) );
}		}
		#if KMP_STATS_ENABLED
		// Have master thread flag the workers to indicate they are now waiting for
		// next parallel region, Also wake them up so they switch their timers to idle.
		for (int i=0; i<team->t.t_nproc; ++i) {
		kmp_info_t* team_thread = team->t.t_threads[i];
		if (team_thread == this_thr)
		continue;
		team_thread->th.th_stats->setIdleFlag();
		if (__kmp_dflt_blocktime != KMP_MAX_BLOCKTIME && team_thread->th.th_sleep_loc != NULL)
		__kmp_null_resume_wrapper(__kmp_gtid_from_thread(team_thread), team_thread->th.th_sleep_loc);
		}
		#endif
#if USE_ITT_BUILD		#if USE_ITT_BUILD
if (__itt_sync_create_ptr \|\| KMP_ITT_DEBUG)		if (__itt_sync_create_ptr \|\| KMP_ITT_DEBUG)
__kmp_itt_barrier_middle(gtid, itt_sync_obj);		__kmp_itt_barrier_middle(gtid, itt_sync_obj);
#endif /* USE_ITT_BUILD */		#endif /* USE_ITT_BUILD */

# if USE_ITT_BUILD && USE_ITT_NOTIFY		# if USE_ITT_BUILD && USE_ITT_NOTIFY
// Join barrier - report frame end		// Join barrier - report frame end
if ((__itt_frame_submit_v3_ptr \|\| KMP_ITT_DEBUG) && __kmp_forkjoin_frames_mode &&		if ((__itt_frame_submit_v3_ptr \|\| KMP_ITT_DEBUG) && __kmp_forkjoin_frames_mode &&
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines
#endif		#endif
}		}


// TODO release worker threads' fork barriers as we are ready instead of all at once		// TODO release worker threads' fork barriers as we are ready instead of all at once
void		void
__kmp_fork_barrier(int gtid, int tid)		__kmp_fork_barrier(int gtid, int tid)
{		{
		KMP_TIME_PARTITIONED_BLOCK(OMP_fork_join_barrier);
		KMP_SET_THREAD_STATE_BLOCK(FORK_JOIN_BARRIER);
KMP_TIME_DEVELOPER_BLOCK(KMP_fork_barrier);		KMP_TIME_DEVELOPER_BLOCK(KMP_fork_barrier);
kmp_info_t *this_thr = __kmp_threads[gtid];		kmp_info_t *this_thr = __kmp_threads[gtid];
kmp_team_t *team = (tid == 0) ? this_thr->th.th_team : NULL;		kmp_team_t *team = (tid == 0) ? this_thr->th.th_team : NULL;
#if USE_ITT_BUILD		#if USE_ITT_BUILD
void * itt_sync_obj = NULL;		void * itt_sync_obj = NULL;
#endif /* USE_ITT_BUILD */		#endif /* USE_ITT_BUILD */

KA_TRACE(10, ("__kmp_fork_barrier: T#%d(%d:%d) has arrived\n",		KA_TRACE(10, ("__kmp_fork_barrier: T#%d(%d:%d) has arrived\n",
▲ Show 20 Lines • Show All 190 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_csupport.c

Show First 20 Lines • Show All 284 Lines • ▼ Show 20 Lines
#if (KMP_STATS_ENABLED)		#if (KMP_STATS_ENABLED)
int inParallel = __kmpc_in_parallel(loc);		int inParallel = __kmpc_in_parallel(loc);
if (inParallel)		if (inParallel)
{		{
KMP_COUNT_BLOCK(OMP_NESTED_PARALLEL);		KMP_COUNT_BLOCK(OMP_NESTED_PARALLEL);
}		}
else		else
{		{
KMP_STOP_EXPLICIT_TIMER(OMP_serial);
KMP_COUNT_BLOCK(OMP_PARALLEL);		KMP_COUNT_BLOCK(OMP_PARALLEL);
}		}
#endif		#endif

// maybe to save thr_state is enough here		// maybe to save thr_state is enough here
{		{
va_list ap;		va_list ap;
va_start( ap, microtask );		va_start( ap, microtask );
Show All 38 Lines

#if OMPT_SUPPORT		#if OMPT_SUPPORT
if (ompt_enabled) {		if (ompt_enabled) {
parent_team->t.t_implicit_task_taskdata[tid].		parent_team->t.t_implicit_task_taskdata[tid].
ompt_task_info.frame.reenter_runtime_frame = 0;		ompt_task_info.frame.reenter_runtime_frame = 0;
}		}
#endif		#endif
}		}
#if (KMP_STATS_ENABLED)
if (!inParallel)
KMP_START_EXPLICIT_TIMER(OMP_serial);
#endif
}		}

#if OMP_40_ENABLED		#if OMP_40_ENABLED
/*!		/*!
@ingroup PARALLEL		@ingroup PARALLEL
@param loc source location information		@param loc source location information
@param global_tid global thread number		@param global_tid global thread number
@param num_teams number of teams requested for the teams construct		@param num_teams number of teams requested for the teams construct
▲ Show 20 Lines • Show All 304 Lines • ▼ Show 20 Lines
@param global_tid thread id.		@param global_tid thread id.

Execute a barrier.		Execute a barrier.
*/		*/
void		void
__kmpc_barrier(ident_t *loc, kmp_int32 global_tid)		__kmpc_barrier(ident_t *loc, kmp_int32 global_tid)
{		{
KMP_COUNT_BLOCK(OMP_BARRIER);		KMP_COUNT_BLOCK(OMP_BARRIER);
KMP_TIME_BLOCK(OMP_barrier);
KC_TRACE( 10, ("__kmpc_barrier: called T#%d\n", global_tid ) );		KC_TRACE( 10, ("__kmpc_barrier: called T#%d\n", global_tid ) );

if (! TCR_4(__kmp_init_parallel))		if (! TCR_4(__kmp_init_parallel))
__kmp_parallel_initialize();		__kmp_parallel_initialize();

if ( __kmp_env_consistency_check ) {		if ( __kmp_env_consistency_check ) {
if ( loc == 0 ) {		if ( loc == 0 ) {
KMP_WARNING( ConstructIdentInvalid ); // ??? What does it mean for the user?		KMP_WARNING( ConstructIdentInvalid ); // ??? What does it mean for the user?
Show All 27 Lines	__kmpc_master(ident_t *loc, kmp_int32 global_tid)

KC_TRACE( 10, ("__kmpc_master: called T#%d\n", global_tid ) );		KC_TRACE( 10, ("__kmpc_master: called T#%d\n", global_tid ) );

if( ! TCR_4( __kmp_init_parallel ) )		if( ! TCR_4( __kmp_init_parallel ) )
__kmp_parallel_initialize();		__kmp_parallel_initialize();

if( KMP_MASTER_GTID( global_tid )) {		if( KMP_MASTER_GTID( global_tid )) {
KMP_COUNT_BLOCK(OMP_MASTER);		KMP_COUNT_BLOCK(OMP_MASTER);
KMP_START_EXPLICIT_TIMER(OMP_master);		KMP_PUSH_PARTITIONED_TIMER(OMP_master);
status = 1;		status = 1;
}		}

#if OMPT_SUPPORT && OMPT_TRACE		#if OMPT_SUPPORT && OMPT_TRACE
if (status) {		if (status) {
if (ompt_enabled &&		if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_master_begin)) {		ompt_callbacks.ompt_callback(ompt_event_master_begin)) {
kmp_info_t *this_thr = __kmp_threads[ global_tid ];		kmp_info_t *this_thr = __kmp_threads[ global_tid ];
Show All 33 Lines
that executes the <tt>master</tt> region.		that executes the <tt>master</tt> region.
*/		*/
void		void
__kmpc_end_master(ident_t *loc, kmp_int32 global_tid)		__kmpc_end_master(ident_t *loc, kmp_int32 global_tid)
{		{
KC_TRACE( 10, ("__kmpc_end_master: called T#%d\n", global_tid ) );		KC_TRACE( 10, ("__kmpc_end_master: called T#%d\n", global_tid ) );

KMP_DEBUG_ASSERT( KMP_MASTER_GTID( global_tid ));		KMP_DEBUG_ASSERT( KMP_MASTER_GTID( global_tid ));
KMP_STOP_EXPLICIT_TIMER(OMP_master);		KMP_POP_PARTITIONED_TIMER();

#if OMPT_SUPPORT && OMPT_TRACE		#if OMPT_SUPPORT && OMPT_TRACE
kmp_info_t *this_thr = __kmp_threads[ global_tid ];		kmp_info_t *this_thr = __kmp_threads[ global_tid ];
kmp_team_t *team = this_thr -> th.th_team;		kmp_team_t *team = this_thr -> th.th_team;
if (ompt_enabled &&		if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_master_end)) {		ompt_callbacks.ompt_callback(ompt_event_master_end)) {
int tid = __kmp_tid_from_gtid( global_tid );		int tid = __kmp_tid_from_gtid( global_tid );
ompt_callbacks.ompt_callback(ompt_event_master_end)(		ompt_callbacks.ompt_callback(ompt_event_master_end)(
▲ Show 20 Lines • Show All 308 Lines • ▼ Show 20 Lines
*/		*/
void		void
__kmpc_critical( ident_t * loc, kmp_int32 global_tid, kmp_critical_name * crit )		__kmpc_critical( ident_t * loc, kmp_int32 global_tid, kmp_critical_name * crit )
{		{
#if KMP_USE_DYNAMIC_LOCK		#if KMP_USE_DYNAMIC_LOCK
__kmpc_critical_with_hint(loc, global_tid, crit, omp_lock_hint_none);		__kmpc_critical_with_hint(loc, global_tid, crit, omp_lock_hint_none);
#else		#else
KMP_COUNT_BLOCK(OMP_CRITICAL);		KMP_COUNT_BLOCK(OMP_CRITICAL);
KMP_TIME_BLOCK(OMP_critical_wait); /* Time spent waiting to enter the critical section */		KMP_TIME_PARTITIONED_BLOCK(OMP_critical_wait); /* Time spent waiting to enter the critical section */
kmp_user_lock_p lck;		kmp_user_lock_p lck;

KC_TRACE( 10, ("__kmpc_critical: called T#%d\n", global_tid ) );		KC_TRACE( 10, ("__kmpc_critical: called T#%d\n", global_tid ) );

//TODO: add THR_OVHD_STATE		//TODO: add THR_OVHD_STATE

KMP_CHECK_USER_LOCK_INIT();		KMP_CHECK_USER_LOCK_INIT();

▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines
# endif		# endif
KMP_I_LOCK_FUNC(ilk, set)(lck, global_tid);		KMP_I_LOCK_FUNC(ilk, set)(lck, global_tid);
}		}

#if USE_ITT_BUILD		#if USE_ITT_BUILD
__kmp_itt_critical_acquired( lck );		__kmp_itt_critical_acquired( lck );
#endif /* USE_ITT_BUILD */		#endif /* USE_ITT_BUILD */

		KMP_PUSH_PARTITIONED_TIMER(OMP_critical);
KA_TRACE( 15, ("__kmpc_critical: done T#%d\n", global_tid ));		KA_TRACE( 15, ("__kmpc_critical: done T#%d\n", global_tid ));
} // __kmpc_critical_with_hint		} // __kmpc_critical_with_hint

#endif // KMP_USE_DYNAMIC_LOCK		#endif // KMP_USE_DYNAMIC_LOCK

/*!		/*!
@ingroup WORK_SHARING		@ingroup WORK_SHARING
@param loc source location information.		@param loc source location information.
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	#if OMPT_SUPPORT && OMPT_BLAME
if (ompt_enabled &&		if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_release_critical)) {		ompt_callbacks.ompt_callback(ompt_event_release_critical)) {
ompt_callbacks.ompt_callback(ompt_event_release_critical)(		ompt_callbacks.ompt_callback(ompt_event_release_critical)(
(uint64_t) lck);		(uint64_t) lck);
}		}
#endif		#endif

#endif // KMP_USE_DYNAMIC_LOCK		#endif // KMP_USE_DYNAMIC_LOCK
KMP_STOP_EXPLICIT_TIMER(OMP_critical);		KMP_POP_PARTITIONED_TIMER();
KA_TRACE( 15, ("__kmpc_end_critical: done T#%d\n", global_tid ));		KA_TRACE( 15, ("__kmpc_end_critical: done T#%d\n", global_tid ));
}		}

/*!		/*!
@ingroup SYNCHRONIZATION		@ingroup SYNCHRONIZATION
@param loc source location information		@param loc source location information
@param global_tid thread id.		@param global_tid thread id.
@return one if the thread should execute the master block, zero otherwise		@return one if the thread should execute the master block, zero otherwise
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines
kmp_int32		kmp_int32
__kmpc_single(ident_t *loc, kmp_int32 global_tid)		__kmpc_single(ident_t *loc, kmp_int32 global_tid)
{		{
kmp_int32 rc = __kmp_enter_single( global_tid, loc, TRUE );		kmp_int32 rc = __kmp_enter_single( global_tid, loc, TRUE );

if (rc) {		if (rc) {
// We are going to execute the single statement, so we should count it.		// We are going to execute the single statement, so we should count it.
KMP_COUNT_BLOCK(OMP_SINGLE);		KMP_COUNT_BLOCK(OMP_SINGLE);
KMP_START_EXPLICIT_TIMER(OMP_single);		KMP_PUSH_PARTITIONED_TIMER(OMP_single);
}		}

#if OMPT_SUPPORT && OMPT_TRACE		#if OMPT_SUPPORT && OMPT_TRACE
kmp_info_t *this_thr = __kmp_threads[ global_tid ];		kmp_info_t *this_thr = __kmp_threads[ global_tid ];
kmp_team_t *team = this_thr -> th.th_team;		kmp_team_t *team = this_thr -> th.th_team;
int tid = __kmp_tid_from_gtid( global_tid );		int tid = __kmp_tid_from_gtid( global_tid );

if (ompt_enabled) {		if (ompt_enabled) {
Show All 26 Lines
Mark the end of a <tt>single</tt> construct. This function should		Mark the end of a <tt>single</tt> construct. This function should
only be called by the thread that executed the block of code protected		only be called by the thread that executed the block of code protected
by the `single` construct.		by the `single` construct.
*/		*/
void		void
__kmpc_end_single(ident_t *loc, kmp_int32 global_tid)		__kmpc_end_single(ident_t *loc, kmp_int32 global_tid)
{		{
__kmp_exit_single( global_tid );		__kmp_exit_single( global_tid );
KMP_STOP_EXPLICIT_TIMER(OMP_single);		KMP_POP_PARTITIONED_TIMER();

#if OMPT_SUPPORT && OMPT_TRACE		#if OMPT_SUPPORT && OMPT_TRACE
kmp_info_t *this_thr = __kmp_threads[ global_tid ];		kmp_info_t *this_thr = __kmp_threads[ global_tid ];
kmp_team_t *team = this_thr -> th.th_team;		kmp_team_t *team = this_thr -> th.th_team;
int tid = __kmp_tid_from_gtid( global_tid );		int tid = __kmp_tid_from_gtid( global_tid );

if (ompt_enabled &&		if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_single_in_block_end)) {		ompt_callbacks.ompt_callback(ompt_event_single_in_block_end)) {
▲ Show 20 Lines • Show All 1,806 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_dispatch.cpp

Show First 20 Lines • Show All 1,418 Lines • ▼ Show 20 Lines	) {
typedef typename traits_t< T >::floating_t DBL;		typedef typename traits_t< T >::floating_t DBL;
#if ( KMP_STATIC_STEAL_ENABLED && KMP_ARCH_X86_64 )		#if ( KMP_STATIC_STEAL_ENABLED && KMP_ARCH_X86_64 )
static const int ___kmp_size_type = sizeof( UT );		static const int ___kmp_size_type = sizeof( UT );
#endif		#endif

// This is potentially slightly misleading, schedule(runtime) will appear here even if the actual runtme schedule		// This is potentially slightly misleading, schedule(runtime) will appear here even if the actual runtme schedule
// is static. (Which points out a disadavantage of schedule(runtime): even when static scheduling is used it costs		// is static. (Which points out a disadavantage of schedule(runtime): even when static scheduling is used it costs
// more than a compile time choice to use static scheduling would.)		// more than a compile time choice to use static scheduling would.)
KMP_TIME_BLOCK(FOR_dynamic_scheduling);		KMP_TIME_PARTITIONED_BLOCK(FOR_dynamic_scheduling);

int status;		int status;
dispatch_private_info_template< T > * pr;		dispatch_private_info_template< T > * pr;
kmp_info_t * th = __kmp_threads[ gtid ];		kmp_info_t * th = __kmp_threads[ gtid ];
kmp_team_t * team = th -> th.th_team;		kmp_team_t * team = th -> th.th_team;

KMP_DEBUG_ASSERT( p_lb && p_ub && p_st ); // AC: these cannot be NULL		KMP_DEBUG_ASSERT( p_lb && p_ub && p_st ); // AC: these cannot be NULL
#ifdef KMP_DEBUG		#ifdef KMP_DEBUG
▲ Show 20 Lines • Show All 1,206 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_runtime.c

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,537 Lines • ▼ Show 20 Lines	#endif
/* OMPT state */		/* OMPT state */
master_th->th.ompt_thread_info.state = ompt_state_work_parallel;		master_th->th.ompt_thread_info.state = ompt_state_work_parallel;
} else {		} else {
exit_runtime_p = &dummy;		exit_runtime_p = &dummy;
}		}
#endif		#endif

{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
__kmp_invoke_microtask( microtask, gtid, 0, argc, parent_team->t.t_argv		__kmp_invoke_microtask( microtask, gtid, 0, argc, parent_team->t.t_argv
#if OMPT_SUPPORT		#if OMPT_SUPPORT
, exit_runtime_p		, exit_runtime_p
#endif		#endif
);		);
}		}

#if OMPT_SUPPORT		#if OMPT_SUPPORT
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	#endif
__kmp_internal_fork( loc, gtid, parent_team );		__kmp_internal_fork( loc, gtid, parent_team );
KF_TRACE( 10, ( "__kmp_fork_call: after internal fork: root=%p, team=%p, master_th=%p, gtid=%d\n", root, parent_team, master_th, gtid ) );		KF_TRACE( 10, ( "__kmp_fork_call: after internal fork: root=%p, team=%p, master_th=%p, gtid=%d\n", root, parent_team, master_th, gtid ) );

/* Invoke microtask for MASTER thread */		/* Invoke microtask for MASTER thread */
KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) invoke microtask = %p\n",		KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) invoke microtask = %p\n",
gtid, parent_team->t.t_id, parent_team->t.t_pkfn ) );		gtid, parent_team->t.t_id, parent_team->t.t_pkfn ) );

{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
if (! parent_team->t.t_invoke( gtid )) {		if (! parent_team->t.t_invoke( gtid )) {
KMP_ASSERT2( 0, "cannot invoke microtask for MASTER thread" );		KMP_ASSERT2( 0, "cannot invoke microtask for MASTER thread" );
}		}
}		}
KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) done microtask = %p\n",		KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) done microtask = %p\n",
gtid, parent_team->t.t_id, parent_team->t.t_pkfn ) );		gtid, parent_team->t.t_id, parent_team->t.t_pkfn ) );
KMP_MB(); /* Flush all pending memory write invalidates. */		KMP_MB(); /* Flush all pending memory write invalidates. */

▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	#endif
/* OMPT state */		/* OMPT state */
master_th->th.ompt_thread_info.state = ompt_state_work_parallel;		master_th->th.ompt_thread_info.state = ompt_state_work_parallel;
} else {		} else {
exit_runtime_p = &dummy;		exit_runtime_p = &dummy;
}		}
#endif		#endif

{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
__kmp_invoke_microtask( microtask, gtid, 0, argc, parent_team->t.t_argv		__kmp_invoke_microtask( microtask, gtid, 0, argc, parent_team->t.t_argv
#if OMPT_SUPPORT		#if OMPT_SUPPORT
, exit_runtime_p		, exit_runtime_p
#endif		#endif
);		);
}		}

#if OMPT_SUPPORT		#if OMPT_SUPPORT
Show All 40 Lines	# endif
// Get args from parent team for teams construct		// Get args from parent team for teams construct
argv[i] = parent_team->t.t_argv[i];		argv[i] = parent_team->t.t_argv[i];
}		}
// AC: revert change made in __kmpc_serialized_parallel()		// AC: revert change made in __kmpc_serialized_parallel()
// because initial code in teams should have level=0		// because initial code in teams should have level=0
team->t.t_level--;		team->t.t_level--;
// AC: call special invoker for outer "parallel" of the teams construct		// AC: call special invoker for outer "parallel" of the teams construct
{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
invoker(gtid);		invoker(gtid);
}		}
} else {		} else {
#endif /* OMP_40_ENABLED */		#endif /* OMP_40_ENABLED */
argv = args;		argv = args;
for( i=argc-1; i >= 0; --i )		for( i=argc-1; i >= 0; --i )
// TODO: revert workaround for Intel(R) 64 tracker #96		// TODO: revert workaround for Intel(R) 64 tracker #96
#if (KMP_ARCH_X86_64 \|\| KMP_ARCH_ARM \|\| KMP_ARCH_AARCH64) && KMP_OS_LINUX		#if (KMP_ARCH_X86_64 \|\| KMP_ARCH_ARM \|\| KMP_ARCH_AARCH64) && KMP_OS_LINUX
Show All 30 Lines	#endif
/* OMPT state */		/* OMPT state */
master_th->th.ompt_thread_info.state = ompt_state_work_parallel;		master_th->th.ompt_thread_info.state = ompt_state_work_parallel;
} else {		} else {
exit_runtime_p = &dummy;		exit_runtime_p = &dummy;
}		}
#endif		#endif

{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
__kmp_invoke_microtask( microtask, gtid, 0, argc, args		__kmp_invoke_microtask( microtask, gtid, 0, argc, args
#if OMPT_SUPPORT		#if OMPT_SUPPORT
, exit_runtime_p		, exit_runtime_p
#endif		#endif
);		);
}		}

#if OMPT_SUPPORT		#if OMPT_SUPPORT
▲ Show 20 Lines • Show All 319 Lines • ▼ Show 20 Lines	#endif /* OMP_40_ENABLED */
}		}

/* Invoke microtask for MASTER thread */		/* Invoke microtask for MASTER thread */
KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) invoke microtask = %p\n",		KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) invoke microtask = %p\n",
gtid, team->t.t_id, team->t.t_pkfn ) );		gtid, team->t.t_id, team->t.t_pkfn ) );
} // END of timer KMP_fork_call block		} // END of timer KMP_fork_call block

{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
// KMP_TIME_DEVELOPER_BLOCK(USER_master_invoke);		// KMP_TIME_DEVELOPER_BLOCK(USER_master_invoke);
if (! team->t.t_invoke( gtid )) {		if (! team->t.t_invoke( gtid )) {
KMP_ASSERT2( 0, "cannot invoke microtask for MASTER thread" );		KMP_ASSERT2( 0, "cannot invoke microtask for MASTER thread" );
}		}
}		}
KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) done microtask = %p\n",		KA_TRACE( 20, ("__kmp_fork_call: T#%d(%d:0) done microtask = %p\n",
gtid, team->t.t_id, team->t.t_pkfn ) );		gtid, team->t.t_id, team->t.t_pkfn ) );
KMP_MB(); /* Flush all pending memory write invalidates. */		KMP_MB(); /* Flush all pending memory write invalidates. */
▲ Show 20 Lines • Show All 3,253 Lines • ▼ Show 20 Lines	#if OMPT_SUPPORT
int tid = __kmp_tid_from_gtid(gtid);		int tid = __kmp_tid_from_gtid(gtid);
task_info->task_id = __ompt_task_id_new(tid);		task_info->task_id = __ompt_task_id_new(tid);
}		}
#endif		#endif

KMP_STOP_DEVELOPER_EXPLICIT_TIMER(USER_launch_thread_loop);		KMP_STOP_DEVELOPER_EXPLICIT_TIMER(USER_launch_thread_loop);
{		{
KMP_TIME_DEVELOPER_BLOCK(USER_worker_invoke);		KMP_TIME_DEVELOPER_BLOCK(USER_worker_invoke);
		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
rc = (*pteam)->t.t_invoke( gtid );		rc = (*pteam)->t.t_invoke( gtid );
}		}
KMP_START_DEVELOPER_EXPLICIT_TIMER(USER_launch_thread_loop);		KMP_START_DEVELOPER_EXPLICIT_TIMER(USER_launch_thread_loop);
KMP_ASSERT( rc );		KMP_ASSERT( rc );

#if OMPT_SUPPORT		#if OMPT_SUPPORT
if (ompt_enabled) {		if (ompt_enabled) {
/* no frame set while outside task */		/* no frame set while outside task */
▲ Show 20 Lines • Show All 1,319 Lines • ▼ Show 20 Lines	if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_implicit_task_begin)) {		ompt_callbacks.ompt_callback(ompt_event_implicit_task_begin)) {
ompt_callbacks.ompt_callback(ompt_event_implicit_task_begin)(		ompt_callbacks.ompt_callback(ompt_event_implicit_task_begin)(
my_parallel_id, my_task_id);		my_parallel_id, my_task_id);
}		}
#endif		#endif
#endif		#endif

{		{
KMP_TIME_BLOCK(OMP_work);		KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
		KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
rc = __kmp_invoke_microtask( (microtask_t) TCR_SYNC_PTR(team->t.t_pkfn),		rc = __kmp_invoke_microtask( (microtask_t) TCR_SYNC_PTR(team->t.t_pkfn),
gtid, tid, (int) team->t.t_argc, (void **) team->t.t_argv		gtid, tid, (int) team->t.t_argc, (void **) team->t.t_argv
#if OMPT_SUPPORT		#if OMPT_SUPPORT
, exit_runtime_p		, exit_runtime_p
#endif		#endif
);		);
}		}

▲ Show 20 Lines • Show All 791 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_sched.cpp

Show First 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	__kmp_for_static_init(
kmp_int32 *plastiter,		kmp_int32 *plastiter,
T *plower,		T *plower,
T *pupper,		T *pupper,
typename traits_t< T >::signed_t *pstride,		typename traits_t< T >::signed_t *pstride,
typename traits_t< T >::signed_t incr,		typename traits_t< T >::signed_t incr,
typename traits_t< T >::signed_t chunk		typename traits_t< T >::signed_t chunk
) {		) {
KMP_COUNT_BLOCK(OMP_FOR_static);		KMP_COUNT_BLOCK(OMP_FOR_static);
KMP_TIME_BLOCK (FOR_static_scheduling);		KMP_TIME_PARTITIONED_BLOCK(FOR_static_scheduling);

typedef typename traits_t< T >::unsigned_t UT;		typedef typename traits_t< T >::unsigned_t UT;
typedef typename traits_t< T >::signed_t ST;		typedef typename traits_t< T >::signed_t ST;
/* this all has to be changed back to TID and such.. */		/* this all has to be changed back to TID and such.. */
register kmp_int32 gtid = global_tid;		register kmp_int32 gtid = global_tid;
register kmp_uint32 tid;		register kmp_uint32 tid;
register kmp_uint32 nth;		register kmp_uint32 nth;
register UT trip_count;		register UT trip_count;
▲ Show 20 Lines • Show All 856 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_stats.h

Show All 21 Lines
* Statistics accumulator.		* Statistics accumulator.
* Accumulates number of samples and computes min, max, mean, standard deviation on the fly.		* Accumulates number of samples and computes min, max, mean, standard deviation on the fly.
*		*
* Online variance calculation algorithm from http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm		* Online variance calculation algorithm from http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm
*/		*/

#include <limits>		#include <limits>
#include <math.h>		#include <math.h>
		#include <vector>
#include <string>		#include <string>
#include <stdint.h>		#include <stdint.h>
#include <new> // placement new		#include <new> // placement new
#include "kmp_stats_timing.h"		#include "kmp_stats_timing.h"

/*		/*
* Enable developer statistics here if you want them. They are more detailed than is useful for application characterisation and		* Enable developer statistics here if you want them. They are more detailed than is useful for application characterisation and
* are intended for the runtime library developer.		* are intended for the runtime library developer.
Show All 9 Lines	enum stats_flags_e {
noTotal = 1<<0, //!< do not show a TOTAL_aggregation for this statistic		noTotal = 1<<0, //!< do not show a TOTAL_aggregation for this statistic
onlyInMaster = 1<<1, //!< statistic is valid only for master		onlyInMaster = 1<<1, //!< statistic is valid only for master
noUnits = 1<<2, //!< statistic doesn't need units printed next to it in output		noUnits = 1<<2, //!< statistic doesn't need units printed next to it in output
notInMaster = 1<<3, //!< statistic is valid only for non-master threads		notInMaster = 1<<3, //!< statistic is valid only for non-master threads
logEvent = 1<<4 //!< statistic can be logged on the event timeline when KMP_STATS_EVENTS is on (valid only for timers)		logEvent = 1<<4 //!< statistic can be logged on the event timeline when KMP_STATS_EVENTS is on (valid only for timers)
};		};

/*!		/*!
		* @ingroup STATS_GATHERING
		* \brief the states which a thread can be in
		*
		*/
		enum stats_state_e {
		IDLE,
		SERIAL_REGION,
		FORK_JOIN_BARRIER,
		PLAIN_BARRIER,
		TASKWAIT,
		TASKYIELD,
		TASKGROUP,
		IMPLICIT_TASK,
		EXPLICIT_TASK
		};

		/*!
* \brief Add new counters under KMP_FOREACH_COUNTER() macro in kmp_stats.h		* \brief Add new counters under KMP_FOREACH_COUNTER() macro in kmp_stats.h
*		*
* @param macro a user defined macro that takes three arguments - macro(COUNTER_NAME, flags, arg)		* @param macro a user defined macro that takes three arguments - macro(COUNTER_NAME, flags, arg)
* @param arg a user defined argument to send to the user defined macro		* @param arg a user defined argument to send to the user defined macro
*		*
* \details A counter counts the occurrence of some event.		* \details A counter counts the occurrence of some event.
* Each thread accumulates its own count, at the end of execution the counts are aggregated treating each thread		* Each thread accumulates its own count, at the end of execution the counts are aggregated treating each thread
* as a separate measurement. (Unless onlyInMaster is set, in which case there's only a single measurement).		* as a separate measurement. (Unless onlyInMaster is set, in which case there's only a single measurement).
Show All 35 Lines
* For most timers the printing code also provides an aggregation over the thread totals. These are printed as TOTAL_foo.		* For most timers the printing code also provides an aggregation over the thread totals. These are printed as TOTAL_foo.
* The count is normally a time (in ticks), hence the name "timer". (But can be any value, so we use this for "number of arguments passed to fork"		* The count is normally a time (in ticks), hence the name "timer". (But can be any value, so we use this for "number of arguments passed to fork"
* as well).		* as well).
* For timers the threads are not significant, it's the individual observations that count, so the statistics are at that level.		* For timers the threads are not significant, it's the individual observations that count, so the statistics are at that level.
* Format is "macro(name, flags, arg)"		* Format is "macro(name, flags, arg)"
*		*
* @ingroup STATS_GATHERING2		* @ingroup STATS_GATHERING2
*/		*/
#define KMP_FOREACH_TIMER(macro, arg) \		#define KMP_FOREACH_TIMER(macro, arg) \
macro (OMP_start_end, stats_flags_e::onlyInMaster \| stats_flags_e::noTotal, arg) \		macro (OMP_worker_thread_life, 0, arg) \
macro (OMP_serial, stats_flags_e::onlyInMaster \| stats_flags_e::noTotal, arg) \
macro (OMP_work, 0, arg) \
macro (OMP_barrier, 0, arg) \
macro (FOR_static_scheduling, 0, arg) \		macro (FOR_static_scheduling, 0, arg) \
macro (FOR_dynamic_scheduling, 0, arg) \		macro (FOR_dynamic_scheduling, 0, arg) \
macro (OMP_task, 0, arg) \
macro (OMP_critical, 0, arg) \		macro (OMP_critical, 0, arg) \
macro (OMP_critical_wait, 0, arg) \		macro (OMP_critical_wait, 0, arg) \
macro (OMP_single, 0, arg) \		macro (OMP_single, 0, arg) \
macro (OMP_master, 0, arg) \		macro (OMP_master, 0, arg) \
		macro (OMP_idle, 0, arg) \
		macro (OMP_plain_barrier, 0, arg) \
		macro (OMP_fork_join_barrier, 0, arg) \
		macro (OMP_parallel, 0, arg) \
		macro (OMP_task_immediate, 0, arg) \
		macro (OMP_task_taskwait, 0, arg) \
		macro (OMP_task_taskyield, 0, arg) \
		macro (OMP_task_taskgroup, 0, arg) \
		macro (OMP_task_join_bar, 0, arg) \
		macro (OMP_task_plain_bar, 0, arg) \
		macro (OMP_serial, 0, arg) \
macro (OMP_set_numthreads, stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \		macro (OMP_set_numthreads, stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \
macro (OMP_PARALLEL_args, stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \		macro (OMP_PARALLEL_args, stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \
macro (FOR_static_iterations, stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \		macro (FOR_static_iterations, stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \
macro (FOR_dynamic_iterations,stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \		macro (FOR_dynamic_iterations,stats_flags_e::noUnits \| stats_flags_e::noTotal, arg) \
KMP_FOREACH_DEVELOPER_TIMER(macro, arg) \		KMP_FOREACH_DEVELOPER_TIMER(macro, arg) \
macro (LAST,0, arg)		macro (LAST,0, arg)


// OMP_start_end -- Time from when OpenMP is initialized until the stats are printed at exit		// OMP_start_end -- Time from when OpenMP is initialized until the stats are printed at exit
// OMP_serial -- Thread zero time executing serial code		// OMP_serial -- Thread zero time executing serial code
// OMP_work -- Elapsed time in code dispatched by a fork (measured in the thread)		// OMP_work -- Elapsed time in code dispatched by a fork (measured in the thread)
// OMP_barrier -- Time at "real" barriers (includes task time)		// OMP_barrier -- Time at "real" barriers (includes task time)
// FOR_static_scheduling -- Time spent doing scheduling for a static "for"		// FOR_static_scheduling -- Time spent doing scheduling for a static "for"
// FOR_dynamic_scheduling -- Time spent doing scheduling for a dynamic "for"		// FOR_dynamic_scheduling -- Time spent doing scheduling for a dynamic "for"
// OMP_task -- Time spent executing tasks		// OMP_idle -- Worker threads time spent waiting for inclusion in a parallel region
		// OMP_plain_barrier -- Time spent in a barrier construct
		// OMP_fork_join_barrier -- Time spent in a the fork-join barrier surrounding a parallel region
		// OMP_parallel -- Time spent inside a parallel construct
		// OMP_task_immediate -- Time spent executing non-deferred tasks
		// OMP_task_taskwait -- Time spent executing tasks inside a taskwait construct
		// OMP_task_taskyield -- Time spent executing tasks inside a taskyield construct
		// OMP_task_taskgroup -- Time spent executing tasks inside a taskygroup construct
		// OMP_task_join_bar -- Time spent executing tasks inside a join barrier
		// OMP_task_plain_bar -- Time spent executing tasks inside a barrier construct
// OMP_single -- Time spent executing a "single" region		// OMP_single -- Time spent executing a "single" region
// OMP_master -- Time spent executing a "master" region		// OMP_master -- Time spent executing a "master" region
// OMP_set_numthreads -- Values passed to omp_set_num_threads		// OMP_set_numthreads -- Values passed to omp_set_num_threads
// OMP_PARALLEL_args -- Number of arguments passed to a parallel region		// OMP_PARALLEL_args -- Number of arguments passed to a parallel region
// FOR_static_iterations -- Number of available parallel chunks of work in a static for		// FOR_static_iterations -- Number of available parallel chunks of work in a static for
// FOR_dynamic_iterations -- Number of available parallel chunks of work in a dynamic for		// FOR_dynamic_iterations -- Number of available parallel chunks of work in a dynamic for
// Both adjust for any chunking, so if there were an iteration count of 20 but a chunk size of 10, we'd record 2.		// Both adjust for any chunking, so if there were an iteration count of 20 but a chunk size of 10, we'd record 2.

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
* \details Explicit timers are ones where we need to allocate a timer itself (as well as the accumulated timing statistics).		* \details Explicit timers are ones where we need to allocate a timer itself (as well as the accumulated timing statistics).
* We allocate these on a per-thread basis, and explicitly start and stop them.		* We allocate these on a per-thread basis, and explicitly start and stop them.
* Block timers just allocate the timer itself on the stack, and use the destructor to notice block exit; they don't		* Block timers just allocate the timer itself on the stack, and use the destructor to notice block exit; they don't
* need to be defined here.		* need to be defined here.
* The name here should be the same as that of a timer above.		* The name here should be the same as that of a timer above.
*		*
* @ingroup STATS_GATHERING		* @ingroup STATS_GATHERING
*/		*/
#define KMP_FOREACH_EXPLICIT_TIMER(macro, arg) \		#define KMP_FOREACH_EXPLICIT_TIMER(macro, arg) \
macro(OMP_serial, 0, arg) \		macro(OMP_worker_thread_life, 0, arg) \
macro(OMP_start_end, 0, arg) \		macro(FOR_static_scheduling, 0, arg) \
		macro(FOR_dynamic_scheduling, 0, arg) \
macro(OMP_critical, 0, arg) \		macro(OMP_critical, 0, arg) \
		macro(OMP_critical_wait, 0, arg) \
macro(OMP_single, 0, arg) \		macro(OMP_single, 0, arg) \
macro(OMP_master, 0, arg) \		macro(OMP_master, 0, arg) \
		macro(OMP_idle, 0, arg) \
		macro(OMP_plain_barrier, 0, arg) \
		macro(OMP_fork_join_barrier, 0, arg) \
		macro(OMP_parallel, 0, arg) \
		macro(OMP_task_immediate, 0, arg) \
		macro(OMP_task_taskwait, 0, arg) \
		macro(OMP_task_taskyield, 0, arg) \
		macro(OMP_task_taskgroup, 0, arg) \
		macro(OMP_task_join_bar, 0, arg) \
		macro(OMP_task_plain_bar, 0, arg) \
		macro(OMP_serial, 0, arg) \
KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro,arg) \		KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro,arg) \
macro(LAST, 0, arg)		macro(LAST, 0, arg)

#if (KMP_DEVELOPER_STATS)		#if (KMP_DEVELOPER_STATS)
# define KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro, arg) \		# define KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro, arg) \
macro(USER_launch_thread_loop, stats_flags_e::logEvent, arg)		macro(USER_launch_thread_loop, stats_flags_e::logEvent, arg)
#else		#else
# define KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro, arg)		# define KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro, arg)
#endif		#endif

#define ENUMERATE(name,ignore,prefix) prefix##name,		#define ENUMERATE(name,ignore,prefix) prefix##name,
enum timer_e {		enum timer_e {
KMP_FOREACH_TIMER(ENUMERATE, TIMER_)		KMP_FOREACH_TIMER(ENUMERATE, TIMER_)
};		};

enum explicit_timer_e {		enum explicit_timer_e {
KMP_FOREACH_EXPLICIT_TIMER(ENUMERATE, EXPLICIT_TIMER_)		KMP_FOREACH_EXPLICIT_TIMER(ENUMERATE, EXPLICIT_TIMER_)
};		};

enum counter_e {		enum counter_e {
KMP_FOREACH_COUNTER(ENUMERATE, COUNTER_)		KMP_FOREACH_COUNTER(ENUMERATE, COUNTER_)
};		};
#undef ENUMERATE		#undef ENUMERATE

		class timerPair {
		explicit_timer_e timer_index;
		timer_e timer;
		public:
		timerPair(explicit_timer_e ti, timer_e t) : timer_index(ti), timer(t) {}
		inline explicit_timer_e get_index() const { return timer_index; }
		inline timer_e get_timer() const { return timer; }
		bool operator==(const timerPair & rhs) {
		return this->get_index() == rhs.get_index();
		}
		bool operator!=(const timerPair & rhs) {
		return !(*this == rhs);
		}
		};

class statistic		class statistic
{		{
double minVal;		double minVal;
double maxVal;		double maxVal;
double meanVal;		double meanVal;
double m2;		double m2;
uint64_t sampleCount;		uint64_t sampleCount;

▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

// Where we need explicitly to start and end the timer, this version can be used		// Where we need explicitly to start and end the timer, this version can be used
// Since these timers normally aren't nicely scoped, so don't have a good place to live		// Since these timers normally aren't nicely scoped, so don't have a good place to live
// on the stack of the thread, they're more work to use.		// on the stack of the thread, they're more work to use.
class explicitTimer		class explicitTimer
{		{
timeStat * stat;		timeStat * stat;
tsc_tick_count startTime;		tsc_tick_count startTime;
		tsc_tick_count pauseStartTime;
		tsc_tick_count::tsc_interval_t totalPauseTime;

public:		public:
explicitTimer () : stat(0), startTime(0) { }		explicitTimer () : stat(0), startTime(0), pauseStartTime(0), totalPauseTime() { }
explicitTimer (timeStat * s) : stat(s), startTime() { }		explicitTimer (timeStat * s) : stat(s), startTime(), pauseStartTime(0), totalPauseTime() { }

void setStat (timeStat *s) { stat = s; }		void setStat (timeStat *s) { stat = s; }
void start(timer_e timerEnumValue);		void start(timer_e timerEnumValue);
		void pause() { pauseStartTime = tsc_tick_count::now(); }
		void resume() { totalPauseTime += (tsc_tick_count::now() - pauseStartTime); }
void stop(timer_e timerEnumValue);		void stop(timer_e timerEnumValue);
void reset() { startTime = 0; }		void reset() { startTime = 0; pauseStartTime = 0; totalPauseTime = 0; }
};		};

// Where all you need is to time a block, this is enough.		// Where all you need is to time a block, this is enough.
// (It avoids the need to have an explicit end, leaving the scope suffices.)		// (It avoids the need to have an explicit end, leaving the scope suffices.)
class blockTimer : public explicitTimer		class blockTimer : public explicitTimer
{		{
timer_e timerEnumValue;		timer_e timerEnumValue;
public:		public:
blockTimer (timeStat * s, timer_e newTimerEnumValue) : timerEnumValue(newTimerEnumValue), explicitTimer(s) { start(timerEnumValue); }		blockTimer (timeStat * s, timer_e newTimerEnumValue) : timerEnumValue(newTimerEnumValue), explicitTimer(s) { start(timerEnumValue); }
~blockTimer() { stop(timerEnumValue); }		~blockTimer() { stop(timerEnumValue); }
};		};

		// Where you need to partition a threads clock ticks into separate states
		// e.g., a partitionedTimers class with two timers of EXECUTING_TASK, and
		// DOING_NOTHING would render these conditions:
		// time(EXECUTING_TASK) + time(DOING_NOTHING) = total time thread is alive
		// No clock tick in the EXECUTING_TASK is a member of DOING_NOTHING and vice versa
		class partitionedTimers
		{
		private:
		explicitTimer* timers[EXPLICIT_TIMER_LAST+1];
		std::vector<timerPair> timer_stack;
		public:
		partitionedTimers();
		void add_timer(explicit_timer_e timer_index, explicitTimer* timer_pointer);
		void init(timerPair timer_index);
		void push(timerPair timer_index);
		void pop();
		void windup();
		};

		// Special wrapper around the partioned timers to aid timing code blocks
		// It avoids the need to have an explicit end, leaving the scope suffices.
		class blockPartitionedTimer
		{
		partitionedTimers* part_timers;
		timerPair timer_pair;
		public:
		blockPartitionedTimer(partitionedTimers* pt, timerPair tp) : part_timers(pt), timer_pair(tp) { part_timers->push(timer_pair); }
		~blockPartitionedTimer() { part_timers->pop(); }
		};

		// Special wrapper around the thread state to aid in keeping state in code blocks
		// It avoids the need to have an explicit end, leaving the scope suffices.
		class blockThreadState
		{
		stats_state_e* state_pointer;
		stats_state_e old_state;
		public:
		blockThreadState(stats_state_e* thread_state_pointer, stats_state_e new_state) : state_pointer(thread_state_pointer), old_state(*thread_state_pointer) {
		*state_pointer = new_state;
		}
		~blockThreadState() { *state_pointer = old_state; }
		};

// If all you want is a count, then you can use this...		// If all you want is a count, then you can use this...
// The individual per-thread counts will be aggregated into a statistic at program exit.		// The individual per-thread counts will be aggregated into a statistic at program exit.
class counter		class counter
{		{
uint64_t value;		uint64_t value;
static const statInfo counterInfo[];		static const statInfo counterInfo[];

public:		public:
▲ Show 20 Lines • Show All 142 Lines • ▼ Show 20 Lines	/* ****************************************************************
store "dummy" statistics before __kmp_create_worker() is called.		store "dummy" statistics before __kmp_create_worker() is called.

**************************************************************** */		**************************************************************** */
class kmp_stats_list {		class kmp_stats_list {
int gtid;		int gtid;
timeStat _timers[TIMER_LAST+1];		timeStat _timers[TIMER_LAST+1];
counter _counters[COUNTER_LAST+1];		counter _counters[COUNTER_LAST+1];
explicitTimer _explicitTimers[EXPLICIT_TIMER_LAST+1];		explicitTimer _explicitTimers[EXPLICIT_TIMER_LAST+1];
		partitionedTimers _partitionedTimers;
int _nestLevel; // one per thread		int _nestLevel; // one per thread
kmp_stats_event_vector _event_vector;		kmp_stats_event_vector _event_vector;
kmp_stats_list* next;		kmp_stats_list* next;
kmp_stats_list* prev;		kmp_stats_list* prev;
		stats_state_e state;
		int thread_is_idle_flag;
public:		public:
kmp_stats_list() : next(this) , prev(this) , _event_vector(), _nestLevel(0) {		kmp_stats_list() : _nestLevel(0), _event_vector(), next(this), prev(this),
		state(IDLE), thread_is_idle_flag(0) {
#define doInit(name,ignore1,ignore2) \		#define doInit(name,ignore1,ignore2) \
getExplicitTimer(EXPLICIT_TIMER_##name)->setStat(getTimer(TIMER_##name));		getExplicitTimer(EXPLICIT_TIMER_##name)->setStat(getTimer(TIMER_##name)); \
		_partitionedTimers.add_timer(EXPLICIT_TIMER_##name, getExplicitTimer(EXPLICIT_TIMER_##name));
KMP_FOREACH_EXPLICIT_TIMER(doInit,0);		KMP_FOREACH_EXPLICIT_TIMER(doInit,0);
#undef doInit		#undef doInit
}		}
~kmp_stats_list() { }		~kmp_stats_list() { }
inline timeStat * getTimer(timer_e idx) { return &_timers[idx]; }		inline timeStat * getTimer(timer_e idx) { return &_timers[idx]; }
inline counter * getCounter(counter_e idx) { return &_counters[idx]; }		inline counter * getCounter(counter_e idx) { return &_counters[idx]; }
inline explicitTimer * getExplicitTimer(explicit_timer_e idx) { return &_explicitTimers[idx]; }		inline explicitTimer * getExplicitTimer(explicit_timer_e idx) { return &_explicitTimers[idx]; }
		inline partitionedTimers * getPartitionedTimers() { return &_partitionedTimers; }
inline timeStat * getTimers() { return _timers; }		inline timeStat * getTimers() { return _timers; }
inline counter * getCounters() { return _counters; }		inline counter * getCounters() { return _counters; }
inline explicitTimer * getExplicitTimers() { return _explicitTimers; }		inline explicitTimer * getExplicitTimers() { return _explicitTimers; }
inline kmp_stats_event_vector & getEventVector() { return _event_vector; }		inline kmp_stats_event_vector & getEventVector() { return _event_vector; }
inline void resetEventVector() { _event_vector.reset(); }		inline void resetEventVector() { _event_vector.reset(); }
inline void incrementNestValue() { _nestLevel++; }		inline void incrementNestValue() { _nestLevel++; }
inline int getNestValue() { return _nestLevel; }		inline int getNestValue() { return _nestLevel; }
inline void decrementNestValue() { _nestLevel--; }		inline void decrementNestValue() { _nestLevel--; }
inline int getGtid() const { return gtid; }		inline int getGtid() const { return gtid; }
inline void setGtid(int newgtid) { gtid = newgtid; }		inline void setGtid(int newgtid) { gtid = newgtid; }
		inline void setState(stats_state_e newstate) { state = newstate; }
		inline stats_state_e getState() const { return state; }
		inline stats_state_e * getStatePointer() { return &state; }
		inline bool isIdle() { return thread_is_idle_flag==1; }
		inline void setIdleFlag() { thread_is_idle_flag = 1; }
		inline void resetIdleFlag() { thread_is_idle_flag = 0; }
kmp_stats_list* push_back(int gtid); // returns newly created list node		kmp_stats_list* push_back(int gtid); // returns newly created list node
inline void push_event(uint64_t start_time, uint64_t stop_time, int nest_level, timer_e name) {		inline void push_event(uint64_t start_time, uint64_t stop_time, int nest_level, timer_e name) {
_event_vector.push_back(start_time, stop_time, nest_level, name);		_event_vector.push_back(start_time, stop_time, nest_level, name);
}		}
void deallocate();		void deallocate();
class iterator;		class iterator;
kmp_stats_list::iterator begin();		kmp_stats_list::iterator begin();
kmp_stats_list::iterator end();		kmp_stats_list::iterator end();
▲ Show 20 Lines • Show All 185 Lines • ▼ Show 20 Lines
* It should be noted that all statistics are reset when this macro is called.		* It should be noted that all statistics are reset when this macro is called.
*		*
* @ingroup STATS_GATHERING		* @ingroup STATS_GATHERING
*/		*/
#define KMP_OUTPUT_STATS(heading_string) \		#define KMP_OUTPUT_STATS(heading_string) \
__kmp_output_stats(heading_string)		__kmp_output_stats(heading_string)

/*!		/*!
		* \brief Initializes the paritioned timers to begin with name.
		*
		* @param name timer which you want this thread to begin with
		*
		* @ingroup STATS_GATHERING
		*/
		#define KMP_INIT_PARTITIONED_TIMERS(name) \
		__kmp_stats_thread_ptr->getPartitionedTimers()->init(timerPair(EXPLICIT_TIMER_##name, TIMER_##name))

		#define KMP_TIME_PARTITIONED_BLOCK(name) \
		blockPartitionedTimer __PBLOCKTIME__(__kmp_stats_thread_ptr->getPartitionedTimers(), \
		timerPair(EXPLICIT_TIMER_##name, TIMER_##name))

		#define KMP_PUSH_PARTITIONED_TIMER(name) \
		__kmp_stats_thread_ptr->getPartitionedTimers()->push(timerPair(EXPLICIT_TIMER_##name, TIMER_##name))

		#define KMP_POP_PARTITIONED_TIMER() \
		__kmp_stats_thread_ptr->getPartitionedTimers()->pop()

		#define KMP_SET_THREAD_STATE(state_name) \
		__kmp_stats_thread_ptr->setState(state_name)

		#define KMP_GET_THREAD_STATE() \
		__kmp_stats_thread_ptr->getState()

		#define KMP_SET_THREAD_STATE_BLOCK(state_name) \
		blockThreadState __BTHREADSTATE__(__kmp_stats_thread_ptr->getStatePointer(), state_name)

		/*!
* \brief resets all stats (counters to 0, timers to 0 elapsed ticks)		* \brief resets all stats (counters to 0, timers to 0 elapsed ticks)
*		*
* \details Reset all stats for all threads.		* \details Reset all stats for all threads.
*		*
* @ingroup STATS_GATHERING		* @ingroup STATS_GATHERING
*/		*/
#define KMP_RESET_STATS() __kmp_reset_stats()		#define KMP_RESET_STATS() __kmp_reset_stats()

Show All 24 Lines
#define KMP_OUTPUT_STATS(heading_string) ((void)0)		#define KMP_OUTPUT_STATS(heading_string) ((void)0)
#define KMP_RESET_STATS() ((void)0)		#define KMP_RESET_STATS() ((void)0)

#define KMP_TIME_DEVELOPER_BLOCK(n) ((void)0)		#define KMP_TIME_DEVELOPER_BLOCK(n) ((void)0)
#define KMP_COUNT_DEVELOPER_VALUE(n,v) ((void)0)		#define KMP_COUNT_DEVELOPER_VALUE(n,v) ((void)0)
#define KMP_COUNT_DEVELOPER_BLOCK(n) ((void)0)		#define KMP_COUNT_DEVELOPER_BLOCK(n) ((void)0)
#define KMP_START_DEVELOPER_EXPLICIT_TIMER(n) ((void)0)		#define KMP_START_DEVELOPER_EXPLICIT_TIMER(n) ((void)0)
#define KMP_STOP_DEVELOPER_EXPLICIT_TIMER(n) ((void)0)		#define KMP_STOP_DEVELOPER_EXPLICIT_TIMER(n) ((void)0)
		#define KMP_INIT_PARTITIONED_TIMERS(name) ((void)0)
		#define KMP_TIME_PARTITIONED_BLOCK(name) ((void)0)
		#define KMP_PUSH_PARTITIONED_TIMER(name) ((void)0)
		#define KMP_POP_PARTITIONED_TIMER() ((void)0)
		#define KMP_SET_THREAD_STATE(state_name) ((void)0)
		#define KMP_GET_THREAD_STATE() ((void)0)
		#define KMP_SET_THREAD_STATE_BLOCK(state_name) ((void)0)
#endif // KMP_STATS_ENABLED		#endif // KMP_STATS_ENABLED

#endif // KMP_STATS_H		#endif // KMP_STATS_H

openmp/trunk/runtime/src/kmp_stats.cpp

Show First 20 Lines • Show All 151 Lines • ▼ Show 20 Lines	std::string statistic::format(char unit, bool total) const
return result;		return result;
}		}

/* ********************************************************** */		/* ********************************************************** */
/* *********** explicitTimer member functions *********** */		/* *********** explicitTimer member functions *********** */

void explicitTimer::start(timer_e timerEnumValue) {		void explicitTimer::start(timer_e timerEnumValue) {
startTime = tsc_tick_count::now();		startTime = tsc_tick_count::now();
		totalPauseTime = 0;
if(timeStat::logEvent(timerEnumValue)) {		if(timeStat::logEvent(timerEnumValue)) {
__kmp_stats_thread_ptr->incrementNestValue();		__kmp_stats_thread_ptr->incrementNestValue();
}		}
return;		return;
}		}

void explicitTimer::stop(timer_e timerEnumValue) {		void explicitTimer::stop(timer_e timerEnumValue) {
if (startTime.getValue() == 0)		if (startTime.getValue() == 0)
return;		return;

tsc_tick_count finishTime = tsc_tick_count::now();		tsc_tick_count finishTime = tsc_tick_count::now();

//stat->addSample ((tsc_tick_count::now() - startTime).ticks());		//stat->addSample ((tsc_tick_count::now() - startTime).ticks());
stat->addSample ((finishTime - startTime).ticks());		stat->addSample(((finishTime - startTime) - totalPauseTime).ticks());

if(timeStat::logEvent(timerEnumValue)) {		if(timeStat::logEvent(timerEnumValue)) {
__kmp_stats_thread_ptr->push_event(startTime.getValue() - __kmp_stats_start_time.getValue(), finishTime.getValue() - __kmp_stats_start_time.getValue(), __kmp_stats_thread_ptr->getNestValue(), timerEnumValue);		__kmp_stats_thread_ptr->push_event(startTime.getValue() - __kmp_stats_start_time.getValue(), finishTime.getValue() - __kmp_stats_start_time.getValue(), __kmp_stats_thread_ptr->getNestValue(), timerEnumValue);
__kmp_stats_thread_ptr->decrementNestValue();		__kmp_stats_thread_ptr->decrementNestValue();
}		}

/* We accept the risk that we drop a sample because it really did start at t==0. */		/* We accept the risk that we drop a sample because it really did start at t==0. */
startTime = 0;		startTime = 0;
return;		return;
}		}

		/* ************************************************************** */
		/* *********** partitionedTimers member functions *********** */
		partitionedTimers::partitionedTimers() {
		timer_stack.reserve(8);
		}

		// add a timer to this collection of partitioned timers.
		void partitionedTimers::add_timer(explicit_timer_e timer_index, explicitTimer* timer_pointer) {
		KMP_DEBUG_ASSERT((int)timer_index < (int)EXPLICIT_TIMER_LAST+1);
		timers[timer_index] = timer_pointer;
		}

		// initialize the paritioned timers to an initial timer
		void partitionedTimers::init(timerPair init_timer_pair) {
		KMP_DEBUG_ASSERT(this->timer_stack.size() == 0);
		timer_stack.push_back(init_timer_pair);
		timers[init_timer_pair.get_index()]->start(init_timer_pair.get_timer());
		}

		// stop/save the current timer, and start the new timer (timer_pair)
		// There is a special condition where if the current timer is equal to
		// the one you are trying to push, then it only manipulates the stack,
		// and it won't stop/start the currently running timer.
		void partitionedTimers::push(timerPair timer_pair) {
		// get the current timer
		// stop current timer
		// push new timer
		// start the new timer
		KMP_DEBUG_ASSERT(this->timer_stack.size() > 0);
		timerPair current_timer = timer_stack.back();
		timer_stack.push_back(timer_pair);
		if(current_timer != timer_pair) {
		timers[current_timer.get_index()]->pause();
		timers[timer_pair.get_index()]->start(timer_pair.get_timer());
		}
		}

		// stop/discard the current timer, and start the previously saved timer
		void partitionedTimers::pop() {
		// get the current timer
		// stop current timer
		// pop current timer
		// get the new current timer and start it back up
		KMP_DEBUG_ASSERT(this->timer_stack.size() > 1);
		timerPair current_timer = timer_stack.back();
		timer_stack.pop_back();
		timerPair new_timer = timer_stack.back();
		if(current_timer != new_timer) {
		timers[current_timer.get_index()]->stop(current_timer.get_timer());
		timers[new_timer.get_index()]->resume();
		}
		}

		// Wind up all the currently running timers.
		// This pops off all the timers from the stack and clears the stack
		// After this is called, init() must be run again to initialize the
		// stack of timers
		void partitionedTimers::windup() {
		while(timer_stack.size() > 1) {
		this->pop();
		}
		if(timer_stack.size() > 0) {
		timerPair last_timer = timer_stack.back();
		timer_stack.pop_back();
		timers[last_timer.get_index()]->stop(last_timer.get_timer());
		}
		}

/* ******************************************************************* */		/* ******************************************************************* */
/* *********** kmp_stats_event_vector member functions *********** */		/* *********** kmp_stats_event_vector member functions *********** */

void kmp_stats_event_vector::deallocate() {		void kmp_stats_event_vector::deallocate() {
__kmp_free(events);		__kmp_free(events);
internal_size = 0;		internal_size = 0;
allocated_size = 0;		allocated_size = 0;
events = NULL;		events = NULL;
▲ Show 20 Lines • Show All 199 Lines • ▼ Show 20 Lines

void kmp_stats_output_module::windupExplicitTimers()		void kmp_stats_output_module::windupExplicitTimers()
{		{
// Wind up any explicit timers. We assume that it's fair at this point to just walk all the explcit timers in all threads		// Wind up any explicit timers. We assume that it's fair at this point to just walk all the explcit timers in all threads
// and say "it's over".		// and say "it's over".
// If the timer wasn't running, this won't record anything anyway.		// If the timer wasn't running, this won't record anything anyway.
kmp_stats_list::iterator it;		kmp_stats_list::iterator it;
for(it = __kmp_stats_list.begin(); it != __kmp_stats_list.end(); it++) {		for(it = __kmp_stats_list.begin(); it != __kmp_stats_list.end(); it++) {
		kmp_stats_list* ptr = *it;
		ptr->getPartitionedTimers()->windup();
for (int timer=0; timer<EXPLICIT_TIMER_LAST; timer++) {		for (int timer=0; timer<EXPLICIT_TIMER_LAST; timer++) {
(*it)->getExplicitTimer(explicit_timer_e(timer))->stop((timer_e)timer);		ptr->getExplicitTimer(explicit_timer_e(timer))->stop((timer_e)timer);
}		}
}		}
}		}

void kmp_stats_output_module::printPloticusFile() {		void kmp_stats_output_module::printPloticusFile() {
int i;		int i;
int size = __kmp_stats_list.size();		int size = __kmp_stats_list.size();
FILE* plotOut = fopen(plotFileName, "w+");		FILE* plotOut = fopen(plotFileName, "w+");
▲ Show 20 Lines • Show All 180 Lines • ▼ Show 20 Lines	for(it = __kmp_stats_list.begin(); it != __kmp_stats_list.end(); it++) {
for (int c = 0; c<COUNTER_LAST; c++)		for (int c = 0; c<COUNTER_LAST; c++)
counters[c].reset();		counters[c].reset();

for (int t=0; t<EXPLICIT_TIMER_LAST; t++)		for (int t=0; t<EXPLICIT_TIMER_LAST; t++)
eTimers[t].reset();		eTimers[t].reset();

// reset the event vector so all previous events are "erased"		// reset the event vector so all previous events are "erased"
(*it)->resetEventVector();		(*it)->resetEventVector();

// May need to restart the explicit timers in thread zero?
}		}
KMP_START_EXPLICIT_TIMER(OMP_serial);
KMP_START_EXPLICIT_TIMER(OMP_start_end);
}		}

// This function will reset all stats and stop all threads' explicit timers if they haven't been stopped already.		// This function will reset all stats and stop all threads' explicit timers if they haven't been stopped already.
void __kmp_output_stats(const char * heading)		void __kmp_output_stats(const char * heading)
{		{
__kmp_stats_global_output.outputStats(heading);		__kmp_stats_global_output.outputStats(heading);
__kmp_reset_stats();		__kmp_reset_stats();
}		}
Show All 16 Lines

openmp/trunk/runtime/src/kmp_stats_timing.h

Show All 34 Lines	class tsc_interval_t {
explicit tsc_interval_t(int64_t _value) : value(_value) {}		explicit tsc_interval_t(int64_t _value) : value(_value) {}
public:		public:
tsc_interval_t() : value(0) {}; // Construct 0 time duration		tsc_interval_t() : value(0) {}; // Construct 0 time duration
#if KMP_HAVE_TICK_TIME		#if KMP_HAVE_TICK_TIME
double seconds() const; // Return the length of a time interval in seconds		double seconds() const; // Return the length of a time interval in seconds
#endif		#endif
double ticks() const { return double(value); }		double ticks() const { return double(value); }
int64_t getValue() const { return value; }		int64_t getValue() const { return value; }
		tsc_interval_t& operator=(int64_t nvalue) { value = nvalue; return *this; }

friend class tsc_tick_count;		friend class tsc_tick_count;

friend tsc_interval_t operator-(		friend tsc_interval_t operator-(const tsc_tick_count& t1,
const tsc_tick_count t1, const tsc_tick_count t0);		const tsc_tick_count& t0);
		friend tsc_interval_t operator-(const tsc_tick_count::tsc_interval_t& i1,
		const tsc_tick_count::tsc_interval_t& i0);
		friend tsc_interval_t& operator+=(tsc_tick_count::tsc_interval_t& i1,
		const tsc_tick_count::tsc_interval_t& i0);
};		};

#if KMP_HAVE___BUILTIN_READCYCLECOUNTER		#if KMP_HAVE___BUILTIN_READCYCLECOUNTER
tsc_tick_count() : my_count(static_cast<int64_t>(__builtin_readcyclecounter())) {}		tsc_tick_count() : my_count(static_cast<int64_t>(__builtin_readcyclecounter())) {}
#elif KMP_HAVE___RDTSC		#elif KMP_HAVE___RDTSC
tsc_tick_count() : my_count(static_cast<int64_t>(__rdtsc())) {};		tsc_tick_count() : my_count(static_cast<int64_t>(__rdtsc())) {};
#else		#else
# error Must have high resolution timer defined		# error Must have high resolution timer defined
#endif		#endif
tsc_tick_count(int64_t value) : my_count(value) {};		tsc_tick_count(int64_t value) : my_count(value) {};
int64_t getValue() const { return my_count; }		int64_t getValue() const { return my_count; }
tsc_tick_count later (tsc_tick_count const other) const {		tsc_tick_count later (tsc_tick_count const other) const {
return my_count > other.my_count ? (*this) : other;		return my_count > other.my_count ? (*this) : other;
}		}
tsc_tick_count earlier(tsc_tick_count const other) const {		tsc_tick_count earlier(tsc_tick_count const other) const {
return my_count < other.my_count ? (*this) : other;		return my_count < other.my_count ? (*this) : other;
}		}
#if KMP_HAVE_TICK_TIME		#if KMP_HAVE_TICK_TIME
static double tick_time(); // returns seconds per cycle (period) of clock		static double tick_time(); // returns seconds per cycle (period) of clock
#endif		#endif
static tsc_tick_count now() { return tsc_tick_count(); } // returns the rdtsc register value		static tsc_tick_count now() { return tsc_tick_count(); } // returns the rdtsc register value
friend tsc_tick_count::tsc_interval_t operator-(const tsc_tick_count t1, const tsc_tick_count t0);		friend tsc_tick_count::tsc_interval_t operator-(const tsc_tick_count& t1, const tsc_tick_count& t0);
};		};

inline tsc_tick_count::tsc_interval_t operator-(const tsc_tick_count t1, const tsc_tick_count t0)		inline tsc_tick_count::tsc_interval_t operator-(const tsc_tick_count& t1, const tsc_tick_count& t0)
{		{
return tsc_tick_count::tsc_interval_t( t1.my_count-t0.my_count );		return tsc_tick_count::tsc_interval_t( t1.my_count-t0.my_count );
}		}

		inline tsc_tick_count::tsc_interval_t operator-(const tsc_tick_count::tsc_interval_t& i1, const tsc_tick_count::tsc_interval_t& i0)
		{
		return tsc_tick_count::tsc_interval_t( i1.value-i0.value );
		}

		inline tsc_tick_count::tsc_interval_t& operator+=(tsc_tick_count::tsc_interval_t& i1, const tsc_tick_count::tsc_interval_t& i0)
		{
		i1.value += i0.value;
		return i1;
		}

#if KMP_HAVE_TICK_TIME		#if KMP_HAVE_TICK_TIME
inline double tsc_tick_count::tsc_interval_t::seconds() const		inline double tsc_tick_count::tsc_interval_t::seconds() const
{		{
return value*tick_time();		return value*tick_time();
}		}
#endif		#endif

extern std::string formatSI(double interval, int width, char unit);		extern std::string formatSI(double interval, int width, char unit);
Show All 12 Lines

openmp/trunk/runtime/src/kmp_tasking.c

Show All 30 Lines
static void __kmp_enable_tasking( kmp_task_team_t task_team, kmp_info_t this_thr );		static void __kmp_enable_tasking( kmp_task_team_t task_team, kmp_info_t this_thr );
static void __kmp_alloc_task_deque( kmp_info_t thread, kmp_thread_data_t thread_data );		static void __kmp_alloc_task_deque( kmp_info_t thread, kmp_thread_data_t thread_data );
static int __kmp_realloc_task_threads_data( kmp_info_t thread, kmp_task_team_t task_team );		static int __kmp_realloc_task_threads_data( kmp_info_t thread, kmp_task_team_t task_team );

#ifdef OMP_41_ENABLED		#ifdef OMP_41_ENABLED
static void __kmp_bottom_half_finish_proxy( kmp_int32 gtid, kmp_task_t * ptask );		static void __kmp_bottom_half_finish_proxy( kmp_int32 gtid, kmp_task_t * ptask );
#endif		#endif

static inline void __kmp_null_resume_wrapper(int gtid, volatile void *flag) {
if (!flag) return;
// Attempt to wake up a thread: examine its type and call appropriate template
switch (((kmp_flag_64 *)flag)->get_type()) {
case flag32: __kmp_resume_32(gtid, NULL); break;
case flag64: __kmp_resume_64(gtid, NULL); break;
case flag_oncore: __kmp_resume_oncore(gtid, NULL); break;
}
}

#ifdef BUILD_TIED_TASK_STACK		#ifdef BUILD_TIED_TASK_STACK

//---------------------------------------------------------------------------		//---------------------------------------------------------------------------
// __kmp_trace_task_stack: print the tied tasks from the task stack in order		// __kmp_trace_task_stack: print the tied tasks from the task stack in order
// from top do bottom		// from top do bottom
//		//
// gtid: global thread identifier for thread containing stack		// gtid: global thread identifier for thread containing stack
// thread_data: thread data for task team thread containing stack		// thread_data: thread data for task team thread containing stack
▲ Show 20 Lines • Show All 1,145 Lines • ▼ Show 20 Lines	if (__kmp_omp_cancellation) {
}		}
}		}

//		//
// Invoke the task routine and pass in relevant data.		// Invoke the task routine and pass in relevant data.
// Thunks generated by gcc take a different argument list.		// Thunks generated by gcc take a different argument list.
//		//
if (!discard) {		if (!discard) {
		#if KMP_STATS_ENABLED
KMP_COUNT_BLOCK(TASK_executed);		KMP_COUNT_BLOCK(TASK_executed);
KMP_TIME_BLOCK (OMP_task);		switch(KMP_GET_THREAD_STATE()) {
		case FORK_JOIN_BARRIER: KMP_PUSH_PARTITIONED_TIMER(OMP_task_join_bar); break;
		case PLAIN_BARRIER: KMP_PUSH_PARTITIONED_TIMER(OMP_task_plain_bar); break;
		case TASKYIELD: KMP_PUSH_PARTITIONED_TIMER(OMP_task_taskyield); break;
		case TASKWAIT: KMP_PUSH_PARTITIONED_TIMER(OMP_task_taskwait); break;
		case TASKGROUP: KMP_PUSH_PARTITIONED_TIMER(OMP_task_taskgroup); break;
		default: KMP_PUSH_PARTITIONED_TIMER(OMP_task_immediate); break;
		}
		#endif // KMP_STATS_ENABLED
#endif // OMP_40_ENABLED		#endif // OMP_40_ENABLED

#if OMPT_SUPPORT && OMPT_TRACE		#if OMPT_SUPPORT && OMPT_TRACE
/* let OMPT know that we're about to run this task */		/* let OMPT know that we're about to run this task */
if (ompt_enabled &&		if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_task_switch))		ompt_callbacks.ompt_callback(ompt_event_task_switch))
{		{
ompt_callbacks.ompt_callback(ompt_event_task_switch)(		ompt_callbacks.ompt_callback(ompt_event_task_switch)(
current_task->ompt_task_info.task_id,		current_task->ompt_task_info.task_id,
taskdata->ompt_task_info.task_id);		taskdata->ompt_task_info.task_id);
}		}
#endif		#endif

#ifdef KMP_GOMP_COMPAT		#ifdef KMP_GOMP_COMPAT
if (taskdata->td_flags.native) {		if (taskdata->td_flags.native) {
((void ()(void ))(*(task->routine)))(task->shareds);		((void ()(void ))(*(task->routine)))(task->shareds);
}		}
else		else
#endif /* KMP_GOMP_COMPAT */		#endif /* KMP_GOMP_COMPAT */
{		{
(*(task->routine))(gtid, task);		(*(task->routine))(gtid, task);
}		}
		KMP_POP_PARTITIONED_TIMER();

#if OMPT_SUPPORT && OMPT_TRACE		#if OMPT_SUPPORT && OMPT_TRACE
/* let OMPT know that we're returning to the callee task */		/* let OMPT know that we're returning to the callee task */
if (ompt_enabled &&		if (ompt_enabled &&
ompt_callbacks.ompt_callback(ompt_event_task_switch))		ompt_callbacks.ompt_callback(ompt_event_task_switch))
{		{
ompt_callbacks.ompt_callback(ompt_event_task_switch)(		ompt_callbacks.ompt_callback(ompt_event_task_switch)(
taskdata->ompt_task_info.task_id,		taskdata->ompt_task_info.task_id,
▲ Show 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
//		//
// TASK_CURRENT_NOT_QUEUED (0) if did not suspend and queue current task to be resumed later.		// TASK_CURRENT_NOT_QUEUED (0) if did not suspend and queue current task to be resumed later.
// TASK_CURRENT_QUEUED (1) if suspended and queued the current task to be resumed later.		// TASK_CURRENT_QUEUED (1) if suspended and queued the current task to be resumed later.

kmp_int32		kmp_int32
__kmpc_omp_task( ident_t loc_ref, kmp_int32 gtid, kmp_task_t new_task)		__kmpc_omp_task( ident_t loc_ref, kmp_int32 gtid, kmp_task_t new_task)
{		{
kmp_int32 res;		kmp_int32 res;
		KMP_SET_THREAD_STATE_BLOCK(EXPLICIT_TASK);

#if KMP_DEBUG		#if KMP_DEBUG
kmp_taskdata_t * new_taskdata = KMP_TASK_TO_TASKDATA(new_task);		kmp_taskdata_t * new_taskdata = KMP_TASK_TO_TASKDATA(new_task);
#endif		#endif
KA_TRACE(10, ("__kmpc_omp_task(enter): T#%d loc=%p task=%p\n",		KA_TRACE(10, ("__kmpc_omp_task(enter): T#%d loc=%p task=%p\n",
gtid, loc_ref, new_taskdata ) );		gtid, loc_ref, new_taskdata ) );

res = __kmp_omp_task(gtid,new_task,true);		res = __kmp_omp_task(gtid,new_task,true);

KA_TRACE(10, ("__kmpc_omp_task(exit): T#%d returning TASK_CURRENT_NOT_QUEUED: loc=%p task=%p\n",		KA_TRACE(10, ("__kmpc_omp_task(exit): T#%d returning TASK_CURRENT_NOT_QUEUED: loc=%p task=%p\n",
gtid, loc_ref, new_taskdata ) );		gtid, loc_ref, new_taskdata ) );
return res;		return res;
}		}

//-------------------------------------------------------------------------------------		//-------------------------------------------------------------------------------------
// __kmpc_omp_taskwait: Wait until all tasks generated by the current task are complete		// __kmpc_omp_taskwait: Wait until all tasks generated by the current task are complete

kmp_int32		kmp_int32
__kmpc_omp_taskwait( ident_t *loc_ref, kmp_int32 gtid )		__kmpc_omp_taskwait( ident_t *loc_ref, kmp_int32 gtid )
{		{
kmp_taskdata_t * taskdata;		kmp_taskdata_t * taskdata;
kmp_info_t * thread;		kmp_info_t * thread;
int thread_finished = FALSE;		int thread_finished = FALSE;
		KMP_SET_THREAD_STATE_BLOCK(TASKWAIT);

KA_TRACE(10, ("__kmpc_omp_taskwait(enter): T#%d loc=%p\n", gtid, loc_ref) );		KA_TRACE(10, ("__kmpc_omp_taskwait(enter): T#%d loc=%p\n", gtid, loc_ref) );

if ( __kmp_tasking_mode != tskm_immediate_exec ) {		if ( __kmp_tasking_mode != tskm_immediate_exec ) {
// GEH TODO: shouldn't we have some sort of OMPRAP API calls here to mark begin wait?		// GEH TODO: shouldn't we have some sort of OMPRAP API calls here to mark begin wait?

thread = __kmp_threads[ gtid ];		thread = __kmp_threads[ gtid ];
taskdata = thread -> th.th_current_task;		taskdata = thread -> th.th_current_task;
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines
kmp_int32		kmp_int32
__kmpc_omp_taskyield( ident_t *loc_ref, kmp_int32 gtid, int end_part )		__kmpc_omp_taskyield( ident_t *loc_ref, kmp_int32 gtid, int end_part )
{		{
kmp_taskdata_t * taskdata;		kmp_taskdata_t * taskdata;
kmp_info_t * thread;		kmp_info_t * thread;
int thread_finished = FALSE;		int thread_finished = FALSE;

KMP_COUNT_BLOCK(OMP_TASKYIELD);		KMP_COUNT_BLOCK(OMP_TASKYIELD);
		KMP_SET_THREAD_STATE_BLOCK(TASKYIELD);

KA_TRACE(10, ("__kmpc_omp_taskyield(enter): T#%d loc=%p end_part = %d\n",		KA_TRACE(10, ("__kmpc_omp_taskyield(enter): T#%d loc=%p end_part = %d\n",
gtid, loc_ref, end_part) );		gtid, loc_ref, end_part) );

if ( __kmp_tasking_mode != tskm_immediate_exec && __kmp_init_parallel ) {		if ( __kmp_tasking_mode != tskm_immediate_exec && __kmp_init_parallel ) {
// GEH TODO: shouldn't we have some sort of OMPRAP API calls here to mark begin wait?		// GEH TODO: shouldn't we have some sort of OMPRAP API calls here to mark begin wait?

thread = __kmp_threads[ gtid ];		thread = __kmp_threads[ gtid ];
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
{		{
kmp_info_t * thread = __kmp_threads[ gtid ];		kmp_info_t * thread = __kmp_threads[ gtid ];
kmp_taskdata_t * taskdata = thread->th.th_current_task;		kmp_taskdata_t * taskdata = thread->th.th_current_task;
kmp_taskgroup_t * taskgroup = taskdata->td_taskgroup;		kmp_taskgroup_t * taskgroup = taskdata->td_taskgroup;
int thread_finished = FALSE;		int thread_finished = FALSE;

KA_TRACE(10, ("__kmpc_end_taskgroup(enter): T#%d loc=%p\n", gtid, loc) );		KA_TRACE(10, ("__kmpc_end_taskgroup(enter): T#%d loc=%p\n", gtid, loc) );
KMP_DEBUG_ASSERT( taskgroup != NULL );		KMP_DEBUG_ASSERT( taskgroup != NULL );
		KMP_SET_THREAD_STATE_BLOCK(TASKGROUP);

if ( __kmp_tasking_mode != tskm_immediate_exec ) {		if ( __kmp_tasking_mode != tskm_immediate_exec ) {
#if USE_ITT_BUILD		#if USE_ITT_BUILD
// For ITT the taskgroup wait is similar to taskwait until we need to distinguish them		// For ITT the taskgroup wait is similar to taskwait until we need to distinguish them
void * itt_sync_obj = __kmp_itt_taskwait_object( gtid );		void * itt_sync_obj = __kmp_itt_taskwait_object( gtid );
if ( itt_sync_obj != NULL )		if ( itt_sync_obj != NULL )
__kmp_itt_taskwait_starting( gtid, itt_sync_obj );		__kmp_itt_taskwait_starting( gtid, itt_sync_obj );
#endif /* USE_ITT_BUILD */		#endif /* USE_ITT_BUILD */
▲ Show 20 Lines • Show All 1,588 Lines • Show Last 20 Lines

openmp/trunk/runtime/src/kmp_wait_release.h

Show All 12 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//


#ifndef KMP_WAIT_RELEASE_H		#ifndef KMP_WAIT_RELEASE_H
#define KMP_WAIT_RELEASE_H		#define KMP_WAIT_RELEASE_H

#include "kmp.h"		#include "kmp.h"
#include "kmp_itt.h"		#include "kmp_itt.h"
		#include "kmp_stats.h"

/*!		/*!
@defgroup WAIT_RELEASE Wait/Release operations		@defgroup WAIT_RELEASE Wait/Release operations

The definitions and functions here implement the lowest level thread		The definitions and functions here implement the lowest level thread
synchronizations of suspending a thread and awaking it. They are used		synchronizations of suspending a thread and awaking it. They are used
to build higher level operations such as barriers and fork/join.		to build higher level operations such as barriers and fork/join.
*/		*/
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	__kmp_wait_template(kmp_info_t this_thr, C flag, int final_spin

KMP_FSYNC_SPIN_INIT(spin, NULL);		KMP_FSYNC_SPIN_INIT(spin, NULL);
if (flag->done_check()) {		if (flag->done_check()) {
KMP_FSYNC_SPIN_ACQUIRED(spin);		KMP_FSYNC_SPIN_ACQUIRED(spin);
return;		return;
}		}
th_gtid = this_thr->th.th_info.ds.ds_gtid;		th_gtid = this_thr->th.th_info.ds.ds_gtid;
KA_TRACE(20, ("__kmp_wait_sleep: T#%d waiting for flag(%p)\n", th_gtid, flag));		KA_TRACE(20, ("__kmp_wait_sleep: T#%d waiting for flag(%p)\n", th_gtid, flag));
		#if KMP_STATS_ENABLED
		stats_state_e thread_state = KMP_GET_THREAD_STATE();
		#endif

#if OMPT_SUPPORT && OMPT_BLAME		#if OMPT_SUPPORT && OMPT_BLAME
ompt_state_t ompt_state = this_thr->th.ompt_thread_info.state;		ompt_state_t ompt_state = this_thr->th.ompt_thread_info.state;
if (ompt_enabled &&		if (ompt_enabled &&
ompt_state != ompt_state_undefined) {		ompt_state != ompt_state_undefined) {
if (ompt_state == ompt_state_idle) {		if (ompt_state == ompt_state_idle) {
if (ompt_callbacks.ompt_callback(ompt_event_idle_begin)) {		if (ompt_callbacks.ompt_callback(ompt_event_idle_begin)) {
ompt_callbacks.ompt_callback(ompt_event_idle_begin)(th_gtid + 1);		ompt_callbacks.ompt_callback(ompt_event_idle_begin)(th_gtid + 1);
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	while (flag->notdone_check()) {
}		}
else { // Recently transferred from pool to team		else { // Recently transferred from pool to team
KMP_TEST_THEN_DEC32((kmp_int32 *) &__kmp_thread_pool_active_nth);		KMP_TEST_THEN_DEC32((kmp_int32 *) &__kmp_thread_pool_active_nth);
KMP_DEBUG_ASSERT(TCR_4(__kmp_thread_pool_active_nth) >= 0);		KMP_DEBUG_ASSERT(TCR_4(__kmp_thread_pool_active_nth) >= 0);
this_thr->th.th_active_in_pool = FALSE;		this_thr->th.th_active_in_pool = FALSE;
}		}
}		}

		#if KMP_STATS_ENABLED
		// Check if thread has been signalled to idle state
		// This indicates that the logical "join-barrier" has finished
		if (this_thr->th.th_stats->isIdle() && KMP_GET_THREAD_STATE() == FORK_JOIN_BARRIER) {
		KMP_SET_THREAD_STATE(IDLE);
		KMP_PUSH_PARTITIONED_TIMER(OMP_idle);
		}
		#endif

// Don't suspend if KMP_BLOCKTIME is set to "infinite"		// Don't suspend if KMP_BLOCKTIME is set to "infinite"
if (__kmp_dflt_blocktime == KMP_MAX_BLOCKTIME)		if (__kmp_dflt_blocktime == KMP_MAX_BLOCKTIME)
continue;		continue;

// Don't suspend if there is a likelihood of new tasks being spawned.		// Don't suspend if there is a likelihood of new tasks being spawned.
if ((task_team != NULL) && TCR_4(task_team->tt.tt_found_tasks))		if ((task_team != NULL) && TCR_4(task_team->tt.tt_found_tasks))
continue;		continue;

Show All 34 Lines	if (ompt_enabled &&
} else {		} else {
pId = this_thr->th.th_team->t.ompt_team_info.parallel_id;		pId = this_thr->th.th_team->t.ompt_team_info.parallel_id;
tId = this_thr->th.th_current_task->ompt_task_info.task_id;		tId = this_thr->th.th_current_task->ompt_task_info.task_id;
}		}
ompt_callbacks.ompt_callback(ompt_event_wait_barrier_end)(pId, tId);		ompt_callbacks.ompt_callback(ompt_event_wait_barrier_end)(pId, tId);
}		}
}		}
#endif		#endif
		#if KMP_STATS_ENABLED
		// If we were put into idle state, pop that off the state stack
		if (KMP_GET_THREAD_STATE() == IDLE) {
		KMP_POP_PARTITIONED_TIMER();
		KMP_SET_THREAD_STATE(thread_state);
		this_thr->th.th_stats->resetIdleFlag();
		}
		#endif

KMP_FSYNC_SPIN_ACQUIRED(spin);		KMP_FSYNC_SPIN_ACQUIRED(spin);
}		}

/* Release any threads specified as waiting on the flag by releasing the flag and resume the waiting thread		/* Release any threads specified as waiting on the flag by releasing the flag and resume the waiting thread
if indicated by the sleep bit(s). A thread that calls __kmp_wait_template must call this function to wake		if indicated by the sleep bit(s). A thread that calls __kmp_wait_template must call this function to wake
up the potentially sleeping thread and prevent deadlocks! */		up the potentially sleeping thread and prevent deadlocks! */
template <class C>		template <class C>
▲ Show 20 Lines • Show All 267 Lines • ▼ Show 20 Lines	int execute_tasks(kmp_info_t this_thr, kmp_int32 gtid, int final_spin, int thread_finished
return __kmp_execute_tasks_oncore(this_thr, gtid, this, final_spin, thread_finished		return __kmp_execute_tasks_oncore(this_thr, gtid, this, final_spin, thread_finished
USE_ITT_BUILD_ARG(itt_sync_obj), is_constrained);		USE_ITT_BUILD_ARG(itt_sync_obj), is_constrained);
}		}
kmp_uint8 *get_stolen() { return NULL; }		kmp_uint8 *get_stolen() { return NULL; }
enum barrier_type get_bt() { return bt; }		enum barrier_type get_bt() { return bt; }
flag_type get_ptr_type() { return flag_oncore; }		flag_type get_ptr_type() { return flag_oncore; }
};		};

		// Used to wake up threads, volatile void* flag is usually the th_sleep_loc associated
		// with int gtid.
		static inline void __kmp_null_resume_wrapper(int gtid, volatile void *flag) {
		switch (((kmp_flag_64 *)flag)->get_type()) {
		case flag32: __kmp_resume_32(gtid, NULL); break;
		case flag64: __kmp_resume_64(gtid, NULL); break;
		case flag_oncore: __kmp_resume_oncore(gtid, NULL); break;
		}
		}

/*!		/*!
@}		@}
*/		*/

#endif // KMP_WAIT_RELEASE_H		#endif // KMP_WAIT_RELEASE_H

openmp/trunk/runtime/src/z_Linux_util.c

Show First 20 Lines • Show All 691 Lines • ▼ Show 20 Lines	#endif
gtid = ((kmp_info_t*)thr) -> th.th_info.ds.ds_gtid;		gtid = ((kmp_info_t*)thr) -> th.th_info.ds.ds_gtid;
__kmp_gtid_set_specific( gtid );		__kmp_gtid_set_specific( gtid );
#ifdef KMP_TDATA_GTID		#ifdef KMP_TDATA_GTID
__kmp_gtid = gtid;		__kmp_gtid = gtid;
#endif		#endif
#if KMP_STATS_ENABLED		#if KMP_STATS_ENABLED
// set __thread local index to point to thread-specific stats		// set __thread local index to point to thread-specific stats
__kmp_stats_thread_ptr = ((kmp_info_t*)thr)->th.th_stats;		__kmp_stats_thread_ptr = ((kmp_info_t*)thr)->th.th_stats;
		KMP_START_EXPLICIT_TIMER(OMP_worker_thread_life);
		KMP_SET_THREAD_STATE(IDLE);
		KMP_INIT_PARTITIONED_TIMERS(OMP_idle);
#endif		#endif

#if USE_ITT_BUILD		#if USE_ITT_BUILD
__kmp_itt_thread_name( gtid );		__kmp_itt_thread_name( gtid );
#endif /* USE_ITT_BUILD */		#endif /* USE_ITT_BUILD */

#if KMP_AFFINITY_SUPPORTED		#if KMP_AFFINITY_SUPPORTED
__kmp_affinity_set_init_mask( gtid, FALSE );		__kmp_affinity_set_init_mask( gtid, FALSE );
▲ Show 20 Lines • Show All 259 Lines • ▼ Show 20 Lines	#if KMP_STATS_ENABLED
// th->th.th_stats is used to transfer thread specific stats-pointer to __kmp_launch_worker		// th->th.th_stats is used to transfer thread specific stats-pointer to __kmp_launch_worker
// So when thread is created (goes into __kmp_launch_worker) it will		// So when thread is created (goes into __kmp_launch_worker) it will
// set it's __thread local pointer to th->th.th_stats		// set it's __thread local pointer to th->th.th_stats
th->th.th_stats = __kmp_stats_list.push_back(gtid);		th->th.th_stats = __kmp_stats_list.push_back(gtid);
if(KMP_UBER_GTID(gtid)) {		if(KMP_UBER_GTID(gtid)) {
__kmp_stats_start_time = tsc_tick_count::now();		__kmp_stats_start_time = tsc_tick_count::now();
__kmp_stats_thread_ptr = th->th.th_stats;		__kmp_stats_thread_ptr = th->th.th_stats;
__kmp_stats_init();		__kmp_stats_init();
KMP_START_EXPLICIT_TIMER(OMP_serial);		KMP_START_EXPLICIT_TIMER(OMP_worker_thread_life);
KMP_START_EXPLICIT_TIMER(OMP_start_end);		KMP_SET_THREAD_STATE(SERIAL_REGION);
		KMP_INIT_PARTITIONED_TIMERS(OMP_serial);
}		}
__kmp_release_tas_lock(&__kmp_stats_lock, gtid);		__kmp_release_tas_lock(&__kmp_stats_lock, gtid);

#endif // KMP_STATS_ENABLED		#endif // KMP_STATS_ENABLED

if ( KMP_UBER_GTID(gtid) ) {		if ( KMP_UBER_GTID(gtid) ) {
KA_TRACE( 10, ("__kmp_create_worker: uber thread (%d)\n", gtid ) );		KA_TRACE( 10, ("__kmp_create_worker: uber thread (%d)\n", gtid ) );
th -> th.th_info.ds.ds_thread = pthread_self();		th -> th.th_info.ds.ds_thread = pthread_self();
▲ Show 20 Lines • Show All 866 Lines • ▼ Show 20 Lines
}		}
void __kmp_resume_oncore(int target_gtid, kmp_flag_oncore *flag) {		void __kmp_resume_oncore(int target_gtid, kmp_flag_oncore *flag) {
__kmp_resume_template(target_gtid, flag);		__kmp_resume_template(target_gtid, flag);
}		}

void		void
__kmp_resume_monitor()		__kmp_resume_monitor()
{		{
		KMP_TIME_DEVELOPER_BLOCK(USER_resume);
int status;		int status;
#ifdef KMP_DEBUG		#ifdef KMP_DEBUG
int gtid = TCR_4(__kmp_init_gtid) ? __kmp_get_gtid() : -1;		int gtid = TCR_4(__kmp_init_gtid) ? __kmp_get_gtid() : -1;
KF_TRACE( 30, ( "__kmp_resume_monitor: T#%d wants to wakeup T#%d enter\n",		KF_TRACE( 30, ( "__kmp_resume_monitor: T#%d wants to wakeup T#%d enter\n",
gtid, KMP_GTID_MONITOR ) );		gtid, KMP_GTID_MONITOR ) );
KMP_DEBUG_ASSERT( gtid != KMP_GTID_MONITOR );		KMP_DEBUG_ASSERT( gtid != KMP_GTID_MONITOR );
#endif		#endif
status = pthread_mutex_lock( &__kmp_wait_mx.m_mutex );		status = pthread_mutex_lock( &__kmp_wait_mx.m_mutex );
▲ Show 20 Lines • Show All 802 Lines • Show Last 20 Lines