Page MenuHomePhabricator

[STATS] Use partitioned timer scheme
ClosedPublic

Authored by jlpeyton on Apr 18 2016, 10:39 AM.

Details

Summary

This change removes the current timers with ones that partition time properly. The current timers are nested, so that if a new timer, B, starts when the current timer, A, is already timing, A's time will include B's. To eliminate this problem, the partitioned timers are designed to stop the current timer (A), let the new timer run (B), and when the new timer is finished, restart the previously running timer (A). With this partitioning of time, a threads' timers all sum up to the OMP_worker_thread_life time and can now easily show the percentage of time a thread is spending in different parts of the runtime or user code.

There is also a new state variable associated with each thread which tells where it is executing a task. This corresponds with the timers: OMP_task_*, e.g., if time is spent in OMP_task_taskwait, then that thread executed tasks inside a #pragma omp taskwait construct.

The changes are mostly changing the MACROs to use the new PARITIONED_* macros, the new partitionedTimers class and its methods, and new state logic.

Diff Detail

Repository
rL LLVM

Event Timeline

jlpeyton updated this revision to Diff 54080.Apr 18 2016, 10:39 AM
jlpeyton retitled this revision from to [STATS] Use partitioned timer scheme.
jlpeyton updated this object.
jlpeyton added reviewers: tlwilmar, jcownie.
jlpeyton set the repository for this revision to rL LLVM.
jlpeyton added a subscriber: openmp-commits.
tlwilmar accepted this revision.Apr 28 2016, 9:13 AM
tlwilmar edited edge metadata.

LGTM

This revision is now accepted and ready to land.Apr 28 2016, 9:13 AM

I am somewhat surprised by this set of changes. The set of thread states being maintained is very close to what OMPT supports. Are the differences worthwhile? If so, why not have someone from Intel participate in the OMPT discussions and lobby for changes. If the differences are not worthwhile, then what is the point of having maintaining two independent implementations of state maintenance that largely duplicate one another? Why not just use the OMPT states? If states aren't being maintained properly for the OpenMP library (I know that not all of the OMPT state maintenance is implemented or correct), why doesn't someone from Intel with intimate knowledge of how the runtime works fix it? If I were to fix the OMPT implementation, I would consider simply putting OMPT state maintenance in the same places that Intel engineers who wrote the library put them (where they are in this patch).

After going to the effort of patching OMPT into this library, I've noticed the distinct lack of Intel investment in helping move the implementation forward. To me, this is a sign that Intel would prefer that the efforts to standardize the OMPT API would just wither and die. If I'm mistaken, please correct me and let's explore how to collaborate.

This revision was automatically updated to reflect the committed changes.