This is an archive of the discontinued LLVM Phabricator instance.

Fix performance regression in SPEC kdtree test
ClosedPublic

Authored by jlpeyton on Dec 17 2018, 1:32 PM.

Details

Summary

Make __ompt_implicit_task_end a static function and remove the inline part. This fixes small regression in SPEC kdtree benchmark. Also reformat some of __ompt_implicit_task_end.

Diff Detail

Repository
rOMP OpenMP

Event Timeline

jlpeyton created this revision.Dec 17 2018, 1:32 PM
Hahnfeld added inline comments.
runtime/src/kmp_wait_release.h
254–261

Are you sure that you want to remove this lookup? This means always passing NULL AFAICS

runtime/src/kmp_wait_release.h
254–261

Once the value of pId is never used it should be OK to have it NULL for now. Increased code size impacts performance of some tests, so we try to eliminate unused code when possible.

Hahnfeld added inline comments.Dec 18 2018, 1:07 AM
runtime/src/kmp_wait_release.h
264

Well, it's used in this event.

runtime/src/kmp_wait_release.h
264

Here is the complete code of the event you mentioned:

static inline void __ompt_implicit_task_end(kmp_info_t *this_thr,
                                          ompt_state_t ompt_state,
                                          ompt_data_t *tId,
                                          ompt_data_t *pId) {
int ds_tid = this_thr->th.th_info.ds.ds_tid;
if (ompt_state == ompt_state_wait_barrier_implicit) {
  this_thr->th.ompt_thread_info.state = ompt_state_overhead;
  void *codeptr = NULL;
  if (ompt_enabled.ompt_callback_sync_region_wait) {
    ompt_callbacks.ompt_callback(ompt_callback_sync_region_wait)(
        ompt_sync_region_barrier, ompt_scope_end, NULL, tId, codeptr);
  }
  if (ompt_enabled.ompt_callback_sync_region) {
    ompt_callbacks.ompt_callback(ompt_callback_sync_region)(
        ompt_sync_region_barrier, ompt_scope_end, NULL, tId, codeptr);
  }
  if (!KMP_MASTER_TID(ds_tid)) {
    if (ompt_enabled.ompt_callback_implicit_task) {
      ompt_callbacks.ompt_callback(ompt_callback_implicit_task)(
          ompt_scope_end, NULL, tId, 0, ds_tid);
    }
    // return to idle state
    this_thr->th.ompt_thread_info.state = ompt_state_idle;
  } else {
    this_thr->th.ompt_thread_info.state = ompt_state_overhead;
  }
}
}

The pId parameter is not used here.

Hahnfeld added inline comments.Dec 18 2018, 1:22 AM
runtime/src/kmp_wait_release.h
264

Ah, didn't look into the function. Can we then remove the argument and the variable completely?

protze.joachim added inline comments.
runtime/src/kmp_wait_release.h
264

I would also suggest to completely remove pId here.
The final OpenMP 5.0 spec says, that parallel_data here is always NULL, so it is not needed anymore.

jlpeyton updated this revision to Diff 178726.Dec 18 2018, 10:21 AM

Removed pId variable

This revision is now accepted and ready to land.Dec 19 2018, 2:32 AM
This revision was automatically updated to reflect the committed changes.