This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add option to use different units for blocktime
ClosedPublic

Authored by tlwilmar on Aug 10 2023, 12:49 PM.

Details

Summary

This change adds the option of using different units for blocktimes specified via the KMP_BLOCKTIME environment variable. The parsing of the environment now recognizes units suffixes: ms and us. If a units suffix is not specified, the default unit is ms. Thus default behavior is still the same, and any previous usage still works the same. Internally, blocktime is now converted to microseconds everywhere, so settings that exceed INT_MAX in microseconds are considered "infinite".

kmp_set/get_blocktime are updated to use the units the user specified with KMP_BLOCKTIME, and if not specified, ms are used.

Added better range checking and inform messages for the two time units. Large values of blocktime for default (ms) case (beyond INT_MAX/1000) are no longer allowed, but will autocorrect with an INFORM message.

Diff Detail

Event Timeline

tlwilmar created this revision.Aug 10 2023, 12:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 12:49 PM
tlwilmar requested review of this revision.Aug 10 2023, 12:49 PM

Do we need to change __kmp_initialize_system_tick for other archs too?

Do we need to change __kmp_initialize_system_tick for other archs too?

It is only defined for x86. Is the current implementation sufficient for other archs? I don't have the ability to test it on anything else.

__kmp_initialize_system_tick is only used for an X86-specific optimization using rdtsc.
For other architectures, gettimeofday is used to measure time. From my experience, clock_gettime(CLOCK_MONOTONIC, ..) is less expensive, but this experience is also limited to X68.

I just added an inline comment to update a comment.

openmp/runtime/src/z_Linux_util.cpp
2006

Is this still 50~100 usec, or 50~100 msec now?

tlwilmar updated this revision to Diff 551276.Aug 17 2023, 2:43 PM

The delay for determining ticks per usec was lowered. It is now 1 million ticks which was calculated as ~450us based on 2.2GHz clock which is pretty typical base clock frequency on X86:

(1e6 Ticks)  /  (2.2e9 Ticks/sec)  *  (1e6 usec/sec)  =  454 usec

Really short benchmarks can be affected by longer delay.

I wasn't so much concerned about the actual time, but wanted to have the comment updated.

Code looks good.

Please update the documentation in openmp/docs/design/Runtimes.rst
Remove suffixes no longer supported, add new suffixes, milli->micro, put infinite into quotes.

tlwilmar updated this revision to Diff 551574.Aug 18 2023, 11:10 AM

Updated KMP_BLOCKTIME documentation.

This revision is now accepted and ready to land.Aug 18 2023, 11:41 AM
This revision was automatically updated to reflect the committed changes.