This is an archive of the discontinued LLVM Phabricator instance.

[libomptarget][amdgcn] Implement get_wtime
AbandonedPublic

Authored by JonChesterfield on Mar 3 2020, 4:07 PM.

Details

Summary

[libomptarget][amdgcn] Implement get_wtime

as a scaled call to an intrinsic. Implementation identical to that in aomp.

Diff Detail

Event Timeline

JonChesterfield created this revision.Mar 3 2020, 4:07 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 3 2020, 4:07 PM
JonChesterfield marked an inline comment as done.Mar 3 2020, 4:11 PM
JonChesterfield added inline comments.
openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
50

I don't have a source for the magic number. It's from before my time. I can offer that rough checks from running applications look ok, and that it has been shipping as part of aomp for ages without user complaints.

grokos added inline comments.Mar 3 2020, 6:36 PM
openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
50

At least can you add a comment about what this magic number is (obviously the clock speed in Hz, but it would be nice to have some sort of documentation)? Or (even better) use a macro because this frequency may change in future chips?

JonChesterfield marked an inline comment as done.Mar 3 2020, 7:45 PM
JonChesterfield added inline comments.
openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
50

The isa docs offer that the clock operates as if at a fixed frequency. It seems plausible that the various amdgcn chips would all have a timer running at the same rate, but I'd also be happier with a reference on that.

I'll ask around. It would be useful to know whether the scaling factor should be different for different chips.

JonChesterfield marked an inline comment as done.Mar 3 2020, 7:57 PM
JonChesterfield added inline comments.
openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
50

Yes. Though I will first ask around internally to try to find out what the 745MHz was based on, and whether it's chip dependent.

I don't have a documentation reference for this yet. A crude benchmark suggests the frequency is 25MHz on a vega 20, but I'm hoping for a source other than experimentation.

I'm unable to find a good reference for what this frequency should be, so am considering unconditionally returning zero from this function. That won't 'work' in any useful sense, but it's an improvement on failing to link because __clock64 is undefined.

jdoerfert added inline comments.Aug 26 2020, 5:57 PM
openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
50

My money is on: NVIDIA Tesla K40 Graphic Card - 1 GPUs - 745 MHz Core

JonChesterfield abandoned this revision.Oct 20 2020, 3:12 AM