This is an archive of the discontinued LLVM Phabricator instance.

[libc] Support timing information in libc tests
ClosedPublic

Authored by jhuber6 on Jul 4 2023, 7:29 AM.

Details

Summary

This patch adds the necessary support to provide timing information in
libc tests. This is useful for determining which tests look what
amount of time. We also can use this as a test basis for providing more
fine-grained timing when implementing things on the GPU.

The main difficulty with this is the fact that the AMDGPU fixed
frequency clock operates at an unknown frequency. We need to read this
on a per-card basis from the driver and then copy it in. NVPTX on the
other hand has a fixed clock at a resolution of 1ns. I have also
increased the resolution of the print-outs as the majority of these are
below a millisecond for me.

Diff Detail

Event Timeline

jhuber6 created this revision.Jul 4 2023, 7:29 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 4 2023, 7:29 AM
jhuber6 requested review of this revision.Jul 4 2023, 7:29 AM
jhuber6 updated this revision to Diff 537406.Jul 5 2023, 10:07 AM

Unify the memcpy into a common function

Let's take an educated guess that this won't be the one true unique global we end up doing this with and make it a struct up front

libc/startup/gpu/amdgpu/start.cpp
21

This will be in .bss. Is that a good thing? Means we have no option to initialise it before loading the image, i.e. we're stuck with some variant of crossing the bus and waiting to write four bytes

libc/test/UnitTest/LibcTest.cpp
22

Looks sketchy that this is a different type on the various platforms. Safer would be a uint64_t, or at least annotate the type so it isn't a signed integer.

Really not keen on it being a macro pretending to be part of time.h when it isn't in time.h and this is all tied up in testing the implementation of libc.

154

Please parenthesize to indicate what order of operations you want on the multiplication and division here, and if you don't want to give CLOCKS_PER_TYPE an actual type, static cast it to uint64_t at the use site so signed/unsigned integer nonsense can't bite us.

When this is a macro expanding to a non-const global, are we stuck with a separate load for each instance? I _think_ clang will constant fold it anyway because it isn't volatile or atomic but haven't checked.

libc/utils/gpu/loader/amdgpu/Loader.cpp
272

Boilerplate looks ok here, though I'd probably have gone with the synchronous copy instead (no need to mess around with signals then, though I have a vague recollection of there being other hazards with the non-async one)

442

All this stuff? Pretty much why I prefer writing to the global before copying it to the GPU

jhuber6 added inline comments.Jul 5 2023, 11:35 AM
libc/startup/gpu/amdgpu/start.cpp
21

I should probably make this in constant memory.

libc/test/UnitTest/LibcTest.cpp
22

Honestly the whole CLOCKS_PER_SEC thing confuses me in general. POSIX says it's always 1,000,000, other platforms say it's something else. Other's scale the clock so that it matches 1,000,000... We could probably avoid it here if necessary and make it a constant or something.

154

I could probably make it const since it should be set by the loader / whoever's standing up the runtime.

libc/utils/gpu/loader/amdgpu/Loader.cpp
272

Yeah, didn't see any simpler ways to do this in the HSA interface.

442

I didn't want to pass it via the kernel arguments because then it would require changing the NVPTX plugin as well. Plus in the future we'll probably want to abstract this into some general environment constant.

jhuber6 updated this revision to Diff 537453.Jul 5 2023, 11:53 AM

Making suggested changes

libc/test/UnitTest/LibcTest.cpp
159

Overflow hazards abound here. I'm not totally sure what to do about them.

jhuber6 added inline comments.Jul 5 2023, 12:02 PM
libc/test/UnitTest/LibcTest.cpp
159

These are uint64_t, if these overflow I think we'll be fine for just running some unit tests and getting a time that says 0ms instead of a few trillion.

JonChesterfield accepted this revision.Jul 5 2023, 12:10 PM

Seems reasonable. Thanks

This revision is now accepted and ready to land.Jul 5 2023, 12:10 PM
This revision was automatically updated to reflect the committed changes.