This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Prototype opt-in new GPU device RTL
ClosedPublic

Authored by jdoerfert on Jul 26 2021, 9:24 AM.

Details

Summary

The "old" OpenMP GPU device runtime (D14254) has served us well for many
years but modernizing it has caused some pain recently. This patch
introduces an alternative which is mostly written from scratch embracing
OpenMP 5.X, C++, LLVM coding style (where applicable), and conceptual
interfaces. This new runtime is opt-in through a clang flag (D106793).
The new runtime is currently only build for nvptx and has "-new" in its
name.

The design is tailored towards middle-end optimizations rather than
front-end code generation choices, a trend we already started in the old
runtime a while back. In contrast to the old one, state is organized in
a simple manner rather than a "smart" one. While this can induce costs
it helps optimizations. Our expectation is that the majority of codes
can be optimized and a "simple" design is therefore preferable. The new
runtime does also avoid users to pay for things they do not use,
especially wrt. memory. The unlikely case of nested parallelism is
supported but costly to make the more likely case use less resources.

The worksharing and reduction implementation have been taken from the
old runtime and will be rewritten in the future if necessary.

Documentation and debug features are still mostly missing and will be
added over time.

All external symbols start with __kmpc for legacy reasons but should
be renamed once we switch over to a single runtime. All internal symbols
are placed in appropriate namespaces (anonymous or _OMP) to avoid name
clashes with user symbols.

Diff Detail

Event Timeline

jdoerfert created this revision.Jul 26 2021, 9:24 AM
jdoerfert requested review of this revision.Jul 26 2021, 9:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 26 2021, 9:24 AM
Herald added a subscriber: sstefan1. · View Herald Transcript
JonChesterfield accepted this revision.Jul 26 2021, 10:05 AM

Some minor comments inline from a scan through. I'm still in favour of landing this and iterating in trunk, keeping the current deviceRTL live (and use the current one by default for now) until we're pretty sure the new one works as well. If it proves to be long lived we might want to include some files into both of them to cut out the duplication+divergence, but for now I'm hoping that won't be warranted.

openmp/libomptarget/DeviceRTL/include/Mapping.h
80

Does this mean number of streaming multiprocessor / compute unit? A little surprised we need to know that.

Can't see a call site, can we delete this function?

openmp/libomptarget/DeviceRTL/include/State.h
99

typo mookup?

openmp/libomptarget/DeviceRTL/include/Types.h
16

freestanding has stdint.h, stddef.h, couple of other headers reliably available

openmp/libomptarget/DeviceRTL/include/Utils.h
43

Nice. Might want a _Static_assert that sizeof(unsigned)==4 and sizeof(long) == 8.

openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

number devices is an int32_t in some places but uint32_t here. I'd like it to be uint32_t everywhere, and if we're using -1 to indicate an error, to stop doing that

openmp/libomptarget/DeviceRTL/src/Debug.cpp
28

Does this change anything? The !cond above hopefully propagates without the assume

openmp/libomptarget/DeviceRTL/src/Kernel.cpp
22

this declaration in a header I think, safer to include said header than redeclare it

49

existing hazard. a shame that this construct blocks inlining the work function, even when the only two values stored to the variable are 0 and a given work function

This revision is now accepted and ready to land.Jul 26 2021, 10:05 AM
jdoerfert marked 8 inline comments as done.Jul 26 2021, 11:32 AM

I address things locally before I push, I'll wait a little longer but want to get this into 13.

openmp/libomptarget/DeviceRTL/include/Mapping.h
80

Yes it is. Call site:
int omp_get_num_procs(void) { return mapping::getNumberOfProcessorElements(); }

openmp/libomptarget/DeviceRTL/include/State.h
99

on purpose to check if people pay attention ;)

openmp/libomptarget/DeviceRTL/include/Types.h
16

I'll leave that for a follow up cleanup. Static assert will catch problems for now.

openmp/libomptarget/DeviceRTL/include/Utils.h
43

I tried to avoid non-sized types, did I miss any?

openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

uint32 everywhere now.

openmp/libomptarget/DeviceRTL/src/Debug.cpp
28

It translates flow into value information, LLVM can't always do that by itself for now. Also, if we are in non-debug mode the entire conditional is just deleted, which is what we want, so cond would not provide information without the assume.

openmp/libomptarget/DeviceRTL/src/Kernel.cpp
49

it does not as we build custom state machines anyway. This is the O0 fallback only, functionally correct.

openmp/libomptarget/DeviceRTL/include/Utils.h
43

It's the suffix on the builtin. ffs vs ffsl vs ffsll. I was burned by sizeof(long)==4 in the past so now compulsively put the static assert next to builtin functions that pass a uintxx_t to a function named based on normal/long/longlong. I hope windows builds of device code don't assume sizeof(long)==4 but wouldn't be totally shocked if they did.

openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

Not in the host plugins, so I assume not in libomptarget. Good to hear it's consistently unsigned in the device code.

openmp/libomptarget/DeviceRTL/src/Debug.cpp
28

Ah, shame. You're right, if (!cond && false) can't fold into builtin_assume(cond).

jdoerfert marked 10 inline comments as done.Jul 26 2021, 11:56 AM
jdoerfert added inline comments.
openmp/libomptarget/DeviceRTL/include/Utils.h
43

I see now. Will do that.

openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

yeah, to be fair it's not even hooked up to the host right now.

jdoerfert updated this revision to Diff 361825.Jul 26 2021, 3:22 PM
jdoerfert marked 2 inline comments as done.

Update and fixes

better to remove commented code before landing.

jdoerfert updated this revision to Diff 361856.Jul 26 2021, 5:06 PM

More cleanups and fixes

This revision was landed with ongoing or failed builds.Jul 26 2021, 10:57 PM
This revision was automatically updated to reflect the committed changes.