Page MenuHomePhabricator

[OpenMP][NFCI] Cleanup new device RT mapping interface
ClosedPublic

Authored by jdoerfert on Oct 20 2021, 9:30 AM.

Details

Summary

Minimize the impl interface and clean up some uses of mapping
functions.

Diff Detail

Event Timeline

jdoerfert created this revision.Oct 20 2021, 9:30 AM
jdoerfert requested review of this revision.Oct 20 2021, 9:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2021, 9:30 AM
Herald added a subscriber: sstefan1. · View Herald Transcript
jhuber6 added inline comments.Oct 20 2021, 9:32 AM
openmp/libomptarget/DeviceRTL/src/Mapping.cpp
232–239

Why do we want to get rid of these? I thought the idea was to keep these functions so we can generate generic IR in Clang that gets dispatched by the runtime once it's linked in.

jdoerfert added inline comments.Oct 20 2021, 3:34 PM
openmp/libomptarget/DeviceRTL/src/Mapping.cpp
232–239

I think we should replace runtime call folding with store->load forwarding. The new runtime does generic store->load forwarding for IsSPMD, as an example, and it works better as we do not need to keep calls around. That said, there is an argument made for both to exist. However, even if, we can simply look for the mangled names of the new runtime once we switched over rather than some __kmpc names.

jhuber6 added inline comments.Oct 20 2021, 5:02 PM
openmp/libomptarget/DeviceRTL/src/Mapping.cpp
232–239

We use these in OpenMPOpt and CGOpenMPRuntimeGPU, could we replace the definitions in OMPKinds.def with the mangled names?

Cleanups good, be a shame to lose the functions clang calls into just as we're on the edge of deleting the amdgcn and nvptx subclasses from openmp codegen.

Is getNumberOfProcessorElements the number of SMs / CUs? That may only be known accurately at application runtime

openmp/libomptarget/DeviceRTL/src/Mapping.cpp
232–239

The _kmpc prefix is used for the rest of the functions clang emits calls to, not that keen to replace all those with mangled symbols written in clang

openmp/libomptarget/DeviceRTL/include/Interface.h
204

These, plus one more nyi, let clang emit 'gpu' IR with calls to these instead of platform specific intrinsics

I can put them back if clang needs them.

tianshilei1992 added inline comments.Oct 22 2021, 6:38 PM
openmp/libomptarget/DeviceRTL/src/Mapping.cpp
232–239

We also (plan to) use these function calls to replace the emission of intrinsics directly from front end, aka those functions in CGOpenMPRuntimeGPUAMDGCN and CGOpenMPRuntimeGPUNVPTX.

jdoerfert updated this revision to Diff 382137.Oct 25 2021, 3:10 PM

Add the generic external interface back (but somewhat hidden)

Did you test this with SPMDzation? I'm pretty sure we generate a call to __kmpc_get_hardware_thread_id_in_block for that, and needed to keep it alive or else there would be no definition in the module.

jdoerfert updated this revision to Diff 384445.Nov 3 2021, 8:08 AM

Add assertions, keep external interface, fix amdgpu

jdoerfert updated this revision to Diff 384446.Nov 3 2021, 8:10 AM

Fix oversight

jhuber6 accepted this revision.Nov 3 2021, 8:11 AM

LGTM

This revision is now accepted and ready to land.Nov 3 2021, 8:11 AM
jdoerfert updated this revision to Diff 384520.Nov 3 2021, 10:56 AM

Fix more rebasing oversights

This revision was automatically updated to reflect the committed changes.