Minimize the impl interface and clean up some uses of mapping
functions.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/DeviceRTL/src/Mapping.cpp | ||
---|---|---|
242–251 | Why do we want to get rid of these? I thought the idea was to keep these functions so we can generate generic IR in Clang that gets dispatched by the runtime once it's linked in. |
openmp/libomptarget/DeviceRTL/src/Mapping.cpp | ||
---|---|---|
242–251 | I think we should replace runtime call folding with store->load forwarding. The new runtime does generic store->load forwarding for IsSPMD, as an example, and it works better as we do not need to keep calls around. That said, there is an argument made for both to exist. However, even if, we can simply look for the mangled names of the new runtime once we switched over rather than some __kmpc names. |
openmp/libomptarget/DeviceRTL/src/Mapping.cpp | ||
---|---|---|
242–251 | We use these in OpenMPOpt and CGOpenMPRuntimeGPU, could we replace the definitions in OMPKinds.def with the mangled names? |
Cleanups good, be a shame to lose the functions clang calls into just as we're on the edge of deleting the amdgcn and nvptx subclasses from openmp codegen.
Is getNumberOfProcessorElements the number of SMs / CUs? That may only be known accurately at application runtime
openmp/libomptarget/DeviceRTL/src/Mapping.cpp | ||
---|---|---|
242–251 | The _kmpc prefix is used for the rest of the functions clang emits calls to, not that keen to replace all those with mangled symbols written in clang |
openmp/libomptarget/DeviceRTL/include/Interface.h | ||
---|---|---|
204 ↗ | (On Diff #380997) | These, plus one more nyi, let clang emit 'gpu' IR with calls to these instead of platform specific intrinsics |
openmp/libomptarget/DeviceRTL/src/Mapping.cpp | ||
---|---|---|
242–251 | We also (plan to) use these function calls to replace the emission of intrinsics directly from front end, aka those functions in CGOpenMPRuntimeGPUAMDGCN and CGOpenMPRuntimeGPUNVPTX. |
Did you test this with SPMDzation? I'm pretty sure we generate a call to __kmpc_get_hardware_thread_id_in_block for that, and needed to keep it alive or else there would be no definition in the module.
Why do we want to get rid of these? I thought the idea was to keep these functions so we can generate generic IR in Clang that gets dispatched by the runtime once it's linked in.