Page MenuHomePhabricator

[libomptarget] Build DeviceRTL for amdgpu
ClosedPublic

Authored by JonChesterfield on Oct 21 2021, 8:28 AM.

Details

Diff Detail

Event Timeline

JonChesterfield requested review of this revision.Oct 21 2021, 8:28 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
JonChesterfield edited the summary of this revision. (Show Details)Oct 21 2021, 8:28 AM
JonChesterfield added inline comments.
openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

Otherwise the missing symbols prevents linking, not clear why it works on nvptx64

openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
71

This is not good, need to revise sema checking on these intrinsics and add some lowering in clang/llvm that builds the switch. Written longhand here to get things running.

jdoerfert added inline comments.Oct 21 2021, 8:56 AM
openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

linking what? Clang emits the symbol, maybe just not for amdgpu.

openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

Where? The only reference I can find to it is here, and it's marked extern.

Subscribed some AMD people to this. I wanted to apply this patch as-is to amd-stg-open to feed it to the internal testing, but it doesn't apply because Driver/ToolChains/AMDGPUOpenMP.cpp in rocm is significantly different to trunk (in particular the call to addOpenMPDeviceRTL is commented out)

  • Enable tests on amdgpu, with same ones marked xfail/unsupported as on the old runtime
JonChesterfield edited the summary of this revision. (Show Details)Oct 26 2021, 12:37 PM
  • drop outdated comment
  • delete the extern debug_kind variable

I think this is good enough for now. It drops the not yet used debug variable and writes out the lowering for runtime values of memory ordering manually. The latter will be simplified once clang learns to emit the switch instead of error. Omp lock is a problem I don't have a good solution to.

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
255

'amdgcn' appears to be a subset of 'amdgpu', so this seems a reasonable point to rename it.

openmp/libomptarget/DeviceRTL/CMakeLists.txt
182

rearranging the naming here - the llvm-link file is now prefixed linked_, with the optimised library left without a prefix. Updated depends / output clauses to match.

openmp/libomptarget/DeviceRTL/src/Configuration.cpp
23

I think the cuda toolchain treats unresolved references as 'just use zero', in which case deleting this is a no-op on nvptx. Maybe it's intended to be patched by cuda rtl.cpp in the future? If so can reintroduce it then

  • rebase, having landed D111987 this time

Problem with missing symbol for __omp_rtl_debug_kind was a local error. I did the initial testing of this with a jury rigged clang that linked the new bitcode and ignored the old. The generation of this integer is guarded by which runtime clang thinks it is compiling for. Thus, my local clang compiled for the old runtime and linked with the new, which it turns out does not work.

I was under the impression that pointing clang at the new runtime with libomptarget-amdgcn-bc-path was sufficient, but evidently not. Doesn't matter for this patch now that the variable is reinstated.

Requires either D112544 or disabling bug51982 on newRTL before landing

jdoerfert accepted this revision.Oct 26 2021, 3:54 PM

LG, I'll land my fix and with this we can switch over to the new RT.

This revision is now accepted and ready to land.Oct 26 2021, 3:54 PM
This revision was automatically updated to reflect the committed changes.
JonChesterfield reopened this revision.EditedOct 27 2021, 5:03 PM

Didn't fare very well under CI, investigating. Ten failures at https://lab.llvm.org/buildbot/#/builders/193/builds/915, but they all pass locally.

This revision is now accepted and ready to land.Oct 27 2021, 5:03 PM
openmp/libomptarget/DeviceRTL/src/Synchronization.cpp
196–201

Error here - syncThreadsAligned is deleted but should not be

  • Reintroduce syncThreadsAligned, dropped in git merge conflict
  • fix new+old test enabling

Tagging Ron as this is current stuck on the mystery of passing locally and failing CI

This revision was automatically updated to reflect the committed changes.

Landed a slightly modified version of this - code and test changes are included, but the tests are not run by default. I'm hopeful this will help the process of working out why ~10 are failing on CI and passing locally.