This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Remove 'keep_alive' functionality from the device RTL
ClosedPublic

Authored by jhuber6 on May 24 2023, 6:08 AM.

Details

Summary

The OpenMP DeviceRTL uses a hacky workaround to keep certain runtime
calls alive. This used a function that prevented them from being
optimized out. We needed this hack because the 'OpenMPOpt' pass likes to
introduce new runtime calls into the TU. This then interacted badly with
the method of linking the bitcode file per-TU like we do with Nvidia.
The OpenMPOpt pass would then generate a runtime call to a function that
was never linked in.

This should not be a problem anymore because we unconditionally link in
the libomptarget.devicertl.a runtime library. This should thus only
extract symbols that are undefined. So, if we do end up with an
unresolved reference it will be resolved by the static library.

The downside to this is that if we are doing non-LTO NVPTX compilation
that introduces one of these calls it will be linked outside the module
and therefore provide the overhead of an external function call.
However, removing this flag should make optimizing things easier. We
will need to see if that performance is a problem.

Diff Detail

Event Timeline

jhuber6 created this revision.May 24 2023, 6:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2023, 6:08 AM
jhuber6 requested review of this revision.May 24 2023, 6:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2023, 6:08 AM

@ye-luo Do you remember which workload exhibited these problems? Would it be possible for you to check if it indeed does not fail anymore and how big the performance penalty is for not inlining those functions.

ye-luo accepted this revision.May 31 2023, 3:07 PM

I don't remember what workload was choked on this. My quick testing shows everything being fine. So let us just land this.

This revision is now accepted and ready to land.May 31 2023, 3:07 PM