This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Replace OpenMP register requires constructor with a global array
Needs ReviewPublic

Authored by jhuber6 on Sep 8 2022, 4:52 PM.

Details

Summary

Currently, every module containing OpenMP offloading code uses a global
constructor to register its flags from the omp requires clause. This
is not ideal as it creates several entry points into libomptarget that
requires the runtime to be initialized first. This makes it difficult
for us to transition to a scheme where the lifetime of libomptarget is
explicitly manages by the uses of the __tgt_[un]register_lib calls.

Instead, this patch changes the method to create global values similar
to how offloading entries are registered. The linker will create a
__start/__stop pointer for these values which we can then iterate
through in the runtime. This requires widening the __tgt_bin_desc
struct which is an ABI break as old programs will attempt to read
invalid memory and the old interface will be removed. This will be
acceptable if we go through with the plans in D133277 and remove the
backwards compatibility in libomptarget.

If we do not wish to add a new field to __tgt_bin_desc we could
instead add it to the exiting __tgt_offload_entry and filter it out.
But this would require using the Size field or something similar to
hold the value and updating all the plugins.

Diff Detail

Event Timeline

jhuber6 created this revision.Sep 8 2022, 4:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 8 2022, 4:52 PM
jhuber6 requested review of this revision.Sep 8 2022, 4:52 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 8 2022, 4:52 PM
jhuber6 updated this revision to Diff 458937.Sep 8 2022, 6:27 PM

Fix tests

jhuber6 updated this revision to Diff 458948.Sep 8 2022, 7:39 PM

Fix LLVM test.

Overall this approach looks good to me, though I'd suggest some (hopefully minor) changes that would help avoid breaking the ABI. It looks like __tgt_bin_desc is only used by __tgt_[un]register_lib -- rather than changing those entry points, can you please add new ones (e.g., __tgt_[un]register_lib_v2 or __tgt[un]register_lib_with_requires)? You can still remove the old entry points from libomptarget. This should avoid any ABI issues with the old entry points, since calls to the old entry points will expect the old struct and calls to the new entry points will expect the new struct. This will also make it possible for vendors to continue supporting the old entry points, if desired.

Also, would you mind renaming the __tgt_bin_desc struct to __tgt_bin_desc_v2 as well, to indicate it has changed? This isn't part of the ABI, so certainly not required, but it might help to convey that the struct has changed from older versions of the source.

Also, it looks like this patch only contains the compiler and test changes, correct? Is there another patch that contains the corresponding libomptarget changes?

Overall this approach looks good to me, though I'd suggest some (hopefully minor) changes that would help avoid breaking the ABI. It looks like __tgt_bin_desc is only used by __tgt_[un]register_lib -- rather than changing those entry points, can you please add new ones (e.g., __tgt_[un]register_lib_v2 or __tgt[un]register_lib_with_requires)? You can still remove the old entry points from libomptarget. This should avoid any ABI issues with the old entry points, since calls to the old entry points will expect the old struct and calls to the new entry points will expect the new struct. This will also make it possible for vendors to continue supporting the old entry points, if desired.

Adding a new registration function would probably be the best way to maintain backwards compatibility, otherwise we'd segfault after reading past the struct. I can update this patch to do that.

Also, would you mind renaming the __tgt_bin_desc struct to __tgt_bin_desc_v2 as well, to indicate it has changed? This isn't part of the ABI, so certainly not required, but it might help to convey that the struct has changed from older versions of the source.

Also, it looks like this patch only contains the compiler and test changes, correct? Is there another patch that contains the corresponding libomptarget changes?

I wrote it but was waiting on this to get some attention before uploading it.

jhuber6 updated this revision to Diff 490915.Jan 20 2023, 11:15 AM

Updating. The problem is that I really don't want these global constructors
around but fixing up the existing ABI is a little painful. The previous option
was to just mark old applications as hard failures and crash. Introducing new
runtime functions made from clang is similarly painful since it can cause some
linker errors when mixing older versions.

I tried this method of changin the struct but detecting if it's the old version
and converting to it. I'm not really a fan, but it works. I set the first field
to be a negative version number so we can detect if it's new or old. If other
people are comfortable with just forcing people to recompile then I'd be happier
with that, but this is backwards compatible to an extent.

Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2023, 11:15 AM

Generally happy with this. Some minor comments. @sandoval, this works for you?

clang/lib/CodeGen/CGOpenMPRuntime.cpp
3294–3299
10919

Really? Not an error or unexpected state?

openmp/libomptarget/src/rtl.cpp
360 ↗(On Diff #490915)

Why do we allow to be called with an old struct, wouldn't we have converted it at this point?

364 ↗(On Diff #490915)

We have llvm::make_range, or similar

jhuber6 marked 2 inline comments as done.Jan 20 2023, 2:36 PM
jhuber6 added inline comments.
clang/lib/CodeGen/CGOpenMPRuntime.cpp
10919

Probably a good point. Would this be an unreachable? Since I think Sema should catch anything we don't expect here.

openmp/libomptarget/src/rtl.cpp
360 ↗(On Diff #490915)

Leftover, will remove.

jhuber6 updated this revision to Diff 491020.Jan 20 2023, 7:36 PM

Addressing comments

clang/test/OpenMP/teams_distribute_parallel_for_simd_reduction_codegen.cpp