This is an archive of the discontinued LLVM Phabricator instance.

[libomptarget][devicertl] Drop templated atomic functions
ClosedPublic

Authored by JonChesterfield on Jan 20 2021, 2:19 PM.

Details

Summary

[libomptarget][devicertl] Drop templated atomic functions

The five __kmpc_atomic templates are instantiated a total of seven times.
This change replaces the template with explictly typed functions, which
have the same prototype for amdgcn and nvptx, and implements them with
the same code presently in use.

Rolls in the accepted but not yet landed D95085.

The unsigned long long type can be replaced with uint64_t when replacing
the cuda function. Until then, clang warns on casting a pointer to one to
a pointer to the other.

Diff Detail

Event Timeline

JonChesterfield requested review of this revision.Jan 20 2021, 2:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2021, 2:19 PM
openmp/libomptarget/deviceRTLs/nvptx/src/target_impl.cu
143

D95085 with the possibly unnecessary DEVICE annotation

I didn't understand why we need to drop the template?

JonChesterfield added a comment.EditedJan 20 2021, 2:57 PM

Most of the intrinsic functions are defined on a single type, e.g. if we wanted to use the nvvm encoded one for atomic_inc.

Primarily, functions are simpler than templates. E.g. I like being able to disassemble target_impl.cu to see whether changing from atomicExch to an openmp pragma or a clang intrinsic changes the IR generated.

Also means we can use the same declaration for amdgcn and nvptx with different implementations, without doing the explicit template instantiation thing. We're not there yet, but most of target_impl.h should be identical across targets and moved into common.

tianshilei1992 accepted this revision.Jan 21 2021, 7:30 PM

LGTM with some nits. We might want to rewrite these atomics with LLVM intrinsics.

openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
134

This change seems unrelated.

140–141

ditto

This revision is now accepted and ready to land.Jan 21 2021, 7:30 PM

I'd like to call the clang/llvm intrinsics instead, plus replace unsigned long long with uint64_t. Easier to check the correctness of that after the call sites are limited to one file.

It's probably also time to move the common parts of the two target_impl files to a header in common, will try to think of a good name for it.

openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip
134

Yeah, result of clang-format at file scope

This revision was landed with ongoing or failed builds.Jan 22 2021, 6:48 AM
This revision was automatically updated to reflect the committed changes.