This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Introduce `--offload-link` option to perform offload device linking
ClosedPublic

Authored by jhuber6 on May 25 2022, 10:30 AM.

Details

Summary

The new driver uses an augmented linker wrapper to perform the device
linking phase, but to the user looks like a regular linker invocation.
Contrary to the old driver, the new driver contains all the information
necessary to produce a linked device image in the host object itself.
Currently, we infer the usage of the device linker by the user
specifying an offloading toolchain, e.g. (--offload-arch=...) or
(-fopenmp-targets=...), but this shouldn't be strictly necessary.
This patch introduces a new option --offload-link to tell
the driver to use the offloading linker instead. So a compilation flow
can now look like this,

clang foo.cu --offload-new-driver -fgpu-rdc --offload-arch=sm_70 -c
clang foo.o --offload-link -lcudart

I was considering if this could be merged into the -fuse-ld option,
but because the device linker wraps over the users linker it would
conflict with that. In the future it's possible to merge this into lld
completely or gold via a plugin and we would use this option to
enable the device linking feature. Let me know what you think for this.

Diff Detail

Event Timeline

jhuber6 created this revision.May 25 2022, 10:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2022, 10:30 AM
Herald added a subscriber: StephenFan. · View Herald Transcript
jhuber6 requested review of this revision.May 25 2022, 10:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2022, 10:30 AM
tra added a comment.May 25 2022, 11:08 AM

Currently, we infer the usage of the device linker by the user
specifying an offloading toolchain, e.g. (--offload-arch=...) or
(-fopenmp-targets=...), but this shouldn't be strictly necessary.

Yup. Whether we want to perform device link or not is orthogonal to those options.

This patch introduces a new option -dl to tell the driver to use the
offloading linker instead. So a compilation flow can now look like this,

clang foo.cu --offload-new-driver -fgpu-rdc --offload-arch=sm_70 -c
clang foo.o -dl -lcudart

It's an essential feature, as we do want to be able to produce libraries with host object files, but with fully linked GPU executables.
Naming, as usual, is hard. I would prefer a more explicit --offload-link which would be in line with other --offload* options we have by now.
-dl is cryptic for uninitiated and is uncomfortably close to commonly used -ldl. If it gets mistyped as -ld, it would lead to a legitimate but unrelated error about missing libd. Or it might silently succeed linking with libd without actually doing any device linking.

I was considering if this could be merged into the -fuse-ld option,
but becuse the device linker wraps over the users linker it would
conflict with that. In the future it's possible to merge this into lld
completely or gold via a plugin. Let me know what you think for this.

Naming, as usual, is hard. I would prefer a more explicit --offload-link which would be in line with other --offload* options we have by now.
-dl is cryptic for uninitiated and is uncomfortably close to commonly used -ldl. If it gets mistyped as -ld, it would lead to a legitimate but unrelated error about missing libd. Or it might silently succeed linking with libd without actually doing any device linking.

Yeah, I can see your point, --offload-link definitely works but it would be nice to have something less verbose. Maybe could just use -dlink or something.

I was considering if this could be merged into the -fuse-ld option,
but becuse the device linker wraps over the users linker it would
conflict with that. In the future it's possible to merge this into lld
completely or gold via a plugin. Let me know what you think for this.

I should also add, even if we built-in support for this into LLD or Gold, we'll probably still need a flag like this to tell Gold to use the plugin, or LLD to do the extra processing.

Unrelated, but in the future I'm also considering making the linker wrapper add the necessary libraries for whatever offloading kinds it found. E.g. if the link job finds embedded OpenMP code it will add -lomp -lomptarget if not already present.

jhuber6 retitled this revision from [Clang] Introduce `-dl` option to perform offload device linking to [Clang] Introduce `-dlink` option to perform offload device linking.May 25 2022, 11:39 AM
jhuber6 edited the summary of this revision. (Show Details)
jhuber6 updated this revision to Diff 432060.May 25 2022, 11:40 AM

Changing to use --offload-link and use -dlink as an alias.

tra added inline comments.May 25 2022, 11:53 AM
clang/include/clang/Driver/Options.td
825–826

We typically use option aliases to provide compatibility with the legacy options. AFAICT there are no current uses of Alias for the sake of saving a few characters in an option name.

Is -dlink really needed? It's not going to be typed manually all that often, and a slightly longer option does not make any difference for cmake or make files.

I assume that partial motivation for -dlink is that it's a shortened alias used by NVCC for its functionally similar --device-link option. I do not think it buys us anything. We never intended to be option-compatible with nvcc and clang's CUDA compilation and relevant options have only partial overlap with NVCC's functionality-wise and almost none syntax-wise. Adding one rarely used option for the same of matching NVCC's is not worth it, IMO.

jhuber6 added inline comments.May 25 2022, 11:59 AM
clang/include/clang/Driver/Options.td
825–826

Yeah, it's somewhat mental compatibility with Nvidia parlance, see -fgpu-rdc and -rdc=true, if you think that's not necessary I'll just make it --offload-link.

jhuber6 retitled this revision from [Clang] Introduce `-dlink` option to perform offload device linking to [Clang] Introduce `--offload-link` option to perform offload device linking.May 25 2022, 12:00 PM
jhuber6 edited the summary of this revision. (Show Details)
jhuber6 updated this revision to Diff 432071.May 25 2022, 12:02 PM

Removing -dlink

tra accepted this revision.May 25 2022, 12:04 PM
This revision is now accepted and ready to land.May 25 2022, 12:04 PM