This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Driver/
-
clang/
-
Driver/
4/5
Options.td
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
-
Clang.cpp
-
test/Driver/
-
Driver/
-
linker-wrapper.c
-
openmp-offload-gpu-new.c
-
tools/clang-linker-wrapper/
-
clang-linker-wrapper/
-
ClangLinkerWrapper.cpp

Differential D126226

[OpenMP] Add `-Xoffload-linker` to forward input to the device linker
ClosedPublic

Authored by jhuber6 on May 23 2022, 10:44 AM.

Download Raw Diff

Details

Reviewers

markdewing
jdoerfert
tianshilei1992
JonChesterfield
tra

Commits

rGf37101983fc9: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker

Summary

We use the clang-linker-wrapper to perform device linking of embedded
offloading object files. This is done by generating those jobs inside of
the linker-wrapper itself. This patch adds an argument in Clang and the
linker-wrapper that allows users to forward input to the device linking
phase. This can either be done for every device linker, or for a
specific target triple. We use the -Xoffload-linker <arg> and the
-Xoffload-linker-<triple> <arg> syntax to accomplish this.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhuber6 created this revision.May 23 2022, 10:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2022, 10:44 AM

Herald added subscribers: guansong, yaxunl. · View Herald Transcript

jhuber6 requested review of this revision.May 23 2022, 10:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2022, 10:44 AM

Herald added subscribers: cfe-commits, sstefan1, MaskRay. · View Herald Transcript

Works for passing libraries to nvlink.

This revision is now accepted and ready to land.May 23 2022, 11:10 AM

Harbormaster completed remote builds in B165877: Diff 431424.May 23 2022, 11:20 AM

-Xoffload-linker=<triple> <arg>

The syntax is confusing. Normally only triple would be the argument for -Xoffload-linker option.
Having both -Xoffload-linker and -Xoffload-linker= variants also looks odd to me.

In effect you're making -Xoffload-linker=foo a full option (as opposed to it being an option -Xoffload-linker= + argument foo) with a separate argument that follows. I guess that might work, but it's a rather unconventional use of command line parser, IMO.

I think the main issue with this approach is that it makes the command line hard to understand. When one sees -Xsomething=a -b it's impossible to tell whether -b is a regular option or something to be passed to -Xsomething=a. My assumption would be the former as -Xsomething= already got its argument a and should have no business grabbing the next one.

I think it would work better if the option could use a - or`_` for the variant that passes the triple. E.g. -Xoffload-linker-nvptx64=-foo or -Xoffload-linker-nvptx64 -foo would be easily interpretable.

tra requested changes to this revision.May 23 2022, 11:45 AM

This revision now requires changes to proceed.May 23 2022, 11:45 AM

In D126226#3532127, @tra wrote:

-Xoffload-linker=<triple> <arg>

The syntax is confusing. Normally only triple would be the argument for -Xoffload-linker option.
Having both -Xoffload-linker and -Xoffload-linker= variants also looks odd to me.

In effect you're making -Xoffload-linker=foo a full option (as opposed to it being an option -Xoffload-linker= + argument foo) with a separate argument that follows. I guess that might work, but it's a rather unconventional use of command line parser, IMO.

I think the main issue with this approach is that it makes the command line hard to understand. When one sees -Xsomething=a -b it's impossible to tell whether -b is a regular option or something to be passed to -Xsomething=a. My assumption would be the former as -Xsomething= already got its argument a and should have no business grabbing the next one.

I think it would work better if the option could use a - or`_` for the variant that passes the triple. E.g. -Xoffload-linker-nvptx64=-foo or -Xoffload-linker-nvptx64 -foo would be easily interpretable.

We already use this approach for the -Xopenmp-target=<triple> <arg> option that forwards arguments to the given toolchain, so that's what I was using as a basis. I'm not sure if I want to hard-code the triple value into the argument itself, since this could theoretically be used for any number of triples, e.g. OpenMP offloading to ppcle64 and x86_64, so it would get a little messy. It's definitely a bit harder to read, but it's a bit of a special-case option so it's to be expected I think.

In D126226#3532147, @jhuber6 wrote:

We already use this approach for the -Xopenmp-target=<triple> <arg> option that forwards arguments to the given toolchain, so that's what I was using as a basis.

Yup. I've noticed that. It's unfortunate.

I'm not sure if I want to hard-code the triple value into the argument itself, since this could theoretically be used for any number of triples, e.g. OpenMP offloading to ppcle64 and x86_64, so it would get a little messy. It's definitely a bit harder to read, but it's a bit of a special-case option so it's to be expected I think.

You do not need to hardcode it. The idea of JoinedAndSeparate is that an option foo assepts two argumants, one glued to it and another following after a whitespace.
So, when you define an option -Xoffload-linker-, and then pass -Xoffload-linker-nvptx64=foo, you will get OPT_offload-linker__ with two arguments. As an example see implementation of plugin_arg which deals with the same kind of problem of passing arguments to an open-ended set of plugins.

In D126226#3532216, @tra wrote:

You do not need to hardcode it. The idea of JoinedAndSeparate is that an option foo assepts two argumants, one glued to it and another following after a whitespace.
So, when you define an option -Xoffload-linker-, and then pass -Xoffload-linker-nvptx64=foo, you will get OPT_offload-linker__ with two arguments. As an example see implementation of plugin_arg which deals with the same kind of problem of passing arguments to an open-ended set of plugins.

I see, it's a little weird sinec the -Xopenmp-target option will be done different, but changing to this should just require switching out the = with -. I'll go ahead and do it.

Changing the -Xoffload-linker= to -Xoffload-linker-.

jhuber6 edited the summary of this revision. (Show Details)May 23 2022, 12:30 PM

It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.

This revision now requires changes to proceed.May 23 2022, 12:51 PM

Herald added a subscriber: StephenFan. · View Herald TranscriptMay 23 2022, 12:51 PM

In D126226#3532257, @MaskRay wrote:

It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.

So how should I pass these two arguments instead? I'm not sure if there's a good way to bind two arguments to a single command line arguments that's readable if we're not allowed to use this joined version.

Harbormaster completed remote builds in B165893: Diff 431452.May 23 2022, 1:02 PM

IIRC there is no built-in way supporting multiple (but fixed number of) values for an option (e.g. -Xoffload-linker-<triple> <arg>). In D105330 (llvm-nm option refactoring) I used a hack to support -s __DATA __data.
The multiple-value support for OptTable does not allow positional arguments after the option.

Consider something like -Xoffload-linker-triple <triple>=<arg>

Updating to use @MaskRay's suggestion.

In D126226#3532257, @MaskRay wrote:

It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.

I'm not sure I understand your argument. The two cases where I see JoinedAndSeparate are used right now (-Xarch_ and -plugin-arg) both are using it for the purposes similar to this patch.
I also do not quite see how JoinedAndSeparate is applicable to --xxxval/--xxx val.
Could you elaborate, please?

In D126226#3532301, @MaskRay wrote:

Consider something like -Xoffload-linker-triple <triple>=<arg>

That could work.

We keep running into the same old underlying issue that we do not have a good way to name/reference specific parts of the compilation pipeline. -Xfoo used to work OK for the linear 'standard' compilation pipeline, but these days when compilation grew from a simple linear pipe it's no longer adequate and we need to extend it.

Speaking of triples. I think using triple as the selector is insufficient for general offloading use. We may have offload variants that would use the same triple, but would be compiled using their own pipeline. E.g. the GPU binaries for sm_60 and sm_80 GPUs will use the same nvptx64 triple, but would presumably be lined with different linker instances and may need different options. My understanding is that AMDGPU has even more detailed offload variants (same triple, same GPU arch, different features). I don't know whether it's applicable to OpenMP, though. I think it is. IIRC OpenMP has a way to specialize offload to particular GPU variant and that would probably give you multiple offload targets, all with the same triple.

In D126226#3532423, @tra wrote:

We keep running into the same old underlying issue that we do not have a good way to name/reference specific parts of the compilation pipeline. -Xfoo used to work OK for the linear 'standard' compilation pipeline, but these days when compilation grew from a simple linear pipe it's no longer adequate and we need to extend it.

Yeah, it's getting increasingly complicated to refer to certain portions of the compilation toolchain as we start adding more complicated stuff. Just recently I had a problem that I wanted to pass an -Xclang argument only to the CUDA toolchain, and there's no way to do it as far as I can tell. It may be worth revisiting this whole concept to support more arbitrary combinations.

Speaking of triples. I think using triple as the selector is insufficient for general offloading use. We may have offload variants that would use the same triple, but would be compiled using their own pipeline. E.g. the GPU binaries for sm_60 and sm_80 GPUs will use the same nvptx64 triple, but would presumably be lined with different linker instances and may need different options. My understanding is that AMDGPU has even more detailed offload variants (same triple, same GPU arch, different features). I don't know whether it's applicable to OpenMP, though. I think it is. IIRC OpenMP has a way to specialize offload to particular GPU variant and that would probably give you multiple offload targets, all with the same triple.

Yes, it's not a truly generic solution. But I figured that just being able to specify it for each "tool-chain" was sufficient for the use-case here and we can expand it as needed. I added support for OpenMP to use --offload-arch recently so we definitely use it. The OpenMP offloading GPU runtime library is now built as a static library with --offload-arch= for all 32 supported architectures currently, it works surprisingly well.

Harbormaster completed remote builds in B165912: Diff 431475.May 23 2022, 2:26 PM

In D126226#3532471, @jhuber6 wrote:

In D126226#3532423, @tra wrote:

We keep running into the same old underlying issue that we do not have a good way to name/reference specific parts of the compilation pipeline. -Xfoo used to work OK for the linear 'standard' compilation pipeline, but these days when compilation grew from a simple linear pipe it's no longer adequate and we need to extend it.

Yeah, it's getting increasingly complicated to refer to certain portions of the compilation toolchain as we start adding more complicated stuff. Just recently I had a problem that I wanted to pass an -Xclang argument only to the CUDA toolchain, and there's no way to do it as far as I can tell. It may be worth revisiting this whole concept to support more arbitrary combinations.

-Xarch_device should do that for all device compilations, or you could use -Xarch_sm_XX if you need to pass it only to the compilation targeting sm_XX.

Speaking of triples. I think using triple as the selector is insufficient for general offloading use.

Yes, it's not a truly generic solution. But I figured that just being able to specify it for each "tool-chain" was sufficient for the use-case here and we can expand it as needed. I added support for OpenMP to use --offload-arch recently so we definitely use it. The OpenMP offloading GPU runtime library is now built as a static library with --offload-arch= for all 32 supported architectures currently, it works surprisingly well.

The comment was largely intended as a counterargument to @MaskRay 's proposal to hardcode triples into arguments. It's doable, but with ever continuing expanding set of offloading targets will be the source of unnecessary churn. It it were just triples, it would be fine, but our set is potentially a cartesian product of {triple, GPU variant} and both AMDGPU and nvptx have quite a few GPU variants they can target.

In D126226#3532423, @tra wrote:

In D126226#3532257, @MaskRay wrote:

It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.

I'm not sure I understand your argument. The two cases where I see JoinedAndSeparate are used right now (-Xarch_ and -plugin-arg) both are using it for the purposes similar to this patch.
I also do not quite see how JoinedAndSeparate is applicable to --xxxval/--xxx val.
Could you elaborate, please?

In D126226#3532301, @MaskRay wrote:

Consider something like -Xoffload-linker-triple <triple>=<arg>

That could work.

We keep running into the same old underlying issue that we do not have a good way to name/reference specific parts of the compilation pipeline. -Xfoo used to work OK for the linear 'standard' compilation pipeline, but these days when compilation grew from a simple linear pipe it's no longer adequate and we need to extend it.

Speaking of triples. I think using triple as the selector is insufficient for general offloading use. We may have offload variants that would use the same triple, but would be compiled using their own pipeline. E.g. the GPU binaries for sm_60 and sm_80 GPUs will use the same nvptx64 triple, but would presumably be lined with different linker instances and may need different options. My understanding is that AMDGPU has even more detailed offload variants (same triple, same GPU arch, different features). I don't know whether it's applicable to OpenMP, though. I think it is. IIRC OpenMP has a way to specialize offload to particular GPU variant and that would probably give you multiple offload targets, all with the same triple.

OK, please ignore my comments. I see JoinedAndSeparate that supports 2 arguments. The relevant code is llvm/lib/Option/Option.cpp:197.
I hereby retreat my objection.

MaskRay removed a reviewer: MaskRay.May 23 2022, 2:40 PM

Go back to old joined method and also change the name to remote _EQ.

Harbormaster completed remote builds in B165924: Diff 431491.May 23 2022, 3:29 PM

tra added inline comments.May 23 2022, 3:39 PM

clang/include/clang/Driver/Options.td
826	This option still stands out as a sore thumb. Could we fold it into the one below as `-Xoffload-linker-all` ? Or, maybe make `Xoffload_linker_arg` use `"Xoffload-linker"` prefix and then check that the first argument is either empty (which would meand "for all") or "-<target>". Maybe we don't even need separate "-Xoffload_linker" option(s). I wonder if it would make sense to extend the existing `-Xlinker` and use `-Xlinker-<target>` ?

jhuber6 added inline comments.May 23 2022, 3:45 PM

clang/include/clang/Driver/Options.td
826	I don't think we could rework `-Xlinker` as it works by forwarding the arguments to the linker job, this requires some handling inside of Clang to format it properly. But I should definitely make it a single argument by just checking if the joined argument is empty.

Merging into a single argument and checking if the joined arg is empty.

Harbormaster completed remote builds in B165940: Diff 431511.May 23 2022, 4:34 PM

Couple of nits. LGTM otherwise.

clang/include/clang/Driver/Options.td
827	The comment may need updating now.
828	I think this is backwards. I think the first one here should be `<triple>` (the one joined with `-Xoffload-linker`), followd by `<arg>`

This revision is now accepted and ready to land.May 23 2022, 4:51 PM

jhuber6 marked 2 inline comments as done.May 23 2022, 4:53 PM

jhuber6 added inline comments.

clang/include/clang/Driver/Options.td
827	Sure, I'll change it before I commit.

Closed by commit rGf37101983fc9: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker (authored by jhuber6). · Explain WhyMay 24 2022, 6:11 AM

This revision was automatically updated to reflect the committed changes.

jhuber6 marked an inline comment as done.

jhuber6 added a commit: rGf37101983fc9: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

Revision Contents

Path

Size

clang/

include/

clang/

Driver/

Options.td

3 lines

lib/

Driver/

ToolChains/

Clang.cpp

14 lines

test/

Driver/

linker-wrapper.c

12 lines

openmp-offload-gpu-new.c

6 lines

tools/

clang-linker-wrapper/

ClangLinkerWrapper.cpp

20 lines

Diff 431511

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 817 Lines • ▼ Show 20 Lines	def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">, Group<CompileOnly_Group>,
HelpText<"Pass <arg> to the target offloading toolchain identified by <triple>.">,		HelpText<"Pass <arg> to the target offloading toolchain identified by <triple>.">,
MetaVarName<"<triple> <arg>">;		MetaVarName<"<triple> <arg>">;
def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>,		def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>,
HelpText<"Pass -z <arg> to the linker">, MetaVarName<"<arg>">,		HelpText<"Pass -z <arg> to the linker">, MetaVarName<"<arg>">,
Group<Link_Group>;		Group<Link_Group>;
def Xlinker : Separate<["-"], "Xlinker">, Flags<[LinkerInput, RenderAsInput]>,		def Xlinker : Separate<["-"], "Xlinker">, Flags<[LinkerInput, RenderAsInput]>,
HelpText<"Pass <arg> to the linker">, MetaVarName<"<arg>">,		HelpText<"Pass <arg> to the linker">, MetaVarName<"<arg>">,
Group<Link_Group>;		Group<Link_Group>;
		def Xoffload_linker : JoinedAndSeparate<["-"], "Xoffload-linker">,
		traUnsubmitted Not Done Reply Inline Actions This option still stands out as a sore thumb. Could we fold it into the one below as `-Xoffload-linker-all` ? Or, maybe make `Xoffload_linker_arg` use `"Xoffload-linker"` prefix and then check that the first argument is either empty (which would meand "for all") or "-<target>". Maybe we don't even need separate "-Xoffload_linker" option(s). I wonder if it would make sense to extend the existing `-Xlinker` and use `-Xlinker-<target>` ? tra: This option still stands out as a sore thumb. Could we fold it into the one below as `…
		jhuber6AuthorUnsubmitted Done Reply Inline Actions I don't think we could rework `-Xlinker` as it works by forwarding the arguments to the linker job, this requires some handling inside of Clang to format it properly. But I should definitely make it a single argument by just checking if the joined argument is empty. jhuber6: I don't think we could rework `-Xlinker` as it works by forwarding the arguments to the linker…
		HelpText<"Pass <arg> to the offload linker identified by <triple>">,
		traUnsubmitted Done Reply Inline Actions The comment may need updating now. tra: The comment may need updating now.
		jhuber6AuthorUnsubmitted Done Reply Inline Actions Sure, I'll change it before I commit. jhuber6: Sure, I'll change it before I commit.
		MetaVarName<"<arg> <triple>">, Group<Link_Group>;
		traUnsubmitted Done Reply Inline Actions I think this is backwards. I think the first one here should be `<triple>` (the one joined with `-Xoffload-linker`), followd by `<arg>` tra: I think this is backwards. I think the first one here should be `<triple>` (the one joined with…
def Xpreprocessor : Separate<["-"], "Xpreprocessor">, Group<Preprocessor_Group>,		def Xpreprocessor : Separate<["-"], "Xpreprocessor">, Group<Preprocessor_Group>,
HelpText<"Pass <arg> to the preprocessor">, MetaVarName<"<arg>">;		HelpText<"Pass <arg> to the preprocessor">, MetaVarName<"<arg>">;
def X_Flag : Flag<["-"], "X">, Group<Link_Group>;		def X_Flag : Flag<["-"], "X">, Group<Link_Group>;
def X_Joined : Joined<["-"], "X">, IgnoredGCCCompat;		def X_Joined : Joined<["-"], "X">, IgnoredGCCCompat;
def Z_Flag : Flag<["-"], "Z">, Group<Link_Group>;		def Z_Flag : Flag<["-"], "Z">, Group<Link_Group>;
// FIXME: All we do with this is reject it. Remove.		// FIXME: All we do with this is reject it. Remove.
def Z_Joined : Joined<["-"], "Z">;		def Z_Joined : Joined<["-"], "Z">;
def all__load : Flag<["-"], "all_load">;		def all__load : Flag<["-"], "all_load">;
▲ Show 20 Lines • Show All 5,962 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Clang.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,384 Lines • ▼ Show 20 Lines	CmdArgs.push_back(
Args.MakeArgString(Twine("-pass-remarks-analysis=") + A->getValue()));		Args.MakeArgString(Twine("-pass-remarks-analysis=") + A->getValue()));
if (Args.getLastArg(options::OPT_save_temps_EQ))		if (Args.getLastArg(options::OPT_save_temps_EQ))
CmdArgs.push_back("-save-temps");		CmdArgs.push_back("-save-temps");

// Construct the link job so we can wrap around it.		// Construct the link job so we can wrap around it.
Linker->ConstructJob(C, JA, Output, Inputs, Args, LinkingOutput);		Linker->ConstructJob(C, JA, Output, Inputs, Args, LinkingOutput);
const auto &LinkCommand = C.getJobs().getJobs().back();		const auto &LinkCommand = C.getJobs().getJobs().back();

		// Forward -Xoffload-linker<-triple> arguments to the device link job.
		for (auto *Arg : Args.filtered(options::OPT_Xoffload_linker)) {
		StringRef Val = Arg->getValue(0);
		if (Val.empty())
		CmdArgs.push_back(
		Args.MakeArgString(Twine("-device-linker=") + Arg->getValue(1)));
		else
		CmdArgs.push_back(Args.MakeArgString(
		"-device-linker=" +
		ToolChain::getOpenMPTriple(Val.drop_front()).getTriple() + "=" +
		Arg->getValue(1)));
		}
		Args.ClaimAllArgs(options::OPT_Xoffload_linker);

// Add the linker arguments to be forwarded by the wrapper.		// Add the linker arguments to be forwarded by the wrapper.
CmdArgs.push_back("-linker-path");		CmdArgs.push_back("-linker-path");
CmdArgs.push_back(LinkCommand->getExecutable());		CmdArgs.push_back(LinkCommand->getExecutable());
CmdArgs.push_back("--");		CmdArgs.push_back("--");
for (const char *LinkArg : LinkCommand->getArguments())		for (const char *LinkArg : LinkCommand->getArguments())
CmdArgs.push_back(LinkArg);		CmdArgs.push_back(LinkArg);

const char *Exec =		const char *Exec =
Args.MakeArgString(getToolChain().GetProgramPath("clang-linker-wrapper"));		Args.MakeArgString(getToolChain().GetProgramPath("clang-linker-wrapper"));

// Replace the executable and arguments of the link job with the		// Replace the executable and arguments of the link job with the
// wrapper.		// wrapper.
LinkCommand->replaceExecutable(Exec);		LinkCommand->replaceExecutable(Exec);
LinkCommand->replaceArguments(CmdArgs);		LinkCommand->replaceArguments(CmdArgs);
}		}

clang/test/Driver/linker-wrapper.c

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \
	// RUN: -fembed-offload-object=%t.out			// RUN: -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \			// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \
	// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA			// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA

	// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_52 {{.*}}.o			// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_52 {{.*}}.o
	// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o			// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o
	// CUDA: fatbinary{{.}}-64 --create {{.}}.fatbin --image=profile=sm_52,file={{.}}.out --image=profile=sm_70,file={{.}}.out			// CUDA: fatbinary{{.}}-64 --create {{.}}.fatbin --image=profile=sm_52,file={{.}}.out --image=profile=sm_70,file={{.}}.out

				// RUN: clang-offload-packager -o %t.out \
				// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \
				// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \
				// RUN: -fembed-offload-object=%t.out
				// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \
				// RUN: /usr/bin/ld --device-linker=a --device-linker=nvptx64-nvidia-cuda=b -- \
				// RUN: %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=LINKER_ARGS

				// LINKER_ARGS: lld{{.}}-flavor gnu --no-undefined -shared -o {{.}}.out {{.*}}.o a
				// LINKER_ARGS: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.*}}.o a b

clang/test/Driver/openmp-offload-gpu-new.c

	Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines
	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \
	// RUN: --offload-device-only -E -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DEVICE-ONLY-PP			// RUN: --offload-device-only -E -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DEVICE-ONLY-PP
	// CHECK-DEVICE-ONLY-PP: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.*]]"], output: "-"			// CHECK-DEVICE-ONLY-PP: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.*]]"], output: "-"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp --offload-arch=sm_52 -nogpulib \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp --offload-arch=sm_52 -nogpulib \
	// RUN: -foffload-lto %s 2>&1 \| FileCheck --check-prefix=CHECK-LTO-LIBRARY %s			// RUN: -foffload-lto %s 2>&1 \| FileCheck --check-prefix=CHECK-LTO-LIBRARY %s

	// CHECK-LTO-LIBRARY: {{.}}-lomptarget{{.}}-lomptarget.devicertl			// CHECK-LTO-LIBRARY: {{.}}-lomptarget{{.}}-lomptarget.devicertl

				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp --offload-arch=sm_52 -nogpulib \
				// RUN: -Xoffload-linker a -Xoffload-linker-nvptx64-nvidia-cuda b -Xoffload-linker-nvptx64 c \
				// RUN: %s 2>&1 \| FileCheck --check-prefix=CHECK-XLINKER %s

				// CHECK-XLINKER: -device-linker=a{{.}}-device-linker=nvptx64-nvidia-cuda=b{{.}}-device-linker=nvptx64-nvidia-cuda=c{{.*}}--

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	HostTriple("host-triple", cl::ZeroOrMore,
cl::init(sys::getDefaultTargetTriple()),		cl::init(sys::getDefaultTargetTriple()),
cl::cat(ClangLinkerWrapperCategory));		cl::cat(ClangLinkerWrapperCategory));

static cl::list<std::string>		static cl::list<std::string>
PtxasArgs("ptxas-args", cl::ZeroOrMore,		PtxasArgs("ptxas-args", cl::ZeroOrMore,
cl::desc("Argument to pass to the ptxas invocation"),		cl::desc("Argument to pass to the ptxas invocation"),
cl::cat(ClangLinkerWrapperCategory));		cl::cat(ClangLinkerWrapperCategory));

		static cl::list<std::string>
		LinkerArgs("device-linker", cl::ZeroOrMore,
		cl::desc("Arguments to pass to the device linker invocation"),
		cl::value_desc("<value> or <triple>=<value>"),
		cl::cat(ClangLinkerWrapperCategory));

static cl::opt<bool> Verbose("v", cl::ZeroOrMore,		static cl::opt<bool> Verbose("v", cl::ZeroOrMore,
cl::desc("Verbose output from tools"),		cl::desc("Verbose output from tools"),
cl::init(false),		cl::init(false),
cl::cat(ClangLinkerWrapperCategory));		cl::cat(ClangLinkerWrapperCategory));

static cl::opt<DebugKind> DebugInfo(		static cl::opt<DebugKind> DebugInfo(
cl::desc("Choose debugging level:"), cl::init(NoDebugInfo),		cl::desc("Choose debugging level:"), cl::init(NoDebugInfo),
cl::values(clEnumValN(NoDebugInfo, "g0", "No debug information"),		cl::values(clEnumValN(NoDebugInfo, "g0", "No debug information"),
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	void printCommands(ArrayRef<StringRef> CmdArgs) {
if (CmdArgs.empty())		if (CmdArgs.empty())
return;		return;

llvm::errs() << " \"" << CmdArgs.front() << "\" ";		llvm::errs() << " \"" << CmdArgs.front() << "\" ";
for (auto IC = std::next(CmdArgs.begin()), IE = CmdArgs.end(); IC != IE; ++IC)		for (auto IC = std::next(CmdArgs.begin()), IE = CmdArgs.end(); IC != IE; ++IC)
llvm::errs() << *IC << (std::next(IC) != IE ? " " : "\n");		llvm::errs() << *IC << (std::next(IC) != IE ? " " : "\n");
}		}

		// Forward user requested arguments to the device linking job.
		void renderXLinkerArgs(SmallVectorImpl<StringRef> &Args, StringRef Triple) {
		for (StringRef Arg : LinkerArgs) {
		auto TripleAndValue = Arg.split('=');
		if (TripleAndValue.second.empty())
		Args.push_back(TripleAndValue.first);
		else if (TripleAndValue.first == Triple)
		Args.push_back(TripleAndValue.second);
		}
		}

std::string getMainExecutable(const char *Name) {		std::string getMainExecutable(const char *Name) {
void Ptr = (void )(intptr_t)&getMainExecutable;		void Ptr = (void )(intptr_t)&getMainExecutable;
auto COWPath = sys::fs::getMainExecutable(Name, Ptr);		auto COWPath = sys::fs::getMainExecutable(Name, Ptr);
return sys::path::parent_path(COWPath).str();		return sys::path::parent_path(COWPath).str();
}		}

/// Extract the device file from the string '<kind>-<triple>-<arch>=<library>'.		/// Extract the device file from the string '<kind>-<triple>-<arch>=<library>'.
DeviceFile getBitcodeLibrary(StringRef LibraryStr) {		DeviceFile getBitcodeLibrary(StringRef LibraryStr) {
▲ Show 20 Lines • Show All 289 Lines • ▼ Show 20 Lines	Expected<std::string> link(ArrayRef<std::string> InputFiles, Triple TheTriple,
CmdArgs.push_back(TempFile);		CmdArgs.push_back(TempFile);
CmdArgs.push_back("-arch");		CmdArgs.push_back("-arch");
CmdArgs.push_back(Arch);		CmdArgs.push_back(Arch);

// Add extracted input files.		// Add extracted input files.
for (StringRef Input : InputFiles)		for (StringRef Input : InputFiles)
CmdArgs.push_back(Input);		CmdArgs.push_back(Input);

		renderXLinkerArgs(CmdArgs, TheTriple.getTriple());
if (Error Err = executeCommands(*NvlinkPath, CmdArgs))		if (Error Err = executeCommands(*NvlinkPath, CmdArgs))
return std::move(Err);		return std::move(Err);

return static_cast<std::string>(TempFile);		return static_cast<std::string>(TempFile);
}		}

Expected<std::string> fatbinary(ArrayRef<StringRef> InputFiles,		Expected<std::string> fatbinary(ArrayRef<StringRef> InputFiles,
Triple TheTriple, ArrayRef<StringRef> Archs) {		Triple TheTriple, ArrayRef<StringRef> Archs) {
▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	Expected<std::string> link(ArrayRef<std::string> InputFiles, Triple TheTriple,
CmdArgs.push_back("-shared");		CmdArgs.push_back("-shared");
CmdArgs.push_back("-o");		CmdArgs.push_back("-o");
CmdArgs.push_back(TempFile);		CmdArgs.push_back(TempFile);

// Add extracted input files.		// Add extracted input files.
for (StringRef Input : InputFiles)		for (StringRef Input : InputFiles)
CmdArgs.push_back(Input);		CmdArgs.push_back(Input);

		renderXLinkerArgs(CmdArgs, TheTriple.getTriple());
if (Error Err = executeCommands(*LLDPath, CmdArgs))		if (Error Err = executeCommands(*LLDPath, CmdArgs))
return std::move(Err);		return std::move(Err);

return static_cast<std::string>(TempFile);		return static_cast<std::string>(TempFile);
}		}
} // namespace amdgcn		} // namespace amdgcn

namespace generic {		namespace generic {
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	Expected<std::string> link(ArrayRef<std::string> InputFiles, Triple TheTriple,
CmdArgs.push_back("-Bsymbolic");		CmdArgs.push_back("-Bsymbolic");
CmdArgs.push_back("-o");		CmdArgs.push_back("-o");
CmdArgs.push_back(TempFile);		CmdArgs.push_back(TempFile);

// Add extracted input files.		// Add extracted input files.
for (StringRef Input : InputFiles)		for (StringRef Input : InputFiles)
CmdArgs.push_back(Input);		CmdArgs.push_back(Input);

		renderXLinkerArgs(CmdArgs, TheTriple.getTriple());
if (Error Err = executeCommands(LinkerUserPath, CmdArgs))		if (Error Err = executeCommands(LinkerUserPath, CmdArgs))
return std::move(Err);		return std::move(Err);

return static_cast<std::string>(TempFile);		return static_cast<std::string>(TempFile);
}		}
} // namespace generic		} // namespace generic

Expected<std::string> linkDevice(ArrayRef<std::string> InputFiles,		Expected<std::string> linkDevice(ArrayRef<std::string> InputFiles,
▲ Show 20 Lines • Show All 681 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add `-Xoffload-linker` to forward input to the device linkerClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 431511

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/Clang.cpp

clang/test/Driver/linker-wrapper.c

clang/test/Driver/openmp-offload-gpu-new.c

clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp

[OpenMP] Add `-Xoffload-linker` to forward input to the device linker
ClosedPublic