This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
3/5
Cuda.cpp
-
test/Driver/
-
Driver/
-
cuda-cross-compiling.c

Differential D149978

[Clang][NVPTX] Allow passing arguments to the linker while standalone
Needs RevisionPublic

Authored by jhuber6 on May 5 2023, 12:03 PM.

Download Raw Diff

Details

Reviewers

JonChesterfield
tra
yaxunl
MaskRay

Summary

We support standalone compilation for the NVPTX architecture using
'nvlink' as our linker. Because of the special handling required to
transform input files to cubins, as nvlink expects for some reason, we
didn't use the standard AddLinkerInput method. However, this also
meant that we weren't forwarding options passed with -Wl to the
linker. Add this support in for the standalone toolchain path.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhuber6 created this revision.May 5 2023, 12:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 12:03 PM

Herald added subscribers: mattd, gchakrabarti, asavonic. · View Herald Transcript

jhuber6 requested review of this revision.May 5 2023, 12:03 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2023, 12:03 PM

Herald added a subscriber: cfe-commits. · View Herald Transcript

hliao added a subscriber: hliao.May 5 2023, 12:04 PM

tra added inline comments.May 5 2023, 12:18 PM

clang/lib/Driver/ToolChains/Cuda.cpp
594	Is removal of this line intentional?
637–646	I'd prefer to replace it with if (II.isFileName()) { do stuff... } else { if (!II.isNothing() II.getInputArg().renderAsInput(Args, CmdArgs); }

jhuber6 marked an inline comment as done.May 5 2023, 1:07 PM

jhuber6 added inline comments.

clang/lib/Driver/ToolChains/Cuda.cpp
594	No, thanks that was from when I originally tried to use `AddLinkerInput` but it didn't work because of the `cubin` thing.

Addressing comments

tra accepted this revision.May 5 2023, 1:26 PM

This revision is now accepted and ready to land.May 5 2023, 1:26 PM

Harbormaster completed remote builds in B230313: Diff 519957.May 5 2023, 1:43 PM

Somewhat annoying, I've discovered that LLVM adds -Wl,-fcolor-diagnostics which obviously isn't supported by nvlink so it fails while including this in libc's CMake. Any clue if there's a way to work around that?

The main reason I made this patch was to allow passing --suppress-stack-size-warning to nvlink. But it turns out it's a little more difficult there.

In D149978#4323210, @jhuber6 wrote:

Somewhat annoying, I've discovered that LLVM adds -Wl,-fcolor-diagnostics which obviously isn't supported by nvlink so it fails while including this in libc's CMake. Any clue if there's a way to work around that?

I guess the options are to either filter out the automatically added option or to avoid adding that particular argument if we know that the target is NVPTX. The latter would probably be preferable as there would be only one place where the decision is made.

In D149978#4323221, @tra wrote:

In D149978#4323210, @jhuber6 wrote:

Somewhat annoying, I've discovered that LLVM adds -Wl,-fcolor-diagnostics which obviously isn't supported by nvlink so it fails while including this in libc's CMake. Any clue if there's a way to work around that?

I guess the options are to either filter out the automatically added option or to avoid adding that particular argument if we know that the target is NVPTX. The latter would probably be preferable as there would be only one place where the decision is made.

The latter is a little difficult, the logic adds it based off of the host linker, but we explicitly override the host triple when we build via --target=. So there's be no way to turn it off in LLVM unless it's a blanket check on building libc. And since it's a global flag I can't just disable it only for the target. So I think the options are, either filter it out manually here or make a new flag called -Xcuda-nvlink, which I wouldn't like to do.

Putting up the hack that works around my problem with libc. Definitely not a good solution though.

The latter is a little difficult,

The more we dig, the more we want GPU-capable lld. :-)

clang/lib/Driver/ToolChains/Cuda.cpp
641	Can there ever be more than one value returned by `II.getInputArg().getValues()`? If so, we probably don't want to skip all of them if one of them is `--color-diagnostics`. We may want to ignore only singleton `--color-diagnostics` and let all other combinations error out.

In D149978#4323328, @tra wrote:

The latter is a little difficult,

The more we dig, the more we want GPU-capable lld. :-)

My thoughts exactly. I had a small chat with @MaskRay about how difficult it would be to spin up support for NVPTX. But it would probably be a reasonably large project, and considering who I work for would be difficult for me to do it as more than a hobby.

clang/lib/Driver/ToolChains/Cuda.cpp
641	Yeah, you can do `-Wl,arg1,arg2,arg3`. This was just because I couldn't think of an easy way to separate them out, considering that we rely on `renderAsInput` we'd need to create an entirely new arg. Which is doable, but I wasn't sure if it was worth the effort.

Harbormaster completed remote builds in B230329: Diff 519977.May 5 2023, 3:18 PM

I've discovered that LLVM adds -Wl,-fcolor-diagnostics

Can you tell me where it's done?

In D149978#4323452, @tra wrote:

I've discovered that LLVM adds -Wl,-fcolor-diagnostics

Can you tell me where it's done?

llvm/cmake/modules/HandleLLVMOptions.cmake:994

barannikov88 added a subscriber: barannikov88.May 5 2023, 3:44 PM

In D149978#4323457, @jhuber6 wrote:

llvm/cmake/modules/HandleLLVMOptions.cmake:994

I do not think that we should work around this particular source of options in clang driver.

This sounds like something that may need to be dealt with in cmake.
The root cause is that cmake assumes that if the linker accepts the flag for target X, it will accept that flag for target Y. Or that the flags will be used only for compiling for the default target. Considering that clang is a cross-compiler, neither of the assumptions is universally true.

We may need to add a concept of per-offload-arch options that would be checked with specific --target=.... It's a bigger can of worms than just filtering the argument out, but I think we'll need to deal with it sooner or later anyways.

As a short-term stop-gap solution, I would suggest adding a cmake knob to disable linker color output altogether. This should unblock you and would not affect anybody else until we have a better fix.

This revision now requires changes to proceed.May 5 2023, 4:04 PM

In D149978#4323457, @jhuber6 wrote:

In D149978#4323452, @tra wrote:

I've discovered that LLVM adds -Wl,-fcolor-diagnostics

Can you tell me where it's done?

llvm/cmake/modules/HandleLLVMOptions.cmake:994

This might have to do something with

# Handle common options used by all runtimes.
include(AddLLVM)

in runtimes/CMakeLists.txt

which is totally wrong in my opinion.
In a LLVM_ENABLE_PROJECTS build (not sure about LLVM_ENABLE_RUNTIMES build) AddLLVM determines flags for building LLVM itself, not for building runtime libraries.
The results of the tests of the host toolchain therefore affect the invocation of the target toolchain.
I came across this once in https://reviews.llvm.org/D146920#4240370

Revision Contents

Path

Size

clang/

lib/

Driver/

ToolChains/

Cuda.cpp

53 lines

test/

Driver/

cuda-cross-compiling.c

9 lines

Diff 519977

clang/lib/Driver/ToolChains/Cuda.cpp

Show First 20 Lines • Show All 585 Lines • ▼ Show 20 Lines	void NVPTX::Linker::ConstructJob(Compilation &C, const JobAction &JA,

StringRef GPUArch = Args.getLastArgValue(options::OPT_march_EQ);		StringRef GPUArch = Args.getLastArgValue(options::OPT_march_EQ);
assert(!GPUArch.empty() && "At least one GPU Arch required for nvlink.");		assert(!GPUArch.empty() && "At least one GPU Arch required for nvlink.");

CmdArgs.push_back("-arch");		CmdArgs.push_back("-arch");
CmdArgs.push_back(Args.MakeArgString(GPUArch));		CmdArgs.push_back(Args.MakeArgString(GPUArch));

// Add paths specified in LIBRARY_PATH environment variable as -L options.		// Add paths specified in LIBRARY_PATH environment variable as -L options.
addDirectoryList(Args, CmdArgs, "-L", "LIBRARY_PATH");		addDirectoryList(Args, CmdArgs, "-L", "LIBRARY_PATH");
traUnsubmitted Not Done Reply Inline Actions Is removal of this line intentional? tra: Is removal of this line intentional?
jhuber6AuthorUnsubmitted Done Reply Inline Actions No, thanks that was from when I originally tried to use `AddLinkerInput` but it didn't work because of the `cubin` thing. jhuber6: No, thanks that was from when I originally tried to use `AddLinkerInput` but it didn't work…

// Add paths for the default clang library path.		// Add paths for the default clang library path.
SmallString<256> DefaultLibPath =		SmallString<256> DefaultLibPath =
llvm::sys::path::parent_path(TC.getDriver().Dir);		llvm::sys::path::parent_path(TC.getDriver().Dir);
llvm::sys::path::append(DefaultLibPath, CLANG_INSTALL_LIBDIR_BASENAME);		llvm::sys::path::append(DefaultLibPath, CLANG_INSTALL_LIBDIR_BASENAME);
CmdArgs.push_back(Args.MakeArgString(Twine("-L") + DefaultLibPath));		CmdArgs.push_back(Args.MakeArgString(Twine("-L") + DefaultLibPath));

for (const auto &II : Inputs) {		for (const auto &II : Inputs) {
if (II.getType() == types::TY_LLVM_IR \|\| II.getType() == types::TY_LTO_IR \|\|		if (II.getType() == types::TY_LLVM_IR \|\| II.getType() == types::TY_LTO_IR \|\|
II.getType() == types::TY_LTO_BC \|\| II.getType() == types::TY_LLVM_BC) {		II.getType() == types::TY_LTO_BC \|\| II.getType() == types::TY_LLVM_BC) {
C.getDriver().Diag(diag::err_drv_no_linker_llvm_support)		C.getDriver().Diag(diag::err_drv_no_linker_llvm_support)
<< getToolChain().getTripleString();		<< getToolChain().getTripleString();
continue;		continue;
}		}

// Currently, we only pass the input files to the linker, we do not pass
// any libraries that may be valid only for the host.
if (!II.isFilename())
continue;

// The 'nvlink' application performs RDC-mode linking when given a '.o'		// The 'nvlink' application performs RDC-mode linking when given a '.o'
// file and device linking when given a '.cubin' file. We always want to		// file and device linking when given a '.cubin' file. We always want to
// perform device linking, so just rename any '.o' files.		// perform device linking, so just rename any '.o' files.
// FIXME: This should hopefully be removed if NVIDIA updates their tooling.		// FIXME: This should hopefully be removed if NVIDIA updates their tooling.
		if (II.isFilename()) {
auto InputFile = getToolChain().getInputFilename(II);		auto InputFile = getToolChain().getInputFilename(II);
if (llvm::sys::path::extension(InputFile) != ".cubin") {		if (llvm::sys::path::extension(InputFile) != ".cubin") {
// If there are no actions above this one then this is direct input and we		// If there are no actions above this one then this is direct input and
// can copy it. Otherwise the input is internal so a `.cubin` file should		// we can copy it. Otherwise the input is internal so a `.cubin` file
// exist.		// should exist.
if (II.getAction() && II.getAction()->getInputs().size() == 0) {		if (II.getAction() && II.getAction()->getInputs().size() == 0) {
const char *CubinF =		const char *CubinF =
Args.MakeArgString(getToolChain().getDriver().GetTemporaryPath(		Args.MakeArgString(getToolChain().getDriver().GetTemporaryPath(
llvm::sys::path::stem(InputFile), "cubin"));		llvm::sys::path::stem(InputFile), "cubin"));
if (std::error_code EC =		if (std::error_code EC =
llvm::sys::fs::copy_file(InputFile, C.addTempFile(CubinF)))		llvm::sys::fs::copy_file(InputFile, C.addTempFile(CubinF)))
continue;		continue;

CmdArgs.push_back(CubinF);		CmdArgs.push_back(CubinF);
} else {		} else {
SmallString<256> Filename(InputFile);		SmallString<256> Filename(InputFile);
llvm::sys::path::replace_extension(Filename, "cubin");		llvm::sys::path::replace_extension(Filename, "cubin");
CmdArgs.push_back(Args.MakeArgString(Filename));		CmdArgs.push_back(Args.MakeArgString(Filename));
}		}
} else {		} else {
CmdArgs.push_back(Args.MakeArgString(InputFile));		CmdArgs.push_back(Args.MakeArgString(InputFile));
}		}
		continue;
		} else if (!II.isNothing()) {
		// This option is commonly passed by LLVM by default, but isn't supported
		// by nvlink.
		if (llvm::any_of(II.getInputArg().getValues(), [](StringRef Arg) {
		traUnsubmitted Not Done Reply Inline Actions Can there ever be more than one value returned by `II.getInputArg().getValues()`? If so, we probably don't want to skip all of them if one of them is `--color-diagnostics`. We may want to ignore only singleton `--color-diagnostics` and let all other combinations error out. tra: Can there ever be more than one value returned by `II.getInputArg().getValues()`? If so, we…
		jhuber6AuthorUnsubmitted Done Reply Inline Actions Yeah, you can do `-Wl,arg1,arg2,arg3`. This was just because I couldn't think of an easy way to separate them out, considering that we rely on `renderAsInput` we'd need to create an entirely new arg. Which is doable, but I wasn't sure if it was worth the effort. jhuber6: Yeah, you can do `-Wl,arg1,arg2,arg3`. This was just because I couldn't think of an easy way to…
		return Arg.equals("--color-diagnostics");
		}))
		continue;
		// Render any remaining arguments as input to nvlink.
		II.getInputArg().renderAsInput(Args, CmdArgs);
		traUnsubmitted Done Reply Inline Actions I'd prefer to replace it with if (II.isFileName()) { do stuff... } else { if (!II.isNothing() II.getInputArg().renderAsInput(Args, CmdArgs); } tra: I'd prefer to replace it with ``` if (II.isFileName()) { do stuff... } else { if (!II.
		}
}		}

C.addCommand(std::make_unique<Command>(		C.addCommand(std::make_unique<Command>(
JA, *this,		JA, *this,
ResponseFileSupport{ResponseFileSupport::RF_Full, llvm::sys::WEM_UTF8,		ResponseFileSupport{ResponseFileSupport::RF_Full, llvm::sys::WEM_UTF8,
"--options-file"},		"--options-file"},
Args.MakeArgString(getToolChain().GetProgramPath("nvlink")), CmdArgs,		Args.MakeArgString(getToolChain().GetProgramPath("nvlink")), CmdArgs,
Inputs, Output));		Inputs, Output));
▲ Show 20 Lines • Show All 378 Lines • Show Last 20 Lines

clang/test/Driver/cuda-cross-compiling.c

	Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	//			//
	// Test to ensure that we enable handling global constructors in a freestanding			// Test to ensure that we enable handling global constructors in a freestanding
	// Nvidia compilation.			// Nvidia compilation.
	//			//
	// RUN: %clang -target nvptx64-nvidia-cuda -march=sm_70 %s -### 2>&1 \			// RUN: %clang -target nvptx64-nvidia-cuda -march=sm_70 %s -### 2>&1 \
	// RUN: \| FileCheck -check-prefix=LOWERING %s			// RUN: \| FileCheck -check-prefix=LOWERING %s

	// LOWERING: -cc1" "-triple" "nvptx64-nvidia-cuda" {{.*}} "-mllvm" "--nvptx-lower-global-ctor-dtor"			// LOWERING: -cc1" "-triple" "nvptx64-nvidia-cuda" {{.*}} "-mllvm" "--nvptx-lower-global-ctor-dtor"

				//
				// Test passing arguments directly to nvlink.
				//
				// RUN: %clang -target nvptx64-nvidia-cuda -Wl,-v -Wl,--color-diagnostics -### %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=LINKER-ARGS %s

				// LINKER-ARGS: nvlink{{.*}}"-v"
				// LINKER-ARGS-NOT: nvlink{{.*}}"--color-diagnostics"