This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
1/3
DiagnosticDriverKinds.td
-
lib/Driver/ToolChains/
-
Driver/
-
ToolChains/
4/6
CommonArgs.cpp

Differential D96877

[libomptarget] Try a fallback devicertl if the preferred one is missing
AbandonedPublic

Authored by JonChesterfield on Feb 17 2021, 10:02 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
ye-luo
tianshilei1992
ABataev
grokos

Summary

[libomptarget] Try a fallback devicertl if the preferred one is missing

Clang may be used with a cuda version that is newer than the clang release.
This may fail in various ways, but there is also a credible chance it will
work OK, for sufficiently small version slip.

As discussed in the weekly call, this change adds a fallback path to clang
which looks for a devicertl library that doesn't have a specific cuda version
encoded in the name, specifically 'libomptarget-nvptx-unknown.bc'. It also
warns about this, and recommends the user specify a devicertl they expect to
work manually.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	590 ms	x64 debian > Clang.Driver::amdgpu-openmp-toolchain.c
	1,360 ms	x64 debian > Clang.Driver::openmp-offload-gpu.c
	40 ms	x64 debian > Clang.Misc::warning-flags.c
	430 ms	x64 windows > Clang.Driver::amdgpu-openmp-toolchain.c
	10,920 ms	x64 windows > Clang.Driver::openmp-offload-gpu.c
		View Full Test Results (6 Failed)

Event Timeline

JonChesterfield requested review of this revision.Feb 17 2021, 10:02 AM

JonChesterfield created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2021, 10:02 AM

Herald added subscribers: cfe-commits, sstefan1. · View Herald Transcript

drop whitespace from .td

tianshilei1992 added inline comments.Feb 17 2021, 10:11 AM

clang/lib/Driver/ToolChains/CommonArgs.cpp
1695–1707	If you're using `Twine`, `+` is not needed. llvm::Twine LibOmpTargetName("libomptarget-", BitcodeSuffix ,".bc");
1697	The `BitcodeSuffix` for NVPTX consists of three parts: `nvptx`, `cuda_xxx`, and `sm_yy`. We might want to have an unknown for every `sm`?
1699–1700	bool FoundBCLibrary = tryAppendBuiltinBitcodeGivenPaths( LibraryPaths, DriverArgs, CC1Args, LibOmpTargetName); Remove the definition above?

tianshilei1992 added inline comments.Feb 17 2021, 10:33 AM

clang/include/clang/Basic/DiagnosticDriverKinds.td
265	Besides, we might also need to tell users this bitcode library might not work as expected.

review comments

JonChesterfield marked an inline comment as done.Feb 17 2021, 10:48 AM

JonChesterfield added inline comments.

clang/lib/Driver/ToolChains/CommonArgs.cpp
1647	^clang-tidy warning about a twine in a parameter list [llvm-twine-local] looks like a bug
1695–1707	This doesn't appear to be the case. Doesn't compile anyway, no matching constructor. The discussion around the clang-tidy warning states that Twine variables shouldn't be used as local variables because they don't make copies of the arguments, so one can hit use after free. I don't agree with that, but it doesn't matter very much so have changed to std::string.
1697	Lets not go there. 'Unknown' is a reasonably strong indicator for 'this might not work'. One per sm suggests we've somehow tested whether it's going to be OK or not. I'd prefer we stick with 'using a cuda that didn't exist when your clang shipped is a bad idea, get a newer clang or an older cuda'. This is a sketch of how we could go the other way.

JonChesterfield added inline comments.Feb 17 2021, 10:49 AM

clang/include/clang/Basic/DiagnosticDriverKinds.td
265	Could you propose preferred wording?

Harbormaster completed remote builds in B89564: Diff 324341.Feb 17 2021, 11:03 AM

Harbormaster completed remote builds in B89565: Diff 324342.

Harbormaster completed remote builds in B89570: Diff 324354.Feb 17 2021, 11:41 AM

Suggestion is to resolve libomptarget-nvptx-unknown.bc to a cp of the bitcode libary built for the newest sm_xx and ptx version clang knows of.

tianshilei1992 added inline comments.Feb 17 2021, 3:12 PM

clang/include/clang/Basic/DiagnosticDriverKinds.td
265	maybe the following one? def warn_drv_omp_offload_target_missingbcruntime : Warning<"No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Fall back to '%1' instead.">;

Please also update the test.

In D96877#2569861, @JonChesterfield wrote:

Suggestion is to resolve libomptarget-nvptx-unknown.bc to a cp of the bitcode libary built for the newest sm_xx and ptx version clang knows of.

Why a copy instead of a symlink?

address review comment

In D96877#2569863, @tianshilei1992 wrote:

Please also update the test.

In D96877#2569861, @JonChesterfield wrote:

Suggestion is to resolve libomptarget-nvptx-unknown.bc to a cp of the bitcode libary built for the newest sm_xx and ptx version clang knows of.

Why a copy instead of a symlink?

I don't trust symlinks to survive packaging (tar or deb) or copying around the filesystem during install. Possibly showing scars from windows development there.

Tests passing locally, as expected. Do you mean add a new one that jury rigs cuda version and has a libomptarget-nvptx-unknown.bc on disk somewhere?

Harbormaster completed remote builds in B89621: Diff 324444.Feb 17 2021, 4:51 PM

Can we wrap this up and backport it, last known issue we should fix for 12.

If you like it, accept it. If there's more stuff to do first, please say so.

Does this patch include creating 'libomptarget-nvptx-unknown.bc'?

LGTM. Please also update the test(s) before commit.

This revision is now accepted and ready to land.Feb 18 2021, 10:13 AM

Which tests?

Choosing a file to duplicate with the unknown.bc name is left separate. This just adds code to look for that file if the first choice failed.

Let user to copy the bc file is not feasible.
Handle this in CMake please.
libomptarget-nvptx-unknown.bc

This revision now requires changes to proceed.Feb 18 2021, 12:18 PM

Got it. Copy a file can be tricky. Compile one more can be easily done in cmake.

This revision is now accepted and ready to land.Feb 18 2021, 2:56 PM

Is this obsolete with the change to devicertl cmake? Would prefer abandon to land if so

I think we might not this patch. We’re gonna not support old version of CUDA anyway.

to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc as a solution for forward compatibility until a clean solution lands. I don't want repetitively asking the admin to copy files for every new CUDA toolkit installation.

In D96877#2578748, @ye-luo wrote:

to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc as a solution for forward compatibility until a clean solution lands.

We’ll have newer version LLVM like 12.1 or 12.01 w/ a *right* solution. I don’t think we need to think that further.

In D96877#2578752, @tianshilei1992 wrote:

In D96877#2578748, @ye-luo wrote:

to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc as a solution for forward compatibility until a clean solution lands.

We’ll have newer version LLVM like 12.1 or 12.01 w/ a *right* solution. I don’t think we need to think that further.

This doesn't help people who needs to run exactly 12.0. Also cannot wait for a minor release, need things to work right away when a new cuda toolkit is installed intentionally.

In D96877#2578756, @ye-luo wrote:

In D96877#2578752, @tianshilei1992 wrote:

In D96877#2578748, @ye-luo wrote:

to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc as a solution for forward compatibility until a clean solution lands.

We’ll have newer version LLVM like 12.1 or 12.01 w/ a *right* solution. I don’t think we need to think that further.

This doesn't help people who needs to run exactly 12.0. Also cannot wait for a minor release, need things to work right away when a new cuda toolkit is installed intentionally.

First, we don’t know whether a new version of CUDA will be released during this time, especially considering the release history of CUDA, 11.3 is not that possible, and 12 will not come out so soon. Second, even if NVIDIA’s people are so brilliant and release CUDA 12 in a short time, we have a mechanism for user to work around the issue using the option libomptarget-nvptx-bc-path. This patch is just a work around, and it contains so many uncertainties, which cannot be called “forward compatibility” at all.

I'm going to abandon this. I'm not confident that a cuda toolkit that is newer than the compiler will work with it correctly and would prefer it take some jury rigging on the end users part to put the two together.

I get that 'works out of the box, whatever cuda is installed' would be an amazing feature to have. This patch sort of promises that without necessarily delivering on the 'works' part.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

DiagnosticDriverKinds.td

4 lines

lib/

Driver/

ToolChains/

CommonArgs.cpp

58 lines

Diff 324341

clang/include/clang/Basic/DiagnosticDriverKinds.td

	Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines
	def err_drv_omp_host_ir_file_not_found : Error<			def err_drv_omp_host_ir_file_not_found : Error<
	"The provided host compiler IR file '%0' is required to generate code for OpenMP target regions but cannot be found.">;			"The provided host compiler IR file '%0' is required to generate code for OpenMP target regions but cannot be found.">;
	def err_drv_omp_host_target_not_supported : Error<			def err_drv_omp_host_target_not_supported : Error<
	"The target '%0' is not a supported OpenMP host target.">;			"The target '%0' is not a supported OpenMP host target.">;
	def err_drv_expecting_fopenmp_with_fopenmp_targets : Error<			def err_drv_expecting_fopenmp_with_fopenmp_targets : Error<
	"The option -fopenmp-targets must be used in conjunction with a -fopenmp option compatible with offloading, please use -fopenmp=libomp or -fopenmp=libiomp5.">;			"The option -fopenmp-targets must be used in conjunction with a -fopenmp option compatible with offloading, please use -fopenmp=libomp or -fopenmp=libiomp5.">;
	def err_drv_omp_offload_target_missingbcruntime : Error<			def err_drv_omp_offload_target_missingbcruntime : Error<
	"No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-%1-bc-path to specify %1 bitcode library.">;			"No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-%1-bc-path to specify %1 bitcode library.">;

				tianshilei1992Unsubmitted Not Done Reply Inline Actions Besides, we might also need to tell users this bitcode library might not work as expected. tianshilei1992: Besides, we might also need to tell users this bitcode library might not work as expected.
				JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions Could you propose preferred wording? JonChesterfield: Could you propose preferred wording?
				tianshilei1992Unsubmitted Not Done Reply Inline Actions maybe the following one? def warn_drv_omp_offload_target_missingbcruntime : Warning<"No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Fall back to '%1' instead.">; tianshilei1992: maybe the following one? ``` def warn_drv_omp_offload_target_missingbcruntime : Warning<"No…
				def warn_drv_omp_offload_target_missingbcruntime : Warning<"No library '%0' found in the default clang lib directory or in LIBRARY_PATH, using '%1' instead. Please use --libomptarget-%2-bc-path to specify %2 bitcode library.">;


	def err_drv_omp_offload_target_bcruntime_not_found : Error<"Bitcode library '%0' does not exist.">;			def err_drv_omp_offload_target_bcruntime_not_found : Error<"Bitcode library '%0' does not exist.">;
	def warn_drv_omp_offload_target_duplicate : Warning<			def warn_drv_omp_offload_target_duplicate : Warning<
	"The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">,			"The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">,
	InGroup<OpenMPTarget>;			InGroup<OpenMPTarget>;
	def err_drv_unsupported_embed_bitcode			def err_drv_unsupported_embed_bitcode
	: Error<"%0 is not supported with -fembed-bitcode">;			: Error<"%0 is not supported with -fembed-bitcode">;
	def err_drv_bitcode_unsupported_on_toolchain : Error<			def err_drv_bitcode_unsupported_on_toolchain : Error<
	"-fembed-bitcode is not supported on versions of iOS prior to 6.0">;			"-fembed-bitcode is not supported on versions of iOS prior to 6.0">;
	▲ Show 20 Lines • Show All 268 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/CommonArgs.cpp

Show First 20 Lines • Show All 1,622 Lines • ▼ Show 20 Lines	if (A->getOption().matches(options::OPT_moutline)) {
}		}
} else {		} else {
// Disable all outlining behaviour.		// Disable all outlining behaviour.
addArg(Twine("-enable-machine-outliner=never"));		addArg(Twine("-enable-machine-outliner=never"));
}		}
}		}
}		}

		namespace {
		bool tryAppendBuiltinBitcode(const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args, StringRef lib) {
		if (llvm::sys::fs::exists(lib)) {
		CC1Args.push_back("-mlink-builtin-bitcode");
		CC1Args.push_back(DriverArgs.MakeArgString(lib));
		return true;
		} else {
		return false;
		}
		}

		bool tryAppendBuiltinBitcodeGivenPaths(
		const SmallVector<StringRef, 8> &LibraryPaths,
		const llvm::opt::ArgList &DriverArgs, llvm::opt::ArgStringList &CC1Args,
		llvm::Twine lib) {
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: twine variables are prone to use-after-free bugs [llvm-twine-local] not useful Lint: Pre-merge checks: clang-tidy: warning: twine variables are prone to use-after-free bugs [llvm-twine-local]…
		for (StringRef LibraryPath : LibraryPaths) {
		JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions ^clang-tidy warning about a twine in a parameter list [llvm-twine-local] looks like a bug JonChesterfield: ^clang-tidy warning about a twine in a parameter list [llvm-twine-local] looks like a bug
		SmallString<128> TargetFile(LibraryPath);
		llvm::sys::path::append(TargetFile, lib);
		if (tryAppendBuiltinBitcode(DriverArgs, CC1Args, TargetFile)) {
		return true;
		}
		}
		return false;
		}
		} // namespace

void tools::addOpenMPDeviceRTL(const Driver &D,		void tools::addOpenMPDeviceRTL(const Driver &D,
const llvm::opt::ArgList &DriverArgs,		const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args,		llvm::opt::ArgStringList &CC1Args,
StringRef BitcodeSuffix,		StringRef BitcodeSuffix,
const llvm::Triple &Triple) {		const llvm::Triple &Triple) {
SmallVector<StringRef, 8> LibraryPaths;		SmallVector<StringRef, 8> LibraryPaths;
// Add user defined library paths from LIBRARY_PATH.		// Add user defined library paths from LIBRARY_PATH.
llvm::Optional<std::string> LibPath =		llvm::Optional<std::string> LibPath =
Show All 14 Lines	void tools::addOpenMPDeviceRTL(const Driver &D,
OptSpecifier LibomptargetBCPathOpt =		OptSpecifier LibomptargetBCPathOpt =
Triple.isAMDGCN() ? options::OPT_libomptarget_amdgcn_bc_path_EQ		Triple.isAMDGCN() ? options::OPT_libomptarget_amdgcn_bc_path_EQ
: options::OPT_libomptarget_nvptx_bc_path_EQ;		: options::OPT_libomptarget_nvptx_bc_path_EQ;

StringRef ArchPrefix = Triple.isAMDGCN() ? "amdgcn" : "nvptx";		StringRef ArchPrefix = Triple.isAMDGCN() ? "amdgcn" : "nvptx";
// First check whether user specifies bc library		// First check whether user specifies bc library
if (const Arg *A = DriverArgs.getLastArg(LibomptargetBCPathOpt)) {		if (const Arg *A = DriverArgs.getLastArg(LibomptargetBCPathOpt)) {
std::string LibOmpTargetName(A->getValue());		std::string LibOmpTargetName(A->getValue());
if (llvm::sys::fs::exists(LibOmpTargetName)) {		if (!tryAppendBuiltinBitcode(DriverArgs, CC1Args, LibOmpTargetName)) {
CC1Args.push_back("-mlink-builtin-bitcode");
CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetName));
} else {
D.Diag(diag::err_drv_omp_offload_target_bcruntime_not_found)		D.Diag(diag::err_drv_omp_offload_target_bcruntime_not_found)
<< LibOmpTargetName;		<< LibOmpTargetName;
}		}
} else {		} else {
bool FoundBCLibrary = false;		bool FoundBCLibrary = false;

std::string LibOmpTargetName =		llvm::Twine LibOmpTargetName = "libomptarget-" + BitcodeSuffix + ".bc";
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: twine variables are prone to use-after-free bugs [llvm-twine-local] not useful Lint: Pre-merge checks: clang-tidy: warning: twine variables are prone to use-after-free bugs [llvm-twine-local]…
"libomptarget-" + BitcodeSuffix.str() + ".bc";		llvm::Twine FallbackTargetName =
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: twine variables are prone to use-after-free bugs [llvm-twine-local] not useful Lint: Pre-merge checks: clang-tidy: warning: twine variables are prone to use-after-free bugs [llvm-twine-local]…
		"libomptarget-" + ArchPrefix + "-unknown.bc";
		tianshilei1992Unsubmitted Not Done Reply Inline Actions The `BitcodeSuffix` for NVPTX consists of three parts: `nvptx`, `cuda_xxx`, and `sm_yy`. We might want to have an unknown for every `sm`? tianshilei1992: The `BitcodeSuffix` for NVPTX consists of three parts: `nvptx`, `cuda_xxx`, and `sm_yy`. We…
		JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions Lets not go there. 'Unknown' is a reasonably strong indicator for 'this might not work'. One per sm suggests we've somehow tested whether it's going to be OK or not. I'd prefer we stick with 'using a cuda that didn't exist when your clang shipped is a bad idea, get a newer clang or an older cuda'. This is a sketch of how we could go the other way. JonChesterfield: Lets not go there. 'Unknown' is a reasonably strong indicator for 'this might not work'. One…
for (StringRef LibraryPath : LibraryPaths) {
SmallString<128> LibOmpTargetFile(LibraryPath);		FoundBCLibrary = tryAppendBuiltinBitcodeGivenPaths(
llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);		LibraryPaths, DriverArgs, CC1Args, LibOmpTargetName);
		tianshilei1992Unsubmitted Done Reply Inline Actions bool FoundBCLibrary = tryAppendBuiltinBitcodeGivenPaths( LibraryPaths, DriverArgs, CC1Args, LibOmpTargetName); Remove the definition above? tianshilei1992: ``` bool FoundBCLibrary = tryAppendBuiltinBitcodeGivenPaths( LibraryPaths, DriverArgs…
if (llvm::sys::fs::exists(LibOmpTargetFile)) {
CC1Args.push_back("-mlink-builtin-bitcode");		if (!FoundBCLibrary) {
CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));		FoundBCLibrary = tryAppendBuiltinBitcodeGivenPaths(
FoundBCLibrary = true;		LibraryPaths, DriverArgs, CC1Args, FallbackTargetName);
break;		if (FoundBCLibrary) {
		D.Diag(diag::warn_drv_omp_offload_target_missingbcruntime)
		<< LibOmpTargetName.str() << FallbackTargetName.str() << ArchPrefix;
		tianshilei1992Unsubmitted Not Done Reply Inline Actions If you're using `Twine`, `+` is not needed. llvm::Twine LibOmpTargetName("libomptarget-", BitcodeSuffix ,".bc"); tianshilei1992: If you're using `Twine`, `+` is not needed. ``` llvm::Twine LibOmpTargetName("libomptarget-"…
		JonChesterfieldAuthorUnsubmitted Done Reply Inline Actions This doesn't appear to be the case. Doesn't compile anyway, no matching constructor. The discussion around the clang-tidy warning states that Twine variables shouldn't be used as local variables because they don't make copies of the arguments, so one can hit use after free. I don't agree with that, but it doesn't matter very much so have changed to std::string. JonChesterfield: This doesn't appear to be the case. Doesn't compile anyway, no matching constructor. The…
}		}
}		}

if (!FoundBCLibrary)		if (!FoundBCLibrary)
D.Diag(diag::err_drv_omp_offload_target_missingbcruntime)		D.Diag(diag::err_drv_omp_offload_target_missingbcruntime)
<< LibOmpTargetName << ArchPrefix;		<< LibOmpTargetName.str() << ArchPrefix;
}		}
}		}