Download Raw Diff

Details

Reviewers

jdoerfert
JonChesterfield
ronlieb
saiislam
ABataev
jansvoboda11
tianshilei1992

Commits

rG79401b43ce4e: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode

Summary

This patch uses the existing logic of CUDA for searching libomptarget
and extracts it to a common method.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	330 ms	x64 debian > libarcher.races::task-dependency.c
	340 ms	x64 debian > libarcher.races::task-taskgroup-unrelated.c
	340 ms	x64 debian > libarcher.races::task-taskwait-nested.c
	260 ms	x64 debian > libarcher.races::task-two.c
	360 ms	x64 debian > libarcher.task::task-barrier.c
		View Full Test Results (13 Failed)

Event Timeline

pdhaliwal created this revision.Feb 8 2021, 1:15 AM

Herald added a reviewer: jansvoboda11. · View Herald TranscriptFeb 8 2021, 1:15 AM

Herald added subscribers: dang, kerbowa, guansong and 7 others. · View Herald Transcript

pdhaliwal requested review of this revision.Feb 8 2021, 1:15 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2021, 1:15 AM

Herald added subscribers: cfe-commits, sstefan1, wdng. · View Herald Transcript

JonChesterfield added a reviewer: tianshilei1992.Feb 8 2021, 1:30 AM

Missed some changes,

Fix openmp-offload.c test failure
Fix amdgpu-openmp-toolchain.c test failure

Harbormaster completed remote builds in B88255: Diff 322048.Feb 8 2021, 1:53 AM

Harbormaster completed remote builds in B88256: Diff 322052.Feb 8 2021, 2:32 AM

I like this. Using the same logic, in the same function call, to find this library on either gpu is the right thing to do. Looks like a non functional change on nvptx, though phab doesn't make that obvious.

clang/include/clang/Driver/Options.td
937	I think there's an aliasing mechanism in the Options handling, where we can use device_bc_path as the canonical choice and nvptx_bc_path as a backwards-compatible argument that sets device_bc_path
clang/lib/Driver/ToolChains/CommonArgs.cpp
1655	This, I think, can be reduced to checking OPT_libomptarget_device_bc_path_EQ if the aliasing machinery I remember does actually exist

Addressed review comments.

pdhaliwal marked 2 inline comments as done.Feb 8 2021, 5:47 AM

LGTM. Let's wait for someone using nvptx to sanity check

Harbormaster completed remote builds in B88273: Diff 322090.Feb 8 2021, 6:26 AM

Edit: Debugged further, rewriting comment.

Current error message on missing library is:
'error: No library 'libomptarget-amdgcn-gfx906.bc' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-nvptx-bc-path to specify nvptx bitcode libarary'

Written in ./clang/include/clang/Basic/DiagnosticDriverKinds.td entry err_drv_omp_offload_target_missingbcruntime should probably refer to 'device' instead of 'nvptx' (error message change only)

Cuda.cpp calls addOpenMPDeviceRTL guarded by nogpulib, AMDGPUOpenMP unconditionally calls it. That means the deviceRTL is needed on disk when building the deviceRTL. Not so good. We need a if (DriverArgs.hasArg(options::OPT_nogpulib)) return; in AMDGPUOpenMPToolChain::addClangTargetOptions. Comment inline as well.

The existing search logic looks in clang's lib and LIBRARY_PATH, I think we should probably look in the runtime directory as well for running from the build tree. That's separate to this change though.

JonChesterfield added inline comments.Feb 8 2021, 12:46 PM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` here or we can't build the deviceRTL without already having one on disk

Generally LGTM.

In D96248#2549339, @JonChesterfield wrote:

The existing search logic looks in clang's lib and LIBRARY_PATH, I think we should probably look in the runtime directory as well for running from the build tree. That's separate to this change though.

I don't think it's necessary as we can add the runtime directory to LIBRARY_PATH when configuring lit.

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	FWIW, NVPTX `deviceRTLs` is built by directly calling FE, not via clang driver. `clang -fopenmp -fopenmp-targets=xxx` basically consists of two passes, and therefore generates two IRs, which is not what we expect. I'm not sure we really need the if statement.

Added check for nogpulib
Fixed diagnostic message

JonChesterfield added inline comments.Feb 8 2021, 11:48 PM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	That explains what I was missing about the ptx cmake I think. I've had to locally hack around clang creating a bundle, which llvm-link chokes on, because cuda-device-only et al are ignored by openmp. I think this check is right - it means nogpulib will exclude the rtl on both GPUs. Nvptx already has it in the control flow. Whether RTL cmake should bypass the driver is interesting, but I think separate to this patch.

Harbormaster completed remote builds in B88414: Diff 322298.Feb 9 2021, 12:11 AM

JonChesterfield accepted this revision.Feb 9 2021, 2:03 AM

This revision is now accepted and ready to land.Feb 9 2021, 2:03 AM

tianshilei1992 added inline comments.Feb 9 2021, 6:11 AM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	`cuda-device-only` is only for CUDA compilation. We don’t have an option to invoke device compilation only.

Some background discussion about the diagnostic. This change means people using nvptx, where the library cannot be found, will now be advised to use libomptarget-device-bc-path instead of libomptarget-nvptx-bc-path. If they use libomptarget-nvptx-bc-path anyway, it'll work as intended. That avoids us adding libomptarget-amdgcn-bc-path and requiring some more conditional compilation for multiarch codebases.

@tianshilei1992 @jdoerfert can we agree on 'libomptarget-device-bc-path' being the better one to recommend in the error message, despite that being a minor change to the current behaviour? It'll slightly surprise people parsing error messages, such as our test, but shouldn't cause any confusion.

I'm personally OK with using libomptarget-nvptx-bc-path to indicate where to look for the amdgcn bitcode as well but can see that causing confusion. I'm assuming that both gpu runtimes will go in same directory - there may be a future where one invocation targets nvptx and amdgcn at the same time, but even then I'd prefer all the runtimes live in the same place in the filesystem.

In D96248#2551503, @JonChesterfield wrote:

@tianshilei1992 @jdoerfert can we agree on 'libomptarget-device-bc-path' being the better one to recommend in the error message, despite that being a minor change to the current behaviour? It'll slightly surprise people parsing error messages, such as our test, but shouldn't cause any confusion.

After a second thought, I don't think it is feasible. Consider the following senecio:

clang -fopenmp -fopenmp-targets=nvptx64,amdgcn source.cpp

We cannot use one option for the two different targets, and the alias might not work as well, especially in terms of the driver.

This revision now requires changes to proceed.Feb 9 2021, 9:20 AM

Using one option for both targets seems great - if both have put the devicertl in the same folder. Which I suppose they might not have.

Maybe keep it separate for now, one for nvptx and one for amdgcn, and hope for a common 'device' later.

I have removed libomptarget-device-bc-path and have added amdgcn one. For diagnostic,
instead of having one per architecture, I have used the same and added second
parameter to err_drv_omp_offload_target_missingbcruntime for having arch specifc message.

Harbormaster completed remote builds in B88605: Diff 322638.Feb 10 2021, 3:59 AM

Parameter to err_drv_omp_offload_target_missingbcruntime is sensible. @tianshilei1992? With this we can set the path for nvptx and amdgcn independently.

LGTM

This revision is now accepted and ready to land.Feb 11 2021, 11:07 AM

This revision was landed with ongoing or failed builds.Feb 11 2021, 9:42 PM

Closed by commit rG79401b43ce4e: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode (authored by pdhaliwal). · Explain Why

This revision was automatically updated to reflect the committed changes.

pdhaliwal added a commit: rG79401b43ce4e: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode.

JonChesterfield mentioned this in D101935: [clang] Search runtimes build tree for openmp runtime.May 5 2021, 11:52 AM

jdoerfert reopened this revision.Jan 8 2022, 11:31 AM

jdoerfert added inline comments.

clang/include/clang/Driver/Options.td
936	Why do we need two options that literally do the same thing? I cannot think of a use case where we would specify two distinct paths, can anyone else?

This revision is now accepted and ready to land.Jan 8 2022, 11:31 AM

JonChesterfield added inline comments.Jan 8 2022, 11:52 AM

clang/include/clang/Driver/Options.td
936	It's for when someone has decided to put nvptx bitcode in one directory and amdgpu in another. That's presently useless. It might be more helpful once we can target both platforms from one compile, but even then passing bitcode-path once per arch seems better. In favour of throwing one option away and renaming the other to work on both/all targets

Diff 322090

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 924 Lines • ▼ Show 20 Lines
	def gpu_max_threads_per_block_EQ : Joined<["--"], "gpu-max-threads-per-block=">,			def gpu_max_threads_per_block_EQ : Joined<["--"], "gpu-max-threads-per-block=">,
	Flags<[CC1Option]>,			Flags<[CC1Option]>,
	HelpText<"Default max threads per block for kernel launch bounds for HIP">,			HelpText<"Default max threads per block for kernel launch bounds for HIP">,
	MarshallingInfoStringInt<LangOpts<"GPUMaxThreadsPerBlock">, "256">,			MarshallingInfoStringInt<LangOpts<"GPUMaxThreadsPerBlock">, "256">,
	ShouldParseIf<hip.KeyPath>;			ShouldParseIf<hip.KeyPath>;
	def gpu_instrument_lib_EQ : Joined<["--"], "gpu-instrument-lib=">,			def gpu_instrument_lib_EQ : Joined<["--"], "gpu-instrument-lib=">,
	HelpText<"Instrument device library for HIP, which is a LLVM bitcode containing "			HelpText<"Instrument device library for HIP, which is a LLVM bitcode containing "
	"__cyg_profile_func_enter and __cyg_profile_func_exit">;			"__cyg_profile_func_enter and __cyg_profile_func_exit">;
				def libomptarget_device_bc_path_EQ : Joined<["--"], "libomptarget-device-bc-path=">, Group<i_Group>,
				HelpText<"Path to libomptarget bitcode library">;
	def libomptarget_nvptx_bc_path_EQ : Joined<["--"], "libomptarget-nvptx-bc-path=">, Group<i_Group>,			def libomptarget_nvptx_bc_path_EQ : Joined<["--"], "libomptarget-nvptx-bc-path=">, Group<i_Group>,
	HelpText<"Path to libomptarget-nvptx bitcode library">;			Alias<libomptarget_device_bc_path_EQ>;
				jdoerfertUnsubmitted Not Done Reply Inline Actions Why do we need two options that literally do the same thing? I cannot think of a use case where we would specify two distinct paths, can anyone else? jdoerfert: Why do we need two options that literally do the same thing? I cannot think of a use case where…
				JonChesterfieldUnsubmitted Not Done Reply Inline Actions It's for when someone has decided to put nvptx bitcode in one directory and amdgpu in another. That's presently useless. It might be more helpful once we can target both platforms from one compile, but even then passing bitcode-path once per arch seems better. In favour of throwing one option away and renaming the other to work on both/all targets JonChesterfield: It's for when someone has decided to put nvptx bitcode in one directory and amdgpu in another.
	def dD : Flag<["-"], "dD">, Group<d_Group>, Flags<[CC1Option]>,			def dD : Flag<["-"], "dD">, Group<d_Group>, Flags<[CC1Option]>,
				JonChesterfieldUnsubmitted Done Reply Inline Actions I think there's an aliasing mechanism in the Options handling, where we can use device_bc_path as the canonical choice and nvptx_bc_path as a backwards-compatible argument that sets device_bc_path JonChesterfield: I think there's an aliasing mechanism in the Options handling, where we can use device_bc_path…
	HelpText<"Print macro definitions in -E mode in addition to normal output">;			HelpText<"Print macro definitions in -E mode in addition to normal output">;
	def dI : Flag<["-"], "dI">, Group<d_Group>, Flags<[CC1Option]>,			def dI : Flag<["-"], "dI">, Group<d_Group>, Flags<[CC1Option]>,
	HelpText<"Print include directives in -E mode in addition to normal output">,			HelpText<"Print include directives in -E mode in addition to normal output">,
	MarshallingInfoFlag<PreprocessorOutputOpts<"ShowIncludeDirectives">>;			MarshallingInfoFlag<PreprocessorOutputOpts<"ShowIncludeDirectives">>;
	def dM : Flag<["-"], "dM">, Group<d_Group>, Flags<[CC1Option]>,			def dM : Flag<["-"], "dM">, Group<d_Group>, Flags<[CC1Option]>,
	HelpText<"Print macro definitions in -E mode instead of normal output">;			HelpText<"Print macro definitions in -E mode instead of normal output">;
	def dead__strip : Flag<["-"], "dead_strip">;			def dead__strip : Flag<["-"], "dead_strip">;
	def dependency_file : Separate<["-"], "dependency-file">, Flags<[CC1Option]>,			def dependency_file : Separate<["-"], "dependency-file">, Flags<[CC1Option]>,
	▲ Show 20 Lines • Show All 4,986 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

Show First 20 Lines • Show All 184 Lines • ▼ Show 20 Lines	void AMDGPUOpenMPToolChain::addClangTargetOptions(
assert(!GpuArch.empty() && "Must have an explicit GPU arch.");		assert(!GpuArch.empty() && "Must have an explicit GPU arch.");
assert(DeviceOffloadingKind == Action::OFK_OpenMP &&		assert(DeviceOffloadingKind == Action::OFK_OpenMP &&
"Only OpenMP offloading kinds are supported.");		"Only OpenMP offloading kinds are supported.");

CC1Args.push_back("-target-cpu");		CC1Args.push_back("-target-cpu");
CC1Args.push_back(DriverArgs.MakeArgStringRef(GpuArch));		CC1Args.push_back(DriverArgs.MakeArgStringRef(GpuArch));
CC1Args.push_back("-fcuda-is-device");		CC1Args.push_back("-fcuda-is-device");
CC1Args.push_back("-emit-llvm-bc");		CC1Args.push_back("-emit-llvm-bc");

		JonChesterfieldUnsubmitted Done Reply Inline Actions Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` here or we can't build the deviceRTL without already having one on disk JonChesterfield: Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` here or we can't build the…
		tianshilei1992Unsubmitted Not Done Reply Inline Actions FWIW, NVPTX `deviceRTLs` is built by directly calling FE, not via clang driver. `clang -fopenmp -fopenmp-targets=xxx` basically consists of two passes, and therefore generates two IRs, which is not what we expect. I'm not sure we really need the if statement. tianshilei1992: FWIW, NVPTX `deviceRTLs` is built by directly calling FE, not via clang driver. `clang -fopenmp…
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions That explains what I was missing about the ptx cmake I think. I've had to locally hack around clang creating a bundle, which llvm-link chokes on, because cuda-device-only et al are ignored by openmp. I think this check is right - it means nogpulib will exclude the rtl on both GPUs. Nvptx already has it in the control flow. Whether RTL cmake should bypass the driver is interesting, but I think separate to this patch. JonChesterfield: That explains what I was missing about the ptx cmake I think. I've had to locally hack around…
		tianshilei1992Unsubmitted Not Done Reply Inline Actions `cuda-device-only` is only for CUDA compilation. We don’t have an option to invoke device compilation only. tianshilei1992: `cuda-device-only` is only for CUDA compilation. We don’t have an option to invoke device…
		std::string BitcodeSuffix = "amdgcn-" + GpuArch.str();
		addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, BitcodeSuffix);
}		}

llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(		llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(
const llvm::opt::DerivedArgList &Args, StringRef BoundArch,		const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
Action::OffloadKind DeviceOffloadKind) const {		Action::OffloadKind DeviceOffloadKind) const {
DerivedArgList *DAL =		DerivedArgList *DAL =
HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);		HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);
if (!DAL)		if (!DAL)
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/CommonArgs.h

	Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines

	unsigned getOrCheckAMDGPUCodeObjectVersion(const Driver &D,			unsigned getOrCheckAMDGPUCodeObjectVersion(const Driver &D,
	const llvm::opt::ArgList &Args,			const llvm::opt::ArgList &Args,
	bool Diagnose = false);			bool Diagnose = false);

	void addMachineOutlinerArgs(const Driver &D, const llvm::opt::ArgList &Args,			void addMachineOutlinerArgs(const Driver &D, const llvm::opt::ArgList &Args,
	llvm::opt::ArgStringList &CmdArgs,			llvm::opt::ArgStringList &CmdArgs,
	const llvm::Triple &Triple, bool IsLTO);			const llvm::Triple &Triple, bool IsLTO);

				void addOpenMPDeviceRTL(const Driver &D, const llvm::opt::ArgList &DriverArgs,
				llvm::opt::ArgStringList &CC1Args,
				StringRef BitcodeSuffix);
	} // end namespace tools			} // end namespace tools
	} // end namespace driver			} // end namespace driver
	} // end namespace clang			} // end namespace clang

	#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_COMMONARGS_H			#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_COMMONARGS_H

clang/lib/Driver/ToolChains/CommonArgs.cpp

Show First 20 Lines • Show All 1,621 Lines • ▼ Show 20 Lines	if (A->getOption().matches(options::OPT_moutline)) {
addArg(Twine("-enable-machine-outliner"));		addArg(Twine("-enable-machine-outliner"));
}		}
} else {		} else {
// Disable all outlining behaviour.		// Disable all outlining behaviour.
addArg(Twine("-enable-machine-outliner=never"));		addArg(Twine("-enable-machine-outliner=never"));
}		}
}		}
}		}

		void tools::addOpenMPDeviceRTL(const Driver &D,
		const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args,
		StringRef BitcodeSuffix) {
		SmallVector<StringRef, 8> LibraryPaths;
		// Add user defined library paths from LIBRARY_PATH.
		llvm::Optional<std::string> LibPath =
		llvm::sys::Process::GetEnv("LIBRARY_PATH");
		if (LibPath) {
		SmallVector<StringRef, 8> Frags;
		const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
		llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
		for (StringRef Path : Frags)
		LibraryPaths.emplace_back(Path.trim());
		}

		// Add path to lib / lib64 folder.
		SmallString<256> DefaultLibPath = llvm::sys::path::parent_path(D.Dir);
		llvm::sys::path::append(DefaultLibPath, Twine("lib") + CLANG_LIBDIR_SUFFIX);
		LibraryPaths.emplace_back(DefaultLibPath.c_str());

		// First check whether user specifies bc library
		if (const Arg *A =
		DriverArgs.getLastArg(options::OPT_libomptarget_device_bc_path_EQ)) {
		std::string LibOmpTargetName(A->getValue());
		JonChesterfieldUnsubmitted Done Reply Inline Actions This, I think, can be reduced to checking OPT_libomptarget_device_bc_path_EQ if the aliasing machinery I remember does actually exist JonChesterfield: This, I think, can be reduced to checking OPT_libomptarget_device_bc_path_EQ if the aliasing…
		if (llvm::sys::fs::exists(LibOmpTargetName)) {
		CC1Args.push_back("-mlink-builtin-bitcode");
		CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetName));
		} else {
		D.Diag(diag::err_drv_omp_offload_target_bcruntime_not_found)
		<< LibOmpTargetName;
		}
		} else {
		bool FoundBCLibrary = false;

		std::string LibOmpTargetName =
		"libomptarget-" + BitcodeSuffix.str() + ".bc";

		for (StringRef LibraryPath : LibraryPaths) {
		SmallString<128> LibOmpTargetFile(LibraryPath);
		llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
		if (llvm::sys::fs::exists(LibOmpTargetFile)) {
		CC1Args.push_back("-mlink-builtin-bitcode");
		CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
		FoundBCLibrary = true;
		break;
		}
		}
		if (!FoundBCLibrary)
		D.Diag(diag::err_drv_omp_offload_target_missingbcruntime)
		<< LibOmpTargetName;
		}
		}

clang/lib/Driver/ToolChains/Cuda.cpp

Show First 20 Lines • Show All 743 Lines • ▼ Show 20 Lines	if (DriverArgs.hasFlag(options::OPT_fcuda_short_ptr,
CC1Args.append({"-mllvm", "--nvptx-short-ptr"});		CC1Args.append({"-mllvm", "--nvptx-short-ptr"});

if (CudaInstallation.version() >= CudaVersion::UNKNOWN)		if (CudaInstallation.version() >= CudaVersion::UNKNOWN)
CC1Args.push_back(DriverArgs.MakeArgString(		CC1Args.push_back(DriverArgs.MakeArgString(
Twine("-target-sdk-version=") +		Twine("-target-sdk-version=") +
CudaVersionToString(CudaInstallation.version())));		CudaVersionToString(CudaInstallation.version())));

if (DeviceOffloadingKind == Action::OFK_OpenMP) {		if (DeviceOffloadingKind == Action::OFK_OpenMP) {
SmallVector<StringRef, 8> LibraryPaths;		std::string BitcodeSuffix =
// Add user defined library paths from LIBRARY_PATH.		"nvptx-cuda_" + CudaVersionStr + "-" + GpuArch.str();
llvm::Optional<std::string> LibPath =		addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, BitcodeSuffix);
llvm::sys::Process::GetEnv("LIBRARY_PATH");
if (LibPath) {
SmallVector<StringRef, 8> Frags;
const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
for (StringRef Path : Frags)
LibraryPaths.emplace_back(Path.trim());
}

// Add path to lib / lib64 folder.
SmallString<256> DefaultLibPath =
llvm::sys::path::parent_path(getDriver().Dir);
llvm::sys::path::append(DefaultLibPath, Twine("lib") + CLANG_LIBDIR_SUFFIX);
LibraryPaths.emplace_back(DefaultLibPath.c_str());

// First check whether user specifies bc library
if (const Arg *A =
DriverArgs.getLastArg(options::OPT_libomptarget_nvptx_bc_path_EQ)) {
std::string LibOmpTargetName(A->getValue());
if (llvm::sys::fs::exists(LibOmpTargetName)) {
CC1Args.push_back("-mlink-builtin-bitcode");
CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetName));
} else {
getDriver().Diag(diag::err_drv_omp_offload_target_bcruntime_not_found)
<< LibOmpTargetName;
}
} else {
bool FoundBCLibrary = false;

std::string LibOmpTargetName = "libomptarget-nvptx-cuda_" +
CudaVersionStr + "-" + GpuArch.str() +
".bc";

for (StringRef LibraryPath : LibraryPaths) {
SmallString<128> LibOmpTargetFile(LibraryPath);
llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
if (llvm::sys::fs::exists(LibOmpTargetFile)) {
CC1Args.push_back("-mlink-builtin-bitcode");
CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
FoundBCLibrary = true;
break;
}
}
if (!FoundBCLibrary)
getDriver().Diag(diag::err_drv_omp_offload_target_missingbcruntime)
<< LibOmpTargetName;
}
}		}
}		}

llvm::DenormalMode CudaToolChain::getDefaultDenormalModeForType(		llvm::DenormalMode CudaToolChain::getDefaultDenormalModeForType(
const llvm::opt::ArgList &DriverArgs, const JobAction &JA,		const llvm::opt::ArgList &DriverArgs, const JobAction &JA,
const llvm::fltSemantics *FPType) const {		const llvm::fltSemantics *FPType) const {
if (JA.getOffloadingDeviceKind() == Action::OFK_Cuda) {		if (JA.getOffloadingDeviceKind() == Action::OFK_Cuda) {
if (FPType && FPType == &llvm::APFloat::IEEEsingle() &&		if (FPType && FPType == &llvm::APFloat::IEEEsingle() &&
▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc

This file was added.

This is an empty file.

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx906.bc

This file was added.

This is an empty file.

clang/test/Driver/amdgpu-openmp-toolchain.c

	// REQUIRES: amdgpu-registered-target			// REQUIRES: amdgpu-registered-target
	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \			// RUN: env LIBRARY_PATH=%S/Inputs/hip_dev_lib %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	// verify the tools invocations			// verify the tools invocations
	// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "c"{{.*}}			// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "c"{{.*}}
	// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "ir"{{.*}}			// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "ir"{{.*}}
	// CHECK: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx906" "-fcuda-is-device" "-emit-llvm-bc"{{.}}			// CHECK: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx906" "-fcuda-is-device" "-emit-llvm-bc" "-mlink-builtin-bitcode"{{.}}libomptarget-amdgcn-gfx906.bc"{{.*}}
	// CHECK: llvm-link{{.}}"-o" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc"			// CHECK: llvm-link{{.}}"-o" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc"
	// CHECK: llc{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc" "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-filetype=obj" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"			// CHECK: llc{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc" "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-filetype=obj" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"
	// CHECK: lld{{.}}"-flavor" "gnu" "--no-undefined" "-shared" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}.out" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"			// CHECK: lld{{.}}"-flavor" "gnu" "--no-undefined" "-shared" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}.out" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"
	// CHECK: clang-offload-wrapper{{.}}"-target" "x86_64-unknown-linux-gnu" "-o" "{{.}}a-{{.}}.bc" {{.}}amdgpu-openmp-toolchain-{{.*}}.out"			// CHECK: clang-offload-wrapper{{.}}"-target" "x86_64-unknown-linux-gnu" "-o" "{{.}}a-{{.}}.bc" {{.}}amdgpu-openmp-toolchain-{{.*}}.out"
	// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-o" "{{.}}a-{{.}}.o" "-x" "ir" "{{.}}a-{{.}}.bc"			// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-o" "{{.}}a-{{.}}.o" "-x" "ir" "{{.}}a-{{.}}.bc"
	// CHECK: ld{{.}}"-o" "a.out"{{.}}"{{.}}amdgpu-openmp-toolchain-{{.}}.o" "{{.}}a-{{.}}.o" "-lomp" "-lomptarget"			// CHECK: ld{{.}}"-o" "a.out"{{.}}"{{.}}amdgpu-openmp-toolchain-{{.}}.o" "{{.}}a-{{.}}.o" "-lomp" "-lomptarget"

	// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \			// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \
	Show All 12 Lines
	// CHECK-PHASES: 10: assembler, {9}, object, (device-openmp)			// CHECK-PHASES: 10: assembler, {9}, object, (device-openmp)
	// CHECK-PHASES: 11: linker, {10}, image, (device-openmp)			// CHECK-PHASES: 11: linker, {10}, image, (device-openmp)
	// CHECK-PHASES: 12: offload, "device-openmp (amdgcn-amd-amdhsa)" {11}, image			// CHECK-PHASES: 12: offload, "device-openmp (amdgcn-amd-amdhsa)" {11}, image
	// CHECK-PHASES: 13: clang-offload-wrapper, {12}, ir, (host-openmp)			// CHECK-PHASES: 13: clang-offload-wrapper, {12}, ir, (host-openmp)
	// CHECK-PHASES: 14: backend, {13}, assembler, (host-openmp)			// CHECK-PHASES: 14: backend, {13}, assembler, (host-openmp)
	// CHECK-PHASES: 15: assembler, {14}, object, (host-openmp)			// CHECK-PHASES: 15: assembler, {14}, object, (host-openmp)
	// CHECK-PHASES: 16: linker, {4, 15}, image, (host-openmp)			// CHECK-PHASES: 16: linker, {4, 15}, image, (host-openmp)

				// handling of --libomptarget-device-bc-path
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 --libomptarget-device-bc-path=%S/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIBOMPTARGET
				// CHECK-LIBOMPTARGET: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-emit-llvm-bc" "-mlink-builtin-bitcode"{{.}}Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc"{{.*}}

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][AMDGPU] Add support for linking libomptarget bitcode
AcceptedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 322090

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

clang/lib/Driver/ToolChains/CommonArgs.h

clang/lib/Driver/ToolChains/CommonArgs.cpp

clang/lib/Driver/ToolChains/Cuda.cpp

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx906.bc

clang/test/Driver/amdgpu-openmp-toolchain.c

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][AMDGPU] Add support for linking libomptarget bitcodeAcceptedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 322090

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

clang/lib/Driver/ToolChains/CommonArgs.h

clang/lib/Driver/ToolChains/CommonArgs.cpp

clang/lib/Driver/ToolChains/Cuda.cpp

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx906.bc

clang/test/Driver/amdgpu-openmp-toolchain.c

[OpenMP][AMDGPU] Add support for linking libomptarget bitcode
AcceptedPublic