I like this. Using the same logic, in the same function call, to find this library on either gpu is the right thing to do. Looks like a non functional change on nvptx, though phab doesn't make that obvious.

clang/include/clang/Driver/Options.td
949	I think there's an aliasing mechanism in the Options handling, where we can use device_bc_path as the canonical choice and nvptx_bc_path as a backwards-compatible argument that sets device_bc_path
clang/lib/Driver/ToolChains/CommonArgs.cpp
1655	This, I think, can be reduced to checking OPT_libomptarget_device_bc_path_EQ if the aliasing machinery I remember does actually exist

Addressed review comments.

pdhaliwal marked 2 inline comments as done.Feb 8 2021, 5:47 AM

LGTM. Let's wait for someone using nvptx to sanity check

Harbormaster completed remote builds in B88273: Diff 322090.Feb 8 2021, 6:26 AM

Edit: Debugged further, rewriting comment.

Current error message on missing library is:
'error: No library 'libomptarget-amdgcn-gfx906.bc' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-nvptx-bc-path to specify nvptx bitcode libarary'

Written in ./clang/include/clang/Basic/DiagnosticDriverKinds.td entry err_drv_omp_offload_target_missingbcruntime should probably refer to 'device' instead of 'nvptx' (error message change only)

Cuda.cpp calls addOpenMPDeviceRTL guarded by nogpulib, AMDGPUOpenMP unconditionally calls it. That means the deviceRTL is needed on disk when building the deviceRTL. Not so good. We need a if (DriverArgs.hasArg(options::OPT_nogpulib)) return; in AMDGPUOpenMPToolChain::addClangTargetOptions. Comment inline as well.

The existing search logic looks in clang's lib and LIBRARY_PATH, I think we should probably look in the runtime directory as well for running from the build tree. That's separate to this change though.

JonChesterfield added inline comments.Feb 8 2021, 12:46 PM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` here or we can't build the deviceRTL without already having one on disk

Generally LGTM.

In D96248#2549339, @JonChesterfield wrote:

The existing search logic looks in clang's lib and LIBRARY_PATH, I think we should probably look in the runtime directory as well for running from the build tree. That's separate to this change though.

I don't think it's necessary as we can add the runtime directory to LIBRARY_PATH when configuring lit.

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	FWIW, NVPTX `deviceRTLs` is built by directly calling FE, not via clang driver. `clang -fopenmp -fopenmp-targets=xxx` basically consists of two passes, and therefore generates two IRs, which is not what we expect. I'm not sure we really need the if statement.

Added check for nogpulib
Fixed diagnostic message

JonChesterfield added inline comments.Feb 8 2021, 11:48 PM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	That explains what I was missing about the ptx cmake I think. I've had to locally hack around clang creating a bundle, which llvm-link chokes on, because cuda-device-only et al are ignored by openmp. I think this check is right - it means nogpulib will exclude the rtl on both GPUs. Nvptx already has it in the control flow. Whether RTL cmake should bypass the driver is interesting, but I think separate to this patch.

Harbormaster completed remote builds in B88414: Diff 322298.Feb 9 2021, 12:11 AM

JonChesterfield accepted this revision.Feb 9 2021, 2:03 AM

This revision is now accepted and ready to land.Feb 9 2021, 2:03 AM

tianshilei1992 added inline comments.Feb 9 2021, 6:11 AM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
193	`cuda-device-only` is only for CUDA compilation. We don’t have an option to invoke device compilation only.

Some background discussion about the diagnostic. This change means people using nvptx, where the library cannot be found, will now be advised to use libomptarget-device-bc-path instead of libomptarget-nvptx-bc-path. If they use libomptarget-nvptx-bc-path anyway, it'll work as intended. That avoids us adding libomptarget-amdgcn-bc-path and requiring some more conditional compilation for multiarch codebases.

@tianshilei1992 @jdoerfert can we agree on 'libomptarget-device-bc-path' being the better one to recommend in the error message, despite that being a minor change to the current behaviour? It'll slightly surprise people parsing error messages, such as our test, but shouldn't cause any confusion.

I'm personally OK with using libomptarget-nvptx-bc-path to indicate where to look for the amdgcn bitcode as well but can see that causing confusion. I'm assuming that both gpu runtimes will go in same directory - there may be a future where one invocation targets nvptx and amdgcn at the same time, but even then I'd prefer all the runtimes live in the same place in the filesystem.

In D96248#2551503, @JonChesterfield wrote:

@tianshilei1992 @jdoerfert can we agree on 'libomptarget-device-bc-path' being the better one to recommend in the error message, despite that being a minor change to the current behaviour? It'll slightly surprise people parsing error messages, such as our test, but shouldn't cause any confusion.

After a second thought, I don't think it is feasible. Consider the following senecio:

clang -fopenmp -fopenmp-targets=nvptx64,amdgcn source.cpp

We cannot use one option for the two different targets, and the alias might not work as well, especially in terms of the driver.

This revision now requires changes to proceed.Feb 9 2021, 9:20 AM

Using one option for both targets seems great - if both have put the devicertl in the same folder. Which I suppose they might not have.

Maybe keep it separate for now, one for nvptx and one for amdgcn, and hope for a common 'device' later.

I have removed libomptarget-device-bc-path and have added amdgcn one. For diagnostic,
instead of having one per architecture, I have used the same and added second
parameter to err_drv_omp_offload_target_missingbcruntime for having arch specifc message.

Harbormaster completed remote builds in B88605: Diff 322638.Feb 10 2021, 3:59 AM

Parameter to err_drv_omp_offload_target_missingbcruntime is sensible. @tianshilei1992? With this we can set the path for nvptx and amdgcn independently.

LGTM

This revision is now accepted and ready to land.Feb 11 2021, 11:07 AM

This revision was landed with ongoing or failed builds.Feb 11 2021, 9:42 PM

Closed by commit rG79401b43ce4e: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode (authored by pdhaliwal). · Explain Why

This revision was automatically updated to reflect the committed changes.

pdhaliwal added a commit: rG79401b43ce4e: [OpenMP][AMDGPU] Add support for linking libomptarget bitcode.

JonChesterfield mentioned this in D101935: [clang] Search runtimes build tree for openmp runtime.May 5 2021, 11:52 AM

jdoerfert reopened this revision.Jan 8 2022, 11:31 AM

jdoerfert added inline comments.

clang/include/clang/Driver/Options.td
948	Why do we need two options that literally do the same thing? I cannot think of a use case where we would specify two distinct paths, can anyone else?

This revision is now accepted and ready to land.Jan 8 2022, 11:31 AM

JonChesterfield added inline comments.Jan 8 2022, 11:52 AM

clang/include/clang/Driver/Options.td
948	It's for when someone has decided to put nvptx bitcode in one directory and amdgpu in another. That's presently useless. It might be more helpful once we can target both platforms from one compile, but even then passing bitcode-path once per arch seems better. In favour of throwing one option away and renaming the other to work on both/all targets

Diff 323219

clang/include/clang/Basic/DiagnosticDriverKinds.td

	Show First 20 Lines • Show All 255 Lines • ▼ Show 20 Lines
	def err_drv_incompatible_omp_arch : Error<"OpenMP target architecture '%0' pointer size is incompatible with host '%1'">;			def err_drv_incompatible_omp_arch : Error<"OpenMP target architecture '%0' pointer size is incompatible with host '%1'">;
	def err_drv_omp_host_ir_file_not_found : Error<			def err_drv_omp_host_ir_file_not_found : Error<
	"The provided host compiler IR file '%0' is required to generate code for OpenMP target regions but cannot be found.">;			"The provided host compiler IR file '%0' is required to generate code for OpenMP target regions but cannot be found.">;
	def err_drv_omp_host_target_not_supported : Error<			def err_drv_omp_host_target_not_supported : Error<
	"The target '%0' is not a supported OpenMP host target.">;			"The target '%0' is not a supported OpenMP host target.">;
	def err_drv_expecting_fopenmp_with_fopenmp_targets : Error<			def err_drv_expecting_fopenmp_with_fopenmp_targets : Error<
	"The option -fopenmp-targets must be used in conjunction with a -fopenmp option compatible with offloading, please use -fopenmp=libomp or -fopenmp=libiomp5.">;			"The option -fopenmp-targets must be used in conjunction with a -fopenmp option compatible with offloading, please use -fopenmp=libomp or -fopenmp=libiomp5.">;
	def err_drv_omp_offload_target_missingbcruntime : Error<			def err_drv_omp_offload_target_missingbcruntime : Error<
	"No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-nvptx-bc-path to specify nvptx bitcode library.">;			"No library '%0' found in the default clang lib directory or in LIBRARY_PATH. Please use --libomptarget-%1-bc-path to specify %1 bitcode library.">;
	def err_drv_omp_offload_target_bcruntime_not_found : Error<"Bitcode library '%0' does not exist.">;			def err_drv_omp_offload_target_bcruntime_not_found : Error<"Bitcode library '%0' does not exist.">;
	def warn_drv_omp_offload_target_duplicate : Warning<			def warn_drv_omp_offload_target_duplicate : Warning<
	"The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">,			"The OpenMP offloading target '%0' is similar to target '%1' already specified - will be ignored.">,
	InGroup<OpenMPTarget>;			InGroup<OpenMPTarget>;
	def err_drv_unsupported_embed_bitcode			def err_drv_unsupported_embed_bitcode
	: Error<"%0 is not supported with -fembed-bitcode">;			: Error<"%0 is not supported with -fembed-bitcode">;
	def err_drv_bitcode_unsupported_on_toolchain : Error<			def err_drv_bitcode_unsupported_on_toolchain : Error<
	"-fembed-bitcode is not supported on versions of iOS prior to 6.0">;			"-fembed-bitcode is not supported on versions of iOS prior to 6.0">;
	▲ Show 20 Lines • Show All 268 Lines • Show Last 20 Lines

clang/include/clang/Driver/Options.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 936 Lines • ▼ Show 20 Lines	HelpText<"An ID for compilation unit, which should be the same for the same "
"source offloading languages CUDA and HIP so that they can be "		"source offloading languages CUDA and HIP so that they can be "
"accessed by the host code of the same compilation unit.">;		"accessed by the host code of the same compilation unit.">;
def fuse_cuid_EQ : Joined<["-"], "fuse-cuid=">,		def fuse_cuid_EQ : Joined<["-"], "fuse-cuid=">,
HelpText<"Method to generate ID's for compilation units for single source "		HelpText<"Method to generate ID's for compilation units for single source "
"offloading languages CUDA and HIP: 'hash' (ID's generated by hashing "		"offloading languages CUDA and HIP: 'hash' (ID's generated by hashing "
"file path and command line options) \| 'random' (ID's generated as "		"file path and command line options) \| 'random' (ID's generated as "
"random numbers) \| 'none' (disabled). Default is 'hash'. This option "		"random numbers) \| 'none' (disabled). Default is 'hash'. This option "
"will be overriden by option '-cuid=[ID]' if it is specified." >;		"will be overriden by option '-cuid=[ID]' if it is specified." >;
		def libomptarget_amdgcn_bc_path_EQ : Joined<["--"], "libomptarget-amdgcn-bc-path=">, Group<i_Group>,
		HelpText<"Path to libomptarget-amdgcn bitcode library">;
def libomptarget_nvptx_bc_path_EQ : Joined<["--"], "libomptarget-nvptx-bc-path=">, Group<i_Group>,		def libomptarget_nvptx_bc_path_EQ : Joined<["--"], "libomptarget-nvptx-bc-path=">, Group<i_Group>,
HelpText<"Path to libomptarget-nvptx bitcode library">;		HelpText<"Path to libomptarget-nvptx bitcode library">;
		jdoerfertUnsubmitted Not Done Reply Inline Actions Why do we need two options that literally do the same thing? I cannot think of a use case where we would specify two distinct paths, can anyone else? jdoerfert: Why do we need two options that literally do the same thing? I cannot think of a use case where…
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions It's for when someone has decided to put nvptx bitcode in one directory and amdgpu in another. That's presently useless. It might be more helpful once we can target both platforms from one compile, but even then passing bitcode-path once per arch seems better. In favour of throwing one option away and renaming the other to work on both/all targets JonChesterfield: It's for when someone has decided to put nvptx bitcode in one directory and amdgpu in another.
def dD : Flag<["-"], "dD">, Group<d_Group>, Flags<[CC1Option]>,		def dD : Flag<["-"], "dD">, Group<d_Group>, Flags<[CC1Option]>,
		JonChesterfieldUnsubmitted Done Reply Inline Actions I think there's an aliasing mechanism in the Options handling, where we can use device_bc_path as the canonical choice and nvptx_bc_path as a backwards-compatible argument that sets device_bc_path JonChesterfield: I think there's an aliasing mechanism in the Options handling, where we can use device_bc_path…
HelpText<"Print macro definitions in -E mode in addition to normal output">;		HelpText<"Print macro definitions in -E mode in addition to normal output">;
def dI : Flag<["-"], "dI">, Group<d_Group>, Flags<[CC1Option]>,		def dI : Flag<["-"], "dI">, Group<d_Group>, Flags<[CC1Option]>,
HelpText<"Print include directives in -E mode in addition to normal output">,		HelpText<"Print include directives in -E mode in addition to normal output">,
MarshallingInfoFlag<PreprocessorOutputOpts<"ShowIncludeDirectives">>;		MarshallingInfoFlag<PreprocessorOutputOpts<"ShowIncludeDirectives">>;
def dM : Flag<["-"], "dM">, Group<d_Group>, Flags<[CC1Option]>,		def dM : Flag<["-"], "dM">, Group<d_Group>, Flags<[CC1Option]>,
HelpText<"Print macro definitions in -E mode instead of normal output">;		HelpText<"Print macro definitions in -E mode instead of normal output">;
def dead__strip : Flag<["-"], "dead_strip">;		def dead__strip : Flag<["-"], "dead_strip">;
def dependency_file : Separate<["-"], "dependency-file">, Flags<[CC1Option]>,		def dependency_file : Separate<["-"], "dependency-file">, Flags<[CC1Option]>,
▲ Show 20 Lines • Show All 4,980 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

Show First 20 Lines • Show All 184 Lines • ▼ Show 20 Lines	void AMDGPUOpenMPToolChain::addClangTargetOptions(
assert(!GpuArch.empty() && "Must have an explicit GPU arch.");		assert(!GpuArch.empty() && "Must have an explicit GPU arch.");
assert(DeviceOffloadingKind == Action::OFK_OpenMP &&		assert(DeviceOffloadingKind == Action::OFK_OpenMP &&
"Only OpenMP offloading kinds are supported.");		"Only OpenMP offloading kinds are supported.");

CC1Args.push_back("-target-cpu");		CC1Args.push_back("-target-cpu");
CC1Args.push_back(DriverArgs.MakeArgStringRef(GpuArch));		CC1Args.push_back(DriverArgs.MakeArgStringRef(GpuArch));
CC1Args.push_back("-fcuda-is-device");		CC1Args.push_back("-fcuda-is-device");
CC1Args.push_back("-emit-llvm-bc");		CC1Args.push_back("-emit-llvm-bc");

		JonChesterfieldUnsubmitted Done Reply Inline Actions Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` here or we can't build the deviceRTL without already having one on disk JonChesterfield: Need `if (DriverArgs.hasArg(options::OPT_nogpulib)) return;` here or we can't build the…
		tianshilei1992Unsubmitted Not Done Reply Inline Actions FWIW, NVPTX `deviceRTLs` is built by directly calling FE, not via clang driver. `clang -fopenmp -fopenmp-targets=xxx` basically consists of two passes, and therefore generates two IRs, which is not what we expect. I'm not sure we really need the if statement. tianshilei1992: FWIW, NVPTX `deviceRTLs` is built by directly calling FE, not via clang driver. `clang -fopenmp…
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions That explains what I was missing about the ptx cmake I think. I've had to locally hack around clang creating a bundle, which llvm-link chokes on, because cuda-device-only et al are ignored by openmp. I think this check is right - it means nogpulib will exclude the rtl on both GPUs. Nvptx already has it in the control flow. Whether RTL cmake should bypass the driver is interesting, but I think separate to this patch. JonChesterfield: That explains what I was missing about the ptx cmake I think. I've had to locally hack around…
		tianshilei1992Unsubmitted Not Done Reply Inline Actions `cuda-device-only` is only for CUDA compilation. We don’t have an option to invoke device compilation only. tianshilei1992: `cuda-device-only` is only for CUDA compilation. We don’t have an option to invoke device…
		if (DriverArgs.hasArg(options::OPT_nogpulib))
		return;
		std::string BitcodeSuffix = "amdgcn-" + GpuArch.str();
		addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, BitcodeSuffix,
		getTriple());
}		}

llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(		llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(
const llvm::opt::DerivedArgList &Args, StringRef BoundArch,		const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
Action::OffloadKind DeviceOffloadKind) const {		Action::OffloadKind DeviceOffloadKind) const {
DerivedArgList *DAL =		DerivedArgList *DAL =
HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);		HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);
if (!DAL)		if (!DAL)
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/CommonArgs.h

	Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines

	unsigned getOrCheckAMDGPUCodeObjectVersion(const Driver &D,			unsigned getOrCheckAMDGPUCodeObjectVersion(const Driver &D,
	const llvm::opt::ArgList &Args,			const llvm::opt::ArgList &Args,
	bool Diagnose = false);			bool Diagnose = false);

	void addMachineOutlinerArgs(const Driver &D, const llvm::opt::ArgList &Args,			void addMachineOutlinerArgs(const Driver &D, const llvm::opt::ArgList &Args,
	llvm::opt::ArgStringList &CmdArgs,			llvm::opt::ArgStringList &CmdArgs,
	const llvm::Triple &Triple, bool IsLTO);			const llvm::Triple &Triple, bool IsLTO);

				void addOpenMPDeviceRTL(const Driver &D, const llvm::opt::ArgList &DriverArgs,
				llvm::opt::ArgStringList &CC1Args,
				StringRef BitcodeSuffix, const llvm::Triple &Triple);
	} // end namespace tools			} // end namespace tools
	} // end namespace driver			} // end namespace driver
	} // end namespace clang			} // end namespace clang

	#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_COMMONARGS_H			#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_COMMONARGS_H

clang/lib/Driver/ToolChains/CommonArgs.cpp

Show First 20 Lines • Show All 1,621 Lines • ▼ Show 20 Lines	if (A->getOption().matches(options::OPT_moutline)) {
addArg(Twine("-enable-machine-outliner"));		addArg(Twine("-enable-machine-outliner"));
}		}
} else {		} else {
// Disable all outlining behaviour.		// Disable all outlining behaviour.
addArg(Twine("-enable-machine-outliner=never"));		addArg(Twine("-enable-machine-outliner=never"));
}		}
}		}
}		}

		void tools::addOpenMPDeviceRTL(const Driver &D,
		const llvm::opt::ArgList &DriverArgs,
		llvm::opt::ArgStringList &CC1Args,
		StringRef BitcodeSuffix,
		const llvm::Triple &Triple) {
		SmallVector<StringRef, 8> LibraryPaths;
		// Add user defined library paths from LIBRARY_PATH.
		llvm::Optional<std::string> LibPath =
		llvm::sys::Process::GetEnv("LIBRARY_PATH");
		if (LibPath) {
		SmallVector<StringRef, 8> Frags;
		const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
		llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
		for (StringRef Path : Frags)
		LibraryPaths.emplace_back(Path.trim());
		}

		// Add path to lib / lib64 folder.
		SmallString<256> DefaultLibPath = llvm::sys::path::parent_path(D.Dir);
		llvm::sys::path::append(DefaultLibPath, Twine("lib") + CLANG_LIBDIR_SUFFIX);
		LibraryPaths.emplace_back(DefaultLibPath.c_str());

		OptSpecifier LibomptargetBCPathOpt =
		Triple.isAMDGCN() ? options::OPT_libomptarget_amdgcn_bc_path_EQ
		: options::OPT_libomptarget_nvptx_bc_path_EQ;
		JonChesterfieldUnsubmitted Done Reply Inline Actions This, I think, can be reduced to checking OPT_libomptarget_device_bc_path_EQ if the aliasing machinery I remember does actually exist JonChesterfield: This, I think, can be reduced to checking OPT_libomptarget_device_bc_path_EQ if the aliasing…

		StringRef ArchPrefix = Triple.isAMDGCN() ? "amdgcn" : "nvptx";
		// First check whether user specifies bc library
		if (const Arg *A = DriverArgs.getLastArg(LibomptargetBCPathOpt)) {
		std::string LibOmpTargetName(A->getValue());
		if (llvm::sys::fs::exists(LibOmpTargetName)) {
		CC1Args.push_back("-mlink-builtin-bitcode");
		CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetName));
		} else {
		D.Diag(diag::err_drv_omp_offload_target_bcruntime_not_found)
		<< LibOmpTargetName;
		}
		} else {
		bool FoundBCLibrary = false;

		std::string LibOmpTargetName =
		"libomptarget-" + BitcodeSuffix.str() + ".bc";

		for (StringRef LibraryPath : LibraryPaths) {
		SmallString<128> LibOmpTargetFile(LibraryPath);
		llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
		if (llvm::sys::fs::exists(LibOmpTargetFile)) {
		CC1Args.push_back("-mlink-builtin-bitcode");
		CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
		FoundBCLibrary = true;
		break;
		}
		}

		if (!FoundBCLibrary)
		D.Diag(diag::err_drv_omp_offload_target_missingbcruntime)
		<< LibOmpTargetName << ArchPrefix;
		}
		}

clang/lib/Driver/ToolChains/Cuda.cpp

Show First 20 Lines • Show All 743 Lines • ▼ Show 20 Lines	if (DriverArgs.hasFlag(options::OPT_fcuda_short_ptr,
CC1Args.append({"-mllvm", "--nvptx-short-ptr"});		CC1Args.append({"-mllvm", "--nvptx-short-ptr"});

if (CudaInstallation.version() >= CudaVersion::UNKNOWN)		if (CudaInstallation.version() >= CudaVersion::UNKNOWN)
CC1Args.push_back(DriverArgs.MakeArgString(		CC1Args.push_back(DriverArgs.MakeArgString(
Twine("-target-sdk-version=") +		Twine("-target-sdk-version=") +
CudaVersionToString(CudaInstallation.version())));		CudaVersionToString(CudaInstallation.version())));

if (DeviceOffloadingKind == Action::OFK_OpenMP) {		if (DeviceOffloadingKind == Action::OFK_OpenMP) {
SmallVector<StringRef, 8> LibraryPaths;		std::string BitcodeSuffix =
// Add user defined library paths from LIBRARY_PATH.		"nvptx-cuda_" + CudaVersionStr + "-" + GpuArch.str();
llvm::Optional<std::string> LibPath =		addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, BitcodeSuffix,
llvm::sys::Process::GetEnv("LIBRARY_PATH");		getTriple());
if (LibPath) {
SmallVector<StringRef, 8> Frags;
const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
for (StringRef Path : Frags)
LibraryPaths.emplace_back(Path.trim());
}

// Add path to lib / lib64 folder.
SmallString<256> DefaultLibPath =
llvm::sys::path::parent_path(getDriver().Dir);
llvm::sys::path::append(DefaultLibPath, Twine("lib") + CLANG_LIBDIR_SUFFIX);
LibraryPaths.emplace_back(DefaultLibPath.c_str());

// First check whether user specifies bc library
if (const Arg *A =
DriverArgs.getLastArg(options::OPT_libomptarget_nvptx_bc_path_EQ)) {
std::string LibOmpTargetName(A->getValue());
if (llvm::sys::fs::exists(LibOmpTargetName)) {
CC1Args.push_back("-mlink-builtin-bitcode");
CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetName));
} else {
getDriver().Diag(diag::err_drv_omp_offload_target_bcruntime_not_found)
<< LibOmpTargetName;
}
} else {
bool FoundBCLibrary = false;

std::string LibOmpTargetName = "libomptarget-nvptx-cuda_" +
CudaVersionStr + "-" + GpuArch.str() +
".bc";

for (StringRef LibraryPath : LibraryPaths) {
SmallString<128> LibOmpTargetFile(LibraryPath);
llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
if (llvm::sys::fs::exists(LibOmpTargetFile)) {
CC1Args.push_back("-mlink-builtin-bitcode");
CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
FoundBCLibrary = true;
break;
}
}
if (!FoundBCLibrary)
getDriver().Diag(diag::err_drv_omp_offload_target_missingbcruntime)
<< LibOmpTargetName;
}
}		}
}		}

llvm::DenormalMode CudaToolChain::getDefaultDenormalModeForType(		llvm::DenormalMode CudaToolChain::getDefaultDenormalModeForType(
const llvm::opt::ArgList &DriverArgs, const JobAction &JA,		const llvm::opt::ArgList &DriverArgs, const JobAction &JA,
const llvm::fltSemantics *FPType) const {		const llvm::fltSemantics *FPType) const {
if (JA.getOffloadingDeviceKind() == Action::OFK_Cuda) {		if (JA.getOffloadingDeviceKind() == Action::OFK_Cuda) {
if (FPType && FPType == &llvm::APFloat::IEEEsingle() &&		if (FPType && FPType == &llvm::APFloat::IEEEsingle() &&
▲ Show 20 Lines • Show All 146 Lines • Show Last 20 Lines

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc

This file was added.

This is an empty file.

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx906.bc

This file was added.

This is an empty file.

clang/test/Driver/amdgpu-openmp-toolchain.c

	// REQUIRES: amdgpu-registered-target			// REQUIRES: amdgpu-registered-target
	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \			// RUN: env LIBRARY_PATH=%S/Inputs/hip_dev_lib %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \
	// RUN: \| FileCheck %s			// RUN: \| FileCheck %s

	// verify the tools invocations			// verify the tools invocations
	// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "c"{{.*}}			// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "c"{{.*}}
	// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "ir"{{.*}}			// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-x" "ir"{{.*}}
	// CHECK: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx906" "-fcuda-is-device" "-emit-llvm-bc"{{.}}			// CHECK: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx906" "-fcuda-is-device" "-emit-llvm-bc" "-mlink-builtin-bitcode"{{.}}libomptarget-amdgcn-gfx906.bc"{{.*}}
	// CHECK: llvm-link{{.}}"-o" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc"			// CHECK: llvm-link{{.}}"-o" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc"
	// CHECK: llc{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc" "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-filetype=obj" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"			// CHECK: llc{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-linked-{{.}}.bc" "-mtriple=amdgcn-amd-amdhsa" "-mcpu=gfx906" "-filetype=obj" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"
	// CHECK: lld{{.}}"-flavor" "gnu" "--no-undefined" "-shared" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}.out" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"			// CHECK: lld{{.}}"-flavor" "gnu" "--no-undefined" "-shared" "-o"{{.}}amdgpu-openmp-toolchain-{{.}}.out" "{{.}}amdgpu-openmp-toolchain-{{.}}-gfx906-{{.}}.o"
	// CHECK: clang-offload-wrapper{{.}}"-target" "x86_64-unknown-linux-gnu" "-o" "{{.}}a-{{.}}.bc" {{.}}amdgpu-openmp-toolchain-{{.*}}.out"			// CHECK: clang-offload-wrapper{{.}}"-target" "x86_64-unknown-linux-gnu" "-o" "{{.}}a-{{.}}.bc" {{.}}amdgpu-openmp-toolchain-{{.*}}.out"
	// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-o" "{{.}}a-{{.}}.o" "-x" "ir" "{{.}}a-{{.}}.bc"			// CHECK: clang{{.}}"-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-o" "{{.}}a-{{.}}.o" "-x" "ir" "{{.}}a-{{.}}.bc"
	// CHECK: ld{{.}}"-o" "a.out"{{.}}"{{.}}amdgpu-openmp-toolchain-{{.}}.o" "{{.}}a-{{.}}.o" "-lomp" "-lomptarget"			// CHECK: ld{{.}}"-o" "a.out"{{.}}"{{.}}amdgpu-openmp-toolchain-{{.}}.o" "{{.}}a-{{.}}.o" "-lomp" "-lomptarget"

	// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \			// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \
	Show All 12 Lines
	// CHECK-PHASES: 10: assembler, {9}, object, (device-openmp)			// CHECK-PHASES: 10: assembler, {9}, object, (device-openmp)
	// CHECK-PHASES: 11: linker, {10}, image, (device-openmp)			// CHECK-PHASES: 11: linker, {10}, image, (device-openmp)
	// CHECK-PHASES: 12: offload, "device-openmp (amdgcn-amd-amdhsa)" {11}, image			// CHECK-PHASES: 12: offload, "device-openmp (amdgcn-amd-amdhsa)" {11}, image
	// CHECK-PHASES: 13: clang-offload-wrapper, {12}, ir, (host-openmp)			// CHECK-PHASES: 13: clang-offload-wrapper, {12}, ir, (host-openmp)
	// CHECK-PHASES: 14: backend, {13}, assembler, (host-openmp)			// CHECK-PHASES: 14: backend, {13}, assembler, (host-openmp)
	// CHECK-PHASES: 15: assembler, {14}, object, (host-openmp)			// CHECK-PHASES: 15: assembler, {14}, object, (host-openmp)
	// CHECK-PHASES: 16: linker, {4, 15}, image, (host-openmp)			// CHECK-PHASES: 16: linker, {4, 15}, image, (host-openmp)

				// handling of --libomptarget-amdgcn-bc-path
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 --libomptarget-amdgcn-bc-path=%S/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIBOMPTARGET
				// CHECK-LIBOMPTARGET: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-emit-llvm-bc" "-mlink-builtin-bitcode"{{.}}Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc"{{.*}}

				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NOGPULIB
				// CHECK-NOGPULIB-NOT: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-emit-llvm-bc" "-mlink-builtin-bitcode"{{.}}libomptarget-amdgcn-gfx803.bc"{{.*}}

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][AMDGPU] Add support for linking libomptarget bitcode
AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 323219

clang/include/clang/Basic/DiagnosticDriverKinds.td

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

clang/lib/Driver/ToolChains/CommonArgs.h

clang/lib/Driver/ToolChains/CommonArgs.cpp

clang/lib/Driver/ToolChains/Cuda.cpp

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx906.bc

clang/test/Driver/amdgpu-openmp-toolchain.c

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][AMDGPU] Add support for linking libomptarget bitcodeAcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 323219

clang/include/clang/Basic/DiagnosticDriverKinds.td

clang/include/clang/Driver/Options.td

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

clang/lib/Driver/ToolChains/CommonArgs.h

clang/lib/Driver/ToolChains/CommonArgs.cpp

clang/lib/Driver/ToolChains/Cuda.cpp

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx803.bc

clang/test/Driver/Inputs/hip_dev_lib/libomptarget-amdgcn-gfx906.bc

clang/test/Driver/amdgpu-openmp-toolchain.c

[OpenMP][AMDGPU] Add support for linking libomptarget bitcode
AcceptedPublic