Download Raw Diff

Details

Reviewers

JonChesterfield
ronlieb
jdoerfert
ye-luo
gregrodgers
yaxunl
scchan
b-sumner
jhuber6
tianshilei1992
t-tye

Commits

rG9830f902e4d0: [AMDGPU][OpenMP] Support linking of math libraries

Summary

Math libraries are linked only when -lm is specified. This is because
host system could be missing rocm-device-libs.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

pdhaliwal created this revision.Jul 14 2021, 7:36 AM

Herald added subscribers: kerbowa, guansong, t-tye and 6 others. · View Herald TranscriptJul 14 2021, 7:36 AM

pdhaliwal requested review of this revision.Jul 14 2021, 7:36 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2021, 7:36 AM

Herald added subscribers: cfe-commits, sstefan1, wdng. · View Herald Transcript

JonChesterfield added reviewers: ye-luo, gregrodgers, yaxunl, scchan, b-sumner, jhuber6, tianshilei1992.Jul 14 2021, 7:46 AM

JonChesterfield added inline comments.Jul 14 2021, 7:49 AM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
260	I recognise this comment. Is this a bunch of logic that can be moved into the base class and then called from here and hip?

pdhaliwal added inline comments.Jul 14 2021, 7:55 AM

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
260	This is copied (after removing stuff related to opencl) from https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/AMDGPU.cpp#L841 I wanted to make call to `ROCMToolChain::addClangTargetOptions`, but there is some extra logic in it which is irrelevant to OpenMP. I will move the library linking into a separate common method as you suggest.

Harbormaster completed remote builds in B113974: Diff 358598.Jul 14 2021, 8:13 AM

Move linking logic to a common method.

JonChesterfield added a reviewer: t-tye.Jul 14 2021, 8:34 AM

JonChesterfield added inline comments.

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
275	Gone searching. This stuff has already been copied & pasted between AMDGPU.cpp and HIP.cpp. And diverged, looks like HIP has gained a bunch of inverse flags that AMDGPU has not, and some flags are duplicated (OPT_cl_fp32_correctly_rounded_divide_sqrt, OPT_fhip_fp32_correctly_rounded_divide_sqrt). @b-sumner / @t-tye / @yaxunl / @scchan I'd like to suggest that we use the same handling of these flags on opencl / hip / c++ openmp. Since the names have diverged and we don't want to break backwards compatibility, I'd like to check for all three (hip, no_hip, cl) flags for each one, with last one wins semantics, and do that from a single method called by all the amdgpu language drivers. That way hip and opencl will continue working as they do today, except hip will additionally accept some opencl flags and do the right thing, and likewise opencl. If that is unacceptable, the near future involves OPT_fopenmp_fp32_correctly_rounded_device_sqrt and similar, and I'd much rather use the same path for all the languages than copy&paste again. @pdhaliwal getCommonBitcodeLibs will splice in ockl as well as ocml and dependencies. OpenMP doesn't call into ockl so I'd like to avoid that, preferably by moving the ockl append out of getCommonBitcodeLibs @amdgpu in general can we get rid of this oclc_isa_version_xxx.bc file, which only defines a single constant in each case `@__oclc_ISA_version = linkonce_odr protected local_unnamed_addr addrspace(4) constant i32 9006, align 4` in favour of emitting that linkonce_odr symbol from clang? Clang already knows which gpu it is compiling for, and uses that to find the file with the corresponding name on disk, so it could instead emit that symbol, letting us drop the O(N) tiny files from the install tree and slightly improving compile time?

Harbormaster completed remote builds in B113989: Diff 358614.Jul 14 2021, 8:53 AM

yaxunl added inline comments.Jul 14 2021, 9:07 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
910–920	I agree that we'd better refactor this part as a common function used by HIP/OpenMP/OpenCL. However I disagree to let HIP use OpenCL options. HIP uses clang generic options or HIP/CUDA shared options to control these flags. I think OpenMP may consider using similar options as HIP instead of OpenCL options.

JonChesterfield added inline comments.Jul 14 2021, 9:28 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
910–920	It's a mixture. Some flags look generic, some have hip in the name. My preference would be to have a generic named flag for each one, and alias hip/opencp/cuda onto the generic one. Aliasing the flags would mean what works today continues working while simplifying the control flow in clang. Keeping flags with different names that do the same thing is certainly possible but doesn't seem a feature. It makes clang more complicated in order to make user build scripts more complicated.

yaxunl added inline comments.Jul 14 2021, 9:47 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
910–920	OpenCL options are defined by OpenCL spec. There may be difficulties to alias them to other options. I am OK to alias HIP options to more generic options.

JonChesterfield added inline comments.Jul 14 2021, 10:59 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
910–920	Works for me. We could fold HIP and OpenMP together, both using generic options, and then leave OpenCL to adopt the same if they wish.

Extract the options from HIP/OpenMP to a common method in base class.

pdhaliwal added inline comments.Jul 27 2021, 3:51 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
923–924	I wanted to rename these to something generic like -fgpu-fp32.... but due to some weird reason aliasing wasn't working. Anyhow, my suggestion is to make that change in separate patch.

Harbormaster completed remote builds in B116387: Diff 361967.Jul 27 2021, 4:33 AM

I don't like that this pulls in ockl automatically but don't think that's a blocker. OK on my side, @yaxunl?

Due to the current state of math headers, I was unable to test this patch without ockl. But last time when headers were working, I was actually required to link ockl for a symbol (I forgot the name). I will update once I am able to get the math headers work again.

yaxunl added inline comments.Jul 28 2021, 7:12 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
831–860	I think we'd better absorb this part into the newly added function getCommonDeviceLibOptions so that we have a centralized location for determining device libs. We could use offload kind of the toolchain to differentiate between OpenCL/HIP/OpenMP.

JonChesterfield added inline comments.Jul 28 2021, 7:43 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
831–860	getCommonBitcodeLibs is called by opencl with some other set of constraints around argument names. Persuading opencl to use the same arguments, getting rid of some of the files, doing things with aliasing, or however else we want to dice this problem is separable from linking the bitcode into openmp and can be left for a later patch. Using a common path for HIP and OpenMP seems a step in the right direction. It might take quite a long time to reach consensus on how to deduplicate the two remaining copies, which I'd guess is why they were copy/pasted to begin with.

yaxunl added inline comments.Jul 28 2021, 8:28 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
831–860	We do not need to use shared options for HIP/OpenMP/OpenCL. We could use if/else to use different options for HIP/OpenCL. I am suggesting this because this part of code is largely the same as getCommonDeviceLibOptions, except the option names. Also the name getCommonDeviceLibOptions indicates that is a common function for all languages, otherwise it should be renamed as getCommonDeviceLibOptionsForHIPAndOpenMP
clang/lib/Driver/ToolChains/AMDGPU.h
141	Maybe rename it as getCommonDeviceLibNames since it returns the bc file names instead of options. Pls add a comment like 'returns a list of device library names shared by different languages".

yaxunl added inline comments.Jul 28 2021, 8:33 AM

clang/lib/Driver/ToolChains/AMDGPU.cpp
831–860	If it is out of scope for this patch I am OK to leave this part out. I can create a patch to refactor this part later.

Rename method to getCommonDeviceLibNames

Missed comment.

Harbormaster completed remote builds in B116931: Diff 362718.Jul 29 2021, 5:24 AM

I think this is good enough for now, assuming the majority of OvO is running locally. More cleanups to do after landing.

This revision is now accepted and ready to land.Jul 30 2021, 6:14 AM

LGTM. Thanks.

Closed by commit rG9830f902e4d0: [AMDGPU][OpenMP] Support linking of math libraries (authored by pdhaliwal). · Explain WhyJul 30 2021, 6:54 AM

This revision was automatically updated to reflect the committed changes.

pdhaliwal added a commit: rG9830f902e4d0: [AMDGPU][OpenMP] Support linking of math libraries.

It's been pointed out to me that -lm is a linker flag so it's weird to require it at compile time. I haven't thought of a good fix for that yet.

We don't need to splice in ocml for each compilation unit, so can move the test+splice into the link phase, except that amdgcn doesn't have a very well defined link phase as it keeps everything in IR. We could look for -c or equivalent, i.e. only check for ocml when building an executable, and not when compiling a translation unit to IR.

Diff 363069

clang/lib/Driver/ToolChains/AMDGPU.h

	Show First 20 Lines • Show All 130 Lines • ▼ Show 20 Lines
	class LLVM_LIBRARY_VISIBILITY ROCMToolChain : public AMDGPUToolChain {			class LLVM_LIBRARY_VISIBILITY ROCMToolChain : public AMDGPUToolChain {
	public:			public:
	ROCMToolChain(const Driver &D, const llvm::Triple &Triple,			ROCMToolChain(const Driver &D, const llvm::Triple &Triple,
	const llvm::opt::ArgList &Args);			const llvm::opt::ArgList &Args);
	void			void
	addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,			addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
	llvm::opt::ArgStringList &CC1Args,			llvm::opt::ArgStringList &CC1Args,
	Action::OffloadKind DeviceOffloadKind) const override;			Action::OffloadKind DeviceOffloadKind) const override;

				// Returns a list of device library names shared by different languages
				llvm::SmallVector<std::string, 12>
				yaxunlUnsubmitted Not Done Reply Inline Actions Maybe rename it as getCommonDeviceLibNames since it returns the bc file names instead of options. Pls add a comment like 'returns a list of device library names shared by different languages". yaxunl: Maybe rename it as getCommonDeviceLibNames since it returns the bc file names instead of…
				getCommonDeviceLibNames(const llvm::opt::ArgList &DriverArgs,
				const std::string &GPUArch) const;
	};			};

	} // end namespace toolchains			} // end namespace toolchains
	} // end namespace driver			} // end namespace driver
	} // end namespace clang			} // end namespace clang

	#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPU_H			#endif // LLVM_CLANG_LIB_DRIVER_TOOLCHAINS_AMDGPU_H

clang/lib/Driver/ToolChains/AMDGPU.cpp

Show First 20 Lines • Show All 822 Lines • ▼ Show 20 Lines	void ROCMToolChain::addClangTargetOptions(

if (!RocmInstallation.hasDeviceLibrary()) {		if (!RocmInstallation.hasDeviceLibrary()) {
getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 0;		getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 0;
return;		return;
}		}

// Get the device name and canonicalize it		// Get the device name and canonicalize it
const StringRef GpuArch = getGPUArch(DriverArgs);		const StringRef GpuArch = getGPUArch(DriverArgs);
auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);		auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);
const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind);		const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind);
std::string LibDeviceFile = RocmInstallation.getLibDeviceFile(CanonArch);		std::string LibDeviceFile = RocmInstallation.getLibDeviceFile(CanonArch);
if (LibDeviceFile.empty()) {		if (LibDeviceFile.empty()) {
getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 1 << GpuArch;		getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 1 << GpuArch;
return;		return;
}		}

bool Wave64 = isWave64(DriverArgs, Kind);		bool Wave64 = isWave64(DriverArgs, Kind);

// TODO: There are way too many flags that change this. Do we need to check		// TODO: There are way too many flags that change this. Do we need to check
// them all?		// them all?
bool DAZ = DriverArgs.hasArg(options::OPT_cl_denorms_are_zero) \|\|		bool DAZ = DriverArgs.hasArg(options::OPT_cl_denorms_are_zero) \|\|
getDefaultDenormsAreZeroForTarget(Kind);		getDefaultDenormsAreZeroForTarget(Kind);
bool FiniteOnly = DriverArgs.hasArg(options::OPT_cl_finite_math_only);		bool FiniteOnly = DriverArgs.hasArg(options::OPT_cl_finite_math_only);

bool UnsafeMathOpt =		bool UnsafeMathOpt =
DriverArgs.hasArg(options::OPT_cl_unsafe_math_optimizations);		DriverArgs.hasArg(options::OPT_cl_unsafe_math_optimizations);
bool FastRelaxedMath = DriverArgs.hasArg(options::OPT_cl_fast_relaxed_math);		bool FastRelaxedMath = DriverArgs.hasArg(options::OPT_cl_fast_relaxed_math);
bool CorrectSqrt =		bool CorrectSqrt =
DriverArgs.hasArg(options::OPT_cl_fp32_correctly_rounded_divide_sqrt);		DriverArgs.hasArg(options::OPT_cl_fp32_correctly_rounded_divide_sqrt);

// Add the OpenCL specific bitcode library.		// Add the OpenCL specific bitcode library.
llvm::SmallVector<std::string, 12> BCLibs;		llvm::SmallVector<std::string, 12> BCLibs;
BCLibs.push_back(RocmInstallation.getOpenCLPath().str());		BCLibs.push_back(RocmInstallation.getOpenCLPath().str());

// Add the generic set of libraries.		// Add the generic set of libraries.
BCLibs.append(RocmInstallation.getCommonBitcodeLibs(		BCLibs.append(RocmInstallation.getCommonBitcodeLibs(
DriverArgs, LibDeviceFile, Wave64, DAZ, FiniteOnly, UnsafeMathOpt,		DriverArgs, LibDeviceFile, Wave64, DAZ, FiniteOnly, UnsafeMathOpt,
FastRelaxedMath, CorrectSqrt));		FastRelaxedMath, CorrectSqrt));
		yaxunlUnsubmitted Not Done Reply Inline Actions I think we'd better absorb this part into the newly added function getCommonDeviceLibOptions so that we have a centralized location for determining device libs. We could use offload kind of the toolchain to differentiate between OpenCL/HIP/OpenMP. yaxunl: I think we'd better absorb this part into the newly added function getCommonDeviceLibOptions so…
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions getCommonBitcodeLibs is called by opencl with some other set of constraints around argument names. Persuading opencl to use the same arguments, getting rid of some of the files, doing things with aliasing, or however else we want to dice this problem is separable from linking the bitcode into openmp and can be left for a later patch. Using a common path for HIP and OpenMP seems a step in the right direction. It might take quite a long time to reach consensus on how to deduplicate the two remaining copies, which I'd guess is why they were copy/pasted to begin with. JonChesterfield: getCommonBitcodeLibs is called by opencl with some other set of constraints around argument…
		yaxunlUnsubmitted Not Done Reply Inline Actions We do not need to use shared options for HIP/OpenMP/OpenCL. We could use if/else to use different options for HIP/OpenCL. I am suggesting this because this part of code is largely the same as getCommonDeviceLibOptions, except the option names. Also the name getCommonDeviceLibOptions indicates that is a common function for all languages, otherwise it should be renamed as getCommonDeviceLibOptionsForHIPAndOpenMP yaxunl: We do not need to use shared options for HIP/OpenMP/OpenCL. We could use if/else to use…
		yaxunlUnsubmitted Not Done Reply Inline Actions If it is out of scope for this patch I am OK to leave this part out. I can create a patch to refactor this part later. yaxunl: If it is out of scope for this patch I am OK to leave this part out. I can create a patch to…

llvm::for_each(BCLibs, [&](StringRef BCFile) {		llvm::for_each(BCLibs, [&](StringRef BCFile) {
CC1Args.push_back("-mlink-builtin-bitcode");		CC1Args.push_back("-mlink-builtin-bitcode");
CC1Args.push_back(DriverArgs.MakeArgString(BCFile));		CC1Args.push_back(DriverArgs.MakeArgString(BCFile));
});		});
}		}

llvm::SmallVector<std::string, 12>		llvm::SmallVector<std::string, 12>
Show All 19 Lines
}		}

bool AMDGPUToolChain::shouldSkipArgument(const llvm::opt::Arg *A) const {		bool AMDGPUToolChain::shouldSkipArgument(const llvm::opt::Arg *A) const {
Option O = A->getOption();		Option O = A->getOption();
if (O.matches(options::OPT_fPIE) \|\| O.matches(options::OPT_fpie))		if (O.matches(options::OPT_fPIE) \|\| O.matches(options::OPT_fpie))
return true;		return true;
return false;		return false;
}		}

		llvm::SmallVector<std::string, 12>
		ROCMToolChain::getCommonDeviceLibNames(const llvm::opt::ArgList &DriverArgs,
		const std::string &GPUArch) const {
		auto Kind = llvm::AMDGPU::parseArchAMDGCN(GPUArch);
		const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind);

		std::string LibDeviceFile = RocmInstallation.getLibDeviceFile(CanonArch);
		if (LibDeviceFile.empty()) {
		getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 1 << GPUArch;
		return {};
		}

		// If --hip-device-lib is not set, add the default bitcode libraries.
		// TODO: There are way too many flags that change this. Do we need to check
		// them all?
		bool DAZ = DriverArgs.hasFlag(options::OPT_fgpu_flush_denormals_to_zero,
		options::OPT_fno_gpu_flush_denormals_to_zero,
		getDefaultDenormsAreZeroForTarget(Kind));
		bool FiniteOnly = DriverArgs.hasFlag(
		options::OPT_ffinite_math_only, options::OPT_fno_finite_math_only, false);
		bool UnsafeMathOpt =
		DriverArgs.hasFlag(options::OPT_funsafe_math_optimizations,
		options::OPT_fno_unsafe_math_optimizations, false);
		bool FastRelaxedMath = DriverArgs.hasFlag(options::OPT_ffast_math,
		yaxunlUnsubmitted Not Done Reply Inline Actions I agree that we'd better refactor this part as a common function used by HIP/OpenMP/OpenCL. However I disagree to let HIP use OpenCL options. HIP uses clang generic options or HIP/CUDA shared options to control these flags. I think OpenMP may consider using similar options as HIP instead of OpenCL options. yaxunl: I agree that we'd better refactor this part as a common function used by HIP/OpenMP/OpenCL.
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions It's a mixture. Some flags look generic, some have hip in the name. My preference would be to have a generic named flag for each one, and alias hip/opencp/cuda onto the generic one. Aliasing the flags would mean what works today continues working while simplifying the control flow in clang. Keeping flags with different names that do the same thing is certainly possible but doesn't seem a feature. It makes clang more complicated in order to make user build scripts more complicated. JonChesterfield: It's a mixture. Some flags look generic, some have hip in the name. My preference would be to…
		yaxunlUnsubmitted Not Done Reply Inline Actions OpenCL options are defined by OpenCL spec. There may be difficulties to alias them to other options. I am OK to alias HIP options to more generic options. yaxunl: OpenCL options are defined by OpenCL spec. There may be difficulties to alias them to other…
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions Works for me. We could fold HIP and OpenMP together, both using generic options, and then leave OpenCL to adopt the same if they wish. JonChesterfield: Works for me. We could fold HIP and OpenMP together, both using generic options, and then leave…
		options::OPT_fno_fast_math, false);
		bool CorrectSqrt = DriverArgs.hasFlag(
		options::OPT_fhip_fp32_correctly_rounded_divide_sqrt,
		options::OPT_fno_hip_fp32_correctly_rounded_divide_sqrt);
		pdhaliwalAuthorUnsubmitted Done Reply Inline Actions I wanted to rename these to something generic like -fgpu-fp32.... but due to some weird reason aliasing wasn't working. Anyhow, my suggestion is to make that change in separate patch. pdhaliwal: I wanted to rename these to something generic like -fgpu-fp32.... but due to some weird reason…
		bool Wave64 = isWave64(DriverArgs, Kind);

		return RocmInstallation.getCommonBitcodeLibs(
		DriverArgs, LibDeviceFile, Wave64, DAZ, FiniteOnly, UnsafeMathOpt,
		FastRelaxedMath, CorrectSqrt);
		}
		No newline at end of file

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

//===- AMDGPUOpenMP.cpp - AMDGPUOpenMP ToolChain Implementation -- C++ --===//		//===- AMDGPUOpenMP.cpp - AMDGPUOpenMP ToolChain Implementation -- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "AMDGPUOpenMP.h"		#include "AMDGPUOpenMP.h"
#include "AMDGPU.h"		#include "AMDGPU.h"
#include "CommonArgs.h"		#include "CommonArgs.h"
		#include "ToolChains/ROCm.h"
#include "clang/Basic/DiagnosticDriver.h"		#include "clang/Basic/DiagnosticDriver.h"
#include "clang/Driver/Compilation.h"		#include "clang/Driver/Compilation.h"
#include "clang/Driver/Driver.h"		#include "clang/Driver/Driver.h"
#include "clang/Driver/DriverDiagnostic.h"		#include "clang/Driver/DriverDiagnostic.h"
#include "clang/Driver/InputInfo.h"		#include "clang/Driver/InputInfo.h"
#include "clang/Driver/Options.h"		#include "clang/Driver/Options.h"
		#include "llvm/ADT/STLExtras.h"
#include "llvm/Support/FileSystem.h"		#include "llvm/Support/FileSystem.h"
#include "llvm/Support/FormatAdapters.h"		#include "llvm/Support/FormatAdapters.h"
#include "llvm/Support/FormatVariadic.h"		#include "llvm/Support/FormatVariadic.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"

using namespace clang::driver;		using namespace clang::driver;
using namespace clang::driver::toolchains;		using namespace clang::driver::toolchains;
using namespace clang::driver::tools;		using namespace clang::driver::tools;
▲ Show 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	void AMDGPUOpenMPToolChain::addClangTargetOptions(
if (DriverArgs.hasFlag(options::OPT_fopenmp_target_new_runtime,		if (DriverArgs.hasFlag(options::OPT_fopenmp_target_new_runtime,
options::OPT_fno_openmp_target_new_runtime, false))		options::OPT_fno_openmp_target_new_runtime, false))
BitcodeSuffix = "new-amdgcn-" + GPUArch;		BitcodeSuffix = "new-amdgcn-" + GPUArch;
else		else
BitcodeSuffix = "amdgcn-" + GPUArch;		BitcodeSuffix = "amdgcn-" + GPUArch;

addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, BitcodeSuffix,		addOpenMPDeviceRTL(getDriver(), DriverArgs, CC1Args, BitcodeSuffix,
getTriple());		getTriple());

		if (!DriverArgs.hasArg(options::OPT_l))
		return;

		auto Lm = DriverArgs.getAllArgValues(options::OPT_l);
		bool HasLibm = false;
		for (auto &Lib : Lm) {
		if (Lib == "m") {
		HasLibm = true;
		break;
		}
		}

		if (HasLibm) {
		SmallVector<std::string, 12> BCLibs =
		getCommonDeviceLibNames(DriverArgs, GPUArch);
		llvm::for_each(BCLibs, [&](StringRef BCFile) {
		CC1Args.push_back("-mlink-builtin-bitcode");
		CC1Args.push_back(DriverArgs.MakeArgString(BCFile));
		});
		}
}		}

llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(		llvm::opt::DerivedArgList *AMDGPUOpenMPToolChain::TranslateArgs(
		JonChesterfieldUnsubmitted Not Done Reply Inline Actions I recognise this comment. Is this a bunch of logic that can be moved into the base class and then called from here and hip? JonChesterfield: I recognise this comment. Is this a bunch of logic that can be moved into the base class and…
		pdhaliwalAuthorUnsubmitted Done Reply Inline Actions This is copied (after removing stuff related to opencl) from https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/AMDGPU.cpp#L841 I wanted to make call to `ROCMToolChain::addClangTargetOptions`, but there is some extra logic in it which is irrelevant to OpenMP. I will move the library linking into a separate common method as you suggest. pdhaliwal: This is copied (after removing stuff related to opencl) from https://github.com/llvm/llvm…
const llvm::opt::DerivedArgList &Args, StringRef BoundArch,		const llvm::opt::DerivedArgList &Args, StringRef BoundArch,
Action::OffloadKind DeviceOffloadKind) const {		Action::OffloadKind DeviceOffloadKind) const {
DerivedArgList *DAL =		DerivedArgList *DAL =
HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);		HostTC.TranslateArgs(Args, BoundArch, DeviceOffloadKind);
if (!DAL)		if (!DAL)
DAL = new DerivedArgList(Args.getBaseArgs());		DAL = new DerivedArgList(Args.getBaseArgs());

const OptTable &Opts = getDriver().getOpts();		const OptTable &Opts = getDriver().getOpts();

if (DeviceOffloadKind != Action::OFK_OpenMP) {		if (DeviceOffloadKind != Action::OFK_OpenMP) {
for (Arg *A : Args) {		for (Arg *A : Args) {
DAL->append(A);		DAL->append(A);
}		}
}		}

		JonChesterfieldUnsubmitted Not Done Reply Inline Actions Gone searching. This stuff has already been copied & pasted between AMDGPU.cpp and HIP.cpp. And diverged, looks like HIP has gained a bunch of inverse flags that AMDGPU has not, and some flags are duplicated (OPT_cl_fp32_correctly_rounded_divide_sqrt, OPT_fhip_fp32_correctly_rounded_divide_sqrt). @b-sumner / @t-tye / @yaxunl / @scchan I'd like to suggest that we use the same handling of these flags on opencl / hip / c++ openmp. Since the names have diverged and we don't want to break backwards compatibility, I'd like to check for all three (hip, no_hip, cl) flags for each one, with last one wins semantics, and do that from a single method called by all the amdgpu language drivers. That way hip and opencl will continue working as they do today, except hip will additionally accept some opencl flags and do the right thing, and likewise opencl. If that is unacceptable, the near future involves OPT_fopenmp_fp32_correctly_rounded_device_sqrt and similar, and I'd much rather use the same path for all the languages than copy&paste again. @pdhaliwal getCommonBitcodeLibs will splice in ockl as well as ocml and dependencies. OpenMP doesn't call into ockl so I'd like to avoid that, preferably by moving the ockl append out of getCommonBitcodeLibs @amdgpu in general can we get rid of this oclc_isa_version_xxx.bc file, which only defines a single constant in each case `@__oclc_ISA_version = linkonce_odr protected local_unnamed_addr addrspace(4) constant i32 9006, align 4` in favour of emitting that linkonce_odr symbol from clang? Clang already knows which gpu it is compiling for, and uses that to find the file with the corresponding name on disk, so it could instead emit that symbol, letting us drop the O(N) tiny files from the install tree and slightly improving compile time? JonChesterfield: Gone searching. This stuff has already been copied & pasted between AMDGPU.cpp and HIP.cpp. And…
if (!BoundArch.empty()) {		if (!BoundArch.empty()) {
DAL->eraseArg(options::OPT_march_EQ);		DAL->eraseArg(options::OPT_march_EQ);
DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),		DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ),
BoundArch);		BoundArch);
}		}

return DAL;		return DAL;
}		}
▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/HIP.cpp

Show First 20 Lines • Show All 389 Lines • ▼ Show 20 Lines	if (!BCLibArgs.empty()) {
});		});
} else {		} else {
if (!RocmInstallation.hasDeviceLibrary()) {		if (!RocmInstallation.hasDeviceLibrary()) {
getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 0;		getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 0;
return {};		return {};
}		}
StringRef GpuArch = getGPUArch(DriverArgs);		StringRef GpuArch = getGPUArch(DriverArgs);
assert(!GpuArch.empty() && "Must have an explicit GPU arch.");		assert(!GpuArch.empty() && "Must have an explicit GPU arch.");
(void)GpuArch;
auto Kind = llvm::AMDGPU::parseArchAMDGCN(GpuArch);
const StringRef CanonArch = llvm::AMDGPU::getArchNameAMDGCN(Kind);

std::string LibDeviceFile = RocmInstallation.getLibDeviceFile(CanonArch);
if (LibDeviceFile.empty()) {
getDriver().Diag(diag::err_drv_no_rocm_device_lib) << 1 << GpuArch;
return {};
}

// If --hip-device-lib is not set, add the default bitcode libraries.		// If --hip-device-lib is not set, add the default bitcode libraries.
// TODO: There are way too many flags that change this. Do we need to check
// them all?
bool DAZ = DriverArgs.hasFlag(options::OPT_fgpu_flush_denormals_to_zero,
options::OPT_fno_gpu_flush_denormals_to_zero,
getDefaultDenormsAreZeroForTarget(Kind));
bool FiniteOnly =
DriverArgs.hasFlag(options::OPT_ffinite_math_only,
options::OPT_fno_finite_math_only, false);
bool UnsafeMathOpt =
DriverArgs.hasFlag(options::OPT_funsafe_math_optimizations,
options::OPT_fno_unsafe_math_optimizations, false);
bool FastRelaxedMath = DriverArgs.hasFlag(
options::OPT_ffast_math, options::OPT_fno_fast_math, false);
bool CorrectSqrt = DriverArgs.hasFlag(
options::OPT_fhip_fp32_correctly_rounded_divide_sqrt,
options::OPT_fno_hip_fp32_correctly_rounded_divide_sqrt);
bool Wave64 = isWave64(DriverArgs, Kind);

if (DriverArgs.hasFlag(options::OPT_fgpu_sanitize,		if (DriverArgs.hasFlag(options::OPT_fgpu_sanitize,
options::OPT_fno_gpu_sanitize, false)) {		options::OPT_fno_gpu_sanitize, false)) {
auto AsanRTL = RocmInstallation.getAsanRTLPath();		auto AsanRTL = RocmInstallation.getAsanRTLPath();
if (AsanRTL.empty()) {		if (AsanRTL.empty()) {
unsigned DiagID = getDriver().getDiags().getCustomDiagID(		unsigned DiagID = getDriver().getDiags().getCustomDiagID(
DiagnosticsEngine::Error,		DiagnosticsEngine::Error,
"AMDGPU address sanitizer runtime library (asanrtl) is not found. "		"AMDGPU address sanitizer runtime library (asanrtl) is not found. "
"Please install ROCm device library which supports address "		"Please install ROCm device library which supports address "
"sanitizer");		"sanitizer");
getDriver().Diag(DiagID);		getDriver().Diag(DiagID);
return {};		return {};
} else		} else
BCLibs.push_back(AsanRTL.str());		BCLibs.push_back(AsanRTL.str());
}		}

// Add the HIP specific bitcode library.		// Add the HIP specific bitcode library.
BCLibs.push_back(RocmInstallation.getHIPPath().str());		BCLibs.push_back(RocmInstallation.getHIPPath().str());

// Add the generic set of libraries.		// Add common device libraries like ocml etc.
BCLibs.append(RocmInstallation.getCommonBitcodeLibs(		BCLibs.append(getCommonDeviceLibNames(DriverArgs, GpuArch.str()));
DriverArgs, LibDeviceFile, Wave64, DAZ, FiniteOnly, UnsafeMathOpt,
FastRelaxedMath, CorrectSqrt));

// Add instrument lib.		// Add instrument lib.
auto InstLib =		auto InstLib =
DriverArgs.getLastArgValue(options::OPT_gpu_instrument_lib_EQ);		DriverArgs.getLastArgValue(options::OPT_gpu_instrument_lib_EQ);
if (InstLib.empty())		if (InstLib.empty())
return BCLibs;		return BCLibs;
if (llvm::sys::fs::exists(InstLib))		if (llvm::sys::fs::exists(InstLib))
BCLibs.push_back(InstLib.str());		BCLibs.push_back(InstLib.str());
Show All 31 Lines

clang/test/Driver/amdgpu-openmp-toolchain.c

	Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines
	// CHECK-C: "amdgcn-amd-amdhsa" - "clang",{{.}}output: "[[DEVICE_I:.]]"			// CHECK-C: "amdgcn-amd-amdhsa" - "clang",{{.}}output: "[[DEVICE_I:.]]"
	// CHECK-C: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[DEVICE_I]]", "[[HOST_BC]]"]			// CHECK-C: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[DEVICE_I]]", "[[HOST_BC]]"]
	// CHECK-C: "x86_64-unknown-linux-gnu" - "clang"			// CHECK-C: "x86_64-unknown-linux-gnu" - "clang"
	// CHECK-C: "x86_64-unknown-linux-gnu" - "clang::as"			// CHECK-C: "x86_64-unknown-linux-gnu" - "clang::as"
	// CHECK-C: "x86_64-unknown-linux-gnu" - "offload bundler"			// CHECK-C: "x86_64-unknown-linux-gnu" - "offload bundler"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -emit-llvm -S -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-EMIT-LLVM-IR			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -emit-llvm -S -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-EMIT-LLVM-IR
	// CHECK-EMIT-LLVM-IR: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.*}}"-emit-llvm"			// CHECK-EMIT-LLVM-IR: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.*}}"-emit-llvm"

				// RUN: env LIBRARY_PATH=%S/Inputs/hip_dev_lib %clang -### -target x86_64-pc-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -lm --rocm-device-lib-path=%S/Inputs/rocm/amdgcn/bitcode %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIB-DEVICE
				// CHECK-LIB-DEVICE: clang{{.}}"-cc1"{{.}}"-triple" "amdgcn-amd-amdhsa"{{.}}"-mlink-builtin-bitcode"{{.}}libomptarget-amdgcn-gfx803.bc"{{.}}"-mlink-builtin-bitcode"{{.}}ocml.bc" "-mlink-builtin-bitcode"{{.}}ockl.bc" "-mlink-builtin-bitcode"{{.}}oclc_daz_opt_on.bc" "-mlink-builtin-bitcode"{{.}}oclc_unsafe_math_off.bc" "-mlink-builtin-bitcode"{{.}}oclc_finite_only_off.bc" "-mlink-builtin-bitcode"{{.}}oclc_correctly_rounded_sqrt_on.bc" "-mlink-builtin-bitcode"{{.}}oclc_wavefrontsize64_on.bc" "-mlink-builtin-bitcode"{{.*}}oclc_isa_version_803.bc"

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][OpenMP] Support linking of math libraries
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 363069

clang/lib/Driver/ToolChains/AMDGPU.h

clang/lib/Driver/ToolChains/AMDGPU.cpp

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

clang/lib/Driver/ToolChains/HIP.cpp

clang/test/Driver/amdgpu-openmp-toolchain.c

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][OpenMP] Support linking of math librariesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 363069

clang/lib/Driver/ToolChains/AMDGPU.h

clang/lib/Driver/ToolChains/AMDGPU.cpp

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp

clang/lib/Driver/ToolChains/HIP.cpp

clang/test/Driver/amdgpu-openmp-toolchain.c

[AMDGPU][OpenMP] Support linking of math libraries
ClosedPublic