This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Driver/
-
clang/
-
Driver/
1/1
Driver.h
-
lib/Driver/
-
Driver/
1/2
Driver.cpp
-
test/Driver/
-
Driver/
-
openmp-system-arch.c

Differential D141105

[OpenMP] Add support for '--offload-arch=native' to OpenMP offloading
ClosedPublic

Authored by jhuber6 on Jan 5 2023, 7:49 PM.

Download Raw Diff

Details

Reviewers

jdoerfert
tianshilei1992
JonChesterfield
tra
yaxunl

Commits

rGa17ab7aa3be0: [OpenMP] Add support for '--offload-arch=native' to OpenMP offloading

Summary

This patch adds support for '--offload-arch=native' to OpenMP
offloading. This will automatically generate the toolchains required to
fulfil whatever GPUs the user has installed. Getting this to work
requires a bit of a hack. The problem is that we need the ToolChain to
launch its searching program. But we do not yet have that ToolChain
built. I had to temporarily make the ToolChain and also add some logic
to ignore regular warnings & errors.

Depends on D141078

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhuber6 created this revision.Jan 5 2023, 7:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 7:49 PM

Herald added a subscriber: guansong. · View Herald Transcript

jhuber6 requested review of this revision.Jan 5 2023, 7:49 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 5 2023, 7:49 PM

Herald added subscribers: cfe-commits, sstefan1, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B206031: Diff 486742.Jan 5 2023, 8:26 PM

Possible naming hazard here. march=native means target the local processor architecture, zen2 or whatever, and we have the host CPU as an offloading target already. So what I'd expect this to do is host offloading with the openmp runtime compiled for the local variant of x86 or aarch64, not for it to have a guess at a GPU target.

What you think of offload-arch=GPU for pick a plausible GPU? That distinguishes it from other things we might want to offload to. Open question whether it should create a vgpu instance if it can't detect a physical card.

In D141105#4031103, @JonChesterfield wrote:

Possible naming hazard here. march=native means target the local processor architecture, zen2 or whatever, and we have the host CPU as an offloading target already. So what I'd expect this to do is host offloading with the openmp runtime compiled for the local variant of x86 or aarch64, not for it to have a guess at a GPU target.

What you think of offload-arch=GPU for pick a plausible GPU? That distinguishes it from other things we might want to offload to. Open question whether it should create a vgpu instance if it can't detect a physical card.

There is some prior art here, e.g. CUDA and CMake, but we don't necessarily need to follow them. I'm in favor of native because it tracks with what the user "expects" native to do. It may be somewhat ambiguous given that we can offload to the host, but I think native in the future should just go with whatever we can detect no the user's system. So if someone has some future FPGA it'll detect that. We can still use native in the host way with this syntax, we just can't infer the triple. e.g. clang foo.c -fopenmp -fopenmp-targets=x86_64-unknown-linux-gnu --offload-arch=native will do what you except.

Add test

Harbormaster completed remote builds in B206137: Diff 486901.Jan 6 2023, 9:59 AM

Looks reasonable to me. See comments though

clang/include/clang/Driver/Driver.h
488	I wouldn't call it query but it's not too bad either. I'd call it "SuppressErrors".
clang/lib/Driver/Driver.cpp
905	Does this necessarily mean we failed with `=native`, if so it's ok. Just didn't follow the logic all the way.

This revision is now accepted and ready to land.Jan 10 2023, 10:56 PM

jhuber6 marked an inline comment as done.Jan 11 2023, 6:05 AM

jhuber6 added inline comments.

clang/lib/Driver/Driver.cpp
905	I think this might trigger if someone passed `--offload-arch=` maybe we should treat that the same as native?

Change to SuppressError and make`--offload-arch=` just default to native.

Harbormaster completed remote builds in B207074: Diff 488193.Jan 11 2023, 6:43 AM

This revision was landed with ongoing or failed builds.Jan 11 2023, 8:31 AM

Closed by commit rGa17ab7aa3be0: [OpenMP] Add support for '--offload-arch=native' to OpenMP offloading (authored by jhuber6). · Explain Why

This revision was automatically updated to reflect the committed changes.

jhuber6 added a commit: rGa17ab7aa3be0: [OpenMP] Add support for '--offload-arch=native' to OpenMP offloading.

Just a heads up: This commit is causing our bootstrap build to fail (your new openmp-system-arch.c test is failing in our stage1 compiler).

FYI: I fixed the problem in https://github.com/llvm/llvm-project/commit/0a11a1b1868dd2ab183c4313ccbfbe126e91ca08.

In D141105#4046400, @gribozavr2 wrote:

FYI: I fixed the problem in https://github.com/llvm/llvm-project/commit/0a11a1b1868dd2ab183c4313ccbfbe126e91ca08.

Thanks, I forgot to update that test after fixing a similar problem before.

Revision Contents

Path

Size

clang/

include/

clang/

Driver/

Driver.h

5 lines

lib/

Driver/

Driver.cpp

73 lines

test/

Driver/

openmp-system-arch.c

56 lines

Diff 488238

clang/include/clang/Driver/Driver.h

Show First 20 Lines • Show All 475 Lines • ▼ Show 20 Lines	public:
/// \param HostAction - The host action used in the offloading toolchain.		/// \param HostAction - The host action used in the offloading toolchain.
Action *BuildOffloadingActions(Compilation &C,		Action *BuildOffloadingActions(Compilation &C,
llvm::opt::DerivedArgList &Args,		llvm::opt::DerivedArgList &Args,
const InputTy &Input,		const InputTy &Input,
Action *HostAction) const;		Action *HostAction) const;

/// Returns the set of bound architectures active for this offload kind.		/// Returns the set of bound architectures active for this offload kind.
/// If there are no bound architctures we return a set containing only the		/// If there are no bound architctures we return a set containing only the
/// empty string.		/// empty string. The \p SuppressError option is used to suppress errors.
llvm::DenseSet<StringRef>		llvm::DenseSet<StringRef>
getOffloadArchs(Compilation &C, const llvm::opt::DerivedArgList &Args,		getOffloadArchs(Compilation &C, const llvm::opt::DerivedArgList &Args,
Action::OffloadKind Kind, const ToolChain *TC) const;		Action::OffloadKind Kind, const ToolChain *TC,
		bool SuppressError = false) const;
		jdoerfertUnsubmitted Done Reply Inline Actions I wouldn't call it query but it's not too bad either. I'd call it "SuppressErrors". jdoerfert: I wouldn't call it query but it's not too bad either. I'd call it "SuppressErrors".

/// Check that the file referenced by Value exists. If it doesn't,		/// Check that the file referenced by Value exists. If it doesn't,
/// issue a diagnostic and return false.		/// issue a diagnostic and return false.
/// If TypoCorrect is true and the file does not exist, see if it looks		/// If TypoCorrect is true and the file does not exist, see if it looks
/// like a likely typo for a flag and if so print a "did you mean" blurb.		/// like a likely typo for a flag and if so print a "did you mean" blurb.
bool DiagnoseInputExistence(const llvm::opt::DerivedArgList &Args,		bool DiagnoseInputExistence(const llvm::opt::DerivedArgList &Args,
StringRef Value, types::ID Ty,		StringRef Value, types::ID Ty,
bool TypoCorrect) const;		bool TypoCorrect) const;
▲ Show 20 Lines • Show All 298 Lines • Show Last 20 Lines

clang/lib/Driver/Driver.cpp

Show First 20 Lines • Show All 853 Lines • ▼ Show 20 Lines	if (IsOpenMPOffloading) {
} else if (C.getInputArgs().hasArg(options::OPT_offload_arch_EQ) &&		} else if (C.getInputArgs().hasArg(options::OPT_offload_arch_EQ) &&
!IsHIP && !IsCuda) {		!IsHIP && !IsCuda) {
const ToolChain *HostTC = C.getSingleOffloadToolChain<Action::OFK_Host>();		const ToolChain *HostTC = C.getSingleOffloadToolChain<Action::OFK_Host>();
auto AMDTriple = getHIPOffloadTargetTriple(*this, C.getInputArgs());		auto AMDTriple = getHIPOffloadTargetTriple(*this, C.getInputArgs());
auto NVPTXTriple = getNVIDIAOffloadTargetTriple(*this, C.getInputArgs(),		auto NVPTXTriple = getNVIDIAOffloadTargetTriple(*this, C.getInputArgs(),
HostTC->getTriple());		HostTC->getTriple());

// Attempt to deduce the offloading triple from the set of architectures.		// Attempt to deduce the offloading triple from the set of architectures.
// We can only correctly deduce NVPTX / AMDGPU triples currently.		// We can only correctly deduce NVPTX / AMDGPU triples currently. We need
llvm::DenseSet<StringRef> Archs =		// to temporarily create these toolchains so that we can access tools for
getOffloadArchs(C, C.getArgs(), Action::OFK_OpenMP, nullptr);		// inferring architectures.
		llvm::DenseSet<StringRef> Archs;
		if (NVPTXTriple) {
		auto TempTC = std::make_unique<toolchains::CudaToolChain>(
		this, NVPTXTriple, *HostTC, C.getInputArgs());
		for (StringRef Arch : getOffloadArchs(
		C, C.getArgs(), Action::OFK_OpenMP, &*TempTC, true))
		Archs.insert(Arch);
		}
		if (AMDTriple) {
		auto TempTC = std::make_unique<toolchains::AMDGPUOpenMPToolChain>(
		this, AMDTriple, *HostTC, C.getInputArgs());
		for (StringRef Arch : getOffloadArchs(
		C, C.getArgs(), Action::OFK_OpenMP, &*TempTC, true))
		Archs.insert(Arch);
		}
		if (!AMDTriple && !NVPTXTriple) {
		for (StringRef Arch :
		getOffloadArchs(C, C.getArgs(), Action::OFK_OpenMP, nullptr, true))
		Archs.insert(Arch);
		}

for (StringRef Arch : Archs) {		for (StringRef Arch : Archs) {
if (NVPTXTriple && IsNVIDIAGpuArch(StringToCudaArch(		if (NVPTXTriple && IsNVIDIAGpuArch(StringToCudaArch(
getProcessorFromTargetID(*NVPTXTriple, Arch)))) {		getProcessorFromTargetID(*NVPTXTriple, Arch)))) {
DerivedArchs[NVPTXTriple->getTriple()].insert(Arch);		DerivedArchs[NVPTXTriple->getTriple()].insert(Arch);
} else if (AMDTriple &&		} else if (AMDTriple &&
IsAMDGpuArch(StringToCudaArch(		IsAMDGpuArch(StringToCudaArch(
getProcessorFromTargetID(*AMDTriple, Arch)))) {		getProcessorFromTargetID(*AMDTriple, Arch)))) {
DerivedArchs[AMDTriple->getTriple()].insert(Arch);		DerivedArchs[AMDTriple->getTriple()].insert(Arch);
} else {		} else {
Diag(clang::diag::err_drv_failed_to_deduce_target_from_arch) << Arch;		Diag(clang::diag::err_drv_failed_to_deduce_target_from_arch) << Arch;
return;		return;
}		}
}		}

		// If the set is empty then we failed to find a native architecture.
		if (Archs.empty()) {
		Diag(clang::diag::err_drv_failed_to_deduce_target_from_arch)
		<< "native";
		return;
		}
		jdoerfertUnsubmitted Not Done Reply Inline Actions Does this necessarily mean we failed with `=native`, if so it's ok. Just didn't follow the logic all the way. jdoerfert: Does this necessarily mean we failed with `=native`, if so it's ok. Just didn't follow the…
		jhuber6AuthorUnsubmitted Done Reply Inline Actions I think this might trigger if someone passed `--offload-arch=` maybe we should treat that the same as native? jhuber6: I think this might trigger if someone passed `--offload-arch=` maybe we should treat that the…

for (const auto &TripleAndArchs : DerivedArchs)		for (const auto &TripleAndArchs : DerivedArchs)
OpenMPTriples.push_back(TripleAndArchs.first());		OpenMPTriples.push_back(TripleAndArchs.first());
}		}

for (StringRef Val : OpenMPTriples) {		for (StringRef Val : OpenMPTriples) {
llvm::Triple TT(ToolChain::getOpenMPTriple(Val));		llvm::Triple TT(ToolChain::getOpenMPTriple(Val));
std::string NormalizedName = TT.normalize();		std::string NormalizedName = TT.normalize();

▲ Show 20 Lines • Show All 3,301 Lines • ▼ Show 20 Lines	void Driver::BuildActions(Compilation &C, DerivedArgList &Args,
Args.ClaimAllArgs(options::OPT_cl_ignored_Group);		Args.ClaimAllArgs(options::OPT_cl_ignored_Group);
}		}

/// Returns the canonical name for the offloading architecture when using a HIP		/// Returns the canonical name for the offloading architecture when using a HIP
/// or CUDA architecture.		/// or CUDA architecture.
static StringRef getCanonicalArchString(Compilation &C,		static StringRef getCanonicalArchString(Compilation &C,
const llvm::opt::DerivedArgList &Args,		const llvm::opt::DerivedArgList &Args,
StringRef ArchStr,		StringRef ArchStr,
const llvm::Triple &Triple) {		const llvm::Triple &Triple,
		bool SuppressError = false) {
// Lookup the CUDA / HIP architecture string. Only report an error if we were		// Lookup the CUDA / HIP architecture string. Only report an error if we were
// expecting the triple to be only NVPTX / AMDGPU.		// expecting the triple to be only NVPTX / AMDGPU.
CudaArch Arch = StringToCudaArch(getProcessorFromTargetID(Triple, ArchStr));		CudaArch Arch = StringToCudaArch(getProcessorFromTargetID(Triple, ArchStr));
if (Triple.isNVPTX() &&		if (!SuppressError && Triple.isNVPTX() &&
(Arch == CudaArch::UNKNOWN \|\| !IsNVIDIAGpuArch(Arch))) {		(Arch == CudaArch::UNKNOWN \|\| !IsNVIDIAGpuArch(Arch))) {
C.getDriver().Diag(clang::diag::err_drv_offload_bad_gpu_arch)		C.getDriver().Diag(clang::diag::err_drv_offload_bad_gpu_arch)
<< "CUDA" << ArchStr;		<< "CUDA" << ArchStr;
return StringRef();		return StringRef();
} else if (Triple.isAMDGPU() &&		} else if (!SuppressError && Triple.isAMDGPU() &&
(Arch == CudaArch::UNKNOWN \|\| !IsAMDGpuArch(Arch))) {		(Arch == CudaArch::UNKNOWN \|\| !IsAMDGpuArch(Arch))) {
C.getDriver().Diag(clang::diag::err_drv_offload_bad_gpu_arch)		C.getDriver().Diag(clang::diag::err_drv_offload_bad_gpu_arch)
<< "HIP" << ArchStr;		<< "HIP" << ArchStr;
return StringRef();		return StringRef();
}		}

if (IsNVIDIAGpuArch(Arch))		if (IsNVIDIAGpuArch(Arch))
return Args.MakeArgStringRef(CudaArchToString(Arch));		return Args.MakeArgStringRef(CudaArchToString(Arch));
Show All 26 Lines	getConflictOffloadArchCombination(const llvm::DenseSet<StringRef> &Archs,

std::set<StringRef> ArchSet;		std::set<StringRef> ArchSet;
llvm::copy(Archs, std::inserter(ArchSet, ArchSet.begin()));		llvm::copy(Archs, std::inserter(ArchSet, ArchSet.begin()));
return getConflictTargetIDCombination(ArchSet);		return getConflictTargetIDCombination(ArchSet);
}		}

llvm::DenseSet<StringRef>		llvm::DenseSet<StringRef>
Driver::getOffloadArchs(Compilation &C, const llvm::opt::DerivedArgList &Args,		Driver::getOffloadArchs(Compilation &C, const llvm::opt::DerivedArgList &Args,
Action::OffloadKind Kind, const ToolChain *TC) const {		Action::OffloadKind Kind, const ToolChain *TC,
		bool SuppressError) const {
if (!TC)		if (!TC)
TC = &C.getDefaultToolChain();		TC = &C.getDefaultToolChain();

// --offload and --offload-arch options are mutually exclusive.		// --offload and --offload-arch options are mutually exclusive.
if (Args.hasArgNoClaim(options::OPT_offload_EQ) &&		if (Args.hasArgNoClaim(options::OPT_offload_EQ) &&
Args.hasArgNoClaim(options::OPT_offload_arch_EQ,		Args.hasArgNoClaim(options::OPT_offload_arch_EQ,
options::OPT_no_offload_arch_EQ)) {		options::OPT_no_offload_arch_EQ)) {
C.getDriver().Diag(diag::err_opt_not_valid_with_opt)		C.getDriver().Diag(diag::err_opt_not_valid_with_opt)
Show All 17 Lines	if (Arg->getOption().matches(options::OPT_Xopenmp_target_EQ) &&
ExtractedArg = getOpts().ParseOneArg(Args, Index);		ExtractedArg = getOpts().ParseOneArg(Args, Index);
Arg = ExtractedArg.get();		Arg = ExtractedArg.get();
}		}

// Add or remove the seen architectures in order of appearance. If an		// Add or remove the seen architectures in order of appearance. If an
// invalid architecture is given we simply exit.		// invalid architecture is given we simply exit.
if (Arg->getOption().matches(options::OPT_offload_arch_EQ)) {		if (Arg->getOption().matches(options::OPT_offload_arch_EQ)) {
for (StringRef Arch : llvm::split(Arg->getValue(), ",")) {		for (StringRef Arch : llvm::split(Arg->getValue(), ",")) {
if (Arch == "native") {		if (Arch == "native" \|\| Arch.empty()) {
auto GPUsOrErr = TC->getSystemGPUArchs(Args);		auto GPUsOrErr = TC->getSystemGPUArchs(Args);
if (!GPUsOrErr) {		if (!GPUsOrErr) {
		if (SuppressError)
		llvm::consumeError(GPUsOrErr.takeError());
		else
TC->getDriver().Diag(diag::err_drv_undetermined_gpu_arch)		TC->getDriver().Diag(diag::err_drv_undetermined_gpu_arch)
<< llvm::Triple::getArchTypeName(TC->getArch())		<< llvm::Triple::getArchTypeName(TC->getArch())
<< llvm::toString(GPUsOrErr.takeError()) << "--offload-arch";		<< llvm::toString(GPUsOrErr.takeError()) << "--offload-arch";
continue;		continue;
}		}

for (auto ArchStr : *GPUsOrErr)		for (auto ArchStr : *GPUsOrErr) {
Archs.insert(		Archs.insert(
getCanonicalArchString(C, Args, ArchStr, TC->getTriple()));		getCanonicalArchString(C, Args, Args.MakeArgString(ArchStr),
		TC->getTriple(), SuppressError));
		}
} else {		} else {
StringRef ArchStr =		StringRef ArchStr = getCanonicalArchString(
getCanonicalArchString(C, Args, Arch, TC->getTriple());		C, Args, Arch, TC->getTriple(), SuppressError);
if (ArchStr.empty())		if (ArchStr.empty())
return Archs;		return Archs;
Archs.insert(ArchStr);		Archs.insert(ArchStr);
}		}
}		}
} else if (Arg->getOption().matches(options::OPT_no_offload_arch_EQ)) {		} else if (Arg->getOption().matches(options::OPT_no_offload_arch_EQ)) {
for (StringRef Arch : llvm::split(Arg->getValue(), ",")) {		for (StringRef Arch : llvm::split(Arg->getValue(), ",")) {
if (Arch == "all") {		if (Arch == "all") {
Archs.clear();		Archs.clear();
} else {		} else {
StringRef ArchStr =		StringRef ArchStr = getCanonicalArchString(
getCanonicalArchString(C, Args, Arch, TC->getTriple());		C, Args, Arch, TC->getTriple(), SuppressError);
if (ArchStr.empty())		if (ArchStr.empty())
return Archs;		return Archs;
Archs.erase(ArchStr);		Archs.erase(ArchStr);
}		}
}		}
}		}
}		}

if (auto ConflictingArchs = getConflictOffloadArchCombination(Archs, Kind)) {		if (auto ConflictingArchs = getConflictOffloadArchCombination(Archs, Kind)) {
C.getDriver().Diag(clang::diag::err_drv_bad_offload_arch_combo)		C.getDriver().Diag(clang::diag::err_drv_bad_offload_arch_combo)
<< ConflictingArchs->first << ConflictingArchs->second;		<< ConflictingArchs->first << ConflictingArchs->second;
C.setContainsError();		C.setContainsError();
}		}

		// Skip filling defaults if we're just querying what is availible.
		if (SuppressError)
		return Archs;

if (Archs.empty()) {		if (Archs.empty()) {
if (Kind == Action::OFK_Cuda)		if (Kind == Action::OFK_Cuda)
Archs.insert(CudaArchToString(CudaArch::CudaDefault));		Archs.insert(CudaArchToString(CudaArch::CudaDefault));
else if (Kind == Action::OFK_HIP)		else if (Kind == Action::OFK_HIP)
Archs.insert(CudaArchToString(CudaArch::HIPDefault));		Archs.insert(CudaArchToString(CudaArch::HIPDefault));
else if (Kind == Action::OFK_OpenMP)		else if (Kind == Action::OFK_OpenMP)
Archs.insert(StringRef());		Archs.insert(StringRef());
} else {		} else {
▲ Show 20 Lines • Show All 1,986 Lines • Show Last 20 Lines

clang/test/Driver/openmp-system-arch.c

This file was added.

				// RUN: mkdir -p %t
				// RUN: cp %S/Inputs/amdgpu-arch/amdgpu_arch_fail %t/
				// RUN: cp %S/Inputs/amdgpu-arch/amdgpu_arch_gfx906 %t/
				// RUN: cp %S/Inputs/nvptx-arch/nvptx_arch_fail %t/
				// RUN: cp %S/Inputs/nvptx-arch/nvptx_arch_sm_70 %t/
				// RUN: echo '#!/bin/sh' > %t/amdgpu_arch_empty
				// RUN: chmod +x %t/amdgpu_arch_fail
				// RUN: chmod +x %t/amdgpu_arch_gfx906
				// RUN: chmod +x %t/amdgpu_arch_empty
				// RUN: echo '#!/bin/sh' > %t/nvptx_arch_empty
				// RUN: chmod +x %t/nvptx_arch_fail
				// RUN: chmod +x %t/nvptx_arch_sm_70
				// RUN: chmod +x %t/nvptx_arch_empty

				// case when nvptx-arch and amdgpu-arch return nothing or fails
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch=native \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_fail --amdgpu-arch-tool=%t/amdgpu_arch_fail %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=NO-OUTPUT-ERROR
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch=native \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_empty --amdgpu-arch-tool=%t/amdgpu_arch_empty %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=NO-OUTPUT-ERROR
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch= \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_fail --amdgpu-arch-tool=%t/amdgpu_arch_fail %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=NO-OUTPUT-ERROR
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch= \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_empty --amdgpu-arch-tool=%t/amdgpu_arch_empty %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=NO-OUTPUT-ERROR
				// NO-OUTPUT-ERROR: error: failed to deduce triple for target architecture 'native'; specify the triple using '-fopenmp-targets' and '-Xopenmp-target' instead.

				// case when amdgpu-arch succeeds.
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch=native \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_fail --amdgpu-arch-tool=%t/amdgpu_arch_gfx906 %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=ARCH-GFX906
				// ARCH-GFX906: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.*}}"-target-cpu" "gfx906"

				// case when nvptx-arch succeeds.
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch=native \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_sm_70 --amdgpu-arch-tool=%t/amdgpu_arch_fail %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=ARCH-SM_70
				// ARCH-SM_70: "-cc1" "-triple" "nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "sm_70"

				// case when both nvptx-arch and amdgpu-arch succeed.
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch=native \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_sm_70 --amdgpu-arch-tool=%t/amdgpu_arch_gfx906 %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=ARCH-SM_70-GFX906
				// ARCH-SM_70-GFX906: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.*}}"-target-cpu" "gfx906"
				// ARCH-SM_70-GFX906: "-cc1" "-triple" "nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "sm_70"

				// case when both nvptx-arch and amdgpu-arch succeed with other archs.
				// RUN: %clang -### --target=x86_64-unknown-linux-gnu -nogpulib -fopenmp --offload-arch=native,sm_75,gfx1030 \
				// RUN: --nvptx-arch-tool=%t/nvptx_arch_sm_70 --amdgpu-arch-tool=%t/amdgpu_arch_gfx906 %s 2>&1 \
				// RUN: \| FileCheck %s --check-prefix=ARCH-MULTIPLE
				// ARCH-MULTIPLE: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.*}}"-target-cpu" "gfx1030"
				// ARCH-MULTIPLE: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.*}}"-target-cpu" "gfx906"
				// ARCH-MULTIPLE: "-cc1" "-triple" "nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "sm_70"
				// ARCH-MULTIPLE: "-cc1" "-triple" "nvptx64-nvidia-cuda"{{.*}}"-target-cpu" "sm_75"

This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add support for '--offload-arch=native' to OpenMP offloadingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 488238

clang/include/clang/Driver/Driver.h

clang/lib/Driver/Driver.cpp

clang/test/Driver/openmp-system-arch.c

[OpenMP] Add support for '--offload-arch=native' to OpenMP offloading
ClosedPublic