This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Driver/
-
Driver/
8/10
Driver.cpp
-
test/Driver/
-
Driver/
-
openmp-offload.c

Differential D21845

[Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.
ClosedPublic

Authored by sfantao on Jun 29 2016, 10:34 AM.

Download Raw Diff

Details

Reviewers

tra
ABataev
echristo
jlebar
hfinkel
rsmith

Commits

rG28c4f18bfecd: [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.
rC285314: [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.
rL285314: [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions.

Summary

This patch adds a new specialized action builder to create OpenMP offloading actions. The specialized builder is added to the action builder already containing the CUDA specialized builder.

OpenMP offloading dependences between host and device actions (expressed with OffloadActions) are different that what is used for CUDA:

Device compile action depends on the host compile action - the device frontend extracts the information about the declarations that have to be emitted by looking into the metadata produced by the host frontend.
The host link action depends on the device link actions - the device images are embedded in the host binary at link time.

Diff Detail

Build Status

Buildable 767
Build 767: arc lint + arc unit

Event Timeline

sfantao updated this revision to Diff 62238.Jun 29 2016, 10:34 AM

sfantao retitled this revision from to [Driver][OpenMP] Add specialized action builder for OpenMP offloading actions..

sfantao updated this object.

sfantao added reviewers: echristo, tra, jlebar, hfinkel, ABataev, rsmith.

sfantao added subscribers: caomhin, carlo.bertolli, arpith-jacob and 3 others.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptJun 29 2016, 10:34 AM

sfantao added a parent revision: D21843: [Driver][OpenMP] Create tool chains for OpenMP offloading kind..Jun 29 2016, 10:34 AM

sfantao added a child revision: D21847: [Driver][OpenMP] Build jobs for OpenMP offloading actions for targets using gcc tool chains..Jun 29 2016, 10:54 AM

ABataev added inline comments.Jun 29 2016, 9:23 PM

lib/Driver/Driver.cpp
1834	'final'

Mark class as final and remove \brief from comments.

Hi Alexey,

Thanks for the review! Addressed your comment in the new diff. Also removed \brief from the comments.

Hi, Alexy. Would you mind not asking for 'final' in additional reviews until we've resolved this thread elsewhere? Feel free to find me on IRC if you want to talk about it synchronously.

Thanks!

Rebase

Rebase.

whchung added a subscriber: whchung.Jul 18 2016, 8:44 AM

mkuron added a subscriber: mkuron.Jul 26 2016, 5:27 AM

Rebase.

Rebase.

hfinkel added inline comments.Sep 28 2016, 12:02 PM

lib/Driver/Driver.cpp
1846	Depences - Spelling?
1864	to a -> as a
1889	as dependence -> as a dependence (or as the dependence)
1890	declaration -> declarations
1891	have prevent -> prevent
1928	related with -> related to
1960	Since we can have both OpenMP offloading and CUDA, please add a test that the phases work correctly for that case (or that we produce an error if that can't currently work correctly).

Fix typos and add test tht checks phases when OpenMP and CUDA are used simultaneously.

Hi Hal,

Thanks for the review! Fixed the typos in the new diff.

lib/Driver/Driver.cpp
1960	Added new test for that. The phases generation should work well if CUDA and OpenMP offloading are used on the same file. However, the bindings for these phases cannot be generated given that the NVPTX toolchain support for OpenMP is not implemented yet and the CUDA implementation interprets actions differently, e.g. in CUDA linking is the combination of binaries of different devices (GPUs) whereas for OpenMP actual linking takes place, i.e. symbols are resolved by looking into other compilation units.

LGTM

lib/Driver/Driver.cpp
1960	Okay; after this is committed, please file a PR showing what happens and explaining the issue.

This revision is now accepted and ready to land.Oct 26 2016, 3:18 PM

sfantao closed this revision.Oct 27 2016, 10:17 AM

A PR was generated as requested by Hal explaining why we do not generate jobs for NVPTX targets yet.

https://llvm.org/bugs/show_bug.cgi?id=30812

I think OffloadAction::DeviceDependences::add(..., ..., /*BoundArch=*/nullptr, Action::OFK_OpenMP) is never sufficient. The invalid BoundArch eventually ends up in NVPTX::Assembler::ConstructJob and triggers an assert; I don't think there is any code path with OpenMP offloading where the GPU architecture is set correctly. If I compile a simple test file with

clang -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -c example.c -march=sm_30

the error message is the following:

clang: /llvm/tools/clang/lib/Driver/Tools.cpp:11960: virtual void clang::driver::tools::NVPTX::Assembler::ConstructJob(clang::driver::Compilation&, const clang::driver::JobAction&, const clang::driver::InputInfo&, const InputInfoList&, const llvm::opt::ArgList&, const char*) const: Assertion `gpu_arch != CudaArch::UNKNOWN && "Device action expected to have an architecture."' failed.

On a related but different note, leaving out -march=sm_30 in the clang call above causes an earlier assert to trigger:

clang: /llvm/tools/clang/lib/Driver/ToolChains.cpp:5049: virtual void clang::driver::toolchains::CudaToolChain::addClangTargetOptions(const llvm::opt::ArgList&, llvm::opt::ArgStringList&) const: Assertion `!GpuArch.empty() && "Must have an explicit GPU arch."' failed.

The more appropriate flag would probably be --cuda-gpu-arch=sm_30, but that is not recognized.

I thought I'd just report this here as it seemed to me that with the merge of all of @sfantao's code yesterday the OpenMP offloading support should mostly work. If this is not the case or I should report the issue elsewhere, please let me know. Also, I'm not sure if/how this relates to the bug report you mentioned.

Hi Michael,

In D21845#581988, @mkuron wrote:
I think OffloadAction::DeviceDependences::add(..., ..., /*BoundArch=*/nullptr, Action::OFK_OpenMP) is never sufficient. The invalid BoundArch eventually ends up in NVPTX::Assembler::ConstructJob and triggers an assert; I don't think there is any code path with OpenMP offloading where the GPU architecture is set correctly. If I compile a simple test file with
clang -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -c example.c -march=sm_30
the error message is the following:
clang: /llvm/tools/clang/lib/Driver/Tools.cpp:11960: virtual void clang::driver::tools::NVPTX::Assembler::ConstructJob(clang::driver::Compilation&, const clang::driver::JobAction&, const clang::driver::InputInfo&, const InputInfoList&, const llvm::opt::ArgList&, const char*) const: Assertion `gpu_arch != CudaArch::UNKNOWN && "Device action expected to have an architecture."' failed.
On a related but different note, leaving out -march=sm_30 in the clang call above causes an earlier assert to trigger:
clang: /llvm/tools/clang/lib/Driver/ToolChains.cpp:5049: virtual void clang::driver::toolchains::CudaToolChain::addClangTargetOptions(const llvm::opt::ArgList&, llvm::opt::ArgStringList&) const: Assertion `!GpuArch.empty() && "Must have an explicit GPU arch."' failed.
The more appropriate flag would probably be --cuda-gpu-arch=sm_30, but that is not recognized.

I thought I'd just report this here as it seemed to me that with the merge of all of @sfantao's code yesterday the OpenMP offloading support should mostly work. If this is not the case or I should report the issue elsewhere, please let me know. Also, I'm not sure if/how this relates to the bug report you mentioned.

These patches do not implement any specific support for GPUs. Only toolchains based on gcc are expected to work. GPUs will require some extra work on the toolchain which is under progress.

In any case, it is not nice to have these assertions when trying an unsupported toolchain. I'll work on a diagnostic so that the driver stops before attempting to create jobs for unsupported toolchains.

Thanks for reporting this!

Revision Contents

Path

Size

lib/

Driver/

Driver.cpp

129 lines

test/

Driver/

openmp-offload.c

138 lines

Commit	Tree	Parents	Author	Summary	Date
b5902d3e0ffc	ca3903c2c6e1	ca6a23827c16	Samuel Antao	Fix typos and add test tht checks phases when OpenMP and CUDA are used… (Show More…)	Oct 25 2016, 9:41 AM
ca6a23827c16	5ae55abc54c1	bdda3e397d4c 9de7a800b30c	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Oct 25 2016, 7:53 AM
bdda3e397d4c	cf2643b70f1e	4b982aca5a27 67b806f446c8	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Oct 24 2016, 7:34 AM
4b982aca5a27	b1d2d65bdcb9	81e5e4393deb 7a8de78e3b47	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Oct 20 2016, 5:26 AM
81e5e4393deb	77532e4ce599	2c31304c787a dbe6391a5e9f	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Sep 30 2016, 8:51 AM
2c31304c787a	3f931457db38	39813f76bd5f fca6a77ad568	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Sep 21 2016, 3:21 PM
39813f76bd5f	c7fd2a10d26b	449e4fe980ed 472c1bd7a981	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Sep 19 2016, 11:22 AM
449e4fe980ed	e964308309d2	a92fc3c7c3cb 1ae757aa48ef	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 28 2016, 8:04 AM
a92fc3c7c3cb	8afa0603342d	6b376ee5108f 405d8dac7110	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 26 2016, 8:36 AM
6b376ee5108f	6d48863107da	b5759851fd5a 1b6cf2caf185	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 25 2016, 5:06 PM
b5759851fd5a	4bfcf3e39697	882f15cceafd 7e2e006d1ebb	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 24 2016, 9:04 AM
882f15cceafd	d94369a936b3	3560ceb1cfb5 accdb1e0c2eb	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 24 2016, 7:39 AM
3560ceb1cfb5	52a74903aa2b	45acf0f75c80 b25a5acc5b42	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 17 2016, 6:25 PM
45acf0f75c80	165bcdd3c9d6	9ba2caff7358 d2cd76e0052d	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 16 2016, 8:00 AM
9ba2caff7358	7141624b07fe	ce8c876ed124 38a270e0f07f	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 15 2016, 1:24 PM
ce8c876ed124	889d5ca82555	796ec0272bac c4a0ba0d835c	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Aug 11 2016, 10:38 AM
796ec0272bac	f230b79d6c4f	4f7aebafa478 6135ecfbd32c	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Jul 29 2016, 3:46 PM
4f7aebafa478	36a612ab9af5	149fa6659393 3901d595ca0e	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Jul 28 2016, 4:28 PM
149fa6659393	cd59eb7f56e7	7f9ac7bd2904 c9f4cb859530	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Jul 28 2016, 9:58 AM
7f9ac7bd2904	cedb2091c430	2f7c05bfed68 8fcb47483e8e	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Jul 11 2016, 5:20 PM
2f7c05bfed68	d539ff5e45a1	edefc4acb6f7 4fbbea9f7414	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Jul 1 2016, 5:05 PM
edefc4acb6f7	5f6d856e1227	0737d07ed94e	Samuel Antao	Mark class as final and remove \brief from comments.	Jul 1 2016, 12:05 PM
0737d07ed94e	dff031f79a59	863c9da55b90 b595c68b27c3	Samuel Antao	Merge branch 'CCCpatch-D21843-depends-on-patch-D21840' into DDDpatch-D21845… (Show More…)	Jul 1 2016, 11:57 AM
863c9da55b90	092103f47c0c	980c0b487099	Samuel Antao	Fix comment.	Jun 29 2016, 10:21 AM
980c0b487099	b199356d6bdd	0070b847ddc2 790418655422	Samuel Antao	Merge branch 'CCC-Create-OpenMP-Toolchains-depends-on-AAA-BBB' into DDD-Create… (Show More…)	Jun 29 2016, 8:18 AM
0070b847ddc2	8c5329b75e7a	14e39868b8f5	Samuel Antao	Add OpenMP action builder.	Jun 24 2016, 10:22 PM

Diff 75722

lib/Driver/Driver.cpp

Show First 20 Lines • Show All 1,541 Lines • ▼ Show 20 Lines	DeviceActionBuilder(Compilation &C, DerivedArgList &Args,
: C(C), Args(Args), Inputs(Inputs),		: C(C), Args(Args), Inputs(Inputs),
AssociatedOffloadKind(AssociatedOffloadKind) {}		AssociatedOffloadKind(AssociatedOffloadKind) {}
virtual ~DeviceActionBuilder() {}		virtual ~DeviceActionBuilder() {}

/// Fill up the array \a DA with all the device dependences that should be		/// Fill up the array \a DA with all the device dependences that should be
/// added to the provided host action \a HostAction. By default it is		/// added to the provided host action \a HostAction. By default it is
/// inactive.		/// inactive.
virtual ActionBuilderReturnCode		virtual ActionBuilderReturnCode
getDeviceDepences(OffloadAction::DeviceDependences &DA, phases::ID CurPhase,		getDeviceDependences(OffloadAction::DeviceDependences &DA,
phases::ID FinalPhase, PhasesTy &Phases) {		phases::ID CurPhase, phases::ID FinalPhase,
		PhasesTy &Phases) {
return ABRT_Inactive;		return ABRT_Inactive;
}		}

/// Update the state to include the provided host action \a HostAction as a		/// Update the state to include the provided host action \a HostAction as a
/// dependency of the current device action. By default it is inactive.		/// dependency of the current device action. By default it is inactive.
virtual ActionBuilderReturnCode addDeviceDepences(Action *HostAction) {		virtual ActionBuilderReturnCode addDeviceDepences(Action *HostAction) {
return ABRT_Inactive;		return ABRT_Inactive;
}		}
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	class CudaActionBuilder final : public DeviceActionBuilder {
bool IsActive = false;		bool IsActive = false;

public:		public:
CudaActionBuilder(Compilation &C, DerivedArgList &Args,		CudaActionBuilder(Compilation &C, DerivedArgList &Args,
const Driver::InputList &Inputs)		const Driver::InputList &Inputs)
: DeviceActionBuilder(C, Args, Inputs, Action::OFK_Cuda) {}		: DeviceActionBuilder(C, Args, Inputs, Action::OFK_Cuda) {}

ActionBuilderReturnCode		ActionBuilderReturnCode
getDeviceDepences(OffloadAction::DeviceDependences &DA, phases::ID CurPhase,		getDeviceDependences(OffloadAction::DeviceDependences &DA,
phases::ID FinalPhase, PhasesTy &Phases) override {		phases::ID CurPhase, phases::ID FinalPhase,
		PhasesTy &Phases) override {
if (!IsActive)		if (!IsActive)
return ABRT_Inactive;		return ABRT_Inactive;

// If we don't have more CUDA actions, we don't have any dependences to		// If we don't have more CUDA actions, we don't have any dependences to
// create for the host.		// create for the host.
if (CudaDeviceActions.empty())		if (CudaDeviceActions.empty())
return ABRT_Success;		return ABRT_Success;

▲ Show 20 Lines • Show All 202 Lines • ▼ Show 20 Lines	bool initialize() override {
// sm_20 code should work correctly, if suboptimally, on all newer GPUs.		// sm_20 code should work correctly, if suboptimally, on all newer GPUs.
if (GpuArchList.empty())		if (GpuArchList.empty())
GpuArchList.push_back(CudaArch::SM_20);		GpuArchList.push_back(CudaArch::SM_20);

return Error;		return Error;
}		}
};		};

/// Add the implementation for other specialized builders here.		/// OpenMP action builder. The host bitcode is passed to the device frontend
		/// and all the device linked images are passed to the host link phase.
		class OpenMPActionBuilder final : public DeviceActionBuilder {
		/// The OpenMP actions for the current input.
		ABataevUnsubmitted Done Reply Inline Actions 'final' ABataev: 'final'
		ActionList OpenMPDeviceActions;

		/// The linker inputs obtained for each toolchain.
		SmallVector<ActionList, 8> DeviceLinkerInputs;

		public:
		OpenMPActionBuilder(Compilation &C, DerivedArgList &Args,
		const Driver::InputList &Inputs)
		: DeviceActionBuilder(C, Args, Inputs, Action::OFK_OpenMP) {}

		ActionBuilderReturnCode
		getDeviceDependences(OffloadAction::DeviceDependences &DA,
		hfinkelUnsubmitted Done Reply Inline Actions Depences - Spelling? hfinkel: Depences - Spelling?
		phases::ID CurPhase, phases::ID FinalPhase,
		PhasesTy &Phases) override {

		// We should always have an action for each input.
		assert(OpenMPDeviceActions.size() == ToolChains.size() &&
		"Number of OpenMP actions and toolchains do not match.");

		// The host only depends on device action in the linking phase, when all
		// the device images have to be embedded in the host image.
		if (CurPhase == phases::Link) {
		assert(ToolChains.size() == DeviceLinkerInputs.size() &&
		"Toolchains and linker inputs sizes do not match.");
		auto LI = DeviceLinkerInputs.begin();
		for (auto *A : OpenMPDeviceActions) {
		LI->push_back(A);
		++LI;
		}

		hfinkelUnsubmitted Done Reply Inline Actions to a -> as a hfinkel: to a -> as a
		// We passed the device action as a host dependence, so we don't need to
		// do anything else with them.
		OpenMPDeviceActions.clear();
		return ABRT_Success;
		}

		// By default, we produce an action for each device arch.
		for (Action *&A : OpenMPDeviceActions)
		A = C.getDriver().ConstructPhaseAction(C, Args, CurPhase, A);

		return ABRT_Success;
		}

		ActionBuilderReturnCode addDeviceDepences(Action *HostAction) override {

		// If this is an input action replicate it for each OpenMP toolchain.
		if (auto *IA = dyn_cast<InputAction>(HostAction)) {
		OpenMPDeviceActions.clear();
		for (unsigned I = 0; I < ToolChains.size(); ++I)
		OpenMPDeviceActions.push_back(
		C.MakeAction<InputAction>(IA->getInputArg(), IA->getType()));
		return ABRT_Success;
		}

		// When generating code for OpenMP we use the host compile phase result as
		hfinkelUnsubmitted Done Reply Inline Actions as dependence -> as a dependence (or as the dependence) hfinkel: as dependence -> as a dependence (or as the dependence)
		// a dependence to the device compile phase so that it can learn what
		hfinkelUnsubmitted Done Reply Inline Actions declaration -> declarations hfinkel: declaration -> declarations
		// declarations should be emitted. However, this is not the only use for
		hfinkelUnsubmitted Done Reply Inline Actions have prevent -> prevent hfinkel: have prevent -> prevent
		// the host action, so we prevent it from being collapsed.
		if (isa<CompileJobAction>(HostAction)) {
		HostAction->setCannotBeCollapsedWithNextDependentAction();
		assert(ToolChains.size() == OpenMPDeviceActions.size() &&
		"Toolchains and device action sizes do not match.");
		OffloadAction::HostDependence HDep(
		HostAction, C.getSingleOffloadToolChain<Action::OFK_Host>(),
		/BoundArch=/nullptr, Action::OFK_OpenMP);
		auto TC = ToolChains.begin();
		for (Action *&A : OpenMPDeviceActions) {
		assert(isa<CompileJobAction>(A));
		OffloadAction::DeviceDependences DDep;
		DDep.add(A, TC, /BoundArch=*/nullptr, Action::OFK_OpenMP);
		A = C.MakeAction<OffloadAction>(HDep, DDep);
		++TC;
		}
		}
		return ABRT_Success;
		}

		void appendLinkDependences(OffloadAction::DeviceDependences &DA) override {
		assert(ToolChains.size() == DeviceLinkerInputs.size() &&
		"Toolchains and linker inputs sizes do not match.");

		// Append a new link action for each device.
		auto TC = ToolChains.begin();
		for (auto &LI : DeviceLinkerInputs) {
		auto *DeviceLinkAction =
		C.MakeAction<LinkJobAction>(LI, types::TY_Image);
		DA.add(DeviceLinkAction, TC, /BoundArch=*/nullptr,
		Action::OFK_OpenMP);
		++TC;
		}
		}

		bool initialize() override {
		// Get the OpenMP toolchains. If we don't get any, the action builder will
		hfinkelUnsubmitted Done Reply Inline Actions related with -> related to hfinkel: related with -> related to
		// know there is nothing to do related to OpenMP offloading.
		auto OpenMPTCRange = C.getOffloadToolChains<Action::OFK_OpenMP>();
		for (auto TI = OpenMPTCRange.first, TE = OpenMPTCRange.second; TI != TE;
		++TI)
		ToolChains.push_back(TI->second);

		DeviceLinkerInputs.resize(ToolChains.size());
		return false;
		}
		};

		///
		/// TODO: Add the implementation for other specialized builders here.
		///

/// Specialized builders being used by this offloading action builder.		/// Specialized builders being used by this offloading action builder.
SmallVector<DeviceActionBuilder *, 4> SpecializedBuilders;		SmallVector<DeviceActionBuilder *, 4> SpecializedBuilders;

public:		public:
OffloadingActionBuilder(Compilation &C, DerivedArgList &Args,		OffloadingActionBuilder(Compilation &C, DerivedArgList &Args,
const Driver::InputList &Inputs)		const Driver::InputList &Inputs)
: C(C), Args(Args) {		: C(C), Args(Args) {
// Create a specialized builder for each device toolchain.		// Create a specialized builder for each device toolchain.

IsValid = true;		IsValid = true;

// Create a specialized builder for CUDA.		// Create a specialized builder for CUDA.
SpecializedBuilders.push_back(new CudaActionBuilder(C, Args, Inputs));		SpecializedBuilders.push_back(new CudaActionBuilder(C, Args, Inputs));

		// Create a specialized builder for OpenMP.
		SpecializedBuilders.push_back(new OpenMPActionBuilder(C, Args, Inputs));

		hfinkelUnsubmitted Done Reply Inline Actions Since we can have both OpenMP offloading and CUDA, please add a test that the phases work correctly for that case (or that we produce an error if that can't currently work correctly). hfinkel: Since we can have both OpenMP offloading and CUDA, please add a test that the phases work…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Added new test for that. The phases generation should work well if CUDA and OpenMP offloading are used on the same file. However, the bindings for these phases cannot be generated given that the NVPTX toolchain support for OpenMP is not implemented yet and the CUDA implementation interprets actions differently, e.g. in CUDA linking is the combination of binaries of different devices (GPUs) whereas for OpenMP actual linking takes place, i.e. symbols are resolved by looking into other compilation units. sfantao: Added new test for that. The phases generation should work well if CUDA and OpenMP offloading…
		hfinkelUnsubmitted Not Done Reply Inline Actions Okay; after this is committed, please file a PR showing what happens and explaining the issue. hfinkel: Okay; after this is committed, please file a PR showing what happens and explaining the issue.
//		//
// TODO: Build other specialized builders here.		// TODO: Build other specialized builders here.
//		//

// Initialize all the builders, keeping track of errors.		// Initialize all the builders, keeping track of errors.
for (auto *SB : SpecializedBuilders)		for (auto *SB : SpecializedBuilders)
IsValid = IsValid && !SB->initialize();		IsValid = IsValid && !SB->initialize();
}		}
Show All 26 Lines	addDeviceDependencesToHostAction(Action HostAction, const Arg InputArg,
unsigned InactiveBuilders = 0u;		unsigned InactiveBuilders = 0u;
unsigned IgnoringBuilders = 0u;		unsigned IgnoringBuilders = 0u;
for (auto *SB : SpecializedBuilders) {		for (auto *SB : SpecializedBuilders) {
if (!SB->isValid()) {		if (!SB->isValid()) {
++InactiveBuilders;		++InactiveBuilders;
continue;		continue;
}		}

auto RetCode = SB->getDeviceDepences(DDeps, CurPhase, FinalPhase, Phases);		auto RetCode =
		SB->getDeviceDependences(DDeps, CurPhase, FinalPhase, Phases);

// If the builder explicitly says the host action should be ignored,		// If the builder explicitly says the host action should be ignored,
// we need to increment the variable that tracks the builders that request		// we need to increment the variable that tracks the builders that request
// the host object to be ignored.		// the host object to be ignored.
if (RetCode == DeviceActionBuilder::ABRT_Ignore_Host)		if (RetCode == DeviceActionBuilder::ABRT_Ignore_Host)
++IgnoringBuilders;		++IgnoringBuilders;

// Unless the builder was inactive for this action, we have to record the		// Unless the builder was inactive for this action, we have to record the
▲ Show 20 Lines • Show All 1,588 Lines • Show Last 20 Lines

test/Driver/openmp-offload.c

	///			///
	/// Perform several driver tests for OpenMP offloading			/// Perform several driver tests for OpenMP offloading
	///			///

				// REQUIRES: clang-driver
				// REQUIRES: x86-registered-target
				// REQUIRES: powerpc-registered-target
				// REQUIRES: nvptx-registered-target

	/// ###########################################################################			/// ###########################################################################

	/// Check whether an invalid OpenMP target is specified:			/// Check whether an invalid OpenMP target is specified:
	// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=aaa-bbb-ccc-ddd %s 2>&1 \			// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=aaa-bbb-ccc-ddd %s 2>&1 \
	// RUN: \| FileCheck -check-prefix=CHK-INVALID-TARGET %s			// RUN: \| FileCheck -check-prefix=CHK-INVALID-TARGET %s
	// RUN: %clang -### -fopenmp -fopenmp-targets=aaa-bbb-ccc-ddd %s 2>&1 \			// RUN: %clang -### -fopenmp -fopenmp-targets=aaa-bbb-ccc-ddd %s 2>&1 \
	// RUN: \| FileCheck -check-prefix=CHK-INVALID-TARGET %s			// RUN: \| FileCheck -check-prefix=CHK-INVALID-TARGET %s
	// CHK-INVALID-TARGET: error: OpenMP target is invalid: 'aaa-bbb-ccc-ddd'			// CHK-INVALID-TARGET: error: OpenMP target is invalid: 'aaa-bbb-ccc-ddd'
	Show All 17 Lines
	// CHK-NO-FOPENMP: error: The option -fopenmp-targets must be used in conjunction with a -fopenmp option compatible with offloading, please use -fopenmp=libomp or -fopenmp=libiomp5.			// CHK-NO-FOPENMP: error: The option -fopenmp-targets must be used in conjunction with a -fopenmp option compatible with offloading, please use -fopenmp=libomp or -fopenmp=libiomp5.

	/// ###########################################################################			/// ###########################################################################

	/// Check warning for duplicate offloading targets.			/// Check warning for duplicate offloading targets.
	// RUN: %clang -### -ccc-print-phases -fopenmp -fopenmp-targets=powerpc64le-ibm-linux-gnu,powerpc64le-ibm-linux-gnu %s 2>&1 \			// RUN: %clang -### -ccc-print-phases -fopenmp -fopenmp-targets=powerpc64le-ibm-linux-gnu,powerpc64le-ibm-linux-gnu %s 2>&1 \
	// RUN: \| FileCheck -check-prefix=CHK-DUPLICATES %s			// RUN: \| FileCheck -check-prefix=CHK-DUPLICATES %s
	// CHK-DUPLICATES: warning: The OpenMP offloading target 'powerpc64le-ibm-linux-gnu' is similar to target 'powerpc64le-ibm-linux-gnu' already specified - will be ignored.			// CHK-DUPLICATES: warning: The OpenMP offloading target 'powerpc64le-ibm-linux-gnu' is similar to target 'powerpc64le-ibm-linux-gnu' already specified - will be ignored.

				/// ###########################################################################

				/// Check the phases graph when using a single target, different from the host.
				/// We should have an offload action joining the host compile and device
				/// preprocessor and another one joining the device linking outputs to the host
				/// action.
				// RUN: %clang -ccc-print-phases -fopenmp -target powerpc64le-ibm-linux-gnu -fopenmp-targets=x86_64-pc-linux-gnu %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES %s
				// CHK-PHASES: 0: input, "[[INPUT:.+\.c]]", c, (host-openmp)
				// CHK-PHASES: 1: preprocessor, {0}, cpp-output, (host-openmp)
				// CHK-PHASES: 2: compiler, {1}, ir, (host-openmp)
				// CHK-PHASES: 3: backend, {2}, assembler, (host-openmp)
				// CHK-PHASES: 4: assembler, {3}, object, (host-openmp)
				// CHK-PHASES: 5: linker, {4}, image, (host-openmp)
				// CHK-PHASES: 6: input, "[[INPUT]]", c, (device-openmp)
				// CHK-PHASES: 7: preprocessor, {6}, cpp-output, (device-openmp)
				// CHK-PHASES: 8: compiler, {7}, ir, (device-openmp)
				// CHK-PHASES: 9: offload, "host-openmp (powerpc64le-ibm-linux-gnu)" {2}, "device-openmp (x86_64-pc-linux-gnu)" {8}, ir
				// CHK-PHASES: 10: backend, {9}, assembler, (device-openmp)
				// CHK-PHASES: 11: assembler, {10}, object, (device-openmp)
				// CHK-PHASES: 12: linker, {11}, image, (device-openmp)
				// CHK-PHASES: 13: offload, "host-openmp (powerpc64le-ibm-linux-gnu)" {5}, "device-openmp (x86_64-pc-linux-gnu)" {12}, image

				/// ###########################################################################

				/// Check the phases when using multiple targets. Here we also add a library to
				/// make sure it is treated as input by the device.
				// RUN: %clang -ccc-print-phases -lsomelib -fopenmp -target powerpc64-ibm-linux-gnu -fopenmp-targets=x86_64-pc-linux-gnu,powerpc64-ibm-linux-gnu %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES-LIB %s
				// CHK-PHASES-LIB: 0: input, "somelib", object, (host-openmp)
				// CHK-PHASES-LIB: 1: input, "[[INPUT:.+\.c]]", c, (host-openmp)
				// CHK-PHASES-LIB: 2: preprocessor, {1}, cpp-output, (host-openmp)
				// CHK-PHASES-LIB: 3: compiler, {2}, ir, (host-openmp)
				// CHK-PHASES-LIB: 4: backend, {3}, assembler, (host-openmp)
				// CHK-PHASES-LIB: 5: assembler, {4}, object, (host-openmp)
				// CHK-PHASES-LIB: 6: linker, {0, 5}, image, (host-openmp)
				// CHK-PHASES-LIB: 7: input, "somelib", object, (device-openmp)
				// CHK-PHASES-LIB: 8: input, "[[INPUT]]", c, (device-openmp)
				// CHK-PHASES-LIB: 9: preprocessor, {8}, cpp-output, (device-openmp)
				// CHK-PHASES-LIB: 10: compiler, {9}, ir, (device-openmp)
				// CHK-PHASES-LIB: 11: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {3}, "device-openmp (x86_64-pc-linux-gnu)" {10}, ir
				// CHK-PHASES-LIB: 12: backend, {11}, assembler, (device-openmp)
				// CHK-PHASES-LIB: 13: assembler, {12}, object, (device-openmp)
				// CHK-PHASES-LIB: 14: linker, {7, 13}, image, (device-openmp)
				// CHK-PHASES-LIB: 15: input, "somelib", object, (device-openmp)
				// CHK-PHASES-LIB: 16: input, "[[INPUT]]", c, (device-openmp)
				// CHK-PHASES-LIB: 17: preprocessor, {16}, cpp-output, (device-openmp)
				// CHK-PHASES-LIB: 18: compiler, {17}, ir, (device-openmp)
				// CHK-PHASES-LIB: 19: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {3}, "device-openmp (powerpc64-ibm-linux-gnu)" {18}, ir
				// CHK-PHASES-LIB: 20: backend, {19}, assembler, (device-openmp)
				// CHK-PHASES-LIB: 21: assembler, {20}, object, (device-openmp)
				// CHK-PHASES-LIB: 22: linker, {15, 21}, image, (device-openmp)
				// CHK-PHASES-LIB: 23: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {6}, "device-openmp (x86_64-pc-linux-gnu)" {14}, "device-openmp (powerpc64-ibm-linux-gnu)" {22}, image


				/// ###########################################################################

				/// Check the phases when using multiple targets and multiple source files
				// RUN: echo " " > %t.c
				// RUN: %clang -ccc-print-phases -lsomelib -fopenmp -target powerpc64-ibm-linux-gnu -fopenmp-targets=x86_64-pc-linux-gnu,powerpc64-ibm-linux-gnu %s %t.c 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES-FILES %s
				// CHK-PHASES-FILES: 0: input, "somelib", object, (host-openmp)
				// CHK-PHASES-FILES: 1: input, "[[INPUT1:.+\.c]]", c, (host-openmp)
				// CHK-PHASES-FILES: 2: preprocessor, {1}, cpp-output, (host-openmp)
				// CHK-PHASES-FILES: 3: compiler, {2}, ir, (host-openmp)
				// CHK-PHASES-FILES: 4: backend, {3}, assembler, (host-openmp)
				// CHK-PHASES-FILES: 5: assembler, {4}, object, (host-openmp)
				// CHK-PHASES-FILES: 6: input, "[[INPUT2:.+\.c]]", c, (host-openmp)
				// CHK-PHASES-FILES: 7: preprocessor, {6}, cpp-output, (host-openmp)
				// CHK-PHASES-FILES: 8: compiler, {7}, ir, (host-openmp)
				// CHK-PHASES-FILES: 9: backend, {8}, assembler, (host-openmp)
				// CHK-PHASES-FILES: 10: assembler, {9}, object, (host-openmp)
				// CHK-PHASES-FILES: 11: linker, {0, 5, 10}, image, (host-openmp)
				// CHK-PHASES-FILES: 12: input, "somelib", object, (device-openmp)
				// CHK-PHASES-FILES: 13: input, "[[INPUT1]]", c, (device-openmp)
				// CHK-PHASES-FILES: 14: preprocessor, {13}, cpp-output, (device-openmp)
				// CHK-PHASES-FILES: 15: compiler, {14}, ir, (device-openmp)
				// CHK-PHASES-FILES: 16: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {3}, "device-openmp (x86_64-pc-linux-gnu)" {15}, ir
				// CHK-PHASES-FILES: 17: backend, {16}, assembler, (device-openmp)
				// CHK-PHASES-FILES: 18: assembler, {17}, object, (device-openmp)
				// CHK-PHASES-FILES: 19: input, "[[INPUT2]]", c, (device-openmp)
				// CHK-PHASES-FILES: 20: preprocessor, {19}, cpp-output, (device-openmp)
				// CHK-PHASES-FILES: 21: compiler, {20}, ir, (device-openmp)
				// CHK-PHASES-FILES: 22: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {8}, "device-openmp (x86_64-pc-linux-gnu)" {21}, ir
				// CHK-PHASES-FILES: 23: backend, {22}, assembler, (device-openmp)
				// CHK-PHASES-FILES: 24: assembler, {23}, object, (device-openmp)
				// CHK-PHASES-FILES: 25: linker, {12, 18, 24}, image, (device-openmp)
				// CHK-PHASES-FILES: 26: input, "somelib", object, (device-openmp)
				// CHK-PHASES-FILES: 27: input, "[[INPUT1]]", c, (device-openmp)
				// CHK-PHASES-FILES: 28: preprocessor, {27}, cpp-output, (device-openmp)
				// CHK-PHASES-FILES: 29: compiler, {28}, ir, (device-openmp)
				// CHK-PHASES-FILES: 30: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {3}, "device-openmp (powerpc64-ibm-linux-gnu)" {29}, ir
				// CHK-PHASES-FILES: 31: backend, {30}, assembler, (device-openmp)
				// CHK-PHASES-FILES: 32: assembler, {31}, object, (device-openmp)
				// CHK-PHASES-FILES: 33: input, "[[INPUT2]]", c, (device-openmp)
				// CHK-PHASES-FILES: 34: preprocessor, {33}, cpp-output, (device-openmp)
				// CHK-PHASES-FILES: 35: compiler, {34}, ir, (device-openmp)
				// CHK-PHASES-FILES: 36: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {8}, "device-openmp (powerpc64-ibm-linux-gnu)" {35}, ir
				// CHK-PHASES-FILES: 37: backend, {36}, assembler, (device-openmp)
				// CHK-PHASES-FILES: 38: assembler, {37}, object, (device-openmp)
				// CHK-PHASES-FILES: 39: linker, {26, 32, 38}, image, (device-openmp)
				// CHK-PHASES-FILES: 40: offload, "host-openmp (powerpc64-ibm-linux-gnu)" {11}, "device-openmp (x86_64-pc-linux-gnu)" {25}, "device-openmp (powerpc64-ibm-linux-gnu)" {39}, image

				/// ###########################################################################

				/// Check the phases graph when using a single GPU target, and check the OpenMP
				/// and CUDA phases are articulated correctly.
				// RUN: %clang -ccc-print-phases -fopenmp -target powerpc64le-ibm-linux-gnu -fopenmp-targets=nvptx64-nvidia-cuda -x cuda %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES-WITH-CUDA %s
				// CHK-PHASES-WITH-CUDA: 0: input, "[[INPUT:.+\.c]]", cuda, (host-cuda-openmp)
				// CHK-PHASES-WITH-CUDA: 1: preprocessor, {0}, cuda-cpp-output, (host-cuda-openmp)
				// CHK-PHASES-WITH-CUDA: 2: compiler, {1}, ir, (host-cuda-openmp)
				// CHK-PHASES-WITH-CUDA: 3: input, "[[INPUT]]", cuda, (device-cuda, sm_20)
				// CHK-PHASES-WITH-CUDA: 4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_20)
				// CHK-PHASES-WITH-CUDA: 5: compiler, {4}, ir, (device-cuda, sm_20)
				// CHK-PHASES-WITH-CUDA: 6: backend, {5}, assembler, (device-cuda, sm_20)
				// CHK-PHASES-WITH-CUDA: 7: assembler, {6}, object, (device-cuda, sm_20)
				// CHK-PHASES-WITH-CUDA: 8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_20)" {7}, object
				// CHK-PHASES-WITH-CUDA: 9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_20)" {6}, assembler
				// CHK-PHASES-WITH-CUDA: 10: linker, {8, 9}, cuda-fatbin, (device-cuda)
				// CHK-PHASES-WITH-CUDA: 11: offload, "host-cuda-openmp (powerpc64le-ibm-linux-gnu)" {2}, "device-cuda (nvptx64-nvidia-cuda)" {10}, ir
				// CHK-PHASES-WITH-CUDA: 12: backend, {11}, assembler, (host-cuda-openmp)
				// CHK-PHASES-WITH-CUDA: 13: assembler, {12}, object, (host-cuda-openmp)
				// CHK-PHASES-WITH-CUDA: 14: linker, {13}, image, (host-cuda-openmp)
				// CHK-PHASES-WITH-CUDA: 15: input, "[[INPUT]]", cuda, (device-openmp)
				// CHK-PHASES-WITH-CUDA: 16: preprocessor, {15}, cuda-cpp-output, (device-openmp)
				// CHK-PHASES-WITH-CUDA: 17: compiler, {16}, ir, (device-openmp)
				// CHK-PHASES-WITH-CUDA: 18: offload, "host-cuda-openmp (powerpc64le-ibm-linux-gnu)" {2}, "device-openmp (nvptx64-nvidia-cuda)" {17}, ir
				// CHK-PHASES-WITH-CUDA: 19: backend, {18}, assembler, (device-openmp)
				// CHK-PHASES-WITH-CUDA: 20: assembler, {19}, object, (device-openmp)
				// CHK-PHASES-WITH-CUDA: 21: linker, {20}, image, (device-openmp)
				// CHK-PHASES-WITH-CUDA: 22: offload, "host-cuda-openmp (powerpc64le-ibm-linux-gnu)" {14}, "device-openmp (nvptx64-nvidia-cuda)" {21}, image