This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/Driver/
-
clang/
-
Driver/
11/17
Action.h
-
Compilation.h
1/2
Driver.h
-
lib/
-
Driver/
8/13
Action.cpp
4/9
Driver.cpp
-
ToolChain.cpp
-
ToolChains.h
-
ToolChains.cpp
-
Tools.h
1/2
Tools.cpp
-
Frontend/
-
CreateInvocationFromCommandLine.cpp

Differential D18171

[CUDA][OpenMP] Create generic offload action
ClosedPublic

Authored by sfantao on Mar 14 2016, 6:29 PM.

Download Raw Diff

Details

Reviewers

tra
ABataev
echristo
jlebar
hfinkel

Commits

rGd06239d359df: [CUDA][OpenMP] Create generic offload action
rC275645: [CUDA][OpenMP] Create generic offload action
rL275645: [CUDA][OpenMP] Create generic offload action

Summary

This patch replaces the CUDA specific action by a generic offload action. The offload action may have multiple dependences classier in “host” and “device”. The way this generic offloading action is used is very similar to what is done today by the CUDA implementation: it is used to set a specific toolchain and architecture to its dependences during the generation of jobs.

This patch also proposes propagating the offloading information through the action graph so that that information can be easily retrieved at any time during the generation of commands. This allows e.g. the "clang tool” to evaluate whether CUDA should be supported for the device or host and ptas to easily retrieve the target architecture.

This is an example of how the action graphs would look like (compilation of a single CUDA file with two GPU architectures)

0: input, "cudatests.cu", cuda, (host-cuda)
1: preprocessor, {0}, cuda-cpp-output, (host-cuda)
2: compiler, {1}, ir, (host-cuda)
3: input, "cudatests.cu", cuda, (device-cuda, sm_35)
4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_35)
5: compiler, {4}, ir, (device-cuda, sm_35)
6: backend, {5}, assembler, (device-cuda, sm_35)
7: assembler, {6}, object, (device-cuda, sm_35)
8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {7}, object
9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_35)" {6}, assembler
10: input, "cudatests.cu", cuda, (device-cuda, sm_37)
11: preprocessor, {10}, cuda-cpp-output, (device-cuda, sm_37)
12: compiler, {11}, ir, (device-cuda, sm_37)
13: backend, {12}, assembler, (device-cuda, sm_37)
14: assembler, {13}, object, (device-cuda, sm_37)
15: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {14}, object
16: offload, "device-cuda (nvptx64-nvidia-cuda:sm_37)" {13}, assembler
17: linker, {8, 9, 15, 16}, cuda-fatbin, (device-cuda)
18: offload, "host-cuda (powerpc64le-unknown-linux-gnu)" {2}, "device-cuda (nvptx64-nvidia-cuda)" {17}, ir
19: backend, {18}, assembler
20: assembler, {19}, object
21: input, "cuda", object
22: input, "cudart", object
23: linker, {20, 21, 22}, image

The changes in this patch pass the existent regression tests (keeps the existent functionality) and resulting binaries execute correctly in a Power8+K40 machine.

Diff Detail

Event Timeline

sfantao updated this revision to Diff 50689.Mar 14 2016, 6:29 PM

sfantao retitled this revision from to [CUDA][OpenMP] Create generic offload action.

sfantao updated this object.

sfantao added reviewers: ABataev, jlebar, tra, echristo, hfinkel.

sfantao added a parent revision: D18170: [CUDA][OpenMP] Create generic offload toolchains.

sfantao added subscribers: caomhin, carlo.bertolli, arpith-jacob, cfe-commits.

sfantao added a child revision: D18172: [CUDA][OpenMP] Add a generic offload action builder.Mar 14 2016, 6:33 PM

mkuron added a subscriber: mkuron.Mar 19 2016, 1:42 AM

tcramer added a subscriber: tcramer.Mar 21 2016, 1:31 AM

Thank you for making these changes. They don't solve all the shortcomings, but they improve things quite a bit, IMO.

Overall I'm happy with the changes, though use of mutable and changing action state from const functions may need a look from someone with better C++-fu skills than myself.
@jlebar: Justin, any suggestions on what we can/should do regarding that?

tra added inline comments.Mar 23 2016, 2:12 PM

include/clang/Driver/Action.h
99	I assume that by combination you imply that this is a mask of all offloading kinds we intend to use. Perhaps it should be renamed to reflect it better. ActiveOffloadKindMask?
99	These fields appears to be part of the Action state yet you want to modify them from const functions which seems wrong to me. Perhaps functions that set these fields should not be const.
143	propagateXXX implies modification which contradicts 'const' which implies that there will be no state change. Which one is telling the truth?
294–301	Are these independent or is it expected that all four are populated/modified in sync? If they all need to be updated at the same time (which is what the code below does), it should be documented. Perhaps these arrays could be converted into vector of structs.
306	Can any of parameters be nullptr?
320–326	Cosmetic nit: Naming is somewhat inconsistent. Device dependencies use acronyms, but HostDependence uses mix of acronyms and words. I'd use camelcase words in both.
include/clang/Driver/Driver.h
423	TC is only used to get triple to construct file name prefix. Perhaps just pass that string explicitly.
lib/Driver/Action.cpp
45	Given that these functions change state they probably should not be const.
104	There's no need for ToolChain here. A string is all it needs as an input.
223	returning nullptr if .size() != 1 looks strange. Perhaps there should be an assert.
lib/Driver/Driver.cpp
1442–1446	Perhaps this should be moved down closer to where it's used. Perhaps even inside of if(PartialCompilation ...)
1512	Is toolchain needed for fatbin action?
1956	You could fold both ifs into something like this: if (auto *DDAP = dyn_cast_or_null<T>(OA->getSingleDeviceDependence()))
2127	It may be worth adding a comment explaining what happens if OffloadDeviceInputInfos.size() != 1.
lib/Driver/Tools.cpp
3829	All we need is a target triple here. Now that we have device offloading info, perhaps we can bypass AuxToolchain and let offloading info provide host or device triple directly. That would render FIXME above obsolete, IMO.

@jlebar: Justin, any suggestions on what we can/should do regarding [const correctness issues]?

Using "mutable" to work around const-incorrectness should be a last resort. I've been frustrated by the const-incorrectness of Action as well, but if it's an issue here, I think it needs to be fixed, not hacked around.

andreybokhanko added a subscriber: andreybokhanko.Apr 6 2016, 8:21 AM

Address Art, Justin and Eric comments.

Hi Art, Justin,

Thanks for the review and feedback! Tried to address your concerns. Let me know other suggestion you may have.

Thanks again,
Samuel

include/clang/Driver/Action.h
99	That's correct. This is meant to query a host action what are all the programing models used in its dependences. I am now using `ActiveOffloadKindMask` as you suggested.
99	Yeah... I was not very happy with these mutable members either... So, after revisiting the issue, the best possible solution was to do all the propagation when the action are appended to the action list. For that I keep track of the offloading kinds employed for a given input and compilation and use that to propagate the information to top-level actions. I will try to abstract that a little better in the offload handler I am proposing in http://reviews.llvm.org/D18172. Let me know if you still find issues with the approach I am adopting now.
143	I refactored the propagation to work on top of non-const actions. So this is not const anymore.
294–301	Yes, they are meant to be populated in sync. I'll document that in the comment as you suggest. The reason I am not using array of structs is that it is more convenient to have the action list separate to forward it to initialize the base class.
306	Only BoundArch can be null. I am changing the signature of `add()` to use reference types for the arguments that must not be null.
320–326	Ok, I am using camelcase in both now. I also noticed I was not using the typedef types in the private members of the dependencies. I fixed that as well.
include/clang/Driver/Driver.h
423	Ok, I am just passing the string now.
lib/Driver/Action.cpp
45	I fixed that.
104	Ok, passing the string with the normalized triple now.
223	This is used when actions get collapsed. In general we can have multiple device dependences, and if so we do not collapse. Therefore this member function was also working as a check. I'm have changed this to have a 'get' and a 'has' version, and use the assertion in the 'get' version.
lib/Driver/Driver.cpp
1442–1446	I moved it right before the if(Partial...) statement, because it is also used after that.
1512	Yes, it is required to enquire which link tool should be used for the device action.
1956	Given that I am adding a new query to check the existence of the Host or single-device action, I am keeping the two ifs separate. `getSingleDeviceDependence` now does an assertion. Let me know if you prefer me to do this differently.
2127	I am elaborating on that in the comment now. I also got rid of doOnHostDependence here and I use `OA->getHostDependence()` that now contains the assertion.
lib/Driver/Tools.cpp
3829	I removed the use of AuxToolChain. It was being used also for the preprocessor argument. I added some new login in there so that the information is extracted from the toolchains owned by compilation. Let me know if that is what you had in mind.

Rebase.

Herald added a subscriber: mehdi_amini. · View Herald TranscriptJun 13 2016, 11:58 AM

Any more comments on this one?

Thanks!
Samuel

mehdi_amini removed a subscriber: mehdi_amini.Jun 13 2016, 12:01 PM

Better organize how the offload action is used and add more comments to document what is going on.
Rebase.

@tra, any other comments/suggestions about this patch?

Thanks!
Samuel

No '\brief's

include/clang/Driver/Action.h
206	'final'
211	'final'
249	'final'
255–257	default initializers
279	Default initializer
lib/Driver/Driver.cpp
1942–1964	Three slashes

Mark classes with final and fix comments.

Hi Alexey,

Thanks for the review! I addressed your comments in the last diff.

In D18171#471044, @ABataev wrote:

No '\brief's

When you say "no \brief" do you mean I am using that where I shouldn't or the other way around. I am only adding that for member functions/vars that I create, should I do that for member I am modifying too? I noticed two places where I was using that for static functions, I removed \brief in those cases. Let me know if that is what you wanted me to do.

Thanks again,
Samuel

In D18171#471824, @sfantao wrote:

Hi Alexey,

Thanks for the review! I addressed your comments in the last diff.

In D18171#471044, @ABataev wrote:

No '\brief's

When you say "no \brief" do you mean I am using that where I shouldn't or the other way around. I am only adding that for member functions/vars that I create, should I do that for member I am modifying too? I noticed two places where I was using that for static functions, I removed \brief in those cases. Let me know if that is what you wanted me to do.

Thanks again,
Samuel

Ok, I got you. In my comments the abstract can always be the first sentence, so \brief is not required. I'll remove that as you suggest.

Thanks,
Samuel

Remove \brief.

Rebase

guansong added a subscriber: guansong.Jul 5 2016, 10:44 AM

Rebase.
Remove static function no longer necessary.

@tra, any more comments about this patch?

Thanks!
Samuel

Few minor nits and suggestions. Other than that I'm OK with the patch.

lib/Driver/Action.cpp
185	Minor style nit -- LLVM coding standard says : ... we strongly prefer loops to be written so that they evaluate it once before the loop starts. `for (unsigned i = 0, e = getInputs().size(); i != e; ++i)` Please check other for loops throughout the patch.
193–200	It could be rephrased as "do work if we have dependencies" and make code a bit more concise. if (auto *A = DDeps.getActions()[i]) { getInputs().push_back(A); A-> propagate... }
206	Please add assert to verify that getInputs() is not empty. It may be worth doing throughout the patch as there are several places where we indexing into getInputs() result without verifying its size. It's not at all obvious from the code that it's always OK to do so.
212–223	You may want to add an assert that I and TI are both valid within the loop.

Add assertions before accessing getInputs() when assumptions are made about the number of elements.
Add missing evaluations of end iterator in the beginning of loops.
Address other comments by Art.

Add one more assert.

Hi Art,

Thanks for the review! Addressed your comments in the last diff.

Thanks again,
Samuel

lib/Driver/Action.cpp
212–223	I added an assertion for `TI`. I didn't do that for `I` though, as it is the exit condition of the loop, so it will be always valid. Let me know if you still want me to add that.

tra added inline comments.Jul 13 2016, 1:54 PM

lib/Driver/Action.cpp
212–223	I don't see any changes in this function in your latest patch. Did you add that assert somewhere else? I'm not worried about validity of `I` which is indeed ensured by the loop, but rather want to verify that number of inputs we process and number of elements in DevToolChains match. While running out of TI elements eary would be obviously wrong, I assume that exiting the loop with some remaining TI elements would also be unexpected and that we should assert() that it does not happen.

Modify assertion to check that sizes of input dependences and device toolchains is consistent.

sfantao added inline comments.Jul 13 2016, 3:05 PM

lib/Driver/Action.cpp
212–223	The only change in this function was the assertion inside the loop. Its possible my message got in before the actual diff, sorry about that... Ok, got it. I'm replacing the assertion in the loop body by an assertion that checks if the sizes of the inputs and toolchains are consistent. Let me know if you'd rather have a check of the iterator inside and after the loop.

The changes look good.

Now we just need some tests. Something along the lines of test/Driver/phases.c should do.

Add test to check the generated phases for CUDA.
Fix typo in comment.

tra added inline comments.Jul 13 2016, 4:50 PM

test/Driver/cuda_phases.cu
1 ↗	(On Diff #63880)	Few words describing the test would be nice to have. You may also want to add few `REQUIRES` fields so the test does not break for builds w/o PPC or NVPTX. // REQUIRES: clang-driver // REQUIRES: nvptx-registered-target // REQUIRES: powerpc-registered-target I wonder if the test need host arch specified at all. I think it should be able to run on any host as long as it supports NVPTX.

Add comments and REQUIRE directives to test.

sfantao marked an inline comment as done.Jul 13 2016, 5:26 PM

sfantao added inline comments.

test/Driver/cuda_phases.cu
2 ↗	(On Diff #63889)	Oh, ok. Thought that the registration of the targets was checked after the phases generation. I added the `REQUIRES` and comments in the test. I'm still using the explicit target architecture in the commands given that it is also being matched in the tests. Let me know if you prefer to use a wildcard in the tests instead.

LGTM.

test/Driver/cuda_phases.cu
48 ↗	(On Diff #63889)	architectures. There are few more copy-pasted cases below.

This revision is now accepted and ready to land.Jul 13 2016, 5:28 PM

sfantao closed this revision.Jul 15 2016, 4:20 PM

sfantao marked an inline comment as done.

Revision Contents

Path

Size

include/

clang/

Driver/

Action.h

189 lines

Compilation.h

11 lines

Driver.h

27 lines

lib/

Driver/

224 lines

371 lines

3 lines

5 lines

15 lines

3 lines

106 lines

Frontend/

CreateInvocationFromCommandLine.cpp

16 lines

Commit	Tree	Parents	Author	Summary	Date
83b200708b85	5666f79413d8	7d2cc9e58b03	Samuel Antao	Remove \brief.	Jun 30 2016, 8:34 PM
7d2cc9e58b03	c564db259938	c8d2e92d0379	Samuel Antao	Mark classes with final and fix comments.	Jun 30 2016, 3:57 PM
c8d2e92d0379	03916f2cf52c	60735086cdd8 908ddc528028	Samuel Antao	Merge branch 'master' into patch-D18171	Jun 30 2016, 3:23 PM
60735086cdd8	b0e8fbfe5843	f2c1b1f9800b 5f57c65083ce	Samuel Antao	Merge branch 'master' into patch-D18171	Jun 29 2016, 7:47 AM
f2c1b1f9800b	5187ce06f649	d480063a7fd7	Samuel Antao	Fix typo.	Jun 24 2016, 6:24 PM
d480063a7fd7	305a5dac6992	f459bf3a85a3	Samuel Antao	Better organize how the offload action is used and add more comments to… (Show More…)	Jun 24 2016, 5:55 PM
f459bf3a85a3	6baf2d3e10eb	266891462cd1 191558e10b78	Samuel Antao	Merge branch 'master' into patch-D18171 (Show More…)	Jun 20 2016, 3:07 PM
266891462cd1	51f344634510	f49892cbdc9a def8b33bd90f	Samuel Antao	Merge branch 'master' into patch-D18171 (Show More…)	Jun 13 2016, 11:54 AM
f49892cbdc9a	8045bf933913	56966ec96132 514ab388572a	Samuel Antao	Merge branch 'patch-D18170' into patch-D18171-depends-on-patch-D18170	Jun 13 2016, 11:15 AM
514ab388572a	1f2f1fccba7d	4d6f6e53afef b5f5768d4530	Samuel Antao	Merge branch 'master' into patch-D18170	Jun 13 2016, 10:49 AM
4d6f6e53afef	c73e024d979c	84d9b82ff77c f0c013a48e30	Samuel Antao	Merge branch 'master' into patch-D18170	Jun 13 2016, 10:35 AM
84d9b82ff77c	f62814ddbe62	69db7b0c2456 cfd0eb5a9747	Samuel Antao	Merge branch 'master' into patch-D18170	Jun 13 2016, 9:20 AM
69db7b0c2456	42fd51cda5f8	530b15227c0e 3317d0fa0bd1	Samuel Antao	Merge branch 'master' into patch-D18170	May 27 2016, 8:00 AM
56966ec96132	dc1a99ed3d48	36bcd314bbf6 530b15227c0e	Samuel Antao	Merge branch 'patch-D18170' into patch-D18171-depends-on-patch-D18170 (Show More…)	Apr 22 2016, 2:47 PM
530b15227c0e	73194523d7ec	b122ff91dc1c 4b380bc1db8b	Samuel Antao	Merge branch 'master' into patch-D18170	Apr 22 2016, 2:13 PM
36bcd314bbf6	df473f57b3ca	26f97a1549ad b122ff91dc1c	Samuel Antao	Merge branch 'patch-D18170' into patch-D18171-depends-on-patch-D18170	Apr 6 2016, 4:58 PM
b122ff91dc1c	5a0bcbe1a580	d59f25d052c9	Samuel Antao	Organize the code a little better.	Apr 6 2016, 4:58 PM
26f97a1549ad	df473f57b3ca	d53e577035f7	Samuel Antao	Address review comments.	Apr 5 2016, 3:58 PM
d53e577035f7	b5c823c72ced	ec30d592d0ad d59f25d052c9	Samuel Antao	Merge branch 'patch-D18170' into patch-D18171-depends-on-patch-D18170 (Show More…)	Apr 5 2016, 8:33 AM
d59f25d052c9	7f8afb8e9cb9	0eeee9aa1a81	Samuel Antao	Address comments from Art review.	Apr 4 2016, 5:47 PM
0eeee9aa1a81	af2dc2a8c693	49683701f63b 241bba2e6902	Samuel Antao	Merge branch 'master' into cs2-A-create-toolchains (Show More…)	Apr 4 2016, 3:30 PM
ec30d592d0ad	b1f53227ce71	71e35bfffd92	Samuel Antao	Add support for device empty dependence.	Mar 14 2016, 5:32 PM
71e35bfffd92	acbcda48314c	49683701f63b	Samuel Antao	Use new generic offload action.	Mar 14 2016, 4:06 PM
49683701f63b	d1d22c8aaecf	463a8d110532	Samuel Antao	Create generic offloading toolchain and create the offload kinds.	Mar 14 2016, 2:53 PM

Diff 62572

include/clang/Driver/Action.h

//===--- Action.h - Abstract compilation steps ------------------- C++ --===//		//===--- Action.h - Abstract compilation steps ------------------- C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_DRIVER_ACTION_H		#ifndef LLVM_CLANG_DRIVER_ACTION_H
#define LLVM_CLANG_DRIVER_ACTION_H		#define LLVM_CLANG_DRIVER_ACTION_H

#include "clang/Driver/Types.h"		#include "clang/Driver/Types.h"
#include "clang/Driver/Util.h"		#include "clang/Driver/Util.h"
		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"

namespace llvm {		namespace llvm {

class StringRef;		class StringRef;

namespace opt {		namespace opt {
class Arg;		class Arg;
}		}
}		}

namespace clang {		namespace clang {
namespace driver {		namespace driver {

		class ToolChain;

/// Action - Represent an abstract compilation step to perform.		/// Action - Represent an abstract compilation step to perform.
///		///
/// An action represents an edge in the compilation graph; typically		/// An action represents an edge in the compilation graph; typically
/// it is a job to transform an input using some tool.		/// it is a job to transform an input using some tool.
///		///
/// The current driver is hard wired to expect actions which produce a		/// The current driver is hard wired to expect actions which produce a
/// single primary output, at least in terms of controlling the		/// single primary output, at least in terms of controlling the
/// compilation. Actions can produce auxiliary files, but can only		/// compilation. Actions can produce auxiliary files, but can only
/// produce a single output to feed into subsequent actions.		/// produce a single output to feed into subsequent actions.
///		///
/// Actions are usually owned by a Compilation, which creates new		/// Actions are usually owned by a Compilation, which creates new
/// actions via MakeAction().		/// actions via MakeAction().
class Action {		class Action {
public:		public:
typedef ActionList::size_type size_type;		typedef ActionList::size_type size_type;
typedef ActionList::iterator input_iterator;		typedef ActionList::iterator input_iterator;
typedef ActionList::const_iterator input_const_iterator;		typedef ActionList::const_iterator input_const_iterator;
typedef llvm::iterator_range<input_iterator> input_range;		typedef llvm::iterator_range<input_iterator> input_range;
typedef llvm::iterator_range<input_const_iterator> input_const_range;		typedef llvm::iterator_range<input_const_iterator> input_const_range;

enum ActionClass {		enum ActionClass {
InputClass = 0,		InputClass = 0,
BindArchClass,		BindArchClass,
CudaDeviceClass,		OffloadClass,
CudaHostClass,
PreprocessJobClass,		PreprocessJobClass,
PrecompileJobClass,		PrecompileJobClass,
AnalyzeJobClass,		AnalyzeJobClass,
MigrateJobClass,		MigrateJobClass,
CompileJobClass,		CompileJobClass,
BackendJobClass,		BackendJobClass,
AssembleJobClass,		AssembleJobClass,
LinkJobClass,		LinkJobClass,
LipoJobClass,		LipoJobClass,
DsymutilJobClass,		DsymutilJobClass,
VerifyDebugInfoJobClass,		VerifyDebugInfoJobClass,
VerifyPCHJobClass,		VerifyPCHJobClass,

JobClassFirst=PreprocessJobClass,		JobClassFirst=PreprocessJobClass,
JobClassLast=VerifyPCHJobClass		JobClassLast=VerifyPCHJobClass
};		};

// The offloading kind determines if this action is binded to a particular		// The offloading kind determines if this action is binded to a particular
// programming model. Each entry reserves one bit. We also have a special kind		// programming model. Each entry reserves one bit. We also have a special kind
// to designate the host offloading tool chain.		// to designate the host offloading tool chain.
//
// FIXME: This is currently used to indicate that tool chains are used in a
// given programming, but will be used here as well once a generic offloading
// action is implemented.
enum OffloadKind {		enum OffloadKind {
OFK_None = 0x00,		OFK_None = 0x00,
// The host offloading tool chain.		// The host offloading tool chain.
OFK_Host = 0x01,		OFK_Host = 0x01,
// The device offloading tool chains - one bit for each programming model.		// The device offloading tool chains - one bit for each programming model.
OFK_Cuda = 0x02,		OFK_Cuda = 0x02,
};		};

static const char *getClassName(ActionClass AC);		static const char *getClassName(ActionClass AC);

private:		private:
ActionClass Kind;		ActionClass Kind;

/// The output type of this action.		/// The output type of this action.
types::ID Type;		types::ID Type;

ActionList Inputs;		ActionList Inputs;

protected:		protected:
		///
		/// Offload information.
		///

		/// The host offloading kind - a combination of kinds encoded in a mask.
		traUnsubmitted Done Reply Inline Actions I assume that by combination you imply that this is a mask of all offloading kinds we intend to use. Perhaps it should be renamed to reflect it better. ActiveOffloadKindMask? tra: I assume that by combination you imply that this is a mask of all offloading kinds we intend to…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions That's correct. This is meant to query a host action what are all the programing models used in its dependences. I am now using `ActiveOffloadKindMask` as you suggested. sfantao: That's correct. This is meant to query a host action what are all the programing models used in…
		traUnsubmitted Done Reply Inline Actions These fields appears to be part of the Action state yet you want to modify them from const functions which seems wrong to me. Perhaps functions that set these fields should not be const. tra: These fields appears to be part of the Action state yet you want to modify them from const…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Yeah... I was not very happy with these mutable members either... So, after revisiting the issue, the best possible solution was to do all the propagation when the action are appended to the action list. For that I keep track of the offloading kinds employed for a given input and compilation and use that to propagate the information to top-level actions. I will try to abstract that a little better in the offload handler I am proposing in http://reviews.llvm.org/D18172. Let me know if you still find issues with the approach I am adopting now. sfantao: Yeah... I was not very happy with these mutable members either... So, after revisiting the…
		/// Multiple programming models may be supported simultaneously by the same
		/// host.
		unsigned ActiveOffloadKindMask = 0u;
		/// Offloading kind of the device.
		OffloadKind OffloadingDeviceKind = OFK_None;
		/// The Offloading architecture associated with this action.
		const char *OffloadingArch = nullptr;

Action(ActionClass Kind, types::ID Type) : Action(Kind, ActionList(), Type) {}		Action(ActionClass Kind, types::ID Type) : Action(Kind, ActionList(), Type) {}
Action(ActionClass Kind, Action *Input, types::ID Type)		Action(ActionClass Kind, Action *Input, types::ID Type)
: Action(Kind, ActionList({Input}), Type) {}		: Action(Kind, ActionList({Input}), Type) {}
Action(ActionClass Kind, Action *Input)		Action(ActionClass Kind, Action *Input)
: Action(Kind, ActionList({Input}), Input->getType()) {}		: Action(Kind, ActionList({Input}), Input->getType()) {}
Action(ActionClass Kind, const ActionList &Inputs, types::ID Type)		Action(ActionClass Kind, const ActionList &Inputs, types::ID Type)
: Kind(Kind), Type(Type), Inputs(Inputs) {}		: Kind(Kind), Type(Type), Inputs(Inputs) {}

Show All 13 Lines	public:
input_iterator input_begin() { return Inputs.begin(); }		input_iterator input_begin() { return Inputs.begin(); }
input_iterator input_end() { return Inputs.end(); }		input_iterator input_end() { return Inputs.end(); }
input_range inputs() { return input_range(input_begin(), input_end()); }		input_range inputs() { return input_range(input_begin(), input_end()); }
input_const_iterator input_begin() const { return Inputs.begin(); }		input_const_iterator input_begin() const { return Inputs.begin(); }
input_const_iterator input_end() const { return Inputs.end(); }		input_const_iterator input_end() const { return Inputs.end(); }
input_const_range inputs() const {		input_const_range inputs() const {
return input_const_range(input_begin(), input_end());		return input_const_range(input_begin(), input_end());
}		}

		/// Return a string containing the offload kind of the action.
		std::string getOffloadingKindPrefix() const;
		/// Return a string that can be used as prefix in order to generate unique
		/// files for each offloading kind.
		std::string getOffloadingFileNamePrefix(StringRef NormalizedTriple) const;

		traUnsubmitted Done Reply Inline Actions propagateXXX implies modification which contradicts 'const' which implies that there will be no state change. Which one is telling the truth? tra: propagateXXX implies modification which contradicts 'const' which implies that there will be no…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I refactored the propagation to work on top of non-const actions. So this is not const anymore. sfantao: I refactored the propagation to work on top of non-const actions. So this is not const anymore.
		/// Set the device offload info of this action and propagate it to its
		/// dependences.
		void propagateDeviceOffloadInfo(OffloadKind OKind, const char *OArch);
		/// Append the host offload info of this action and propagate it to its
		/// dependences.
		void propagateHostOffloadInfo(unsigned OKinds, const char *OArch);
		/// Set the offload info of this action to be the same as the provided action,
		/// and propagate it to its dependences.
		void propagateOffloadInfo(const Action *A);

		unsigned getOffloadingHostActiveKinds() const {
		return ActiveOffloadKindMask;
		}
		OffloadKind getOffloadingDeviceKind() const { return OffloadingDeviceKind; }
		const char *getOffloadingArch() const { return OffloadingArch; }

		/// Check if this action have any offload kinds. Note that host offload kinds
		/// are only set if the action is a dependence to a host offload action.
		bool isHostOffloading(OffloadKind OKind) const {
		return ActiveOffloadKindMask & OKind;
		}
		bool isDeviceOffloading(OffloadKind OKind) const {
		return OffloadingDeviceKind == OKind;
		}
		bool isOffloading(OffloadKind OKind) const {
		return isHostOffloading(OKind) \|\| isDeviceOffloading(OKind);
		}
};		};

class InputAction : public Action {		class InputAction : public Action {
virtual void anchor();		virtual void anchor();
const llvm::opt::Arg &Input;		const llvm::opt::Arg &Input;

public:		public:
InputAction(const llvm::opt::Arg &Input, types::ID Type);		InputAction(const llvm::opt::Arg &Input, types::ID Type);
Show All 16 Lines	public:

const char *getArchName() const { return ArchName; }		const char *getArchName() const { return ArchName; }

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == BindArchClass;		return A->getKind() == BindArchClass;
}		}
};		};

class CudaDeviceAction : public Action {		/// An offload action combines host or/and device actions according to the
		/// programming model implementation needs and propagates the offloading kind to
		/// its dependences.
		class OffloadAction final : public Action {
		ABataevUnsubmitted Done Reply Inline Actions 'final' ABataev: 'final'
virtual void anchor();		virtual void anchor();
/// GPU architecture to bind. Always of the form /sm_\d+/ or null (when the
/// action applies to multiple architectures).
const char *GpuArchName;
/// True when action results are not consumed by the host action (e.g when
/// -fsyntax-only or --cuda-device-only options are used).
bool AtTopLevel;

public:		public:
CudaDeviceAction(Action Input, const char ArchName, bool AtTopLevel);		/// Type used to communicate device actions. It associates bound architecture,
		/// toolchain, and offload kind to each action.
		class DeviceDependences final {
		ABataevUnsubmitted Done Reply Inline Actions 'final' ABataev: 'final'
		public:
		typedef SmallVector<const ToolChain *, 3> ToolChainList;
		typedef SmallVector<const char *, 3> BoundArchList;
		typedef SmallVector<OffloadKind, 3> OffloadKindList;

const char *getGpuArchName() const { return GpuArchName; }		private:
		// Lists that keep the information for each dependency. All the lists are
		// meant to be updated in sync. We are adopting separate lists instead of a
		// list of structs, because that simplifies forwarding the actions list to
		// initialize the inputs of the base Action class.

/// Gets the compute_XX that corresponds to getGpuArchName(). Returns null		/// The dependence actions.
/// when getGpuArchName() is null.		ActionList DeviceActions;
const char *getComputeArchName() const;		/// The offloading toolchains that should be used with the action.
		ToolChainList DeviceToolChains;
		/// The architectures that should be used with this action.
		BoundArchList DeviceBoundArchs;
		/// The offload kind of each dependence.
		OffloadKindList DeviceOffloadKinds;

		public:
		/// Add a action along with the associated toolchain, bound arch, and
		/// offload kind.
		void add(Action &A, const ToolChain &TC, const char *BoundArch,
		OffloadKind OKind);

		/// Get each of the individual arrays.
		const ActionList &getActions() const { return DeviceActions; };
		const ToolChainList &getToolChains() const { return DeviceToolChains; };
		const BoundArchList &getBoundArchs() const { return DeviceBoundArchs; };
		const OffloadKindList &getOffloadKinds() const {
		return DeviceOffloadKinds;
		};
		};

bool isAtTopLevel() const { return AtTopLevel; }		/// Type used to communicate host actions. It associates bound architecture,
		/// toolchain, and offload kinds to the host action.
		class HostDependence final {
		ABataevUnsubmitted Done Reply Inline Actions 'final' ABataev: 'final'
		/// The dependence action.
		Action &HostAction;
		/// The offloading toolchain that should be used with the action.
		const ToolChain &HostToolChain;
		/// The architectures that should be used with this action.
		const char *HostBoundArch = nullptr;
		/// The offload kind of each dependence.
		unsigned HostOffloadKinds = 0u;
		ABataevUnsubmitted Done Reply Inline Actions default initializers ABataev: default initializers

		public:
		HostDependence(Action &A, const ToolChain &TC, const char *BoundArch,
		const unsigned OffloadKinds)
		: HostAction(A), HostToolChain(TC), HostBoundArch(BoundArch),
		HostOffloadKinds(OffloadKinds){};
		/// Constructor version that obtains the offload kinds from the device
		/// dependencies.
		HostDependence(Action &A, const ToolChain &TC, const char *BoundArch,
		const DeviceDependences &DDeps);
		Action *getAction() const { return &HostAction; };
		const ToolChain *getToolChain() const { return &HostToolChain; };
		const char *getBoundArch() const { return HostBoundArch; };
		unsigned getOffloadKinds() const { return HostOffloadKinds; };
		};

static bool IsValidGpuArchName(llvm::StringRef ArchName);		typedef llvm::function_ref<void(Action , const ToolChain , const char *)>
		OffloadActionWorkTy;

static bool classof(const Action *A) {		private:
return A->getKind() == CudaDeviceClass;		/// The host offloading toolchain that should be used with the action.
}		const ToolChain *HostTC = nullptr;
		ABataevUnsubmitted Done Reply Inline Actions Default initializer ABataev: Default initializer
};

class CudaHostAction : public Action {		/// The tool chains associated with the list of actions.
virtual void anchor();		DeviceDependences::ToolChainList DevToolChains;
ActionList DeviceActions;

public:		public:
CudaHostAction(Action *Input, const ActionList &DeviceActions);		OffloadAction(const HostDependence &HDep);
		OffloadAction(const DeviceDependences &DDeps, types::ID Ty);
		OffloadAction(const HostDependence &HDep, const DeviceDependences &DDeps);

		/// Execute the work specified in \a Work on the host dependence.
		void doOnHostDependence(const OffloadActionWorkTy &Work) const;

		/// Execute the work specified in \a Work on each device dependence.
		void doOnEachDeviceDependence(const OffloadActionWorkTy &Work) const;

		/// Execute the work specified in \a Work on each dependence.
		void doOnEachDependence(const OffloadActionWorkTy &Work) const;

		/// Execute the work specified in \a Work on each host or device dependence if
		/// \a IsHostDependenceto is true or false, respectively.
		void doOnEachDependence(bool IsHostDependence,
		const OffloadActionWorkTy &Work) const;
		traUnsubmitted Done Reply Inline Actions Are these independent or is it expected that all four are populated/modified in sync? If they all need to be updated at the same time (which is what the code below does), it should be documented. Perhaps these arrays could be converted into vector of structs. tra: Are these independent or is it expected that all four are populated/modified in sync? If they…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Yes, they are meant to be populated in sync. I'll document that in the comment as you suggest. The reason I am not using array of structs is that it is more convenient to have the action list separate to forward it to initialize the base class. sfantao: Yes, they are meant to be populated in sync. I'll document that in the comment as you suggest.

		/// Return true if the action has a host dependence.
		bool hasHostDependence() const;

		/// Return the host dependence of this action. This function is only expected
		traUnsubmitted Done Reply Inline Actions Can any of parameters be nullptr? tra: Can any of parameters be nullptr?
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Only BoundArch can be null. I am changing the signature of `add()` to use reference types for the arguments that must not be null. sfantao: Only BoundArch can be null. I am changing the signature of `add()` to use reference types for…
		/// to be called if the host dependence exists.
		Action *getHostDependence() const;

		/// Return true if the action has a single device dependence. If \a
		/// DoNotConsiderHostActions is set, ignore the host dependence, if any, while
		/// accounting for the number of dependences.
		bool hasSingleDeviceDependence(bool DoNotConsiderHostActions = false) const;

const ActionList &getDeviceActions() const { return DeviceActions; }		/// Return the single device dependence of this action. This function is only
		/// expected to be called if a single device dependence exists. If \a
		/// DoNotConsiderHostActions is set, a host dependence is allowed.
		Action *
		getSingleDeviceDependence(bool DoNotConsiderHostActions = false) const;

static bool classof(const Action *A) { return A->getKind() == CudaHostClass; }		static bool classof(const Action *A) { return A->getKind() == OffloadClass; }
};		};

class JobAction : public Action {		class JobAction : public Action {
virtual void anchor();		virtual void anchor();
protected:		protected:
		traUnsubmitted Done Reply Inline Actions Cosmetic nit: Naming is somewhat inconsistent. Device dependencies use acronyms, but HostDependence uses mix of acronyms and words. I'd use camelcase words in both. tra: Cosmetic nit: Naming is somewhat inconsistent. Device dependencies use acronyms, but…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Ok, I am using camelcase in both now. I also noticed I was not using the typedef types in the private members of the dependencies. I fixed that as well. sfantao: Ok, I am using camelcase in both now. I also noticed I was not using the typedef types in the…
JobAction(ActionClass Kind, Action *Input, types::ID Type);		JobAction(ActionClass Kind, Action *Input, types::ID Type);
JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type);		JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type);

public:		public:
static bool classof(const Action *A) {		static bool classof(const Action *A) {
return (A->getKind() >= JobClassFirst &&		return (A->getKind() >= JobClassFirst &&
A->getKind() <= JobClassLast);		A->getKind() <= JobClassLast);
}		}
▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

include/clang/Driver/Compilation.h

Show First 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	public:
Compilation(const Driver &D, const ToolChain &DefaultToolChain,		Compilation(const Driver &D, const ToolChain &DefaultToolChain,
llvm::opt::InputArgList *Args,		llvm::opt::InputArgList *Args,
llvm::opt::DerivedArgList *TranslatedArgs);		llvm::opt::DerivedArgList *TranslatedArgs);
~Compilation();		~Compilation();

const Driver &getDriver() const { return TheDriver; }		const Driver &getDriver() const { return TheDriver; }

const ToolChain &getDefaultToolChain() const { return DefaultToolChain; }		const ToolChain &getDefaultToolChain() const { return DefaultToolChain; }
const ToolChain *getOffloadingHostToolChain() const {
auto It = OrderedOffloadingToolchains.find(Action::OFK_Host);
if (It != OrderedOffloadingToolchains.end())
return It->second;
return nullptr;
}
unsigned isOffloadingHostKind(Action::OffloadKind Kind) const {		unsigned isOffloadingHostKind(Action::OffloadKind Kind) const {
return ActiveOffloadMask & Kind;		return ActiveOffloadMask & Kind;
}		}

/// Iterator that visits device toolchains of a given kind.		/// Iterator that visits device toolchains of a given kind.
typedef const std::multimap<Action::OffloadKind,		typedef const std::multimap<Action::OffloadKind,
const ToolChain *>::const_iterator		const ToolChain *>::const_iterator
const_offload_toolchains_iterator;		const_offload_toolchains_iterator;
typedef std::pair<const_offload_toolchains_iterator,		typedef std::pair<const_offload_toolchains_iterator,
const_offload_toolchains_iterator>		const_offload_toolchains_iterator>
const_offload_toolchains_range;		const_offload_toolchains_range;

template <Action::OffloadKind Kind>		template <Action::OffloadKind Kind>
const_offload_toolchains_range getOffloadToolChains() const {		const_offload_toolchains_range getOffloadToolChains() const {
return OrderedOffloadingToolchains.equal_range(Kind);		return OrderedOffloadingToolchains.equal_range(Kind);
}		}

// Return an offload toolchain of the provided kind. Only one is expected to		/// Return an offload toolchain of the provided kind. Only one is expected to
// exist.		/// exist.
template <Action::OffloadKind Kind>		template <Action::OffloadKind Kind>
const ToolChain *getSingleOffloadToolChain() const {		const ToolChain *getSingleOffloadToolChain() const {
auto TCs = getOffloadToolChains<Kind>();		auto TCs = getOffloadToolChains<Kind>();

assert(TCs.first != TCs.second &&		assert(TCs.first != TCs.second &&
"No tool chains of the selected kind exist!");		"No tool chains of the selected kind exist!");
assert(std::next(TCs.first) == TCs.second &&		assert(std::next(TCs.first) == TCs.second &&
"More than one tool chain of the this kind exist.");		"More than one tool chain of the this kind exist.");
▲ Show 20 Lines • Show All 136 Lines • Show Last 20 Lines

include/clang/Driver/Driver.h

Show First 20 Lines • Show All 388 Lines • ▼ Show 20 Lines	public:
/// \p Phase on the \p Input, taking in to account arguments		/// \p Phase on the \p Input, taking in to account arguments
/// like -fsyntax-only or --analyze.		/// like -fsyntax-only or --analyze.
Action *ConstructPhaseAction(Compilation &C, const llvm::opt::ArgList &Args,		Action *ConstructPhaseAction(Compilation &C, const llvm::opt::ArgList &Args,
phases::ID Phase, Action *Input) const;		phases::ID Phase, Action *Input) const;

/// BuildJobsForAction - Construct the jobs to perform for the action \p A and		/// BuildJobsForAction - Construct the jobs to perform for the action \p A and
/// return an InputInfo for the result of running \p A. Will only construct		/// return an InputInfo for the result of running \p A. Will only construct
/// jobs for a given (Action, ToolChain, BoundArch) tuple once.		/// jobs for a given (Action, ToolChain, BoundArch) tuple once.
InputInfo BuildJobsForAction(Compilation &C, const Action *A,		InputInfo
const ToolChain TC, const char BoundArch,		BuildJobsForAction(Compilation &C, const Action A, const ToolChain TC,
bool AtTopLevel, bool MultipleArchs,		const char *BoundArch, bool AtTopLevel, bool MultipleArchs,
const char *LinkingOutput,		const char *LinkingOutput,
std::map<std::pair<const Action *, std::string>,		std::map<std::pair<const Action *, std::string>, InputInfo>
InputInfo> &CachedResults) const;		&CachedResults,
		bool BuildForOffloadDevice) const;

/// Returns the default name for linked images (e.g., "a.out").		/// Returns the default name for linked images (e.g., "a.out").
const char *getDefaultImageName() const;		const char *getDefaultImageName() const;

/// GetNamedOutputPath - Return the name to use for the output of		/// GetNamedOutputPath - Return the name to use for the output of
/// the action \p JA. The result is appended to the compilation's		/// the action \p JA. The result is appended to the compilation's
/// list of temporary or result files, as appropriate.		/// list of temporary or result files, as appropriate.
///		///
/// \param C - The compilation.		/// \param C - The compilation.
/// \param JA - The action of interest.		/// \param JA - The action of interest.
/// \param BaseInput - The original input file that this action was		/// \param BaseInput - The original input file that this action was
/// triggered by.		/// triggered by.
/// \param BoundArch - The bound architecture.		/// \param BoundArch - The bound architecture.
/// \param AtTopLevel - Whether this is a "top-level" action.		/// \param AtTopLevel - Whether this is a "top-level" action.
/// \param MultipleArchs - Whether multiple -arch options were supplied.		/// \param MultipleArchs - Whether multiple -arch options were supplied.
const char *GetNamedOutputPath(Compilation &C,		/// \param NormalizedTriple - The normalized triple of the relevant target.
const JobAction &JA,		const char *GetNamedOutputPath(Compilation &C, const JobAction &JA,
const char *BaseInput,		const char BaseInput, const char BoundArch,
const char *BoundArch,		bool AtTopLevel, bool MultipleArchs,
bool AtTopLevel,		StringRef NormalizedTriple) const;
		traUnsubmitted Done Reply Inline Actions TC is only used to get triple to construct file name prefix. Perhaps just pass that string explicitly. tra: TC is only used to get triple to construct file name prefix. Perhaps just pass that string…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Ok, I am just passing the string now. sfantao: Ok, I am just passing the string now.
bool MultipleArchs) const;

/// GetTemporaryPath - Return the pathname of a temporary file to use		/// GetTemporaryPath - Return the pathname of a temporary file to use
/// as part of compilation; the file will have the given prefix and suffix.		/// as part of compilation; the file will have the given prefix and suffix.
///		///
/// GCC goes to extra lengths here to be a bit more robust.		/// GCC goes to extra lengths here to be a bit more robust.
std::string GetTemporaryPath(StringRef Prefix, const char *Suffix) const;		std::string GetTemporaryPath(StringRef Prefix, const char *Suffix) const;

/// Return the pathname of the pch file in clang-cl mode.		/// Return the pathname of the pch file in clang-cl mode.
Show All 30 Lines	private:
/// Helper used in BuildJobsForAction. Doesn't use the cache when building		/// Helper used in BuildJobsForAction. Doesn't use the cache when building
/// jobs specifically for the given action, but will use the cache when		/// jobs specifically for the given action, but will use the cache when
/// building jobs for the Action's inputs.		/// building jobs for the Action's inputs.
InputInfo BuildJobsForActionNoCache(		InputInfo BuildJobsForActionNoCache(
Compilation &C, const Action A, const ToolChain TC,		Compilation &C, const Action A, const ToolChain TC,
const char *BoundArch, bool AtTopLevel, bool MultipleArchs,		const char *BoundArch, bool AtTopLevel, bool MultipleArchs,
const char *LinkingOutput,		const char *LinkingOutput,
std::map<std::pair<const Action *, std::string>, InputInfo>		std::map<std::pair<const Action *, std::string>, InputInfo>
&CachedResults) const;		&CachedResults,
		bool BuildForOffloadDevice) const;

public:		public:
/// GetReleaseVersion - Parse (([0-9]+)(.([0-9]+)(.([0-9]+)?))?)? and		/// GetReleaseVersion - Parse (([0-9]+)(.([0-9]+)(.([0-9]+)?))?)? and
/// return the grouped values as integers. Numbers which are not		/// return the grouped values as integers. Numbers which are not
/// provided are set to 0.		/// provided are set to 0.
///		///
/// \return True if the entire string was parsed (9.2), or all		/// \return True if the entire string was parsed (9.2), or all
/// groups were parsed (10.3.5extrastuff). HadExtra is true if all		/// groups were parsed (10.3.5extrastuff). HadExtra is true if all
Show All 23 Lines

lib/Driver/Action.cpp

	//===--- Action.cpp - Abstract compilation steps --------------------------===//			//===--- Action.cpp - Abstract compilation steps --------------------------===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "clang/Driver/Action.h"			#include "clang/Driver/Action.h"
				#include "clang/Driver/ToolChain.h"
	#include "llvm/ADT/StringSwitch.h"			#include "llvm/ADT/StringSwitch.h"
	#include "llvm/Support/ErrorHandling.h"			#include "llvm/Support/ErrorHandling.h"
	#include "llvm/Support/Regex.h"			#include "llvm/Support/Regex.h"
	#include <cassert>			#include <cassert>
	using namespace clang::driver;			using namespace clang::driver;
	using namespace llvm::opt;			using namespace llvm::opt;

	Action::~Action() {}			Action::~Action() {}

	const char *Action::getClassName(ActionClass AC) {			const char *Action::getClassName(ActionClass AC) {
	switch (AC) {			switch (AC) {
	case InputClass: return "input";			case InputClass: return "input";
	case BindArchClass: return "bind-arch";			case BindArchClass: return "bind-arch";
	case CudaDeviceClass: return "cuda-device";			case OffloadClass:
	case CudaHostClass: return "cuda-host";			return "offload";
	case PreprocessJobClass: return "preprocessor";			case PreprocessJobClass: return "preprocessor";
	case PrecompileJobClass: return "precompiler";			case PrecompileJobClass: return "precompiler";
	case AnalyzeJobClass: return "analyzer";			case AnalyzeJobClass: return "analyzer";
	case MigrateJobClass: return "migrator";			case MigrateJobClass: return "migrator";
	case CompileJobClass: return "compiler";			case CompileJobClass: return "compiler";
	case BackendJobClass: return "backend";			case BackendJobClass: return "backend";
	case AssembleJobClass: return "assembler";			case AssembleJobClass: return "assembler";
	case LinkJobClass: return "linker";			case LinkJobClass: return "linker";
	case LipoJobClass: return "lipo";			case LipoJobClass: return "lipo";
	case DsymutilJobClass: return "dsymutil";			case DsymutilJobClass: return "dsymutil";
	case VerifyDebugInfoJobClass: return "verify-debug-info";			case VerifyDebugInfoJobClass: return "verify-debug-info";
	case VerifyPCHJobClass: return "verify-pch";			case VerifyPCHJobClass: return "verify-pch";
	}			}

	llvm_unreachable("invalid class");			llvm_unreachable("invalid class");
	}			}

	void InputAction::anchor() {}			void Action::propagateDeviceOffloadInfo(OffloadKind OKind, const char *OArch) {
				// Offload action set its own kinds on their dependences.
				traUnsubmitted Done Reply Inline Actions Given that these functions change state they probably should not be const. tra: Given that these functions change state they probably should not be const.
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I fixed that. sfantao: I fixed that.
				if (Kind == OffloadClass)
				return;

	InputAction::InputAction(const Arg &_Input, types::ID _Type)			assert((OffloadingDeviceKind == OKind \|\| OffloadingDeviceKind == OFK_None) &&
	: Action(InputClass, _Type), Input(_Input) {			"Setting device kind to a different device??");
				assert(!ActiveOffloadKindMask && "Setting a device kind in a host action??");
				OffloadingDeviceKind = OKind;
				OffloadingArch = OArch;

				for (auto *A : Inputs)
				A->propagateDeviceOffloadInfo(OffloadingDeviceKind, OArch);
	}			}

	void BindArchAction::anchor() {}			void Action::propagateHostOffloadInfo(unsigned OKinds, const char *OArch) {
				// Offload action set its own kinds on their dependences.
				if (Kind == OffloadClass)
				return;

	BindArchAction::BindArchAction(Action Input, const char _ArchName)			assert(OffloadingDeviceKind == OFK_None &&
	: Action(BindArchClass, Input), ArchName(_ArchName) {}			"Setting a host kind in a device action.");
				ActiveOffloadKindMask \|= OKinds;
				OffloadingArch = OArch;

	// Converts CUDA GPU architecture, e.g. "sm_21", to its corresponding virtual			for (auto *A : Inputs)
	// compute arch, e.g. "compute_20". Returns null if the input arch is null or			A->propagateHostOffloadInfo(ActiveOffloadKindMask, OArch);
	// doesn't match an existing arch.
	static const char* GpuArchToComputeName(const char *ArchName) {
	if (!ArchName)
	return nullptr;
	return llvm::StringSwitch<const char *>(ArchName)
	.Cases("sm_20", "sm_21", "compute_20")
	.Case("sm_30", "compute_30")
	.Case("sm_32", "compute_32")
	.Case("sm_35", "compute_35")
	.Case("sm_37", "compute_37")
	.Case("sm_50", "compute_50")
	.Case("sm_52", "compute_52")
	.Case("sm_53", "compute_53")
	.Default(nullptr);
	}			}

	void CudaDeviceAction::anchor() {}			void Action::propagateOffloadInfo(const Action *A) {
				if (unsigned HK = A->getOffloadingHostActiveKinds())
				propagateHostOffloadInfo(HK, A->getOffloadingArch());
				else
				propagateDeviceOffloadInfo(A->getOffloadingDeviceKind(),
				A->getOffloadingArch());
				}

				std::string Action::getOffloadingKindPrefix() const {
				switch (OffloadingDeviceKind) {
				case OFK_None:
				break;
				case OFK_Host:
				llvm_unreachable("Host kind is not an offloading device kind.");
				break;
				case OFK_Cuda:
				return "device-cuda";

	CudaDeviceAction::CudaDeviceAction(Action Input, const char ArchName,			// TODO: Add other programming models here.
	bool AtTopLevel)
	: Action(CudaDeviceClass, Input), GpuArchName(ArchName),
	AtTopLevel(AtTopLevel) {
	assert(!GpuArchName \|\| IsValidGpuArchName(GpuArchName));
	}			}

	const char *CudaDeviceAction::getComputeArchName() const {			if (!ActiveOffloadKindMask)
	return GpuArchToComputeName(GpuArchName);			return "";

				std::string Res("host");
				if (ActiveOffloadKindMask & OFK_Cuda)
				Res += "-cuda";

				// TODO: Add other programming models here.

				return Res;
	}			}
				traUnsubmitted Done Reply Inline Actions There's no need for ToolChain here. A string is all it needs as an input. tra: There's no need for ToolChain here. A string is all it needs as an input.
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Ok, passing the string with the normalized triple now. sfantao: Ok, passing the string with the normalized triple now.

	bool CudaDeviceAction::IsValidGpuArchName(llvm::StringRef ArchName) {			std::string
	return GpuArchToComputeName(ArchName.data()) != nullptr;			Action::getOffloadingFileNamePrefix(StringRef NormalizedTriple) const {
				// A file prefix is only generated for device actions and consists of the
				// offload kind and triple.
				if (!OffloadingDeviceKind)
				return "";

				std::string Res("-");
				Res += getOffloadingKindPrefix();
				Res += "-";
				Res += NormalizedTriple;
				return Res;
	}			}

	void CudaHostAction::anchor() {}			void InputAction::anchor() {}

				InputAction::InputAction(const Arg &_Input, types::ID _Type)
				: Action(InputClass, _Type), Input(_Input) {
				}

				void BindArchAction::anchor() {}

				BindArchAction::BindArchAction(Action Input, const char _ArchName)
				: Action(BindArchClass, Input), ArchName(_ArchName) {}

				void OffloadAction::anchor() {}

	CudaHostAction::CudaHostAction(Action *Input, const ActionList &DeviceActions)			OffloadAction::OffloadAction(const HostDependence &HDep)
	: Action(CudaHostClass, Input), DeviceActions(DeviceActions) {}			: Action(OffloadClass, HDep.getAction()), HostTC(HDep.getToolChain()) {
				OffloadingArch = HDep.getBoundArch();
				ActiveOffloadKindMask = HDep.getOffloadKinds();
				HDep.getAction()->propagateHostOffloadInfo(HDep.getOffloadKinds(),
				HDep.getBoundArch());
				};

				OffloadAction::OffloadAction(const DeviceDependences &DDeps, types::ID Ty)
				: Action(OffloadClass, DDeps.getActions(), Ty),
				DevToolChains(DDeps.getToolChains()) {
				auto &OKinds = DDeps.getOffloadKinds();
				auto &BArchs = DDeps.getBoundArchs();

				// If all inputs agree on the same kind, use it also for this action.
				if (llvm::all_of(OKinds, [&](OffloadKind K) { return K == OKinds.front(); }))
				OffloadingDeviceKind = OKinds.front();

				// If we have a single dependency, inherit the architecture from it.
				if (OKinds.size() == 1)
				OffloadingArch = BArchs.front();

				// Propagate info to the dependencies.
				for (unsigned i = 0; i < getInputs().size(); ++i)
				getInputs()[i]->propagateDeviceOffloadInfo(OKinds[i], BArchs[i]);
				}

				OffloadAction::OffloadAction(const HostDependence &HDep,
				const DeviceDependences &DDeps)
				: Action(OffloadClass, HDep.getAction()), HostTC(HDep.getToolChain()),
				DevToolChains(DDeps.getToolChains()) {
				// We use the kinds of the host dependence for this action.
				OffloadingArch = HDep.getBoundArch();
				ActiveOffloadKindMask = HDep.getOffloadKinds();
				HDep.getAction()->propagateHostOffloadInfo(HDep.getOffloadKinds(),
				HDep.getBoundArch());

				// Add device inputs and propagate info to the device actions.
				for (unsigned i = 0; i < DDeps.getActions().size(); ++i) {
				auto *A = DDeps.getActions()[i];
				// Skip actions of empty dependences.
				if (!A)
				continue;
				getInputs().push_back(A);
				A->propagateDeviceOffloadInfo(DDeps.getOffloadKinds()[i],
				DDeps.getBoundArchs()[i]);
				}
				}

				void OffloadAction::doOnHostDependence(const OffloadActionWorkTy &Work) const {
				if (!HostTC)
				return;
				auto *A = getInputs().front();
				traUnsubmitted Done Reply Inline Actions Minor style nit -- LLVM coding standard says : ... we strongly prefer loops to be written so that they evaluate it once before the loop starts. `for (unsigned i = 0, e = getInputs().size(); i != e; ++i)` Please check other for loops throughout the patch. tra: Minor style nit -- [[ http://llvm.org/docs/CodingStandards.html#don-t-evaluate-end-every-time…
				Work(A, HostTC, A->getOffloadingArch());
				}

				void OffloadAction::doOnEachDeviceDependence(
				const OffloadActionWorkTy &Work) const {
				auto I = getInputs().begin();
				auto E = getInputs().end();
				if (I == E)
				return;

				// Skip host action
				if (HostTC)
				++I;

				auto TI = DevToolChains.begin();
				traUnsubmitted Done Reply Inline Actions It could be rephrased as "do work if we have dependencies" and make code a bit more concise. if (auto A = DDeps.getActions()[i]) { getInputs().push_back(A); A-> propagate... } tra:* It could be rephrased as "do work if we have dependencies" and make code a bit more concise.
				for (; I != E; ++I, ++TI)
				Work(I, TI, (*I)->getOffloadingArch());
				}

				void OffloadAction::doOnEachDependence(const OffloadActionWorkTy &Work) const {
				doOnHostDependence(Work);
				traUnsubmitted Done Reply Inline Actions Please add assert to verify that getInputs() is not empty. It may be worth doing throughout the patch as there are several places where we indexing into getInputs() result without verifying its size. It's not at all obvious from the code that it's always OK to do so. tra: Please add assert to verify that getInputs() is not empty. It may be worth doing throughout the…
				doOnEachDeviceDependence(Work);
				}

				void OffloadAction::doOnEachDependence(bool IsHostDependence,
				const OffloadActionWorkTy &Work) const {
				if (IsHostDependence)
				doOnHostDependence(Work);
				else
				doOnEachDeviceDependence(Work);
				}

				bool OffloadAction::hasHostDependence() const { return HostTC != nullptr; }

				Action *OffloadAction::getHostDependence() const {
				assert(hasHostDependence() && "Host dependence does not exist!");
				return HostTC ? getInputs().front() : nullptr;
				}
				traUnsubmitted Done Reply Inline Actions returning nullptr if .size() != 1 looks strange. Perhaps there should be an assert. tra: returning nullptr if .size() != 1 looks strange. Perhaps there should be an assert.
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions This is used when actions get collapsed. In general we can have multiple device dependences, and if so we do not collapse. Therefore this member function was also working as a check. I'm have changed this to have a 'get' and a 'has' version, and use the assertion in the 'get' version. sfantao: This is used when actions get collapsed. In general we can have multiple device dependences…
				traUnsubmitted Done Reply Inline Actions You may want to add an assert that I and TI are both valid within the loop. tra: You may want to add an assert that I and TI are both valid within the loop.
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I added an assertion for `TI`. I didn't do that for `I` though, as it is the exit condition of the loop, so it will be always valid. Let me know if you still want me to add that. sfantao: I added an assertion for `TI`. I didn't do that for `I` though, as it is the exit condition of…
				traUnsubmitted Done Reply Inline Actions I don't see any changes in this function in your latest patch. Did you add that assert somewhere else? I'm not worried about validity of `I` which is indeed ensured by the loop, but rather want to verify that number of inputs we process and number of elements in DevToolChains match. While running out of TI elements eary would be obviously wrong, I assume that exiting the loop with some remaining TI elements would also be unexpected and that we should assert() that it does not happen. tra: I don't see any changes in this function in your latest patch. Did you add that assert…
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions The only change in this function was the assertion inside the loop. Its possible my message got in before the actual diff, sorry about that... Ok, got it. I'm replacing the assertion in the loop body by an assertion that checks if the sizes of the inputs and toolchains are consistent. Let me know if you'd rather have a check of the iterator inside and after the loop. sfantao: The only change in this function was the assertion inside the loop. Its possible my message got…

				bool OffloadAction::hasSingleDeviceDependence(
				bool DoNotConsiderHostActions) const {
				if (DoNotConsiderHostActions)
				return getInputs().size() == (HostTC ? 2 : 1);
				return !HostTC && getInputs().size() == 1;
				}

				Action *
				OffloadAction::getSingleDeviceDependence(bool DoNotConsiderHostActions) const {
				assert(hasSingleDeviceDependence(DoNotConsiderHostActions) &&
				"Single device dependence does not exist!");
				return HostTC ? getInputs()[1] : getInputs().front();
				}

				void OffloadAction::DeviceDependences::add(Action &A, const ToolChain &TC,
				const char *BoundArch,
				OffloadKind OKind) {
				DeviceActions.push_back(&A);
				DeviceToolChains.push_back(&TC);
				DeviceBoundArchs.push_back(BoundArch);
				DeviceOffloadKinds.push_back(OKind);
				}

				OffloadAction::HostDependence::HostDependence(Action &A, const ToolChain &TC,
				const char *BoundArch,
				const DeviceDependences &DDeps)
				: HostAction(A), HostToolChain(TC), HostBoundArch(BoundArch) {
				for (auto K : DDeps.getOffloadKinds())
				HostOffloadKinds \|= K;
				}

	void JobAction::anchor() {}			void JobAction::anchor() {}

	JobAction::JobAction(ActionClass Kind, Action *Input, types::ID Type)			JobAction::JobAction(ActionClass Kind, Action *Input, types::ID Type)
	: Action(Kind, Input, Type) {}			: Action(Kind, Input, Type) {}

	JobAction::JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type)			JobAction::JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type)
	: Action(Kind, Inputs, Type) {			: Action(Kind, Inputs, Type) {
	▲ Show 20 Lines • Show All 74 Lines • Show Last 20 Lines

lib/Driver/Driver.cpp

Show First 20 Lines • Show All 428 Lines • ▼ Show 20 Lines	void Driver::CreateOffloadingDeviceToolChains(Compilation &C,
// CUDA		// CUDA
//		//
// We need to generate a CUDA toolchain if any of the inputs has a CUDA type.		// We need to generate a CUDA toolchain if any of the inputs has a CUDA type.
if (llvm::any_of(Inputs, [](std::pair<types::ID, const llvm::opt::Arg *> &I) {		if (llvm::any_of(Inputs, [](std::pair<types::ID, const llvm::opt::Arg *> &I) {
return types::isCuda(I.first);		return types::isCuda(I.first);
})) {		})) {
const ToolChain &TC = getToolChain(		const ToolChain &TC = getToolChain(
C.getInputArgs(),		C.getInputArgs(),
llvm::Triple(C.getOffloadingHostToolChain()->getTriple().isArch64Bit()		llvm::Triple(C.getSingleOffloadToolChain<Action::OFK_Host>()
		->getTriple()
		.isArch64Bit()
? "nvptx64-nvidia-cuda"		? "nvptx64-nvidia-cuda"
: "nvptx-nvidia-cuda"));		: "nvptx-nvidia-cuda"));
C.addOffloadDeviceToolChain(&TC, Action::OFK_Cuda);		C.addOffloadDeviceToolChain(&TC, Action::OFK_Cuda);
}		}

//		//
// TODO: Add support for other offloading programming models here.		// TODO: Add support for other offloading programming models here.
//		//
▲ Show 20 Lines • Show All 570 Lines • ▼ Show 20 Lines	static unsigned PrintActions1(const Compilation &C, Action *A,
llvm::raw_string_ostream os(str);		llvm::raw_string_ostream os(str);

os << Action::getClassName(A->getKind()) << ", ";		os << Action::getClassName(A->getKind()) << ", ";
if (InputAction *IA = dyn_cast<InputAction>(A)) {		if (InputAction *IA = dyn_cast<InputAction>(A)) {
os << "\"" << IA->getInputArg().getValue() << "\"";		os << "\"" << IA->getInputArg().getValue() << "\"";
} else if (BindArchAction *BIA = dyn_cast<BindArchAction>(A)) {		} else if (BindArchAction *BIA = dyn_cast<BindArchAction>(A)) {
os << '"' << BIA->getArchName() << '"' << ", {"		os << '"' << BIA->getArchName() << '"' << ", {"
<< PrintActions1(C, *BIA->input_begin(), Ids) << "}";		<< PrintActions1(C, *BIA->input_begin(), Ids) << "}";
} else if (CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {		} else if (OffloadAction *OA = dyn_cast<OffloadAction>(A)) {
os << '"'		bool IsFirst = true;
<< (CDA->getGpuArchName() ? CDA->getGpuArchName() : "(multiple archs)")		OA->doOnEachDependence(
<< '"' << ", {" << PrintActions1(C, *CDA->input_begin(), Ids) << "}";		[&](Action A, const ToolChain TC, const char *BoundArch) {
		// E.g. for two CUDA device dependences whose bound arch is sm_20 and
		// sm_35 this will generate:
		// "cuda-device" (nvptx64-nvidia-cuda:sm_20) {#ID}, "cuda-device"
		// (nvptx64-nvidia-cuda:sm_35) {#ID}
		if (!IsFirst)
		os << ", ";
		os << '"';
		if (TC)
		os << A->getOffloadingKindPrefix();
		else
		os << "host";
		os << " (";
		os << TC->getTriple().normalize();

		if (BoundArch)
		os << ":" << BoundArch;
		os << ")";
		os << '"';
		os << " {" << PrintActions1(C, A, Ids) << "}";
		IsFirst = false;
		});
} else {		} else {
const ActionList *AL;		const ActionList *AL = &A->getInputs();
if (CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {
os << "{" << PrintActions1(C, *CHA->input_begin(), Ids) << "}"
<< ", gpu binaries ";
AL = &CHA->getDeviceActions();
} else
AL = &A->getInputs();

if (AL->size()) {		if (AL->size()) {
const char *Prefix = "{";		const char *Prefix = "{";
for (Action PreRequisite : AL) {		for (Action PreRequisite : AL) {
os << Prefix << PrintActions1(C, PreRequisite, Ids);		os << Prefix << PrintActions1(C, PreRequisite, Ids);
Prefix = ", ";		Prefix = ", ";
}		}
os << "}";		os << "}";
} else		} else
os << "{}";		os << "{}";
}		}

		// Append offload info for all options other than the offloading action
		// itself (e.g. (cuda-device, sm_20) or (cuda-host)).
		std::string offload_str;
		llvm::raw_string_ostream offload_os(offload_str);
		if (!isa<OffloadAction>(A)) {
		auto S = A->getOffloadingKindPrefix();
		if (!S.empty()) {
		offload_os << ", (" << S;
		if (A->getOffloadingArch())
		offload_os << ", " << A->getOffloadingArch();
		offload_os << ")";
		}
		}

unsigned Id = Ids.size();		unsigned Id = Ids.size();
Ids[A] = Id;		Ids[A] = Id;
llvm::errs() << Id << ": " << os.str() << ", "		llvm::errs() << Id << ": " << os.str() << ", "
<< types::getTypeName(A->getType()) << "\n";		<< types::getTypeName(A->getType()) << offload_os.str() << "\n";

return Id;		return Id;
}		}

// Print the action graphs in a compilation C.		// Print the action graphs in a compilation C.
// For example "clang -c file1.c file2.c" is composed of two subgraphs.		// For example "clang -c file1.c file2.c" is composed of two subgraphs.
void Driver::PrintActions(const Compilation &C) const {		void Driver::PrintActions(const Compilation &C) const {
std::map<Action *, unsigned> Ids;		std::map<Action *, unsigned> Ids;
▲ Show 20 Lines • Show All 311 Lines • ▼ Show 20 Lines	Arg *PartialCompilationArg = Args.getLastArg(
options::OPT_cuda_compile_host_device);		options::OPT_cuda_compile_host_device);
bool CompileHostOnly =		bool CompileHostOnly =
PartialCompilationArg &&		PartialCompilationArg &&
PartialCompilationArg->getOption().matches(options::OPT_cuda_host_only);		PartialCompilationArg->getOption().matches(options::OPT_cuda_host_only);
bool CompileDeviceOnly =		bool CompileDeviceOnly =
PartialCompilationArg &&		PartialCompilationArg &&
PartialCompilationArg->getOption().matches(options::OPT_cuda_device_only);		PartialCompilationArg->getOption().matches(options::OPT_cuda_device_only);

if (CompileHostOnly)		if (CompileHostOnly) {
return C.MakeAction<CudaHostAction>(HostAction, ActionList());		OffloadAction::HostDependence HDep(
		HostAction, C.getSingleOffloadToolChain<Action::OFK_Host>(),
		/BoundArch=/nullptr, Action::OFK_Cuda);
		return C.MakeAction<OffloadAction>(HDep);
		}

// Collect all cuda_gpu_arch parameters, removing duplicates.		// Collect all cuda_gpu_arch parameters, removing duplicates.
SmallVector<const char *, 4> GpuArchList;		SmallVector<const char *, 4> GpuArchList;
llvm::StringSet<> GpuArchNames;		llvm::StringSet<> GpuArchNames;
for (Arg *A : Args) {		for (Arg *A : Args) {
if (!A->getOption().matches(options::OPT_cuda_gpu_arch_EQ))		if (!A->getOption().matches(options::OPT_cuda_gpu_arch_EQ))
continue;		continue;
A->claim();		A->claim();

const auto& Arch = A->getValue();		const auto& Arch = A->getValue();
if (!CudaDeviceAction::IsValidGpuArchName(Arch))		if (!toolchains::CudaToolChain::GpuArchToComputeName(Arch))
C.getDriver().Diag(clang::diag::err_drv_cuda_bad_gpu_arch) << Arch;		C.getDriver().Diag(clang::diag::err_drv_cuda_bad_gpu_arch) << Arch;
else if (GpuArchNames.insert(Arch).second)		else if (GpuArchNames.insert(Arch).second)
GpuArchList.push_back(Arch);		GpuArchList.push_back(Arch);
}		}

// Default to sm_20 which is the lowest common denominator for supported GPUs.		// Default to sm_20 which is the lowest common denominator for supported GPUs.
// sm_20 code should work correctly, if suboptimally, on all newer GPUs.		// sm_20 code should work correctly, if suboptimally, on all newer GPUs.
if (GpuArchList.empty())		if (GpuArchList.empty())
GpuArchList.push_back("sm_20");		GpuArchList.push_back("sm_20");

// Replicate inputs for each GPU architecture.		// Replicate inputs for each GPU architecture.
Driver::InputList CudaDeviceInputs;		Driver::InputList CudaDeviceInputs;
for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)		for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)
CudaDeviceInputs.push_back(std::make_pair(types::TY_CUDA_DEVICE, InputArg));		CudaDeviceInputs.push_back(std::make_pair(types::TY_CUDA_DEVICE, InputArg));

// Build actions for all device inputs.		// Build actions for all device inputs.
assert(C.getSingleOffloadToolChain<Action::OFK_Cuda>() &&
"Missing toolchain for device-side compilation.");
ActionList CudaDeviceActions;		ActionList CudaDeviceActions;
C.getDriver().BuildActions(C, Args, CudaDeviceInputs, CudaDeviceActions);		C.getDriver().BuildActions(C, Args, CudaDeviceInputs, CudaDeviceActions);
assert(GpuArchList.size() == CudaDeviceActions.size() &&		assert(GpuArchList.size() == CudaDeviceActions.size() &&
"Failed to create actions for all devices");		"Failed to create actions for all devices");
		traUnsubmitted Done Reply Inline Actions Perhaps this should be moved down closer to where it's used. Perhaps even inside of if(PartialCompilation ...) tra: Perhaps this should be moved down closer to where it's used. Perhaps even inside of if…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I moved it right before the if(Partial...) statement, because it is also used after that. sfantao: I moved it right before the if(Partial...) statement, because it is also used after that.

// Check whether any of device actions stopped before they could generate PTX.		// Check whether any of device actions stopped before they could generate PTX.
bool PartialCompilation =		bool PartialCompilation =
llvm::any_of(CudaDeviceActions, [](const Action *a) {		llvm::any_of(CudaDeviceActions, [](const Action *a) {
return a->getKind() != Action::AssembleJobClass;		return a->getKind() != Action::AssembleJobClass;
});		});

		const ToolChain *CudaTC = C.getSingleOffloadToolChain<Action::OFK_Cuda>();

// Figure out what to do with device actions -- pass them as inputs to the		// Figure out what to do with device actions -- pass them as inputs to the
// host action or run each of them independently.		// host action or run each of them independently.
if (PartialCompilation \|\| CompileDeviceOnly) {		if (PartialCompilation \|\| CompileDeviceOnly) {
// In case of partial or device-only compilation results of device actions		// In case of partial or device-only compilation results of device actions
// are not consumed by the host action device actions have to be added to		// are not consumed by the host action device actions have to be added to
// top-level actions list with AtTopLevel=true and run independently.		// top-level actions list with AtTopLevel=true and run independently.

// -o is ambiguous if we have more than one top-level action.		// -o is ambiguous if we have more than one top-level action.
if (Args.hasArg(options::OPT_o) &&		if (Args.hasArg(options::OPT_o) &&
(!CompileDeviceOnly \|\| GpuArchList.size() > 1)) {		(!CompileDeviceOnly \|\| GpuArchList.size() > 1)) {
C.getDriver().Diag(		C.getDriver().Diag(
clang::diag::err_drv_output_argument_with_multiple_files);		clang::diag::err_drv_output_argument_with_multiple_files);
return nullptr;		return nullptr;
}		}

for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I)		for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I) {
Actions.push_back(C.MakeAction<CudaDeviceAction>(CudaDeviceActions[I],		OffloadAction::DeviceDependences DDep;
GpuArchList[I],		DDep.add(CudaDeviceActions[I], CudaTC, GpuArchList[I],
/* AtTopLevel */ true));		Action::OFK_Cuda);
		Actions.push_back(
		C.MakeAction<OffloadAction>(DDep, CudaDeviceActions[I]->getType()));
		}
// Kill host action in case of device-only compilation.		// Kill host action in case of device-only compilation.
if (CompileDeviceOnly)		if (CompileDeviceOnly)
return nullptr;		return nullptr;
return HostAction;		return HostAction;
}		}

// If we're not a partial or device-only compilation, we compile each arch to		// If we're not a partial or device-only compilation, we compile each arch to
// ptx and assemble to cubin, then feed the cubin and the ptx into a device		// ptx and assemble to cubin, then feed the cubin and the ptx into a device
// "link" action, which uses fatbinary to combine these cubins into one		// "link" action, which uses fatbinary to combine these cubins into one
// fatbin. The fatbin is then an input to the host compilation.		// fatbin. The fatbin is then an input to the host compilation.
ActionList DeviceActions;		ActionList DeviceActions;
for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I) {		for (unsigned I = 0, E = GpuArchList.size(); I != E; ++I) {
Action* AssembleAction = CudaDeviceActions[I];		Action* AssembleAction = CudaDeviceActions[I];
assert(AssembleAction->getType() == types::TY_Object);		assert(AssembleAction->getType() == types::TY_Object);
assert(AssembleAction->getInputs().size() == 1);		assert(AssembleAction->getInputs().size() == 1);

Action* BackendAction = AssembleAction->getInputs()[0];		Action* BackendAction = AssembleAction->getInputs()[0];
assert(BackendAction->getType() == types::TY_PP_Asm);		assert(BackendAction->getType() == types::TY_PP_Asm);

for (const auto& A : {AssembleAction, BackendAction}) {		for (auto &A : {AssembleAction, BackendAction}) {
DeviceActions.push_back(C.MakeAction<CudaDeviceAction>(		OffloadAction::DeviceDependences DDep;
A, GpuArchList[I], /* AtTopLevel */ false));		DDep.add(A, CudaTC, GpuArchList[I], Action::OFK_Cuda);
		DeviceActions.push_back(C.MakeAction<OffloadAction>(DDep, A->getType()));
}		}
}		}
auto FatbinAction = C.MakeAction<CudaDeviceAction>(		auto FatbinAction =
C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN),		C.MakeAction<LinkJobAction>(DeviceActions, types::TY_CUDA_FATBIN);
/* GpuArchName = */ nullptr,
/* AtTopLevel = */ false);
// Return a new host action that incorporates original host action and all		// Return a new host action that incorporates original host action and all
// device actions.		// device actions.
return C.MakeAction<CudaHostAction>(std::move(HostAction),		OffloadAction::HostDependence HDep(
ActionList({FatbinAction}));		HostAction, C.getSingleOffloadToolChain<Action::OFK_Host>(),
		/BoundArch=/nullptr, Action::OFK_Cuda);
		OffloadAction::DeviceDependences DDep;
		DDep.add(FatbinAction, CudaTC, /BoundArch=/nullptr, Action::OFK_Cuda);
		traUnsubmitted Done Reply Inline Actions Is toolchain needed for fatbin action? tra: Is toolchain needed for fatbin action?
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Yes, it is required to enquire which link tool should be used for the device action. sfantao: Yes, it is required to enquire which link tool should be used for the device action.
		return C.MakeAction<OffloadAction>(HDep, DDep);
}		}

void Driver::BuildActions(Compilation &C, DerivedArgList &Args,		void Driver::BuildActions(Compilation &C, DerivedArgList &Args,
const InputList &Inputs, ActionList &Actions) const {		const InputList &Inputs, ActionList &Actions) const {
llvm::PrettyStackTraceString CrashInfo("Building compilation actions");		llvm::PrettyStackTraceString CrashInfo("Building compilation actions");

if (!SuppressMissingInputWarning && Inputs.empty()) {		if (!SuppressMissingInputWarning && Inputs.empty()) {
Diag(clang::diag::err_drv_no_input_files);		Diag(clang::diag::err_drv_no_input_files);
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	if (Args.hasArg(options::OPT__SLASH_Y_)) {
// /Y- disables all pch handling. Rather than check for it everywhere,		// /Y- disables all pch handling. Rather than check for it everywhere,
// just remove clang-cl pch-related flags here.		// just remove clang-cl pch-related flags here.
Args.eraseArg(options::OPT__SLASH_Fp);		Args.eraseArg(options::OPT__SLASH_Fp);
Args.eraseArg(options::OPT__SLASH_Yc);		Args.eraseArg(options::OPT__SLASH_Yc);
Args.eraseArg(options::OPT__SLASH_Yu);		Args.eraseArg(options::OPT__SLASH_Yu);
YcArg = YuArg = nullptr;		YcArg = YuArg = nullptr;
}		}

		// Track the host offload kinds used on this compilation.
		unsigned CompilationActiveOffloadHostKinds = 0u;

// Construct the actions to perform.		// Construct the actions to perform.
ActionList LinkerInputs;		ActionList LinkerInputs;

llvm::SmallVector<phases::ID, phases::MaxNumberOfPhases> PL;		llvm::SmallVector<phases::ID, phases::MaxNumberOfPhases> PL;
for (auto &I : Inputs) {		for (auto &I : Inputs) {
types::ID InputType = I.first;		types::ID InputType = I.first;
const Arg *InputArg = I.second;		const Arg *InputArg = I.second;

▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	for (auto &I : Inputs) {
}		}

phases::ID CudaInjectionPhase =		phases::ID CudaInjectionPhase =
(phases::Compile < FinalPhase &&		(phases::Compile < FinalPhase &&
llvm::find(PL, phases::Compile) != PL.end())		llvm::find(PL, phases::Compile) != PL.end())
? phases::Compile		? phases::Compile
: FinalPhase;		: FinalPhase;

		// Track the host offload kinds used on this input.
		unsigned InputActiveOffloadHostKinds = 0u;

// Build the pipeline for this file.		// Build the pipeline for this file.
Action Current = C.MakeAction<InputAction>(InputArg, InputType);		Action Current = C.MakeAction<InputAction>(InputArg, InputType);
for (SmallVectorImpl<phases::ID>::iterator i = PL.begin(), e = PL.end();		for (SmallVectorImpl<phases::ID>::iterator i = PL.begin(), e = PL.end();
i != e; ++i) {		i != e; ++i) {
phases::ID Phase = *i;		phases::ID Phase = *i;

// We are done if this step is past what the user requested.		// We are done if this step is past what the user requested.
if (Phase > FinalPhase)		if (Phase > FinalPhase)
Show All 15 Lines	for (SmallVectorImpl<phases::ID>::iterator i = PL.begin(), e = PL.end();

// Otherwise construct the appropriate action.		// Otherwise construct the appropriate action.
Current = ConstructPhaseAction(C, Args, Phase, Current);		Current = ConstructPhaseAction(C, Args, Phase, Current);

if (InputType == types::TY_CUDA && Phase == CudaInjectionPhase) {		if (InputType == types::TY_CUDA && Phase == CudaInjectionPhase) {
Current = buildCudaActions(C, Args, InputArg, Current, Actions);		Current = buildCudaActions(C, Args, InputArg, Current, Actions);
if (!Current)		if (!Current)
break;		break;

		// We produced a CUDA action for this input, so the host has to support
		// CUDA.
		InputActiveOffloadHostKinds \|= Action::OFK_Cuda;
		CompilationActiveOffloadHostKinds \|= Action::OFK_Cuda;
}		}

if (Current->getType() == types::TY_Nothing)		if (Current->getType() == types::TY_Nothing)
break;		break;
}		}

// If we ended with something, add to the output list.		// If we ended with something, add to the output list. Also, propagate the
if (Current)		// offload information to the top-level host action related with the current
		// input.
		if (Current) {
		if (InputActiveOffloadHostKinds)
		Current->propagateHostOffloadInfo(InputActiveOffloadHostKinds,
		/BoundArch=/nullptr);
Actions.push_back(Current);		Actions.push_back(Current);
}		}
		}

// Add a link action if necessary.		// Add a link action if necessary and propagate the offload information for
if (!LinkerInputs.empty())		// the current compilation.
		if (!LinkerInputs.empty()) {
Actions.push_back(		Actions.push_back(
C.MakeAction<LinkJobAction>(LinkerInputs, types::TY_Image));		C.MakeAction<LinkJobAction>(LinkerInputs, types::TY_Image));
		Actions.back()->propagateHostOffloadInfo(CompilationActiveOffloadHostKinds,
		/BoundArch=/nullptr);
		}

// If we are linking, claim any options which are obviously only used for		// If we are linking, claim any options which are obviously only used for
// compilation.		// compilation.
if (FinalPhase == phases::Link && PL.size() == 1) {		if (FinalPhase == phases::Link && PL.size() == 1) {
Args.ClaimAllArgs(options::OPT_CompileOnly_Group);		Args.ClaimAllArgs(options::OPT_CompileOnly_Group);
Args.ClaimAllArgs(options::OPT_cl_compile_Group);		Args.ClaimAllArgs(options::OPT_cl_compile_Group);
}		}

▲ Show 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	if (isa<LipoJobAction>(A)) {
else		else
LinkingOutput = getDefaultImageName();		LinkingOutput = getDefaultImageName();
}		}

BuildJobsForAction(C, A, &C.getDefaultToolChain(),		BuildJobsForAction(C, A, &C.getDefaultToolChain(),
/BoundArch/ nullptr,		/BoundArch/ nullptr,
/AtTopLevel/ true,		/AtTopLevel/ true,
/MultipleArchs/ ArchNames.size() > 1,		/MultipleArchs/ ArchNames.size() > 1,
/LinkingOutput/ LinkingOutput, CachedResults);		/LinkingOutput/ LinkingOutput, CachedResults,
		/BuildForOffloadDevice/ false);
}		}

// If the user passed -Qunused-arguments or there were errors, don't warn		// If the user passed -Qunused-arguments or there were errors, don't warn
// about any unused arguments.		// about any unused arguments.
if (Diags.hasErrorOccurred() \|\|		if (Diags.hasErrorOccurred() \|\|
C.getArgs().hasArg(options::OPT_Qunused_arguments))		C.getArgs().hasArg(options::OPT_Qunused_arguments))
return;		return;

Show All 32 Lines	if (!A->isClaimed()) {
// In clang-cl, don't mention unknown arguments here since they have		// In clang-cl, don't mention unknown arguments here since they have
// already been warned about.		// already been warned about.
if (!IsCLMode() \|\| !A->getOption().matches(options::OPT_UNKNOWN))		if (!IsCLMode() \|\| !A->getOption().matches(options::OPT_UNKNOWN))
Diag(clang::diag::warn_drv_unused_argument)		Diag(clang::diag::warn_drv_unused_argument)
<< A->getAsString(C.getArgs());		<< A->getAsString(C.getArgs());
}		}
}		}
}		}
		/// Collapse an offloading action looking for a job of the given type. The input
		/// action is changed to the input of the collapsed sequence. If we effectively
		/// had a collapse return the corresponding offloading action, otherwise return
		/// null.
		template <typename T>
		static OffloadAction collapseOffloadingAction(Action &CurAction) {
		if (!CurAction)
		return nullptr;
		if (auto *OA = dyn_cast<OffloadAction>(CurAction)) {
		if (OA->hasHostDependence())
		if (auto *HDep = dyn_cast<T>(OA->getHostDependence())) {
		CurAction = HDep;
		return OA;
		}
		if (OA->hasSingleDeviceDependence())
		traUnsubmitted Done Reply Inline Actions You could fold both ifs into something like this: if (auto DDAP = dyn_cast_or_null<T>(OA->getSingleDeviceDependence())) tra:* You could fold both ifs into something like this: ``` if (auto *DDAP = dyn_cast_or_null<T>(OA…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Given that I am adding a new query to check the existence of the Host or single-device action, I am keeping the two ifs separate. `getSingleDeviceDependence` now does an assertion. Let me know if you prefer me to do this differently. sfantao: Given that I am adding a new query to check the existence of the Host or single-device action…
		if (auto *DDep = dyn_cast<T>(OA->getSingleDeviceDependence())) {
		CurAction = DDep;
		return OA;
		}
		}
		return nullptr;
		}
// Returns a Tool for a given JobAction. In case the action and its		// Returns a Tool for a given JobAction. In case the action and its
		ABataevUnsubmitted Not Done Reply Inline Actions Three slashes ABataev: Three slashes
// predecessors can be combined, updates Inputs with the inputs of the		// predecessors can be combined, updates Inputs with the inputs of the
// first combined action. If one of the collapsed actions is a		// first combined action. If one of the collapsed actions is a
// CudaHostAction, updates CollapsedCHA with the pointer to it so the		// CudaHostAction, updates CollapsedCHA with the pointer to it so the
// caller can deal with extra handling such action requires.		// caller can deal with extra handling such action requires.
static const Tool *selectToolForJob(Compilation &C, bool SaveTemps,		static const Tool *selectToolForJob(Compilation &C, bool SaveTemps,
bool EmbedBitcode, const ToolChain *TC,		bool EmbedBitcode, const ToolChain *TC,
const JobAction *JA,		const JobAction *JA,
const ActionList *&Inputs,		const ActionList *&Inputs,
const CudaHostAction *&CollapsedCHA) {		ActionList &CollapsedOffloadAction) {
const Tool *ToolForJob = nullptr;		const Tool *ToolForJob = nullptr;
CollapsedCHA = nullptr;		CollapsedOffloadAction.clear();

// See if we should look for a compiler with an integrated assembler. We match		// See if we should look for a compiler with an integrated assembler. We match
// bottom up, so what we are actually looking for is an assembler job with a		// bottom up, so what we are actually looking for is an assembler job with a
// compiler input.		// compiler input.

		// Look through offload actions between assembler and backend actions.
		Action *BackendJA = (isa<AssembleJobAction>(JA) && Inputs->size() == 1)
		? *Inputs->begin()
		: nullptr;
		auto *BackendOA = collapseOffloadingAction<BackendJobAction>(BackendJA);

if (TC->useIntegratedAs() && !SaveTemps &&		if (TC->useIntegratedAs() && !SaveTemps &&
!C.getArgs().hasArg(options::OPT_via_file_asm) &&		!C.getArgs().hasArg(options::OPT_via_file_asm) &&
!C.getArgs().hasArg(options::OPT__SLASH_FA) &&		!C.getArgs().hasArg(options::OPT__SLASH_FA) &&
!C.getArgs().hasArg(options::OPT__SLASH_Fa) &&		!C.getArgs().hasArg(options::OPT__SLASH_Fa) && BackendJA &&
isa<AssembleJobAction>(JA) && Inputs->size() == 1 &&		isa<BackendJobAction>(BackendJA)) {
isa<BackendJobAction>(*Inputs->begin())) {
// A BackendJob is always preceded by a CompileJob, and without -save-temps		// A BackendJob is always preceded by a CompileJob, and without -save-temps
// or -fembed-bitcode, they will always get combined together, so instead of		// or -fembed-bitcode, they will always get combined together, so instead of
// checking the backend tool, check if the tool for the CompileJob has an		// checking the backend tool, check if the tool for the CompileJob has an
// integrated assembler. For -fembed-bitcode, CompileJob is still used to		// integrated assembler. For -fembed-bitcode, CompileJob is still used to
// look up tools for BackendJob, but they need to match before we can split		// look up tools for BackendJob, but they need to match before we can split
// them.		// them.
const ActionList BackendInputs = &(Inputs)[0]->getInputs();
// Compile job may be wrapped in CudaHostAction, extract it if		// Look through offload actions between backend and compile actions.
// that's the case and update CollapsedCHA if we combine phases.		Action CompileJA = BackendJA->getInputs().begin();
CudaHostAction CHA = dyn_cast<CudaHostAction>(BackendInputs->begin());		auto *CompileOA = collapseOffloadingAction<CompileJobAction>(CompileJA);
JobAction *CompileJA = cast<CompileJobAction>(
CHA ? CHA->input_begin() : BackendInputs->begin());		assert(CompileJA && isa<CompileJobAction>(CompileJA) &&
assert(CompileJA && "Backend job is not preceeded by compile job.");		"Backend job is not preceeded by compile job.");
const Tool Compiler = TC->SelectTool(CompileJA);		const Tool Compiler = TC->SelectTool(cast<CompileJobAction>(CompileJA));
if (!Compiler)		if (!Compiler)
return nullptr;		return nullptr;
// When using -fembed-bitcode, it is required to have the same tool (clang)		// When using -fembed-bitcode, it is required to have the same tool (clang)
// for both CompilerJA and BackendJA. Otherwise, combine two stages.		// for both CompilerJA and BackendJA. Otherwise, combine two stages.
if (EmbedBitcode) {		if (EmbedBitcode) {
JobAction InputJA = cast<JobAction>(Inputs->begin());		JobAction InputJA = cast<JobAction>(Inputs->begin());
const Tool BackendTool = TC->SelectTool(InputJA);		const Tool BackendTool = TC->SelectTool(InputJA);
if (BackendTool == Compiler)		if (BackendTool == Compiler)
CompileJA = InputJA;		CompileJA = InputJA;
}		}
if (Compiler->hasIntegratedAssembler()) {		if (Compiler->hasIntegratedAssembler()) {
Inputs = &CompileJA->getInputs();		Inputs = &CompileJA->getInputs();
ToolForJob = Compiler;		ToolForJob = Compiler;
CollapsedCHA = CHA;		// Save the collapsed offload actions because they may still contain
		// device actions.
		if (CompileOA)
		CollapsedOffloadAction.push_back(CompileOA);
		if (BackendOA)
		CollapsedOffloadAction.push_back(BackendOA);
}		}
}		}

// A backend job should always be combined with the preceding compile job		// A backend job should always be combined with the preceding compile job
// unless OPT_save_temps or OPT_fembed_bitcode is enabled and the compiler is		// unless OPT_save_temps or OPT_fembed_bitcode is enabled and the compiler is
// capable of emitting LLVM IR as an intermediate output.		// capable of emitting LLVM IR as an intermediate output.
if (isa<BackendJobAction>(JA)) {		if (isa<BackendJobAction>(JA)) {
// Check if the compiler supports emitting LLVM IR.		// Check if the compiler supports emitting LLVM IR.
assert(Inputs->size() == 1);		assert(Inputs->size() == 1);
// Compile job may be wrapped in CudaHostAction, extract it if
// that's the case and update CollapsedCHA if we combine phases.		// Look through offload actions between backend and compile actions.
CudaHostAction CHA = dyn_cast<CudaHostAction>(Inputs->begin());		Action CompileJA = JA->getInputs().begin();
JobAction *CompileJA =		auto *CompileOA = collapseOffloadingAction<CompileJobAction>(CompileJA);
cast<CompileJobAction>(CHA ? CHA->input_begin() : Inputs->begin());
assert(CompileJA && "Backend job is not preceeded by compile job.");		assert(CompileJA && isa<CompileJobAction>(CompileJA) &&
const Tool Compiler = TC->SelectTool(CompileJA);		"Backend job is not preceeded by compile job.");
		const Tool Compiler = TC->SelectTool(cast<CompileJobAction>(CompileJA));
if (!Compiler)		if (!Compiler)
return nullptr;		return nullptr;
if (!Compiler->canEmitIR() \|\|		if (!Compiler->canEmitIR() \|\|
(!SaveTemps && !EmbedBitcode)) {		(!SaveTemps && !EmbedBitcode)) {
Inputs = &CompileJA->getInputs();		Inputs = &CompileJA->getInputs();
ToolForJob = Compiler;		ToolForJob = Compiler;
CollapsedCHA = CHA;
		if (CompileOA)
		CollapsedOffloadAction.push_back(CompileOA);
}		}
}		}

// Otherwise use the tool for the current job.		// Otherwise use the tool for the current job.
if (!ToolForJob)		if (!ToolForJob)
ToolForJob = TC->SelectTool(*JA);		ToolForJob = TC->SelectTool(*JA);

// See if we should use an integrated preprocessor. We do so when we have		// See if we should use an integrated preprocessor. We do so when we have
// exactly one input, since this is the only use case we care about		// exactly one input, since this is the only use case we care about
// (irrelevant since we don't support combine yet).		// (irrelevant since we don't support combine yet).
if (Inputs->size() == 1 && isa<PreprocessJobAction>(*Inputs->begin()) &&
		// Look through offload actions after preprocessing.
		Action PreprocessJA = (Inputs->size() == 1) ? Inputs->begin() : nullptr;
		auto *PreprocessOA =
		collapseOffloadingAction<PreprocessJobAction>(PreprocessJA);

		if (PreprocessJA && isa<PreprocessJobAction>(PreprocessJA) &&
!C.getArgs().hasArg(options::OPT_no_integrated_cpp) &&		!C.getArgs().hasArg(options::OPT_no_integrated_cpp) &&
!C.getArgs().hasArg(options::OPT_traditional_cpp) && !SaveTemps &&		!C.getArgs().hasArg(options::OPT_traditional_cpp) && !SaveTemps &&
!C.getArgs().hasArg(options::OPT_rewrite_objc) &&		!C.getArgs().hasArg(options::OPT_rewrite_objc) &&
ToolForJob->hasIntegratedCPP())		ToolForJob->hasIntegratedCPP()) {
Inputs = &(*Inputs)[0]->getInputs();		Inputs = &PreprocessJA->getInputs();
		if (PreprocessOA)
		CollapsedOffloadAction.push_back(PreprocessOA);
		}

return ToolForJob;		return ToolForJob;
}		}

InputInfo Driver::BuildJobsForAction(		InputInfo Driver::BuildJobsForAction(
Compilation &C, const Action A, const ToolChain TC, const char *BoundArch,		Compilation &C, const Action A, const ToolChain TC, const char *BoundArch,
bool AtTopLevel, bool MultipleArchs, const char *LinkingOutput,		bool AtTopLevel, bool MultipleArchs, const char *LinkingOutput,
std::map<std::pair<const Action *, std::string>, InputInfo> &CachedResults)		std::map<std::pair<const Action *, std::string>, InputInfo> &CachedResults,
const {		bool BuildForOffloadDevice) const {
// The bound arch is not necessarily represented in the toolchain's triple --		// The bound arch is not necessarily represented in the toolchain's triple --
// for example, armv7 and armv7s both map to the same triple -- so we need		// for example, armv7 and armv7s both map to the same triple -- so we need
// both in our map.		// both in our map.
std::string TriplePlusArch = TC->getTriple().normalize();		std::string TriplePlusArch = TC->getTriple().normalize();
if (BoundArch) {		if (BoundArch) {
TriplePlusArch += "-";		TriplePlusArch += "-";
TriplePlusArch += BoundArch;		TriplePlusArch += BoundArch;
}		}
std::pair<const Action *, std::string> ActionTC = {A, TriplePlusArch};		std::pair<const Action *, std::string> ActionTC = {A, TriplePlusArch};
auto CachedResult = CachedResults.find(ActionTC);		auto CachedResult = CachedResults.find(ActionTC);
if (CachedResult != CachedResults.end()) {		if (CachedResult != CachedResults.end()) {
return CachedResult->second;		return CachedResult->second;
}		}
InputInfo Result =		InputInfo Result = BuildJobsForActionNoCache(
BuildJobsForActionNoCache(C, A, TC, BoundArch, AtTopLevel, MultipleArchs,		C, A, TC, BoundArch, AtTopLevel, MultipleArchs, LinkingOutput,
LinkingOutput, CachedResults);		CachedResults, BuildForOffloadDevice);
CachedResults[ActionTC] = Result;		CachedResults[ActionTC] = Result;
return Result;		return Result;
}		}

InputInfo Driver::BuildJobsForActionNoCache(		InputInfo Driver::BuildJobsForActionNoCache(
Compilation &C, const Action A, const ToolChain TC, const char *BoundArch,		Compilation &C, const Action A, const ToolChain TC, const char *BoundArch,
bool AtTopLevel, bool MultipleArchs, const char *LinkingOutput,		bool AtTopLevel, bool MultipleArchs, const char *LinkingOutput,
std::map<std::pair<const Action *, std::string>, InputInfo> &CachedResults)		std::map<std::pair<const Action *, std::string>, InputInfo> &CachedResults,
const {		bool BuildForOffloadDevice) const {
llvm::PrettyStackTraceString CrashInfo("Building compilation jobs");		llvm::PrettyStackTraceString CrashInfo("Building compilation jobs");

InputInfoList CudaDeviceInputInfos;		InputInfoList OffloadDependencesInputInfo;
if (const CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {		if (const OffloadAction *OA = dyn_cast<OffloadAction>(A)) {
// Append outputs of device jobs to the input list.		// The offload action is expected to be used in four different situations.
for (const Action *DA : CHA->getDeviceActions()) {		//
CudaDeviceInputInfos.push_back(BuildJobsForAction(		// a) Set a toolchain/architecture/kind for a host action:
C, DA, TC, nullptr, AtTopLevel,		// Host Action 1 -> OffloadAction -> Host Action 2
/MultipleArchs/ false, LinkingOutput, CachedResults));		//
}		// b) Set a toolchain/architecture/kind for a device action;
// Override current action with a real host compile action and continue		// Device Action 1 -> OffloadAction -> Device Action 2
// processing it.		//
A = *CHA->input_begin();		// c) Specify a device dependences to a host action;
		// Device Action 1 _
		// \
		// Host Action 1 ---> OffloadAction -> Host Action 2
		//
		// d) Specify a host dependence to a device action.
		traUnsubmitted Done Reply Inline Actions It may be worth adding a comment explaining what happens if OffloadDeviceInputInfos.size() != 1. tra: It may be worth adding a comment explaining what happens if OffloadDeviceInputInfos.size() != 1.
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I am elaborating on that in the comment now. I also got rid of doOnHostDependence here and I use `OA->getHostDependence()` that now contains the assertion. sfantao: I am elaborating on that in the comment now. I also got rid of doOnHostDependence here and I…
		// Host Action 1 _
		// \
		// Device Action 1 ---> OffloadAction -> Device Action 2
		//
		// For a) and b), we just return the job generated for the dependence. For
		// c) and d) we override the current action with the host/device dependence
		// if the current toolchain is host/device and set the offload dependences
		// info with the jobs obtained from the device/host dependence(s).

		// If there is a single device option, just generate the job for it.
		if (OA->hasSingleDeviceDependence()) {
		InputInfo DevA;
		OA->doOnEachDeviceDependence([&](Action DepA, const ToolChain DepTC,
		const char *DepBoundArch) {
		DevA =
		BuildJobsForAction(C, DepA, DepTC, DepBoundArch, AtTopLevel,
		/MultipleArchs/ !!DepBoundArch, LinkingOutput,
		CachedResults, /BuildForOffloadDevice=/true);
		});
		return DevA;
		}

		// If 'Action 2' is host, we generate jobs for the device dependences and
		// override the current action with the host dependence. Otherwise, we
		// generate the host dependences and override the action with the device
		// dependence. The dependences can't therefore be a top-level action.
		OA->doOnEachDependence(
		/IsHostDependence=/BuildForOffloadDevice,
		[&](Action DepA, const ToolChain DepTC, const char *DepBoundArch) {
		OffloadDependencesInputInfo.push_back(BuildJobsForAction(
		C, DepA, DepTC, DepBoundArch, /AtTopLevel=/false,
		/MultipleArchs/ !!DepBoundArch, LinkingOutput, CachedResults,
		/BuildForOffloadDevice=/DepA->getOffloadingDeviceKind() !=
		Action::OFK_None));
		});

		A = BuildForOffloadDevice
		? OA->getSingleDeviceDependence(/DoNotConsiderHostActions=/true)
		: OA->getHostDependence();
}		}

if (const InputAction *IA = dyn_cast<InputAction>(A)) {		if (const InputAction *IA = dyn_cast<InputAction>(A)) {
// FIXME: It would be nice to not claim this here; maybe the old scheme of		// FIXME: It would be nice to not claim this here; maybe the old scheme of
// just using Args was better?		// just using Args was better?
const Arg &Input = IA->getInputArg();		const Arg &Input = IA->getInputArg();
Input.claim();		Input.claim();
if (Input.getOption().matches(options::OPT_INPUT)) {		if (Input.getOption().matches(options::OPT_INPUT)) {
Show All 10 Lines	if (const BindArchAction *BAA = dyn_cast<BindArchAction>(A)) {
if (ArchName)		if (ArchName)
TC = &getToolChain(C.getArgs(),		TC = &getToolChain(C.getArgs(),
computeTargetTriple(*this, DefaultTargetTriple,		computeTargetTriple(*this, DefaultTargetTriple,
C.getArgs(), ArchName));		C.getArgs(), ArchName));
else		else
TC = &C.getDefaultToolChain();		TC = &C.getDefaultToolChain();

return BuildJobsForAction(C, *BAA->input_begin(), TC, ArchName, AtTopLevel,		return BuildJobsForAction(C, *BAA->input_begin(), TC, ArchName, AtTopLevel,
MultipleArchs, LinkingOutput, CachedResults);		MultipleArchs, LinkingOutput, CachedResults,
		BuildForOffloadDevice);
}		}

if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {
// Initial processing of CudaDeviceAction carries host params.
// Call BuildJobsForAction() again, now with correct device parameters.
InputInfo II = BuildJobsForAction(
C, *CDA->input_begin(), C.getSingleOffloadToolChain<Action::OFK_Cuda>(),
CDA->getGpuArchName(), CDA->isAtTopLevel(), /MultipleArchs=/true,
LinkingOutput, CachedResults);
// Currently II's Action is *CDA->input_begin(). Set it to CDA instead, so
// that one can retrieve II's GPU arch.
II.setAction(A);
return II;
}

const ActionList *Inputs = &A->getInputs();		const ActionList *Inputs = &A->getInputs();

const JobAction *JA = cast<JobAction>(A);		const JobAction *JA = cast<JobAction>(A);
const CudaHostAction *CollapsedCHA = nullptr;		ActionList CollapsedOffloadActions;

const Tool *T =		const Tool *T =
selectToolForJob(C, isSaveTempsEnabled(), embedBitcodeEnabled(), TC, JA,		selectToolForJob(C, isSaveTempsEnabled(), embedBitcodeEnabled(), TC, JA,
Inputs, CollapsedCHA);		Inputs, CollapsedOffloadActions);
if (!T)		if (!T)
return InputInfo();		return InputInfo();

// If we've collapsed action list that contained CudaHostAction we		// If we've collapsed action list that contained OffloadAction we
// need to build jobs for device-side inputs it may have held.		// need to build jobs for host/device-side inputs it may have held.
if (CollapsedCHA) {		for (const auto *OA : CollapsedOffloadActions)
for (const Action *DA : CollapsedCHA->getDeviceActions()) {		cast<OffloadAction>(OA)->doOnEachDependence(
CudaDeviceInputInfos.push_back(BuildJobsForAction(		/IsHostDependence=/BuildForOffloadDevice,
C, DA, TC, "", AtTopLevel,		[&](Action DepA, const ToolChain DepTC, const char *DepBoundArch) {
/MultipleArchs/ false, LinkingOutput, CachedResults));		OffloadDependencesInputInfo.push_back(BuildJobsForAction(
}		C, DepA, DepTC, DepBoundArch, AtTopLevel,
}		/MultipleArchs=/!!DepBoundArch, LinkingOutput, CachedResults,
		/BuildForOffloadDevice=/DepA->getOffloadingDeviceKind() !=
		Action::OFK_None));
		});

// Only use pipes when there is exactly one input.		// Only use pipes when there is exactly one input.
InputInfoList InputInfos;		InputInfoList InputInfos;
for (const Action Input : Inputs) {		for (const Action Input : Inputs) {
// Treat dsymutil and verify sub-jobs as being at the top-level too, they		// Treat dsymutil and verify sub-jobs as being at the top-level too, they
// shouldn't get temporary output names.		// shouldn't get temporary output names.
// FIXME: Clean this up.		// FIXME: Clean this up.
bool SubJobAtTopLevel =		bool SubJobAtTopLevel =
AtTopLevel && (isa<DsymutilJobAction>(A) \|\| isa<VerifyJobAction>(A));		AtTopLevel && (isa<DsymutilJobAction>(A) \|\| isa<VerifyJobAction>(A));
InputInfos.push_back(BuildJobsForAction(C, Input, TC, BoundArch,		InputInfos.push_back(BuildJobsForAction(
SubJobAtTopLevel, MultipleArchs,		C, Input, TC, BoundArch, SubJobAtTopLevel, MultipleArchs, LinkingOutput,
LinkingOutput, CachedResults));		CachedResults, BuildForOffloadDevice));
}		}

// Always use the first input as the base input.		// Always use the first input as the base input.
const char *BaseInput = InputInfos[0].getBaseInput();		const char *BaseInput = InputInfos[0].getBaseInput();

// ... except dsymutil actions, which use their actual input as the base		// ... except dsymutil actions, which use their actual input as the base
// input.		// input.
if (JA->getType() == types::TY_dSYM)		if (JA->getType() == types::TY_dSYM)
BaseInput = InputInfos[0].getFilename();		BaseInput = InputInfos[0].getFilename();

// Append outputs of cuda device jobs to the input list		// Append outputs of offload device jobs to the input list
if (CudaDeviceInputInfos.size())		if (!OffloadDependencesInputInfo.empty())
InputInfos.append(CudaDeviceInputInfos.begin(), CudaDeviceInputInfos.end());		InputInfos.append(OffloadDependencesInputInfo.begin(),
		OffloadDependencesInputInfo.end());

// Determine the place to write output to, if any.		// Determine the place to write output to, if any.
InputInfo Result;		InputInfo Result;
if (JA->getType() == types::TY_Nothing)		if (JA->getType() == types::TY_Nothing)
Result = InputInfo(A, BaseInput);		Result = InputInfo(A, BaseInput);
else		else
Result = InputInfo(A, GetNamedOutputPath(C, *JA, BaseInput, BoundArch,		Result = InputInfo(A, GetNamedOutputPath(C, *JA, BaseInput, BoundArch,
AtTopLevel, MultipleArchs),		AtTopLevel, MultipleArchs,
		TC->getTriple().normalize()),
BaseInput);		BaseInput);

if (CCCPrintBindings && !CCGenDiagnostics) {		if (CCCPrintBindings && !CCGenDiagnostics) {
llvm::errs() << "# \"" << T->getToolChain().getTripleString() << '"'		llvm::errs() << "# \"" << T->getToolChain().getTripleString() << '"'
<< " - \"" << T->getName() << "\", inputs: [";		<< " - \"" << T->getName() << "\", inputs: [";
for (unsigned i = 0, e = InputInfos.size(); i != e; ++i) {		for (unsigned i = 0, e = InputInfos.size(); i != e; ++i) {
llvm::errs() << InputInfos[i].getAsString();		llvm::errs() << InputInfos[i].getAsString();
if (i + 1 != e)		if (i + 1 != e)
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	static const char *MakeCLOutputFilename(const ArgList &Args, StringRef ArgValue,
}		}

return Args.MakeArgString(Filename.c_str());		return Args.MakeArgString(Filename.c_str());
}		}

const char *Driver::GetNamedOutputPath(Compilation &C, const JobAction &JA,		const char *Driver::GetNamedOutputPath(Compilation &C, const JobAction &JA,
const char *BaseInput,		const char *BaseInput,
const char *BoundArch, bool AtTopLevel,		const char *BoundArch, bool AtTopLevel,
bool MultipleArchs) const {		bool MultipleArchs,
		StringRef NormalizedTriple) const {
llvm::PrettyStackTraceString CrashInfo("Computing output path");		llvm::PrettyStackTraceString CrashInfo("Computing output path");
// Output to a user requested destination?		// Output to a user requested destination?
if (AtTopLevel && !isa<DsymutilJobAction>(JA) && !isa<VerifyJobAction>(JA)) {		if (AtTopLevel && !isa<DsymutilJobAction>(JA) && !isa<VerifyJobAction>(JA)) {
if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT_o))		if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT_o))
return C.addResultFile(FinalOutput->getValue(), &JA);		return C.addResultFile(FinalOutput->getValue(), &JA);
}		}

// For /P, preprocess to file named after BaseInput.		// For /P, preprocess to file named after BaseInput.
▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	NamedOutput =
MakeCLOutputFilename(C.getArgs(), Val, BaseName, types::TY_Image);		MakeCLOutputFilename(C.getArgs(), Val, BaseName, types::TY_Image);
} else if (JA.getType() == types::TY_Image) {		} else if (JA.getType() == types::TY_Image) {
if (IsCLMode()) {		if (IsCLMode()) {
// clang-cl uses BaseName for the executable name.		// clang-cl uses BaseName for the executable name.
NamedOutput =		NamedOutput =
MakeCLOutputFilename(C.getArgs(), "", BaseName, types::TY_Image);		MakeCLOutputFilename(C.getArgs(), "", BaseName, types::TY_Image);
} else if (MultipleArchs && BoundArch) {		} else if (MultipleArchs && BoundArch) {
SmallString<128> Output(getDefaultImageName());		SmallString<128> Output(getDefaultImageName());
		Output += JA.getOffloadingFileNamePrefix(NormalizedTriple);
Output += "-";		Output += "-";
Output.append(BoundArch);		Output.append(BoundArch);
NamedOutput = C.getArgs().MakeArgString(Output.c_str());		NamedOutput = C.getArgs().MakeArgString(Output.c_str());
} else {		} else {
NamedOutput = getDefaultImageName();		NamedOutput = getDefaultImageName();
}		}
} else if (JA.getType() == types::TY_PCH && IsCLMode()) {		} else if (JA.getType() == types::TY_PCH && IsCLMode()) {
NamedOutput = C.getArgs().MakeArgString(GetClPchPath(C, BaseName).c_str());		NamedOutput = C.getArgs().MakeArgString(GetClPchPath(C, BaseName).c_str());
} else {		} else {
const char *Suffix = types::getTypeTempSuffix(JA.getType(), IsCLMode());		const char *Suffix = types::getTypeTempSuffix(JA.getType(), IsCLMode());
assert(Suffix && "All types used for output should have a suffix.");		assert(Suffix && "All types used for output should have a suffix.");

std::string::size_type End = std::string::npos;		std::string::size_type End = std::string::npos;
if (!types::appendSuffixForType(JA.getType()))		if (!types::appendSuffixForType(JA.getType()))
End = BaseName.rfind('.');		End = BaseName.rfind('.');
SmallString<128> Suffixed(BaseName.substr(0, End));		SmallString<128> Suffixed(BaseName.substr(0, End));
		Suffixed += JA.getOffloadingFileNamePrefix(NormalizedTriple);
if (MultipleArchs && BoundArch) {		if (MultipleArchs && BoundArch) {
Suffixed += "-";		Suffixed += "-";
Suffixed.append(BoundArch);		Suffixed.append(BoundArch);
}		}
// When using both -save-temps and -emit-llvm, use a ".tmp.bc" suffix for		// When using both -save-temps and -emit-llvm, use a ".tmp.bc" suffix for
// the unoptimized bitcode so that it does not get overwritten by the ".bc"		// the unoptimized bitcode so that it does not get overwritten by the ".bc"
// optimized bitcode output.		// optimized bitcode output.
if (!AtTopLevel && C.getArgs().hasArg(options::OPT_emit_llvm) &&		if (!AtTopLevel && C.getArgs().hasArg(options::OPT_emit_llvm) &&
▲ Show 20 Lines • Show All 385 Lines • Show Last 20 Lines

lib/Driver/ToolChain.cpp

Show First 20 Lines • Show All 242 Lines • ▼ Show 20 Lines	Tool *ToolChain::getTool(Action::ActionClass AC) const {
case Action::AssembleJobClass:		case Action::AssembleJobClass:
return getAssemble();		return getAssemble();

case Action::LinkJobClass:		case Action::LinkJobClass:
return getLink();		return getLink();

case Action::InputClass:		case Action::InputClass:
case Action::BindArchClass:		case Action::BindArchClass:
case Action::CudaDeviceClass:		case Action::OffloadClass:
case Action::CudaHostClass:
case Action::LipoJobClass:		case Action::LipoJobClass:
case Action::DsymutilJobClass:		case Action::DsymutilJobClass:
case Action::VerifyDebugInfoJobClass:		case Action::VerifyDebugInfoJobClass:
llvm_unreachable("Invalid tool kind.");		llvm_unreachable("Invalid tool kind.");

case Action::CompileJobClass:		case Action::CompileJobClass:
case Action::PrecompileJobClass:		case Action::PrecompileJobClass:
case Action::PreprocessJobClass:		case Action::PreprocessJobClass:
▲ Show 20 Lines • Show All 441 Lines • Show Last 20 Lines

lib/Driver/ToolChains.h

Show First 20 Lines • Show All 846 Lines • ▼ Show 20 Lines	TranslateArgs(const llvm::opt::DerivedArgList &Args,
const char *BoundArch) const override;		const char *BoundArch) const override;
void addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,		void addClangTargetOptions(const llvm::opt::ArgList &DriverArgs,
llvm::opt::ArgStringList &CC1Args) const override;		llvm::opt::ArgStringList &CC1Args) const override;

// Never try to use the integrated assembler with CUDA; always fork out to		// Never try to use the integrated assembler with CUDA; always fork out to
// ptxas.		// ptxas.
bool useIntegratedAs() const override { return false; }		bool useIntegratedAs() const override { return false; }

		// Converts CUDA GPU architecture, e.g. "sm_21", to its corresponding virtual
		// compute arch, e.g. "compute_20". Returns null if the input arch is null or
		// doesn't match an existing arch.
		static const char GpuArchToComputeName(const char ArchName);

protected:		protected:
Tool *buildAssembler() const override; // ptxas		Tool *buildAssembler() const override; // ptxas
Tool *buildLinker() const override; // fatbinary (ok, not really a linker)		Tool *buildLinker() const override; // fatbinary (ok, not really a linker)
};		};

class LLVM_LIBRARY_VISIBILITY MipsLLVMToolChain : public Linux {		class LLVM_LIBRARY_VISIBILITY MipsLLVMToolChain : public Linux {
protected:		protected:
Tool *buildLinker() const override;		Tool *buildLinker() const override;
▲ Show 20 Lines • Show All 338 Lines • Show Last 20 Lines

lib/Driver/ToolChains.cpp

Show First 20 Lines • Show All 4,707 Lines • ▼ Show 20 Lines	CudaToolChain::TranslateArgs(const llvm::opt::DerivedArgList &Args,

if (BoundArch) {		if (BoundArch) {
DAL->eraseArg(options::OPT_march_EQ);		DAL->eraseArg(options::OPT_march_EQ);
DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), BoundArch);		DAL->AddJoinedArg(nullptr, Opts.getOption(options::OPT_march_EQ), BoundArch);
}		}
return DAL;		return DAL;
}		}

		const char CudaToolChain::GpuArchToComputeName(const char ArchName) {
		if (!ArchName)
		return nullptr;
		return llvm::StringSwitch<const char *>(ArchName)
		.Cases("sm_20", "sm_21", "compute_20")
		.Case("sm_30", "compute_30")
		.Case("sm_32", "compute_32")
		.Case("sm_35", "compute_35")
		.Case("sm_37", "compute_37")
		.Case("sm_50", "compute_50")
		.Case("sm_52", "compute_52")
		.Case("sm_53", "compute_53")
		.Default(nullptr);
		}

Tool *CudaToolChain::buildAssembler() const {		Tool *CudaToolChain::buildAssembler() const {
return new tools::NVPTX::Assembler(*this);		return new tools::NVPTX::Assembler(*this);
}		}

Tool *CudaToolChain::buildLinker() const {		Tool *CudaToolChain::buildLinker() const {
return new tools::NVPTX::Linker(*this);		return new tools::NVPTX::Linker(*this);
}		}

▲ Show 20 Lines • Show All 297 Lines • Show Last 20 Lines

lib/Driver/Tools.h

Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	public:
static const char *getDependencyFileName(const llvm::opt::ArgList &Args,		static const char *getDependencyFileName(const llvm::opt::ArgList &Args,
const InputInfoList &Inputs);		const InputInfoList &Inputs);

private:		private:
void AddPreprocessingOptions(Compilation &C, const JobAction &JA,		void AddPreprocessingOptions(Compilation &C, const JobAction &JA,
const Driver &D, const llvm::opt::ArgList &Args,		const Driver &D, const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs,		llvm::opt::ArgStringList &CmdArgs,
const InputInfo &Output,		const InputInfo &Output,
const InputInfoList &Inputs,		const InputInfoList &Inputs) const;
const ToolChain *AuxToolChain) const;

void AddAArch64TargetArgs(const llvm::opt::ArgList &Args,		void AddAArch64TargetArgs(const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs) const;		llvm::opt::ArgStringList &CmdArgs) const;
void AddARMTargetArgs(const llvm::Triple &Triple,		void AddARMTargetArgs(const llvm::Triple &Triple,
const llvm::opt::ArgList &Args,		const llvm::opt::ArgList &Args,
llvm::opt::ArgStringList &CmdArgs,		llvm::opt::ArgStringList &CmdArgs,
bool KernelOrKext) const;		bool KernelOrKext) const;
void AddARM64TargetArgs(const llvm::opt::ArgList &Args,		void AddARM64TargetArgs(const llvm::opt::ArgList &Args,
▲ Show 20 Lines • Show All 891 Lines • Show Last 20 Lines

lib/Driver/Tools.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 282 Lines • ▼ Show 20 Lines

static bool forwardToGCC(const Option &O) {		static bool forwardToGCC(const Option &O) {
// Don't forward inputs from the original command line. They are added from		// Don't forward inputs from the original command line. They are added from
// InputInfoList.		// InputInfoList.
return O.getKind() != Option::InputClass &&		return O.getKind() != Option::InputClass &&
!O.hasFlag(options::DriverOption) && !O.hasFlag(options::LinkerInput);		!O.hasFlag(options::DriverOption) && !O.hasFlag(options::LinkerInput);
}		}

		/// Add the C++ include args of other offloading toolchains. If this is a host
		/// job, the device toolchains are added. If this is a device job, the host
		/// toolchains will be added.
		static void addExtraOffloadCXXStdlibIncludeArgs(Compilation &C,
		const JobAction &JA,
		const ArgList &Args,
		ArgStringList &CmdArgs) {

		if (JA.isHostOffloading(Action::OFK_Cuda))
		C.getSingleOffloadToolChain<Action::OFK_Cuda>()
		->AddClangCXXStdlibIncludeArgs(Args, CmdArgs);
		else if (JA.isDeviceOffloading(Action::OFK_Cuda))
		C.getSingleOffloadToolChain<Action::OFK_Host>()
		->AddClangCXXStdlibIncludeArgs(Args, CmdArgs);

		// TODO: Add support for other programming models here.
		}

		/// Add the include args that are specific of each offloading programming model.
		static void addExtraOffloadSpecificIncludeArgs(Compilation &C,
		const JobAction &JA,
		const ArgList &Args,
		ArgStringList &CmdArgs) {

		if (JA.isHostOffloading(Action::OFK_Cuda))
		C.getSingleOffloadToolChain<Action::OFK_Host>()->AddCudaIncludeArgs(
		Args, CmdArgs);
		else if (JA.isDeviceOffloading(Action::OFK_Cuda))
		C.getSingleOffloadToolChain<Action::OFK_Cuda>()->AddCudaIncludeArgs(
		Args, CmdArgs);

		// TODO: Add support for other programming models here.
		}

void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,		void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,
const Driver &D, const ArgList &Args,		const Driver &D, const ArgList &Args,
ArgStringList &CmdArgs,		ArgStringList &CmdArgs,
const InputInfo &Output,		const InputInfo &Output,
const InputInfoList &Inputs,		const InputInfoList &Inputs) const {
const ToolChain *AuxToolChain) const {
Arg *A;		Arg *A;
const bool IsIAMCU = getToolChain().getTriple().isOSIAMCU();		const bool IsIAMCU = getToolChain().getTriple().isOSIAMCU();

CheckPreprocessingOptions(D, Args);		CheckPreprocessingOptions(D, Args);

Args.AddLastArg(CmdArgs, options::OPT_C);		Args.AddLastArg(CmdArgs, options::OPT_C);
Args.AddLastArg(CmdArgs, options::OPT_CC);		Args.AddLastArg(CmdArgs, options::OPT_CC);

▲ Show 20 Lines • Show All 248 Lines • ▼ Show 20 Lines	void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA,
addDirectoryList(Args, CmdArgs, "-c-isystem", "C_INCLUDE_PATH");		addDirectoryList(Args, CmdArgs, "-c-isystem", "C_INCLUDE_PATH");
// CPLUS_INCLUDE_PATH - system includes enabled when compiling C++.		// CPLUS_INCLUDE_PATH - system includes enabled when compiling C++.
addDirectoryList(Args, CmdArgs, "-cxx-isystem", "CPLUS_INCLUDE_PATH");		addDirectoryList(Args, CmdArgs, "-cxx-isystem", "CPLUS_INCLUDE_PATH");
// OBJC_INCLUDE_PATH - system includes enabled when compiling ObjC.		// OBJC_INCLUDE_PATH - system includes enabled when compiling ObjC.
addDirectoryList(Args, CmdArgs, "-objc-isystem", "OBJC_INCLUDE_PATH");		addDirectoryList(Args, CmdArgs, "-objc-isystem", "OBJC_INCLUDE_PATH");
// OBJCPLUS_INCLUDE_PATH - system includes enabled when compiling ObjC++.		// OBJCPLUS_INCLUDE_PATH - system includes enabled when compiling ObjC++.
addDirectoryList(Args, CmdArgs, "-objcxx-isystem", "OBJCPLUS_INCLUDE_PATH");		addDirectoryList(Args, CmdArgs, "-objcxx-isystem", "OBJCPLUS_INCLUDE_PATH");

// Optional AuxToolChain indicates that we need to include headers		// While adding the include arguments, we also attempt to retrieve the
// for more than one target. If that's the case, add include paths		// arguments of related offloading toolchains or arguments that are specific
// from AuxToolChain right after include paths of the same kind for		// of an offloading programming model.
// the current target.

// Add C++ include arguments, if needed.		// Add C++ include arguments, if needed.
if (types::isCXX(Inputs[0].getType())) {		if (types::isCXX(Inputs[0].getType())) {
getToolChain().AddClangCXXStdlibIncludeArgs(Args, CmdArgs);		getToolChain().AddClangCXXStdlibIncludeArgs(Args, CmdArgs);
if (AuxToolChain)		addExtraOffloadCXXStdlibIncludeArgs(C, JA, Args, CmdArgs);
AuxToolChain->AddClangCXXStdlibIncludeArgs(Args, CmdArgs);
}		}

// Add system include arguments for all targets but IAMCU.		// Add system include arguments for all targets but IAMCU.
if (!IsIAMCU) {		if (!IsIAMCU) {
getToolChain().AddClangSystemIncludeArgs(Args, CmdArgs);		getToolChain().AddClangSystemIncludeArgs(Args, CmdArgs);
if (AuxToolChain)		addExtraOffloadCXXStdlibIncludeArgs(C, JA, Args, CmdArgs);
AuxToolChain->AddClangCXXStdlibIncludeArgs(Args, CmdArgs);
} else {		} else {
// For IAMCU add special include arguments.		// For IAMCU add special include arguments.
getToolChain().AddIAMCUIncludeArgs(Args, CmdArgs);		getToolChain().AddIAMCUIncludeArgs(Args, CmdArgs);
}		}

// Add CUDA include arguments, if needed.		// Add offload include arguments, if needed.
if (types::isCuda(Inputs[0].getType()))		addExtraOffloadSpecificIncludeArgs(C, JA, Args, CmdArgs);
getToolChain().AddCudaIncludeArgs(Args, CmdArgs);
}		}

// FIXME: Move to target hook.		// FIXME: Move to target hook.
static bool isSignedCharDefault(const llvm::Triple &Triple) {		static bool isSignedCharDefault(const llvm::Triple &Triple) {
switch (Triple.getArch()) {		switch (Triple.getArch()) {
default:		default:
return true;		return true;

▲ Show 20 Lines • Show All 3,168 Lines • ▼ Show 20 Lines	void Clang::ConstructJob(Compilation &C, const JobAction &JA,
bool IsIAMCU = getToolChain().getTriple().isOSIAMCU();		bool IsIAMCU = getToolChain().getTriple().isOSIAMCU();

// Check number of inputs for sanity. We need at least one input.		// Check number of inputs for sanity. We need at least one input.
assert(Inputs.size() >= 1 && "Must have at least one input.");		assert(Inputs.size() >= 1 && "Must have at least one input.");
const InputInfo &Input = Inputs[0];		const InputInfo &Input = Inputs[0];
// CUDA compilation may have multiple inputs (source file + results of		// CUDA compilation may have multiple inputs (source file + results of
// device-side compilations). All other jobs are expected to have exactly one		// device-side compilations). All other jobs are expected to have exactly one
// input.		// input.
bool IsCuda = types::isCuda(Input.getType());		bool IsCuda = JA.isOffloading(Action::OFK_Cuda);
assert((IsCuda \|\| Inputs.size() == 1) && "Unable to handle multiple inputs.");		assert((IsCuda \|\| Inputs.size() == 1) && "Unable to handle multiple inputs.");

// C++ is not supported for IAMCU.		// C++ is not supported for IAMCU.
if (IsIAMCU && types::isCXX(Input.getType()))		if (IsIAMCU && types::isCXX(Input.getType()))
D.Diag(diag::err_drv_clang_unsupported) << "C++ for IAMCU";		D.Diag(diag::err_drv_clang_unsupported) << "C++ for IAMCU";

// Invoke ourselves in -cc1 mode.		// Invoke ourselves in -cc1 mode.
//		//
// FIXME: Implement custom jobs for internal actions.		// FIXME: Implement custom jobs for internal actions.
CmdArgs.push_back("-cc1");		CmdArgs.push_back("-cc1");

// Add the "effective" target triple.		// Add the "effective" target triple.
CmdArgs.push_back("-triple");		CmdArgs.push_back("-triple");
CmdArgs.push_back(Args.MakeArgString(TripleStr));		CmdArgs.push_back(Args.MakeArgString(TripleStr));

const ToolChain *AuxToolChain = nullptr;
if (IsCuda) {		if (IsCuda) {
// FIXME: We need a (better) way to pass information about		// We have to pass the triple of the host if compiling for a CUDA device and
// particular compilation pass we're constructing here. For now we		// vice-versa.
// can check which toolchain we're using and pick the other one to		StringRef NormalizedTriple;
// extract the triple.		if (JA.isDeviceOffloading(Action::OFK_Cuda))
if (&getToolChain() == C.getSingleOffloadToolChain<Action::OFK_Cuda>())		NormalizedTriple = C.getSingleOffloadToolChain<Action::OFK_Host>()
AuxToolChain = C.getOffloadingHostToolChain();		->getTriple()
else if (&getToolChain() == C.getOffloadingHostToolChain())		.normalize();
AuxToolChain = C.getSingleOffloadToolChain<Action::OFK_Cuda>();		else
else		NormalizedTriple = C.getSingleOffloadToolChain<Action::OFK_Cuda>()
llvm_unreachable("Can't figure out CUDA compilation mode.");		->getTriple()
assert(AuxToolChain != nullptr && "No aux toolchain.");		.normalize();

CmdArgs.push_back("-aux-triple");		CmdArgs.push_back("-aux-triple");
CmdArgs.push_back(Args.MakeArgString(AuxToolChain->getTriple().str()));		CmdArgs.push_back(Args.MakeArgString(NormalizedTriple));
		traUnsubmitted Done Reply Inline Actions All we need is a target triple here. Now that we have device offloading info, perhaps we can bypass AuxToolchain and let offloading info provide host or device triple directly. That would render FIXME above obsolete, IMO. tra: All we need is a target triple here. Now that we have device offloading info, perhaps we can…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I removed the use of AuxToolChain. It was being used also for the preprocessor argument. I added some new login in there so that the information is extracted from the toolchains owned by compilation. Let me know if that is what you had in mind. sfantao: I removed the use of AuxToolChain. It was being used also for the preprocessor argument. I…
}		}

if (Triple.isOSWindows() && (Triple.getArch() == llvm::Triple::arm \|\|		if (Triple.isOSWindows() && (Triple.getArch() == llvm::Triple::arm \|\|
Triple.getArch() == llvm::Triple::thumb)) {		Triple.getArch() == llvm::Triple::thumb)) {
unsigned Offset = Triple.getArch() == llvm::Triple::arm ? 4 : 6;		unsigned Offset = Triple.getArch() == llvm::Triple::arm ? 4 : 6;
unsigned Version;		unsigned Version;
Triple.getArchName().substr(Offset).getAsInteger(10, Version);		Triple.getArchName().substr(Offset).getAsInteger(10, Version);
if (Version < 7)		if (Version < 7)
▲ Show 20 Lines • Show All 861 Lines • ▼ Show 20 Lines	if (const Arg *A = Args.getLastArg(options::OPT_ccc_objcmt_migrate)) {
Args.AddLastArg(CmdArgs, options::OPT_objcmt_whitelist_dir_path);		Args.AddLastArg(CmdArgs, options::OPT_objcmt_whitelist_dir_path);
}		}

// Add preprocessing options like -I, -D, etc. if we are using the		// Add preprocessing options like -I, -D, etc. if we are using the
// preprocessor.		// preprocessor.
//		//
// FIXME: Support -fpreprocessed		// FIXME: Support -fpreprocessed
if (types::getPreprocessedType(InputType) != types::TY_INVALID)		if (types::getPreprocessedType(InputType) != types::TY_INVALID)
AddPreprocessingOptions(C, JA, D, Args, CmdArgs, Output, Inputs,		AddPreprocessingOptions(C, JA, D, Args, CmdArgs, Output, Inputs);
AuxToolChain);

// Don't warn about "clang -c -DPIC -fPIC test.i" because libtool.m4 assumes		// Don't warn about "clang -c -DPIC -fPIC test.i" because libtool.m4 assumes
// that "The compiler can only warn and ignore the option if not recognized".		// that "The compiler can only warn and ignore the option if not recognized".
// When building with ccache, it will pass -D options to clang even on		// When building with ccache, it will pass -D options to clang even on
// preprocessed inputs and configure concludes that -fPIC is not supported.		// preprocessed inputs and configure concludes that -fPIC is not supported.
Args.ClaimAllArgs(options::OPT_D);		Args.ClaimAllArgs(options::OPT_D);

// Manually translate -O4 to -O3; let clang reject others.		// Manually translate -O4 to -O3; let clang reject others.
▲ Show 20 Lines • Show All 6,439 Lines • ▼ Show 20 Lines	void NVPTX::Assembler::ConstructJob(Compilation &C, const JobAction &JA,
const InputInfo &Output,		const InputInfo &Output,
const InputInfoList &Inputs,		const InputInfoList &Inputs,
const ArgList &Args,		const ArgList &Args,
const char *LinkingOutput) const {		const char *LinkingOutput) const {
const auto &TC =		const auto &TC =
static_cast<const toolchains::CudaToolChain &>(getToolChain());		static_cast<const toolchains::CudaToolChain &>(getToolChain());
assert(TC.getTriple().isNVPTX() && "Wrong platform");		assert(TC.getTriple().isNVPTX() && "Wrong platform");

std::vector<std::string> gpu_archs =		// Obtain architecture from the action.
Args.getAllArgValues(options::OPT_march_EQ);		const char *gpu_arch = JA.getOffloadingArch();
assert(gpu_archs.size() == 1 && "Exactly one GPU Arch required for ptxas.");		assert(gpu_arch && "Device action expected to have an architecture.");
const std::string& gpu_arch = gpu_archs[0];

ArgStringList CmdArgs;		ArgStringList CmdArgs;
CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-m64" : "-m32");		CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-m64" : "-m32");
if (Args.hasFlag(options::OPT_cuda_noopt_device_debug,		if (Args.hasFlag(options::OPT_cuda_noopt_device_debug,
options::OPT_no_cuda_noopt_device_debug, false)) {		options::OPT_no_cuda_noopt_device_debug, false)) {
// ptxas does not accept -g option if optimization is enabled, so		// ptxas does not accept -g option if optimization is enabled, so
// we ignore the compiler's -O* options if we want debug info.		// we ignore the compiler's -O* options if we want debug info.
CmdArgs.push_back("-g");		CmdArgs.push_back("-g");
▲ Show 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	void NVPTX::Linker::ConstructJob(Compilation &C, const JobAction &JA,

ArgStringList CmdArgs;		ArgStringList CmdArgs;
CmdArgs.push_back("--cuda");		CmdArgs.push_back("--cuda");
CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-64" : "-32");		CmdArgs.push_back(TC.getTriple().isArch64Bit() ? "-64" : "-32");
CmdArgs.push_back(Args.MakeArgString("--create"));		CmdArgs.push_back(Args.MakeArgString("--create"));
CmdArgs.push_back(Args.MakeArgString(Output.getFilename()));		CmdArgs.push_back(Args.MakeArgString(Output.getFilename()));

for (const auto& II : Inputs) {		for (const auto& II : Inputs) {
auto* A = cast<const CudaDeviceAction>(II.getAction());		auto *A = II.getAction();
		assert(A->getInputs().size() == 1 &&
		"Device offload action is expected to have a single input");
		const char *gpu_arch = A->getOffloadingArch();
		assert(gpu_arch &&
		"Device action expected to have associated a GPU architecture!");

// We need to pass an Arch of the form "sm_XX" for cubin files and		// We need to pass an Arch of the form "sm_XX" for cubin files and
// "compute_XX" for ptx.		// "compute_XX" for ptx.
const char *Arch = (II.getType() == types::TY_PP_Asm)		const char *Arch =
? A->getComputeArchName()		(II.getType() == types::TY_PP_Asm)
: A->getGpuArchName();		? toolchains::CudaToolChain::GpuArchToComputeName(gpu_arch)
		: gpu_arch;
CmdArgs.push_back(Args.MakeArgString(llvm::Twine("--image=profile=") +		CmdArgs.push_back(Args.MakeArgString(llvm::Twine("--image=profile=") +
Arch + ",file=" + II.getFilename()));		Arch + ",file=" + II.getFilename()));
}		}

for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_fatbinary))		for (const auto& A : Args.getAllArgValues(options::OPT_Xcuda_fatbinary))
CmdArgs.push_back(Args.MakeArgString(A));		CmdArgs.push_back(Args.MakeArgString(A));

const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));		const char *Exec = Args.MakeArgString(TC.GetProgramPath("fatbinary"));
C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));		C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));
}		}

lib/Frontend/CreateInvocationFromCommandLine.cpp

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	clang::createInvocationFromCommandLine(ArrayRef<const char *> ArgList,

// Just print the cc1 options if -### was present.		// Just print the cc1 options if -### was present.
if (C->getArgs().hasArg(driver::options::OPT__HASH_HASH_HASH)) {		if (C->getArgs().hasArg(driver::options::OPT__HASH_HASH_HASH)) {
C->getJobs().Print(llvm::errs(), "\n", true);		C->getJobs().Print(llvm::errs(), "\n", true);
return nullptr;		return nullptr;
}		}

// We expect to get back exactly one command job, if we didn't something		// We expect to get back exactly one command job, if we didn't something
// failed. CUDA compilation is an exception as it creates multiple jobs. If		// failed. Offload compilation is an exception as it creates multiple jobs. If
// that's the case, we proceed with the first job. If caller needs particular		// that's the case, we proceed with the first job. If caller needs a
// CUDA job, it should be controlled via --cuda-{host\|device}-only option		// particular job, it should be controlled via options (e.g.
// passed to the driver.		// --cuda-{host\|device}-only for CUDA) passed to the driver.
const driver::JobList &Jobs = C->getJobs();		const driver::JobList &Jobs = C->getJobs();
bool CudaCompilation = false;		bool OffloadCompilation = false;
if (Jobs.size() > 1) {		if (Jobs.size() > 1) {
for (auto &A : C->getActions()){		for (auto &A : C->getActions()){
// On MacOSX real actions may end up being wrapped in BindArchAction		// On MacOSX real actions may end up being wrapped in BindArchAction
if (isa<driver::BindArchAction>(A))		if (isa<driver::BindArchAction>(A))
A = *A->input_begin();		A = *A->input_begin();
if (isa<driver::CudaDeviceAction>(A)) {		if (isa<driver::OffloadAction>(A)) {
CudaCompilation = true;		OffloadCompilation = true;
break;		break;
}		}
}		}
}		}
if (Jobs.size() == 0 \|\| !isa<driver::Command>(*Jobs.begin()) \|\|		if (Jobs.size() == 0 \|\| !isa<driver::Command>(*Jobs.begin()) \|\|
(Jobs.size() > 1 && !CudaCompilation)) {		(Jobs.size() > 1 && !OffloadCompilation)) {
SmallString<256> Msg;		SmallString<256> Msg;
llvm::raw_svector_ostream OS(Msg);		llvm::raw_svector_ostream OS(Msg);
Jobs.Print(OS, "; ", true);		Jobs.Print(OS, "; ", true);
Diags->Report(diag::err_fe_expected_compiler_job) << OS.str();		Diags->Report(diag::err_fe_expected_compiler_job) << OS.str();
return nullptr;		return nullptr;
}		}

const driver::Command &Cmd = cast<driver::Command>(*Jobs.begin());		const driver::Command &Cmd = cast<driver::Command>(*Jobs.begin());
Show All 15 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][OpenMP] Create generic offload actionClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 62572

include/clang/Driver/Action.h

include/clang/Driver/Compilation.h

include/clang/Driver/Driver.h

lib/Driver/Action.cpp

lib/Driver/Driver.cpp

lib/Driver/ToolChain.cpp

lib/Driver/ToolChains.h

lib/Driver/ToolChains.cpp

lib/Driver/Tools.h

lib/Driver/Tools.cpp

lib/Frontend/CreateInvocationFromCommandLine.cpp

[CUDA][OpenMP] Create generic offload action
ClosedPublic