This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/clang/
-
clang/
-
Basic/
-
DiagnosticDriverKinds.td
-
Driver/
-
Action.h
-
CC1Options.td
2/10
Driver.h
2
Options.td
-
ToolChain.h
-
Types.h
-
lib/Driver/
-
Driver/
-
Action.cpp
1/2
Compilation.cpp
2/6
Driver.cpp
-
ToolChain.cpp
-
ToolChains.h
-
ToolChains.cpp
-
Tools.h
1/4
Tools.cpp
-
Types.cpp
-
test/OpenMP/
-
OpenMP/
2/4
target_driver.c

Differential D9888

[OPENMP] Driver support for OpenMP offloading
AbandonedPublic

Authored by sfantao on May 20 2015, 10:42 AM.

Download Raw Diff

Details

Reviewers

tra
chandlerc
caomhin
rjmccall
carlo.bertolli
ABataev
echristo
hfinkel
rsmith

Summary

With a full implementation of OpenMP 3.1. already available upstream, we aim at continuing that work and add support for OpenMP 4.0 as well. One important component introduced by OpenMP 4.0 is offloading which enables the execution of a given structured block to be transferred to a device other than the host.

An implementation for OpenMP offloading infrastructure in clang is proposed in http://goo.gl/L1rnKJ. This document is already a second iteration that includes contributions from several vendors and members of the LLVM community. It was published in http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084304.html for discussion by the community, and so far we didn’t have any major concern about the design.

Unlike other OpenMP components, offloading requires support from the compiler driver given that for the same source file, several (host and target) objects will be generated using potentially different toolchains. At the same time, the compiler needs to have a mechanism to relate variables in the host with the ones generated with target, so communication between toolchains is required. The way this relation is supported by the driver will also have implications in the code generation.

This patch proposes an implementation of the driver support for offloading. The following summarizes the main changes this patch introduces:

a) clang can be invoked with -fopenmp=libiom5 -omptargets=triple1,…,tripleN, where triplei are the target triples the user wants to be able to offload to.

b) driver detects whether the offloading triples are valid or not and if the corresponding toolchain is prepared to offload. This patch only enables offloading for Linux toolchains.

c) Each target compiler phase takes the host IR (result of the host compiler phase) as a second input. This will enable the host generation to specify the variables that should be emitted for the target in the form of metadata and this metadata could be read by the target frontend.

d) Given that the same host IR result info is used by the different toolchains, the driver keeps a cache of results in order to avoid the job that generates a given result to be emitted twice.

e) Offloading leverages the argument translation functionality in order to convert host arguments into target arguments. This is currently used to make sure a shared library is always produced by the target toolchain - a library that can be loaded by the OpenMP runtime library.

f) The target shared libraries are embedded into the host binary by using a linker script produced by the driver and passed to the host linker.

g) The driver passes to the frontend offloading a command that specify if the frontend is producing code for a target. This is required as the code generation for target and host have to be different.

h) A full path to the original source file is passed to the frontend so it can be used to produce unique IDs that are the same for the host and targets.

Thanks!
Samuel

Diff Detail

Event Timeline

sfantao updated this revision to Diff 26158.May 20 2015, 10:42 AM

sfantao retitled this revision from to [OPENMP] Driver support for OpenMP offloading.

sfantao updated this object.

sfantao edited the test plan for this revision. (Show Details)

sfantao added reviewers: ABataev, hfinkel, rsmith, rjmccall, chandlerc.

sfantao added a subscriber: Unknown Object (MLST).

Herald added a subscriber: jfb. · View Herald TranscriptMay 20 2015, 10:42 AM

Hmm. Using the host IR as an implicit line of communication is an interesting approach. Can you expand on what kind of information needs to flow from host to target there, or at least link to a place in the previous discussion?

Hi John

Thanks for looking into this patch!

Sure, let me expand on the host-target communication. Just a little bit of context before I do that:

During code generation, the target frontend has to decide whether a given declaration or target region has to be emitted or not. Without any information communicated from the host frontend, this decision become complicated for cases like:

#pragma omp target regions in static functions or class members;
static declarations delimitted by #pragma omp declare target regions that end up not being used;
#pragma omp target in template functions

In order for the target frontend to correctly identify all the declarations that need to be emitted it would have to, somehow, emulate the actions done by the host frontend which would turn the code generation messy in places that do not even relate with OpenMP.

On top of that, in order to have an efficient mapping between host and target entries (global declarations/target regions)
table (this is discussed in the document, in section 5.1, where __tgt_offload_entry is introduced) the compiler would have to emit the corresponding entries in the host and target side in the same order. This is useful for devices whose toolchain maintain the order of the symbols given that the order of the entries in the host and target tables will be the same after linking. So knowing an index would be enough to do the mapping. In order for that to happen, the target frontend would have to know that order, which would be also hard to extract if no information is communicated form the host.

So, the information that needs to be propagated to make what I described above possible is basically i) declaration mangled names and ii) order they were emitted. This information could be communicated in the form of metadata that is emitted by the host frontend when the module is released and loaded by the target frontend when CGOpenMPRuntime is created. This information has however to be coded in slightly different ways for different kinds of declarations. Let me explain this with an example:

//######################################
#pragma omp declare target
struct MyClass{
  ...

  MyClass &add(MyClass &op){...}
  
  MyClass &add(int (&op)[N]){...}
  
  bool eq(MyClass &op){...}

  MyClass() {...}
  
  ~MyClass() {...}
};

MyClass C;
MyClass D;
#pragma omp end declare target 	

void foo(){
  int AA[N];
  MyClass H, T;
  MyClass HC;
  
  ...

  #pragma omp target
  {
    MyClass TC;
    T.add(AA).add(HC);
  }
  
  if (H.eq(T)) {...}
  
  #pragma omp target
  {
    T.add(AA);
  } 
}
//######################################

I was planning the metadata for this example to look more or less like this:

; Named metadata that encloses all the offloading information
!openmp.offloading.info = !{!1, !2, !3, !4, !5, !6, !7, !8, !9, !10}

; Global variables that require a map between host and target:
; Entry 0 -> ID for this type of metadata (0)
; Entry 1 -> Mangled name of the variable
; Entry 2 -> Order it was emitted
!1 = !{i32 0, !"C", i32 0}
!2 = !{i32 0, !"D", i32 2}

; Functions with target regions
; Entry 0 -> ID for this type of metadata (1)
; Entry 1 -> Mangled name of the function that was emitted for the host and encloses target regions
; Entry 2-n -> Order the target regions in the functions (in the same sequence the statements are found) are emitted 
!3 = !{i32 1, !"_Z3foov", i32 4, i32 5}

; Global initializers
; Entry 0 -> ID for this type of metadata (2)
; Entry 1-n -> Order the initializers are emitted in descending order of priority (we will require a target region per set of initializers with the same priority)
!4 = !{i32 2, i32 6}

; Global Dtors
; Entry 0 -> ID for this type of metadata (3)
; Entry 1 -> Mangled name of the variable to be destructed 
; Entry 2 -> Order the destructor was emitted (we will have a target region per variable being destroyed - this can probably be optimized)
!5 = !{i32 3, !"C", i32 1}
!6 = !{i32 3, !"D", i32 3}

; Other functions that should be emitted in the target but do not require to be mapped to the host
; Entry 0 -> ID for this type of metadata (4)
; Entry 1 -> Mangled name of the function that has to be emitted.
!7 = !{i32 4, !"_ZN7MyClass3addERA64_i"}
!8 = !{i32 4, !"_ZN7MyClass3addERS_"}
!9 = !{i32 4, !"_ZN7MyClassC2Ev"}
!10 = !{i32 4, !"_ZN7MyClassD2Ev"}

I realize this is the kind of information I should propose as a patch to the codegen part of offloading, but I think it makes sense to discuss it now as the driver has to enable it.

I also foresee the communication between target and host to be useful for other cases, like the propagation of alias information from host to target. I don’t have have however a proposal for that at this moment.

Hope I haven’t been either too brief or too exhaustive! Let me know if I can clarify anything else for you.

Thanks!
Samuel

sfantao added reviewers: carlo.bertolli, caomhin.May 29 2015, 9:25 AM

Are there any other comments or questions about this patch?

Many thanks!
Samuel

echristo added a reviewer: echristo.Jun 8 2015, 8:57 AM

I've just noticed Chad is owning the Compiler driver, so I believe he should also be added to the list of reviewer of this patch.

Thanks!
Samuel

Quite a big patch, I'd definitely like to take a look at this as well. It relates to how some of the cuda work is progressing too.

Thanks!

-eric

In D9888#188244, @echristo wrote:

Quite a big patch, I'd definitely like to take a look at this as well. It relates to how some of the cuda work is progressing too.

Thanks!

-eric

Thanks eric,

Please let me know any comments you may have.

I agree the patch is quite big... I had a hard time trying to find a better partition that would make sense - this requires small but related changes in several places - and mapped to something meaningful in terms of the regression tests. If you see a good way to partition the patch let me know and I would gladly do it.

Thanks again!
Samuel

hyviquel added a subscriber: hyviquel.Jun 17 2015, 7:38 AM

mcrosier removed a reviewer: mcrosier.Jul 6 2015, 11:54 AM

Hahnfeld added a subscriber: Hahnfeld.Jul 16 2015, 1:22 AM

tra added a subscriber: tra.Aug 19 2015, 5:06 PM

I think this has to be updated for the current trunk...

This diff refactors the original patch and is rebased on top of the latests offloading changes inserted for CUDA.

Here I don't touch the CUDA support. I tried, however, to have the implementation modular enough so that it could eventually be combined with the CUDA implementation. In my view OpenMP offloading is more general in the sense that it does not refer to a given tool chain, instead it uses existing toolchains to generate code for offloading devices. So, I believe that a tool chain (which I did not include in this patch) targeting NVPTX will be able to handle both CUDA and OpenMP offloading models.

Chris, Art, I understand you have worked out the latest CUDA changes so any feedback from you is greatly appreciated!

Here are few more details about this diff:

a) Add tool to bundle and unbundle corresponding host and device files into a single one.

One of the goals of OpenMP offloading is to enable users to offload with little effort, by annotating the code with a few pragmas. I'd also like to save users the trouble of changing their existent applications' build system. So having the compiler always return a single file instead of one for the host and each target even if the user is doing separate compilation is desirable.

This diff includes a tool named clang-offload-bundled (happy to change the name or even include it in the driver if someone thinks it is the best direction to go) that is used on all input files that are not source files to unbundle them, and on top level jobs that are not linking jobs to bundle the results obtained for host and each target.

The format of the bundled files is currently very simple: text formats are concatenated with comments that have a magic string and target identifying triple in between, and binary formats have a header that contains the triple and the offset and size of the code for host and each target.

This tool still has to be improved in the future to deal with archive files so that each individual file in the archive is properly dealt with. We see that archives are very commonly used in current application to combine separate compilation results. So I'm convinced users would enjoy this feature.

b ) The building of the driver actions is unchanged.

I don't create device specific actions. Instead only the bundling/unbundling are inserted as first or last action if the file type requires that.

c) Add offloading kind to ToolChain

Offloading does not require a new toolchain to be created. Existent toolchains are used and the offloading kind is used to drive specific behavior in each toolchain so that valid device code is generated.

This is a major difference from what is currently done for CUDA. But I guess the CUDA implementation easily fits this design and the Nvidia GPU toolchain could be reused for both CUDA and OpenMP offloading.

d) Use Job results cache to easily use host results in device actions and vice-versa.

An array of the results for each job is kept so that the device job can use the result previously generated for the host and used it as input or vice-versa.

In OpenMP the device declarations have be communicated from the host frontend to the device frontend. So this is used to conveniently pass that information. Unlike CUDA, OpenMP doesn't have already outline functions with "device" attributes that the frontend can rely on to make the decision on what to be emitted or not.

The result cache can also be updated to keep the required information for the CUDA implementation to decide host/device binaries combining (injection is the term used in the code). I don't have a concrete proposal for that however, given that is not clear to me what are the plans for CUDA to support separate compilation, I understand that the CUDA binary is inserted directly in host IR (Art, can you shed some light on this?).

e) Use compiler generated linker script to do the device/host code combining and correctly support separate compilation.

Currently the OpenMP support in the toolchains is only implemented for Generic GCC targets and a linker script is used to embed the resulting device images into the host binary ELF sections. Also, the linker script defines the symbols that are emitted during code generation so that the address of the images can be easily retrieved.

f) Minor refactoring of the existing code to enable reusing.

I've outlined some of the exiting code into static function so that it could be reused by the new offloading related hooks.

Any comments/remarks are very welcome!

Thanks!
Samuel

Currently trying to test, but

Offloading to the same target isn't supported (x86_64-unknown-linux-gnu as host and device) - this was working with clang-omp

The produced IR isn't showing any calls to the target library and on linkage it complains:

undefined reference to `.omp_offloading.img_start.x86_64-unknown-linux-gnu'
undefined reference to `.omp_offloading.img_end.x86_64-unknown-linux-gnu'
undefined reference to `.omp_offloading.entries_begin'
undefined reference to `.omp_offloading.entries_end'
undefined reference to `.omp_offloading.entries_begin'
undefined reference to `.omp_offloading.entries_end'

(btw: clang-offload-bundler saves the IR file to $TMP with -S -emit-llvm, this seems to be a bug - I had to use --save-temps)

I can't seem to figure out the target triple for NVIDIA GPUs. It should be nvptx[64]-nvidia-cuda which gives me

include/llvm/Option/Option.h:101: const llvm::opt::Option llvm::opt::Option::getAlias() const: Assertion `Info && "Must have a valid info!"' failed.

In clang-omp it was nvptxsm_35-nvidia-cuda but this is now invalid...

In D9888#257904, @sfantao wrote:

This diff refactors the original patch and is rebased on top of the latests offloading changes inserted for CUDA.

Here I don't touch the CUDA support. I tried, however, to have the implementation modular enough so that it could eventually be combined with the CUDA implementation. In my view OpenMP offloading is more general in the sense that it does not refer to a given tool chain, instead it uses existing toolchains to generate code for offloading devices. So, I believe that a tool chain (which I did not include in this patch) targeting NVPTX will be able to handle both CUDA and OpenMP offloading models.

What do you mean by "does not refer to a given toolchain"? Do you have the toolchain patch available?

Creating a separate toolchain for CUDA was a crutch that was available to craft appropriate cc1 command line for device-side compilation using existing toolchain. It works, but it's rather rigid arrangement. Creating a NVPTX toolchain which can be parameterized to produce CUDA or OpenMP would be an improvement.

Ideally toolchain tweaking should probably be done outside of the toolchain itself so that it can be used with any combination of {CUDA or OpenMP target tweaks}x{toolchains capable of generating target code}.

b ) The building of the driver actions is unchanged.

I don't create device specific actions. Instead only the bundling/unbundling are inserted as first or last action if the file type requires that.

Could you elaborate on that? The way I read it, the driver sees linear chain of compilation steps plus bundling/unbundling at the beginning/end and that each action would result in multiple compiler invocations, presumably per target.

If that's the case, then it may present a bit of a challenge in case one part of compilation depends on results of another. That's the case for CUDA where results of device-side compilation must be present for host-side compilation so we can generate additional code to initialize it at runtime.

c) Add offloading kind to ToolChain

Offloading does not require a new toolchain to be created. Existent toolchains are used and the offloading kind is used to drive specific behavior in each toolchain so that valid device code is generated.

This is a major difference from what is currently done for CUDA. But I guess the CUDA implementation easily fits this design and the Nvidia GPU toolchain could be reused for both CUDA and OpenMP offloading.

Sounds good. I'd be happy to make necessary make CUDA support use it.

d) Use Job results cache to easily use host results in device actions and vice-versa.

An array of the results for each job is kept so that the device job can use the result previously generated for the host and used it as input or vice-versa.

Nice. That's something that will be handy for CUDA and may help to avoid passing bits of info about other jobs explicitly throughout the driver.

The result cache can also be updated to keep the required information for the CUDA implementation to decide host/device binaries combining (injection is the term used in the code). I don't have a concrete proposal for that however, given that is not clear to me what are the plans for CUDA to support separate compilation, I understand that the CUDA binary is inserted directly in host IR (Art, can you shed some light on this?).

Currently CUDA depends on libcudart which assumes that GPU code and its initialization is done the way nvcc does it. Currently we do include PTX assembly (as in readable text) generated on device side into host-side IR *and* generate some host data structures and init code to register GPU binaries with libcudart. I haven't figured out a way to compile host/device sides of CUDA without a host-side compilation depending on device results.

Long-term we're considering implementing CUDA runtime support based on plain driver interface which would give us more control over where we keep GPU code and how we initialize it. Then we could simplify things and, for example, incorporate GPU code via linker script. Alas for the time being we're stuck with libcudart and sequential device and host compilation phases.

As for separate compilation -- compilation part is doable. It's using the results of such compilation that becomes tricky. CUDA's triple-bracket kernel launch syntax depends on libcudart and will not work, because we would not generate init code. You can still launch kernels manually using raw driver API, but it's quite a bit more convoluted.

--Artem

include/clang/Driver/Driver.h
226	re -> are
228	"If offloading is not supported" perhaps?
lib/Driver/Driver.cpp
2133	"has to be"

Make the offloading ELF sections consistent with what is in http://reviews.llvm.org/D12614.

Fix bug in AtTopLevel flag, so that the bundling job is considered always top level job.

Fix several typos.

Art, Jonas,

Thanks for the comments!

In D9888#261434, @Hahnfeld wrote:
Currently trying to test, but

Offloading to the same target isn't supported (x86_64-unknown-linux-gnu as host and device) - this was working with clang-omp

The produced IR isn't showing any calls to the target library and on linkage it complains:
undefined reference to `.omp_offloading.img_start.x86_64-unknown-linux-gnu'
undefined reference to `.omp_offloading.img_end.x86_64-unknown-linux-gnu'
undefined reference to `.omp_offloading.entries_begin'
undefined reference to `.omp_offloading.entries_end'
undefined reference to `.omp_offloading.entries_begin'
undefined reference to `.omp_offloading.entries_end'

I assume you were trying this using the diff in http://reviews.llvm.org/D12614. There was an inconsistency in the names of the ELF sections and symbols defined by the linker script in these two patches. This is now fixed.

Note that if you are using the libomptarget library from clang-omp, you need to replace in the code .openmptgt_host_entries by .omp_offloading.entries. I changed the names so that all of them are consistent with what is already in place for other OpenMP directives.

I also changed the files generation so that different files are used even if target and host have the same triple.

Please, let me know if it still does not work for you.

(btw: clang-offload-bundler saves the IR file to $TMP with -S -emit-llvm, this seems to be a bug - I had to use --save-temps)

Yes, the bundling job was not being marked as top level. It is now fixed!

I can't seem to figure out the target triple for NVIDIA GPUs. It should be nvptx[64]-nvidia-cuda which gives me
include/llvm/Option/Option.h:101: const llvm::opt::Option llvm::opt::Option::getAlias() const: Assertion `Info && "Must have a valid info!"' failed.
In clang-omp it was nvptxsm_35-nvidia-cuda but this is now invalid...

I didn't implement the triples logic for the nvptx targets yet. I'll port that from clang-omp once we have the basic functionality working upstream.

I'll address Art's comments in a separate message.

Thanks again,
Samuel

include/clang/Driver/Driver.h
226	Fixed!
228	Fixed!
lib/Driver/Driver.cpp
2133	Fixed!

In D9888#262325, @tra wrote:

In D9888#257904, @sfantao wrote:

This diff refactors the original patch and is rebased on top of the latests offloading changes inserted for CUDA.

Here I don't touch the CUDA support. I tried, however, to have the implementation modular enough so that it could eventually be combined with the CUDA implementation. In my view OpenMP offloading is more general in the sense that it does not refer to a given tool chain, instead it uses existing toolchains to generate code for offloading devices. So, I believe that a tool chain (which I did not include in this patch) targeting NVPTX will be able to handle both CUDA and OpenMP offloading models.

What do you mean by "does not refer to a given toolchain"? Do you have the toolchain patch available?

I mean not having to create a toolchain for a specific offloading model. OpenMP offloading is meant for any target and possibility many different targets simultaneously, so having a toolchain for each combination would be overwhelming.

I don't have a patch for the toolchain out for review yet. I'm planing to port what we have in clang-omp for the NVPTX toolchain once I have the host functionality in place. In there (https://github.com/clang-omp/clang_trunk/tree/master/lib/Driver) the Driver is implemented in a different way, I guess the version I'm proposing here is much cleaner. However, the ToolChains shouldn't be that different. All the tweaking is moved to the Tool itself, and I imagine I can drive that using the ToolChain offloading kind I'm proposing here. In https://github.com/clang-omp/clang_trunk/blob/master/lib/Driver/Tools.cpp I basically pick some arguments to forward to the tool and do some tricks to include libdevice in compilation when required. Do you think something like that could also work for CUDA?

Creating a separate toolchain for CUDA was a crutch that was available to craft appropriate cc1 command line for device-side compilation using existing toolchain. It works, but it's rather rigid arrangement. Creating a NVPTX toolchain which can be parameterized to produce CUDA or OpenMP would be an improvement.

Ideally toolchain tweaking should probably be done outside of the toolchain itself so that it can be used with any combination of {CUDA or OpenMP target tweaks}x{toolchains capable of generating target code}.

I agree. I decided to move all the offloading tweaking to the tools, given that that is what clang tool already does: customizes the arguments based on the ToolChain that is passed to it.

b ) The building of the driver actions is unchanged.

I don't create device specific actions. Instead only the bundling/unbundling are inserted as first or last action if the file type requires that.

Could you elaborate on that? The way I read it, the driver sees linear chain of compilation steps plus bundling/unbundling at the beginning/end and that each action would result in multiple compiler invocations, presumably per target.

If that's the case, then it may present a bit of a challenge in case one part of compilation depends on results of another. That's the case for CUDA where results of device-side compilation must be present for host-side compilation so we can generate additional code to initialize it at runtime.

That's right. I try to tackle the challenge of passing host/device results to device/host jobs by using a cache of results as I had described in d). The goal here is to add the flexibility required to accommodate different offloading models. In OpenMP we use host compile results in device compile jobs, and device link results in host link jobs whereas in CUDA the assemble result is used in compile job. I believe that we can have that cache to include whatever information is required to suit all needs.

c) Add offloading kind to ToolChain

Offloading does not require a new toolchain to be created. Existent toolchains are used and the offloading kind is used to drive specific behavior in each toolchain so that valid device code is generated.

This is a major difference from what is currently done for CUDA. But I guess the CUDA implementation easily fits this design and the Nvidia GPU toolchain could be reused for both CUDA and OpenMP offloading.

Sounds good. I'd be happy to make necessary make CUDA support use it.

Great! Thanks.

d) Use Job results cache to easily use host results in device actions and vice-versa.

An array of the results for each job is kept so that the device job can use the result previously generated for the host and used it as input or vice-versa.

Nice. That's something that will be handy for CUDA and may help to avoid passing bits of info about other jobs explicitly throughout the driver.

The result cache can also be updated to keep the required information for the CUDA implementation to decide host/device binaries combining (injection is the term used in the code). I don't have a concrete proposal for that however, given that is not clear to me what are the plans for CUDA to support separate compilation, I understand that the CUDA binary is inserted directly in host IR (Art, can you shed some light on this?).

Currently CUDA depends on libcudart which assumes that GPU code and its initialization is done the way nvcc does it. Currently we do include PTX assembly (as in readable text) generated on device side into host-side IR *and* generate some host data structures and init code to register GPU binaries with libcudart. I haven't figured out a way to compile host/device sides of CUDA without a host-side compilation depending on device results.

Long-term we're considering implementing CUDA runtime support based on plain driver interface which would give us more control over where we keep GPU code and how we initialize it. Then we could simplify things and, for example, incorporate GPU code via linker script. Alas for the time being we're stuck with libcudart and sequential device and host compilation phases.

As for separate compilation -- compilation part is doable. It's using the results of such compilation that becomes tricky. CUDA's triple-bracket kernel launch syntax depends on libcudart and will not work, because we would not generate init code. You can still launch kernels manually using raw driver API, but it's quite a bit more convoluted.

Ok, I see. I am not aware of what exactly libcudart does, but I can elaborate on what the OpenMP offloading implementation we have in place does:

We have a descriptor that is registered with the runtime library (we generate a function for that called before any global initializers are executed ), this descriptor has (among other things) fields that are initialized with the symbols defined by the linker script (so that the runtime library can immediately get the CUDA module) and also the names of the kernels (in OpenMP with don't have user-defined names for these kernels, so we generate some mangling to make sure they are unique). While launching the kernel, the runtime gets a pointer from which he can easily retrieve the name, and the CUDA driver API is used to get the CUDA function to be launched. We have been successfully generating a CUDA module that works well with separate compilation using ptxas and nvlink.

Part of my work is also port the runtime library in clang-omp to the LLLVM OpenMP project. I see CUDA as a simplified version of what OpenMP does, given that the user controls the data mappings explicitly, so I am sure we can find some synergies in the runtime library too and you may be able to use something that we already have in there.

Thanks!
Samuel

--Artem

In D9888#262389, @sfantao wrote:

[...]

I assume you were trying this using the diff in http://reviews.llvm.org/D12614. There was an inconsistency in the names of the ELF sections and symbols defined by the linker script in these two patches. This is now fixed.

Note that if you are using the libomptarget library from clang-omp, you need to replace in the code .openmptgt_host_entries by .omp_offloading.entries. I changed the names so that all of them are consistent with what is already in place for other OpenMP directives.

I also changed the files generation so that different files are used even if target and host have the same triple.

Please, let me know if it still does not work for you.

Thanks for your help, a small test program now seems to work!

[...]

I didn't implement the triples logic for the nvptx targets yet. I'll port that from clang-omp once we have the basic functionality working upstream.

Ok, I'll wait then. Thanks for your work and finally upstreaming this!
Jonas

Are there any more comments/suggestions about this patch?

Thanks!
Samuel

Move clang-offload-bundler to to a separate review: http://reviews.llvm.org/D13909.

This patch depends on http://reviews.llvm.org/D13909.

Rebase.

sfantao added a comment.Dec 8 2015, 4:03 PM

This comment was removed by sfantao.

Will this somewhen receive a final review and get merged?

DmitryPolukhin added a subscriber: DmitryPolukhin.Jan 29 2016, 4:37 AM

arpith-jacob added a subscriber: arpith-jacob.Feb 4 2016, 6:38 AM

andreybokhanko added a subscriber: andreybokhanko.Feb 10 2016, 2:24 AM

@rsmith could you possibly take a look at this one? It has been around for roughly 8 months now and hasn't received much feedback

guansong added a subscriber: guansong.Feb 17 2016, 8:07 AM

tcramer added a subscriber: tcramer.Feb 25 2016, 12:04 AM

jfifield added a subscriber: jfifield.Mar 1 2016, 1:51 PM

jprice added a subscriber: jprice.Mar 9 2016, 9:03 AM

rsmith added a reviewer: tra.Mar 18 2016, 9:45 AM

@echristo, you asked for time to review this; if you still want to, please can you do so?
@tra, it looks like you're happy with this design (and with moving the CUDA offloading support in this direction), please let us know if not!

include/clang/Driver/Options.td
1617–1618	This is an unfortunate flag name; `-oblah` already means something. Is this name chosen for compatibility with some other system, or could we change it to, say, `-fopenmp-targets=`?
lib/Driver/Tools.cpp
316	s -> is

Hi Richard,

Thanks for your review. I partitioned some of the stuff I am proposing here in smaller patches:

http://reviews.llvm.org/D18170
http://reviews.llvm.org/D18171
http://reviews.llvm.org/D18172

These patches already try to incorporate the feedback I got in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html related with the generation of actions.

Thanks again,
Samuel

include/clang/Driver/Options.td
1617–1618	You are right, we are now using -fomptargets in codegen exactly because of that. I can change it to `-fopenmp-targets=` we don't have any compatibility issues at this point.
lib/Driver/Tools.cpp
316	I'll fix it.

mkuron added a subscriber: mkuron.Mar 19 2016, 1:14 AM

The three smaller patches into which you divided this one appear to be missing some things. For example, AddOpenMPLinkerScript in lib/Driver/Tools.cpp from this patch appears to still be necessary to get the desired functionality, but it is not present in any of the three.

Hi Michael,

In D9888#380225, @mkuron wrote:

The three smaller patches into which you divided this one appear to be missing some things. For example, AddOpenMPLinkerScript in lib/Driver/Tools.cpp from this patch appears to still be necessary to get the desired functionality, but it is not present in any of the three.

Those three patches do not add any OpenMP specific code yet, so they do not cover the whole implementation I have here. I am doing things in a slightly different way in the new patches given the feedback I had in the mailing list and I am waiting to review to see if the approach I have in there is acceptable. If so, I'll continue with the OpenMP related patches afterwards.

Thanks,
Samuel

First I'd like to note that the code quality here is really high, most of my comments are higher level design decisions going with the driver and the implementation here rather than that.

One meta comment: offload appears to be something that could be used for CUDA and OpenMP (and OpenACC etc) as a term. I think we should either merge these concepts or pick a different name :)

Thanks for all of your work and patience here! The rest of the comments are inline.

-eric

include/clang/Driver/Driver.h
210–213	Example?
216–217	Any reason?
427–435	This function is starting to get a little silly. Perhaps we should look into refactoring such that this doesn't need to be "the one function that rules them all". Perhaps a different ownership model for the things that are arguments here?
lib/Driver/Compilation.cpp
66–67	Hmm?
lib/Driver/Driver.cpp
224–225	This can probably be done separately? Can you split this out and make it generally useful?
2045–2051	Might be time to make some specialized versions of this function. This may take it from "ridiculously confusing" to "code no one should ever look at" :)
lib/Driver/Tools.cpp
6032	Should we get the offload bundler in first so that the interface is there and testable? (Honest question, no particular opinion here). Though the command lines there will affect how this code is written.
test/OpenMP/target_driver.c
41–47	Do we really think the phases should be a DAG check?
54	How do you pass options to individual omptargets? e.g. -mvsx or -mavx2?

Hi Eric,

Thanks for the review!

As you are probably a aware, I started partitioning this patch following your initial concern related with the size of this patch and the feedback I got from http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html. I am keeping this patch as it shows the big picture of what I am trying to accomplish, so if you prefer to add other higher level suggesting here that's perfectly fine. Let me know if there is a more proper way to link patches.

So, I am incorporating your suggestions here in the partioned patches as specified in the inline comments. The partitioned patches are http://reviews.llvm.org/D18170, http://reviews.llvm.org/D18171 and http://reviews.llvm.org/D18172.

One meta comment: offload appears to be something that could be used for CUDA and OpenMP (and OpenACC etc) as a term. I think we should either merge these concepts or pick a different name :)

Yes, I agree. I am now using offloading. I only refer to the programming model name if the code relates to something specific of that programming model.

Thanks again,
Samuel

include/clang/Driver/Driver.h
210–213	I got rid of this extra toolchain cache and I am organizing it in a multimap by offload kind as Art suggested in http://reviews.llvm.org/D18170. That avoids the multiple containers for the offloading toolchains (this one and the ordered one).
216–217	Currently in OpenMP any directive that relates with offloading supports a `device()` clause that basically specifies which device to use for that region or data transfer. E.g. void foo() { ... } void bar(int i) { #pragma omp target device(i) foo(); } ... here foo is going to be executed on the device `i`. The problem is that the device is an integer - it does not tell which device type it is - so it is up to the implementation to decide how `i` is interpreted. So, if we have a system with two GPUs and two DSP devices. We may bind 0-1 to the GPUs and 2-3 to the DSPs. My goal with preserving the order of the toolchains was to allow codegen to leverage that information and make a better decision on how to bind devices to integers. Maybe, if the user requests the GPU toolchain first he may be interested in prioritizing its use, so the first IDs would map to GPUs. Making a long story short, this is only about preserving information so that codegen can use it. In any case, this is going to change in the future as the OpenMP language committee is working on having a device identifier to use instead of an integer. So, if you prefer remove the `ordered` out of the name, I am not opposed to that.
427–435	This has changed a little in recent CUDA work, in the version http://reviews.llvm.org/D18171 is based on, `Result` is returned instead of being passed by reference, and we have a `string/action-result map. I'll have to add to that string the offloading kind eventually, but in the partitioned patches I didn't touch that yet. Do you suggest having that cache owned by the driver instead of passing it along?
lib/Driver/Compilation.cpp
66–67	This relates in some extend to your other question: how do we pass device-specific options. So, right now we are relying on the host options to derive device-specific options. This hook was meant to make the tuning of the host options so that things that do not make sense on the device are filtered. Also, the device resulting image is usually a shared library so it that can be easily loaded, this hook is also used to specify the options that result in a shared library, even if the host options don't ask for a host shared library. Can you think of a better way to abstract this?
lib/Driver/Driver.cpp
224–225	Given the feedback I got in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html, I end up moving most the functionality that I have in jobs creation to the creation of actions. Having a action graph that shows the offloading specifics was desired feature. As a result, what gets more complex is the dump of the actions. In http://reviews.llvm.org/D18171 I have an example on how that dump looks like. That patch also proposes a unified offloading action that should be reused by the different offloading programming models. Does this address your concern?
2045–2051	I agree. This function is really messy... :S In http://reviews.llvm.org/D18171 I am proposing `collapseOffloadingAction` that drives the collapsing of offload actions and abstracts some of the complexity in `selectToolForJob`. Do you think that goes in the right direction, or you think I should do something else?
lib/Driver/Tools.cpp
6032	Yes, sure, I proposed an implementation of the bundler, using a generic format in http://reviews.llvm.org/D13909. Let me know any comments you have about that specific component. I still need to add testing specific to http://reviews.llvm.org/D13909, which I didn't yet because I didn't know where it was supposed to live - maybe in the Driver? Do you have an opinion about that? Also, in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html, the generic opinion was that the bundler should use the host object format to bundle whenever possible. So, I also have to add a default behavior for the binary bundler when the input is an object file. For the other input types, I don't think there were any strong opinions. Do you happen to have one? In any case, I was planing to add the object file specific bundling in a separate patch, which seems to me a natural way to partition the bundler functionality. Does that sound like a good plan?
test/OpenMP/target_driver.c
41–47	Using a DAG seemed to me a robust way to test that. I'd have to double check, but several map containers are used for the inputs and actions, so the order may depend on the implementation of the container. I was just trying to use a safe way to test. Do you prefer to change this to the exact sequence I am getting?
54	Well, currently I don't. In http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html I was proposing something to tackle that, but the opinion was that it was somewhat secondary and the driver design should be settled first. What I as proposing was some sort of group option associated with the device triple. The idea was to avoid proliferation of device specific options and reuse what we already have, just organize it groups so that i could be forwarded to the right tool chain. The goal was to make things like this possible: clang -mcpu=pwr8 -target-offload=nvptx64-nvidia-cuda -fopenmp -mcpu=sm_35 -target-offload=nvptx64-nvidia-cuda -fcuda -mcpu=sm_32 a.c ... where mcpu is used to specify the cpu/gpu for the different tool chains and programing models. This would also be useful to specify include and library paths that only make sense to the device. Do you have any opinion about that?

I think these changes have been contributed to trunk in multiple commits so this can be closed?

Hi Jonas,

In D9888#581809, @Hahnfeld wrote:

I think these changes have been contributed to trunk in multiple commits so this can be closed?

You're right, this can be closed now.

Thanks!
Samuel

Revision Contents

Path

Size

include/

clang/

Basic/

DiagnosticDriverKinds.td

2 lines

Driver/

31 lines

9 lines

58 lines

2 lines

24 lines

5 lines

lib/

Driver/

19 lines

13 lines

438 lines

19 lines

3 lines

39 lines

13 lines

220 lines

4 lines

test/

OpenMP/

target_driver.c

195 lines

Commit	Tree	Parents	Author	Summary	Date
c6a50e55c27a	380f768afab5	2aec100df28c 3ffadbc05bf1	Samuel Antao	Merge branch 'patch-D13909' into patch-D9888-depends-on-patch-D13909 (Show More…)	Nov 23 2015, 5:10 PM
2aec100df28c	9131350065a0	d3d1d8dc822e 91ec36c0893a	Samuel Antao	Merge branch 'patch-D13909' into patch-D9888-depends-on-patch-D13909 (Show More…)	Nov 6 2015, 2:32 PM
d3d1d8dc822e	2ca61f9aaec7	4261eadf9c1a 9ddafdc1bbf0	Samuel Antao	Merge branch 'patch-D9888-tool-only' into patch-D9888	Oct 20 2015, 11:33 AM
4261eadf9c1a	2ca61f9aaec7	719979b7301e 5cbdeae35804	Samuel Antao	Merge branch 'patch-D9888-tool-only' into patch-D9888	Oct 20 2015, 9:58 AM
719979b7301e	2ca61f9aaec7	da0ae3c7b5d2 5fd2cdb71e21	Samuel Antao	Merge branch 'master' into patch-D9888	Oct 20 2015, 9:48 AM
da0ae3c7b5d2	e9caae9441dd	b030e7d5908c 8f3d8be71a48	Samuel Antao	Merge branch 'master' into patch-D9888 (Show More…)	Oct 20 2015, 9:34 AM
b030e7d5908c	a09783ddb398	c209189d7558	Samuel Antao	Fix ELF sections issues and typos here and there.	Oct 7 2015, 5:33 PM
c209189d7558	67259fc85d40	e9edcb1c971f 8c622f1ee1c9	Samuel Antao	Merge branch 'master' into patch-D9888	Oct 7 2015, 12:50 PM
e9edcb1c971f	1e7212185897	42275222dede db216d5fd213	Samuel Antao	Merge branch 'master' into driver-2nd-attempt (Show More…)	Oct 1 2015, 9:32 AM
42275222dede	0691069972c2	5d078c2d42e5	Samuel Antao	Fix Formatting.	Sep 30 2015, 7:09 PM
5d078c2d42e5	e183f657a786	a5a0dfccb847	Samuel Antao	Working with regression test.	Sep 30 2015, 7:03 PM
a5a0dfccb847	7e9cae71591b	731a211b6732	Samuel Antao	Fix some of the formatting.	Sep 30 2015, 1:30 PM
731a211b6732	30f521a463c0	a9dc50e45470	Samuel Antao	Working but with no regression tests yet.	Sep 30 2015, 8:21 AM
a9dc50e45470	5e588a957fd3	f066baa0951e	Samuel Antao	Before redoing the actions stuff with the changed toolchain.	Sep 24 2015, 3:58 PM
f066baa0951e	0bf6dfd94d50	6ab573c71b43	Samuel Antao	Fix formatting of the bundler.	Sep 22 2015, 4:12 PM
6ab573c71b43	7d3bd48ac27d	4749f38b6ede	Samuel Antao	Fix formatting of the bundler.	Sep 22 2015, 3:57 PM
4749f38b6ede	15702d0eefc5	e1b47fb599d9	Samuel Antao	initial version of the bundler.	Sep 22 2015, 3:07 PM

Diff 41001

include/clang/Basic/DiagnosticDriverKinds.td

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	def err_drv_mg_requires_m_or_mm : Error<
"option '-MG' requires '-M' or '-MM'">;		"option '-MG' requires '-M' or '-MM'">;
def err_drv_unknown_objc_runtime : Error<		def err_drv_unknown_objc_runtime : Error<
"unknown or ill-formed Objective-C runtime '%0'">;		"unknown or ill-formed Objective-C runtime '%0'">;
def err_drv_emit_llvm_link : Error<		def err_drv_emit_llvm_link : Error<
"-emit-llvm cannot be used when linking">;		"-emit-llvm cannot be used when linking">;
def err_drv_optimization_remark_pattern : Error<		def err_drv_optimization_remark_pattern : Error<
"%0 in '%1'">;		"%0 in '%1'">;
def err_drv_no_neon_modifier : Error<"[no]neon is not accepted as modifier, please use [no]simd instead">;		def err_drv_no_neon_modifier : Error<"[no]neon is not accepted as modifier, please use [no]simd instead">;
		def err_drv_invalid_omp_target : Error<
		"OpenMP target is invalid: '%0'">;

def warn_O4_is_O3 : Warning<"-O4 is equivalent to -O3">, InGroup<Deprecated>;		def warn_O4_is_O3 : Warning<"-O4 is equivalent to -O3">, InGroup<Deprecated>;
def warn_drv_lto_libpath : Warning<"libLTO.dylib relative to clang installed dir not found; using 'ld' default search path instead">,		def warn_drv_lto_libpath : Warning<"libLTO.dylib relative to clang installed dir not found; using 'ld' default search path instead">,
InGroup<LibLTO>;		InGroup<LibLTO>;
def warn_drv_optimization_value : Warning<"optimization level '%0' is not supported; using '%1%2' instead">,		def warn_drv_optimization_value : Warning<"optimization level '%0' is not supported; using '%1%2' instead">,
InGroup<InvalidCommandLineArgument>;		InGroup<InvalidCommandLineArgument>;
def warn_ignored_gcc_optimization : Warning<"optimization flag '%0' is not supported">,		def warn_ignored_gcc_optimization : Warning<"optimization flag '%0' is not supported">,
InGroup<IgnoredOptimizationArgument>;		InGroup<IgnoredOptimizationArgument>;
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

include/clang/Driver/Action.h

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	enum ActionClass {
CompileJobClass,		CompileJobClass,
BackendJobClass,		BackendJobClass,
AssembleJobClass,		AssembleJobClass,
LinkJobClass,		LinkJobClass,
LipoJobClass,		LipoJobClass,
DsymutilJobClass,		DsymutilJobClass,
VerifyDebugInfoJobClass,		VerifyDebugInfoJobClass,
VerifyPCHJobClass,		VerifyPCHJobClass,
		OffloadBundlingJobClass,
		OffloadUnbundlingJobClass,

JobClassFirst=PreprocessJobClass,		JobClassFirst = PreprocessJobClass,
JobClassLast=VerifyPCHJobClass		JobClassLast = OffloadUnbundlingJobClass
};		};

static const char *getClassName(ActionClass AC);		static const char *getClassName(ActionClass AC);

private:		private:
ActionClass Kind;		ActionClass Kind;

/// The output type of this action.		/// The output type of this action.
▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines	public:
const ActionList &getDeviceActions() const { return DeviceActions; }		const ActionList &getDeviceActions() const { return DeviceActions; }

static bool classof(const Action *A) { return A->getKind() == CudaHostClass; }		static bool classof(const Action *A) { return A->getKind() == CudaHostClass; }
};		};

class JobAction : public Action {		class JobAction : public Action {
virtual void anchor();		virtual void anchor();
protected:		protected:
		JobAction(ActionClass Kind, std::unique_ptr<Action> Input);
JobAction(ActionClass Kind, std::unique_ptr<Action> Input, types::ID Type);		JobAction(ActionClass Kind, std::unique_ptr<Action> Input, types::ID Type);
JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type);		JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type);

public:		public:
static bool classof(const Action *A) {		static bool classof(const Action *A) {
return (A->getKind() >= JobClassFirst &&		return (A->getKind() >= JobClassFirst &&
A->getKind() <= JobClassLast);		A->getKind() <= JobClassLast);
}		}
};		};

		class OffloadBundlingJobAction : public JobAction {
		void anchor() override;

		public:
		// Offloading bundling doesn't change the type of output.
		OffloadBundlingJobAction(std::unique_ptr<Action> Input);

		static bool classof(const Action *A) {
		return A->getKind() == OffloadBundlingJobClass;
		}
		};

		class OffloadUnbundlingJobAction : public JobAction {
		void anchor() override;

		public:
		// Offloading unbundling doesn't change the type of output.
		OffloadUnbundlingJobAction(std::unique_ptr<Action> Input);

		static bool classof(const Action *A) {
		return A->getKind() == OffloadUnbundlingJobClass;
		}
		};

class PreprocessJobAction : public JobAction {		class PreprocessJobAction : public JobAction {
void anchor() override;		void anchor() override;
public:		public:
PreprocessJobAction(std::unique_ptr<Action> Input, types::ID OutputType);		PreprocessJobAction(std::unique_ptr<Action> Input, types::ID OutputType);

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == PreprocessJobClass;		return A->getKind() == PreprocessJobClass;
}		}
▲ Show 20 Lines • Show All 125 Lines • Show Last 20 Lines

include/clang/Driver/CC1Options.td

	Show First 20 Lines • Show All 667 Lines • ▼ Show 20 Lines
	def fcuda_disable_target_call_checks : Flag<["-"],			def fcuda_disable_target_call_checks : Flag<["-"],
	"fcuda-disable-target-call-checks">,			"fcuda-disable-target-call-checks">,
	HelpText<"Disable all cross-target (host, device, etc.) call checks in CUDA">;			HelpText<"Disable all cross-target (host, device, etc.) call checks in CUDA">;
	def fcuda_include_gpubinary : Separate<["-"], "fcuda-include-gpubinary">,			def fcuda_include_gpubinary : Separate<["-"], "fcuda-include-gpubinary">,
	HelpText<"Incorporate CUDA device-side binary into host object file.">;			HelpText<"Incorporate CUDA device-side binary into host object file.">;
	def fcuda_target_overloads : Flag<["-"], "fcuda-target-overloads">,			def fcuda_target_overloads : Flag<["-"], "fcuda-target-overloads">,
	HelpText<"Enable function overloads based on CUDA target attributes.">;			HelpText<"Enable function overloads based on CUDA target attributes.">;

				//===----------------------------------------------------------------------===//
				// OpenMP Options
				//===----------------------------------------------------------------------===//

				def fopenmp_is_device : Flag<["-"], "fopenmp-is-device">,
				HelpText<"Generate code only for an OpenMP target device.">;
				def omp_host_ir_file_path : Separate<["-"], "omp-host-ir-file-path">,
				HelpText<"Path to the IR file produced by the frontend for the host.">;

	} // let Flags = [CC1Option]			} // let Flags = [CC1Option]


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// cc1as-only Options			// cc1as-only Options
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let Flags = [CC1AsOption, NoDriverOption] in {			let Flags = [CC1AsOption, NoDriverOption] in {
	Show All 22 Lines

include/clang/Driver/Driver.h

//===--- Driver.h - Clang GCC Compatible Driver ------------------ C++ --===//		//===--- Driver.h - Clang GCC Compatible Driver ------------------ C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CLANG_DRIVER_DRIVER_H		#ifndef LLVM_CLANG_DRIVER_DRIVER_H
#define LLVM_CLANG_DRIVER_DRIVER_H		#define LLVM_CLANG_DRIVER_DRIVER_H

#include "clang/Basic/Diagnostic.h"		#include "clang/Basic/Diagnostic.h"
#include "clang/Basic/LLVM.h"		#include "clang/Basic/LLVM.h"
#include "clang/Driver/Phases.h"		#include "clang/Driver/Phases.h"
#include "clang/Driver/Types.h"		#include "clang/Driver/Types.h"
		#include "clang/Driver/Tool.h"
		#include "clang/Driver/ToolChain.h"
#include "clang/Driver/Util.h"		#include "clang/Driver/Util.h"
#include "llvm/ADT/StringMap.h"		#include "llvm/ADT/StringMap.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
#include "llvm/Support/Path.h" // FIXME: Kill when CompilationInfo		#include "llvm/Support/Path.h" // FIXME: Kill when CompilationInfo
#include <memory>		#include <memory>
// lands.		// lands.
#include <list>		#include <list>
Show All 16 Lines
class FileSystem;		class FileSystem;
}		}

namespace driver {		namespace driver {

class Action;		class Action;
class Command;		class Command;
class Compilation;		class Compilation;
class InputInfo;		class InputAction;
class JobList;		class JobList;
class JobAction;		class JobAction;
class SanitizerArgs;		class SanitizerArgs;
class ToolChain;

/// Describes the kind of LTO mode selected via -f(no-)?lto(=.*)? options.		/// Describes the kind of LTO mode selected via -f(no-)?lto(=.*)? options.
enum LTOKind {		enum LTOKind {
LTOK_None,		LTOK_None,
LTOK_Full,		LTOK_Full,
LTOK_Thin,		LTOK_Thin,
LTOK_Unknown		LTOK_Unknown
};		};
▲ Show 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	private:

/// \brief Cache of all the ToolChains in use by the driver.		/// \brief Cache of all the ToolChains in use by the driver.
///		///
/// This maps from the string representation of a triple to a ToolChain		/// This maps from the string representation of a triple to a ToolChain
/// created targeting that triple. The driver owns all the ToolChain objects		/// created targeting that triple. The driver owns all the ToolChain objects
/// stored in it, and will clean them up when torn down.		/// stored in it, and will clean them up when torn down.
mutable llvm::StringMap<ToolChain *> ToolChains;		mutable llvm::StringMap<ToolChain *> ToolChains;

		/// \brief Cache of all the ToolChains in use by the driver.
		///
		/// This maps from the string representation of a triple that refers to an
		/// offloading target to a ToolChain created targeting that triple. The driver
		/// owns all the ToolChain objects stored in it, and will clean them up when
		/// torn down. We use a different cache for offloading as it is possible to
		/// have offloading toolchains with the same triple the host has, and the
		/// implementation has to differentiate the two in order to adjust the
		/// commands for offloading.
		echristoUnsubmitted Not Done Reply Inline Actions Example? echristo: Example?
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I got rid of this extra toolchain cache and I am organizing it in a multimap by offload kind as Art suggested in http://reviews.llvm.org/D18170. That avoids the multiple containers for the offloading toolchains (this one and the ordered one). sfantao: I got rid of this extra toolchain cache and I am organizing it in a multimap by offload kind as…
		mutable llvm::StringMap<ToolChain *> OffloadToolChains;

		/// \brief Array of the toolchains of offloading targets in the order they
		/// were requested by the user.
		echristoUnsubmitted Done Reply Inline Actions Any reason? echristo: Any reason?
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Currently in OpenMP any directive that relates with offloading supports a `device()` clause that basically specifies which device to use for that region or data transfer. E.g. void foo() { ... } void bar(int i) { #pragma omp target device(i) foo(); } ... here foo is going to be executed on the device `i`. The problem is that the device is an integer - it does not tell which device type it is - so it is up to the implementation to decide how `i` is interpreted. So, if we have a system with two GPUs and two DSP devices. We may bind 0-1 to the GPUs and 2-3 to the DSPs. My goal with preserving the order of the toolchains was to allow codegen to leverage that information and make a better decision on how to bind devices to integers. Maybe, if the user requests the GPU toolchain first he may be interested in prioritizing its use, so the first IDs would map to GPUs. Making a long story short, this is only about preserving information so that codegen can use it. In any case, this is going to change in the future as the OpenMP language committee is working on having a device identifier to use instead of an integer. So, if you prefer remove the `ordered` out of the name, I am not opposed to that. sfantao: Currently in OpenMP any directive that relates with offloading supports a `device()` clause…
		SmallVector<const ToolChain *, 4> OrderedOffloadingToolchains;

		/// \brief Type for the cache of the results for the offloading host emitted
		/// so far. The host results can be required by the device tools.
		typedef llvm::DenseMap<const Action *, InputInfoList> OffloadingHostResultsTy;

private:		private:
		/// CreateUnbundledOffloadingResult - Create a command to unbundle the input
		/// and use the resulting input info. If there are inputs already cached in
		traUnsubmitted Not Done Reply Inline Actions re -> are tra: re -> are
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Fixed! sfantao: Fixed!
		/// OffloadingHostResults for that action use them instead. If offloading
		/// is not supported, just return the provided input info.
		traUnsubmitted Not Done Reply Inline Actions "If offloading is not supported" perhaps? tra: "If offloading is not supported" perhaps?
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Fixed! sfantao: Fixed!
		InputInfo CreateUnbundledOffloadingResult(
		Compilation &C, const OffloadUnbundlingJobAction *CurAction,
		const ToolChain *TC, InputInfo Result,
		OffloadingHostResultsTy &OffloadingHostResults) const;

		/// CreateBundledOffloadingResult - Create a bundle of all provided results
		/// and return the InputInfo of the bundled file.
		InputInfo CreateBundledOffloadingResult(
		Compilation &C, const OffloadBundlingJobAction *CurAction,
		const ToolChain *TC, InputInfoList Results) const;

		/// PostProcessOffloadingInputsAndResults - Update the input and output
		/// information to suit the needs of the offloading implementation. This used
		/// to, e.g., to pass extra results from host to device side and vice-versa.
		void PostProcessOffloadingInputsAndResults(
		Compilation &C, const JobAction JA, const ToolChain TC,
		InputInfoList &Inputs, InputInfo &Result,
		OffloadingHostResultsTy &OffloadingHostResults) const;

/// TranslateInputArgs - Create a new derived argument list from the input		/// TranslateInputArgs - Create a new derived argument list from the input
/// arguments, after applying the standard argument translations.		/// arguments, after applying the standard argument translations.
llvm::opt::DerivedArgList *		llvm::opt::DerivedArgList *
TranslateInputArgs(const llvm::opt::InputArgList &Args) const;		TranslateInputArgs(const llvm::opt::InputArgList &Args) const;

// getFinalPhase - Determine which compilation mode we are in and record		// getFinalPhase - Determine which compilation mode we are in and record
// which option we used to determine the final phase.		// which option we used to determine the final phase.
phases::ID getFinalPhase(const llvm::opt::DerivedArgList &DAL,		phases::ID getFinalPhase(const llvm::opt::DerivedArgList &DAL,
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	public:
/// \p Phase on the \p Input, taking in to account arguments		/// \p Phase on the \p Input, taking in to account arguments
/// like -fsyntax-only or --analyze.		/// like -fsyntax-only or --analyze.
std::unique_ptr<Action>		std::unique_ptr<Action>
ConstructPhaseAction(const ToolChain &TC, const llvm::opt::ArgList &Args,		ConstructPhaseAction(const ToolChain &TC, const llvm::opt::ArgList &Args,
phases::ID Phase, std::unique_ptr<Action> Input) const;		phases::ID Phase, std::unique_ptr<Action> Input) const;

/// BuildJobsForAction - Construct the jobs to perform for the		/// BuildJobsForAction - Construct the jobs to perform for the
/// action \p A.		/// action \p A.
void BuildJobsForAction(Compilation &C,		void BuildJobsForAction(Compilation &C,
const Action *A,		const Action *A,
const ToolChain *TC,		const ToolChain *TC,
const char *BoundArch,		const char *BoundArch,
bool AtTopLevel,		bool AtTopLevel,
bool MultipleArchs,		bool MultipleArchs,
const char *LinkingOutput,		const char *LinkingOutput,
InputInfo &Result) const;		InputInfo &Result,
		OffloadingHostResultsTy &OffloadingHostResults) const;
		echristoUnsubmitted Done Reply Inline Actions This function is starting to get a little silly. Perhaps we should look into refactoring such that this doesn't need to be "the one function that rules them all". Perhaps a different ownership model for the things that are arguments here? echristo: This function is starting to get a little silly. Perhaps we should look into refactoring such…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions This has changed a little in recent CUDA work, in the version http://reviews.llvm.org/D18171 is based on, `Result` is returned instead of being passed by reference, and we have a `string/action-result map. I'll have to add to that string the offloading kind eventually, but in the partitioned patches I didn't touch that yet. Do you suggest having that cache owned by the driver instead of passing it along? sfantao: This has changed a little in recent CUDA work, in the version http://reviews.llvm.org/D18171 is…

/// Returns the default name for linked images (e.g., "a.out").		/// Returns the default name for linked images (e.g., "a.out").
const char *getDefaultImageName() const;		const char *getDefaultImageName() const;

/// GetNamedOutputPath - Return the name to use for the output of		/// GetNamedOutputPath - Return the name to use for the output of
/// the action \p JA. The result is appended to the compilation's		/// the action \p JA. The result is appended to the compilation's
/// list of temporary or result files, as appropriate.		/// list of temporary or result files, as appropriate.
///		///
Show All 30 Lines
private:		private:
/// Parse the \p Args list for LTO options and record the type of LTO		/// Parse the \p Args list for LTO options and record the type of LTO
/// compilation based on which -f(no-)?lto(=.*)? option occurs last.		/// compilation based on which -f(no-)?lto(=.*)? option occurs last.
void setLTOMode(const llvm::opt::ArgList &Args);		void setLTOMode(const llvm::opt::ArgList &Args);

/// \brief Retrieves a ToolChain for a particular \p Target triple.		/// \brief Retrieves a ToolChain for a particular \p Target triple.
///		///
/// Will cache ToolChains for the life of the driver object, and create them		/// Will cache ToolChains for the life of the driver object, and create them
/// on-demand.		/// on-demand. \a OffloadingKind specifies if the toolchain being created
const ToolChain &getToolChain(const llvm::opt::ArgList &Args,		/// refers to any kind of offloading (e.g. OpenMP).
const llvm::Triple &Target) const;		const ToolChain &getToolChain(
		const llvm::opt::ArgList &Args, const llvm::Triple &Target,
		ToolChain::OffloadingKind OffloadingKind = ToolChain::OK_None) const;

/// @}		/// @}

/// \brief Get bitmasks for which option flags to include and exclude based on		/// \brief Get bitmasks for which option flags to include and exclude based on
/// the driver mode.		/// the driver mode.
std::pair<unsigned, unsigned> getIncludeExcludeOptionFlagMasks() const;		std::pair<unsigned, unsigned> getIncludeExcludeOptionFlagMasks() const;

public:		public:
Show All 20 Lines

include/clang/Driver/Options.td

	Show First 20 Lines • Show All 1,608 Lines • ▼ Show 20 Lines
	def nostdinc : Flag<["-"], "nostdinc">;			def nostdinc : Flag<["-"], "nostdinc">;
	def nostdlibinc : Flag<["-"], "nostdlibinc">;			def nostdlibinc : Flag<["-"], "nostdlibinc">;
	def nostdincxx : Flag<["-"], "nostdinc++">, Flags<[CC1Option]>,			def nostdincxx : Flag<["-"], "nostdinc++">, Flags<[CC1Option]>,
	HelpText<"Disable standard #include directories for the C++ standard library">;			HelpText<"Disable standard #include directories for the C++ standard library">;
	def nostdlib : Flag<["-"], "nostdlib">;			def nostdlib : Flag<["-"], "nostdlib">;
	def object : Flag<["-"], "object">;			def object : Flag<["-"], "object">;
	def o : JoinedOrSeparate<["-"], "o">, Flags<[DriverOption, RenderAsInput, CC1Option, CC1AsOption]>,			def o : JoinedOrSeparate<["-"], "o">, Flags<[DriverOption, RenderAsInput, CC1Option, CC1AsOption]>,
	HelpText<"Write output to <file>">, MetaVarName<"<file>">;			HelpText<"Write output to <file>">, MetaVarName<"<file>">;
				def omptargets_EQ : CommaJoined<["-"], "omptargets=">, Flags<[DriverOption, CC1Option]>,
				HelpText<"Specify comma-separated list of triples OpenMP offloading targets to be supported">;
				rsmithUnsubmitted Not Done Reply Inline Actions This is an unfortunate flag name; `-oblah` already means something. Is this name chosen for compatibility with some other system, or could we change it to, say, `-fopenmp-targets=`? rsmith: This is an unfortunate flag name; `-oblah` already means something. Is this name chosen for…
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions You are right, we are now using -fomptargets in codegen exactly because of that. I can change it to `-fopenmp-targets=` we don't have any compatibility issues at this point. sfantao: You are right, we are now using -fomptargets in codegen exactly because of that. I can change…
	def pagezero__size : JoinedOrSeparate<["-"], "pagezero_size">;			def pagezero__size : JoinedOrSeparate<["-"], "pagezero_size">;
	def pass_exit_codes : Flag<["-", "--"], "pass-exit-codes">, Flags<[Unsupported]>;			def pass_exit_codes : Flag<["-", "--"], "pass-exit-codes">, Flags<[Unsupported]>;
	def pedantic_errors : Flag<["-", "--"], "pedantic-errors">, Group<pedantic_Group>, Flags<[CC1Option]>;			def pedantic_errors : Flag<["-", "--"], "pedantic-errors">, Group<pedantic_Group>, Flags<[CC1Option]>;
	def pedantic : Flag<["-", "--"], "pedantic">, Group<pedantic_Group>, Flags<[CC1Option]>;			def pedantic : Flag<["-", "--"], "pedantic">, Group<pedantic_Group>, Flags<[CC1Option]>;
	def pg : Flag<["-"], "pg">, HelpText<"Enable mcount instrumentation">, Flags<[CC1Option]>;			def pg : Flag<["-"], "pg">, HelpText<"Enable mcount instrumentation">, Flags<[CC1Option]>;
	def pipe : Flag<["-", "--"], "pipe">,			def pipe : Flag<["-", "--"], "pipe">,
	HelpText<"Use pipes between commands, when possible">;			HelpText<"Use pipes between commands, when possible">;
	def prebind__all__twolevel__modules : Flag<["-"], "prebind_all_twolevel_modules">;			def prebind__all__twolevel__modules : Flag<["-"], "prebind_all_twolevel_modules">;
	▲ Show 20 Lines • Show All 464 Lines • Show Last 20 Lines

include/clang/Driver/ToolChain.h

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	public:

enum RTTIMode {		enum RTTIMode {
RM_EnabledExplicitly,		RM_EnabledExplicitly,
RM_EnabledImplicitly,		RM_EnabledImplicitly,
RM_DisabledExplicitly,		RM_DisabledExplicitly,
RM_DisabledImplicitly		RM_DisabledImplicitly
};		};

		enum OffloadingKind {
		OK_None,
		OK_OpenMP_Host,
		OK_OpenMP_Device,
		};

private:		private:
const Driver &D;		const Driver &D;
const llvm::Triple Triple;		const llvm::Triple Triple;
const llvm::opt::ArgList &Args;		const llvm::opt::ArgList &Args;
// We need to initialize CachedRTTIArg before CachedRTTIMode		// We need to initialize CachedRTTIArg before CachedRTTIMode
const llvm::opt::Arg *const CachedRTTIArg;		const llvm::opt::Arg *const CachedRTTIArg;
const RTTIMode CachedRTTIMode;		const RTTIMode CachedRTTIMode;
		OffloadingKind CachedOffloadingKind;

/// The list of toolchain specific path prefixes to search for		/// The list of toolchain specific path prefixes to search for
/// files.		/// files.
path_list FilePaths;		path_list FilePaths;

/// The list of toolchain specific path prefixes to search for		/// The list of toolchain specific path prefixes to search for
/// programs.		/// programs.
path_list ProgramPaths;		path_list ProgramPaths;

mutable std::unique_ptr<Tool> Clang;		mutable std::unique_ptr<Tool> Clang;
mutable std::unique_ptr<Tool> Assemble;		mutable std::unique_ptr<Tool> Assemble;
mutable std::unique_ptr<Tool> Link;		mutable std::unique_ptr<Tool> Link;
		mutable std::unique_ptr<Tool> OffloadBundler;
Tool *getClang() const;		Tool *getClang() const;
Tool *getAssemble() const;		Tool *getAssemble() const;
Tool *getLink() const;		Tool *getLink() const;
Tool *getClangAs() const;		Tool *getClangAs() const;
		Tool *getOffloadBundler() const;

mutable std::unique_ptr<SanitizerArgs> SanitizerArguments;		mutable std::unique_ptr<SanitizerArgs> SanitizerArguments;

protected:		protected:
MultilibSet Multilibs;		MultilibSet Multilibs;
const char *DefaultLinker = "ld";		const char *DefaultLinker = "ld";

ToolChain(const Driver &D, const llvm::Triple &T,		ToolChain(const Driver &D, const llvm::Triple &T,
Show All 24 Lines	public:
virtual ~ToolChain();		virtual ~ToolChain();

// Accessors		// Accessors

const Driver &getDriver() const { return D; }		const Driver &getDriver() const { return D; }
vfs::FileSystem &getVFS() const;		vfs::FileSystem &getVFS() const;
const llvm::Triple &getTriple() const { return Triple; }		const llvm::Triple &getTriple() const { return Triple; }

		OffloadingKind getOffloadingKind() const { return CachedOffloadingKind; }
		void setOffloadingKind(OffloadingKind OT);

llvm::Triple::ArchType getArch() const { return Triple.getArch(); }		llvm::Triple::ArchType getArch() const { return Triple.getArch(); }
StringRef getArchName() const { return Triple.getArchName(); }		StringRef getArchName() const { return Triple.getArchName(); }
StringRef getPlatform() const { return Triple.getVendorName(); }		StringRef getPlatform() const { return Triple.getVendorName(); }
StringRef getOS() const { return Triple.getOSName(); }		StringRef getOS() const { return Triple.getOSName(); }

/// \brief Provide the default architecture name (as expected by -arch) for		/// \brief Provide the default architecture name (as expected by -arch) for
/// this toolchain. Note t		/// this toolchain. Note t
StringRef getDefaultUniversalArchName() const;		StringRef getDefaultUniversalArchName() const;
Show All 40 Lines	public:
///		///
/// \param BoundArch - The bound architecture name, or 0.		/// \param BoundArch - The bound architecture name, or 0.
virtual llvm::opt::DerivedArgList *		virtual llvm::opt::DerivedArgList *
TranslateArgs(const llvm::opt::DerivedArgList &Args,		TranslateArgs(const llvm::opt::DerivedArgList &Args,
const char *BoundArch) const {		const char *BoundArch) const {
return nullptr;		return nullptr;
}		}

		/// TranslateOffloadArgs - Create a new derived argument list for any argument
		/// translations this ToolChain may wish to perform if supporting offloading,
		// or 0 if no tool chain specific translations are needed. If this tool chain
		// does not refer to an offloading tool chain 0 is returned too.
		///
		/// \param BoundArch - The bound architecture name, or 0.
		virtual llvm::opt::DerivedArgList *
		TranslateOffloadArgs(const llvm::opt::DerivedArgList &Args,
		const char *BoundArch) const {
		return nullptr;
		}

/// Choose a tool to use to handle the action \p JA.		/// Choose a tool to use to handle the action \p JA.
///		///
/// This can be overridden when a particular ToolChain needs to use		/// This can be overridden when a particular ToolChain needs to use
/// a C compiler other than Clang.		/// a C compiler other than Clang.
virtual Tool *SelectTool(const JobAction &JA) const;		virtual Tool *SelectTool(const JobAction &JA) const;

// Helper methods		// Helper methods

▲ Show 20 Lines • Show All 219 Lines • Show Last 20 Lines

include/clang/Driver/Types.h

Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines	#undef TYPE
bool isCXX(ID Id);		bool isCXX(ID Id);

/// isCuda - Is this a CUDA input.		/// isCuda - Is this a CUDA input.
bool isCuda(ID Id);		bool isCuda(ID Id);

/// isObjC - Is this an "ObjC" input (Obj-C and Obj-C++ sources and headers).		/// isObjC - Is this an "ObjC" input (Obj-C and Obj-C++ sources and headers).
bool isObjC(ID Id);		bool isObjC(ID Id);

		/// isSrcFile - Is this a source file, i.e. something that still has to be
		/// preprocessed. The logic behind this is the same that decides the first
		/// compilation phase is a preprocesing one.
		bool isSrcFile(ID Id);

/// lookupTypeForExtension - Lookup the type to use for the file		/// lookupTypeForExtension - Lookup the type to use for the file
/// extension \p Ext.		/// extension \p Ext.
ID lookupTypeForExtension(const char *Ext);		ID lookupTypeForExtension(const char *Ext);

/// lookupTypeForTypSpecifier - Lookup the type to use for a user		/// lookupTypeForTypSpecifier - Lookup the type to use for a user
/// specified type name.		/// specified type name.
ID lookupTypeForTypeSpecifier(const char *Name);		ID lookupTypeForTypeSpecifier(const char *Name);

Show All 15 Lines

lib/Driver/Action.cpp

	Show All 20 Lines
	}			}

	const char *Action::getClassName(ActionClass AC) {			const char *Action::getClassName(ActionClass AC) {
	switch (AC) {			switch (AC) {
	case InputClass: return "input";			case InputClass: return "input";
	case BindArchClass: return "bind-arch";			case BindArchClass: return "bind-arch";
	case CudaDeviceClass: return "cuda-device";			case CudaDeviceClass: return "cuda-device";
	case CudaHostClass: return "cuda-host";			case CudaHostClass: return "cuda-host";
				case OffloadBundlingJobClass:
				return "clang-offload-bundler";
				case OffloadUnbundlingJobClass:
				return "clang-offload-unbundler";
	case PreprocessJobClass: return "preprocessor";			case PreprocessJobClass: return "preprocessor";
	case PrecompileJobClass: return "precompiler";			case PrecompileJobClass: return "precompiler";
	case AnalyzeJobClass: return "analyzer";			case AnalyzeJobClass: return "analyzer";
	case MigrateJobClass: return "migrator";			case MigrateJobClass: return "migrator";
	case CompileJobClass: return "compiler";			case CompileJobClass: return "compiler";
	case BackendJobClass: return "backend";			case BackendJobClass: return "backend";
	case AssembleJobClass: return "assembler";			case AssembleJobClass: return "assembler";
	case LinkJobClass: return "linker";			case LinkJobClass: return "linker";
	Show All 33 Lines

	CudaHostAction::~CudaHostAction() {			CudaHostAction::~CudaHostAction() {
	for (auto &DA : DeviceActions)			for (auto &DA : DeviceActions)
	delete DA;			delete DA;
	}			}

	void JobAction::anchor() {}			void JobAction::anchor() {}

				JobAction::JobAction(ActionClass Kind, std::unique_ptr<Action> Input)
				: Action(Kind, std::move(Input)) {}

	JobAction::JobAction(ActionClass Kind, std::unique_ptr<Action> Input,			JobAction::JobAction(ActionClass Kind, std::unique_ptr<Action> Input,
	types::ID Type)			types::ID Type)
	: Action(Kind, std::move(Input), Type) {}			: Action(Kind, std::move(Input), Type) {}

	JobAction::JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type)			JobAction::JobAction(ActionClass Kind, const ActionList &Inputs, types::ID Type)
	: Action(Kind, Inputs, Type) {			: Action(Kind, Inputs, Type) {
	}			}

				void OffloadBundlingJobAction::anchor() {}

				OffloadBundlingJobAction::OffloadBundlingJobAction(
				std::unique_ptr<Action> Input)
				: JobAction(OffloadBundlingJobClass, std::move(Input)) {}

				void OffloadUnbundlingJobAction::anchor() {}

				OffloadUnbundlingJobAction::OffloadUnbundlingJobAction(
				std::unique_ptr<Action> Input)
				: JobAction(OffloadUnbundlingJobClass, std::move(Input)) {}

	void PreprocessJobAction::anchor() {}			void PreprocessJobAction::anchor() {}

	PreprocessJobAction::PreprocessJobAction(std::unique_ptr<Action> Input,			PreprocessJobAction::PreprocessJobAction(std::unique_ptr<Action> Input,
	types::ID OutputType)			types::ID OutputType)
	: JobAction(PreprocessJobClass, std::move(Input), OutputType) {}			: JobAction(PreprocessJobClass, std::move(Input), OutputType) {}

	void PrecompileJobAction::anchor() {}			void PrecompileJobAction::anchor() {}

	▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

lib/Driver/Compilation.cpp

	Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines

	const DerivedArgList &Compilation::getArgsForToolChain(const ToolChain *TC,			const DerivedArgList &Compilation::getArgsForToolChain(const ToolChain *TC,
	const char *BoundArch) {			const char *BoundArch) {
	if (!TC)			if (!TC)
	TC = &DefaultToolChain;			TC = &DefaultToolChain;

	DerivedArgList *&Entry = TCArgs[std::make_pair(TC, BoundArch)];			DerivedArgList *&Entry = TCArgs[std::make_pair(TC, BoundArch)];
	if (!Entry) {			if (!Entry) {
	Entry = TC->TranslateArgs(*TranslatedArgs, BoundArch);			DerivedArgList DefaultArgs = TC->TranslateArgs(TranslatedArgs, BoundArch);
	if (!Entry)			Entry = (DefaultArgs) ? DefaultArgs : TranslatedArgs;
	Entry = TranslatedArgs;
				// Check if there is any offloading specific translation to do.
				DerivedArgList OffloadArgs = TC->TranslateOffloadArgs(Entry, BoundArch);
				echristoUnsubmitted Done Reply Inline Actions Hmm? echristo: Hmm?
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions This relates in some extend to your other question: how do we pass device-specific options. So, right now we are relying on the host options to derive device-specific options. This hook was meant to make the tuning of the host options so that things that do not make sense on the device are filtered. Also, the device resulting image is usually a shared library so it that can be easily loaded, this hook is also used to specify the options that result in a shared library, even if the host options don't ask for a host shared library. Can you think of a better way to abstract this? sfantao: This relates in some extend to your other question: how do we pass device-specific options. So…
				if (OffloadArgs) {
				// There are offloading translated args, so we have to use them instead.
				delete DefaultArgs;
				Entry = OffloadArgs;
				}
	}			}

	return *Entry;			return *Entry;
	}			}

	bool Compilation::CleanupFile(const char *File, bool IssueErrors) const {			bool Compilation::CleanupFile(const char *File, bool IssueErrors) const {
	// FIXME: Why are we trying to remove files that we have not created? For			// FIXME: Why are we trying to remove files that we have not created? For
	// example we should only try to remove a temporary assembly file if			// example we should only try to remove a temporary assembly file if
	▲ Show 20 Lines • Show All 165 Lines • Show Last 20 Lines

lib/Driver/Driver.cpp

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	Driver::Driver(StringRef ClangExecutable, StringRef DefaultTargetTriple,
}		}
ResourceDir = P.str();		ResourceDir = P.str();
}		}

Driver::~Driver() {		Driver::~Driver() {
delete Opts;		delete Opts;

llvm::DeleteContainerSeconds(ToolChains);		llvm::DeleteContainerSeconds(ToolChains);
		llvm::DeleteContainerSeconds(OffloadToolChains);
}		}

void Driver::ParseDriverMode(ArrayRef<const char *> Args) {		void Driver::ParseDriverMode(ArrayRef<const char *> Args) {
const std::string OptName =		const std::string OptName =
getOpts().getOption(options::OPT_driver_mode).getPrefixedName();		getOpts().getOption(options::OPT_driver_mode).getPrefixedName();

for (const char *ArgPtr : Args) {		for (const char *ArgPtr : Args) {
// Ingore nullptrs, they are response file's EOL markers		// Ingore nullptrs, they are response file's EOL markers
Show All 39 Lines	InputArgList Driver::ParseArgStrings(ArrayRef<const char *> ArgStrings) {
// Check for unsupported options.		// Check for unsupported options.
for (const Arg *A : Args) {		for (const Arg *A : Args) {
if (A->getOption().hasFlag(options::Unsupported)) {		if (A->getOption().hasFlag(options::Unsupported)) {
Diag(clang::diag::err_drv_unsupported_opt) << A->getAsString(Args);		Diag(clang::diag::err_drv_unsupported_opt) << A->getAsString(Args);
continue;		continue;
}		}

// Warn about -mcpu= without an argument.		// Warn about -mcpu= without an argument.
if (A->getOption().matches(options::OPT_mcpu_EQ) && A->containsValue("")) {		if ((A->getOption().matches(options::OPT_mcpu_EQ) &&
		A->containsValue("")) \|\|
		(A->getOption().matches(options::OPT_omptargets_EQ) &&
		!A->getNumValues())) {
Diag(clang::diag::warn_drv_empty_joined_argument) << A->getAsString(Args);		Diag(clang::diag::warn_drv_empty_joined_argument) << A->getAsString(Args);
}		}
}		}

for (const Arg *A : Args.filtered(options::OPT_UNKNOWN))		for (const Arg *A : Args.filtered(options::OPT_UNKNOWN))
Diags.Report(diag::err_drv_unknown_argument) << A->getAsString(Args);		Diags.Report(diag::err_drv_unknown_argument) << A->getAsString(Args);

return Args;		return Args;
Show All 39 Lines	if (CCCIsCPP() \|\| (PhaseArg = DAL.getLastArg(options::OPT_E)) \|\|
FinalPhase = phases::Link;		FinalPhase = phases::Link;

if (FinalPhaseArg)		if (FinalPhaseArg)
*FinalPhaseArg = PhaseArg;		*FinalPhaseArg = PhaseArg;

return FinalPhase;		return FinalPhase;
}		}

		/// \brief Return true if the provided arguments require OpenMP offloading.
		static bool RequiresOpenMPOffloading(ArgList &Args) {
		if (Args.hasFlag(options::OPT_fopenmp, options::OPT_fopenmp_EQ,
		options::OPT_fno_openmp, false)) {
		StringRef OpenMPRuntimeName(CLANG_DEFAULT_OPENMP_RUNTIME);
		if (const Arg *A = Args.getLastArg(options::OPT_fopenmp_EQ))
		OpenMPRuntimeName = A->getValue();

		if (OpenMPRuntimeName == "libomp" \|\| OpenMPRuntimeName == "libiomp5") {
		auto *A = Args.getLastArg(options::OPT_omptargets_EQ);
		return A != nullptr && A->getNumValues();
		}
		}
		return false;
		}
		/// \brief Return true if the provided tool chain require OpenMP offloading.
		static bool RequiresOpenMPOffloading(const ToolChain *TC) {
		return TC->getOffloadingKind() == ToolChain::OK_OpenMP_Host \|\|
		TC->getOffloadingKind() == ToolChain::OK_OpenMP_Device;
		}

		/// \brief Dump the job bindings for a given action.
		static void DumpJobBindings(ArrayRef<const ToolChain *> TCs, StringRef ToolName,
		echristoUnsubmitted Done Reply Inline Actions This can probably be done separately? Can you split this out and make it generally useful? echristo: This can probably be done separately? Can you split this out and make it generally useful?
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Given the feedback I got in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html, I end up moving most the functionality that I have in jobs creation to the creation of actions. Having a action graph that shows the offloading specifics was desired feature. As a result, what gets more complex is the dump of the actions. In http://reviews.llvm.org/D18171 I have an example on how that dump looks like. That patch also proposes a unified offloading action that should be reused by the different offloading programming models. Does this address your concern? sfantao: Given the feedback I got in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html…
		ArrayRef<InputInfo> Inputs,
		ArrayRef<InputInfo> Outputs) {

		llvm::errs() << "# \"";
		for (unsigned i = 0, e = TCs.size(); i != e; ++i) {
		llvm::errs() << TCs[i]->getTripleString();
		if (i + 1 != e)
		llvm::errs() << ", ";
		}

		llvm::errs() << "\" - \"" << ToolName << "\", inputs: [";
		for (unsigned i = 0, e = Inputs.size(); i != e; ++i) {
		llvm::errs() << Inputs[i].getAsString();
		if (i + 1 != e)
		llvm::errs() << ", ";
		}
		llvm::errs() << "], ";
		llvm::errs() << ((Outputs.size() > 1) ? "outputs: [" : "output: ");
		for (unsigned i = 0, e = Outputs.size(); i != e; ++i) {
		llvm::errs() << Outputs[i].getAsString();
		if (i + 1 != e)
		llvm::errs() << ", ";
		}
		llvm::errs() << ((Outputs.size() > 1) ? "]\n" : "\n");
		return;
		}

		/// \brief Create output for a given action, if any.
		static InputInfo CreateActionResult(Compilation &C, const Action *A,
		const char *BaseInput,
		const char *BoundArch, bool AtTopLevel,
		bool MultipleArchs) {
		InputInfo Result;
		const JobAction *JA = cast<JobAction>(A);
		if (JA->getType() == types::TY_Nothing)
		Result = InputInfo(A->getType(), BaseInput);
		else
		Result =
		InputInfo(C.getDriver().GetNamedOutputPath(C, *JA, BaseInput, BoundArch,
		AtTopLevel, MultipleArchs),
		A->getType(), BaseInput);
		return Result;
		}

		static const char *CreateOffloadingPseudoArchName(Compilation &C,
		const ToolChain *TC) {
		SmallString<128> Name;
		switch (TC->getOffloadingKind()) {
		default:
		llvm_unreachable("Offload information was not specified.");
		break;
		case ToolChain::OK_OpenMP_Host:
		Name = "offload-host-";
		break;
		case ToolChain::OK_OpenMP_Device:
		Name = "offload-device-";
		break;
		}

		Name += TC->getTripleString();
		return C.getArgs().MakeArgString(Name.str());
		}

		InputInfo Driver::CreateUnbundledOffloadingResult(
		Compilation &C, const OffloadUnbundlingJobAction *CurAction,
		const ToolChain *TC, InputInfo Result,
		OffloadingHostResultsTy &OffloadingHostResults) const {
		assert(!OrderedOffloadingToolchains.empty() &&
		!types::isSrcFile(Result.getType()) &&
		"Not expecting to create a bundling action!");

		// If this is an offloading device toolchain, we need to use the results
		// cached when the host input was processed, except if the input is a source
		// file.
		if (TC->getOffloadingKind() == ToolChain::OK_OpenMP_Device) {
		// If this is not a source file, it had to be part of a bundle. So we need
		// to checkout the results created by the host when this input was processed
		// for the host toolchain.
		auto ILIt = OffloadingHostResults.find(CurAction);
		assert(ILIt != OffloadingHostResults.end() &&
		"Offloading inputs do not exist??");
		InputInfoList &IL = ILIt->getSecond();
		assert(IL.size() == OrderedOffloadingToolchains.size() + 1 &&
		"Not all offloading inputs exist??");

		// Get the order of the toolchain and retrieve the input;
		unsigned Order = 1;
		for (auto *OffloadTC : OrderedOffloadingToolchains) {
		if (OffloadTC == TC)
		break;
		++Order;
		}
		return IL[Order];
		}

		// Otherwise, this input is expected to be bundled. Therefore we need to issue
		// an unbundling command.

		// The bundled file is the input.
		InputInfo BundledFile = Result;

		// Create the input info for the unbundled files.
		InputInfoList &UnbundledFiles = OffloadingHostResults[CurAction];
		{
		InputInfo HostResult = CreateActionResult(
		C, CurAction, Result.getBaseInput(),
		CreateOffloadingPseudoArchName(C, TC), /AtTopLevel=/
		false, /MultipleArchs=/false);
		UnbundledFiles.push_back(HostResult);
		for (auto *OffloadTC : OrderedOffloadingToolchains) {
		InputInfo TargetResult = CreateActionResult(
		C, CurAction, Result.getBaseInput(),
		CreateOffloadingPseudoArchName(C, OffloadTC), /AtTopLevel=/
		false, /MultipleArchs=/false);
		UnbundledFiles.push_back(TargetResult);
		}
		}

		auto OffloadBundlerTool = TC->SelectTool(*CurAction);

		// Emit the command or dump the bindings.
		if (CCCPrintBindings && !CCGenDiagnostics) {
		SmallVector<const ToolChain *, 4> AllToolChains;
		AllToolChains.push_back(TC);
		AllToolChains.append(OrderedOffloadingToolchains.begin(),
		OrderedOffloadingToolchains.end());
		DumpJobBindings(AllToolChains, OffloadBundlerTool->getName(), BundledFile,
		UnbundledFiles);
		} else {
		OffloadBundlerTool->ConstructJob(C, *CurAction, BundledFile, UnbundledFiles,
		C.getArgs(), nullptr);
		}

		// The host result is the first of the unbundled files.
		return UnbundledFiles.front();
		}

		InputInfo Driver::CreateBundledOffloadingResult(
		Compilation &C, const OffloadBundlingJobAction *CurAction,
		const ToolChain *TC, InputInfoList Results) const {
		assert(!OrderedOffloadingToolchains.empty() &&
		"Not expecting to create a bundling action!");

		// Get the result file based on BaseInput file name and the previous host
		// action.
		InputInfo BundledFile = CreateActionResult(
		C, CurAction->begin(), Results[0].getBaseInput(), /BoundArch=*/nullptr,
		/AtTopLevel=/true, /MultipleArchs=/false);

		// The unbundled files are the previous action result for each target.
		InputInfoList &UnbundledFiles = Results;

		// Create the bundling command.
		auto OffloadBundlerTool = TC->SelectTool(*CurAction);

		// Emit the command or dump the bindings.
		if (CCCPrintBindings && !CCGenDiagnostics) {
		SmallVector<const ToolChain *, 4> AllToolChains;
		AllToolChains.push_back(TC);
		AllToolChains.append(OrderedOffloadingToolchains.begin(),
		OrderedOffloadingToolchains.end());
		DumpJobBindings(AllToolChains, OffloadBundlerTool->getName(),
		UnbundledFiles, BundledFile);
		} else {
		OffloadBundlerTool->ConstructJob(C, *CurAction, BundledFile, UnbundledFiles,
		C.getArgs(), nullptr);
		}

		return BundledFile;
		}

		void Driver::PostProcessOffloadingInputsAndResults(
		Compilation &C, const JobAction JA, const ToolChain TC,
		InputInfoList &Inputs, InputInfo &Result,
		OffloadingHostResultsTy &OffloadingHostResults) const {

		// If this driver run requires OpenMP offloading we need to make sure
		// everything gets combined at link time. Also, all the compile phase results
		// obtained for the host are used as inputs in the device side.
		if (RequiresOpenMPOffloading(TC)) {

		if (isa<LinkJobAction>(JA) &&
		TC->getOffloadingKind() == ToolChain::OK_OpenMP_Host) {
		// Get link results for all targets.
		InputInfoList TgtLinkResults(OrderedOffloadingToolchains.size());
		for (unsigned i = 0; i < OrderedOffloadingToolchains.size(); ++i) {
		const ToolChain *TgtTC = OrderedOffloadingToolchains[i];
		BuildJobsForAction(C, JA, TgtTC,
		CreateOffloadingPseudoArchName(C, TgtTC),
		/AtTopLevel=/false,
		/MultipleArchs=/true, /LinkingOutput=/nullptr,
		TgtLinkResults[i], OffloadingHostResults);
		}
		Inputs.append(TgtLinkResults.begin(), TgtLinkResults.end());
		return;
		}

		if (isa<CompileJobAction>(JA) &&
		TC->getOffloadingKind() == ToolChain::OK_OpenMP_Device) {
		// Find the host compile result.
		auto ILIt = OffloadingHostResults.find(JA);
		assert(ILIt != OffloadingHostResults.end() &&
		"The OpenMP host side action is expected to be processed before!");
		InputInfoList &IL = ILIt->getSecond();
		assert(IL.size() == 1 && "Host compile results should only be one!");
		Inputs.push_back(IL.front());
		return;
		}

		// If this is a host action, make sure it is recorded in the offloading
		// results cache.
		if (TC->getOffloadingKind() == ToolChain::OK_OpenMP_Host)
		OffloadingHostResults[JA].push_back(Result);

		return;
		}

		//
		// Add post-processing code for other offloading implementations here.
		//
		}

static Arg MakeInputArg(DerivedArgList &Args, OptTable Opts,		static Arg MakeInputArg(DerivedArgList &Args, OptTable Opts,
StringRef Value) {		StringRef Value) {
Arg *A = new Arg(Opts->getOption(options::OPT_INPUT), Value,		Arg *A = new Arg(Opts->getOption(options::OPT_INPUT), Value,
Args.getBaseArgs().MakeIndex(Value), Value.data());		Args.getBaseArgs().MakeIndex(Value), Value.data());
Args.AddSynthesizedArg(A);		Args.AddSynthesizedArg(A);
A->claim();		A->claim();
return A;		return A;
}		}
▲ Show 20 Lines • Show All 272 Lines • ▼ Show 20 Lines	Compilation Driver::BuildCompilation(ArrayRef<const char > ArgList) {
setLTOMode(Args);		setLTOMode(Args);

std::unique_ptr<llvm::opt::InputArgList> UArgs =		std::unique_ptr<llvm::opt::InputArgList> UArgs =
llvm::make_unique<InputArgList>(std::move(Args));		llvm::make_unique<InputArgList>(std::move(Args));

// Perform the default argument translations.		// Perform the default argument translations.
DerivedArgList TranslatedArgs = TranslateInputArgs(UArgs);		DerivedArgList TranslatedArgs = TranslateInputArgs(UArgs);

		// Check if we need offloading support by the toolchains.
		ToolChain::OffloadingKind HostOffloadingKind = ToolChain::OK_None;
		ToolChain::OffloadingKind DeviceOffloadingKind = ToolChain::OK_None;
		// Check if we need OpenMP offloading
		if (RequiresOpenMPOffloading(*UArgs)) {
		HostOffloadingKind = ToolChain::OK_OpenMP_Host;
		DeviceOffloadingKind = ToolChain::OK_OpenMP_Device;
		}

// Owned by the host.		// Owned by the host.
const ToolChain &TC =		const ToolChain &TC =
getToolChain(UArgs, computeTargetTriple(DefaultTargetTriple, UArgs));		getToolChain(UArgs, computeTargetTriple(DefaultTargetTriple, UArgs),
		HostOffloadingKind);

		// Get the toolchains for the offloading targets if any. We need to read the
		// offloading toolchains only if we have a compatible runtime library, ant
		// that would be either libomp or libiomp.
		OrderedOffloadingToolchains.clear();

		if (DeviceOffloadingKind == ToolChain::OK_OpenMP_Device) {
		Arg *Tgts = UArgs->getLastArg(options::OPT_omptargets_EQ);
		assert(Tgts && Tgts->getNumValues() &&
		"OpenMP offloading has to have targets specified.");

		for (unsigned v = 0; v < Tgts->getNumValues(); ++v) {
		const char *Val = Tgts->getValue(v);
		llvm::Triple TT(Val);

		// If the specified target is invalid, emit error
		if (TT.getArch() == llvm::Triple::UnknownArch)
		Diag(clang::diag::err_drv_invalid_omp_target) << Val;
		else {
		const ToolChain &OffloadTC =
		getToolChain(*UArgs, TT, DeviceOffloadingKind);
		OrderedOffloadingToolchains.push_back(&OffloadTC);
		}
		}
		}

// The compilation takes ownership of Args.		// The compilation takes ownership of Args.
Compilation C = new Compilation(this, TC, UArgs.release(), TranslatedArgs);		Compilation C = new Compilation(this, TC, UArgs.release(), TranslatedArgs);

C->setCudaDeviceToolChain(		C->setCudaDeviceToolChain(
&getToolChain(C->getArgs(), llvm::Triple(TC.getTriple().isArch64Bit()		&getToolChain(C->getArgs(), llvm::Triple(TC.getTriple().isArch64Bit()
? "nvptx64-nvidia-cuda"		? "nvptx64-nvidia-cuda"
: "nvptx-nvidia-cuda")));		: "nvptx-nvidia-cuda")));
▲ Show 20 Lines • Show All 975 Lines • ▼ Show 20 Lines	for (auto &I : Inputs) {
for (const auto &Phase : PL)		for (const auto &Phase : PL)
if (Phase <= FinalPhase && Phase == phases::Compile) {		if (Phase <= FinalPhase && Phase == phases::Compile) {
CudaInjectionPhase = Phase;		CudaInjectionPhase = Phase;
break;		break;
}		}

// Build the pipeline for this file.		// Build the pipeline for this file.
std::unique_ptr<Action> Current(new InputAction(*InputArg, InputType));		std::unique_ptr<Action> Current(new InputAction(*InputArg, InputType));

		// If we need to support offloading, run an unbundling job before each input
		// to make sure that bundled files get unbundled. If the input is a source
		// file that is not required.
		if (!OrderedOffloadingToolchains.empty() &&
		InputArg->getOption().getKind() == llvm::opt::Option::InputClass &&
		!types::isSrcFile(InputType))
		Current.reset(new OffloadUnbundlingJobAction(std::move(Current)));

for (SmallVectorImpl<phases::ID>::iterator i = PL.begin(), e = PL.end();		for (SmallVectorImpl<phases::ID>::iterator i = PL.begin(), e = PL.end();
i != e; ++i) {		i != e; ++i) {
phases::ID Phase = *i;		phases::ID Phase = *i;

// We are done if this step is past what the user requested.		// We are done if this step is past what the user requested.
if (Phase > FinalPhase)		if (Phase > FinalPhase)
break;		break;

Show All 20 Lines	for (SmallVectorImpl<phases::ID>::iterator i = PL.begin(), e = PL.end();
break;		break;
}		}

if (Current->getType() == types::TY_Nothing)		if (Current->getType() == types::TY_Nothing)
break;		break;
}		}

// If we ended with something, add to the output list.		// If we ended with something, add to the output list.
if (Current)		if (Current) {
		// If we need to support offloading, run a bundling job for each output
		// that is not a linker action. Linker actions is when device images are
		// usually embedded into the host to form a fat binary.
		if (!OrderedOffloadingToolchains.empty())
		Current.reset(new OffloadBundlingJobAction(std::move(Current)));

Actions.push_back(Current.release());		Actions.push_back(Current.release());
}		}
		}

// Add a link action if necessary.		// Add a link action if necessary.
if (!LinkerInputs.empty())		if (!LinkerInputs.empty())
Actions.push_back(new LinkJobAction(LinkerInputs, types::TY_Image));		Actions.push_back(new LinkJobAction(LinkerInputs, types::TY_Image));

// If we are linking, claim any options which are obviously only used for		// If we are linking, claim any options which are obviously only used for
// compilation.		// compilation.
if (FinalPhase == phases::Link && PL.size() == 1) {		if (FinalPhase == phases::Link && PL.size() == 1) {
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	void Driver::BuildJobs(Compilation &C) const {

// Collect the list of architectures.		// Collect the list of architectures.
llvm::StringSet<> ArchNames;		llvm::StringSet<> ArchNames;
if (C.getDefaultToolChain().getTriple().isOSBinFormatMachO())		if (C.getDefaultToolChain().getTriple().isOSBinFormatMachO())
for (const Arg *A : C.getArgs())		for (const Arg *A : C.getArgs())
if (A->getOption().matches(options::OPT_arch))		if (A->getOption().matches(options::OPT_arch))
ArchNames.insert(A->getValue());		ArchNames.insert(A->getValue());

		// Cleanup the offloading host cache so that cached results of previous runs
		// are not used. This is required for when clang is used as library.
		OffloadingHostResultsTy OffloadingHostResults;

for (Action *A : C.getActions()) {		for (Action *A : C.getActions()) {
// If we are linking an image for multiple archs then the linker wants		// If we are linking an image for multiple archs then the linker wants
// -arch_multiple and -final_output <final image name>. Unfortunately, this		// -arch_multiple and -final_output <final image name>. Unfortunately, this
// doesn't fit in cleanly because we have to pass this information down.		// doesn't fit in cleanly because we have to pass this information down.
//		//
// FIXME: This is a hack; find a cleaner way to integrate this into the		// FIXME: This is a hack; find a cleaner way to integrate this into the
// process.		// process.
const char *LinkingOutput = nullptr;		const char *LinkingOutput = nullptr;
if (isa<LipoJobAction>(A)) {		if (isa<LipoJobAction>(A)) {
if (FinalOutput)		if (FinalOutput)
LinkingOutput = FinalOutput->getValue();		LinkingOutput = FinalOutput->getValue();
else		else
LinkingOutput = getDefaultImageName();		LinkingOutput = getDefaultImageName();
}		}

InputInfo II;		InputInfo II;
BuildJobsForAction(C, A, &C.getDefaultToolChain(),		BuildJobsForAction(C, A, &C.getDefaultToolChain(),
/BoundArch/ nullptr,		/BoundArch/ nullptr,
/AtTopLevel/ true,		/AtTopLevel/ true,
/MultipleArchs/ ArchNames.size() > 1,		/MultipleArchs/ ArchNames.size() > 1,
/LinkingOutput/ LinkingOutput, II);		/LinkingOutput/ LinkingOutput, II,
		OffloadingHostResults);
}		}

// If the user passed -Qunused-arguments or there were errors, don't warn		// If the user passed -Qunused-arguments or there were errors, don't warn
// about any unused arguments.		// about any unused arguments.
if (Diags.hasErrorOccurred() \|\|		if (Diags.hasErrorOccurred() \|\|
C.getArgs().hasArg(options::OPT_Qunused_arguments))		C.getArgs().hasArg(options::OPT_Qunused_arguments))
return;		return;

▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	if (TC->useIntegratedAs() && !SaveTemps &&
!C.getArgs().hasArg(options::OPT_via_file_asm) &&		!C.getArgs().hasArg(options::OPT_via_file_asm) &&
!C.getArgs().hasArg(options::OPT__SLASH_FA) &&		!C.getArgs().hasArg(options::OPT__SLASH_FA) &&
!C.getArgs().hasArg(options::OPT__SLASH_Fa) &&		!C.getArgs().hasArg(options::OPT__SLASH_Fa) &&
isa<AssembleJobAction>(JA) && Inputs->size() == 1 &&		isa<AssembleJobAction>(JA) && Inputs->size() == 1 &&
isa<BackendJobAction>(*Inputs->begin())) {		isa<BackendJobAction>(*Inputs->begin())) {
// A BackendJob is always preceded by a CompileJob, and without		// A BackendJob is always preceded by a CompileJob, and without
// -save-temps they will always get combined together, so instead of		// -save-temps they will always get combined together, so instead of
// checking the backend tool, check if the tool for the CompileJob		// checking the backend tool, check if the tool for the CompileJob
// has an integrated assembler.		// has an integrated assembler. However, if OpenMP offloading is required
const ActionList BackendInputs = &(Inputs)[0]->getInputs();		// the backend and compile jobs have to be kept separate and an integrated
		// assembler of the backend job will be queried instead.
		JobAction CurJA = cast<BackendJobAction>(Inputs->begin());
		const ActionList *BackendInputs = &CurJA->getInputs();
		CudaHostAction *CHA = nullptr;
		if (!RequiresOpenMPOffloading(TC)) {
		echristoUnsubmitted Done Reply Inline Actions Might be time to make some specialized versions of this function. This may take it from "ridiculously confusing" to "code no one should ever look at" :) echristo: Might be time to make some specialized versions of this function. This may take it from…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I agree. This function is really messy... :S In http://reviews.llvm.org/D18171 I am proposing `collapseOffloadingAction` that drives the collapsing of offload actions and abstracts some of the complexity in `selectToolForJob`. Do you think that goes in the right direction, or you think I should do something else? sfantao: I agree. This function is really messy... :S In http://reviews.llvm.org/D18171 I am proposing…
// Compile job may be wrapped in CudaHostAction, extract it if		// Compile job may be wrapped in CudaHostAction, extract it if
// that's the case and update CollapsedCHA if we combine phases.		// that's the case and update CollapsedCHA if we combine phases.
CudaHostAction CHA = dyn_cast<CudaHostAction>(BackendInputs->begin());		CHA = dyn_cast<CudaHostAction>(*CurJA->begin());
JobAction *CompileJA =		CurJA =
cast<CompileJobAction>(CHA ? CHA->begin() : BackendInputs->begin());		cast<CompileJobAction>(CHA ? CHA->begin() : BackendInputs->begin());
assert(CompileJA && "Backend job is not preceeded by compile job.");		assert(CurJA && "Backend job is not preceeded by compile job.");
const Tool Compiler = TC->SelectTool(CompileJA);		}
if (!Compiler)		const Tool CurTool = TC->SelectTool(CurJA);
		if (!CurTool)
return nullptr;		return nullptr;
if (Compiler->hasIntegratedAssembler()) {		if (CurTool->hasIntegratedAssembler()) {
Inputs = &CompileJA->getInputs();		Inputs = &CurJA->getInputs();
ToolForJob = Compiler;		ToolForJob = CurTool;
CollapsedCHA = CHA;		CollapsedCHA = CHA;
}		}
}		}

// A backend job should always be combined with the preceding compile job		// A backend job should always be combined with the preceding compile job
// unless OPT_save_temps is enabled and the compiler is capable of emitting		// unless OPT_save_temps is enabled and the compiler is capable of emitting
// LLVM IR as an intermediate output.		// LLVM IR as an intermediate output. The OpenMP offloading implementation
if (isa<BackendJobAction>(JA)) {		// also requires the Compile and Backend jobs to be separate.
		if (isa<BackendJobAction>(JA) && !RequiresOpenMPOffloading(TC)) {
// Check if the compiler supports emitting LLVM IR.		// Check if the compiler supports emitting LLVM IR.
assert(Inputs->size() == 1);		assert(Inputs->size() == 1);
// Compile job may be wrapped in CudaHostAction, extract it if		// Compile job may be wrapped in CudaHostAction, extract it if
// that's the case and update CollapsedCHA if we combine phases.		// that's the case and update CollapsedCHA if we combine phases.
CudaHostAction CHA = dyn_cast<CudaHostAction>(Inputs->begin());		CudaHostAction CHA = dyn_cast<CudaHostAction>(Inputs->begin());
JobAction *CompileJA =		JobAction *CompileJA =
cast<CompileJobAction>(CHA ? CHA->begin() : Inputs->begin());		cast<CompileJobAction>(CHA ? CHA->begin() : Inputs->begin());
assert(CompileJA && "Backend job is not preceeded by compile job.");		assert(CompileJA && "Backend job is not preceeded by compile job.");
Show All 23 Lines	static const Tool *selectToolForJob(Compilation &C, bool SaveTemps,

return ToolForJob;		return ToolForJob;
}		}

void Driver::BuildJobsForAction(Compilation &C, const Action *A,		void Driver::BuildJobsForAction(Compilation &C, const Action *A,
const ToolChain TC, const char BoundArch,		const ToolChain TC, const char BoundArch,
bool AtTopLevel, bool MultipleArchs,		bool AtTopLevel, bool MultipleArchs,
const char *LinkingOutput,		const char *LinkingOutput,
InputInfo &Result) const {		InputInfo &Result,
		OffloadingHostResultsTy &OffloadingHostResults) const {
llvm::PrettyStackTraceString CrashInfo("Building compilation jobs");		llvm::PrettyStackTraceString CrashInfo("Building compilation jobs");

InputInfoList CudaDeviceInputInfos;		InputInfoList CudaDeviceInputInfos;
if (const CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {		if (const CudaHostAction *CHA = dyn_cast<CudaHostAction>(A)) {
InputInfo II;		InputInfo II;
// Append outputs of device jobs to the input list.		// Append outputs of device jobs to the input list.
for (const Action *DA : CHA->getDeviceActions()) {		for (const Action *DA : CHA->getDeviceActions()) {
BuildJobsForAction(C, DA, TC, nullptr, AtTopLevel,		BuildJobsForAction(C, DA, TC, nullptr, AtTopLevel,
/MultipleArchs/ false, LinkingOutput, II);		/MultipleArchs/ false, LinkingOutput, II, OffloadingHostResults);
CudaDeviceInputInfos.push_back(II);		CudaDeviceInputInfos.push_back(II);
}		}
// Override current action with a real host compile action and continue		// Override current action with a real host compile action and continue
// processing it.		// processing it.
A = *CHA->begin();		A = *CHA->begin();
}		}

		if (const OffloadUnbundlingJobAction *OUA =
		dyn_cast<OffloadUnbundlingJobAction>(A)) {
		// The input of the unbundling job has to be a single input non-source file,
		traUnsubmitted Not Done Reply Inline Actions "has to be" tra: "has to be"
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Fixed! sfantao: Fixed!
		// so we do not consider it having multiple architectures. We just use the
		// naming that a regular host input file would have.
		BuildJobsForAction(C, *OUA->begin(), TC, BoundArch, AtTopLevel,
		/MultipleArchs=/false, LinkingOutput, Result,
		OffloadingHostResults);
		Result = CreateUnbundledOffloadingResult(C, OUA, TC, Result,
		OffloadingHostResults);
		return;
		}

		if (const OffloadBundlingJobAction *OBA =
		dyn_cast<OffloadBundlingJobAction>(A)) {
		// Compute the input action for all devices and emit a bundling command.
		InputInfoList Results(OrderedOffloadingToolchains.size() + 1);
		for (unsigned i = 0; i < Results.size(); ++i) {
		const ToolChain *CurTC = i ? OrderedOffloadingToolchains[i - 1] : TC;
		// The input job of the bundling action is meant for multiple targets and
		// is not a top level job - the bundling job is the top level for the
		// current output.
		BuildJobsForAction(C, *OBA->begin(), CurTC,
		CreateOffloadingPseudoArchName(C, CurTC),
		/AtTopLevel=/false,
		/MultipleArchs=/true, LinkingOutput, Results[i],
		OffloadingHostResults);
		}
		Result = CreateBundledOffloadingResult(C, OBA, TC, Results);
		return;
		}

if (const InputAction *IA = dyn_cast<InputAction>(A)) {		if (const InputAction *IA = dyn_cast<InputAction>(A)) {
// FIXME: It would be nice to not claim this here; maybe the old scheme of		// FIXME: It would be nice to not claim this here; maybe the old scheme of
// just using Args was better?		// just using Args was better?
const Arg &Input = IA->getInputArg();		const Arg &Input = IA->getInputArg();
Input.claim();		Input.claim();
if (Input.getOption().matches(options::OPT_INPUT)) {		if (Input.getOption().matches(options::OPT_INPUT)) {
const char *Name = Input.getValue();		const char *Name = Input.getValue();
Result = InputInfo(Name, A->getType(), Name);		Result = InputInfo(Name, A->getType(), Name);
Show All 10 Lines	if (const BindArchAction *BAA = dyn_cast<BindArchAction>(A)) {
if (ArchName)		if (ArchName)
TC = &getToolChain(		TC = &getToolChain(
C.getArgs(),		C.getArgs(),
computeTargetTriple(DefaultTargetTriple, C.getArgs(), ArchName));		computeTargetTriple(DefaultTargetTriple, C.getArgs(), ArchName));
else		else
TC = &C.getDefaultToolChain();		TC = &C.getDefaultToolChain();

BuildJobsForAction(C, *BAA->begin(), TC, ArchName, AtTopLevel,		BuildJobsForAction(C, *BAA->begin(), TC, ArchName, AtTopLevel,
MultipleArchs, LinkingOutput, Result);		MultipleArchs, LinkingOutput, Result,
		OffloadingHostResults);
return;		return;
}		}

if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {		if (const CudaDeviceAction *CDA = dyn_cast<CudaDeviceAction>(A)) {
// Initial processing of CudaDeviceAction carries host params.		// Initial processing of CudaDeviceAction carries host params.
// Call BuildJobsForAction() again, now with correct device parameters.		// Call BuildJobsForAction() again, now with correct device parameters.
assert(CDA->getGpuArchName() && "No GPU name in device action.");		assert(CDA->getGpuArchName() && "No GPU name in device action.");
BuildJobsForAction(C, *CDA->begin(), C.getCudaDeviceToolChain(),		BuildJobsForAction(C, *CDA->begin(), C.getCudaDeviceToolChain(),
CDA->getGpuArchName(), CDA->isAtTopLevel(),		CDA->getGpuArchName(), CDA->isAtTopLevel(),
/MultipleArchs/ true, LinkingOutput, Result);		/MultipleArchs/ true, LinkingOutput, Result, OffloadingHostResults);
return;		return;
}		}

const ActionList *Inputs = &A->getInputs();		const ActionList *Inputs = &A->getInputs();

const JobAction *JA = cast<JobAction>(A);		const JobAction *JA = cast<JobAction>(A);
const CudaHostAction *CollapsedCHA = nullptr;		const CudaHostAction *CollapsedCHA = nullptr;
const Tool *T =		const Tool *T =
selectToolForJob(C, isSaveTempsEnabled(), TC, JA, Inputs, CollapsedCHA);		selectToolForJob(C, isSaveTempsEnabled(), TC, JA, Inputs, CollapsedCHA);
if (!T)		if (!T)
return;		return;

// If we've collapsed action list that contained CudaHostAction we		// If we've collapsed action list that contained CudaHostAction we
// need to build jobs for device-side inputs it may have held.		// need to build jobs for device-side inputs it may have held.
if (CollapsedCHA) {		if (CollapsedCHA) {
InputInfo II;		InputInfo II;
for (const Action *DA : CollapsedCHA->getDeviceActions()) {		for (const Action *DA : CollapsedCHA->getDeviceActions()) {
BuildJobsForAction(C, DA, TC, "", AtTopLevel,		BuildJobsForAction(C, DA, TC, "", AtTopLevel,
/MultipleArchs/ false, LinkingOutput, II);		/MultipleArchs/ false, LinkingOutput, II,
		OffloadingHostResults);
CudaDeviceInputInfos.push_back(II);		CudaDeviceInputInfos.push_back(II);
}		}
}		}

// Only use pipes when there is exactly one input.		// Only use pipes when there is exactly one input.
InputInfoList InputInfos;		InputInfoList InputInfos;
for (const Action Input : Inputs) {		for (const Action Input : Inputs) {
// Treat dsymutil and verify sub-jobs as being at the top-level too, they		// Treat dsymutil and verify sub-jobs as being at the top-level too, they
// shouldn't get temporary output names.		// shouldn't get temporary output names.
// FIXME: Clean this up.		// FIXME: Clean this up.
bool SubJobAtTopLevel = false;		bool SubJobAtTopLevel = false;
if (AtTopLevel && (isa<DsymutilJobAction>(A) \|\| isa<VerifyJobAction>(A)))		if (AtTopLevel && (isa<DsymutilJobAction>(A) \|\| isa<VerifyJobAction>(A)))
SubJobAtTopLevel = true;		SubJobAtTopLevel = true;

InputInfo II;		InputInfo II;
BuildJobsForAction(C, Input, TC, BoundArch, SubJobAtTopLevel, MultipleArchs,		BuildJobsForAction(C, Input, TC, BoundArch, SubJobAtTopLevel, MultipleArchs,
LinkingOutput, II);		LinkingOutput, II, OffloadingHostResults);
InputInfos.push_back(II);		InputInfos.push_back(II);
}		}

// Always use the first input as the base input.		// Always use the first input as the base input.
const char *BaseInput = InputInfos[0].getBaseInput();		const char *BaseInput = InputInfos[0].getBaseInput();

// ... except dsymutil actions, which use their actual input as the base		// ... except dsymutil actions, which use their actual input as the base
// input.		// input.
if (JA->getType() == types::TY_dSYM)		if (JA->getType() == types::TY_dSYM)
BaseInput = InputInfos[0].getFilename();		BaseInput = InputInfos[0].getFilename();

// Append outputs of cuda device jobs to the input list		// Append outputs of cuda device jobs to the input list
if (CudaDeviceInputInfos.size())		if (CudaDeviceInputInfos.size())
InputInfos.append(CudaDeviceInputInfos.begin(), CudaDeviceInputInfos.end());		InputInfos.append(CudaDeviceInputInfos.begin(), CudaDeviceInputInfos.end());

// Determine the place to write output to, if any.		// Determine the place to write output to, if any.
if (JA->getType() == types::TY_Nothing)		Result =
Result = InputInfo(A->getType(), BaseInput);		CreateActionResult(C, A, BaseInput, BoundArch, AtTopLevel, MultipleArchs);
else
Result = InputInfo(GetNamedOutputPath(C, *JA, BaseInput, BoundArch,
AtTopLevel, MultipleArchs),
A->getType(), BaseInput);

if (CCCPrintBindings && !CCGenDiagnostics) {		// Post-process inputs and results to suit the needs of the offloading
llvm::errs() << "# \"" << T->getToolChain().getTripleString() << '"'		// implementations.
<< " - \"" << T->getName() << "\", inputs: [";		PostProcessOffloadingInputsAndResults(C, JA, TC, InputInfos, Result,
for (unsigned i = 0, e = InputInfos.size(); i != e; ++i) {		OffloadingHostResults);
llvm::errs() << InputInfos[i].getAsString();
if (i + 1 != e)		if (CCCPrintBindings && !CCGenDiagnostics)
llvm::errs() << ", ";		DumpJobBindings(&T->getToolChain(), T->getName(), InputInfos, Result);
}		else
llvm::errs() << "], output: " << Result.getAsString() << "\n";
} else {
T->ConstructJob(C, *JA, Result, InputInfos,		T->ConstructJob(C, *JA, Result, InputInfos,
C.getArgsForToolChain(TC, BoundArch), LinkingOutput);		C.getArgsForToolChain(TC, BoundArch), LinkingOutput);
}		}
}

const char *Driver::getDefaultImageName() const {		const char *Driver::getDefaultImageName() const {
llvm::Triple Target(llvm::Triple::normalize(DefaultTargetTriple));		llvm::Triple Target(llvm::Triple::normalize(DefaultTargetTriple));
return Target.isOSWindows() ? "a.exe" : "a.out";		return Target.isOSWindows() ? "a.exe" : "a.out";
}		}

/// \brief Create output filename based on ArgValue, which could either be a		/// \brief Create output filename based on ArgValue, which could either be a
/// full filename, filename without extension, or a directory. If ArgValue		/// full filename, filename without extension, or a directory. If ArgValue
▲ Show 20 Lines • Show All 282 Lines • ▼ Show 20 Lines	std::string Driver::GetTemporaryPath(StringRef Prefix,
if (EC) {		if (EC) {
Diag(clang::diag::err_unable_to_make_temp) << EC.message();		Diag(clang::diag::err_unable_to_make_temp) << EC.message();
return "";		return "";
}		}

return Path.str();		return Path.str();
}		}

const ToolChain &Driver::getToolChain(const ArgList &Args,		const ToolChain &
const llvm::Triple &Target) const {		Driver::getToolChain(const ArgList &Args, const llvm::Triple &Target,
		ToolChain::OffloadingKind OffloadingKind) const {
ToolChain *&TC = ToolChains[Target.str()];		// If this is an offload toolchain we need to try to get it from the right
		// cache.
		bool IsOffloadingDevice = (OffloadingKind == ToolChain::OK_OpenMP_Device);
		ToolChain &TC = ((IsOffloadingDevice) ? &OffloadToolChains[Target.str()]
		: &ToolChains[Target.str()]);
if (!TC) {		if (!TC) {
switch (Target.getOS()) {		switch (Target.getOS()) {
case llvm::Triple::CloudABI:		case llvm::Triple::CloudABI:
TC = new toolchains::CloudABI(*this, Target, Args);		TC = new toolchains::CloudABI(*this, Target, Args);
break;		break;
case llvm::Triple::Darwin:		case llvm::Triple::Darwin:
case llvm::Triple::MacOSX:		case llvm::Triple::MacOSX:
case llvm::Triple::IOS:		case llvm::Triple::IOS:
▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	default:
TC = new toolchains::Generic_ELF(*this, Target, Args);		TC = new toolchains::Generic_ELF(*this, Target, Args);
else if (Target.isOSBinFormatMachO())		else if (Target.isOSBinFormatMachO())
TC = new toolchains::MachO(*this, Target, Args);		TC = new toolchains::MachO(*this, Target, Args);
else		else
TC = new toolchains::Generic_GCC(*this, Target, Args);		TC = new toolchains::Generic_GCC(*this, Target, Args);
}		}
}		}
}		}
		// Set the offloading kind for this toolchain.
		TC->setOffloadingKind(OffloadingKind);
return *TC;		return *TC;
}		}

bool Driver::ShouldUseClangCompiler(const JobAction &JA) const {		bool Driver::ShouldUseClangCompiler(const JobAction &JA) const {
// Say "no" if there is not exactly one input of a type clang understands.		// Say "no" if there is not exactly one input of a type clang understands.
if (JA.size() != 1 \|\| !types::isAcceptedByClang((*JA.begin())->getType()))		if (JA.size() != 1 \|\| !types::isAcceptedByClang((*JA.begin())->getType()))
return false;		return false;

▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

lib/Driver/ToolChain.cpp

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	if (Exceptions &&
return ToolChain::RM_EnabledImplicitly;		return ToolChain::RM_EnabledImplicitly;

return ToolChain::RM_DisabledImplicitly;		return ToolChain::RM_DisabledImplicitly;
}		}

ToolChain::ToolChain(const Driver &D, const llvm::Triple &T,		ToolChain::ToolChain(const Driver &D, const llvm::Triple &T,
const ArgList &Args)		const ArgList &Args)
: D(D), Triple(T), Args(Args), CachedRTTIArg(GetRTTIArgument(Args)),		: D(D), Triple(T), Args(Args), CachedRTTIArg(GetRTTIArgument(Args)),
CachedRTTIMode(CalculateRTTIMode(Args, Triple, CachedRTTIArg)) {		CachedRTTIMode(CalculateRTTIMode(Args, Triple, CachedRTTIArg)),
		CachedOffloadingKind(OK_None) {
if (Arg *A = Args.getLastArg(options::OPT_mthread_model))		if (Arg *A = Args.getLastArg(options::OPT_mthread_model))
if (!isThreadModelSupported(A->getValue()))		if (!isThreadModelSupported(A->getValue()))
D.Diag(diag::err_drv_invalid_thread_model_for_target)		D.Diag(diag::err_drv_invalid_thread_model_for_target)
<< A->getValue() << A->getAsString(Args);		<< A->getValue() << A->getAsString(Args);
}		}

ToolChain::~ToolChain() {		ToolChain::~ToolChain() {
}		}
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	ToolChain::getTargetAndModeFromProgramName(StringRef PN) {
std::string IgnoredError;		std::string IgnoredError;
std::string Target;		std::string Target;
if (llvm::TargetRegistry::lookupTarget(Prefix, IgnoredError)) {		if (llvm::TargetRegistry::lookupTarget(Prefix, IgnoredError)) {
Target = Prefix;		Target = Prefix;
}		}
return std::make_pair(Target, ModeFlag);		return std::make_pair(Target, ModeFlag);
}		}

		void ToolChain::setOffloadingKind(OffloadingKind OK) {
		assert(CachedOffloadingKind == OK_None &&
		"Offloading kind not expected to change once it is set.");
		CachedOffloadingKind = OK;
		}

StringRef ToolChain::getDefaultUniversalArchName() const {		StringRef ToolChain::getDefaultUniversalArchName() const {
// In universal driver terms, the arch name accepted by -arch isn't exactly		// In universal driver terms, the arch name accepted by -arch isn't exactly
// the same as the ones that appear in the triple. Roughly speaking, this is		// the same as the ones that appear in the triple. Roughly speaking, this is
// an inverse of the darwin::getArchTypeForDarwinArchName() function, but the		// an inverse of the darwin::getArchTypeForDarwinArchName() function, but the
// only interesting special case is powerpc.		// only interesting special case is powerpc.
switch (Triple.getArch()) {		switch (Triple.getArch()) {
case llvm::Triple::ppc:		case llvm::Triple::ppc:
return "ppc";		return "ppc";
Show All 37 Lines
}		}

Tool *ToolChain::getLink() const {		Tool *ToolChain::getLink() const {
if (!Link)		if (!Link)
Link.reset(buildLinker());		Link.reset(buildLinker());
return Link.get();		return Link.get();
}		}

		Tool *ToolChain::getOffloadBundler() const {
		if (!OffloadBundler)
		OffloadBundler.reset(new tools::OffloadBundler(*this));
		return OffloadBundler.get();
		}

Tool *ToolChain::getTool(Action::ActionClass AC) const {		Tool *ToolChain::getTool(Action::ActionClass AC) const {
switch (AC) {		switch (AC) {
case Action::AssembleJobClass:		case Action::AssembleJobClass:
return getAssemble();		return getAssemble();

case Action::LinkJobClass:		case Action::LinkJobClass:
return getLink();		return getLink();

Show All 9 Lines	Tool *ToolChain::getTool(Action::ActionClass AC) const {
case Action::CompileJobClass:		case Action::CompileJobClass:
case Action::PrecompileJobClass:		case Action::PrecompileJobClass:
case Action::PreprocessJobClass:		case Action::PreprocessJobClass:
case Action::AnalyzeJobClass:		case Action::AnalyzeJobClass:
case Action::MigrateJobClass:		case Action::MigrateJobClass:
case Action::VerifyPCHJobClass:		case Action::VerifyPCHJobClass:
case Action::BackendJobClass:		case Action::BackendJobClass:
return getClang();		return getClang();

		case Action::OffloadBundlingJobClass:
		case Action::OffloadUnbundlingJobClass:
		return getOffloadBundler();
}		}

llvm_unreachable("Invalid tool kind.");		llvm_unreachable("Invalid tool kind.");
}		}

static StringRef getArchNameForCompilerRTLib(const ToolChain &TC,		static StringRef getArchNameForCompilerRTLib(const ToolChain &TC,
const ArgList &Args) {		const ArgList &Args) {
const llvm::Triple &Triple = TC.getTriple();		const llvm::Triple &Triple = TC.getTriple();
▲ Show 20 Lines • Show All 395 Lines • Show Last 20 Lines

lib/Driver/ToolChains.h

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	public:

void printVerboseInfo(raw_ostream &OS) const override;		void printVerboseInfo(raw_ostream &OS) const override;

bool IsUnwindTablesDefault() const override;		bool IsUnwindTablesDefault() const override;
bool isPICDefault() const override;		bool isPICDefault() const override;
bool isPIEDefault() const override;		bool isPIEDefault() const override;
bool isPICDefaultForced() const override;		bool isPICDefaultForced() const override;
bool IsIntegratedAssemblerDefault() const override;		bool IsIntegratedAssemblerDefault() const override;
		llvm::opt::DerivedArgList *
		TranslateOffloadArgs(const llvm::opt::DerivedArgList &Args,
		const char *BoundArch) const override;

protected:		protected:
Tool *getTool(Action::ActionClass AC) const override;		Tool *getTool(Action::ActionClass AC) const override;
Tool *buildAssembler() const override;		Tool *buildAssembler() const override;
Tool *buildLinker() const override;		Tool *buildLinker() const override;

/// \name ToolChain Implementation Helper Functions		/// \name ToolChain Implementation Helper Functions
/// @{		/// @{
▲ Show 20 Lines • Show All 913 Lines • Show Last 20 Lines

lib/Driver/ToolChains.cpp

Show First 20 Lines • Show All 2,414 Lines • ▼ Show 20 Lines	if ((GCCMultiarchTriple.empty() && TargetMultiarchTriple.empty()) \|\|
addSystemInclude(DriverArgs, CC1Args,		addSystemInclude(DriverArgs, CC1Args,
Base + "/" + TargetMultiarchTriple + Suffix);		Base + "/" + TargetMultiarchTriple + Suffix);
}		}

addSystemInclude(DriverArgs, CC1Args, Base + Suffix + "/backward");		addSystemInclude(DriverArgs, CC1Args, Base + Suffix + "/backward");
return true;		return true;
}		}

		llvm::opt::DerivedArgList *
		Generic_GCC::TranslateOffloadArgs(const llvm::opt::DerivedArgList &Args,
		const char *BoundArch) const {
		// Make sure we always generate a shared library for an OpenMP offloading
		// target regardless the commands the user passed to the host.

		if (getOffloadingKind() != OK_OpenMP_Device)
		return nullptr;

		DerivedArgList *DAL = new DerivedArgList(Args.getBaseArgs());
		const OptTable &Opts = getDriver().getOpts();

		// Request the shared library.
		DAL->AddFlagArg(0, Opts.getOption(options::OPT_shared));
		DAL->AddFlagArg(0, Opts.getOption(options::OPT_fPIC));

		// Filter all the arguments we don't care passing to the offloading toolchain
		// as they can mess up with the creation of a shared library.
		for (auto *A : Args) {
		switch ((options::ID)A->getOption().getID()) {
		default:
		DAL->append(A);
		break;
		case options::OPT_shared:
		case options::OPT_static:
		case options::OPT_fPIC:
		case options::OPT_fno_PIC:
		case options::OPT_fpic:
		case options::OPT_fno_pic:
		case options::OPT_fPIE:
		case options::OPT_fno_PIE:
		case options::OPT_fpie:
		case options::OPT_fno_pie:
		break;
		}
		}

		return DAL;
		}

void Generic_ELF::addClangTargetOptions(const ArgList &DriverArgs,		void Generic_ELF::addClangTargetOptions(const ArgList &DriverArgs,
ArgStringList &CC1Args) const {		ArgStringList &CC1Args) const {
const Generic_GCC::GCCVersion &V = GCCInstallation.getVersion();		const Generic_GCC::GCCVersion &V = GCCInstallation.getVersion();
bool UseInitArrayDefault =		bool UseInitArrayDefault =
getTriple().getArch() == llvm::Triple::aarch64 \|\|		getTriple().getArch() == llvm::Triple::aarch64 \|\|
getTriple().getArch() == llvm::Triple::aarch64_be \|\|		getTriple().getArch() == llvm::Triple::aarch64_be \|\|
(getTriple().getOS() == llvm::Triple::Linux &&		(getTriple().getOS() == llvm::Triple::Linux &&
▲ Show 20 Lines • Show All 2,121 Lines • Show Last 20 Lines

lib/Driver/Tools.h

Show First 20 Lines • Show All 126 Lines • ▼ Show 20 Lines	public:
bool hasIntegratedCPP() const override { return false; }		bool hasIntegratedCPP() const override { return false; }

void ConstructJob(Compilation &C, const JobAction &JA,		void ConstructJob(Compilation &C, const JobAction &JA,
const InputInfo &Output, const InputInfoList &Inputs,		const InputInfo &Output, const InputInfoList &Inputs,
const llvm::opt::ArgList &TCArgs,		const llvm::opt::ArgList &TCArgs,
const char *LinkingOutput) const override;		const char *LinkingOutput) const override;
};		};

		/// \brief Offload bundler tool.
		class LLVM_LIBRARY_VISIBILITY OffloadBundler : public Tool {
		public:
		OffloadBundler(const ToolChain &TC)
		: Tool("Offload bundler", "clang-offload-bundler", TC) {}

		bool hasIntegratedCPP() const override { return false; }
		void ConstructJob(Compilation &C, const JobAction &JA,
		const InputInfo &Output, const InputInfoList &Inputs,
		const llvm::opt::ArgList &TCArgs,
		const char *LinkingOutput) const override;
		};

/// \brief Base class for all GNU tools that provide the same behavior when		/// \brief Base class for all GNU tools that provide the same behavior when
/// it comes to response files support		/// it comes to response files support
class LLVM_LIBRARY_VISIBILITY GnuTool : public Tool {		class LLVM_LIBRARY_VISIBILITY GnuTool : public Tool {
virtual void anchor();		virtual void anchor();

public:		public:
GnuTool(const char Name, const char ShortName, const ToolChain &TC)		GnuTool(const char Name, const char ShortName, const ToolChain &TC)
: Tool(Name, ShortName, TC, RF_Full, llvm::sys::WEM_CurrentCodePage) {}		: Tool(Name, ShortName, TC, RF_Full, llvm::sys::WEM_CurrentCodePage) {}
▲ Show 20 Lines • Show All 737 Lines • Show Last 20 Lines

lib/Driver/Tools.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 203 Lines • ▼ Show 20 Lines	if (CombinedArg) {
CmdArgs.push_back(Args.MakeArgString(Dirs));		CmdArgs.push_back(Args.MakeArgString(Dirs));
}		}
}		}
}		}

static void AddLinkerInputs(const ToolChain &TC, const InputInfoList &Inputs,		static void AddLinkerInputs(const ToolChain &TC, const InputInfoList &Inputs,
const ArgList &Args, ArgStringList &CmdArgs) {		const ArgList &Args, ArgStringList &CmdArgs) {
const Driver &D = TC.getDriver();		const Driver &D = TC.getDriver();
		unsigned NumberOfInputs = Inputs.size();

		// If the current toolchain is an OpenMP host toolchain, we need to ignore
		// the last inputs - one for each offloading device - as they are going to be
		// embedded in the fat binary by a custom linker script.
		if (TC.getOffloadingKind() == ToolChain::OK_OpenMP_Host) {
		Arg *Tgts = Args.getLastArg(options::OPT_omptargets_EQ);
		assert(Tgts && Tgts->getNumValues() &&
		"OpenMP offloading has to have targets specified.");
		NumberOfInputs -= Tgts->getNumValues();
		}

// Add extra linker input arguments which are not treated as inputs		// Add extra linker input arguments which are not treated as inputs
// (constructed via -Xarch_).		// (constructed via -Xarch_).
Args.AddAllArgValues(CmdArgs, options::OPT_Zlinker_input);		Args.AddAllArgValues(CmdArgs, options::OPT_Zlinker_input);

for (const auto &II : Inputs) {		for (unsigned i = 0; i < NumberOfInputs; ++i) {
		const auto &II = Inputs[i];
if (!TC.HasNativeLLVMSupport()) {		if (!TC.HasNativeLLVMSupport()) {
// Don't try to pass LLVM inputs unless we have native support.		// Don't try to pass LLVM inputs unless we have native support.
if (II.getType() == types::TY_LLVM_IR \|\|		if (II.getType() == types::TY_LLVM_IR \|\|
II.getType() == types::TY_LTO_IR \|\|		II.getType() == types::TY_LTO_IR \|\|
II.getType() == types::TY_LLVM_BC \|\| II.getType() == types::TY_LTO_BC)		II.getType() == types::TY_LLVM_BC \|\| II.getType() == types::TY_LTO_BC)
D.Diag(diag::err_drv_no_linker_llvm_support) << TC.getTripleString();		D.Diag(diag::err_drv_no_linker_llvm_support) << TC.getTripleString();
}		}

Show All 21 Lines	static void AddLinkerInputs(const ToolChain &TC, const InputInfoList &Inputs,
}		}

// LIBRARY_PATH - included following the user specified library paths.		// LIBRARY_PATH - included following the user specified library paths.
// and only supported on native toolchains.		// and only supported on native toolchains.
if (!TC.isCrossCompiling())		if (!TC.isCrossCompiling())
addDirectoryList(Args, CmdArgs, "-L", "LIBRARY_PATH");		addDirectoryList(Args, CmdArgs, "-L", "LIBRARY_PATH");
}		}

		/// \brief Add OpenMP linker script arguments at the end of the argument list
		/// so that the fat binary is built by embedding each of the device images into
		/// the host. The device images are the last inputs, one for each device and
		/// come in the same order the triples are passed through the omptargets option.
		/// The linker script also defines a few symbols required by the code generation
		/// so that the images can be easily retrieved at runtime by the offloading
		/// library. This should be used in tool chains that support linker scripts.
		static void AddOpenMPLinkerScript(const ToolChain &TC, Compilation &C,
		const InputInfo &Output,
		const InputInfoList &Inputs,
		const ArgList &Args, ArgStringList &CmdArgs) {

		// If this is not an OpenMP host toolchain, we don't need to do anything.
		if (TC.getOffloadingKind() != ToolChain::OK_OpenMP_Host)
		return;

		// Gather the pairs (target triple)-(file name). The files names are at the
		// end of the input list. So we do a reverse scanning.
		SmallVector<std::pair<llvm::Triple, const char *>, 4> Targets;

		Arg *Tgts = Args.getLastArg(options::OPT_omptargets_EQ);
		assert(Tgts && Tgts->getNumValues() &&
		"OpenMP offloading has to have targets specified.");

		auto TriplesIt = Tgts->getValues().end();
		auto FileNamesIt = Inputs.end();
		for (unsigned i = 0; i < Tgts->getNumValues(); ++i) {
		--TriplesIt;
		--FileNamesIt;
		Targets.push_back(
		std::make_pair(llvm::Triple(*TriplesIt), FileNamesIt->getFilename()));
		}

		// Create temporary linker script
		StringRef Name = llvm::sys::path::filename(Output.getFilename());
		std::pair<StringRef, StringRef> Split = Name.rsplit('.');
		std::string TmpName = C.getDriver().GetTemporaryPath(Split.first, "lk");
		const char *LKS = C.addTempFile(C.getArgs().MakeArgString(TmpName.c_str()));

		// Open script file in order to write contents
		std::error_code EC;
		llvm::raw_fd_ostream Lksf(LKS, EC, llvm::sys::fs::F_None);

		if (EC) {
		C.getDriver().Diag(clang::diag::err_unable_to_make_temp) << EC.message();
		return;
		}

		// Add commands to embed target binaries. We ensure that each section and
		// image s 16-byte aligned. This is not mandatory, but increases the
		rsmithUnsubmitted Not Done Reply Inline Actions s -> is rsmith: s -> is
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions I'll fix it. sfantao: I'll fix it.
		// likelihood of data to be aligned with a cache block in several main host
		// machines.
		Lksf << "TARGET(binary)\n";
		for (unsigned i = 0; i < Targets.size(); ++i)
		Lksf << "INPUT(" << Targets[i].second << ")\n";

		Lksf << "SECTIONS\n";
		Lksf << "{\n";
		Lksf << " .omp_offloading :\n";
		Lksf << " ALIGN(0x10)\n";
		Lksf << " {\n";

		for (unsigned i = 0; i < Targets.size(); ++i) {
		std::string TgtName(Targets[i].first.getTriple());
		// std::replace(TgtName.begin(), TgtName.end(), '-', '_');
		Lksf << " . = ALIGN(0x10);\n";
		Lksf << " PROVIDE_HIDDEN(.omp_offloading.img_start." << TgtName
		<< " = .);\n";
		Lksf << " " << Targets[i].second << "\n";
		Lksf << " PROVIDE_HIDDEN(.omp_offloading.img_end." << TgtName
		<< " = .);\n";
		}

		Lksf << " }\n";
		// Add commands to define host entries begin and end
		Lksf << " .omp_offloading.entries :\n";
		Lksf << " ALIGN(0x10)\n";
		Lksf << " SUBALIGN(0x01)\n";
		Lksf << " {\n";
		Lksf << " PROVIDE_HIDDEN(.omp_offloading.entries_begin = .);\n";
		Lksf << " *(.omp_offloading.entries)\n";
		Lksf << " PROVIDE_HIDDEN(.omp_offloading.entries_end = .);\n";
		Lksf << " }\n";
		Lksf << "}\n";
		Lksf << "INSERT BEFORE .data\n";

		Lksf.close();

		CmdArgs.push_back("-T");
		CmdArgs.push_back(LKS);
		}

/// \brief Determine whether Objective-C automated reference counting is		/// \brief Determine whether Objective-C automated reference counting is
/// enabled.		/// enabled.
static bool isObjCAutoRefCount(const ArgList &Args) {		static bool isObjCAutoRefCount(const ArgList &Args) {
return Args.hasFlag(options::OPT_fobjc_arc, options::OPT_fno_objc_arc, false);		return Args.hasFlag(options::OPT_fobjc_arc, options::OPT_fno_objc_arc, false);
}		}

/// \brief Determine whether we are linking the ObjC runtime.		/// \brief Determine whether we are linking the ObjC runtime.
static bool isObjCRuntimeLinked(const ArgList &Args) {		static bool isObjCRuntimeLinked(const ArgList &Args) {
▲ Show 20 Lines • Show All 3,023 Lines • ▼ Show 20 Lines	bool IsWindowsCygnus =
getToolChain().getTriple().isWindowsCygwinEnvironment();		getToolChain().getTriple().isWindowsCygwinEnvironment();
bool IsWindowsMSVC = getToolChain().getTriple().isWindowsMSVCEnvironment();		bool IsWindowsMSVC = getToolChain().getTriple().isWindowsMSVCEnvironment();
bool IsPS4CPU = getToolChain().getTriple().isPS4CPU();		bool IsPS4CPU = getToolChain().getTriple().isPS4CPU();

// Check number of inputs for sanity. We need at least one input.		// Check number of inputs for sanity. We need at least one input.
assert(Inputs.size() >= 1 && "Must have at least one input.");		assert(Inputs.size() >= 1 && "Must have at least one input.");
const InputInfo &Input = Inputs[0];		const InputInfo &Input = Inputs[0];
// CUDA compilation may have multiple inputs (source file + results of		// CUDA compilation may have multiple inputs (source file + results of
// device-side compilations). All other jobs are expected to have exactly one		// device-side compilations). OpenMP offloading device compile jobs also take
// input.		// the host IR as an extra input. All other jobs are expected to have exactly
		// one input.
bool IsCuda = types::isCuda(Input.getType());		bool IsCuda = types::isCuda(Input.getType());
assert((IsCuda \|\| Inputs.size() == 1) && "Unable to handle multiple inputs.");		bool IsOpenMPDeviceCompileJob =
		isa<CompileJobAction>(JA) &&
		getToolChain().getOffloadingKind() == ToolChain::OK_OpenMP_Device;
		assert((IsCuda \|\| (IsOpenMPDeviceCompileJob && Inputs.size() == 2) \|\|
		Inputs.size() == 1) &&
		"Unable to handle multiple inputs.");

// Invoke ourselves in -cc1 mode.		// Invoke ourselves in -cc1 mode.
//		//
// FIXME: Implement custom jobs for internal actions.		// FIXME: Implement custom jobs for internal actions.
CmdArgs.push_back("-cc1");		CmdArgs.push_back("-cc1");

// Add the "effective" target triple.		// Add the "effective" target triple.
CmdArgs.push_back("-triple");		CmdArgs.push_back("-triple");
▲ Show 20 Lines • Show All 2,025 Lines • ▼ Show 20 Lines	#endif
// Host-side cuda compilation receives device-side outputs as Inputs[1...].		// Host-side cuda compilation receives device-side outputs as Inputs[1...].
// Include them with -fcuda-include-gpubinary.		// Include them with -fcuda-include-gpubinary.
if (IsCuda && Inputs.size() > 1)		if (IsCuda && Inputs.size() > 1)
for (auto I = std::next(Inputs.begin()), E = Inputs.end(); I != E; ++I) {		for (auto I = std::next(Inputs.begin()), E = Inputs.end(); I != E; ++I) {
CmdArgs.push_back("-fcuda-include-gpubinary");		CmdArgs.push_back("-fcuda-include-gpubinary");
CmdArgs.push_back(I->getFilename());		CmdArgs.push_back(I->getFilename());
}		}

		// OpenMP offloading device jobs take the argument -omp-host-ir-file-path
		// to specify the result of the compile phase on the host, so the meaningful
		// device declarations can be identified. Also, -fopenmp-is-device is passed
		// along to tell the frontend that it is generating code for a device, so that
		// only the relevant declarations are emitted.
		if (IsOpenMPDeviceCompileJob) {
		CmdArgs.push_back("-fopenmp-is-device");
		CmdArgs.push_back("-omp-host-ir-file-path");
		CmdArgs.push_back(Args.MakeArgString(Inputs.back().getFilename()));
		}

		// For all the host OpenMP offloading compile jobs we need to pass the targets
		// information using -omptargets= option.
		if (isa<CompileJobAction>(JA) &&
		getToolChain().getOffloadingKind() == ToolChain::OK_OpenMP_Host) {
		SmallString<128> TargetInfo("-omptargets=");

		Arg *Tgts = Args.getLastArg(options::OPT_omptargets_EQ);
		assert(Tgts && Tgts->getNumValues() &&
		"OpenMP offloading has to have targets specified.");
		for (unsigned i = 0; i < Tgts->getNumValues(); ++i) {
		if (i)
		TargetInfo += ',';
		// We need to get the string from the triple because it may be not exactly
		// the same as the one we get directly from the arguments.
		llvm::Triple T(Tgts->getValue(i));
		TargetInfo += T.getTriple();
		}
		CmdArgs.push_back(Args.MakeArgString(TargetInfo.str()));
		}

// Finally add the compile command to the compilation.		// Finally add the compile command to the compilation.
if (Args.hasArg(options::OPT__SLASH_fallback) &&		if (Args.hasArg(options::OPT__SLASH_fallback) &&
Output.getType() == types::TY_Object &&		Output.getType() == types::TY_Object &&
(InputType == types::TY_C \|\| InputType == types::TY_CXX)) {		(InputType == types::TY_C \|\| InputType == types::TY_CXX)) {
auto CLCommand =		auto CLCommand =
getCLFallback()->GetCommand(C, JA, Output, Inputs, Args, LinkingOutput);		getCLFallback()->GetCommand(C, JA, Output, Inputs, Args, LinkingOutput);
C.addCommand(llvm::make_unique<FallbackCommand>(		C.addCommand(llvm::make_unique<FallbackCommand>(
JA, *this, Exec, CmdArgs, Inputs, std::move(CLCommand)));		JA, *this, Exec, CmdArgs, Inputs, std::move(CLCommand)));
▲ Show 20 Lines • Show All 529 Lines • ▼ Show 20 Lines	void ClangAs::ConstructJob(Compilation &C, const JobAction &JA,
// creating an object.		// creating an object.
// TODO: Currently only works on linux with newer objcopy.		// TODO: Currently only works on linux with newer objcopy.
if (Args.hasArg(options::OPT_gsplit_dwarf) &&		if (Args.hasArg(options::OPT_gsplit_dwarf) &&
getToolChain().getTriple().isOSLinux())		getToolChain().getTriple().isOSLinux())
SplitDebugInfo(getToolChain(), C, *this, JA, Args, Output,		SplitDebugInfo(getToolChain(), C, *this, JA, Args, Output,
SplitDebugName(Args, Input));		SplitDebugName(Args, Input));
}		}

		void OffloadBundler::ConstructJob(Compilation &C, const JobAction &JA,
		const InputInfo &Output,
		const InputInfoList &Inputs,
		const llvm::opt::ArgList &TCArgs,
		const char *LinkingOutput) const {

		// The (un)bundling command looks like this:
		// clang-offload-bundler -type=bc
		echristoUnsubmitted Done Reply Inline Actions Should we get the offload bundler in first so that the interface is there and testable? (Honest question, no particular opinion here). Though the command lines there will affect how this code is written. echristo: Should we get the offload bundler in first so that the interface is there and testable? (Honest…
		sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Yes, sure, I proposed an implementation of the bundler, using a generic format in http://reviews.llvm.org/D13909. Let me know any comments you have about that specific component. I still need to add testing specific to http://reviews.llvm.org/D13909, which I didn't yet because I didn't know where it was supposed to live - maybe in the Driver? Do you have an opinion about that? Also, in http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html, the generic opinion was that the bundler should use the host object format to bundle whenever possible. So, I also have to add a default behavior for the binary bundler when the input is an object file. For the other input types, I don't think there were any strong opinions. Do you happen to have one? In any case, I was planing to add the object file specific bundling in a separate patch, which seems to me a natural way to partition the bundler functionality. Does that sound like a good plan? sfantao: Yes, sure, I proposed an implementation of the bundler, using a generic format in http…
		// -omptargets=host-triple,device-triple1,device-triple2
		// -inputs=input_file
		// -outputs=unbundle_file_host,unbundle_file_tgt1,unbundle_file_tgt2"
		// (-unbundle)

		auto BundledFile = Output;
		auto UnbundledFiles = Inputs;

		bool IsUnbundle = isa<OffloadUnbundlingJobAction>(JA);

		ArgStringList CmdArgs;

		// Get the type.
		CmdArgs.push_back(TCArgs.MakeArgString(
		Twine("-type=") + types::getTypeTempSuffix(BundledFile.getType())));

		// Get the triples. The order is the same that comes in omptargets option.
		{
		SmallString<128> Triples;
		Triples += "-targets=offload-host-";
		Triples += getToolChain().getTripleString();

		Arg *TargetsArg = TCArgs.getLastArg(options::OPT_omptargets_EQ);
		for (auto *A : TargetsArg->getValues()) {
		// We have to use the string that exactly matches the triple here.
		llvm::Triple T(A);
		Triples += ",offload-device-";
		Triples += T.getTriple();
		}
		CmdArgs.push_back(TCArgs.MakeArgString(Triples));
		}

		// Get bundled file command.
		CmdArgs.push_back(
		TCArgs.MakeArgString(Twine(IsUnbundle ? "-inputs=" : "-outputs=") +
		BundledFile.getFilename()));

		// Get unbundled files command.
		{
		SmallString<128> UB(IsUnbundle ? "-outputs=" : "-inputs=");
		for (unsigned i = 0; i < UnbundledFiles.size(); ++i) {
		if (i)
		UB += ',';
		UB += UnbundledFiles[i].getFilename();
		}
		CmdArgs.push_back(TCArgs.MakeArgString(UB));
		}

		if (IsUnbundle)
		CmdArgs.push_back("-unbundle");

		// All the inputs are encoded as commands.
		C.addCommand(llvm::make_unique<Command>(
		JA, *this,
		TCArgs.MakeArgString(getToolChain().GetProgramPath(getShortName())),
		CmdArgs, None));
		}

void GnuTool::anchor() {}		void GnuTool::anchor() {}

void gcc::Common::ConstructJob(Compilation &C, const JobAction &JA,		void gcc::Common::ConstructJob(Compilation &C, const JobAction &JA,
const InputInfo &Output,		const InputInfo &Output,
const InputInfoList &Inputs, const ArgList &Args,		const InputInfoList &Inputs, const ArgList &Args,
const char *LinkingOutput) const {		const char *LinkingOutput) const {
const Driver &D = getToolChain().getDriver();		const Driver &D = getToolChain().getDriver();
ArgStringList CmdArgs;		ArgStringList CmdArgs;
▲ Show 20 Lines • Show All 2,762 Lines • ▼ Show 20 Lines	if (!Args.hasArg(options::OPT_nodefaultlibs)) {
break;		break;
case OMPRT_IOMP5:		case OMPRT_IOMP5:
CmdArgs.push_back("-liomp5");		CmdArgs.push_back("-liomp5");
break;		break;
case OMPRT_Unknown:		case OMPRT_Unknown:
// Already diagnosed.		// Already diagnosed.
break;		break;
}		}
		if (getToolChain().getOffloadingKind() == ToolChain::OK_OpenMP_Host)
		CmdArgs.push_back("-lomptarget");
}		}

AddRunTimeLibs(ToolChain, D, CmdArgs, Args);		AddRunTimeLibs(ToolChain, D, CmdArgs, Args);

if (WantPthread && !isAndroid)		if (WantPthread && !isAndroid)
CmdArgs.push_back("-lpthread");		CmdArgs.push_back("-lpthread");

CmdArgs.push_back("-lc");		CmdArgs.push_back("-lc");
Show All 16 Lines	if (!Args.hasArg(options::OPT_nostartfiles)) {
if (HasCRTBeginEndFiles)		if (HasCRTBeginEndFiles)
CmdArgs.push_back(Args.MakeArgString(ToolChain.GetFilePath(crtend)));		CmdArgs.push_back(Args.MakeArgString(ToolChain.GetFilePath(crtend)));
if (!isAndroid)		if (!isAndroid)
CmdArgs.push_back(Args.MakeArgString(ToolChain.GetFilePath("crtn.o")));		CmdArgs.push_back(Args.MakeArgString(ToolChain.GetFilePath("crtn.o")));
}		}
} else if (Args.hasArg(options::OPT_rtlib_EQ))		} else if (Args.hasArg(options::OPT_rtlib_EQ))
AddRunTimeLibs(ToolChain, D, CmdArgs, Args);		AddRunTimeLibs(ToolChain, D, CmdArgs, Args);

		// Add OpenMP offloading linker script args if required.
		AddOpenMPLinkerScript(getToolChain(), C, Output, Inputs, Args, CmdArgs);

C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));		C.addCommand(llvm::make_unique<Command>(JA, *this, Exec, CmdArgs, Inputs));
}		}

// NaCl ARM assembly (inline or standalone) can be written with a set of macros		// NaCl ARM assembly (inline or standalone) can be written with a set of macros
// for the various SFI requirements like register masking. The assembly tool		// for the various SFI requirements like register masking. The assembly tool
// inserts the file containing the macros as an input into all the assembly		// inserts the file containing the macros as an input into all the assembly
// jobs.		// jobs.
void nacltools::AssemblerARM::ConstructJob(Compilation &C, const JobAction &JA,		void nacltools::AssemblerARM::ConstructJob(Compilation &C, const JobAction &JA,
▲ Show 20 Lines • Show All 1,653 Lines • Show Last 20 Lines

lib/Driver/Types.cpp

Show First 20 Lines • Show All 134 Lines • ▼ Show 20 Lines	bool types::isCuda(ID Id) {

case TY_CUDA:		case TY_CUDA:
case TY_PP_CUDA:		case TY_PP_CUDA:
case TY_CUDA_DEVICE:		case TY_CUDA_DEVICE:
return true;		return true;
}		}
}		}

		bool types::isSrcFile(ID Id) {
		return Id != TY_Object && getPreprocessedType(Id) != TY_INVALID;
		}

types::ID types::lookupTypeForExtension(const char *Ext) {		types::ID types::lookupTypeForExtension(const char *Ext) {
return llvm::StringSwitch<types::ID>(Ext)		return llvm::StringSwitch<types::ID>(Ext)
.Case("c", TY_C)		.Case("c", TY_C)
.Case("i", TY_PP_C)		.Case("i", TY_PP_C)
.Case("m", TY_ObjC)		.Case("m", TY_ObjC)
.Case("M", TY_ObjCXX)		.Case("M", TY_ObjCXX)
.Case("h", TY_CHeader)		.Case("h", TY_CHeader)
.Case("C", TY_CXX)		.Case("C", TY_CXX)
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

test/OpenMP/target_driver.c

This file was added.

				///
				/// Perform several driver tests for OpenMP offloading
				///

				/// ###########################################################################

				/// Check whether an invalid OpenMP target is specified:
				// RUN: %clang -### -fopenmp=libomp -omptargets=aaa-bbb-ccc-ddd %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-INVALID-TARGET %s
				// CHK-INVALID-TARGET: error: OpenMP target is invalid: 'aaa-bbb-ccc-ddd'

				/// ###########################################################################

				/// Check warning for empty -omptargets
				// RUN: %clang -### -fopenmp=libomp -omptargets= %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-EMPTY-OMPTARGETS %s
				// CHK-EMPTY-OMPTARGETS: warning: joined argument expects additional value: '-omptargets='

				/// ###########################################################################

				/// Check the phases graph when using a single target, different from the host.
				/// The actions should be exactly the same as if not offloading was being used.
				// RUN: %clang -ccc-print-phases -fopenmp=libomp -target powerpc64-ibm-linux-gnu -omptargets=x86_64-pc-linux-gnu %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES %s

				// CHK-PHASES-DAG: {{.*}}: linker, {[[A0:[0-9]+]]}, image
				// CHK-PHASES-DAG: [[A0]]: assembler, {[[A1:[0-9]+]]}, object
				// CHK-PHASES-DAG: [[A1]]: backend, {[[A2:[0-9]+]]}, assembler
				// CHK-PHASES-DAG: [[A2]]: compiler, {[[A3:[0-9]+]]}, ir
				// CHK-PHASES-DAG: [[A3]]: preprocessor, {[[I:[0-9]+]]}, cpp-output
				// CHK-PHASES-DAG: [[I]]: input, {{.*}}, c

				/// ###########################################################################

				/// Check the phases when using multiple targets. Again, the actions are the
				/// same as if no offloading was being used. Here we also add a library to make
				/// sure it is not treated as input.
				// RUN: %clang -ccc-print-phases -lm -fopenmp=libomp -target powerpc64-ibm-linux-gnu -omptargets=x86_64-pc-linux-gnu,powerpc64-ibm-linux-gnu %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES-LIB %s

				// CHK-PHASES-LIB-DAG: {{.*}}: linker, {[[L0:[0-9]+]], [[A0:[0-9]+]]}, image
				// CHK-PHASES-LIB-DAG: [[A0]]: assembler, {[[A1:[0-9]+]]}, object
				// CHK-PHASES-LIB-DAG: [[A1]]: backend, {[[A2:[0-9]+]]}, assembler
				// CHK-PHASES-LIB-DAG: [[A2]]: compiler, {[[A3:[0-9]+]]}, ir
				// CHK-PHASES-LIB-DAG: [[A3]]: preprocessor, {[[I:[0-9]+]]}, cpp-output
				// CHK-PHASES-LIB-DAG: [[I]]: input, {{.*}}, c
				// CHK-PHASES-LIB-DAG: [[L0]]: input, "m", object
				echristoUnsubmitted Done Reply Inline Actions Do we really think the phases should be a DAG check? echristo: Do we really think the phases should be a DAG check?
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Using a DAG seemed to me a robust way to test that. I'd have to double check, but several map containers are used for the inputs and actions, so the order may depend on the implementation of the container. I was just trying to use a safe way to test. Do you prefer to change this to the exact sequence I am getting? sfantao: Using a DAG seemed to me a robust way to test that. I'd have to double check, but several map…

				/// ###########################################################################

				/// Check the phases when using multiple targets and passing an object file as
				/// input. An unbundling action has to be created.
				// RUN: echo 'bla' > %t.o
				// RUN: %clang -ccc-print-phases -lm -fopenmp=libomp -target powerpc64-ibm-linux-gnu -omptargets=x86_64-pc-linux-gnu,powerpc64-ibm-linux-gnu %s %t.o 2>&1 \
				echristoUnsubmitted Done Reply Inline Actions How do you pass options to individual omptargets? e.g. -mvsx or -mavx2? echristo: How do you pass options to individual omptargets? e.g. -mvsx or -mavx2?
				sfantaoAuthorUnsubmitted Not Done Reply Inline Actions Well, currently I don't. In http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html I was proposing something to tackle that, but the opinion was that it was somewhat secondary and the driver design should be settled first. What I as proposing was some sort of group option associated with the device triple. The idea was to avoid proliferation of device specific options and reuse what we already have, just organize it groups so that i could be forwarded to the right tool chain. The goal was to make things like this possible: clang -mcpu=pwr8 -target-offload=nvptx64-nvidia-cuda -fopenmp -mcpu=sm_35 -target-offload=nvptx64-nvidia-cuda -fcuda -mcpu=sm_32 a.c ... where mcpu is used to specify the cpu/gpu for the different tool chains and programing models. This would also be useful to specify include and library paths that only make sense to the device. Do you have any opinion about that? sfantao: Well, currently I don't. In http://lists.llvm.org/pipermail/cfe-dev/2016-February/047547.html I…
				// RUN: \| FileCheck -check-prefix=CHK-PHASES-OBJ %s

				// CHK-PHASES-OBJ-DAG: {{.*}}: linker, {[[L0:[0-9]+]], [[A0:[0-9]+]], [[B0:[0-9]+]]}, image
				// CHK-PHASES-OBJ-DAG: [[A0]]: assembler, {[[A1:[0-9]+]]}, object
				// CHK-PHASES-OBJ-DAG: [[A1]]: backend, {[[A2:[0-9]+]]}, assembler
				// CHK-PHASES-OBJ-DAG: [[A2]]: compiler, {[[A3:[0-9]+]]}, ir
				// CHK-PHASES-OBJ-DAG: [[A3]]: preprocessor, {[[I:[0-9]+]]}, cpp-output
				// CHK-PHASES-OBJ-DAG: [[I]]: input, {{.*}}, c
				// CHK-PHASES-OBJ-DAG: [[L0]]: input, "m", object
				// CHK-PHASES-OBJ-DAG: [[B0]]: clang-offload-unbundler, {[[B1:[0-9]+]]}, object
				// CHK-PHASES-OBJ-DAG: [[B1]]: input, "{{.*}}.o", object

				/// ###########################################################################

				/// Check the phases when using multiple targets and separate compilation.
				// RUN: echo 'bla' > %t.s
				// RUN: %clang -ccc-print-phases -c -lm -fopenmp=libomp -target powerpc64-ibm-linux-gnu -omptargets=x86_64-pc-linux-gnu,powerpc64-ibm-linux-gnu %t.s -x cpp-output %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-PHASES-SEP %s

				// CHK-PHASES-SEP-DAG: [[A:[0-9]+]]: input, "{{.*}}.c", cpp-output
				// CHK-PHASES-SEP-DAG: [[A1:[0-9]+]]: clang-offload-unbundler, {[[A]]}, cpp-output
				// CHK-PHASES-SEP-DAG: [[A2:[0-9]+]]: compiler, {[[A1]]}, ir
				// CHK-PHASES-SEP-DAG: [[A3:[0-9]+]]: backend, {[[A2]]}, assembler
				// CHK-PHASES-SEP-DAG: [[A4:[0-9]+]]: assembler, {[[A3]]}, object
				// CHK-PHASES-SEP-DAG: {{.*}}: clang-offload-bundler, {[[A4]]}, object

				// CHK-PHASES-SEP-DAG: [[B:[0-9]+]]: input, "{{.*}}.s", assembler
				// CHK-PHASES-SEP-DAG: [[B1:[0-9]+]]: clang-offload-unbundler, {[[B]]}, assembler
				// CHK-PHASES-SEP-DAG: [[B2:[0-9]+]]: assembler, {[[B1]]}, object
				// CHK-PHASES-SEP-DAG: {{.*}}: clang-offload-bundler, {[[B2]]}, object

				/// ###########################################################################

				/// Check of the commands passed to each tool when using valid OpenMP targets.
				/// Here we also check that offloading does not break the use of integrated
				/// assembler. It does however preclude the use of integrated preprocessor as
				/// host IR is shared by all the compile phases. There are also two offloading
				/// specific commands:
				/// -fopenmp-is-device: will tell the frontend that it will generate code for a
				/// target.
				/// -omp-host-ir-file-path: specifies the host IR file that can be loaded by
				/// the target code generation to gather information about which declaration
				/// really need to be emitted.
				///
				// RUN: %clang -### -fopenmp=libomp -target powerpc64le-linux -omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-COMMANDS %s
				// RUN: %clang -### -fopenmp=libomp -target powerpc64le-linux -omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %s -save-temps 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-COMMANDS-ST %s
				//

				// Final linking - host (ppc64le)
				// CHK-COMMANDS-DAG: ld" {{.}}"-m" "elf64lppc" {{.}}"-o" "a.out" {{.}}"[[HSTOBJ:.+]].o" "-lomp" "-lomptarget" {{.}}"-T" "[[LKSCRIPT:.+]].lk"
				// CHK-COMMANDS-ST-DAG: ld" {{.}}"-m" "elf64lppc" {{.}}"-o" "a.out" {{.}}"[[HSTOBJ:.+]].o" "-lomp" "-lomptarget" {{.}}"-T" "[[LKSCRIPT:.+]].lk"

				// Target 2 commands (x86_64)
				// CHK-COMMANDS-DAG: ld" {{.}}"-m" "elf_x86_64" {{.}}"-shared" {{.}}"-o" "[[T2LIB:.+]]" {{.}}"[[T2OBJ:.+]].o" {{.*}}"-lomp"
				// CHK-COMMANDS-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-obj" {{.}}"-fopenmp" {{.*}}"-o" "[[T2OBJ]].o" "-x" "ir" "[[T2BC:.+]].bc"
				// CHK-COMMANDS-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[T2BC]].bc" "-x" "c" "[[SRC:.+]].c" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[HSTBC:.+]].bc"

				// CHK-COMMANDS-ST-DAG: ld" {{.}}"-m" "elf_x86_64" {{.}}"-shared" {{.}}"-o" "[[T2LIB:.+]]" {{.}}"[[T2OBJ:.+]].o" {{.*}}"-lomp"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1as" "-triple" "x86_64-pc-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[T2OBJ]].o" "[[T2ASM:.+]].s"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-S" {{.}}"-fopenmp" {{.*}}"-o" "[[T2ASM]].s" "-x" "ir" "[[T2BC:.+]].bc"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[T2BC]].bc" "-x" "cpp-output" "[[T2PP:.+]].i" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[HSTBC:.+]].bc"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-E" {{.}}"-fopenmp" {{.*}}"-o" "[[T2PP]].i" "-x" "c" "[[SRC:.+]].c"

				// Target 1 commands (ppc64le)
				// CHK-COMMANDS-DAG: ld" {{.}}"-m" "elf64lppc" {{.}}"-shared" {{.}}"-o" "[[T1LIB:.+]]" {{.}}"[[T1OBJ:.+]].o" {{.*}}"-lomp"
				// CHK-COMMANDS-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-emit-obj" {{.}}"-fopenmp" {{.*}}"-o" "[[T1OBJ]].o" "-x" "ir" "[[T1BC:.+]].bc"
				// CHK-COMMANDS-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[T1BC]].bc" "-x" "c" "[[SRC]].c" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[HSTBC]].bc"

				// CHK-COMMANDS-ST-DAG: ld" {{.}}"-m" "elf64lppc" {{.}}"-shared" {{.}}"-o" "[[T1LIB:.+]]" {{.}}"[[T1OBJ:.+]].o" {{.*}}"-lomp"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le-ibm-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[T1OBJ]].o" "[[T1ASM:.+]].s"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-S" {{.}}"-fopenmp" {{.*}}"-o" "[[T1ASM]].s" "-x" "ir" "[[T1BC:.+]].bc"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[T1BC]].bc" "-x" "cpp-output" "[[T1PP:.+]].i" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[HSTBC]].bc"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-E" {{.}}"-fopenmp" {{.*}}"-o" "[[T1PP]].i" "-x" "c" "[[SRC]].c"

				// Host object generation
				// CHK-COMMANDS-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-emit-obj" {{.}}"-fopenmp" {{.*}}"-o" "[[HSTOBJ]].o" "-x" "ir" "[[HSTBC]].bc"
				// CHK-COMMANDS-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-emit-llvm-bc"{{.}}"-fopenmp" {{.*}}"-o" "[[HSTBC]].bc" "-x" "c" "[[SRC]].c" "-omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu"

				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le--linux" "-filetype" "obj" {{.}}"-o" "[[HSTOBJ]].o" "[[HSTASM:.+]].s"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-S"{{.}}"-fopenmp" {{.*}}"-o" "[[HSTASM]].s" "-x" "ir" "[[HSTBC:.+]].bc"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-emit-llvm-bc"{{.}}"-fopenmp" {{.*}}"-o" "[[HSTBC]].bc" "-x" "cpp-output" "[[HSTPP:.+]].i" "-omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu"
				// CHK-COMMANDS-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-E"{{.}}"-fopenmp" {{.*}}"-o" "[[HSTPP]].i" "-x" "c" "[[SRC]].c"

				/// ###########################################################################

				/// Check separate compilation
				///
				// RUN: echo 'bla' > %t.s
				// RUN: %clang -### -fopenmp=libomp -c -target powerpc64le-linux -omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %t.s -x cpp-output %s 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-COMMANDS-SEP %s
				// RUN: %clang -### -fopenmp=libomp -c -target powerpc64le-linux -omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu %t.s -x cpp-output %s -save-temps 2>&1 \
				// RUN: \| FileCheck -check-prefix=CHK-COMMANDS-SEP-ST %s
				//

				// Unbundle the input files.
				// CHK-COMMANDS-SEP-DAG: clang-offload-bundler{{.*}}" "-type=s" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-inputs=[[AAASM:.+]].s" "-outputs=[[AAHASM:.+]].s,[[AAT1ASM:.+]].s,[[AAT2ASM:.+]].s" "-unbundle"
				// CHK-COMMANDS-SEP-DAG: clang-offload-bundler{{.*}}" "-type=i" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-inputs=[[BBPP:.+]].c" "-outputs=[[BBHPP:.+]].i,[[BBT1PP:.+]].i,[[BBT2PP:.+]].i" "-unbundle"

				// CHK-COMMANDS-SEP-ST-DAG: clang-offload-bundler{{.*}}" "-type=s" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-inputs=[[AAASM:.+]].s" "-outputs=[[AAHASM:.+]].s,[[AAT1ASM:.+]].s,[[AAT2ASM:.+]].s" "-unbundle"
				// CHK-COMMANDS-SEP-ST-DAG: clang-offload-bundler{{.*}}" "-type=i" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-inputs=[[BBPP:.+]].c" "-outputs=[[BBHPP:.+]].i,[[BBT1PP:.+]].i,[[BBT2PP:.+]].i" "-unbundle"

				// Create 1st bundle.
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le--linux" "-filetype" "obj" {{.}}"-o" "[[AAHOBJ:.+]].o" "[[AAHASM]].s"
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le-ibm-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[AAT1OBJ:.+]].o" "[[AAT1ASM]].s"
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1as" "-triple" "x86_64-pc-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[AAT2OBJ:.+]].o" "[[AAT2ASM]].s"
				// CHK-COMMANDS-SEP-DAG: clang-offload-bundler{{.*}}" "-type=o" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-outputs=[[AAOBJ:.+]].o" "-inputs=[[AAHOBJ]].o,[[AAT1OBJ]].o,[[AAT2OBJ]].o"

				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le--linux" "-filetype" "obj" {{.}}"-o" "[[AAHOBJ:.+]].o" "[[AAHASM]].s"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le-ibm-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[AAT1OBJ:.+]].o" "[[AAT1ASM]].s"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1as" "-triple" "x86_64-pc-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[AAT2OBJ:.+]].o" "[[AAT2ASM]].s"
				// CHK-COMMANDS-SEP-ST-DAG: clang-offload-bundler{{.*}}" "-type=o" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-outputs=[[AAOBJ:.+]].o" "-inputs=[[AAHOBJ]].o,[[AAT1OBJ]].o,[[AAT2OBJ]].o"

				// Create 2nd bundle.
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-emit-llvm-bc"{{.}}"-fopenmp" {{.*}}"-o" "[[BBHBC:.+]].bc" "-x" "cpp-output" "[[BBHPP]].i" "-omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu"
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-emit-obj" {{.}}"-fopenmp" {{.*}}"-o" "[[BBHOBJ:.+]].o" "-x" "ir" "[[BBHBC]].bc"

				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-emit-llvm-bc"{{.}}"-fopenmp" {{.*}}"-o" "[[BBHBC:.+]].bc" "-x" "cpp-output" "[[BBHPP]].i" "-omptargets=powerpc64le-ibm-linux-gnu,x86_64-pc-linux-gnu"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le--linux" "-S" {{.}}"-fopenmp" {{.*}}"-o" "[[BBHASM:.+]].s" "-x" "ir" "[[BBHBC]].bc"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le--linux" "-filetype" "obj" {{.}}"-o" "[[BBHOBJ:.+]].o" "[[BBHASM]].s"

				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT1BC:.+]].bc" "-x" "cpp-output" "[[BBT1PP]].i" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[BBHBC]].bc"
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-emit-obj" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT1OBJ:.+]].o" "-x" "ir" "[[BBT1BC]].bc"

				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT1BC:.+]].bc" "-x" "cpp-output" "[[BBT1PP]].i" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[BBHBC]].bc"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1" "-triple" "powerpc64le-ibm-linux-gnu" "-S" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT1ASM:.+]].s" "-x" "ir" "[[BBT1BC]].bc"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1as" "-triple" "powerpc64le-ibm-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[BBT1OBJ:.+]].o" "[[BBT1ASM]].s"

				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT2BC:.+]].bc" "-x" "cpp-output" "[[BBT2PP]].i" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[BBHBC]].bc"
				// CHK-COMMANDS-SEP-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-obj" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT2OBJ:.+]].o" "-x" "ir" "[[BBT2BC]].bc"

				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-emit-llvm-bc" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT2BC:.+]].bc" "-x" "cpp-output" "[[BBT2PP]].i" "-fopenmp-is-device" "-omp-host-ir-file-path" "[[BBHBC]].bc"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1" "-triple" "x86_64-pc-linux-gnu" "-S" {{.}}"-fopenmp" {{.*}}"-o" "[[BBT2ASM:.+]].s" "-x" "ir" "[[BBT2BC]].bc"
				// CHK-COMMANDS-SEP-ST-DAG: clang{{.}}" "-cc1as" "-triple" "x86_64-pc-linux-gnu" "-filetype" "obj" {{.}}"-o" "[[BBT2OBJ:.+]].o" "[[BBT2ASM]].s"

				// CHK-COMMANDS-SEP-DAG: clang-offload-bundler{{.*}}" "-type=o" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-outputs=[[BBOBJ:.+]].o" "-inputs=[[BBHOBJ]].o,[[BBT1OBJ]].o,[[BBT2OBJ]].o"
				// CHK-COMMANDS-SEP-ST-DAG: clang-offload-bundler{{.*}}" "-type=o" "-targets=offload-host-powerpc64le--linux,offload-device-powerpc64le-ibm-linux-gnu,offload-device-x86_64-pc-linux-gnu" "-outputs=[[BBOBJ:.+]].o" "-inputs=[[BBHOBJ]].o,[[BBT1OBJ]].o,[[BBT2OBJ]].o"

This is an archive of the discontinued LLVM Phabricator instance.

[OPENMP] Driver support for OpenMP offloadingAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 41001

include/clang/Basic/DiagnosticDriverKinds.td

include/clang/Driver/Action.h

include/clang/Driver/CC1Options.td

include/clang/Driver/Driver.h

include/clang/Driver/Options.td

include/clang/Driver/ToolChain.h

include/clang/Driver/Types.h

lib/Driver/Action.cpp

lib/Driver/Compilation.cpp

lib/Driver/Driver.cpp

lib/Driver/ToolChain.cpp

lib/Driver/ToolChains.h

lib/Driver/ToolChains.cpp

lib/Driver/Tools.h

lib/Driver/Tools.cpp

lib/Driver/Types.cpp

test/OpenMP/target_driver.c

[OPENMP] Driver support for OpenMP offloading
AbandonedPublic