This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
docs/
2/4
ClangOffloadPackager.rst
-
include/clang/
-
clang/
-
Basic/
-
CodeGenOptions.h
-
Driver/
-
Action.h
-
ToolChain.h
-
lib/
-
CodeGen/
-
BackendUtil.cpp
-
Driver/
-
Action.cpp
-
Driver.cpp
-
ToolChain.cpp
-
ToolChains/
-
Clang.h
-
Clang.cpp
-
test/
-
Driver/
-
amdgpu-openmp-toolchain-new.c
-
cuda-openmp-driver.cu
-
cuda-phases.cu
-
linker-wrapper-image.c
2
linker-wrapper.c
-
openmp-offload-gpu-new.c
-
openmp-offload-infer.c
-
Frontend/
1/2
embed-object.c
1/2
embed-object.ll
-
lit.cfg.py
-
tools/
-
CMakeLists.txt
-
clang-offload-packager/
-
CMakeLists.txt
-
ClangOffloadPackager.cpp

Differential D125165

[Clang] Introduce clang-offload-packager tool to bundle device files
ClosedPublic

Authored by jhuber6 on May 7 2022, 4:59 AM.

Download Raw Diff

Details

Reviewers

jdoerfert
JonChesterfield
saiislam
yaxunl
tra

Commits

rG26eb04268f4c: [Clang] Introduce clang-offload-packager tool to bundle device files

Summary

In order to do offloading compilation we need to embed files into the
host and create fatbainaries. Clang uses a special binary format to
bundle several files along with their metadata into a single binary
image. This is currently performed using the -fembed-offload-binary
option. However this is not very extensibile since it requires changing
the command flag every time we want to add something and makes optional
arguments difficult. This patch introduces a new tool called
clang-offload-packager that behaves similarly to CUDA's fatbinary.
This tool takes several input files with metadata and embeds it into a
single image that can then be embedded in the host.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jhuber6 created this revision.May 7 2022, 4:59 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2022, 4:59 AM

Herald added subscribers: ormris, kerbowa, mgorny, jvesely. · View Herald Transcript

jhuber6 requested review of this revision.May 7 2022, 4:59 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 7 2022, 4:59 AM

Herald added subscribers: cfe-commits, sstefan1, MaskRay. · View Herald Transcript

Fix test.

Harbormaster completed remote builds in B163308: Diff 427849.May 7 2022, 6:12 AM

Fix missing file in test.

Harbormaster completed remote builds in B163315: Diff 427858.May 7 2022, 7:37 AM

tra added inline comments.May 9 2022, 2:36 PM

clang/docs/ClangOffloadBinary.rst
8 ↗	(On Diff #427858)	Naming nit: `binary` may not be the best term for what we're trying to do here. Perhaps something like `package`, `container` or `collection` would work better.
42 ↗	(On Diff #427858)	This appears to be a one-way process. How one would examine what's in the binary and unpack/extract specific component from it?
clang/test/Frontend/embed-object.ll
7	What will happen if an openMP file compiled this way is linked with the older version of OpenMP runtime which presumably expected to see extra data in `.llvm.offloading`? Will it provide a sensible error? Perhaps we should change the section name, too.
clang/tools/clang-offload-binary/ClangOffloadBinary.cpp
70 ↗	(On Diff #427858)	It would be useful to add a comment describing the 'special' keys `file` and `kind`.
75 ↗	(On Diff #427858)	Should `kind` also be required? If not, what's the default kind?
99 ↗	(On Diff #427858)	Nit: `write` is a rather misleading function name here. AFAICT, we're not actually writing anything, but rather packing the `ImageBinary` into a memory buffer, which we then return.

Thanks for the comments.

clang/docs/ClangOffloadBinary.rst
8 ↗	(On Diff #427858)	Yeah, I would've used `bundler` but that name is taken. I think I can go with `clang-offload-container`
42 ↗	(On Diff #427858)	This is done by the linker wrapper, but I think it would be good to teach `llvm-objdump` how to handle these. Then we could basically just treat it the same way as `cuobjdump`.
clang/test/Frontend/embed-object.ll
7	I didn't change the actual data being embedded, only the method to do it. previously this command line did the work of the offload binary tool. Now it just embeds the file that the tool spits out. This test just makes sure that we run it and get the contents in the IR.
clang/tools/clang-offload-binary/ClangOffloadBinary.cpp
70 ↗	(On Diff #427858)	I think I'll add that to the help message.
75 ↗	(On Diff #427858)	The default kind is filled when we default construct the `OffloadingImage` below, which gives `OFK_None`. This has the effect of being used for linking jobs, but not emitting any registration code.
99 ↗	(On Diff #427858)	I read it as "write to buffer", but I can see your point. It's not related to this patch but I could see changing it.

Conceptually fine with me, @tra?

clang/docs/ClangOffloadBinary.rst
8 ↗	(On Diff #427858)	clang-offload-packager?

Changing name from clang-offload-binary to clang-offload-packager and updating some help mesages.

jhuber6 retitled this revision from [Clang] Introduce clang-offload-binary tool to bundle device files to [Clang] Introduce clang-offload-packager tool to bundle device files.May 10 2022, 7:50 AM

jhuber6 edited the summary of this revision. (Show Details)

yaxunl added inline comments.May 10 2022, 8:20 AM

clang/docs/ClangOffloadBinary.rst
15 ↗	(On Diff #428379)	It would help if more details are given, e.g, offset and size of members of the header and layout of the string map.
clang/test/Frontend/embed-object.c
3	Is this due to the embedded object being empty? So now the bitcode for different targets are bundled by clang-offload-packager then embedded as one file in the relocatable object file? In the old scheme the bitcode for different targets are bundled by clang-offload-bundler then embedded in the relocatable object file, right? What's the advantage of clang-offload-packager compared with clang-offload-bundler?

jhuber6 added inline comments.May 10 2022, 8:33 AM

clang/docs/ClangOffloadBinary.rst
15 ↗	(On Diff #428379)	I can probably add some more documentation on that, would definitely help people inspecting these. Later I intend to let `llvm-objdump` extract these as well.
clang/test/Frontend/embed-object.c
3	Is this due to the embedded object being empty? Yes, we used to do the binary format in Clang itself so we got the binary stuff along with the empty file. Now this flag simply embeds a file at a section, the file is empty so we get a zeroinitializer. What's important in this test is just that the option puts the contents in the IR. So now the bitcode for different targets are bundled by clang-offload-packager then embedded as one file in the relocatable object file? Yes, this is basically like what fatbinary does for CUDA. We take all the files and put it into a single binary. The binary then contains metadata which lets us find these files later at link time. In the old scheme the bitcode for different targets are bundled by clang-offload-bundler then embedded in the relocatable object file, right? What's the advantage of clang-offload-packager compared with clang-offload-bundler? The old clang offload bundler did some similar stuff, namely embedding multiple files into the host. It was similarly an ELF section if the target is an object file. Conceptually this only creates the actual binary that's being embedded and puts it in one big blob, this then just gets embedded directly in the IR. The benefit to this approach in my mind is that the host and device phases are more distinct, we don't need to call the `clang-offload-bundler` on the host files as well. I could've worked around the current clang offload bundler to make it do something similar, but I didn't see the utility when I'm doing different stuff using a different binary format.

Harbormaster completed remote builds in B163701: Diff 428379.May 10 2022, 9:16 AM

Adding some extra documentation.

LGTM in principle.

Given that we're introducing a new tool dependency we may want to get a stamp from someone dealing with build and release.
@tstellar -- do we need to change anything else for the new binary to ship with clang releases?

clang/docs/ClangOffloadBinary.rst
42 ↗	(On Diff #427858)	SGTM. That may be a good motivation for someone to write a proper disassembler for NVIDIA's GPU binaries. Or we could teach it to invoke nvdisasm or cuobjdump if it sees an NVIDIA ELF file.

In D125165#3504252, @tra wrote:

LGTM in principle.

Given that we're introducing a new tool dependency we may want to get a stamp from someone dealing with build and release.
@tstellar -- do we need to change anything else for the new binary to ship with clang releases?

We did break ABI with LLVM 14 seeing as we supported -fopenmp-new-driver in the release. This used a different method of encoding which isn't compatible with this one. But since that functionality was hidden behind an opt-in experimental flag I would think it's okay to change it.

In D125165#3504252, @tra wrote:

LGTM in principle.

Given that we're introducing a new tool dependency we may want to get a stamp from someone dealing with build and release.
@tstellar -- do we need to change anything else for the new binary to ship with clang releases?

If the tools is built by default, then the test-release.sh build script will include it in the release binaries.

Harbormaster completed remote builds in B163743: Diff 428443.May 10 2022, 1:08 PM

jhuber6 mentioned this in D123810: [Cuda] Add initial support for wrapping CUDA images in the new driver..May 10 2022, 3:09 PM

OK. LGTM.

This revision is now accepted and ready to land.May 10 2022, 3:28 PM

This revision was landed with ongoing or failed builds.May 11 2022, 6:39 AM

Closed by commit rG26eb04268f4c: [Clang] Introduce clang-offload-packager tool to bundle device files (authored by jhuber6). · Explain Why

This revision was automatically updated to reflect the committed changes.

jhuber6 added a commit: rG26eb04268f4c: [Clang] Introduce clang-offload-packager tool to bundle device files.

Herald added a subscriber: kosarev. · View Herald TranscriptMay 11 2022, 6:39 AM

We now have clang-offload-bundler, clang-offload-packager, clang-offload-wrapper. Do we really need that many distinct binaries for offloading? Any chance we could combine some of those?

clang/test/Driver/linker-wrapper.c
5	Since you're calling this from a test, you have to edit clang/test/CMakeLists.txt and add a dep on the new tool.

thakis added inline comments.May 11 2022, 7:10 AM

clang/test/Driver/linker-wrapper.c
5	Sorry, I missed the `add_dependencies(clang clang-offload-packager)` line in the new cmakelists.txt file. All good.

In D125165#3506172, @thakis wrote:

We now have clang-offload-bundler, clang-offload-packager, clang-offload-wrapper. Do we really need that many distinct binaries for offloading? Any chance we could combine some of those?

Yeah, it's not ideal. I would've liked to use clang-offload-bundler for this tool but we're keeping the old ones around for backwards compatibility. The clang-offload-bundler and clang-offload-wrapper have already been merged into clang and the clang-linker-wrapper respectively, they're kept around for backwards compatibility currently. I think HIP still uses the clang-offload-bundler but I'm in the process of changing that. I'm not sure if it's possible to delete these tools once they're introduced since someone might rely on the binary.

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

In D125165#3506448, @thakis wrote:

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

In D125165#3506477, @yaxunl wrote:

In D125165#3506448, @thakis wrote:

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

If I read the above right, jhuber says it's been merged into clang itself, not that it's being replaced by clang-offload-packager (?)

In D125165#3506502, @thakis wrote:

In D125165#3506477, @yaxunl wrote:

In D125165#3506448, @thakis wrote:

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

If I read the above right, jhuber says it's been merged into clang itself, not that it's being replaced by clang-offload-packager (?)

clang-offload-bundler is not merged into clang itself (https://github.com/llvm/llvm-project/blob/main/clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp)

Currently, its functionality is not replaced by clang-offload-packager. I am not sure about future.

In D125165#3506502, @thakis wrote:

In D125165#3506477, @yaxunl wrote:

In D125165#3506448, @thakis wrote:

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

If I read the above right, jhuber says it's been merged into clang itself, not that it's being replaced by clang-offload-packager (?)

I'll clarify, the functionality of the clang-offload-bundler is to embed device files into the host. I now do this directly in clang by creating a global string in the LLVM-IR of the host rather than calling a tool. The HIP toolchain still uses the clang-offload-bundler, but I'm planning on putting patches up to move away from that. The current clang-offload-bundler and this new tool have different purposes, this one simply create a binary that can then be embedded into the host. There is still functionality that the clang-offload-bundler provides that I don't intend to replace, namely the bundling and un-bundling of text files. I don't think we want to stick with the clang-offload-bundler approach, because the files that the --clang-offload-bundler spat out weren't valid input to the rest of LLVM, e.g. clang -S -emit-llvm --offload-arch=gfx908 foo.hip -o - | opt would break.

In D125165#3506529, @jhuber6 wrote:

In D125165#3506502, @thakis wrote:

In D125165#3506477, @yaxunl wrote:

In D125165#3506448, @thakis wrote:

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

If I read the above right, jhuber says it's been merged into clang itself, not that it's being replaced by clang-offload-packager (?)

I'll clarify, the functionality of the clang-offload-bundler is to embed device files into the host. I now do this directly in clang by creating a global string in the LLVM-IR of the host rather than calling a tool. The HIP toolchain still uses the clang-offload-bundler, but I'm planning on putting patches up to move away from that. The current clang-offload-bundler and this new tool have different purposes, this one simply create a binary that can then be embedded into the host. There is still functionality that the clang-offload-bundler provides that I don't intend to replace, namely the bundling and un-bundling of text files. I don't think we want to stick with the clang-offload-bundler approach, because the files that the --clang-offload-bundler spat out weren't valid input to the rest of LLVM, e.g. clang -S -emit-llvm --offload-arch=gfx908 foo.hip -o - | opt would break.

The clang-offload-bundler textual format can be consumed by clang since clang can automatically unbundle them.

The textural format allows clang to emit one output for -E, which can be used by ccache for calculating hashes.

Another usage of clang-offload-bundler textual format is bundled assembly code. Users can modify them and use clang to assemble them.

For embedding bitcode in relocatable object files, clang-offload-packager can be a replacement for clang-offload-bundler, since this is consumed by compiler.

For HIP toolchain, clang-offload-bundler is also used to generate fatbinary files which can be loaded dynamically at run time through module API's. So far I don't think this can be replaced by clang-offload-packager in a short time, since it needs HIP runtime change.

yaxunl added inline comments.May 11 2022, 10:23 AM

clang/docs/ClangOffloadPackager.rst
31–32	This makes the file format depend on LLVM version and potentially standard C++ library version. If it is consumed by the same version of LLVM, it may be fine. However, if it is intended for a generic file format to be consumed by generic offloading language runtimes, it is better to use C-like simple data layout which does not depend on LLVM or standard C++ library.

jhuber6 added inline comments.May 11 2022, 10:26 AM

clang/docs/ClangOffloadPackager.rst
31–32	That format just stores the data before it's serialized to a binary format. The binary format is basically just a few headers and a string table. See https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Object/OffloadBinary.h#L91 for the real format. I didn't want to explain it all in detail here.

yaxunl added inline comments.May 11 2022, 11:26 AM

clang/docs/ClangOffloadPackager.rst
31–32	If the file format is intended to be consumed by generic offloading language runtimes or development tools, better to describe its layout like https://clang.llvm.org/docs/ClangOffloadBundler.html Since language runtimes or development tools may not use LLVM to load the file. The documentation serves as a spec for this binary format. Especially, it is not clear where to find the target triple and GPU arch for each image in this documentation.

jhuber6 added inline comments.May 11 2022, 11:33 AM

clang/docs/ClangOffloadPackager.rst
31–32	Sure, I'll add some more comprehensive documentation.

@jhuber6 -- @MaskRay has found that ninja install is failing in a clean build with:

clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!

It looks like a missing dependency somewhere.

In D125165#3557015, @tra wrote:
@jhuber6 -- @MaskRay has found that ninja install is failing in a clean build with:
clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!
It looks like a missing dependency somewhere.

Weird, I'll try a fresh build myself and see if I can figure it out. I'm not sure if it's the problem, but I used add_clang_executable instead of add_clang_tool in the CMake, that's the only thing that stands out I can see.

In D125165#3557015, @tra wrote:
@jhuber6 -- @MaskRay has found that ninja install is failing in a clean build with:
clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!
It looks like a missing dependency somewhere.

I couldn't reproduce it with a clean build, but I went ahead and pushed a patch that builds it using the clang tool preset instead. if the problem persists I can look into the specific build configuration.

Add openmp to LLVM_ENABLE_PROJECTS to trigger the issue:

cmake -GNinja -Sllvm -B/tmp/out/play -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;openmp' -DCMAKE_CXX_COMPILER=~/Stable/bin/clang++ -DCMAKE_C_COMPILER=~/Stable/bin/clang -DCMAKE_INSTALL_PREFIX=/tmp/install/play
ninja -C /tmp/out/play install

clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!
clang: error: clang-offload-packager command failed with exit code 1 (use -v to see invocation)
[527/5061] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangAttrEmitter.cpp.o
ninja: build stopped: subcommand failed.

[Clang] Change the offload packager build to be a clang tool does not fix the issue.

In D125165#3557441, @MaskRay wrote:

Add openmp to LLVM_ENABLE_PROJECTS to trigger the issue:

cmake -GNinja -Sllvm -B/tmp/out/play -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;openmp' -DCMAKE_CXX_COMPILER=~/Stable/bin/clang++ -DCMAKE_C_COMPILER=~/Stable/bin/clang -DCMAKE_INSTALL_PREFIX=/tmp/install/play
ninja -C /tmp/out/play install

clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!
clang: error: clang-offload-packager command failed with exit code 1 (use -v to see invocation)
[527/5061] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangAttrEmitter.cpp.o
ninja: build stopped: subcommand failed.

[Clang] Change the offload packager build to be a clang tool does not fix the issue.

This worked fine for me using a fresh build

$ cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;openmp' -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCMAKE_INSTALL_PREFIX=$HOME/clang && ninja install
$ ls ~/clang/bin/clang-offload-packager 
/home2/3n4/clang/bin/clang-offload-packager

The above doesn't work for you right?

In D125165#3557693, @jhuber6 wrote:

In D125165#3557441, @MaskRay wrote:

Add openmp to LLVM_ENABLE_PROJECTS to trigger the issue:

cmake -GNinja -Sllvm -B/tmp/out/play -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;openmp' -DCMAKE_CXX_COMPILER=~/Stable/bin/clang++ -DCMAKE_C_COMPILER=~/Stable/bin/clang -DCMAKE_INSTALL_PREFIX=/tmp/install/play
ninja -C /tmp/out/play install

clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!
clang: error: clang-offload-packager command failed with exit code 1 (use -v to see invocation)
[527/5061] Building CXX object tools/clang/utils/TableGen/CMakeFiles/clang-tblgen.dir/ClangAttrEmitter.cpp.o
ninja: build stopped: subcommand failed.

[Clang] Change the offload packager build to be a clang tool does not fix the issue.

This worked fine for me using a fresh build

$ cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS='clang;openmp' -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_C_COMPILER=clang -DCMAKE_INSTALL_PREFIX=$HOME/clang && ninja install
$ ls ~/clang/bin/clang-offload-packager 
/home2/3n4/clang/bin/clang-offload-packager

The above doesn't work for you right?

Sorry, it's not this patch's fault.
This is a problem of https://raw.githubusercontent.com/chromium/chromium/main/tools/clang/scripts/update.py . Its clang has picked up this change which tries to spawn clang-offload-packager but the chromium prebuilt tools don't provide clang-offload-packager.
Some openmp directories require the host tool clang-offload-packager. (@thakis)

I can use -DLIBOMPTARGET_BUILD_DEVICERTL_BCLIB=off -DLIBOMPTARGET_BUILD_AMDGPU_PLUGIN=off -DLIBOMPTARGET_BUILD_CUDA_PLUGIN=off to disable these directories.

Revision Contents

Path

Size

clang/

docs/

ClangOffloadPackager.rst

72 lines

include/

clang/

Basic/

CodeGenOptions.h

8 lines

Driver/

Action.h

12 lines

ToolChain.h

2 lines

lib/

CodeGen/

BackendUtil.cpp

23 lines

Driver/

Action.cpp

8 lines

Driver.cpp

14 lines

ToolChain.cpp

8 lines

ToolChains/

Clang.h

13 lines

Clang.cpp

55 lines

test/

Driver/

amdgpu-openmp-toolchain-new.c

19 lines

cuda-openmp-driver.cu

3 lines

cuda-phases.cu

33 lines

linker-wrapper-image.c

6 lines

linker-wrapper.c

61 lines

openmp-offload-gpu-new.c

23 lines

openmp-offload-infer.c

6 lines

Frontend/

embed-object.c

5 lines

embed-object.ll

8 lines

lit.cfg.py

2 lines

tools/

CMakeLists.txt

1 line

clang-offload-packager/

CMakeLists.txt

28 lines

ClangOffloadPackager.cpp

114 lines

Diff 428645

clang/docs/ClangOffloadPackager.rst

This file was added.

				======================
				Clang Offload Packager
				======================

				.. contents::
				:local:

				.. _clang-offload-packager:

				Introduction
				============

				This tool bundles device files into a single image containing necessary
				metadata. We use a custom binary format for bundling all the device images
				together. The image format is a small header wrapping around a string map. This
				tool creates bundled binaries so that they can be embedded into the host to
				create a fat-binary.

				An embedded binary is marked by the ``0x10FF10AD`` magic bytes, followed by a
				version. Each created binary contains its own magic bytes. This allows us to
				locate all the embedded offloading sections even after they may have been merged
				by the linker, such as when using relocatable linking. The format used is
				primarily a binary serialization of the following struct.

				.. code-block:: c++

				struct OffloadingImage {
				uint16_t TheImageKind;
				uint16_t TheOffloadKind;
				uint32_t Flags;
				StringMap<StringRef> StringData;
				MemoryBufferRef Image;
				yaxunlUnsubmitted Not Done Reply Inline Actions This makes the file format depend on LLVM version and potentially standard C++ library version. If it is consumed by the same version of LLVM, it may be fine. However, if it is intended for a generic file format to be consumed by generic offloading language runtimes, it is better to use C-like simple data layout which does not depend on LLVM or standard C++ library. yaxunl: This makes the file format depend on LLVM version and potentially standard C++ library version.
				jhuber6AuthorUnsubmitted Done Reply Inline Actions That format just stores the data before it's serialized to a binary format. The binary format is basically just a few headers and a string table. See https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Object/OffloadBinary.h#L91 for the real format. I didn't want to explain it all in detail here. jhuber6: That format just stores the data before it's serialized to a binary format. The binary format…
				yaxunlUnsubmitted Not Done Reply Inline Actions If the file format is intended to be consumed by generic offloading language runtimes or development tools, better to describe its layout like https://clang.llvm.org/docs/ClangOffloadBundler.html Since language runtimes or development tools may not use LLVM to load the file. The documentation serves as a spec for this binary format. Especially, it is not clear where to find the target triple and GPU arch for each image in this documentation. yaxunl: If the file format is intended to be consumed by generic offloading language runtimes or…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions Sure, I'll add some more comprehensive documentation. jhuber6: Sure, I'll add some more comprehensive documentation.
				};

				Usage
				=====

				This tool can be used with the following arguments. Generally information is
				passed as a key-value pair to the ``image=`` argument. The ``file``, ``triple``,
				and ``arch`` arguments are considered mandatory to make a valid image.

				.. code-block:: console

				OVERVIEW: A utility for bundling several object files into a single binary.
				The output binary can then be embedded into the host section table
				to create a fatbinary containing offloading code.

				USAGE: clang-offload-packager [options]

				OPTIONS:

				Generic Options:

				--help - Display available options (--help-hidden for more)
				--help-list - Display list of available options (--help-list-hidden for more)
				--version - Display the version of this program

				clang-offload-packager options:

				--image=<<key>=<value>,...> - List of key and value arguments. Required
				keywords are 'file' and 'triple'.
				-o=<file> - Write output to <file>.

				Example
				=======

				This tool simply takes many input files from the ``image`` option and creates a
				single output file with all the images combined.

				.. code-block:: console

				clang-offload-packager -o out.bin --image=file=input.o,triple=nvptx64,arch=sm_70

clang/include/clang/Basic/CodeGenOptions.h

Show First 20 Lines • Show All 270 Lines • ▼ Show 20 Lines	public:

/// Prefix to use for -save-temps output.		/// Prefix to use for -save-temps output.
std::string SaveTempsFilePrefix;		std::string SaveTempsFilePrefix;

/// Name of file passed with -fcuda-include-gpubinary option to forward to		/// Name of file passed with -fcuda-include-gpubinary option to forward to
/// CUDA runtime back-end for incorporating them into host-side object file.		/// CUDA runtime back-end for incorporating them into host-side object file.
std::string CudaGpuBinaryFileName;		std::string CudaGpuBinaryFileName;

/// List of filenames and metadata passed in using the -fembed-offload-object		/// List of filenames passed in using the -fembed-offload-object option. These
/// option to embed device-side offloading objects into the host as a named		/// are offloading binaries containing device images and metadata.
/// section. Input passed in as 'filename,kind,triple,arch'.
///
/// NOTE: This will need to be expanded whenever we want to pass in more
/// metadata, at some point this should be its own clang tool.
std::vector<std::string> OffloadObjects;		std::vector<std::string> OffloadObjects;

/// The name of the file to which the backend should save YAML optimization		/// The name of the file to which the backend should save YAML optimization
/// records.		/// records.
std::string OptRecordFile;		std::string OptRecordFile;

/// The regex that filters the passes that should be saved to the optimization		/// The regex that filters the passes that should be saved to the optimization
/// records.		/// records.
▲ Show 20 Lines • Show All 201 Lines • Show Last 20 Lines

clang/include/clang/Driver/Action.h

Show First 20 Lines • Show All 68 Lines • ▼ Show 20 Lines	enum ActionClass {
IfsMergeJobClass,		IfsMergeJobClass,
LipoJobClass,		LipoJobClass,
DsymutilJobClass,		DsymutilJobClass,
VerifyDebugInfoJobClass,		VerifyDebugInfoJobClass,
VerifyPCHJobClass,		VerifyPCHJobClass,
OffloadBundlingJobClass,		OffloadBundlingJobClass,
OffloadUnbundlingJobClass,		OffloadUnbundlingJobClass,
OffloadWrapperJobClass,		OffloadWrapperJobClass,
		OffloadPackagerJobClass,
LinkerWrapperJobClass,		LinkerWrapperJobClass,
StaticLibJobClass,		StaticLibJobClass,

JobClassFirst = PreprocessJobClass,		JobClassFirst = PreprocessJobClass,
JobClassLast = StaticLibJobClass		JobClassLast = StaticLibJobClass
};		};

// The offloading kind determines if this action is binded to a particular		// The offloading kind determines if this action is binded to a particular
▲ Show 20 Lines • Show All 579 Lines • ▼ Show 20 Lines
public:		public:
OffloadWrapperJobAction(ActionList &Inputs, types::ID Type);		OffloadWrapperJobAction(ActionList &Inputs, types::ID Type);

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == OffloadWrapperJobClass;		return A->getKind() == OffloadWrapperJobClass;
}		}
};		};

		class OffloadPackagerJobAction : public JobAction {
		void anchor() override;

		public:
		OffloadPackagerJobAction(ActionList &Inputs, types::ID Type);

		static bool classof(const Action *A) {
		return A->getKind() == OffloadPackagerJobClass;
		}
		};

class LinkerWrapperJobAction : public JobAction {		class LinkerWrapperJobAction : public JobAction {
void anchor() override;		void anchor() override;

public:		public:
LinkerWrapperJobAction(ActionList &Inputs, types::ID Type);		LinkerWrapperJobAction(ActionList &Inputs, types::ID Type);

static bool classof(const Action *A) {		static bool classof(const Action *A) {
return A->getKind() == LinkerWrapperJobClass;		return A->getKind() == LinkerWrapperJobClass;
Show All 18 Lines

clang/include/clang/Driver/ToolChain.h

Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	private:
mutable std::unique_ptr<Tool> Clang;		mutable std::unique_ptr<Tool> Clang;
mutable std::unique_ptr<Tool> Flang;		mutable std::unique_ptr<Tool> Flang;
mutable std::unique_ptr<Tool> Assemble;		mutable std::unique_ptr<Tool> Assemble;
mutable std::unique_ptr<Tool> Link;		mutable std::unique_ptr<Tool> Link;
mutable std::unique_ptr<Tool> StaticLibTool;		mutable std::unique_ptr<Tool> StaticLibTool;
mutable std::unique_ptr<Tool> IfsMerge;		mutable std::unique_ptr<Tool> IfsMerge;
mutable std::unique_ptr<Tool> OffloadBundler;		mutable std::unique_ptr<Tool> OffloadBundler;
mutable std::unique_ptr<Tool> OffloadWrapper;		mutable std::unique_ptr<Tool> OffloadWrapper;
		mutable std::unique_ptr<Tool> OffloadPackager;
mutable std::unique_ptr<Tool> LinkerWrapper;		mutable std::unique_ptr<Tool> LinkerWrapper;

Tool *getClang() const;		Tool *getClang() const;
Tool *getFlang() const;		Tool *getFlang() const;
Tool *getAssemble() const;		Tool *getAssemble() const;
Tool *getLink() const;		Tool *getLink() const;
Tool *getStaticLibTool() const;		Tool *getStaticLibTool() const;
Tool *getIfsMerge() const;		Tool *getIfsMerge() const;
Tool *getClangAs() const;		Tool *getClangAs() const;
Tool *getOffloadBundler() const;		Tool *getOffloadBundler() const;
Tool *getOffloadWrapper() const;		Tool *getOffloadWrapper() const;
		Tool *getOffloadPackager() const;
Tool *getLinkerWrapper() const;		Tool *getLinkerWrapper() const;

mutable bool SanitizerArgsChecked = false;		mutable bool SanitizerArgsChecked = false;
mutable std::unique_ptr<XRayArgs> XRayArguments;		mutable std::unique_ptr<XRayArgs> XRayArguments;

/// The effective clang triple for the current Job.		/// The effective clang triple for the current Job.
mutable llvm::Triple EffectiveTriple;		mutable llvm::Triple EffectiveTriple;

▲ Show 20 Lines • Show All 591 Lines • Show Last 20 Lines

clang/lib/CodeGen/BackendUtil.cpp

	Show First 20 Lines • Show All 1,205 Lines • ▼ Show 20 Lines
	}			}

	void clang::EmbedObject(llvm::Module *M, const CodeGenOptions &CGOpts,			void clang::EmbedObject(llvm::Module *M, const CodeGenOptions &CGOpts,
	DiagnosticsEngine &Diags) {			DiagnosticsEngine &Diags) {
	if (CGOpts.OffloadObjects.empty())			if (CGOpts.OffloadObjects.empty())
	return;			return;

	for (StringRef OffloadObject : CGOpts.OffloadObjects) {			for (StringRef OffloadObject : CGOpts.OffloadObjects) {
	SmallVector<StringRef, 4> ObjectFields;
	OffloadObject.split(ObjectFields, ',');

	if (ObjectFields.size() != 4) {
	auto DiagID = Diags.getCustomDiagID(
	DiagnosticsEngine::Error, "Expected at least four arguments '%0'");
	Diags.Report(DiagID) << OffloadObject;
	return;
	}

	llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> ObjectOrErr =			llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> ObjectOrErr =
	llvm::MemoryBuffer::getFileOrSTDIN(ObjectFields[0]);			llvm::MemoryBuffer::getFileOrSTDIN(OffloadObject);
	if (std::error_code EC = ObjectOrErr.getError()) {			if (std::error_code EC = ObjectOrErr.getError()) {
	auto DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Error,			auto DiagID = Diags.getCustomDiagID(DiagnosticsEngine::Error,
	"could not open '%0' for embedding");			"could not open '%0' for embedding");
	Diags.Report(DiagID) << ObjectFields[0];			Diags.Report(DiagID) << OffloadObject;
	return;			return;
	}			}

	OffloadBinary::OffloadingImage Image{};			llvm::embedBufferInModule(M, *ObjectOrErr, ".llvm.offloading",
	Image.TheImageKind = getImageKind(ObjectFields[0].rsplit(".").second);
	Image.TheOffloadKind = getOffloadKind(ObjectFields[1]);
	Image.StringData = {{"triple", ObjectFields[2]}, {"arch", ObjectFields[3]}};
	Image.Image = **ObjectOrErr;

	std::unique_ptr<MemoryBuffer> OffloadBuffer = OffloadBinary::write(Image);
	llvm::embedBufferInModule(M, OffloadBuffer, ".llvm.offloading",
	Align(OffloadBinary::getAlignment()));			Align(OffloadBinary::getAlignment()));
	}			}
	}			}

clang/lib/Driver/Action.cpp

Show All 39 Lines	const char *Action::getClassName(ActionClass AC) {
case VerifyDebugInfoJobClass: return "verify-debug-info";		case VerifyDebugInfoJobClass: return "verify-debug-info";
case VerifyPCHJobClass: return "verify-pch";		case VerifyPCHJobClass: return "verify-pch";
case OffloadBundlingJobClass:		case OffloadBundlingJobClass:
return "clang-offload-bundler";		return "clang-offload-bundler";
case OffloadUnbundlingJobClass:		case OffloadUnbundlingJobClass:
return "clang-offload-unbundler";		return "clang-offload-unbundler";
case OffloadWrapperJobClass:		case OffloadWrapperJobClass:
return "clang-offload-wrapper";		return "clang-offload-wrapper";
		case OffloadPackagerJobClass:
		return "clang-offload-packager";
case LinkerWrapperJobClass:		case LinkerWrapperJobClass:
return "clang-linker-wrapper";		return "clang-linker-wrapper";
case StaticLibJobClass:		case StaticLibJobClass:
return "static-lib-linker";		return "static-lib-linker";
}		}

llvm_unreachable("invalid class");		llvm_unreachable("invalid class");
}		}
▲ Show 20 Lines • Show All 371 Lines • ▼ Show 20 Lines	OffloadUnbundlingJobAction::OffloadUnbundlingJobAction(Action *Input)
: JobAction(OffloadUnbundlingJobClass, Input, Input->getType()) {}		: JobAction(OffloadUnbundlingJobClass, Input, Input->getType()) {}

void OffloadWrapperJobAction::anchor() {}		void OffloadWrapperJobAction::anchor() {}

OffloadWrapperJobAction::OffloadWrapperJobAction(ActionList &Inputs,		OffloadWrapperJobAction::OffloadWrapperJobAction(ActionList &Inputs,
types::ID Type)		types::ID Type)
: JobAction(OffloadWrapperJobClass, Inputs, Type) {}		: JobAction(OffloadWrapperJobClass, Inputs, Type) {}

		void OffloadPackagerJobAction::anchor() {}

		OffloadPackagerJobAction::OffloadPackagerJobAction(ActionList &Inputs,
		types::ID Type)
		: JobAction(OffloadPackagerJobClass, Inputs, Type) {}

void LinkerWrapperJobAction::anchor() {}		void LinkerWrapperJobAction::anchor() {}

LinkerWrapperJobAction::LinkerWrapperJobAction(ActionList &Inputs,		LinkerWrapperJobAction::LinkerWrapperJobAction(ActionList &Inputs,
types::ID Type)		types::ID Type)
: JobAction(LinkerWrapperJobClass, Inputs, Type) {}		: JobAction(LinkerWrapperJobClass, Inputs, Type) {}

void StaticLibJobAction::anchor() {}		void StaticLibJobAction::anchor() {}

StaticLibJobAction::StaticLibJobAction(ActionList &Inputs, types::ID Type)		StaticLibJobAction::StaticLibJobAction(ActionList &Inputs, types::ID Type)
: JobAction(StaticLibJobClass, Inputs, Type) {}		: JobAction(StaticLibJobClass, Inputs, Type) {}

clang/lib/Driver/Driver.cpp

Show First 20 Lines • Show All 4,376 Lines • ▼ Show 20 Lines	const bool DeviceOnly =
Mode && Mode->getOption().matches(options::OPT_offload_device_only);		Mode && Mode->getOption().matches(options::OPT_offload_device_only);

// Don't build offloading actions if explicitly disabled or we do not have a		// Don't build offloading actions if explicitly disabled or we do not have a
// compile action to embed it in. If preprocessing only ignore embedding.		// compile action to embed it in. If preprocessing only ignore embedding.
if (HostOnly \|\| !(isa<CompileJobAction>(HostAction) \|\|		if (HostOnly \|\| !(isa<CompileJobAction>(HostAction) \|\|
getFinalPhase(Args) == phases::Preprocess))		getFinalPhase(Args) == phases::Preprocess))
return HostAction;		return HostAction;

		ActionList OffloadActions;
OffloadAction::DeviceDependences DDeps;		OffloadAction::DeviceDependences DDeps;

const Action::OffloadKind OffloadKinds[] = {		const Action::OffloadKind OffloadKinds[] = {
Action::OFK_OpenMP, Action::OFK_Cuda, Action::OFK_HIP};		Action::OFK_OpenMP, Action::OFK_Cuda, Action::OFK_HIP};

for (Action::OffloadKind Kind : OffloadKinds) {		for (Action::OffloadKind Kind : OffloadKinds) {
SmallVector<const ToolChain *, 2> ToolChains;		SmallVector<const ToolChain *, 2> ToolChains;
ActionList DeviceActions;		ActionList DeviceActions;
▲ Show 20 Lines • Show All 59 Lines • ▼ Show 20 Lines	for (phases::ID Phase : PL) {
}		}
++TCAndArch;		++TCAndArch;
}		}
}		}

auto TCAndArch = TCAndArchs.begin();		auto TCAndArch = TCAndArchs.begin();
for (Action *A : DeviceActions) {		for (Action *A : DeviceActions) {
DDeps.add(A, TCAndArch->first, TCAndArch->second.data(), Kind);		DDeps.add(A, TCAndArch->first, TCAndArch->second.data(), Kind);
		OffloadAction::DeviceDependences DDep;
		DDep.add(A, TCAndArch->first, TCAndArch->second.data(), Kind);
		OffloadActions.push_back(C.MakeAction<OffloadAction>(DDep, A->getType()));
++TCAndArch;		++TCAndArch;
}		}
}		}

if (DeviceOnly)		if (DeviceOnly)
return C.MakeAction<OffloadAction>(DDeps, types::TY_Nothing);		return C.MakeAction<OffloadAction>(DDeps, types::TY_Nothing);

		Action *OffloadPackager =
		C.MakeAction<OffloadPackagerJobAction>(OffloadActions, types::TY_Image);
		OffloadAction::DeviceDependences DDep;
		DDep.add(OffloadPackager, C.getSingleOffloadToolChain<Action::OFK_Host>(),
		nullptr, Action::OFK_None);
OffloadAction::HostDependence HDep(		OffloadAction::HostDependence HDep(
HostAction, C.getSingleOffloadToolChain<Action::OFK_Host>(),		HostAction, C.getSingleOffloadToolChain<Action::OFK_Host>(),
/BoundArch=/nullptr, DDeps);		/BoundArch=/nullptr, isa<CompileJobAction>(HostAction) ? DDep : DDeps);
return C.MakeAction<OffloadAction>(HDep, DDeps);		return C.MakeAction<OffloadAction>(
		HDep, isa<CompileJobAction>(HostAction) ? DDep : DDeps);
}		}

Action *Driver::ConstructPhaseAction(		Action *Driver::ConstructPhaseAction(
Compilation &C, const ArgList &Args, phases::ID Phase, Action *Input,		Compilation &C, const ArgList &Args, phases::ID Phase, Action *Input,
Action::OffloadKind TargetDeviceOffloadKind) const {		Action::OffloadKind TargetDeviceOffloadKind) const {
llvm::PrettyStackTraceString CrashInfo("Constructing phase actions");		llvm::PrettyStackTraceString CrashInfo("Constructing phase actions");

// Some types skip the assembler phase (e.g., llvm-bc), but we can't		// Some types skip the assembler phase (e.g., llvm-bc), but we can't
▲ Show 20 Lines • Show All 1,782 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChain.cpp

Show First 20 Lines • Show All 322 Lines • ▼ Show 20 Lines
}		}

Tool *ToolChain::getOffloadWrapper() const {		Tool *ToolChain::getOffloadWrapper() const {
if (!OffloadWrapper)		if (!OffloadWrapper)
OffloadWrapper.reset(new tools::OffloadWrapper(*this));		OffloadWrapper.reset(new tools::OffloadWrapper(*this));
return OffloadWrapper.get();		return OffloadWrapper.get();
}		}

		Tool *ToolChain::getOffloadPackager() const {
		if (!OffloadPackager)
		OffloadPackager.reset(new tools::OffloadPackager(*this));
		return OffloadPackager.get();
		}

Tool *ToolChain::getLinkerWrapper() const {		Tool *ToolChain::getLinkerWrapper() const {
if (!LinkerWrapper)		if (!LinkerWrapper)
LinkerWrapper.reset(new tools::LinkerWrapper(*this, getLink()));		LinkerWrapper.reset(new tools::LinkerWrapper(*this, getLink()));
return LinkerWrapper.get();		return LinkerWrapper.get();
}		}

Tool *ToolChain::getTool(Action::ActionClass AC) const {		Tool *ToolChain::getTool(Action::ActionClass AC) const {
switch (AC) {		switch (AC) {
Show All 29 Lines	case Action::BackendJobClass:
return getClang();		return getClang();

case Action::OffloadBundlingJobClass:		case Action::OffloadBundlingJobClass:
case Action::OffloadUnbundlingJobClass:		case Action::OffloadUnbundlingJobClass:
return getOffloadBundler();		return getOffloadBundler();

case Action::OffloadWrapperJobClass:		case Action::OffloadWrapperJobClass:
return getOffloadWrapper();		return getOffloadWrapper();
		case Action::OffloadPackagerJobClass:
		return getOffloadPackager();
case Action::LinkerWrapperJobClass:		case Action::LinkerWrapperJobClass:
return getLinkerWrapper();		return getLinkerWrapper();
}		}

llvm_unreachable("Invalid tool kind.");		llvm_unreachable("Invalid tool kind.");
}		}

static StringRef getArchNameForCompilerRTLib(const ToolChain &TC,		static StringRef getArchNameForCompilerRTLib(const ToolChain &TC,
▲ Show 20 Lines • Show All 885 Lines • Show Last 20 Lines

clang/lib/Driver/ToolChains/Clang.h

Show First 20 Lines • Show All 164 Lines • ▼ Show 20 Lines	public:

bool hasIntegratedCPP() const override { return false; }		bool hasIntegratedCPP() const override { return false; }
void ConstructJob(Compilation &C, const JobAction &JA,		void ConstructJob(Compilation &C, const JobAction &JA,
const InputInfo &Output, const InputInfoList &Inputs,		const InputInfo &Output, const InputInfoList &Inputs,
const llvm::opt::ArgList &TCArgs,		const llvm::opt::ArgList &TCArgs,
const char *LinkingOutput) const override;		const char *LinkingOutput) const override;
};		};

		/// Offload binary tool.
		class LLVM_LIBRARY_VISIBILITY OffloadPackager final : public Tool {
		public:
		OffloadPackager(const ToolChain &TC)
		: Tool("Offload::Packager", "clang-offload-packager", TC) {}

		bool hasIntegratedCPP() const override { return false; }
		void ConstructJob(Compilation &C, const JobAction &JA,
		const InputInfo &Output, const InputInfoList &Inputs,
		const llvm::opt::ArgList &TCArgs,
		const char *LinkingOutput) const override;
		};

/// Linker wrapper tool.		/// Linker wrapper tool.
class LLVM_LIBRARY_VISIBILITY LinkerWrapper final : public Tool {		class LLVM_LIBRARY_VISIBILITY LinkerWrapper final : public Tool {
const Tool *Linker;		const Tool *Linker;

public:		public:
LinkerWrapper(const ToolChain &TC, const Tool *Linker)		LinkerWrapper(const ToolChain &TC, const Tool *Linker)
: Tool("Offload::Linker", "linker", TC), Linker(Linker) {}		: Tool("Offload::Linker", "linker", TC), Linker(Linker) {}

Show All 13 Lines

clang/lib/Driver/ToolChains/Clang.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,979 Lines • ▼ Show 20 Lines	if (IsOpenMPDevice) {
if (OpenMPDeviceInput) {		if (OpenMPDeviceInput) {
CmdArgs.push_back("-fopenmp-host-ir-file-path");		CmdArgs.push_back("-fopenmp-host-ir-file-path");
CmdArgs.push_back(Args.MakeArgString(OpenMPDeviceInput->getFilename()));		CmdArgs.push_back(Args.MakeArgString(OpenMPDeviceInput->getFilename()));
}		}
}		}

// Host-side offloading recieves the device object files and embeds it in a		// Host-side offloading recieves the device object files and embeds it in a
// named section including the associated target triple and architecture.		// named section including the associated target triple and architecture.
for (const InputInfo Input : HostOffloadingInputs) {		for (const InputInfo Input : HostOffloadingInputs)
const Action *OffloadAction = Input.getAction();		CmdArgs.push_back(Args.MakeArgString("-fembed-offload-object=" +
const ToolChain *TC = OffloadAction->getOffloadingToolChain();		TC.getInputFilename(Input)));
const ArgList &TCArgs =
C.getArgsForToolChain(TC, OffloadAction->getOffloadingArch(),
OffloadAction->getOffloadingDeviceKind());
StringRef File = C.getArgs().MakeArgString(TC->getInputFilename(Input));
StringRef Arch = (OffloadAction->getOffloadingArch())
? OffloadAction->getOffloadingArch()
: TCArgs.getLastArgValue(options::OPT_march_EQ);

CmdArgs.push_back(Args.MakeArgString(
"-fembed-offload-object=" + File + "," +
Action::GetOffloadKindName(OffloadAction->getOffloadingDeviceKind()) +
"," + TC->getTripleString() + "," + Arch));
}

if (Triple.isAMDGPU()) {		if (Triple.isAMDGPU()) {
handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);		handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs);

Args.addOptInFlag(CmdArgs, options::OPT_munsafe_fp_atomics,		Args.addOptInFlag(CmdArgs, options::OPT_munsafe_fp_atomics,
options::OPT_mno_unsafe_fp_atomics);		options::OPT_mno_unsafe_fp_atomics);
}		}

▲ Show 20 Lines • Show All 1,220 Lines • ▼ Show 20 Lines	void OffloadWrapper::ConstructJob(Compilation &C, const JobAction &JA,
}		}

C.addCommand(std::make_unique<Command>(		C.addCommand(std::make_unique<Command>(
JA, *this, ResponseFileSupport::None(),		JA, *this, ResponseFileSupport::None(),
Args.MakeArgString(getToolChain().GetProgramPath(getShortName())),		Args.MakeArgString(getToolChain().GetProgramPath(getShortName())),
CmdArgs, Inputs, Output));		CmdArgs, Inputs, Output));
}		}

		void OffloadPackager::ConstructJob(Compilation &C, const JobAction &JA,
		const InputInfo &Output,
		const InputInfoList &Inputs,
		const llvm::opt::ArgList &Args,
		const char *LinkingOutput) const {
		ArgStringList CmdArgs;

		// Add the output file name.
		assert(Output.isFilename() && "Invalid output.");
		CmdArgs.push_back("-o");
		CmdArgs.push_back(Output.getFilename());

		// Create the inputs to bundle the needed metadata.
		for (const InputInfo &Input : Inputs) {
		const Action *OffloadAction = Input.getAction();
		const ToolChain *TC = OffloadAction->getOffloadingToolChain();
		const ArgList &TCArgs =
		C.getArgsForToolChain(TC, OffloadAction->getOffloadingArch(),
		OffloadAction->getOffloadingDeviceKind());
		StringRef File = C.getArgs().MakeArgString(TC->getInputFilename(Input));
		StringRef Arch = (OffloadAction->getOffloadingArch())
		? OffloadAction->getOffloadingArch()
		: TCArgs.getLastArgValue(options::OPT_march_EQ);

		CmdArgs.push_back(Args.MakeArgString(
		"--image=file=" + File + "," + "triple=" + TC->getTripleString() + "," +
		"arch=" + Arch + "," + "kind=" +
		Action::GetOffloadKindName(OffloadAction->getOffloadingDeviceKind())));
		}

		C.addCommand(std::make_unique<Command>(
		JA, *this, ResponseFileSupport::None(),
		Args.MakeArgString(getToolChain().GetProgramPath(getShortName())),
		CmdArgs, Inputs, Output));
		}

void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,		void LinkerWrapper::ConstructJob(Compilation &C, const JobAction &JA,
const InputInfo &Output,		const InputInfo &Output,
const InputInfoList &Inputs,		const InputInfoList &Inputs,
const ArgList &Args,		const ArgList &Args,
const char *LinkingOutput) const {		const char *LinkingOutput) const {
const Driver &D = getToolChain().getDriver();		const Driver &D = getToolChain().getDriver();
const llvm::Triple TheTriple = getToolChain().getTriple();		const llvm::Triple TheTriple = getToolChain().getTriple();
auto OpenMPTCRange = C.getOffloadToolChains<Action::OFK_OpenMP>();		auto OpenMPTCRange = C.getOffloadToolChains<Action::OFK_OpenMP>();
▲ Show 20 Lines • Show All 147 Lines • Show Last 20 Lines

clang/test/Driver/amdgpu-openmp-toolchain-new.c

	Show All 9 Lines
	// verify the tools invocations			// verify the tools invocations
	// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-emit-llvm-bc"{{.}}"-x" "c"			// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-emit-llvm-bc"{{.}}"-x" "c"
	// CHECK: "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu"{{.}}"-target-cpu" "gfx906"{{.}}"-fcuda-is-device"{{.}}"-mlink-builtin-bitcode" "{{.}}libomptarget-amdgpu-gfx906.bc"			// CHECK: "-cc1" "-triple" "amdgcn-amd-amdhsa" "-aux-triple" "x86_64-unknown-linux-gnu"{{.}}"-target-cpu" "gfx906"{{.}}"-fcuda-is-device"{{.}}"-mlink-builtin-bitcode" "{{.}}libomptarget-amdgpu-gfx906.bc"
	// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-emit-obj"			// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-emit-obj"
	// CHECK: clang-linker-wrapper{{.}}"--"{{.}} "-o" "a.out"			// CHECK: clang-linker-wrapper{{.}}"--"{{.}} "-o" "a.out"

	// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \			// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx906 %s 2>&1 \
	// RUN: \| FileCheck --check-prefix=CHECK-PHASES %s			// RUN: \| FileCheck --check-prefix=CHECK-PHASES %s
	// CHECK-PHASES: 0: input, "[[INPUT:.*]]", c, (host-openmp)			// CHECK-PHASES: 0: input, "[[INPUT:.+]]", c, (host-openmp)
	// CHECK-PHASES: 1: preprocessor, {0}, cpp-output, (host-openmp)			// CHECK-PHASES: 1: preprocessor, {0}, cpp-output, (host-openmp)
	// CHECK-PHASES: 2: compiler, {1}, ir, (host-openmp)			// CHECK-PHASES: 2: compiler, {1}, ir, (host-openmp)
	// CHECK-PHASES: 3: input, "[[INPUT]]", c, (device-openmp)			// CHECK-PHASES: 3: input, "[[INPUT]]", c, (device-openmp)
	// CHECK-PHASES: 4: preprocessor, {3}, cpp-output, (device-openmp)			// CHECK-PHASES: 4: preprocessor, {3}, cpp-output, (device-openmp)
	// CHECK-PHASES: 5: compiler, {4}, ir, (device-openmp)			// CHECK-PHASES: 5: compiler, {4}, ir, (device-openmp)
	// CHECK-PHASES: 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (amdgcn-amd-amdhsa)" {5}, ir			// CHECK-PHASES: 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (amdgcn-amd-amdhsa)" {5}, ir
	// CHECK-PHASES: 7: backend, {6}, assembler, (device-openmp)			// CHECK-PHASES: 7: backend, {6}, assembler, (device-openmp)
	// CHECK-PHASES: 8: assembler, {7}, object, (device-openmp)			// CHECK-PHASES: 8: assembler, {7}, object, (device-openmp)
	// CHECK-PHASES: 9: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (amdgcn-amd-amdhsa)" {8}, ir			// CHECK-PHASES: 9: offload, "device-openmp (amdgcn-amd-amdhsa)" {8}, object
	// CHECK-PHASES: 10: backend, {9}, assembler, (host-openmp)			// CHECK-PHASES: 10: clang-offload-packager, {9}, image
	// CHECK-PHASES: 11: assembler, {10}, object, (host-openmp)			// CHECK-PHASES: 11: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, " (x86_64-unknown-linux-gnu)" {10}, ir
	// CHECK-PHASES: 12: clang-linker-wrapper, {11}, image, (host-openmp)			// CHECK-PHASES: 12: backend, {11}, assembler, (host-openmp)
				// CHECK-PHASES: 13: assembler, {12}, object, (host-openmp)
				// CHECK-PHASES: 14: clang-linker-wrapper, {13}, image, (host-openmp)

	// handling of --libomptarget-amdgpu-bc-path			// handling of --libomptarget-amdgpu-bc-path
	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 --libomptarget-amdgpu-bc-path=%S/Inputs/hip_dev_lib/libomptarget-amdgpu-gfx803.bc %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIBOMPTARGET			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 --libomptarget-amdgpu-bc-path=%S/Inputs/hip_dev_lib/libomptarget-amdgpu-gfx803.bc %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIBOMPTARGET
	// CHECK-LIBOMPTARGET: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-mlink-builtin-bitcode"{{.}}Inputs/hip_dev_lib/libomptarget-amdgpu-gfx803.bc"{{.*}}			// CHECK-LIBOMPTARGET: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-mlink-builtin-bitcode"{{.}}Inputs/hip_dev_lib/libomptarget-amdgpu-gfx803.bc"{{.*}}

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NOGPULIB			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NOGPULIB
	// CHECK-NOGPULIB-NOT: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-mlink-builtin-bitcode"{{.}}libomptarget-amdgpu-gfx803.bc"{{.*}}			// CHECK-NOGPULIB-NOT: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.}}"-target-cpu" "gfx803" "-fcuda-is-device" "-mlink-builtin-bitcode"{{.}}libomptarget-amdgpu-gfx803.bc"{{.*}}

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-BINDINGS			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-BINDINGS
	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-BINDINGS			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa --offload-arch=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-BINDINGS
	// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"			// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"
	// CHECK-BINDINGS: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC:.*]]"			// CHECK-BINDINGS: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC:.+]]"
	// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_BC]]"], output: "[[HOST_OBJ:.*]]"			// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[DEVICE_BC]]"], output: "[[BINARY:.+]]"
				// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.+]]"
	// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -emit-llvm -S -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-EMIT-LLVM-IR			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -emit-llvm -S -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-EMIT-LLVM-IR
	// CHECK-EMIT-LLVM-IR: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.*}}"-emit-llvm"			// CHECK-EMIT-LLVM-IR: "-cc1" "-triple" "amdgcn-amd-amdhsa"{{.*}}"-emit-llvm"

	// RUN: %clang -### -target x86_64-pc-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -lm --rocm-device-lib-path=%S/Inputs/rocm/amdgcn/bitcode -fopenmp-new-driver %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIB-DEVICE-NEW			// RUN: %clang -### -target x86_64-pc-linux-gnu -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx803 -lm --rocm-device-lib-path=%S/Inputs/rocm/amdgcn/bitcode -fopenmp-new-driver %s 2>&1 \| FileCheck %s --check-prefix=CHECK-LIB-DEVICE-NEW
	// CHECK-LIB-DEVICE-NEW: {{.}}clang-linker-wrapper{{.}}-target-library=openmp-amdgcn-amd-amdhsa-gfx803={{.}}ocml.bc"{{.}}ockl.bc"{{.}}oclc_daz_opt_on.bc"{{.}}oclc_unsafe_math_off.bc"{{.}}oclc_finite_only_off.bc"{{.}}oclc_correctly_rounded_sqrt_on.bc"{{.}}oclc_wavefrontsize64_on.bc"{{.}}oclc_isa_version_803.bc"			// CHECK-LIB-DEVICE-NEW: {{.}}clang-linker-wrapper{{.}}-target-library=openmp-amdgcn-amd-amdhsa-gfx803={{.}}ocml.bc"{{.}}ockl.bc"{{.}}oclc_daz_opt_on.bc"{{.}}oclc_unsafe_math_off.bc"{{.}}oclc_finite_only_off.bc"{{.}}oclc_correctly_rounded_sqrt_on.bc"{{.}}oclc_wavefrontsize64_on.bc"{{.}}oclc_isa_version_803.bc"

clang/test/Driver/cuda-openmp-driver.cu

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target

	// RUN: %clang -### -target x86_64-linux-gnu -nocudalib -ccc-print-bindings -fgpu-rdc \			// RUN: %clang -### -target x86_64-linux-gnu -nocudalib -ccc-print-bindings -fgpu-rdc \
	// RUN: --offload-new-driver --offload-arch=sm_35 --offload-arch=sm_70 %s 2>&1 \			// RUN: --offload-new-driver --offload-arch=sm_35 --offload-arch=sm_70 %s 2>&1 \
	// RUN: \| FileCheck -check-prefix BINDINGS %s			// RUN: \| FileCheck -check-prefix BINDINGS %s

	// BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[PTX_SM_35:.+]]"			// BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[PTX_SM_35:.+]]"
	// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_SM_35]]"], output: "[[CUBIN_SM_35:.+]]"			// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_SM_35]]"], output: "[[CUBIN_SM_35:.+]]"
	// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Linker", inputs: ["[[CUBIN_SM_35]]", "[[PTX_SM_35]]"], output: "[[FATBIN_SM_35:.+]]"			// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Linker", inputs: ["[[CUBIN_SM_35]]", "[[PTX_SM_35]]"], output: "[[FATBIN_SM_35:.+]]"
	// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]"], output: "[[PTX_SM_70:.+]]"			// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]"], output: "[[PTX_SM_70:.+]]"
	// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_SM_70:.+]]"], output: "[[CUBIN_SM_70:.+]]"			// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[PTX_SM_70:.+]]"], output: "[[CUBIN_SM_70:.+]]"
	// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Linker", inputs: ["[[CUBIN_SM_70]]", "[[PTX_SM_70:.+]]"], output: "[[FATBIN_SM_70:.+]]"			// BINDINGS-NEXT: "nvptx64-nvidia-cuda" - "NVPTX::Linker", inputs: ["[[CUBIN_SM_70]]", "[[PTX_SM_70:.+]]"], output: "[[FATBIN_SM_70:.+]]"
	// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT]]", "[[FATBIN_SM_35]]", "[[FATBIN_SM_70]]"], output: "[[HOST_OBJ:.+]]"			// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[FATBIN_SM_35]]", "[[FATBIN_SM_70]]"], output: "[[BINARY:.+]]"
				// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT]]", "[[BINARY]]"], output: "[[HOST_OBJ:.+]]"
	// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// BINDINGS-NEXT: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### -nocudalib --offload-new-driver %s 2>&1 \| FileCheck -check-prefix RDC %s			// RUN: %clang -### -nocudalib --offload-new-driver %s 2>&1 \| FileCheck -check-prefix RDC %s
	// RDC: error: Using '--offload-new-driver' requires '-fgpu-rdc'			// RDC: error: Using '--offload-new-driver' requires '-fgpu-rdc'

	// RUN: %clang -### -target x86_64-linux-gnu -nocudalib -ccc-print-bindings -fgpu-rdc \			// RUN: %clang -### -target x86_64-linux-gnu -nocudalib -ccc-print-bindings -fgpu-rdc \
	// RUN: --offload-new-driver --offload-arch=sm_35 --offload-arch=sm_70 %s 2>&1 \			// RUN: --offload-new-driver --offload-arch=sm_35 --offload-arch=sm_70 %s 2>&1 \
	// RUN: \| FileCheck -check-prefix BINDINGS-HOST %s			// RUN: \| FileCheck -check-prefix BINDINGS-HOST %s
	Show All 11 Lines

clang/test/Driver/cuda-phases.cu

	Show First 20 Lines • Show All 217 Lines • ▼ Show 20 Lines
	// DASM2-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE]]:[[ARCH2]])" {[[P8]]}, assembler			// DASM2-DAG: [[P9:[0-9]+]]: offload, "device-[[T]] ([[TRIPLE]]:[[ARCH2]])" {[[P8]]}, assembler
	// DASM2-NOT: host			// DASM2-NOT: host

	//			//
	// Test the phases generated when using the new offloading driver.			// Test the phases generated when using the new offloading driver.
	//			//
	// RUN: %clang -### -target powerpc64le-ibm-linux-gnu -ccc-print-phases --offload-new-driver \			// RUN: %clang -### -target powerpc64le-ibm-linux-gnu -ccc-print-phases --offload-new-driver \
	// RUN: --offload-arch=sm_52 --offload-arch=sm_70 %s 2>&1 \| FileCheck --check-prefix=NEW_DRIVER %s			// RUN: --offload-arch=sm_52 --offload-arch=sm_70 %s 2>&1 \| FileCheck --check-prefix=NEW_DRIVER %s
	// NEW_DRIVER: 0: input, "[[INPUT:.*]]", cuda, (host-cuda)			// NEW_DRIVER: 0: input, "[[INPUT:.+]]", cuda
	// NEW_DRIVER: 1: preprocessor, {0}, cuda-cpp-output, (host-cuda)			// NEW_DRIVER: 1: preprocessor, {0}, cuda-cpp-output
	// NEW_DRIVER: 2: compiler, {1}, ir, (host-cuda)			// NEW_DRIVER: 2: compiler, {1}, ir
	// NEW_DRIVER: 3: input, "[[INPUT]]", cuda, (device-cuda, sm_52)			// NEW_DRIVER: 3: input, "[[INPUT]]", cuda, (device-cuda, sm_52)
	// NEW_DRIVER: 4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_52)			// NEW_DRIVER: 4: preprocessor, {3}, cuda-cpp-output, (device-cuda, sm_52)
	// NEW_DRIVER: 5: compiler, {4}, ir, (device-cuda, sm_52)			// NEW_DRIVER: 5: compiler, {4}, ir, (device-cuda, sm_52)
	// NEW_DRIVER: 6: backend, {5}, assembler, (device-cuda, sm_52)			// NEW_DRIVER: 6: backend, {5}, assembler, (device-cuda, sm_52)
	// NEW_DRIVER: 7: assembler, {6}, object, (device-cuda, sm_52)			// NEW_DRIVER: 7: assembler, {6}, object, (device-cuda, sm_52)
	// NEW_DRIVER: 8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_52)" {7}, object			// NEW_DRIVER: 8: offload, "device-cuda (nvptx64-nvidia-cuda:sm_52)" {7}, object
	// NEW_DRIVER: 9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_52)" {6}, assembler			// NEW_DRIVER: 9: offload, "device-cuda (nvptx64-nvidia-cuda:sm_52)" {6}, assembler
	// NEW_DRIVER: 10: linker, {8, 9}, cuda-fatbin, (device-cuda, sm_52)			// NEW_DRIVER: 10: linker, {8, 9}, cuda-fatbin, (device-cuda, sm_52)
	// NEW_DRIVER: 11: input, "[[INPUT]]", cuda, (device-cuda, sm_70)			// NEW_DRIVER: 11: offload, "device-cuda (nvptx64-nvidia-cuda:sm_52)" {10}, cuda-fatbin
	// NEW_DRIVER: 12: preprocessor, {11}, cuda-cpp-output, (device-cuda, sm_70)			// NEW_DRIVER: 12: input, "[[INPUT]]", cuda, (device-cuda, sm_70)
	// NEW_DRIVER: 13: compiler, {12}, ir, (device-cuda, sm_70)			// NEW_DRIVER: 13: preprocessor, {12}, cuda-cpp-output, (device-cuda, sm_70)
	// NEW_DRIVER: 14: backend, {13}, assembler, (device-cuda, sm_70)			// NEW_DRIVER: 14: compiler, {13}, ir, (device-cuda, sm_70)
	// NEW_DRIVER: 15: assembler, {14}, object, (device-cuda, sm_70)			// NEW_DRIVER: 15: backend, {14}, assembler, (device-cuda, sm_70)
	// NEW_DRIVER: 16: offload, "device-cuda (nvptx64-nvidia-cuda:sm_70)" {15}, object			// NEW_DRIVER: 16: assembler, {15}, object, (device-cuda, sm_70)
	// NEW_DRIVER: 17: offload, "device-cuda (nvptx64-nvidia-cuda:sm_70)" {14}, assembler			// NEW_DRIVER: 17: offload, "device-cuda (nvptx64-nvidia-cuda:sm_70)" {16}, object
	// NEW_DRIVER: 18: linker, {16, 17}, cuda-fatbin, (device-cuda, sm_70)			// NEW_DRIVER: 18: offload, "device-cuda (nvptx64-nvidia-cuda:sm_70)" {15}, assembler
	// NEW_DRIVER: 19: offload, "host-cuda (powerpc64le-ibm-linux-gnu)" {2}, "device-cuda (nvptx64-nvidia-cuda:sm_52)" {10}, "device-cuda (nvptx64-nvidia-cuda:sm_70)" {18}, ir			// NEW_DRIVER: 19: linker, {17, 18}, cuda-fatbin, (device-cuda, sm_70)
	// NEW_DRIVER: 20: backend, {19}, assembler, (host-cuda)			// NEW_DRIVER: 20: offload, "device-cuda (nvptx64-nvidia-cuda:sm_70)" {19}, cuda-fatbin
	// NEW_DRIVER: 21: assembler, {20}, object, (host-cuda)			// NEW_DRIVER: 21: clang-offload-packager, {11, 20}, image
	// NEW_DRIVER: 22: clang-linker-wrapper, {21}, image, (host-cuda)			// NEW_DRIVER: 22: offload, " (powerpc64le-ibm-linux-gnu)" {2}, " (powerpc64le-ibm-linux-gnu)" {21}, ir
				// NEW_DRIVER: 23: backend, {22}, assembler, (host-cuda)
				// NEW_DRIVER: 24: assembler, {23}, object, (host-cuda)
				// NEW_DRIVER: 25: clang-linker-wrapper, {24}, image, (host-cuda)

clang/test/Driver/linker-wrapper-image.c

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target
	// REQUIRES: amdgpu-registered-target			// REQUIRES: amdgpu-registered-target

				// RUN: clang-offload-packager -o %t.out --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70			// RUN: -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple x86_64-unknown-linux-gnu \			// RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple x86_64-unknown-linux-gnu \
	// RUN: -linker-path /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=OPENMP			// RUN: -linker-path /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=OPENMP

	// OPENMP: @__start_omp_offloading_entries = external hidden constant %__tgt_offload_entry			// OPENMP: @__start_omp_offloading_entries = external hidden constant %__tgt_offload_entry
	// OPENMP-NEXT: @__stop_omp_offloading_entries = external hidden constant %__tgt_offload_entry			// OPENMP-NEXT: @__stop_omp_offloading_entries = external hidden constant %__tgt_offload_entry
	// OPENMP-NEXT: @__dummy.omp_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "omp_offloading_entries"			// OPENMP-NEXT: @__dummy.omp_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "omp_offloading_entries"
	// OPENMP-NEXT: @.omp_offloading.device_image = internal unnamed_addr constant [0 x i8] zeroinitializer			// OPENMP-NEXT: @.omp_offloading.device_image = internal unnamed_addr constant [0 x i8] zeroinitializer
	// OPENMP-NEXT: @.omp_offloading.device_images = internal unnamed_addr constant [1 x %__tgt_device_image] [%__tgt_device_image { i8* getelementptr inbounds ([0 x i8], [0 x i8]* @.omp_offloading.device_image, i64 0, i64 0), i8* getelementptr inbounds ([0 x i8], [0 x i8]* @.omp_offloading.device_image, i64 0, i64 0), %__tgt_offload_entry* @__start_omp_offloading_entries, %__tgt_offload_entry* @__stop_omp_offloading_entries }]			// OPENMP-NEXT: @.omp_offloading.device_images = internal unnamed_addr constant [1 x %__tgt_device_image] [%__tgt_device_image { i8* getelementptr inbounds ([0 x i8], [0 x i8]* @.omp_offloading.device_image, i64 0, i64 0), i8* getelementptr inbounds ([0 x i8], [0 x i8]* @.omp_offloading.device_image, i64 0, i64 0), %__tgt_offload_entry* @__start_omp_offloading_entries, %__tgt_offload_entry* @__stop_omp_offloading_entries }]
	// OPENMP-NEXT: @.omp_offloading.descriptor = internal constant %__tgt_bin_desc { i32 1, %__tgt_device_image* getelementptr inbounds ([1 x %__tgt_device_image], [1 x %__tgt_device_image]* @.omp_offloading.device_images, i64 0, i64 0), %__tgt_offload_entry* @__start_omp_offloading_entries, %__tgt_offload_entry* @__stop_omp_offloading_entries }			// OPENMP-NEXT: @.omp_offloading.descriptor = internal constant %__tgt_bin_desc { i32 1, %__tgt_device_image* getelementptr inbounds ([1 x %__tgt_device_image], [1 x %__tgt_device_image]* @.omp_offloading.device_images, i64 0, i64 0), %__tgt_offload_entry* @__start_omp_offloading_entries, %__tgt_offload_entry* @__stop_omp_offloading_entries }
	// OPENMP-NEXT: @llvm.global_ctors = appending global [1 x { i32, void (), i8 }] [{ i32, void (), i8 } { i32 1, void ()* @.omp_offloading.descriptor_reg, i8* null }]			// OPENMP-NEXT: @llvm.global_ctors = appending global [1 x { i32, void (), i8 }] [{ i32, void (), i8 } { i32 1, void ()* @.omp_offloading.descriptor_reg, i8* null }]
	// OPENMP-NEXT: @llvm.global_dtors = appending global [1 x { i32, void (), i8 }] [{ i32, void (), i8 } { i32 1, void ()* @.omp_offloading.descriptor_unreg, i8* null }]			// OPENMP-NEXT: @llvm.global_dtors = appending global [1 x { i32, void (), i8 }] [{ i32, void (), i8 } { i32 1, void ()* @.omp_offloading.descriptor_unreg, i8* null }]

	// OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" {			// OPENMP: define internal void @.omp_offloading.descriptor_reg() section ".text.startup" {
	// OPENMP-NEXT: entry:			// OPENMP-NEXT: entry:
	// OPENMP-NEXT: call void @__tgt_register_lib(%__tgt_bin_desc* @.omp_offloading.descriptor)			// OPENMP-NEXT: call void @__tgt_register_lib(%__tgt_bin_desc* @.omp_offloading.descriptor)
	// OPENMP-NEXT: ret void			// OPENMP-NEXT: ret void
	// OPENMP-NEXT: }			// OPENMP-NEXT: }

	// OPENMP: define internal void @.omp_offloading.descriptor_unreg() section ".text.startup" {			// OPENMP: define internal void @.omp_offloading.descriptor_unreg() section ".text.startup" {
	// OPENMP-NEXT: entry:			// OPENMP-NEXT: entry:
	// OPENMP-NEXT: call void @__tgt_unregister_lib(%__tgt_bin_desc* @.omp_offloading.descriptor)			// OPENMP-NEXT: call void @__tgt_unregister_lib(%__tgt_bin_desc* @.omp_offloading.descriptor)
	// OPENMP-NEXT: ret void			// OPENMP-NEXT: ret void
	// OPENMP-NEXT: }			// OPENMP-NEXT: }

				// RUN: clang-offload-packager -o %t.out --image=file=%S/Inputs/dummy-elf.o,kind=cuda,triple=nvptx64-nvidia-cuda,arch=sm_70
	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,cuda,nvptx64-nvida-cuda,sm_70			// RUN: -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple x86_64-unknown-linux-gnu \			// RUN: clang-linker-wrapper --print-wrapped-module --dry-run --host-triple x86_64-unknown-linux-gnu \
	// RUN: -linker-path /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA			// RUN: -linker-path /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA

	// CUDA: @.fatbin_image = internal constant [0 x i8] zeroinitializer, section ".nv_fatbin"			// CUDA: @.fatbin_image = internal constant [0 x i8] zeroinitializer, section ".nv_fatbin"
	// CUDA-NEXT: @.fatbin_wrapper = internal constant %fatbin_wrapper { i32 1180844977, i32 1, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @.fatbin_image, i32 0, i32 0), i8* null }, section ".nvFatBinSegment", align 8			// CUDA-NEXT: @.fatbin_wrapper = internal constant %fatbin_wrapper { i32 1180844977, i32 1, i8* getelementptr inbounds ([0 x i8], [0 x i8]* @.fatbin_image, i32 0, i32 0), i8* null }, section ".nvFatBinSegment", align 8
	// CUDA-NEXT: @__dummy.cuda_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "cuda_offloading_entries"			// CUDA-NEXT: @__dummy.cuda_offloading.entry = hidden constant [0 x %__tgt_offload_entry] zeroinitializer, section "cuda_offloading_entries"
	// CUDA-NEXT: @.cuda.binary_handle = internal global i8** null			// CUDA-NEXT: @.cuda.binary_handle = internal global i8** null
	// CUDA-NEXT: @__start_cuda_offloading_entries = external hidden constant [0 x %__tgt_offload_entry]			// CUDA-NEXT: @__start_cuda_offloading_entries = external hidden constant [0 x %__tgt_offload_entry]
	▲ Show 20 Lines • Show All 51 Lines • Show Last 20 Lines

clang/test/Driver/linker-wrapper.c

	// REQUIRES: x86-registered-target			// REQUIRES: x86-registered-target
	// REQUIRES: nvptx-registered-target			// REQUIRES: nvptx-registered-target
	// REQUIRES: amdgpu-registered-target			// REQUIRES: amdgpu-registered-target

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: clang-offload-packager -o %t.out \
				thakisUnsubmitted Not Done Reply Inline Actions Since you're calling this from a test, you have to edit clang/test/CMakeLists.txt and add a dep on the new tool. thakis: Since you're calling this from a test, you have to edit clang/test/CMakeLists.txt and add a dep…
				thakisUnsubmitted Not Done Reply Inline Actions Sorry, I missed the `add_dependencies(clang clang-offload-packager)` line in the new cmakelists.txt file. All good. thakis: Sorry, I missed the `add_dependencies(clang clang-offload-packager)` line in the new cmakelists.
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70 \			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \			// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \
	// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=NVPTX_LINK			// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=NVPTX_LINK

	// NVPTX_LINK: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o			// NVPTX_LINK: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: clang-offload-packager -o %t.out \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,amdgcn-amd-amdhsa,gfx908 \			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,amdgcn-amd-amdhsa,gfx908			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \			// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \
	// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=AMDGPU_LINK			// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=AMDGPU_LINK

	// AMDGPU_LINK: lld{{.}}-flavor gnu --no-undefined -shared -o {{.}}.out {{.}}.o {{.}}.o			// AMDGPU_LINK: lld{{.}}-flavor gnu --no-undefined -shared -o {{.}}.out {{.}}.o {{.}}.o

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: clang-offload-packager -o %t.out \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,x86_64-unknown-linux-gnu, \			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,x86_64-unknown-linux-gnu,			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \			// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \
	// RUN: /usr/bin/ld.lld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CPU_LINK			// RUN: /usr/bin/ld.lld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CPU_LINK

	// CPU_LINK: ld.lld{{.}}-m elf_x86_64 -shared -Bsymbolic -o {{.}}.out {{.}}.o {{.}}.o			// CPU_LINK: ld.lld{{.}}-m elf_x86_64 -shared -Bsymbolic -o {{.}}.out {{.}}.o {{.}}.o

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o			// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o
	// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \			// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \
	// RUN: /usr/bin/ld.lld -- -a -b -c %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=HOST_LINK			// RUN: /usr/bin/ld.lld -- -a -b -c %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=HOST_LINK

	// HOST_LINK: ld.lld{{.}}-a -b -c {{.}}.o -o a.out			// HOST_LINK: ld.lld{{.}}-a -b -c {{.}}.o -o a.out

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: clang-offload-packager -o %t.out \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-bc.bc,openmp,nvptx64-nvida-cuda,sm_70 \			// RUN: --image=file=%S/Inputs/dummy-bc.bc,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-bc.bc,openmp,nvptx64-nvida-cuda,sm_70			// RUN: --image=file=%S/Inputs/dummy-bc.bc,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \			// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \
	// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=LTO			// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=LTO

	// LTO: ptxas{{.}}-m64 -o {{.}}.cubin -O2 --gpu-name sm_70 {{.*}}.s			// LTO: ptxas{{.}}-m64 -o {{.}}.cubin -O2 --gpu-name sm_70 {{.*}}.s
	// LTO-NOT: nvlink			// LTO-NOT: nvlink

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: clang-offload-packager -o %t.out \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70 \			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,cuda,nvptx64-nvida-cuda,sm_70			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=cuda,triple=nvptx64-nvidia-cuda,arch=sm_70
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \			// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \
	// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA_OMP_LINK			// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA_OMP_LINK

	// CUDA_OMP_LINK: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o			// CUDA_OMP_LINK: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o

	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t-lib.o \			// RUN: clang-offload-packager -o %t-lib.out \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70 \			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_52			// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=cuda,triple=nvptx64-nvidia-cuda,arch=sm_52
	// RUN: llvm-ar rcs %t.a %t-lib.o			// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t-lib.out
	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t-obj.o \			// RUN: llvm-ar rcs %t.a %t.o
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70			// RUN: clang-offload-packager -o %t.out \
				// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70
				// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t-obj.o -fembed-offload-object=%t.out
	// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \			// RUN: clang-linker-wrapper --host-triple x86_64-unknown-linux-gnu --dry-run -linker-path \
	// RUN: /usr/bin/ld -- %t.a %t-obj.o -o a.out 2>&1 \| FileCheck %s --check-prefix=STATIC-LIBRARY			// RUN: /usr/bin/ld -- %t.a %t-obj.o -o a.out 2>&1 \| FileCheck %s --check-prefix=STATIC-LIBRARY

	// STATIC-LIBRARY: nvlink{{.*}} -arch sm_70			// STATIC-LIBRARY: nvlink{{.*}} -arch sm_70
	// STATIC-LIBRARY-NOT: nvlink{{.*}} -arch sm_50			// STATIC-LIBRARY-NOT: nvlink{{.*}} -arch sm_50

				// RUN: clang-offload-packager -o %t.out \
				// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=cuda,triple=nvptx64-nvidia-cuda,arch=sm_70 \
				// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=openmp,triple=nvptx64-nvidia-cuda,arch=sm_70 \
				// RUN: --image=file=%S/Inputs/dummy-elf.o,kind=cuda,triple=nvptx64-nvidia-cuda,arch=sm_52
	// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \			// RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,cuda,nvptx64-nvida-cuda,sm_70 \			// RUN: -fembed-offload-object=%t.out
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,openmp,nvptx64-nvida-cuda,sm_70 \
	// RUN: -fembed-offload-object=%S/Inputs/dummy-elf.o,cuda,nvptx64-nvida-cuda,sm_52
	// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \			// RUN: clang-linker-wrapper --dry-run --host-triple x86_64-unknown-linux-gnu -linker-path \
	// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA			// RUN: /usr/bin/ld -- %t.o -o a.out 2>&1 \| FileCheck %s --check-prefix=CUDA

	// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o
	// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_52 {{.*}}.o			// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_52 {{.*}}.o
	// CUDA: fatbinary{{.}}-64 --create {{.}}.fatbin --image=profile=sm_70,file={{.}}.out --image=profile=sm_52,file={{.}}.out			// CUDA: nvlink{{.}}-m64 -o {{.}}.out -arch sm_70 {{.}}.o {{.}}.o
				// CUDA: fatbinary{{.}}-64 --create {{.}}.fatbin --image=profile=sm_52,file={{.}}.out --image=profile=sm_70,file={{.}}.out

clang/test/Driver/openmp-offload-gpu-new.c

	Show All 17 Lines
	// verify the tools invocations			// verify the tools invocations
	// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-emit-llvm-bc"{{.}}"-x" "c"			// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.}}"-emit-llvm-bc"{{.}}"-x" "c"
	// CHECK: "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu"{{.*}}"-target-cpu" "sm_52"			// CHECK: "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu"{{.*}}"-target-cpu" "sm_52"
	// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-emit-obj"			// CHECK: "-cc1" "-triple" "x86_64-unknown-linux-gnu"{{.*}}"-emit-obj"
	// CHECK: clang-linker-wrapper{{.}}"--"{{.}} "-o" "a.out"			// CHECK: clang-linker-wrapper{{.}}"--"{{.}} "-o" "a.out"

	// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_52 %s 2>&1 \			// RUN: %clang -ccc-print-phases --target=x86_64-unknown-linux-gnu -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_52 %s 2>&1 \
	// RUN: \| FileCheck --check-prefix=CHECK-PHASES %s			// RUN: \| FileCheck --check-prefix=CHECK-PHASES %s
	// CHECK-PHASES: 0: input, "[[INPUT:.*]]", c, (host-openmp)			// CHECK-PHASES: 0: input, "[[INPUT:.+]]", c, (host-openmp)
	// CHECK-PHASES: 1: preprocessor, {0}, cpp-output, (host-openmp)			// CHECK-PHASES: 1: preprocessor, {0}, cpp-output, (host-openmp)
	// CHECK-PHASES: 2: compiler, {1}, ir, (host-openmp)			// CHECK-PHASES: 2: compiler, {1}, ir, (host-openmp)
	// CHECK-PHASES: 3: input, "[[INPUT]]", c, (device-openmp)			// CHECK-PHASES: 3: input, "[[INPUT]]", c, (device-openmp)
	// CHECK-PHASES: 4: preprocessor, {3}, cpp-output, (device-openmp)			// CHECK-PHASES: 4: preprocessor, {3}, cpp-output, (device-openmp)
	// CHECK-PHASES: 5: compiler, {4}, ir, (device-openmp)			// CHECK-PHASES: 5: compiler, {4}, ir, (device-openmp)
	// CHECK-PHASES: 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64-nvidia-cuda)" {5}, ir			// CHECK-PHASES: 6: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64-nvidia-cuda)" {5}, ir
	// CHECK-PHASES: 7: backend, {6}, assembler, (device-openmp)			// CHECK-PHASES: 7: backend, {6}, assembler, (device-openmp)
	// CHECK-PHASES: 8: assembler, {7}, object, (device-openmp)			// CHECK-PHASES: 8: assembler, {7}, object, (device-openmp)
	// CHECK-PHASES: 9: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, "device-openmp (nvptx64-nvidia-cuda)" {8}, ir			// CHECK-PHASES: 9: offload, "device-openmp (nvptx64-nvidia-cuda)" {8}, object
	// CHECK-PHASES: 10: backend, {9}, assembler, (host-openmp)			// CHECK-PHASES: 10: clang-offload-packager, {9}, image
	// CHECK-PHASES: 11: assembler, {10}, object, (host-openmp)			// CHECK-PHASES: 11: offload, "host-openmp (x86_64-unknown-linux-gnu)" {2}, " (x86_64-unknown-linux-gnu)" {10}, ir
	// CHECK-PHASES: 12: clang-linker-wrapper, {11}, image, (host-openmp)			// CHECK-PHASES: 12: backend, {11}, assembler, (host-openmp)
				// CHECK-PHASES: 13: assembler, {12}, object, (host-openmp)
				// CHECK-PHASES: 14: clang-linker-wrapper, {13}, image, (host-openmp)

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_52 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-BINDINGS			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_52 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-BINDINGS
	// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"			// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"
	// CHECK-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC:.*]]"			// CHECK-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC:.*]]"
	// CHECK-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC]]"], output: "[[DEVICE_OBJ:.*]]"			// CHECK-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC]]"], output: "[[DEVICE_OBJ:.*]]"
	// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ]]"], output: "[[HOST_OBJ:.*]]"			// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[DEVICE_OBJ]]"], output: "[[BINARY:.*]]"
				// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.*]]"
	// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// CHECK-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_52 --offload-arch=sm_70 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-ARCH-BINDINGS			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda --offload-arch=sm_52 --offload-arch=sm_70 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-ARCH-BINDINGS
	// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"			// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_52:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_52:.*]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_52]]"], output: "[[DEVICE_OBJ_SM_52:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_52]]"], output: "[[DEVICE_OBJ_SM_52:.*]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_70:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_70:.*]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_70]]"], output: "[[DEVICE_OBJ_SM_70:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_70]]"], output: "[[DEVICE_OBJ_SM_70:.*]]"
	// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ_SM_52]]", "[[DEVICE_OBJ_SM_70]]"], output: "[[HOST_OBJ:.*]]"			// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[DEVICE_OBJ_SM_52]]", "[[DEVICE_OBJ_SM_70]]"], output: "[[BINARY:.*]]"
				// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.*]]"
	// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \
	// RUN: -fopenmp-targets=nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_70 \			// RUN: -fopenmp-targets=nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_70 \
	// RUN: -fopenmp-targets=nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx908 \			// RUN: -fopenmp-targets=nvptx64-nvidia-cuda,amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx908 \
	// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NVIDIA-AMDGPU			// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NVIDIA-AMDGPU

	// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"			// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"
	// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[NVIDIA_PTX:.+]]"			// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[NVIDIA_PTX:.+]]"
	// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[NVIDIA_PTX]]"], output: "[[NVIDIA_CUBIN:.+]]"			// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[NVIDIA_PTX]]"], output: "[[NVIDIA_CUBIN:.+]]"
	// CHECK-NVIDIA-AMDGPU: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[AMD_BC:.+]]"			// CHECK-NVIDIA-AMDGPU: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[AMD_BC:.+]]"
	// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[NVIDIA_CUBIN]]", "[[AMD_BC]]"], output: "[[HOST_OBJ:.+]]"			// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[NVIDIA_CUBIN]]", "[[AMD_BC]]"], output: "[[BINARY:.*]]"
				// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.+]]"
	// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -emit-llvm -S -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_52 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-EMIT-LLVM-IR			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -emit-llvm -S -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_52 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-EMIT-LLVM-IR
	// CHECK-EMIT-LLVM-IR: "-cc1"{{.}}"-triple" "nvptx64-nvidia-cuda"{{.}}"-emit-llvm"			// CHECK-EMIT-LLVM-IR: "-cc1"{{.}}"-triple" "nvptx64-nvidia-cuda"{{.}}"-emit-llvm"

	// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \			// RUN: %clang -### -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvida-cuda -march=sm_70 \
	// RUN: --libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc \			// RUN: --libomptarget-nvptx-bc-path=%S/Inputs/libomptarget/libomptarget-new-nvptx-test.bc \
	// RUN: -nogpulib %s -o openmp-offload-gpu 2>&1 \			// RUN: -nogpulib %s -o openmp-offload-gpu 2>&1 \
	// RUN: \| FileCheck -check-prefix=DRIVER_EMBEDDING %s			// RUN: \| FileCheck -check-prefix=DRIVER_EMBEDDING %s

	// DRIVER_EMBEDDING: -fembed-offload-object=[[CUBIN:.*\.cubin]],openmp,nvptx64-nvidia-cuda,sm_70			// DRIVER_EMBEDDING: -fembed-offload-object={{.*}}.out

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \
	// RUN: --offload-host-only -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-HOST-ONLY			// RUN: --offload-host-only -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-HOST-ONLY
	// CHECK-HOST-ONLY: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[OUTPUT:.]]"			// CHECK-HOST-ONLY: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[OUTPUT:.]]"
	// CHECK-HOST-ONLY: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[OUTPUT]]"], output: "a.out"			// CHECK-HOST-ONLY: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[OUTPUT]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \
	// RUN: --offload-device-only -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DEVICE-ONLY			// RUN: --offload-device-only -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DEVICE-ONLY
	// CHECK-DEVICE-ONLY: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"			// CHECK-DEVICE-ONLY: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"
	// CHECK-DEVICE-ONLY: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_ASM:.*]]"			// CHECK-DEVICE-ONLY: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_ASM:.*]]"
	// CHECK-DEVICE-ONLY: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "{{.*}}-openmp-nvptx64-nvidia-cuda.o"			// CHECK-DEVICE-ONLY: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_ASM]]"], output: "{{.*}}-openmp-nvptx64-nvidia-cuda.o"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda \
	// RUN: --offload-device-only -E -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DEVICE-ONLY-PP			// RUN: --offload-device-only -E -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DEVICE-ONLY-PP
	// CHECK-DEVICE-ONLY-PP: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.*]]"], output: "-"			// CHECK-DEVICE-ONLY-PP: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT:.*]]"], output: "-"

clang/test/Driver/openmp-offload-infer.c

	Show All 17 Lines
	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \
	// RUN: --offload-arch=sm_70 --offload-arch=gfx908:sramecc+:xnack- \			// RUN: --offload-arch=sm_70 --offload-arch=gfx908:sramecc+:xnack- \
	// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NVIDIA-AMDGPU			// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-NVIDIA-AMDGPU

	// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"			// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.+]]"], output: "[[HOST_BC:.+]]"
	// CHECK-NVIDIA-AMDGPU: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[AMD_BC:.+]]"			// CHECK-NVIDIA-AMDGPU: "amdgcn-amd-amdhsa" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[AMD_BC:.+]]"
	// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[NVIDIA_PTX:.+]]"			// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[NVIDIA_PTX:.+]]"
	// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[NVIDIA_PTX]]"], output: "[[NVIDIA_CUBIN:.+]]"			// CHECK-NVIDIA-AMDGPU: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[NVIDIA_PTX]]"], output: "[[NVIDIA_CUBIN:.+]]"
	// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[AMD_BC]]", "[[NVIDIA_CUBIN]]"], output: "[[HOST_OBJ:.+]]"			// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[AMD_BC]]", "[[NVIDIA_CUBIN]]"], output: "[[BINARY:.+]]"
				// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.+]]"
	// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// CHECK-NVIDIA-AMDGPU: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \
	// RUN: --offload-arch=sm_52 --offload-arch=sm_70 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-ARCH-BINDINGS			// RUN: --offload-arch=sm_52 --offload-arch=sm_70 -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-ARCH-BINDINGS

	// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"			// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[INPUT:.]]"], output: "[[HOST_BC:.]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_52:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_52:.*]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_52]]"], output: "[[DEVICE_OBJ_SM_52:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_52]]"], output: "[[DEVICE_OBJ_SM_52:.*]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_70:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "clang", inputs: ["[[INPUT]]", "[[HOST_BC]]"], output: "[[DEVICE_BC_SM_70:.*]]"
	// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_70]]"], output: "[[DEVICE_OBJ_SM_70:.*]]"			// CHECK-ARCH-BINDINGS: "nvptx64-nvidia-cuda" - "NVPTX::Assembler", inputs: ["[[DEVICE_BC_SM_70]]"], output: "[[DEVICE_OBJ_SM_70:.*]]"
	// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[DEVICE_OBJ_SM_52]]", "[[DEVICE_OBJ_SM_70]]"], output: "[[HOST_OBJ:.*]]"			// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Packager", inputs: ["[[DEVICE_OBJ_SM_52]]", "[[DEVICE_OBJ_SM_70]]"], output: "[[BINARY:.+]]"
				// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "clang", inputs: ["[[HOST_BC]]", "[[BINARY]]"], output: "[[HOST_OBJ:.*]]"
	// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"			// CHECK-ARCH-BINDINGS: "x86_64-unknown-linux-gnu" - "Offload::Linker", inputs: ["[[HOST_OBJ]]"], output: "a.out"

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \
	// RUN: --offload-arch=sm_70 --offload-arch=gfx908 --offload-arch=native \			// RUN: --offload-arch=sm_70 --offload-arch=gfx908 --offload-arch=native \
	// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-FAILED			// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-FAILED

	// CHECK-FAILED: error: failed to deduce triple for target architecture 'native'; specify the triple using '-fopenmp-targets' and '-Xopenmp-target' instead.			// CHECK-FAILED: error: failed to deduce triple for target architecture 'native'; specify the triple using '-fopenmp-targets' and '-Xopenmp-target' instead.

	// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \			// RUN: %clang -### --target=x86_64-unknown-linux-gnu -ccc-print-bindings -fopenmp \
	// RUN: --offload-arch=sm_70 --offload-arch=gfx908 -fno-openmp \			// RUN: --offload-arch=sm_70 --offload-arch=gfx908 -fno-openmp \
	// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DISABLED			// RUN: -nogpulib %s 2>&1 \| FileCheck %s --check-prefix=CHECK-DISABLED

	// CHECK-DISABLED-NOT: "nvptx64-nvidia-cuda" - "clang",			// CHECK-DISABLED-NOT: "nvptx64-nvidia-cuda" - "clang",

clang/test/Frontend/embed-object.c

	// RUN: %clang_cc1 -x c -triple x86_64-unknown-linux-gnu -emit-llvm -fembed-offload-object=%S/Inputs/empty.h,,, -o - %s \| FileCheck %s			// RUN: %clang_cc1 -x c -triple x86_64-unknown-linux-gnu -emit-llvm -fembed-offload-object=%S/Inputs/empty.h -o - %s \| FileCheck %s

	// CHECK: @[[OBJECT:.+]] = private constant [120 x i8] c"\10\FF\10\AD{{.*}}", section ".llvm.offloading", align 8			// CHECK: @[[OBJECT:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading", align 8
				yaxunlUnsubmitted Not Done Reply Inline Actions Is this due to the embedded object being empty? So now the bitcode for different targets are bundled by clang-offload-packager then embedded as one file in the relocatable object file? In the old scheme the bitcode for different targets are bundled by clang-offload-bundler then embedded in the relocatable object file, right? What's the advantage of clang-offload-packager compared with clang-offload-bundler? yaxunl: Is this due to the embedded object being empty? So now the bitcode for different targets are…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions Is this due to the embedded object being empty? Yes, we used to do the binary format in Clang itself so we got the binary stuff along with the empty file. Now this flag simply embeds a file at a section, the file is empty so we get a zeroinitializer. What's important in this test is just that the option puts the contents in the IR. So now the bitcode for different targets are bundled by clang-offload-packager then embedded as one file in the relocatable object file? Yes, this is basically like what fatbinary does for CUDA. We take all the files and put it into a single binary. The binary then contains metadata which lets us find these files later at link time. In the old scheme the bitcode for different targets are bundled by clang-offload-bundler then embedded in the relocatable object file, right? What's the advantage of clang-offload-packager compared with clang-offload-bundler? The old clang offload bundler did some similar stuff, namely embedding multiple files into the host. It was similarly an ELF section if the target is an object file. Conceptually this only creates the actual binary that's being embedded and puts it in one big blob, this then just gets embedded directly in the IR. The benefit to this approach in my mind is that the host and device phases are more distinct, we don't need to call the `clang-offload-bundler` on the host files as well. I could've worked around the current clang offload bundler to make it do something similar, but I didn't see the utility when I'm doing different stuff using a different binary format. jhuber6: > > Is this due to the embedded object being empty? > Yes, we used to do the binary format in…
	// CHECK: @llvm.compiler.used = appending global [1 x ptr] [ptr @[[OBJECT]]], section "llvm.metadata"			// CHECK: @llvm.compiler.used = appending global [1 x ptr] [ptr @[[OBJECT]]], section "llvm.metadata"


	void foo(void) {}			void foo(void) {}

clang/test/Frontend/embed-object.ll

	; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \			; RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -emit-llvm \
	; RUN: -fembed-offload-object=%S/Inputs/empty.h,,, \			; RUN: -fembed-offload-object=%S/Inputs/empty.h \
	; RUN: -fembed-offload-object=%S/Inputs/empty.h,,, -x ir %s -o - \			; RUN: -fembed-offload-object=%S/Inputs/empty.h -x ir %s -o - \
	; RUN: \| FileCheck %s -check-prefix=CHECK			; RUN: \| FileCheck %s -check-prefix=CHECK

	; CHECK: @[[OBJECT_1:.+]] = private constant [120 x i8] c"\10\FF\10\AD{{.*}}\00", section ".llvm.offloading", align 8			; CHECK: @[[OBJECT_1:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading", align 8
	; CHECK: @[[OBJECT_2:.+]] = private constant [120 x i8] c"\10\FF\10\AD{{.*}}\00", section ".llvm.offloading", align 8			; CHECK: @[[OBJECT_2:.+]] = private constant [0 x i8] zeroinitializer, section ".llvm.offloading", align 8
				traUnsubmitted Not Done Reply Inline Actions What will happen if an openMP file compiled this way is linked with the older version of OpenMP runtime which presumably expected to see extra data in `.llvm.offloading`? Will it provide a sensible error? Perhaps we should change the section name, too. tra: What will happen if an openMP file compiled this way is linked with the older version of OpenMP…
				jhuber6AuthorUnsubmitted Done Reply Inline Actions I didn't change the actual data being embedded, only the method to do it. previously this command line did the work of the offload binary tool. Now it just embeds the file that the tool spits out. This test just makes sure that we run it and get the contents in the IR. jhuber6: I didn't change the actual data being embedded, only the method to do it. previously this…
	; CHECK: @llvm.compiler.used = appending global [3 x ptr] [ptr @x, ptr @[[OBJECT_1]], ptr @[[OBJECT_2]]], section "llvm.metadata"			; CHECK: @llvm.compiler.used = appending global [3 x ptr] [ptr @x, ptr @[[OBJECT_1]], ptr @[[OBJECT_2]]], section "llvm.metadata"

	@x = private constant i8 1			@x = private constant i8 1
	@llvm.compiler.used = appending global [1 x ptr] [ptr @x], section "llvm.metadata"			@llvm.compiler.used = appending global [1 x ptr] [ptr @x], section "llvm.metadata"

	define i32 @foo() {			define i32 @foo() {
	ret i32 0			ret i32 0
	}			}

clang/test/lit.cfg.py

	Show First 20 Lines • Show All 57 Lines • ▼ Show 20 Lines

	# For each occurrence of a clang tool name, replace it with the full path to			# For each occurrence of a clang tool name, replace it with the full path to
	# the build directory holding that tool. We explicitly specify the directories			# the build directory holding that tool. We explicitly specify the directories
	# to search to ensure that we get the tools just built and not some random			# to search to ensure that we get the tools just built and not some random
	# tools that might happen to be in the user's PATH.			# tools that might happen to be in the user's PATH.
	tool_dirs = [config.clang_tools_dir, config.llvm_tools_dir]			tool_dirs = [config.clang_tools_dir, config.llvm_tools_dir]

	tools = [			tools = [
	'apinotes-test', 'c-index-test', 'clang-diff', 'clang-format', 'clang-repl',			'apinotes-test', 'c-index-test', 'clang-diff', 'clang-format', 'clang-repl', 'clang-offload-packager',
	'clang-tblgen', 'clang-scan-deps', 'opt', 'llvm-ifs', 'yaml2obj', 'clang-linker-wrapper',			'clang-tblgen', 'clang-scan-deps', 'opt', 'llvm-ifs', 'yaml2obj', 'clang-linker-wrapper',
	ToolSubst('%clang_extdef_map', command=FindTool(			ToolSubst('%clang_extdef_map', command=FindTool(
	'clang-extdef-mapping'), unresolved='ignore'),			'clang-extdef-mapping'), unresolved='ignore'),
	ToolSubst('%clang_dxc', command=config.clang,			ToolSubst('%clang_dxc', command=config.clang,
	extra_args=['--driver-mode=dxc']),			extra_args=['--driver-mode=dxc']),
	]			]

	if config.clang_examples:			if config.clang_examples:
	▲ Show 20 Lines • Show All 196 Lines • Show Last 20 Lines

clang/tools/CMakeLists.txt

	create_subdirectory_options(CLANG TOOL)			create_subdirectory_options(CLANG TOOL)

	add_clang_subdirectory(diagtool)			add_clang_subdirectory(diagtool)
	add_clang_subdirectory(driver)			add_clang_subdirectory(driver)
	add_clang_subdirectory(apinotes-test)			add_clang_subdirectory(apinotes-test)
	add_clang_subdirectory(clang-diff)			add_clang_subdirectory(clang-diff)
	add_clang_subdirectory(clang-format)			add_clang_subdirectory(clang-format)
	add_clang_subdirectory(clang-format-vs)			add_clang_subdirectory(clang-format-vs)
	add_clang_subdirectory(clang-fuzzer)			add_clang_subdirectory(clang-fuzzer)
	add_clang_subdirectory(clang-import-test)			add_clang_subdirectory(clang-import-test)
	add_clang_subdirectory(clang-nvlink-wrapper)			add_clang_subdirectory(clang-nvlink-wrapper)
	add_clang_subdirectory(clang-linker-wrapper)			add_clang_subdirectory(clang-linker-wrapper)
				add_clang_subdirectory(clang-offload-packager)
	add_clang_subdirectory(clang-offload-bundler)			add_clang_subdirectory(clang-offload-bundler)
	add_clang_subdirectory(clang-offload-wrapper)			add_clang_subdirectory(clang-offload-wrapper)
	add_clang_subdirectory(clang-scan-deps)			add_clang_subdirectory(clang-scan-deps)
	add_clang_subdirectory(clang-repl)			add_clang_subdirectory(clang-repl)

	add_clang_subdirectory(c-index-test)			add_clang_subdirectory(c-index-test)

	add_clang_subdirectory(clang-rename)			add_clang_subdirectory(clang-rename)
	Show All 31 Lines

clang/tools/clang-offload-packager/CMakeLists.txt

This file was added.

				set(LLVM_LINK_COMPONENTS
				${LLVM_TARGETS_TO_BUILD}
				Object
				Support)

				if(NOT CLANG_BUILT_STANDALONE)
				set(tablegen_deps intrinsics_gen)
				endif()

				add_clang_executable(clang-offload-packager
				ClangOffloadPackager.cpp

				DEPENDS
				${tablegen_deps}
				)

				set(CLANG_LINKER_WRAPPER_LIB_DEPS
				clangBasic
				)

				add_dependencies(clang clang-offload-packager)

				target_link_libraries(clang-offload-packager
				PRIVATE
				${CLANG_LINKER_WRAPPER_LIB_DEPS}
				)

				install(TARGETS clang-offload-packager RUNTIME DESTINATION bin)

clang/tools/clang-offload-packager/ClangOffloadPackager.cpp

This file was added.

				//===-- clang-offload-packager/ClangOffloadPackager.cpp - file bundler ---===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===---------------------------------------------------------------------===//
				//
				// This tool takes several device object files and bundles them into a single
				// binary image using a custom binary format. This is intended to be used to
				// embed many device files into an application to create a fat binary.
				//
				//===---------------------------------------------------------------------===//

				#include "clang/Basic/Version.h"

				#include "llvm/Object/Binary.h"
				#include "llvm/Object/ObjectFile.h"
				#include "llvm/Object/OffloadBinary.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Support/FileOutputBuffer.h"
				#include "llvm/Support/MemoryBuffer.h"
				#include "llvm/Support/Path.h"
				#include "llvm/Support/Signals.h"
				#include "llvm/Support/WithColor.h"

				using namespace llvm;
				using namespace llvm::object;

				static cl::opt<bool> Help("h", cl::desc("Alias for -help"), cl::Hidden);

				static cl::OptionCategory
				ClangOffloadPackagerCategory("clang-offload-packager options");

				static cl::opt<std::string> OutputFile("o", cl::Required,
				cl::desc("Write output to <file>."),
				cl::value_desc("file"),
				cl::cat(ClangOffloadPackagerCategory));

				static cl::list<std::string>
				DeviceImages("image", cl::ZeroOrMore,
				cl::desc("List of key and value arguments. Required keywords "
				"are 'file' and 'triple'."),
				cl::value_desc("<key>=<value>,..."),
				cl::cat(ClangOffloadPackagerCategory));

				static void PrintVersion(raw_ostream &OS) {
				OS << clang::getClangToolFullVersion("clang-offload-packager") << '\n';
				}

				int main(int argc, const char **argv) {
				sys::PrintStackTraceOnErrorSignal(argv[0]);
				cl::HideUnrelatedOptions(ClangOffloadPackagerCategory);
				cl::SetVersionPrinter(PrintVersion);
				cl::ParseCommandLineOptions(
				argc, argv,
				"A utility for bundling several object files into a single binary.\n"
				"The output binary can then be embedded into the host section table\n"
				"to create a fatbinary containing offloading code.\n");

				if (Help) {
				cl::PrintHelpMessage();
				return EXIT_SUCCESS;
				}

				auto reportError = [argv](Error E) {
				logAllUnhandledErrors(std::move(E), WithColor::error(errs(), argv[0]));
				return EXIT_FAILURE;
				};

				SmallVector<char, 1024> BinaryData;
				raw_svector_ostream OS(BinaryData);
				for (StringRef Image : DeviceImages) {
				StringMap<StringRef> Args;
				for (StringRef Arg : llvm::split(Image, ","))
				Args.insert(Arg.split("="));

				if (!Args.count("triple") \|\| !Args.count("file"))
				return reportError(createStringError(
				inconvertibleErrorCode(),
				"'file' and 'triple' are required image arguments"));

				OffloadBinary::OffloadingImage ImageBinary{};
				std::unique_ptr<llvm::MemoryBuffer> DeviceImage;
				for (const auto &KeyAndValue : Args) {
				StringRef Key = KeyAndValue.getKey();
				if (Key == "file") {
				llvm::ErrorOr<std::unique_ptr<llvm::MemoryBuffer>> ObjectOrErr =
				llvm::MemoryBuffer::getFileOrSTDIN(KeyAndValue.getValue());
				if (std::error_code EC = ObjectOrErr.getError())
				return reportError(errorCodeToError(EC));
				DeviceImage = std::move(*ObjectOrErr);
				ImageBinary.Image = *DeviceImage;
				ImageBinary.TheImageKind = getImageKind(
				sys::path::extension(KeyAndValue.getValue()).drop_front());
				} else if (Key == "kind") {
				ImageBinary.TheOffloadKind = getOffloadKind(KeyAndValue.getValue());
				} else {
				ImageBinary.StringData[Key] = KeyAndValue.getValue();
				}
				}
				std::unique_ptr<MemoryBuffer> Buffer = OffloadBinary::write(ImageBinary);
				OS << Buffer->getBuffer();
				}

				Expected<std::unique_ptr<FileOutputBuffer>> OutputOrErr =
				FileOutputBuffer::create(OutputFile, BinaryData.size());
				if (!OutputOrErr)
				return reportError(OutputOrErr.takeError());
				std::unique_ptr<FileOutputBuffer> Output = std::move(*OutputOrErr);
				std::copy(BinaryData.begin(), BinaryData.end(), Output->getBufferStart());
				if (Error E = Output->commit())
				return reportError(std::move(E));
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Introduce clang-offload-packager tool to bundle device filesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 428645

clang/docs/ClangOffloadPackager.rst

clang/include/clang/Basic/CodeGenOptions.h

clang/include/clang/Driver/Action.h

clang/include/clang/Driver/ToolChain.h

clang/lib/CodeGen/BackendUtil.cpp

clang/lib/Driver/Action.cpp

clang/lib/Driver/Driver.cpp

clang/lib/Driver/ToolChain.cpp

clang/lib/Driver/ToolChains/Clang.h

clang/lib/Driver/ToolChains/Clang.cpp

clang/test/Driver/amdgpu-openmp-toolchain-new.c

clang/test/Driver/cuda-openmp-driver.cu

clang/test/Driver/cuda-phases.cu

clang/test/Driver/linker-wrapper-image.c

clang/test/Driver/linker-wrapper.c

clang/test/Driver/openmp-offload-gpu-new.c

clang/test/Driver/openmp-offload-infer.c

clang/test/Frontend/embed-object.c

clang/test/Frontend/embed-object.ll

clang/test/lit.cfg.py

clang/tools/CMakeLists.txt

clang/tools/clang-offload-packager/CMakeLists.txt

clang/tools/clang-offload-packager/ClangOffloadPackager.cpp

[Clang] Introduce clang-offload-packager tool to bundle device files
ClosedPublic