Page MenuHomePhabricator

[clang-offload-bundler] Add unbundling of archives containing bundled object files into device specific archives
Needs ReviewPublic

Authored by saiislam on Dec 18 2020, 1:51 AM.

Details

Summary

This patch adds unbundling support of an archive file. It takes an
archive file along with a set of offload targets as input. Output is
a device specific archive for each given offload target. Input archive
contains bundled code objects bundled using clang-offload-bundler. Each
generated device specific archive contains a set of device code object
files which are named as <Parent Bundle Name>-<CodeObject-TargetID>.

Targets can be specified with and without using TargetID.

Entries in input archive can be of any binary type which is
supported by clang-offload-bundler, like *.o,*.bc, etc. Output archives
will contain files in same type.

Example Usuage:

clang-offload-bundler --unbundle --inputs=lib-generic.a -type=a -targets=openmp-amdgcn-amdhsa-gfx906,openmp-amdgcn-amdhsa-gfx908
      -outputs=devicelib-gfx906.a,deviceLib-gfx908.a

Diff Detail

Event Timeline

saiislam requested review of this revision.Dec 18 2020, 1:51 AM
saiislam created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptDec 18 2020, 1:51 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
saiislam edited the summary of this revision. (Show Details)Dec 18 2020, 1:54 AM
ABataev added inline comments.Dec 18 2020, 8:49 AM
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
170–172

No need else here

1064–1068

No need for else here

1071–1073

I think llvm Support lib has all required functions for this.

1111

Do not use auto where the type is not obvious.

1159–1160

Just continue and make else if just if

saiislam updated this revision to Diff 314633.Jan 5 2021, 8:42 AM

Modified to handle multiple targets/outputs in one run of the tool for archive unbundling. Other minor changes as requested in the review.

saiislam marked 3 inline comments as done.Jan 5 2021, 8:47 AM
saiislam added inline comments.
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp
1159–1160

wasn't possible with the code flow. there is stuff to be processed in case of failure as well.

saiislam edited the summary of this revision. (Show Details)Jan 5 2021, 8:49 AM
saiislam added reviewers: yaxunl, t-tye.

can you document this in ClangOffloadBundler.rst ? I think we need a clear description about how clang-offload-bundler knows which file in the .a file belongs to which target.

can you document this in ClangOffloadBundler.rst ? I think we need a clear description about how clang-offload-bundler knows which file in the .a file belongs to which target.

How does the .a relate to bundled code objects? Does the .a have a number of bundled code objects? If so wouldn't the identity of code objects be defined by the existing bundled code object ABI already documented? If the .a is a set of non-bundled code objects then defining how they are identified is not part of the clang-offload-bundler documentation as there are no bundled code objects involved. It would seem that the documentation belongs with the OpenMP runtime/compiler that is choosing to use .a files in this manner.

can you document this in ClangOffloadBundler.rst ? I think we need a clear description about how clang-offload-bundler knows which file in the .a file belongs to which target.

How does the .a relate to bundled code objects? Does the .a have a number of bundled code objects? If so wouldn't the identity of code objects be defined by the existing bundled code object ABI already documented? If the .a is a set of non-bundled code objects then defining how they are identified is not part of the clang-offload-bundler documentation as there are no bundled code objects involved. It would seem that the documentation belongs with the OpenMP runtime/compiler that is choosing to use .a files in this manner.

Bundles (created using clang-offload-bundler) are passed to llvm-ar to create an archive of bundled objects (*.a file). An archive can have bundles for multiple device types. So, yes, the identity of code objects is defined by the existing bundled code object ABI.
This patch reads such an archive and produces a device-specific archive for each of the target devices given as input. Each device-specific archive contains all the code objects corresponding to that particular device and are written as per llvm archive format.

Here is a snippet of relevant lit run lines:

// RUN: %clang -O0 -target %itanium_abi_triple %s -c -o %t.o

// RUN: echo 'Content of device file 1' > %t.tgt1
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.o,%t.tgt1 -outputs=%t.abundle1.o
 
// RUN: echo 'Content of device file 2' > %t.tgt2
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.o,%t.tgt2 -outputs=%t.abundle2.o
 
// RUN: llvm-ar cr %t.lib.a %t.abundle1.o %t.abundle2.o

This patch ==>
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.lib.a -outputs=%t.devicelib.a

%t.devicelib.a will contain all devices objects corresponding to gfx900

Though my interest originates from OpenMP side, Device-specific Archive Libraries created like this can be used by other offloading languages like HIP, CUDA, and OpenCL. Pelase refer D81109 for the an earlier patch in the series of patches which will enable this.

t-tye added a comment.Jan 13 2021, 8:07 AM

can you document this in ClangOffloadBundler.rst ? I think we need a clear description about how clang-offload-bundler knows which file in the .a file belongs to which target.

How does the .a relate to bundled code objects? Does the .a have a number of bundled code objects? If so wouldn't the identity of code objects be defined by the existing bundled code object ABI already documented? If the .a is a set of non-bundled code objects then defining how they are identified is not part of the clang-offload-bundler documentation as there are no bundled code objects involved. It would seem that the documentation belongs with the OpenMP runtime/compiler that is choosing to use .a files in this manner.

Bundles (created using clang-offload-bundler) are passed to llvm-ar to create an archive of bundled objects (*.a file). An archive can have bundles for multiple device types. So, yes, the identity of code objects is defined by the existing bundled code object ABI.
This patch reads such an archive and produces a device-specific archive for each of the target devices given as input. Each device-specific archive contains all the code objects corresponding to that particular device and are written as per llvm archive format.

Here is a snippet of relevant lit run lines:

// RUN: %clang -O0 -target %itanium_abi_triple %s -c -o %t.o

// RUN: echo 'Content of device file 1' > %t.tgt1
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.o,%t.tgt1 -outputs=%t.abundle1.o
 
// RUN: echo 'Content of device file 2' > %t.tgt2
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.o,%t.tgt2 -outputs=%t.abundle2.o
 
// RUN: llvm-ar cr %t.lib.a %t.abundle1.o %t.abundle2.o

This patch ==>
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.lib.a -outputs=%t.devicelib.a

%t.devicelib.a will contain all devices objects corresponding to gfx900

Though my interest originates from OpenMP side, Device-specific Archive Libraries created like this can be used by other offloading languages like HIP, CUDA, and OpenCL. Pelase refer D81109 for the an earlier patch in the series of patches which will enable this.

The naming of code objects in a bundled code object includes the processor name and the settings for target features (see https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id and https://llvm.org/docs/AMDGPUUsage.html#target-id). The compatibility of code objects considers both target processor matching and target feature compatibility. Target features can have three settings: on, off and any. The compatibility is that each feature that is on/off must exactly match, but any will match either on or off.

So when unbundling an archive how is the desired code object being requested? How is it handling the target features? For example, if code objects that will be compatible with a feature being on is required, then matching code objects in the archive would be those that have that feature either on or any.

can you document this in ClangOffloadBundler.rst ? I think we need a clear description about how clang-offload-bundler knows which file in the .a file belongs to which target.

How does the .a relate to bundled code objects? Does the .a have a number of bundled code objects? If so wouldn't the identity of code objects be defined by the existing bundled code object ABI already documented? If the .a is a set of non-bundled code objects then defining how they are identified is not part of the clang-offload-bundler documentation as there are no bundled code objects involved. It would seem that the documentation belongs with the OpenMP runtime/compiler that is choosing to use .a files in this manner.

Bundles (created using clang-offload-bundler) are passed to llvm-ar to create an archive of bundled objects (*.a file). An archive can have bundles for multiple device types. So, yes, the identity of code objects is defined by the existing bundled code object ABI.
This patch reads such an archive and produces a device-specific archive for each of the target devices given as input. Each device-specific archive contains all the code objects corresponding to that particular device and are written as per llvm archive format.

Here is a snippet of relevant lit run lines:

// RUN: %clang -O0 -target %itanium_abi_triple %s -c -o %t.o

// RUN: echo 'Content of device file 1' > %t.tgt1
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.o,%t.tgt1 -outputs=%t.abundle1.o
 
// RUN: echo 'Content of device file 2' > %t.tgt2
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.o,%t.tgt2 -outputs=%t.abundle2.o
 
// RUN: llvm-ar cr %t.lib.a %t.abundle1.o %t.abundle2.o

This patch ==>
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa-gfx900 -inputs=%t.lib.a -outputs=%t.devicelib.a

%t.devicelib.a will contain all devices objects corresponding to gfx900

Though my interest originates from OpenMP side, Device-specific Archive Libraries created like this can be used by other offloading languages like HIP, CUDA, and OpenCL. Pelase refer D81109 for the an earlier patch in the series of patches which will enable this.

The naming of code objects in a bundled code object includes the processor name and the settings for target features (see https://clang.llvm.org/docs/ClangOffloadBundler.html#target-id and https://llvm.org/docs/AMDGPUUsage.html#target-id). The compatibility of code objects considers both target processor matching and target feature compatibility. Target features can have three settings: on, off and any. The compatibility is that each feature that is on/off must exactly match, but any will match either on or off.

So when unbundling an archive how is the desired code object being requested? How is it handling the target features? For example, if code objects that will be compatible with a feature being on is required, then matching code objects in the archive would be those that have that feature either on or any.

At the moment this patch defines compatibility as exact string match of bundler entry ID. So, it doesn't support target ID concept fully. But, following example work.
Supporting target ID requires little more work and discussion.

// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx908 -inputs=%t.o,%t.tgt1 -outputs=%t.abundle1.o
// RUN: clang-offload-bundler -type=o -targets=host-%itanium_abi_triple,openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+,openmp-amdgcn-amd-amdhsa--gfx908:sramecc-:xnack+ -inputs=%t.o,%t.tgt1,%t.tgt2 -outputs=%t.targetIDbundle.o
// RUN: llvm-ar cr %t.targetIDlib.a %t.abundle1.o %t.targetIDbundle.o
// RUN: clang-offload-bundler -unbundle -type=a -targets=openmp-amdgcn-amd-amdhsa--gfx908:sramecc+:xnack+ -inputs=%t.targetIDlib.a -outputs=%t.devicelibt-sramecc+.a
// RUN: llvm-ar t %t.devicelibt-sramecc+.a | FileCheck %s -check-prefix=SRAMECCplus
// SRAMECCplus: targetIDbundle.bc
// SRAMECCplus-NOT: abundle1.bc

At the moment this patch defines compatibility as exact string match of bundler entry ID.
[...]
Supporting target ID requires little more work and discussion.

Let's get this in first, then revisit target ID support as we need it.

t-tye requested changes to this revision.Jan 20 2021, 8:16 AM

At the moment this patch defines compatibility as exact string match of bundler entry ID.
[...]
Supporting target ID requires little more work and discussion.

Let's get this in first, then revisit target ID support as we need it.

I do not think this patch should ignore target ID as that is now upstreamed and documented. What is involved in correcting the compatibility test to be correct by the target ID rules? There are examples of doing this in all the runtimes and I can help if that is useful.

This revision now requires changes to proceed.Jan 20 2021, 8:16 AM

At the moment this patch defines compatibility as exact string match of bundler entry ID.
[...]
Supporting target ID requires little more work and discussion.

Let's get this in first, then revisit target ID support as we need it.

I do not think this patch should ignore target ID as that is now upstreamed and documented. What is involved in correcting the compatibility test to be correct by the target ID rules? There are examples of doing this in all the runtimes and I can help if that is useful.

First, there is no reason not to have multiple patches as long as they are self contained and testable. Arguably, smaller patches are better.

That said, target ID is a new feature and, as discussed in the OpenMP call today, there is a chance we have to revisit this to support more involved information. As this discussion is open ended (and hasn't started yet), it seems absolutely sensible to continue with a tested and working patch that provides features we need for sure instead of forcing some support of a feature we don't use right now anyway.

saiislam updated this revision to Diff 322724.Wed, Feb 10, 9:58 AM

Added support for optional TargetID during unbundling of archives.

saiislam retitled this revision from [OpenMP] Add unbundling of archives containing bundled object files into device specific archives to [clang-offload-bundler] Add unbundling of archives containing bundled object files into device specific archives.Wed, Feb 10, 10:00 AM
saiislam edited the summary of this revision. (Show Details)