This patch adds the file type "a", which together with the --unbundle flag, unbundles an archive of bundled object files into an archive containing the device-specific code.
Example:
clang-offload-bundler --unbundle --inputs=libMyLib.a -type=a -outputs=libMyDeviceLib.a -targets=openmp-amdgcn-amdhsa-gfx906
Details
Diff Detail
Event Timeline
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
965 | Need to add comments about what is in the input file and by what criteria the files are extracted and what format is the output file, does not contain index, etc. |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
927–931 | Does the K_GNU case cover all other OSes? What about Windows? | |
933–938 | Can we add a default case which will return a generic extension (or exit with an "unsupported device" error)? Something like this: if (Device.contains("gfx")) return ".bc"; else if (Device.contains("nvidia-cuda")) return ".cubin"; // Maybe also print a warning here return ".o"; It will help identify things that need to be patched when a new architecture gets support in clang. |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
927–931 | From what I could find out the PE/COFF variant is the same as K_GNU, so it should work. | |
933–938 | Yes, I think that would be fine. I was also considering adding a new flag to specify the file extension instead of having this function. We may want to keep this kind of logic in the driver. |
Made matching for ".cubin" exension explicit, and created a fallback to ".o" with a warning.
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
933–938 | I opted for now to keep this function and explicitly match for .cubin extension and have a fallback to .o with a warning. |
The patch looks good now. I don't have any other comments. We can revisit the driver flag proposal for setting the file extension in the future.
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
939–940 | I think a space is missing between archive and members. |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
132–140 | The use of Triple is pretty confusing in this tool, it is actually more than a triple. It is a pair of a llvm::Triple string and a device e.g. "gfx-906" or "sm_61" separated by a "-". I kept the convention of using Triple. |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
132–140 | I argued that somewhere else before. Are we really sure it makes sense to somehow glue together two things so we don't have to add something new? We have a triple in the module, and if we need a module wide cpu/device identifier, let's add one? I mean, you basically gave the perfect argument of not doing it the way it is implemented: "The use of Triple is pretty confusing in this tool, it is actually more than a triple." |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
132–140 | Having a target cpu in the module is probably a good idea, but it wouldn't directly solve this unless the llvm::Triple was modified to include the cpu/device. Together with the "kind" it is used to name the ELF sections for the bundling, so it is really 3 things put together that makes up the whole identifier. |
Fixed string memory leak, replaced if-then-else with conditional expression. Added check in case there is no device code in an archive child, and we can skip processing it.
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
132–140 | Remind me, what is "kind" in this context. |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
132–140 | It is the offload kind, which is currently "host", "cuda", "hip" or "openmp". |
There's already some outstanding comments - maybe we leave this until you're back from vacation?
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
132–140 | To summarize, the target is something like "openmp-amdgcn-amd-amdhsa-gfx906" and getDevice is being used to get the device (gfx906). So, it is a combination of "<kind>-<llvm-target>-<device>". | |
1063 | Done in D93525 | |
1105–1106 | For our use case, getting a single device specific archive from this unbundle flag option is sufficient. Can you give any use case(s) where we need to generate multiple device specific archives in one run of the tool? |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
1105–1106 | I assume in general case unbundling for archives will be done in a similar way as for object files and for objects unbundling is currently done into multiple outputs in one run. There are many example of such use cases for object files in clang/test/Driver/openmp-offload.c. So, it seems natural to have the same functionality for archives as well. |
clang/tools/clang-offload-bundler/ClangOffloadBundler.cpp | ||
---|---|---|
1105–1106 | We can probably reuse code for unbundling archives from https://github.com/intel/llvm branch. This implementation does not have one target/output limitation and it is based on the FileHandler interface as the rest of the file type handlers. I have prepared a draft for llvm.org which is based on the intel/llvm code - https://reviews.llvm.org/D94005 |
Thank you, @sdmitriev. I have updated the patch to handle multiple targets/outputs in one run of the tool (https://reviews.llvm.org/D93525?id=314633). Please have a look.
Better to use Triple class for parsing triple rather than invent something new.