This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Fix unbundling archive
ClosedPublic

Authored by yaxunl on Sep 12 2022, 9:06 AM.

Details

Summary

HIP is able to unbundle archive of bundled bitcode.
However currently there are two bugs:

  1. archives passed by -l: are not unbundled.
  2. archives passed as input files are not unbundled

The actual file name of an archive passed by -l: should
not be prefixed with lib and appended with '.a',
but the file path is prefixed with paths in '-L' options.

The actual file name of an archive passed as an input file
stays the same, not affected by the '-L' options.

Diff Detail

Event Timeline

yaxunl created this revision.Sep 12 2022, 9:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 9:06 AM
yaxunl requested review of this revision.Sep 12 2022, 9:06 AM
yaxunl updated this revision to Diff 459482.Sep 12 2022, 9:09 AM

sorry. update with the correct patch.

yaxunl updated this revision to Diff 459484.Sep 12 2022, 9:14 AM

remove debug output

scchan requested changes to this revision.Sep 12 2022, 9:39 AM
scchan added a subscriber: scchan.
scchan added inline comments.
clang/lib/Driver/ToolChains/CommonArgs.cpp
1841

The AOBFileNames small vector needs to be cleared if !FoundAOB or just move its declaration into the LibraryPaths loop.

This revision now requires changes to proceed.Sep 12 2022, 9:39 AM
yaxunl marked an inline comment as done.Sep 12 2022, 10:08 AM
yaxunl added inline comments.
clang/lib/Driver/ToolChains/CommonArgs.cpp
1841

The AOBFileNames small vector needs to be cleared if !FoundAOB or just move its declaration into the LibraryPaths loop.

will move the decl into LibraryPaths loop.

yaxunl updated this revision to Diff 459502.Sep 12 2022, 10:09 AM
yaxunl marked an inline comment as done.

revised by Siu Chi's comments

scchan accepted this revision.Sep 12 2022, 10:57 AM

LGTM

This revision is now accepted and ready to land.Sep 12 2022, 10:57 AM
tra added a comment.Sep 12 2022, 11:57 AM

Archives passed by -l: should not be prefixed with
prefix lib and appended with '.a', but still need to be prefixed with
paths in -L options.
Archives passed as input files should not be prefixed
or appended with anything.

I'm not sure I understand the requirements. WDYM by "archives passed as input files should not be prefixed or appended by anything" ? E.g. if i do clang -o foo foo.o mylib.a, is mylib.a prefixed with something or has something appended?

It may help if you could rephrase it in terms of what *does* the code expect. E.g. 'archives with names lib*.a passed as direct inputs are not unbundled. All others are treated as potentially containing device code and are unbundled during [linking?]'.

Also, using lib*.a as pattern to tell device libraries from the host-ony one will be insufficient. There will be libraries with other extensions (e.g. .lo is fairly common) and there will be libraries that do not have lib prefix. I.e. nothing stops me from building mylib.a and linking with it as in the example above.

Archives passed by -l: should not be prefixed with
prefix lib and appended with '.a', but still need to be prefixed with
paths in -L options.
Archives passed as input files should not be prefixed
or appended with anything.

I'm not sure I understand the requirements. WDYM by "archives passed as input files should not be prefixed or appended by anything" ? E.g. if i do clang -o foo foo.o mylib.a, is mylib.a prefixed with something or has something appended?

It may help if you could rephrase it in terms of what *does* the code expect. E.g. 'archives with names lib*.a passed as direct inputs are not unbundled. All others are treated as potentially containing device code and are unbundled during [linking?]'.

It means "the file name is not prefixed with 'lib' and appended with '.a', and the file path is not prefixed with values from '-L' options. This is in contrast to archive names passed through '-l' options. The directly passed archives are still unbundled. I can update the description.

Also, using lib*.a as pattern to tell device libraries from the host-ony one will be insufficient. There will be libraries with other extensions (e.g. .lo is fairly common) and there will be libraries that do not have lib prefix. I.e. nothing stops me from building mylib.a and linking with it as in the example above.

For archives passed as input files, we use the extension ('.a' on Linux and '.lib' on MSVC) to identify them. We do not restrict the file names to 'lib*.a', but they need to be '*.a' or '*.lib' to be identified as archives.

For arbitrary file names to be identified as archives, users can pass them by '-l:', e.g. '-l:file.bin', then clang will treat 'file.bin' as an archive.

yaxunl edited the summary of this revision. (Show Details)Sep 12 2022, 8:19 PM
tra added a comment.Sep 15 2022, 4:24 PM

Also, using lib*.a as pattern to tell device libraries from the host-ony one will be insufficient. There will be libraries with other extensions (e.g. .lo is fairly common) and there will be libraries that do not have lib prefix. I.e. nothing stops me from building mylib.a and linking with it as in the example above.

For archives passed as input files, we use the extension ('.a' on Linux and '.lib' on MSVC) to identify them. We do not restrict the file names to 'lib*.a', but they need to be '*.a' or '*.lib' to be identified as archives.

As I've pointed above that will create issues.

For arbitrary file names to be identified as archives, users can pass them by '-l:', e.g. '-l:file.bin', then clang will treat 'file.bin' as an archive.

It's not always possible. Linking with dependencies is often handled by the build tools that the end user may not control. E.g. TF is built w/ bazel which handles virtually all linking under the hood with almost no user controls over the names/extensions it assigns to files or how they are passed to the linker.

@MaskRay - WDYT?

MaskRay added a comment.EditedSep 15 2022, 5:32 PM

I know very little about HIP, but I am concerned with relying on extensions as well. For example, I've seen libc++.a.1 (we use this for the real archive while libc++.a is a linker script) and .la (libtool).
A .so file is sometimes a linker script and it may reference an archive file. For example glibc libc.so typically does something like GROUP ( /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )

An ELF linker doesn't really care what extensions are used for what kind of input (archive,relocatable file,shared object). .a and .so are only used for -lfoo lookup.

Also, using lib*.a as pattern to tell device libraries from the host-ony one will be insufficient. There will be libraries with other extensions (e.g. .lo is fairly common) and there will be libraries that do not have lib prefix. I.e. nothing stops me from building mylib.a and linking with it as in the example above.

For archives passed as input files, we use the extension ('.a' on Linux and '.lib' on MSVC) to identify them. We do not restrict the file names to 'lib*.a', but they need to be '*.a' or '*.lib' to be identified as archives.

As I've pointed above that will create issues.

For arbitrary file names to be identified as archives, users can pass them by '-l:', e.g. '-l:file.bin', then clang will treat 'file.bin' as an archive.

It's not always possible. Linking with dependencies is often handled by the build tools that the end user may not control. E.g. TF is built w/ bazel which handles virtually all linking under the hood with almost no user controls over the names/extensions it assigns to files or how they are passed to the linker.

@MaskRay - WDYT?

A side note: this patch does not change clang's original behavior about how to link archives in host compilation. It only changes how clang determines what files to be unbundled as archives. The current patch only treats '*.a' or '*.lib' input files as archives and tries to unbundle them.

I think the root issue is that clang lacks a way to mark an arbitrary file name as an archive file. One solution might be to introduce a new file type 'Archive' and a '-x' value 'archive' to represent that. However, there are some complications with how currently clang treated '.lib' files. This makes me hesitate to introduce such changes. Since such unavoidably will affect how clang links archives for host compilation. My current patch does not change that. It has a limitation, but that limitation only affects HIP and I think it is acceptable.

I know very little about HIP, but I am concerned with relying on extensions as well. For example, I've seen libc++.a.1 (we use this for the real archive while libc++.a is a linker script) and .la (libtool).
A .so file is sometimes a linker script and it may reference an archive file. For example glibc libc.so typically does something like GROUP ( /lib/x86_64-linux-gnu/libc.so.6 /usr/lib/x86_64-linux-gnu/libc_nonshared.a AS_NEEDED ( /lib64/ld-linux-x86-64.so.2 ) )

An ELF linker doesn't really care what extensions are used for what kind of input (archive,relocatable file,shared object). .a and .so are only used for -lfoo lookup.

For HIP, we only need to unbundle archive files. We always try to unbundle files passed by -l options. If it is not archive files, it is OK, since the unbundled will generate an empty file. For input files, the current patch only tries to unbundle '*.a' or '*.lib' files. If it is a concern that some archive files not with extension '*.a' or '*.lib' are missed, one solution might be to try to unbundle any files classified as 'Nothing' (which is how clang classifies archive files).

yaxunl updated this revision to Diff 462189.Sep 22 2022, 9:04 AM

allow archive files to have unknown extension

tra added a comment.Sep 22 2022, 12:14 PM

It's still not quite clear to me what the patch is intended to do. The description describes the existing issues (we don't unbundle some libraries), but is not clear what we do want to have in the end.

Are all library arguments expected to be unbundled? If we do not want to unbundle some libraries, then which arguments would lead to that outcome. The tests only seem to verify the cases where unbundling does happen. It would be great to test the cases when it should not. That is, assuming that there are such cases -- the code seems to

clang/lib/Driver/Driver.cpp
2907–2908

This is confusing. I do not understand how/why therefore should not be unbundled is inferred from they are actually archives.
The patch description says that not unbundling the archives is that the patch is intended to fix. The tests below with MSVC label show that we do unbundle the inputs with .lib extension.

Should this comment be fixed/updated? What do I miss?

clang/lib/Driver/ToolChains/CommonArgs.cpp
1829–1839

Nit: I'd restructure it a bit and move library file name construction out of the loop. Makes it a bit easier to see what's going on.

auto LibFile = Lib.startswith(":") ? Lib.drop_front() :  
                         IsMSVC ? Lib+Ext : "lib" + Lib + Ext;
for (auto Prefix : {"/libdevice/", "/"})
    AOBFileNames.push_back( Twine(LPath + Prefix + LibFile).str());

You may also skip AOBFileNames altogether and just check for the file existence directly within that for loop.

1834

Is it intentional that we're not adding lib prefix to library names passed with -l?

clang/test/Driver/hip-link-bundle-archive.hip
63

Just curious. Does GNU have some meaning in the check label?
GNU[12] seem to deal with unbundling while GNU-L* seem to be regular host libraries. It would be useful to use somewhat more descriptive labels. E.g. HOST*/OFFLOAD* ?

yaxunl marked 4 inline comments as done.Sep 22 2022, 7:03 PM
yaxunl added inline comments.
clang/lib/Driver/Driver.cpp
2907–2908

Here the driver is trying to unbundle bundled objects. This is different from unbundle archives. clang classifies '.lib' files as objects. If we do not exclude '.lib' files here, they will be incorrectly unbundled as objects here. The unbundling of '.lib' as archives should be done at other places.

Since this comment is confusing, I will change it to:

'therefore should not be unbundled as objects here. They will be handled at other places.'

clang/lib/Driver/ToolChains/CommonArgs.cpp
1829–1839

will do

1834

This is for MSVC, which assumes -labc corresponds to the lib file name 'abc.lib'.

clang/test/Driver/hip-link-bundle-archive.hip
63

Here 'GNU' is versus 'MSVC' regarding the library name ('.a' vs '.lib').

'GNU-L' is for '-l'. 'GNU-LA' is for '-l:*.a'. 'GNU-A' is for '*.a'.

yaxunl updated this revision to Diff 462376.Sep 22 2022, 7:05 PM
yaxunl marked 4 inline comments as done.

revised by Artem's comments

tra accepted this revision.Sep 23 2022, 10:49 AM
tra added inline comments.
clang/lib/Driver/Driver.cpp
2907–2908

Thank you for the explanation. It makes more sense to me now.

yaxunl updated this revision to Diff 462673.Sep 24 2022, 7:38 AM

I just found clang-offload-bundler reports an error when trying to unbundle an archive but the input file is not an archive.

This update let clang-offload-bundler to extract empty archives when the input file is not an archive and -allow-missing-bundles is specified.

tra added inline comments.Sep 26 2022, 11:52 AM
clang/test/Driver/clang-offload-bundler.c
410–412

Is there a distinction between "an archive, but has no offloading data in it" and "not an archive, at all"?
I can see how we may want the former case work with --allow-missing-bundles, but would we/should we report the error for the latter? I think we should, if we don't.

yaxunl marked an inline comment as done.Sep 26 2022, 1:50 PM
yaxunl added inline comments.
clang/test/Driver/clang-offload-bundler.c
410–412

I will let clang-offload-bundler keep the original behavior about reporting an error for the "not an archive" case.

I will let clang check the magic bytes of the input file so that it won't pass non-archive files to clang-offload-bundler. This also makes the compilation stages cleaner since there will not be spurious unbundlings.

yaxunl updated this revision to Diff 463016.Sep 26 2022, 2:03 PM
yaxunl marked an inline comment as done.

check file magic and only unbundle real archives

tra accepted this revision.Sep 26 2022, 2:44 PM

LGTM.