This is an archive of the discontinued LLVM Phabricator instance.

--thinlto-full-index to index both native and bitcode objects
Needs ReviewPublic

Authored by weiwang on Aug 2 2022, 2:45 AM.

Details

Reviewers
MaskRay
tejohnson
Summary

Add new option --thinlto-full-index to work together with --thinlto-index-only= to produce an extended index list including both prebuilt native and bitcode objects.

For file name thinlto.index given to --thinlto-index-only=, the new option writes the extended list to thinlto.index.full. Then distributed backend can use thinlto.index to guide its compilation and use thinlto.index.full can be fed into final link as the full input list with the intended resolution order.

Currently --thinlto-index-only appends to the list for non-lazy bitcode object file as they are encountered and lazy bitcode object files when they are parsed. It basically emits the symbol resolution order for those files. However when user inputs consist of both prebuilt native objects and bitcode objects, the list is incomplete and it is hard to achieve the same symbol resolution order during final link to produce the same result as the inputs were going through local ThinLTO.

--thinlto-full-index appends to the list for non-lazy object files, both native and bitcode, as they are encountered and lazy object files, both native and bitcode, when they are parsed.

Diff Detail

Event Timeline

weiwang created this revision.Aug 2 2022, 2:45 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: ormris, hoy, wenlei and 6 others. · View Herald Transcript
weiwang requested review of this revision.Aug 2 2022, 2:45 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2022, 2:45 AM
weiwang retitled this revision from thinlto-full-index to --thinlto-full-index to index both native and bitcode objects.Aug 2 2022, 3:03 AM
weiwang edited the summary of this revision. (Show Details)
weiwang added a subscriber: LeeHowes.
MaskRay added a comment.EditedAug 2 2022, 11:17 AM

I considered this approach. I went with D130229 as I have thought of some minor issues:

  • The requires clang -nostdlib ...
  • An archive member cannot be serialized as file->getName(). Actually lld doesn't support syntax reading just an archive member instead of the whole archive. As an example, glibc libc.so reads libc_nonshared.a which contains elf-init.oS and the patch will incorrectly print elf-init.oS. This approach assuredly needs an option doing something similar to --remapping-file= in spirit, but more involved.
  • A build system needs to recognize all input files, and remove them for the final native link. This is nearly mostly feasible with Bazel/Buck style build systems, but not with others. Even in Bazel/Buck, I don't think linkopts = ["file_added_as_an_option_instead_of_src_or_deps.o"] can be recognized.

The above is off the top of my head.

And we lose opportunities to fix some problems for free:

  • If we are to support bitcode files in archives, we will need more options.
  • This still misses an opportunity to properly fix "libcall symbols defined in bitcode member of an archive".

I wish that you can give a fair consideration of D130229 and not simply dismiss it with "it differs from in-process ThinLTO, it's not right."

weiwang added a comment.EditedAug 2 2022, 2:18 PM

Thanks for the quick feedback!

I considered this approach. I went with D130229 as I have thought of some minor issues:

  • The requires clang -nostdlib ...
  • An archive member cannot be serialized as file->getName(). Actually lld doesn't support syntax reading just an archive member instead of the whole archive. As an example, glibc libc.so reads libc_nonshared.a which contains elf-init.oS and the patch will incorrectly print elf-init.oS. This approach assuredly needs an option doing something similar to --remapping-file= in spirit, but more involved.
  • A build system needs to recognize all input files, and remove them for the final native link. This is nearly mostly feasible with Bazel/Buck style build systems, but not with others. Even in Bazel/Buck, I don't think linkopts = ["file_added_as_an_option_instead_of_src_or_deps.o"] can be recognized.

The above is off the top of my head.

Your concerns are valid. I encountered them as well during local testing.

  • Our internal build system (buck) applies -nostdlib throughout linking, so this requirement is satisfied.
  • There are no direct support to extract specific members from archive. We are updating build system to make all regular archives into thin archive (--[start|end]-lib). With that done, archive members can be accessed directly. And you are right about elf-init.oS. During my local testing, I had to extract it manually, so some work needs to be done here. In a more general approach, formatting the index with ("archivename", "object") pair and using a separate flag to parse the index (like "--remapping-file=") is also doable.
  • The index file contains line separated input file list, and can be used directly as response file in the final link (I should've just used @thinlto.index.full in the test case provided).

And we lose opportunities to fix some problems for free:

  • If we are to support bitcode files in archives, we will need more options.
  • This still misses an opportunity to properly fix "libcall symbols defined in bitcode member of an archive".

I wish that you can give a fair consideration of D130229 and not simply dismiss it with "it differs from in-process ThinLTO, it's not right."

The concern we have is not about which linking order is right or wrong, it's rather making it consistent with its local in-process thinlto counterpart. We are kind of late in the game of distributed thinlto and we have large scale of services (20K inputs for linking) running with local thinlto for years. We can't guarantee that changing the resolution order of inputs from local to distributed thinlto won't make a difference in binary runtime behavior, and it would be a big headache to debug after everything goes production. So our current approach is to

  1. Make sure that distributed thinlto thinlink uses the same inputs and order as local thinlto. This is more of the build system level change.
  2. Make sure that distributed thinlto final link uses the same resolution order as thinlink with the help of this new option.
  3. Verify that binaries generated from both local and distributed thinlink matches at per symbol level.
MaskRay added a comment.EditedAug 2 2022, 2:55 PM

And we lose opportunities to fix some problems for free:

  • If we are to support bitcode files in archives, we will need more options.
  • This still misses an opportunity to properly fix "libcall symbols defined in bitcode member of an archive".

I wish that you can give a fair consideration of D130229 and not simply dismiss it with "it differs from in-process ThinLTO, it's not right."

The concern we have is not about which linking order is right or wrong, it's rather making it consistent with its local in-process thinlto counterpart. We are kind of late in the game of distributed thinlto and we have large scale of services (20K inputs for linking) running with local thinlto for years. We can't guarantee that changing the resolution order of inputs from local to distributed thinlto won't make a difference in binary runtime behavior, and it would be a big headache to debug after everything goes production. So our current approach is to

  1. Make sure that distributed thinlto thinlink uses the same inputs and order as local thinlto. This is more of the build system level change.
  2. Make sure that distributed thinlto final link uses the same resolution order as thinlink with the help of this new option.
  3. Verify that binaries generated from both local and distributed thinlink matches at per symbol level.

Thanks for being open. However, this is something I'd push back a bit.

See the last paragraph of https://reviews.llvm.org/D130229#3674024 : I think lld users need to ensure their build targets and the dependencies are in a decent state.
In the past, I have fixed a lot of internal builds to cooperate with the executable default --no-allow-shlib-undefined and later --warn-backrefs (https://maskray.me/blog/2021-06-13-dependency-related-linker-options#warn-backrefs).
Perhaps these things indirectly help stabilize our ThinLTO builds.
I did not add options to lld just so that my company could quickly adapt the dependency restriction features.

In addition, I suspect our current archive member extraction logic isn't ideal and we may need to revamp it. See mold which has implemented great parallel input file parsing.
Applications tied to every implementation detail of the archive member extraction is brittle and should be avoided.
Improving lld performance isn't my work priority, so I basically do it in my spare time in a slow but hopefully steady pace.
See https://discourse.llvm.org/t/parallel-input-file-parsing/60164 and a few patches from me mentioning "parallel". There have been some changes like D95985, removing ArchiveFile, etc.
I think we will do similar things if parallel non-local symbol resolution can become more feasible, as long as the symbol resolution logic still makes sense.

As mentioned, the in-process ThinLTO symbol resolution probably isn't that ideal and we need to fix that, too, though probably with a low priority.
For this patch, I want to give a clear NAK unless you can come up with a reasonable example that --thinlto-index= + --remapping-file= don't work well (not a brittle archive shadow example).