This is an archive of the discontinued LLVM Phabricator instance.

[ELF][Distributed ThinLTO] Do not generate empty index when bitcode object is linked
ClosedPublic

Authored by weiwang on Sep 12 2022, 5:36 PM.

Details

Summary

When the same bitcode object file is given multiple times from the Command-line as lazy object file, empty index is generated which overwrites the one from thinlink. This could cause undefined symbols during final link.

Diff Detail

Event Timeline

weiwang created this revision.Sep 12 2022, 5:36 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added subscribers: ormris, hoy, wenlei and 6 others. · View Herald Transcript
weiwang requested review of this revision.Sep 12 2022, 5:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2022, 5:36 PM
weiwang edited the summary of this revision. (Show Details)Sep 12 2022, 5:42 PM
weiwang added a reviewer: tejohnson.
MaskRay added inline comments.Sep 12 2022, 6:04 PM
lld/ELF/LTO.cpp
293

DenseMap<StringRef>

lld/test/ELF/lto/thinlto-index-only.ll
51

Don't repeat the comments. You can place the RUN lines immediately after the last block.

weiwang updated this revision to Diff 459644.Sep 12 2022, 9:54 PM

address comments

weiwang marked 2 inline comments as done.Sep 12 2022, 9:54 PM

For Bazel like build systems, it seems weird to have one object file in two modes, lazy and non-lazy. Do you know why it happens for your build system?
I think the change will improve robustness of lld's distributed ThinLTO usage so we probably need to make the change.

For Bazel like build systems, it seems weird to have one object file in two modes, lazy and non-lazy. Do you know why it happens for your build system?
I think the change will improve robustness of lld's distributed ThinLTO usage so we probably need to make the change.

We've seen the case of 2 same lazy objects in buck, and the cause for that is still a mystery to us. @christylee may know more.

I listed the case of lazy and non-lazy just for the sake of test coverage.

MaskRay accepted this revision.Sep 13 2022, 12:06 PM

LGTM.

This revision is now accepted and ready to land.Sep 13 2022, 12:06 PM

For Bazel like build systems, it seems weird to have one object file in two modes, lazy and non-lazy. Do you know why it happens for your build system?
I think the change will improve robustness of lld's distributed ThinLTO usage so we probably need to make the change.

We've seen the case of 2 same lazy objects in buck, and the cause for that is still a mystery to us. @christylee may know more.

Such a build system usually deduplicates linker input. I have seen duplicate archives but they are -lc, -lc++, etc, not from cc_library like targets.
Technically with linker scripts INPUT and GROUP an archive can be referenced by two cc_library, but that's a very rare case usually implying a user mistake but I feel the linker can guard against it, which will also help traditional build systems which don't do a great job deduplicating input.

I listed the case of lazy and non-lazy just for the sake of test coverage.

For Bazel like build systems, it seems weird to have one object file in two modes, lazy and non-lazy. Do you know why it happens for your build system?
I think the change will improve robustness of lld's distributed ThinLTO usage so we probably need to make the change.

We've seen the case of 2 same lazy objects in buck, and the cause for that is still a mystery to us. @christylee may know more.

Such a build system usually deduplicates linker input. I have seen duplicate archives but they are -lc, -lc++, etc, not from cc_library like targets.
Technically with linker scripts INPUT and GROUP an archive can be referenced by two cc_library, but that's a very rare case usually implying a user mistake but I feel the linker can guard against it, which will also help traditional build systems which don't do a great job deduplicating input.

We've seen two cases:

  1. Older rust compilers will duplicate transitive dependencies, including C++ dependencies.
  2. Third-party prebuilt libraries that are wrapped with --start-group/--end-group and can be referenced by multiple cc_library. This is technically a user error but it's very hard to debug.

I listed the case of lazy and non-lazy just for the sake of test coverage.