This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Parse archives as --start-lib object files
ClosedPublic

Authored by MaskRay on Feb 6 2022, 2:02 AM.

Details

Summary

https://maskray.me/blog/2022-01-16-archives-and-start-lib

For every definition in an extracted archive member, we intern the symbol twice,
once for the archive index entry, once for the .o symbol table after extraction.
This is inefficient.

Symbols in a --start-lib ObjFile/BitcodeFile are only interned once because the
result is cached in symbols[i].

Just handle an archive using the --start-lib code path. We can therefore remove
ArchiveFile and LazyArchive. For many projects, archive member extraction ratio
is high and it is a net performance win. Linking a Release build of clang is
1.01x as fast.

Note: --start-lib scans symbols in the same order that llvm-ar adds them to the
index, so in the common case the semantics should be identical. If the archive
symbol table was created in a different order, or is incomplete, this strategy
may have different semantics. Such cases are considered user error.

The is neither ET_REL nor LLVM bitcode error is changed to a warning.
Previously an archive may have such members without a diagnostic. Using a
warning prevents breakage.

  • For some tests, the diagnostics get improved where we did not consider the archive member name: b.a: => b.a(b.o):.
  • no-obj.s: the link is now allowed, matching GNU ld
  • archive-no-index.s: the is neither ET_REL nor LLVM bitcode diagnostic is demoted to a warning.
  • incompatible.s: even when an archive is unextracted, we may report an "incompatible with" error.

I recently decreased sizeof(SymbolUnion) by 8 and decreased memory usage quite a
bit, so retaining symbols for un-extracted archive members should not cause a
memory usage problem.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 6 2022, 2:02 AM
MaskRay requested review of this revision.Feb 6 2022, 2:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 6 2022, 2:02 AM

Just come back from being out of office, will try and work my way through this but may take a bit of time. Hopefully by the end of the week.

MaskRay updated this revision to Diff 407035.Feb 8 2022, 7:53 PM
MaskRay edited the summary of this revision. (Show Details)

rebase after recent header clean-up

MaskRay updated this revision to Diff 407802.Feb 11 2022, 12:55 AM

add a test to incompatible.s: we may now report "incompatible with" for an unextracted archive

Apologies, I ran out of time this Friday, will try again early next week.

I've had a chance to look through this. Assuming the user-visible behaviour changes are limited to what has been mentioned in the description I'm in favour of the change. I've only made one suggestion surrounding a comment. To check my understanding all files in archives are handled with createLazyFile which would make their symbols LazyObject ? If I'm wrong then I'll need to re-read the code to make sure I understand first.

lld/ELF/Driver.cpp
233

Is it worth a design decision comment here? I'm thinking of people new to LLD wondering where the traditional archive handling code is.

For example

LLD chooses to handle archives and --start-lib and --end-lib using the same code-path. This scans all the ELF files in the archive rather than just the index file, with the benefit that the symbols are only loaded once. For many projects archives see high utilization rates so in these cases it is a net performance win.
lld/ELF/Symbols.h
411

As I understand it, traditional archives are effectively being handled as if they've been expanded by --start-lib and --end-lib is it worth mentioning this here.

// LazyObject symbols represents symbols in object files between --start-lib and --end-lib options. LLD also handles traditional archives as if all the files in the archive where extracted while being surrounded by --start-lib and --end-lib.
MaskRay updated this revision to Diff 408693.Feb 14 2022, 7:27 PM
MaskRay marked 2 inline comments as done.
MaskRay edited the summary of this revision. (Show Details)

improve comments

peter.smith accepted this revision.Feb 15 2022, 12:28 AM

Thanks for the update. LGTM.

This revision is now accepted and ready to land.Feb 15 2022, 12:28 AM
bd1976llvm added inline comments.Feb 15 2022, 1:15 AM
lld/ELF/Driver.cpp
238

Maybe worth expanding this some more: "--start-lib and --end-lib scans symbols in the same order that llvm-ar adds them, so in the common case the semantics should be identical. If the archive symbol table was created in a different order, or is incomplete, this strategy has different semantics, such cases are considered user error."

MaskRay updated this revision to Diff 408911.Feb 15 2022, 9:14 AM
MaskRay marked an inline comment as done.

update comments

I am slightly concerned about penalising links with low utilisation archives. However, I don't have any specific data on low utilisation archives or any example of them. We did do some testing of a similar feature (in our proprietary linker) on game code, and low utilisation archives did not cause a problem for any of the games tested. Users should be able to create high utilisation versions of problematic archives in any case.

LGTM from me.

MaskRay edited the summary of this revision. (Show Details)Feb 15 2022, 9:27 AM
This revision was landed with ongoing or failed builds.Feb 15 2022, 9:38 AM
This revision was automatically updated to reflect the committed changes.
MaskRay added inline comments.Feb 15 2022, 9:40 AM
lld/ELF/Driver.cpp
233

Thanks for the suggestion!

MaskRay added a comment.EditedFeb 15 2022, 9:57 AM

I am slightly concerned about penalising links with low utilisation archives. However, I don't have any specific data on low utilisation archives or any example of them. We did do some testing of a similar feature (in our proprietary linker) on game code, and low utilisation archives did not cause a problem for any of the games tested. Users should be able to create high utilisation versions of problematic archives in any case.

LGTM from me.

I think the utilization ratio needs to be quite low to observe a performance loss (I guess for 50% utilization with regular defined-vs-undefined ratio, it's still a net win). Even in that case the performance loss can be restored by parallel input file scanning (https://discourse.llvm.org/t/parallel-input-file-parsing/60164).

The removal of 155 line code is also important.

MaskRay added a comment.EditedFeb 15 2022, 11:29 AM

For a default build of chromium, this change makes the link 1.05x as fast :)

% hyperfine --warmup 2 --min-runs 16 "numactl -C 20-27 "{/tmp/c/0,/tmp/c/1}" -flavor gnu @response.txt --threads=8 -o lld"
Benchmark 1: numactl -C 20-27 /tmp/c/0 -flavor gnu @response.txt --threads=8 -o lld
  Time (mean ± σ):      6.911 s ±  0.034 s    [User: 10.447 s, System: 2.791 s]
  Range (min … max):    6.843 s …  6.980 s    16 runs
 
Benchmark 2: numactl -C 20-27 /tmp/c/1 -flavor gnu @response.txt --threads=8 -o lld
  Time (mean ± σ):      6.574 s ±  0.034 s    [User: 10.068 s, System: 2.844 s]
  Range (min … max):    6.505 s …  6.624 s    16 runs
 
Summary
  'numactl -C 20-27 /tmp/c/1 -flavor gnu @response.txt --threads=8 -o lld' ran
    1.05 ± 0.01 times faster than 'numactl -C 20-27 /tmp/c/0 -flavor gnu @response.txt --threads=8 -o lld'
thakis added a subscriber: thakis.Feb 16 2022, 6:54 AM

Hooray for the speedup!

Looks like this breaks tests on Windows though: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8822094137116818593/+/u/package_clang/stdout?format=raw

tail: cannot open `+9' for reading: No such file or directory

Looks like the gnuwin version of tail doesn't like -c +9 syntax. Is there a different way of writing this?

MaskRay added a comment.EditedFeb 16 2022, 9:10 AM

Hooray for the speedup!

Looks like this breaks tests on Windows though: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8822094137116818593/+/u/package_clang/stdout?format=raw

tail: cannot open `+9' for reading: No such file or directory

Looks like the gnuwin version of tail doesn't like -c +9 syntax. Is there a different way of writing this?

The -c +9 usage conforms to https://pubs.opengroup.org/onlinepubs/007904975/utilities/tail.html

We can unsupport the test on Windows. Is it that old? http://gnuwin32.sourceforge.net/packages/coreutils.htm

Hooray for the speedup!

Looks like this breaks tests on Windows though: https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket/8822094137116818593/+/u/package_clang/stdout?format=raw

tail: cannot open `+9' for reading: No such file or directory

Looks like the gnuwin version of tail doesn't like -c +9 syntax. Is there a different way of writing this?

The -c +9 usage conforms to https://pubs.opengroup.org/onlinepubs/007904975/utilities/tail.html

We can unsupport the test on Windows. Is it that old? http://gnuwin32.sourceforge.net/packages/coreutils.htm

Unsupporting the test on Windows is fine, I can send out a patch.
We can't change gnuwin32 and it's what all the LLVM docs say to use, so we have to deal with it.

Unsupporting the test on Windows is fine, I can send out a patch.
We can't change gnuwin32 and it's what all the LLVM docs say to use, so we have to deal with it.

Which docs are you referring to? The latest LLVM docs no longer say to use GNUWin32 (because it's so old). See D108513. The consensus nowadays seems to be to rely on the tools shipped with git.

Unsupporting the test on Windows is fine, I can send out a patch.
We can't change gnuwin32 and it's what all the LLVM docs say to use, so we have to deal with it.

Which docs are you referring to? The latest LLVM docs no longer say to use GNUWin32 (because it's so old). See D108513. The consensus nowadays seems to be to rely on the tools shipped with git.

Ah I do vaguely remember that patch going through.
I grepped for gnuwin and saw a bunch of references to it, all of them saying/implying that we should be using gnuwin. Perhaps those should be updated?
The thing with git bash tools is that although they're great for development, it's non trivial to download them and unpack them somewhere as part of bot builds.

Unsupporting the test on Windows is fine, I can send out a patch.
We can't change gnuwin32 and it's what all the LLVM docs say to use, so we have to deal with it.

Which docs are you referring to? The latest LLVM docs no longer say to use GNUWin32 (because it's so old). See D108513. The consensus nowadays seems to be to rely on the tools shipped with git.

Ah I do vaguely remember that patch going through.
I grepped for gnuwin and saw a bunch of references to it, all of them saying/implying that we should be using gnuwin. Perhaps those should be updated?

Yes, probably!

The thing with git bash tools is that although they're great for development, it's non trivial to download them and unpack them somewhere as part of bot builds.

Don't these bots need git installed already though to checkout the LLVM repository?

Unsupporting the test on Windows is fine, I can send out a patch.
We can't change gnuwin32 and it's what all the LLVM docs say to use, so we have to deal with it.

Which docs are you referring to? The latest LLVM docs no longer say to use GNUWin32 (because it's so old). See D108513. The consensus nowadays seems to be to rely on the tools shipped with git.

Ah I do vaguely remember that patch going through.
I grepped for gnuwin and saw a bunch of references to it, all of them saying/implying that we should be using gnuwin. Perhaps those should be updated?

Yes, probably!

The thing with git bash tools is that although they're great for development, it's non trivial to download them and unpack them somewhere as part of bot builds.

Don't these bots need git installed already though to checkout the LLVM repository?

let's move this conversation to D108513