This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Cache readFile results
ClosedPublic

Authored by keith on Nov 3 2021, 5:17 PM.

Details

Reviewers
int3
smeenai
gkm
Group Reviewers
Restricted Project
Commits
rGd49e7244cc01: [lld-macho] Cache readFile results
Summary

In one of our links lld was reading 760k files, but the unique number of
files was only 1500. This takes that link from 30 seconds to 8.

This seems like a heavy hammer, especially since some things don't need
to be cached, like the filelist arguments and the passed static
archives (the latter is already cached as a one off), but it seems ld64
does something similar here to short circuit these duplicate reads:

https://github.com/keith/ld64/blob/82e429e186488529111b0ef86af33a3b1b9438c7/src/ld/InputFiles.cpp#L644-L665

Of the types of files being read for our iOS app, the biggest problem
was constantly re-reading small tbd files:

% wc -l /tmp/read.txt
761414 /tmp/read.txt
% cat /tmp/read.txt | sort -u | wc -l
1503

% cat /tmp/read.txt | grep "\.a$" | wc -l
43721
% cat /tmp/read.txt | grep "\.tbd$" | wc -l
717656

We could likely hoist this logic up to not cache at this level, but it
would be a more invasive change to make sure all callers that needed it
cached the results.

I could see this being an issue with OOMs, and I'm not a linker expert so
maybe there's another way we should solve this problem? Feedback welcome!

Diff Detail

Event Timeline

keith created this revision.Nov 3 2021, 5:17 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
keith requested review of this revision.Nov 3 2021, 5:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2021, 5:17 PM
int3 accepted this revision.Nov 3 2021, 9:25 PM

I could see this being an issue with OOMs

We are pretty liberal with our memory use at the moment. E.g. we basically don't free anything until process exit, with the view that most of things we allocate live for most of the linker process' short lifetime anyway. I think we can run with this and revisit if it causes any problems in practice.

lld/MachO/InputFiles.cpp
176

Would be nice to have a comment here about how this is primarily for the caching of .tbd / common library files and is a bit of a heavy hammer (as mentioned in the commit message)

This revision is now accepted and ready to land.Nov 3 2021, 9:25 PM
keith updated this revision to Diff 384654.Nov 3 2021, 9:46 PM
keith marked an inline comment as done.

Add comment

This revision was automatically updated to reflect the committed changes.

This one needs to be reset in the cleanupCallback as well (https://github.com/llvm/llvm-project/blob/e7fdff403e849b18d93cd4a5cb760cba66a92c0b/lld/MachO/Driver.cpp#L1098), as do all the other caches you added (if they aren't already). The cache itself would technically be valid, but we free all our memory arenas at the end of a linker run, so the entries would point to freed memory.

Out of curiosity, is the link that takes 8 seconds now the same one that ld64 takes 20 seconds for? If not, I'd be curious to know how the speed for that one compares after all your changes.

Thanks for all the fixes!

keith added a comment.Nov 4 2021, 9:45 AM

This one needs to be reset in the cleanupCallback as well (https://github.com/llvm/llvm-project/blob/e7fdff403e849b18d93cd4a5cb760cba66a92c0b/lld/MachO/Driver.cpp#L1098), as do all the other caches you added (if they aren't already). The cache itself would technically be valid, but we free all our memory arenas at the end of a linker run, so the entries would point to freed memory.

Ah thanks, I wondered but should have asked. https://reviews.llvm.org/D113198

Out of curiosity, is the link that takes 8 seconds now the same one that ld64 takes 20 seconds for? If not, I'd be curious to know how the speed for that one compares after all your changes.

Yes, so for the same link our benchmarks have been:

  • lld before my changes: 1 minute 45 seconds
  • ld64 20s
  • zld (optimized ld64 fork) ~13s
  • lld with all my changes ~8s

Unfortunately the produced binary crashes on what appears to be issues with dropped Objective-C categories, I'm working on reducing a repro case, but if there are known issues around this (I couldn't find any on the bug tracker) I would love to know!

Thanks for all the fixes!

Thanks for the reviews!

Unfortunately the produced binary crashes on what appears to be issues with dropped Objective-C categories, I'm working on reducing a repro case, but if there are known issues around this (I couldn't find any on the bug tracker) I would love to know!

We haven't seen any issues like that on our end yet, as far as I'm aware. The repro will be interesting once you've got it :)

Unfortunately the produced binary crashes on what appears to be issues with dropped Objective-C categories, I'm working on reducing a repro case, but if there are known issues around this (I couldn't find any on the bug tracker) I would love to know!

We haven't seen any issues like that on our end yet, as far as I'm aware. The repro will be interesting once you've got it :)

Filed https://bugs.llvm.org/show_bug.cgi?id=52480 with a repro!