Previously functions that aren't included in the symtab were also
excluded from the function starts. Symbols missing from function starts
degrades the debugger experience in the case you don't have debug info
for them.
Details
- Reviewers
oontvoo MaskRay - Group Reviewers
Restricted Project - Commits
- rG0ab09a9009b6: [test][lld-macho] Improve LC_FUNCTION_STARTS test coverage
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
If I'm understanding correctly, it's not just that Obj-C symbols need to be in LC_FUNCTION_STARTS, but all local symbols in code sections as well. If that's the case, can the test not be objc-specific?
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
803–805 | ||
809–812 | this is existing code, but I think it is redundant... the symbol isLive check above should cover it. |
Maybe we can reuse the code here: https://github.com/llvm/llvm-project/blob/main/lld/MachO/SyntheticSections.cpp#L952
(should also check if ld64 also skips emitting Lfoo symbols in LC_FUNCTION_STARTS... I would be surprised if it didn't)
You're right, updated.
The conditional in there is a bit different I guess, are you thinking we should extract the loops into another function and maybe have some callback for each symbol?
(should also check if ld64 also skips emitting Lfoo symbols in LC_FUNCTION_STARTS... I would be surprised if it didn't)
I verified my new test passes with ld64 as well. The relevant logic appears to be here https://github.com/keith/ld64/blob/eddf3fd275669887e70392a1f7d8355222fa295e/src/ld/LinkEdit.hpp#L1978-L1994 although it's not super clear from this snippet that it includes local symbols, but it seems to.
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806 | Do you happen to have performance data for this change? On the other hand, I agree debug experience is important... |
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806 | I've haven't tested this on our large links yet so I'll have to check, I agree, I felt bad writing this |
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806–819 | How would people feel about making this configurable? ( either via making up some new flag or grouping it into an existing one) |
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806–819 | Maybe -no-function-starts is enough for this level of configuration? We actually have that enabled today for our builds for binary size. |
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806 | Hmm... I think we can parallelize this. This might offset the need to gate it behind a flag. Although, we can hold off on this until more concrete data on how bad this affects the link speed. Why can't we add these symbols to the symtab? |
The conditional in there is a bit different I guess, are you thinking we should extract the loops into another function and maybe have some callback for each symbol?
Yeah I was thinking about a callback, but I'm good with the current implementation too. It's not a lot of code duplication.
I verified my new test passes with ld64 as well.
Nice! The linked code also excludes lfoo and LFoo symbols, though, and I think we should do the same thing here (assuming ld64 does) and test for it.
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806 |
the symtab is for global (aka extern) symbols, which need to have unique names across all object files. local symbols do not have that constraint. Anyway, this is being done in parallel with other stuff (see finalizeLinkEditSegment()), so its perf impact might actually be negligible in practice | |
806 |
We've been using chromium_framework as an ad-hoc benchmark, since it's something that all of us can repro: https://bugs.llvm.org/show_bug.cgi?id=48657#c0 Would be nice to include some numbers for that in the commit message. For running the benchmark, I like to use https://github.com/asayers/cbdr/blob/master/cbdr.md My usual invocation is cbdr sample "base:~/bench-base/ld64.lld @response.txt" "diff:~/bench-diff/ld64.lld @response.txt" --timeout=300s | tee results.csv | cbdr analyze -s 95 | |
806–819 | I would prefer we not add too many config flags. I'm also not sure there's much value in having partially emitted debug info -- I figure you either want or don't want it. -no-function-starts is probably enough for now. |
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806–819 | SG 👍 |
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806 |
To be frank, chromium is good for most benchmarking purposes - but it isn't quite representative enough for most of our internal apps (for one, it's much smaller). Unfortunately it's the only thing we have public so ... ¯\_(ツ)_/¯ But anyways, I've run your diff on YT and int3 is right that there's negligible difference in link time. x ./LLD_YT_released_base.txt + ./LLD_TY_released_diff.txt N Min Max Median Avg Stddev x 5 13.33 13.72 13.56 13.534 0.15565989 + 5 13.56 13.92 13.79 13.76 0.15636496 No difference proven at 95.0% confidence (No diff in CPU/SYS time either) Sorry for the noise here :) |
It looks like the logic change from this was committed in https://reviews.llvm.org/D110040, @oontvoo was that intentional or did it sneak in since you tested this? I think we were good to go here, so I'll update this to just include the test, just wanted to make sure I followed what happened
P.S: D'oh! Oops! I see it now. Yes, I must've messed up the branches and accidentally included this patch's diff into the personality symbol patch. I'm so sorry!
No problem! I think we were good with that version of the change, but lmk if you have any thoughts and I can fix it up.
lld/MachO/SyntheticSections.cpp | ||
---|---|---|
806 | As far is perf is concerned, when linking the largest lyft ios app, we end up with ~1.1 million function starts and the time difference doesn't seem measurable with this when using this or passing -no_function_starts |
You may consider [test] ... as an alternative to ... NFC.
[test] is clearer that there is no code change.