Page MenuHomePhabricator

[lld-macho] Include Objective-C functions in LC_FUNCTION_STARTS
Needs ReviewPublic

Authored by keith on Nov 19 2021, 12:54 PM.

Details

Reviewers
None
Group Reviewers
Restricted Project
Summary

Previously since Objective-C functions aren't included in the symtab
they were also excluded from the function starts data. This degraded
the debugger experience in the case you didn't have debug info for for
some Objective-C code.

I'm not convinced this is the right way to do this, I'd love feedback!

Diff Detail

Event Timeline

keith created this revision.Nov 19 2021, 12:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2021, 12:54 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
keith requested review of this revision.Nov 19 2021, 12:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2021, 12:54 PM
int3 added a subscriber: int3.Nov 19 2021, 1:51 PM

If I'm understanding correctly, it's not just that Obj-C symbols need to be in LC_FUNCTION_STARTS, but all local symbols in code sections as well. If that's the case, can the test not be objc-specific?

lld/MachO/SyntheticSections.cpp
802–804
803–806

this is existing code, but I think it is redundant... the symbol isLive check above should cover it.

int3 added a comment.Nov 19 2021, 1:59 PM

Maybe we can reuse the code here: https://github.com/llvm/llvm-project/blob/main/lld/MachO/SyntheticSections.cpp#L952

(should also check if ld64 also skips emitting Lfoo symbols in LC_FUNCTION_STARTS... I would be surprised if it didn't)

keith updated this revision to Diff 388664.Nov 19 2021, 4:43 PM
keith marked 2 inline comments as done.

Improve logic and test

keith added a comment.Nov 19 2021, 4:45 PM

If I'm understanding correctly, it's not just that Obj-C symbols need to be in LC_FUNCTION_STARTS, but all local symbols in code sections as well. If that's the case, can the test not be objc-specific?

You're right, updated.

The conditional in there is a bit different I guess, are you thinking we should extract the loops into another function and maybe have some callback for each symbol?

(should also check if ld64 also skips emitting Lfoo symbols in LC_FUNCTION_STARTS... I would be surprised if it didn't)

I verified my new test passes with ld64 as well. The relevant logic appears to be here https://github.com/keith/ld64/blob/eddf3fd275669887e70392a1f7d8355222fa295e/src/ld/LinkEdit.hpp#L1978-L1994 although it's not super clear from this snippet that it includes local symbols, but it seems to.

oontvoo added inline comments.
lld/MachO/SyntheticSections.cpp
800

Do you happen to have performance data for this change?
seems like it could take a hit here , esp. for cases where the number of input files is large - ie., most of our apps :)

On the other hand, I agree debug experience is important...

keith added inline comments.Nov 19 2021, 5:00 PM
lld/MachO/SyntheticSections.cpp
800

I've haven't tested this on our large links yet so I'll have to check, I agree, I felt bad writing this

oontvoo added inline comments.Nov 19 2021, 5:07 PM
lld/MachO/SyntheticSections.cpp
800–813

How would people feel about making this configurable? ( either via making up some new flag or grouping it into an existing one)

keith added inline comments.Nov 19 2021, 5:11 PM
lld/MachO/SyntheticSections.cpp
800–813

Maybe -no-function-starts is enough for this level of configuration? We actually have that enabled today for our builds for binary size.

thevinster added inline comments.
lld/MachO/SyntheticSections.cpp
800

Hmm... I think we can parallelize this. This might offset the need to gate it behind a flag. Although, we can hold off on this until more concrete data on how bad this affects the link speed.

Why can't we add these symbols to the symtab?

int3 added a comment.Nov 19 2021, 11:57 PM

The conditional in there is a bit different I guess, are you thinking we should extract the loops into another function and maybe have some callback for each symbol?

Yeah I was thinking about a callback, but I'm good with the current implementation too. It's not a lot of code duplication.

I verified my new test passes with ld64 as well.

Nice! The linked code also excludes lfoo and LFoo symbols, though, and I think we should do the same thing here (assuming ld64 does) and test for it.

lld/MachO/SyntheticSections.cpp
800

Why can't we add these symbols to the symtab?

the symtab is for global (aka extern) symbols, which need to have unique names across all object files. local symbols do not have that constraint.

Anyway, this is being done in parallel with other stuff (see finalizeLinkEditSegment()), so its perf impact might actually be negligible in practice

800

I've haven't tested this on our large links yet

We've been using chromium_framework as an ad-hoc benchmark, since it's something that all of us can repro: https://bugs.llvm.org/show_bug.cgi?id=48657#c0

Would be nice to include some numbers for that in the commit message. For running the benchmark, I like to use https://github.com/asayers/cbdr/blob/master/cbdr.md

My usual invocation is cbdr sample "base:~/bench-base/ld64.lld @response.txt" "diff:~/bench-diff/ld64.lld @response.txt" --timeout=300s | tee results.csv | cbdr analyze -s 95

800–813

I would prefer we not add too many config flags. I'm also not sure there's much value in having partially emitted debug info -- I figure you either want or don't want it. -no-function-starts is probably enough for now.

oontvoo added inline comments.Nov 20 2021, 9:23 AM
lld/MachO/SyntheticSections.cpp
800–813

SG 👍

oontvoo added inline comments.Nov 20 2021, 10:53 AM
lld/MachO/SyntheticSections.cpp
800

I've haven't tested this on our large links yet

We've been using chromium_framework as an ad-hoc benchmark, since it's something that all of us can repro: https://bugs.llvm.org/show_bug.cgi?id=48657#c0

To be frank, chromium is good for most benchmarking purposes - but it isn't quite representative enough for most of our internal apps (for one, it's much smaller). Unfortunately it's the only thing we have public so ... ¯\_(ツ)_/¯

But anyways, I've run your diff on YT and int3 is right that there's negligible difference in link time.

x ./LLD_YT_released_base.txt
+ ./LLD_TY_released_diff.txt
    N           Min           Max        Median           Avg        Stddev
x   5         13.33         13.72         13.56        13.534    0.15565989
+   5         13.56         13.92         13.79         13.76    0.15636496
No difference proven at 95.0% confidence
(No diff in CPU/SYS time either)

Sorry for the noise here :)

int3 added a comment.Nov 22 2021, 9:10 AM

(remember to update the diff title/description too)