This is an archive of the discontinued LLVM Phabricator instance.

[lld/mac] Implement for section$start and section$ end symbols
ClosedPublic

Authored by thakis on Jul 22 2021, 7:14 PM.

Details

Summary

With this, libclang_rt.profile_osx.a can be linked, that is coverage
and PGO-instrumented builds should now work with lld.

section$start and section$end symbols can create non-existing sections.
They're also undefined symbols that are only magic if there isn't a
regular symbol with their name, which means the need to be handled
in treatUndefined() instead of just looping over all existing
sections and adding start and end symbols like the ELF port does.

To represent the actual symbols, this uses absolute symbols that
get their value updated once an output section is layed out.

segment$start and segment$end are still missing for now, but they produce a
nicer error message after this patch.

Main part of PR50760.

Diff Detail

Event Timeline

thakis created this revision.Jul 22 2021, 7:14 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: wenlei. · View Herald Transcript

Here's a coverage report of "check-lld" made with lld with this patch as host linker. (Built lld without coverage with this patch, then build everything again with coverage, and using the lld from the first build as linker): https://drive.google.com/file/d/1D8Zmd45neE-_S2I4Yu2veaZLhy6CPZv3/view?usp=sharing Decent test coverage!

Here's a coverage report of "check-lld" made with lld with this patch as host linker. (Built lld without coverage with this patch, then build everything again with coverage, and using the lld from the first build as linker): https://drive.google.com/file/d/1D8Zmd45neE-_S2I4Yu2veaZLhy6CPZv3/view?usp=sharing Decent test coverage!

For comparison, here's the same report with ld as host linker. Things look pretty similar, so that's good :) ld shows more lines in ELF CallGraphSort.cpp, lld more in COFF. Maybe those get partially ICF'd, or the runtime gets confused by these files containing functions with the same names (in unnamed namespaces each, so shouldn't be an ODR issue).

thakis edited the summary of this revision. (Show Details)Jul 23 2021, 7:22 AM
int3 added a subscriber: int3.Jul 23 2021, 7:57 AM
int3 added inline comments.
lld/MachO/OutputSection.h
71–72

when will we have more than one symbol here?

also, if we're usually going to have just one element, since a TinyPtrVector better?

lld/MachO/SymbolTable.cpp
243

shouldn't we be passing in a non-null isec here?

lld/test/MachO/start-end.s
73–79

😂

thakis updated this revision to Diff 361237.Jul 23 2021, 8:41 AM

tinyptrvector

lld/MachO/OutputSection.h
71–72

when will we have more than one symbol here?

e.g. if you have section$start$FOO$bar and section$start$BAZ$quux and sectrename FOO bar BAZ quux -- both symbols will refer to the same section.

also, if we're usually going to have just one element, since a TinyPtrVector better?

There are on the order of 10 OutputSections so I don't think we _really_ need to worry about its size. But sure, done, and TIL about TinyPtrVector :) (It means we have to include Symbols.h in the header, but /shruggie)

lld/MachO/SymbolTable.cpp
243

To me this semantically seems like an absolute symbol, and absolute symbols don't have isec. Defined::getVA() has an early exit for absolute symbols which is attractive since I don't want any isec-based adjustments for the address.

Nice! Thanks!

lld/MachO/SymbolTable.cpp
219

What about section$start$__DATA$__<....> ?

thakis added inline comments.Jul 23 2021, 9:15 AM
lld/MachO/SymbolTable.cpp
219

That's usually a ConcatInputSection. The "create new used" branch below works for those. (We could walk through the existing isecs and not alloc a new one if we find one with the right name, but that's more code and more chance or something to go wrong, with little benefit, so I didn't do that. An earlier version of this patch on the linked PR did that, but it's not necessary so I got rid of it.) __DATA is covered by the test below, too.

int3 accepted this revision.Jul 23 2021, 12:16 PM

lgtm, though I'm a bit confused about how the symbol liveness is working out

lld/MachO/SymbolTable.cpp
219
243

hmm, but the comment above says that "The replacing Defined gets its liveness from the underlying isec" -- how does that work if the isec isn't set here?

lld/test/MachO/start-end.s
142
261

would it be worthwhile to add a zerofill section to the test, to make sure the $end symbol occurs after the zeroes?

This revision is now accepted and ready to land.Jul 23 2021, 12:16 PM
oontvoo accepted this revision.Jul 23 2021, 12:21 PM

LGTM thanks!

thakis marked an inline comment as done.Jul 23 2021, 1:00 PM

Thanks!

lld/MachO/SymbolTable.cpp
243

That's an excellent question! I wrote the comment when there was still a backing isec. Absolute defineds get their liveness from the used bit on the superclass, so https://reviews.llvm.org/D106565 made that part work. We still need an explicitly-live isec to make sure that the OutputSection the new symbol is added to is emitted. I adjusted the comment.

lld/test/MachO/start-end.s
261

Sure, done. Seems to Just Work, but yes, that's a good thing to test.

oontvoo added inline comments.Jul 23 2021, 1:02 PM
lld/test/MachO/start-end.s
261

"Just Work", famous last words 🤔

Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2021, 1:04 PM
thakis marked an inline comment as done.EditedJul 23 2021, 3:12 PM
This comment has been deleted.
lld/test/MachO/start-end.s
142

Whoops, missed this. Fixed in a follow-up.

261

I'm sure I managed to hide a bug or two somewhere :)