This is an archive of the discontinued LLVM Phabricator instance.

[lld/mac] Implement support for private extern symbols
ClosedPublic

Authored by thakis on Dec 20 2020, 8:23 PM.

Details

Reviewers
int3
Group Reviewers
Restricted Project
Commits
rG13f439a1872b: [lld/mac] Implement support for private extern symbols
Summary

Private extern symbols are used for things scoped to the linkage unit.
They cause duplicate symbol errors (so they're in the symbol table,
unlike TU-scoped truly local symbols), but they don't make it into the
export trie. They are created e.g. by compiling with
-fvisibility=hidden.

If two weak symbols have differing privateness, the combined symbol is
non-private external. (Example: inline functions and some TUs that
include the header defining it were built with
-fvisibility-inlines-hidden and some weren't).

A weak private external symbol implicitly has its "weak" dropped and
behaves like a regular strong private external symbol: Weak is an export
trie concept, and private symbols are not in the export trie.

If a weak and a strong symbol have different privateness, the strong
symbol wins.

If two common symbols have differing privateness, the larger symbol
wins. If they have the same size, the privateness of the symbol seen
later during the link wins (!) -- this is a bit lame, but it matches
ld64 and this behavior takes 2 lines less to implement than the less
surprising "result is non-private external), so match ld64.
(Example: int a in two .c files, both built with -fcommon,
one built with -fvisibility=hidden and one without.)

This also makes __dyld_private a true TU-local symbol, matching ld64.
To make this work, make the const char* StringRefZ ctor to correctly
set size (without this, writing the string table crashed when calling
getName() on the __dyld_private symbol).

Mention in CommonSymbol's comment that common symbols are now disabled
by default in clang.

Mention in -keep_private_externs's HelpText that the flag only has an
effect with -r (which we don't implement yet -- so this patch here
doesn't regress any behavior around -r + -keep_private_externs)). ld64
doesn't explicitly document it, but the commit text of
http://reviews.llvm.org/rL216146 does, and ld64's
OutputFile::buildSymbolTable() checks `_options.outputKind() ==
Options::kObjectFile` before calling _options.keepPrivateExterns()
(the only reference to that function).

Fixes PR48536.

Diff Detail

Event Timeline

thakis created this revision.Dec 20 2020, 8:23 PM
thakis requested review of this revision.Dec 20 2020, 8:23 PM
thakis edited the summary of this revision. (Show Details)
thakis updated this revision to Diff 313098.Dec 21 2020, 6:32 AM
thakis edited the summary of this revision. (Show Details)
  • update tests (even found a bug in the common symbol handling)
  • tweak behavior of two commons with same size to match ld64
  • add many tests. i verified that ld64 has the same behavior on these tests

still need to look at weak private external XXXs.

MaskRay added a subscriber: lhames.Dec 21 2020, 8:51 AM
int3 added a subscriber: int3.Dec 21 2020, 11:13 AM

looks pretty good!

lld/MachO/Driver.cpp
528

nit: seems unnecessary to have the argument name in a comment when the argument itself is descriptive, but I guess it might look more uniform this way. up to you

lld/MachO/Symbols.h
28

This defeats the purpose of using StringRefZ over StringRef though -- it's meant to be a performance optimization (see the comment on the identically-named class in lld/ELF... I probably should have copied it over and/or factored the class out)

Would a constexpr constructor work here?

123

how about combining privateExtern and external into one enum, so that the invalid state of external = false, privateExtern = true is not representable?

lld/test/MachO/private-extern.s
118

how do you feel about hyphenating the check prefix?

lld/test/MachO/symtab.s
39

I think including the string offsets in the expected test output was a mistake (probably mine), let's remove it to reduce future churn

thakis updated this revision to Diff 313165.Dec 21 2020, 11:37 AM
thakis edited the summary of this revision. (Show Details)

address XXXs about weak private externs and add test for them

I'll reply to comments in a bit. This is now done (except for review comments).

thakis added inline comments.Dec 21 2020, 11:42 AM
lld/MachO/SyntheticSections.cpp
381

This is the only change from defined->isWeakDef() && defined->isExternal() to isExternalWeakDef() where I couldn't come up with a test case that shows the difference.

thakis updated this revision to Diff 313172.Dec 21 2020, 12:06 PM
thakis marked 4 inline comments as done.

address comments

Comments addressed, thanks!

Re binding vs private externals: See also "references to internal symbol never need binding" comments in ld64.

lld/MachO/Driver.cpp
528

Yup, done. That's a left-over from an earlier iteration.

lld/MachO/Symbols.h
28

This ctor is exclusively used for the __dyld_private symbol, so performance probably isn't all that important :) Do you expect this to be used more often? Maybe it's not intentional that it's used so little?

But sure, undid this and instead imported the technique for dealing with this from ELF to Symbol::getName().

123

The current setup has the advantage that the external bit can stay const, which is why I when with two independent bits. It's also closer to how things look in the input and output files. ld64 does this dance where it converts all file bits to its own concepts at input and back at output and I find that more confusing then helpful.

I don't feel terribly strongly about that though. If you want I can change it to an enum.

lld/test/MachO/private-extern.s
118

No strong opinion. The advantage of no hyphens is that -NOT, -DAG etc look more obviously special, the downside is that it's a bit harder to read. Added hyphens.

thakis edited the summary of this revision. (Show Details)Dec 21 2020, 12:07 PM
int3 accepted this revision.Dec 21 2020, 4:50 PM
int3 added a subscriber: smeenai.

nice work!

lld/MachO/SymbolTable.cpp
15

leftover include?

lld/MachO/Symbols.h
28

oops, not having that ctor used elsewhere was an oversight I guess

51

is the cast necessary?

123

ah fair enough. it's fine as-is

lld/test/MachO/private-extern.s
106

ultra nit

lld/test/MachO/weak-private-extern.s
16
21–22

other reviewers have expressed a preference against using obj2yaml in tests unless absolutely necessary... though I don't remember the exact reasons (cc @smeenai). In this case I think we could use llvm-readobj --syms -- it would make the expected value more readable too

26

I was going to suggest that it might be worth testing TLV weak private externs too, then I realized that we lacked a test for regular TLV weakdefs 🤭

we could address both in a separate diff I guess

This revision is now accepted and ready to land.Dec 21 2020, 4:50 PM
thakis marked 4 inline comments as done.Dec 21 2020, 6:22 PM

Thanks! Submitting.

lld/MachO/Symbols.h
51

../../lld/MachO/Symbols.h:51:18: warning: comparison of integers of different signs: 'uint32_t' (aka 'unsigned int') and 'int' [-Wsign-compare]

lld/test/MachO/weak-private-extern.s
21–22

Oh yeah, that's nicer, thanks.

26

Will send you tests for TLV weaks tomorrow :)

thakis marked an inline comment as done.Dec 21 2020, 6:24 PM

(Also makes linking Chromium Framework faster >20%, from ~5.1s to ~3.8s or so. Didn't profile why -- maybe the export trie construction algorithm needs some speeding up if that should be the cause. I guess I should profile before/after.)

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptDec 21 2020, 6:26 PM

Actual bench.py / ministat before/after data:

    N           Min           Max        Median           Avg        Stddev
x   5     4.9493809      5.168751     5.0355039     5.0416348   0.082035677
+   5     3.7375929     3.9309309     3.7923329     3.8123539   0.077711957
Difference at 95.0% confidence
	-1.22928 +/- 0.116534
	-24.3826% +/- 2.31143%
	(Student's t, pooled s = 0.0799031)

Here's a profile of the "before" run: https://imgur.com/a/fzG2yqw In the "after" profile, TrieBuilder isn't in the profile at all (…not surprisingly, since it's ~20 exports vs 797022 before). So for links that have many exports (possibly Chromiums component build; haven't tried that yet, but it's many smaller dylibs there instead of a single huge one), TrieBuilder might need work at some point in the future. For now, this change here sidesteps that :)