This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Partial support for weak definitions
ClosedPublic

Authored by int3 on Jul 9 2020, 11:18 PM.

Details

Reviewers
compnerd
Group Reviewers
Restricted Project
Commits
rG31d58858425f: [lld-macho] Partial support for weak definitions
Summary

This diff adds support for weak definitions, though it doesn't handle weak
symbols in dylibs quite correctly -- we need to emit binding opcodes for them
in the weak binding section rather than the lazy binding section.

What *is* covered in this diff:

  1. Reading the weak flag from symbol table / export trie, and writing it to the export trie
  2. Refining the symbol table's rules for choosing one symbol definition over another. Wrote a few dozen test cases to make sure we were matching ld64's behavior.

We can now link basic C++ programs.

Diff Detail

Event Timeline

int3 created this revision.Jul 9 2020, 11:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2020, 11:18 PM
compnerd added inline comments.
lld/MachO/ExportTrie.cpp
63

Can you please mark this explicit?

lld/MachO/ExportTrie.h
40 ↗(On Diff #276918)

Are we sure we wont need any other flags? I wonder if it's better to just treat weakness as a flag. IIRC, there is a EXPORT_SYMBOL_FLAGS_REEXPORT and EXPORT_SYMBOL_FLAGS_KIND_THREAD_LOCAL that would be fairly good to account for.

lld/MachO/InputFiles.cpp
234

It's okay if you want to use braces, but please use them on both sides. However, I think that this is better written as:

if (sym.n_type & N_EXT)
  return symtab->addDefined(name, isec, value, sym.n_desc & N_WEAK_DEF);
return make<Defined>(name, isec, value, sym.n_desc & N_WEAK_DEF);
lld/MachO/SymbolTable.cpp
51

What do you think of doing an early return instead if the symbol was inserted?

Can you explain why currently it is not an error if the symbol is not inserted and not defined? Seems like a comment for that would be good.

int3 updated this revision to Diff 277137.Jul 10 2020, 1:27 PM
int3 marked 6 inline comments as done.

address comments

lld/MachO/ExportTrie.h
40 ↗(On Diff #276918)

Yeah I wasn't super sure about how to organize this. My reason for making it a bool is so that the trie-decoding-specific logic can all live in ExportTrie.cpp. But having a whole bunch of boolean parameters in a callback isn't the prettiest either. But I figure we can switch it back later if necessary as we support more flags (or think of a different, cleaner API)

lld/MachO/InputFiles.cpp
234

oh yeah the braces were accidentally left over after some editing

lld/MachO/SymbolTable.cpp
51

not inserted and not defined => we will fall through to replaceSymbol below which will replace any Undefined/Lazy/Dylib symbols with the Defined one

compnerd added inline comments.Jul 14 2020, 9:04 AM
lld/MachO/ExportTrie.h
40 ↗(On Diff #276918)

I'm pretty confident that we will need to add a few more flags at the very least. IMO the migration from uint64_t to bool here is really a step backwards.

lld/MachO/SymbolTable.cpp
51

I think that my confusion here is that the not inserted implies that it already exists, but because the cast to Defined failed, it cannot be a defined symbol, so you are replacing it with an undefined symbol or or something else? Do we know what is replacing the symbol? If it is not defined, then why is this function called addDefined?

int3 marked 2 inline comments as done.Jul 14 2020, 11:55 PM
int3 added inline comments.
lld/MachO/ExportTrie.h
40 ↗(On Diff #276918)

okay, I don't feel super strongly about this so I can revert

lld/MachO/SymbolTable.cpp
51

the cast to Defined failing means that the existing symbol isn't Defined, and so we're replacing it with a Defined symbol. I'm not sure why you think we may be replacing it with an undefined symbol, but note that replaceSymbol is changing the value and type of s via placement new (maybe you thought it was inserting s?)

int3 updated this revision to Diff 278085.EditedJul 15 2020, 12:16 AM
int3 marked 2 inline comments as done.

revert to flags

smeenai added inline comments.
lld/MachO/InputFiles.cpp
398

Super super nit: convention is to format the comments as /*isWeakDef=*/false

lld/MachO/Symbols.h
57

How would you feel about making the field private and adding an accessor which asserts it's only called on the right kind of symbol? (Ordinarily I would have suggested a virtual method with subclass overrides, but it looks like lld keeps Symbol free of virtual methods, and with an assert we wouldn't have any overhead in no-assert builds.)

83

Not that it holds any meaning for undefined symbols, but how come we're setting isWeakDef to true?

107

Nit: missing the comment for the boolean argument.

lld/test/MachO/weak-definition-direct-fetch.s
31

LLD will read archives of all formats fine, but technically we should pass --format=darwin to get a Darwin archive.

48

Was the order of -lweakfoo and -lfoo meant to be switched in this line?

65

I assume the intended behavior is that whichever comes first on the command line (the archive or the object) wins out. Is it worth testing that explicitly? (As in, by creating a different object file that's part of the archive, perhaps with the symbol in a different section.)

70

Huh, interesting.

lld/test/MachO/weak-definition-indirect-fetch.s
7

"some other symbol in the archive member being referenced" would be more precise.

14

Is it necessary to have the callq _foo?

lld/test/MachO/weak-definition-order.s
13

Do we need this test for archives as well?

lld/test/MachO/weak-definition-over-dysym.s
27

Doesn't this contradict the test in weak-definition-direct-fetch that I commented "Huh, interesting" on?

Is weak-definition-direct-fetch not testing all these cases already?

int3 updated this revision to Diff 278375.Jul 15 2020, 11:24 PM
int3 marked 16 inline comments as done.

address comments

lld/MachO/Symbols.h
57

I actually add virtual methods to Symbol in D83603, so I can do it here too

83

no reason really, Undefined symbols are just weak compared to everything else, so I figured why not :p but yeah not having redundant members will make more sense

lld/test/MachO/weak-definition-direct-fetch.s
48

oops yeah

65

I'm confused. Was your comment intended for this line? We're only testing archive vs dylib here...

But in any case, for object vs archive, the object always wins out. Lines 82-85 demonstrate that.

lld/test/MachO/weak-definition-indirect-fetch.s
14

nope, will remove

lld/test/MachO/weak-definition-order.s
13

from the other tests it's pretty clear that the weak flag has no impact on archive symbols, so I figured it wasn't really necessary to test them here...

lld/test/MachO/weak-definition-over-dysym.s
27

The difference is that this isn't a "direct fetch". Pulling in _bar causes _foo to be pulled in too and take priority over the dylib symbol.

compnerd accepted this revision.Jul 16 2020, 10:56 AM
compnerd added inline comments.
lld/MachO/SymbolTable.cpp
51

Ah, now I follow. I think it would be nice to have that as a comment.

This revision is now accepted and ready to land.Jul 16 2020, 10:56 AM
int3 updated this revision to Diff 280602.Jul 24 2020, 3:11 PM
int3 marked 2 inline comments as done.

add comment

This revision was automatically updated to reflect the committed changes.