This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Dylib symbols should always replace undefined symbols
ClosedPublic

Authored by int3 on Apr 29 2020, 12:10 PM.

Details

Summary

Otherwise we get undefined symbol errors depending on the order of
arguments on the command line.

Depends on D78270.

Diff Detail

Event Timeline

int3 created this revision.Apr 29 2020, 12:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2020, 12:10 PM
Harbormaster failed remote builds in B55157: Diff 260986!
Harbormaster failed remote builds in B55155: Diff 260982!
smeenai added inline comments.Apr 29 2020, 1:10 PM
lld/test/MachO/resolution.s
28–29

Where are _bar and _baz defined in this assembly file?

Not this diff, but it appears ld64 always prefers symbols from object files over symbols from dylibs. For example:

$ cat f1.c
int f() { return 1; }

$ cat f2.c:
int f() { return 2; }

$ cat main.c
int f(void);
int main() { return f(); }

$ clang -fPIC -dynamiclib -o libf1.dylib f1.c
$ clang -c f2.c
$ clang -c main.c

$ ld libf1.dylib main.o f2.o -lSystem
$ ./a.out; echo $?
2

$ ld main.o libf1.dylib f2.o -lSystem
$ ./a.out; echo $?
2

# sanity check
$ ld main.o libf1.dylib -lSystem

I believe we should already be implementing this behavior, courtesy of a Defined replacing any other symbol, but it's worth adding a test.

int3 marked an inline comment as done.Apr 29 2020, 7:54 PM
int3 added inline comments.
lld/test/MachO/resolution.s
28–29

ah, I was under the impression that I didn't need to define them because they were already appearing in the symtab, but I realize they were appearing as undefined symbols instead of the global/local defined symbols that I wanted. Will fix, thanks

I believe we should already be implementing this behavior, courtesy of a Defined replacing any other symbol, but it's worth adding a test.

Yeah this is what _bar and _baz are supposed to test :)

int3 updated this revision to Diff 261116.Apr 29 2020, 7:55 PM

fix test

smeenai added inline comments.Apr 30 2020, 12:18 AM
lld/test/MachO/resolution.s
28–29

For the particular case I was thinking of, you'd need to have the dylib appear in the link line before the object file, to confirm that a Defined symbol replaces a DylibSymbol.

Also, it's not too relevant for this particular test, but it's an interesting question of what it means to have a GOTPCREL relocation against a symbol in your own binary (especially a symbol that's not even external, like _baz here). ld64 creates GOT entries for those symbols. I believe we won't right now, since our GOT creation is based on whether the relocation is targeting a dylib symbol vs. what the type of the relocation is.

int3 marked an inline comment as done.Apr 30 2020, 2:43 AM
int3 added inline comments.
lld/test/MachO/resolution.s
28–29

Oh gotcha. And yeah, you're right, we're not handling those GOT relocations quite correctly right now...

int3 updated this revision to Diff 261164.Apr 30 2020, 2:43 AM

extend test, also simplify libresolution.s

This behavior is reasonable to me but where can we find authoritative symbol resolution rules? (You may be aware of https://lld.llvm.org/NewLLD.html "Efficient archive file handling". The ELF port has different resolution rules regarding archives.)

lld/test/MachO/resolution.s
3

Consider inlining short Inputs/* file:

# RUN: echo '.globl _foo, _bar, _baz; _foo: _bar: _baz' | \
# RUN:   llvm-mc -filetype=obj -triple=x86_64-apple-darwin - -o %t/libresolution.o
int3 marked an inline comment as done.May 5 2020, 4:46 PM

where can we find authoritative symbol resolution rules?

I guess the authoritative source is whatever ld64 is doing. I took a brief look at their SymbolTable.cpp but it seems rather complex, handling many kinds of symbols. The test behavior in this diff does emulate the observed behavior of ld64 though.

Thanks for the pointer to the ELF resolution rule design, will keep that in mind as we expand support for symbol kinds.

int3 updated this revision to Diff 262262.May 5 2020, 4:46 PM

define libresolution inline

smeenai accepted this revision.May 6 2020, 12:15 AM

LGTM.

where can we find authoritative symbol resolution rules?

I guess the authoritative source is whatever ld64 is doing. I took a brief look at their SymbolTable.cpp but it seems rather complex, handling many kinds of symbols. The test behavior in this diff does emulate the observed behavior of ld64 though.

Thanks for the pointer to the ELF resolution rule design, will keep that in mind as we expand support for symbol kinds.

I added some commentary on ld64's behavior in https://reviews.llvm.org/D78342#inline-728182.

This revision is now accepted and ready to land.May 6 2020, 12:15 AM
This revision was automatically updated to reflect the committed changes.