This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Initial support for common symbols
ClosedPublic

Authored by int3 on Aug 31 2020, 11:10 PM.

Details

Summary

On Unix, it is traditionally allowed to write variable definitions without
initialization expressions (such as "int foo;") to header files. These are
called tentative definitions.

The compiler creates common symbols when it sees tentative definitions. When
linking the final binary, if there are remaining common symbols after name
resolution is complete, the linker converts them to regular defined symbols in
a __common section.

This diff implements most of that functionality, though we do not yet handle
the case where there are both common and non-common definitions of the same
symbol.

Diff Detail

Event Timeline

int3 created this revision.Aug 31 2020, 11:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2020, 11:10 PM
int3 requested review of this revision.Aug 31 2020, 11:10 PM
MaskRay added a subscriber: MaskRay.EditedSep 3 2020, 5:34 PM

Some interesting tests

  • common.o regular.o - Does regular.o win?
  • common.o archive.a (archive.a has a regular definition) - Is the regular definition in archive.a fetched? GNU ld's ELF port fetches archive.a in this case (Sun/HP-UX behavior https://sourceware.org/pipermail/binutils/1999-December/001283.html). lld/ELF doesn't
  • common.o archive.a (archive.a has a common) - Is the regular definition in archive.a fetched?
  • common.o shared.so - Does common.o win? In ELF, common-shared.s
  • common0.o common1.o - the larger size is picked. the larger alignment is picked.
int3 added a comment.Sep 4 2020, 10:31 AM

@MaskRay -- I split the common vs non-common symbol handling into D86910. Sorry if that made it confusing... I think it covers all those cases you mentioned, aside from

common.o archive.a (archive.a has a common) - Is the regular definition in archive.a fetched?

I will add that.

common0.o common1.o - the larger size is picked. the larger alignment is picked.

Unlike lld-ELF, ld64 actually just picks the larger size, so I've emulated that here.

MaskRay added inline comments.Sep 4 2020, 1:55 PM
lld/MachO/SymbolTable.cpp
85

This is strange. Is it >?

int3 added inline comments.Sep 4 2020, 2:03 PM
lld/MachO/SymbolTable.cpp
85

This is intentional. The same-size.s test verifies this behavior, and can be used to check that ld64 does the same thing.

int3 planned changes to this revision.Sep 4 2020, 2:13 PM
int3 added inline comments.
lld/MachO/SymbolTable.cpp
85

actually, I may have messed up the test...

int3 added inline comments.Sep 4 2020, 3:04 PM
lld/MachO/SymbolTable.cpp
85

okay fixed. ld64 actually takes the smaller alignment if the sizes are equal. still different from lld-ELF's behavior though...

MaskRay added inline comments.Sep 4 2020, 3:13 PM
lld/MachO/SymbolTable.cpp
85

This rule is hard to believe...

A larger alignment imposes a larger requirement. Shouldn't the larger alignment be used?

In ELF, size and align are separately considered.

int3 added inline comments.Sep 4 2020, 3:35 PM
lld/MachO/SymbolTable.cpp
85

I dunno... ld64 really does seem to behave like this. I haven't found the corresponding part of its code that handles common symbol resolution though

ld64 does take a -max_default_common_align option, and I double-checked to make sure that the max was high enough that it wouldn't interfere with our test

int3 planned changes to this revision.Sep 4 2020, 3:44 PM
int3 updated this revision to Diff 290048.Sep 4 2020, 4:40 PM

revert to original behavior, fix test, handle zero alignment

lld/MachO/SymbolTable.cpp
85

My bad... my initial test was wrong in two places, but the >= was actually correct. I've fixed the test and reverted the behavior to the original. I'm fairly confident that it is correctly replicating ld64's behavior now, though I do agree it's a little weird

int3 updated this revision to Diff 293753.Sep 23 2020, 8:26 AM

update SymbolUnion

gkm accepted this revision.Sep 23 2020, 5:44 PM
gkm added a subscriber: gkm.

Macro llvm-logo: Ship It

lld/MachO/Driver.cpp
443

Nit, bordering on Super Nit:
Say more about the nature of this replacement

  • replaceCommonSymbolsWithDefinedSymbols ?
  • defineCommonSymbols ?
  • upgradeCommonSymbolsToDefined ?
This revision is now accepted and ready to land.Sep 23 2020, 5:44 PM
int3 added inline comments.Sep 23 2020, 7:18 PM
lld/MachO/Driver.cpp
443

I agree the name could be more informative, but this name is taken from a similar function in lld-elf, and I'd prefer to match them where possible

This revision was automatically updated to reflect the committed changes.
omjavaid reopened this revision.Sep 24 2020, 12:27 AM
omjavaid added a subscriber: omjavaid.

I have temporarily reverted this change as i broke LLDB build on Arm http://lab.llvm.org:8011/builders/lldb-arm-ubuntu/builds/4409

This revision is now accepted and ready to land.Sep 24 2020, 12:27 AM
omjavaid requested changes to this revision.Sep 24 2020, 12:27 AM
This revision now requires changes to proceed.Sep 24 2020, 12:27 AM
int3 added a comment.Sep 24 2020, 5:02 AM

Whoops, I hadn't realized that the test failure was related to this diff. Thanks for catching it.

This revision was not accepted when it landed; it landed in state Needs Revision.Sep 24 2020, 3:00 PM
This revision was automatically updated to reflect the committed changes.