This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Handle weak vs strong case in symbol resolution
ClosedPublic

Authored by oontvoo on Nov 1 2021, 7:13 PM.

Details

Summary

Previously, LLD only allows symbols with the same name if all of them are weak.

With this patch, we can allow symbols with the same name if one of them is weak. (This takes care of a TODO)

PR/52372

Diff Detail

Event Timeline

oontvoo created this revision.Nov 1 2021, 7:13 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a reviewer: Restricted Project. · View Herald Transcript
oontvoo requested review of this revision.Nov 1 2021, 7:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 1 2021, 7:13 PM
oontvoo added a comment.EditedNov 1 2021, 7:20 PM

Oh, wait - actually we should just remove the assert
ICF also needs fold CompactUnwind ...

thakis requested changes to this revision.Nov 2 2021, 7:10 AM
thakis added a subscriber: thakis.

This looks wrong to me.

lld/MachO/Driver.cpp
1475 ↗(On Diff #383943)

We definitely want to run dead stripping before ICF (no need to spend time ICF'ing code that's dead -- dead-stripping is faster than ICFing). This doesn't lg.

This revision now requires changes to proceed.Nov 2 2021, 7:10 AM
oontvoo planned changes to this revision.Nov 2 2021, 8:12 AM
oontvoo added inline comments.
lld/MachO/Driver.cpp
1475 ↗(On Diff #383943)

Right - on the other hand, compactUnwindEntries need to be setup before deadstripping (but it shouldn't be done before ICF ...)

Let me re-think this ... there's a cyclic depependency here.

int3 added a comment.Nov 2 2021, 11:13 AM

I don't think this has anything to do with ICF. The assert was trying to verify that any subsections associated with a coalesced weak symbol has wasCoalesced = true. But now I see that it's not always true, because of this FIXME: https://github.com/llvm/llvm-project/blob/main/lld/MachO/SymbolTable.cpp#L73

We could *probably* remove the assert with no ill effects; I think the only effect is that we will emit some subsections containing dead code that have their associated compact unwind info dropped. But we should ideally handle the FIXME. @oontvoo do you want to take a stab at it, or should I?

oontvoo updated this revision to Diff 384206.Nov 2 2021, 1:30 PM

handle FIXME, as pointed out by int3

The test attached (and all the existing tests) here should cover it.
Please let me know if additional tests are needed

int3 added a comment.Nov 2 2021, 2:04 PM

We can probably just extend weak-definition-gc.s instead of creating a new test. And we should verify that the associated contents are indeed deleted.

IMO once we cover that case, I don't think it's particularly important to test that the assert in registerCompactUnwind isn't fired. The "coalesced symbol => coalesced section contents" invariant is the more fundamental thing.

cc @thakis since he wrote the original weak symbol GC code

lld/MachO/SymbolTable.cpp
73–78

why are we moving these outside the if block? we should be coalescing away the subsection associated with the weak symbol, but not the subsection associated with the non-weak symbol

90

given that we only assign to curIsec in one place, we can just inline the block below here

int3 added a comment.Nov 2 2021, 2:11 PM

Also we should test that coalescing works correctly regardless of which order we encounter the symbols (i.e. weak first then non-weak, and vice versa)

oontvoo updated this revision to Diff 384238.Nov 2 2021, 2:34 PM
oontvoo marked an inline comment as done.

updated diff

oontvoo marked an inline comment as done and an inline comment as not done.Nov 2 2021, 2:35 PM
oontvoo added inline comments.
lld/MachO/SymbolTable.cpp
73–78

Yes, for this if-branch (ie., both are weak, then we're coalescing the new one)
For the other branch, we're coalescing the weak symbol's (ie., the old one)

int3 added inline comments.Nov 2 2021, 2:43 PM
lld/MachO/SymbolTable.cpp
73–74

is this what ld64 does for these two attributes too? (can we test it?)

73–78

oh right. okay yeah I was confusing myself

oontvoo added inline comments.Nov 2 2021, 6:17 PM
lld/MachO/SymbolTable.cpp
73–74

No, this is NOT what LD64 does. (In fact this was one of the differences I was trying to note in the draft doc a couple weeks ago).
LD64 picks one of the two (based on things like scope, alignment, whether symbols are auto-hide ) and doesn't merge anything.
I guess we could imitate that here ... didn't want to change that since this current approach seems to be working fine...

int3 added inline comments.Nov 2 2021, 7:23 PM
lld/MachO/SymbolTable.cpp
73–74

gotcha. Nonetheless, this is changing our behavior slightly, since we are now setting referencedDynamically and noDeadStrip when coalescing a weak with a non-weak, whereas before we would only do it when both symbols were weak. Can we keep the previous behavior for now?

oontvoo updated this revision to Diff 384459.Nov 3 2021, 8:33 AM

move the attributes merging back to the weak-weak case to presrve old behaviour - also added some note in case we want to revisit in the future

lld/MachO/SymbolTable.cpp
73–74

^ is done but. leaving open in case thakis@ wants to chime in

thevinster added inline comments.
lld/test/MachO/dup-symbols-weak-def.s
16 ↗(On Diff #384459)

nit: I imagine this might be a name mangled from your internal systems. I think it would cleaner if we just renamed these to our usual names in all of our tests - foo, bar, baz, etc.

20 ↗(On Diff #384459)

nit: Here to .cfi_endproc, could we align this with .cfi_startproc? Same for the below function as well.

oontvoo updated this revision to Diff 384742.Nov 4 2021, 7:09 AM
oontvoo marked 2 inline comments as done.

fixed name in tests + fixed inconsistent indentations

thakis added inline comments.Nov 4 2021, 7:46 AM
lld/MachO/SymbolTable.cpp
69

There's a long comment about this in createDefined() in InputFiles.cpp. Maybe refer to that instead.

(Also, typo ,.)

int3 added a comment.Nov 4 2021, 7:46 AM

Hmm I think you missed my earlier comment about the test:

We can probably just extend weak-definition-gc.s instead of creating a new test. And we should verify that the associated contents are indeed deleted.

IMO once we cover that case, I don't think it's particularly important to test that the assert in registerCompactUnwind isn't fired. The "coalesced symbol => coalesced section contents" invariant is the more fundamental thing.

oontvoo updated this revision to Diff 385214.Nov 5 2021, 4:32 PM
oontvoo marked an inline comment as done.

added test

int3 added a comment.Nov 8 2021, 10:15 AM

Thanks! Last nits: can we remove the dup-symbols-weak-def.s test now since the new logic is already covered by weak-definition-gc.s, and can we update the commit message to reflect the new approach?

lld/MachO/SymbolTable.cpp
83

IMO this is redundant since we should never be using coalesced sections in any meaningful capacity. Maybe comment it out and leave a note as to why it's not needed?

lld/test/MachO/weak-definition-gc.s
83–85
int3 accepted this revision.Nov 8 2021, 11:21 AM
oontvoo updated this revision to Diff 385581.Nov 8 2021, 11:50 AM

deleted redundant tests + removed extra spaces in tests

oontvoo added inline comments.Nov 8 2021, 11:55 AM
lld/MachO/SymbolTable.cpp
83

This was needed because only **coalesced-weak** sections are omitted from output.

Eg., without this, you could see my new tests emitting the following (commentary is mine, of course):

$ otool -jtV /Users/vyng/repo/llvm-project/build_lld/tools/lld/test/MachO/Output/weak-definition-gc.s.tmp/weak-strong-mixed.dylib
/Users/vyng/repo/llvm-project/build_lld/tools/lld/test/MachO/Output/weak-definition-gc.s.tmp/weak-strong-mixed.dylib:
(__TEXT,__text) section
>>>>>> all this stuff is from the weak symbol and should not have made it here.
0000000000000440	55              	pushq	%rbp
0000000000000441	4889e5          	movq	%rsp, %rbp
0000000000000444	5d              	popq	%rbp
0000000000000445	c3              	retq
<<<<<<<   END
_foo:
0000000000000446	3333            	xorl	(%rbx), %esi
0000000000000448	3333            	xorl	(%rbx), %esi
000000000000044a	c3              	retq

We could change the logic everywhere but I thought the easiest way was to remove the weak symbol here (as if it'd never been added when we already had a strong symbol)

WDYT?

oontvoo retitled this revision from [lld-macho] Fix failed assertion in registerCompactUnwind to [lld-macho] Handle weak vs strong case in symbol resolution.Nov 8 2021, 1:07 PM
oontvoo edited the summary of this revision. (Show Details)
int3 added inline comments.Nov 8 2021, 1:58 PM
lld/MachO/SymbolTable.cpp
83

huh interesting. I thought we would have omitted the section content regardless of whether that symbol were removed from the coalesced section... but I'm too lazy to dig into it right now. I also realized that ConcatInputSection::foldIdentical also takes care to remove the symbols, and this is basically a truncated version of that. Yeah ship it, and maybe I'll look into it another day

oontvoo marked an inline comment as done.Nov 8 2021, 4:33 PM

Thanks!

lld/MachO/SymbolTable.cpp
83

foldIdenticalSections() isn't run for this test. (It's not run by default ....)
But yeah, i'll have a look in a separate patch.

oontvoo marked an inline comment as done.Nov 8 2021, 4:34 PM

@thakis friendly ping?

int3 accepted this revision.Nov 8 2021, 7:25 PM

my 2c is that you can land this without waiting for thakis :)

hmm, seems to break windows build - reverted to investigate ...

oontvoo updated this revision to Diff 385822.Nov 9 2021, 7:46 AM
oontvoo marked an inline comment as done.

s/$otool/llvm-otool

otool doesn't exist on non-OSX platforms!

oontvoo updated this revision to Diff 385839.EditedNov 9 2021, 8:30 AM
  • add llvm-otool to the set of tools used by test so that the bot will use the <build_dir>/bin/llvm-otool instead of the unqualified llvm-otool (which may not exist)
  • update tests since the latest (TOT) llvm-otool prints a space between two bytes and the old one doesn't.

the bots look green now - submitting now. Thanks!

oontvoo removed a reviewer: thakis.Nov 9 2021, 9:02 AM
This revision is now accepted and ready to land.Nov 9 2021, 9:02 AM