Page MenuHomePhabricator

[lld-macho] Handle encoding personalities of same names but different kinds
ClosedPublic

Authored by oontvoo on Aug 4 2021, 10:01 PM.

Details

Summary

Sometimes people intentionally re-define a dylib personlity symbol as a local defined symbol as a workaround to a ld -r bug.
As a result, we could see "too many personalities" to encode. This patch tries to handle this case by ignoring the local symbols entirely.

More generally, this patch also tries to resolve which personality symbols getting "picked" when there are more than one with the same name to avoid encoding too many personalities. (eg., personality symbols could be from 3 groups, local, global, and dylib. Granted, this is not a usual scenario but it does happen sometimes and LD64 seems to allow it - so I think LLD should , too.)

    
PR: 51262
    
Differential Revision: https://reviews.llvm.org/D107533

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
oontvoo updated this revision to Diff 365281.Aug 9 2021, 1:38 PM

readjust triple version

oontvoo edited the summary of this revision. (Show Details)Aug 9 2021, 1:45 PM
oontvoo updated this revision to Diff 365287.Aug 9 2021, 2:09 PM

add postlink check and fix lint error

oontvoo updated this revision to Diff 365328.Aug 9 2021, 6:26 PM

ignore tombstoned personality too

int3 added a comment.EditedAug 11 2021, 3:05 PM

Interesting example... after playing around with it for a bit, I'm not convinced that ld64 is actually behaving sensibly here πŸ˜•

First, when linking against combined.o, the dylib symbol "wins":

(base) ~/unwind: ld -arch "x86_64" -platform_version macos 10.15 11.0   -syslibroot "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk" -lSystem -lc++ user_2.o combined.o -o out-ld64
ld: warning: object file (user_2.o) was built for newer macOS version (11.0) than being linked (10.15)
ld: warning: object file (combined.o) was built for newer macOS version (11.0) than being linked (10.15)
(base) ~/unwind: llvm-objdump --macho --bind out-ld64
out-ld64:

Bind table:
segment  section            address    type       addend dylib            symbol
__DATA_CONST __got              0x100004000 pointer         0 libc++           ___gxx_personality_v0

In contrast, the code in this diff lets the local symbol win. But... I put "wins" above in quotes because ld64 doesn't actually seem to consider the local symbol a valid candidate at all! If we remove -lc++:

(base) ~/unwind: ld -arch "x86_64" -platform_version macos 10.15 11.0   -syslibroot "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk" -lSystem user_2.o combined.o -o out-ld64
ld: warning: object file (user_2.o) was built for newer macOS version (11.0) than being linked (10.15)
ld: warning: object file (combined.o) was built for newer macOS version (11.0) than being linked (10.15)
Undefined symbols for architecture x86_64:
  "___gxx_personality_v0", referenced from:
      _main in user_2.o
      Dwarf Exception Unwind Info (__eh_frame) in user_2.o
      _foo in combined.o
      Dwarf Exception Unwind Info (__eh_frame) in combined.o
ld: symbol(s) not found for architecture x86_64

However, linking user_1.o and defined.o in place of combined.o does cause the personalities to be resolved to the global symbol (as expected):

(base) ~/unwind: ld -arch "x86_64" -platform_version macos 10.15 11.0   -syslibroot "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk" -lSystem user_2.o user_1.o defined.o -o out-ld64
ld: warning: object file (user_2.o) was built for newer macOS version (11.0) than being linked (10.15)
ld: warning: object file (user_1.o) was built for newer macOS version (11.0) than being linked (10.15)
ld: warning: object file (defined.o) was built for newer macOS version (11.0) than being linked (10.15)
ld: warning: could not create compact unwind for ___gxx_personality_v0: FDE found for zero size function
(base) ~/unwind: llvm-objdump --macho --bind out-ld64
out-ld64:

Bind table:
segment  section            address    type       addend dylib            symbol

I'm not convinced that we should attempt to emulate this bizarre behavior. I don't see why the result of a link should differ depending on whether some of its intermediate inputs have been run through ld -r. Would it be possible to change your build system to work around this instead?

Interesting example...

Yep! this is "inspired" by for ios apps that try really hard to do funny things static linking

after playing around with it for a bit, I'm not convinced that ld64 is actually behaving sensibly here πŸ˜•

First, when linking against combined.o, the dylib symbol "wins":
In contrast, the code in this diff lets the local symbol win. But... I put "wins" above in quotes because ld64 doesn't actually seem to consider the local symbol a valid candidate at all! If we remove -lc++:

Actually this diff doesn't change which symbol "wins" in the final output - (it's pretty inconsequential to that)
Similarly, without supplying -lc++, LLD would also fail with undefined symbol here.

$  ../bin/ld64.lld.darwinnew   -arch "x86_64" -platform_version macos 10.15 11.0   -syslibroot "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk"  -lc++ -lSystem user_2.o combined.o -o lld_output
ld64.lld.darwinnew: warning: user_2.o has version 11.0.0, which is newer than target minimum of 10.15
ld64.lld.darwinnew: warning: combined.o has version 11.0.0, which is newer than target minimum of 10.15

$ llvm-objdump --macho --bind lld_output 
lld_output:
Bind table:
segment  section            address    type       addend dylib            symbol
__DATA_CONST __got              0x100002000 pointer         0 libc++           ___gxx_personality_v0

$  ../bin/ld64.lld.darwinnew   -arch "x86_64" -platform_version macos 10.15 11.0   -syslibroot "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk"   -lSystem user_2.o combined.o -o lld_output 
ld64.lld.darwinnew: warning: user_2.o has version 11.0.0, which is newer than target minimum of 10.15
ld64.lld.darwinnew: warning: combined.o has version 11.0.0, which is newer than target minimum of 10.15
ld64.lld.darwinnew: error: undefined symbol: ___gxx_personality_v0
>>> referenced by user_2.o
ld64.lld.darwinnew: error: undefined symbol: ___gxx_personality_v0
>>> referenced by user_2.o

However, linking user_1.o and defined.o in place of combined.o does cause the personalities to be resolved to the global symbol (as expected):

user_1.o + defined.o is not the same thing as combined.o .
(The trick here was that ___gxx_personality_v0 was defined as private-extern in defined.o, ie., T, and ld -r changed it to local defined, ie t in combined.o)

I'm not convinced that we should attempt to emulate this bizarre behavior. I don't see why the result of a link should differ depending on whether some of its intermediate inputs have been run through ld -r. Would it be possible to change your build system to work around this instead?

I think the most convoluted part here is how ld -r changes a a private extern to local πŸ€”
The rest of the things that we've observed here are by-products

I think the most convoluted part here is how ld -r changes a a private extern to local πŸ€”
The rest of the things that we've observed here are by-products

I believe -keep_private_externs (which isn't the default) will turn that off.

int3 added a comment.Aug 16 2021, 2:26 PM

Actually this diff doesn't change which symbol "wins" in the final output - (it's pretty inconsequential to that)

It changes which symbol "wins" as the personality symbol (as your "Prioritize Defined over others." comment indicates). ld64 always picks the dylib symbol; your diff seems to pick whichever symbol comes first:

(base) ~/unwind: ld64.lld -arch "x86_64" -platform_version macos 10.15 11.0   -syslibroot "/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk" -lSystem -lc++ combined.o user_2.o -o out-lld
ld64.lld: warning: combined.o has version 11.0.0, which is newer than target minimum of 10.15
ld64.lld: warning: user_2.o has version 11.0.0, which is newer than target minimum of 10.15
(base) ~/unwind:  llvm-objdump --macho --bind --unwind-info out-lld
out-lld:
Contents of __unwind_info section:
  Version:                                   0x1
  Common encodings array section offset:     0x1c
  Number of common encodings in array:       0x3
  Personality function array section offset: 0x28
  Number of personality functions in array:  0x1
  Index array section offset:                0x2c
  Number of indices in array:                0x2
  Common encodings: (count = 3)
    encoding[0]: 0x12020000
    encoding[1]: 0x02020000
    encoding[2]: 0x00000000
  Personality functions: (count = 1)
    personality[1]: 0x00002000
  Top level indices: (count = 2)
    [0]: function offset=0x00000000, 2nd level page offset=0x00000044, LSDA offset=0x00000044
    [1]: function offset=0x00000439, 2nd level page offset=0x00000000, LSDA offset=0x00000044
  LSDA descriptors:
  Second level indices:
    Second level index[0]: offset in section=0x00000044, base function offset=0x00000000
      [0]: function offset=0x00000000, encoding=0x12020000
      [1]: function offset=0x00000001, encoding=0x02020000
      [2]: function offset=0x00000430, encoding=0x00000000
      [3]: function offset=0x00000438, encoding=0x12020000

Bind table:
segment  section            address    type       addend dylib            symbol
__DATA_CONST __got              0x100002008 pointer         0 libc++           ___gxx_personality_v0

Note that the personality symbol does not match the dynamically bound address. (We should probably not be binding ___gxx_personality_v0 at all, but anyway...)

user_1.o + defined.o is not the same thing as combined.o .

I understood that. My point is that it *should* be the same thing. I would rather we not try to handle this weird edge case of ld64 if possible.

I believe -keep_private_externs (which isn't the default) will turn that off.

Nice, this seems to work in my local testing. @oontvoo, can you use that flag to work around the issue? Or are you trying to link some binary blob for which you don't have the source?

I believe -keep_private_externs (which isn't the default) will turn that off.

Thanks for the suggestion!

Nice, this seems to work in my local testing. @oontvoo, can you use that flag to work around the issue? Or are you trying to link some binary blob for which you don't have the source?

Well, in theory do have access to all the source files - but for this specific build, it's just a pre-compiled object file we have to deal with.
Let me talk to them and see if setting this flag would be an option.

Just so I have a sense, how strongly opposed are you to supporting this? :) 100% ? 80%, etc?

int3 added a comment.Aug 16 2021, 7:30 PM

Well, assuming it's multiple-choice, I pick 80% ;p

I mean, if you can't find a workaround, and/or there are multiple builds all over the place that need this, then I guess we should support it somehow

Re: setting -keep_private_externs

Turnt out, the flag was *not* set for a specific reason

https://chromium.googlesource.com/chromium/src/+/refs/heads/main/components/cronet/tools/hide_symbols.py#124

# When compiling for 64bit targets, the symbols in call_with_eh_frame.o are
# referenced in assembly and eventually stripped by the call to ld -r above,
# perhaps because the linker incorrectly assumes that those symbols are not
# used. Using -keep_private_externs fixes the compile issue, but breaks
# other parts of cronet. Instead, simply add a second .o file with the
# personality routine. Note that this issue was not caught by Chrome tests,
# it was only detected when apps tried to link the resulting .a file.

Maybe when/if we implement lld -r we can fix the bug here ...

But anyways, back the previous question of whether we want this, cronet (part of Chrominium) is a large part of what we want to support, so I'd really like to get this fixed somehow
(I see now this patch has a bug but it can be fixed).
Thoughts?

int3 added a comment.Aug 18 2021, 6:56 PM

Fair enough, let's support it then :)

I think we can have a considerably simpler implementation though, by handling this case as part of UnwindInfoSectionImpl<Ptr>::prepareRelocations. In particular, we already have logic that special-cases the personality relocation in order to handle section relocations: https://github.com/llvm/llvm-project/blob/main/lld/MachO/UnwindInfoSection.cpp#L212. We could add something that replaces local defined symbols with dylib symbols of the same name. Also, I would guess that it's pretty rare for personality symbols to be local, so we can skip the expensive string hash in the common case.

(Worth checking: what happens if we have local defined + global defined? Presumably the same sort of replacement happens?)

lld/MachO/UnwindInfoSection.cpp
318–319

do we really need to care about <internal> symbols?

oontvoo planned changes to this revision.Aug 30 2021, 11:08 AM

make picking a symbol deterministic

oontvoo updated this revision to Diff 371179.Sep 7 2021, 1:53 PM

addressed comment

(Worth checking: what happens if we have local defined + global defined? Presumably the same sort of replacement happens?)

I think if you have both local and global, then the local would take precedence (as expected).

oontvoo planned changes to this revision.Sep 7 2021, 2:58 PM

needs test for local vs global too

oontvoo updated this revision to Diff 371416.Sep 8 2021, 12:22 PM

more tests

I think this is ready for review. PTAL. Thanks!

I only reviewed the mechanical changes as I'm not too familiar with the functional changes. I'll let someone more experienced comment on that.

lld/MachO/UnwindInfoSection.cpp
225

Probably it's just my lack of knowledge here, but is this line useful? It eventually gets overridden in the start of the loop on L228 right?

309

Why does this need to be initialized to 0? Isn't it always going to be eventually defined below?

lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
26

Shouldn't there also be a check on the symbol table for b.out and c.out?

oontvoo marked an inline comment as done.Sep 9 2021, 6:04 PM
oontvoo added inline comments.
lld/MachO/UnwindInfoSection.cpp
225

No, it's a reference to the entry into the map. I guess it would've been clearer to have written:

Line: 228: auto it = personalityTable.find(....);
< .... >

HERE: it->second = s;
309

sorry - spurious change .

int3 added inline comments.Sep 9 2021, 10:57 PM
lld/MachO/UnwindInfoSection.cpp
145–146

hm, what do you mean "intentionally defined ... as a hack to get around another bug"? Did you figure out why ld -r is emitting these local symbols?

175–178

omg... I assume you've verified that this what ld64 does? this seems super bizarre. It's not even a valid total order!

179
213
oontvoo marked an inline comment as done.Sep 10 2021, 10:21 AM
oontvoo added inline comments.
lld/MachO/UnwindInfoSection.cpp
175–178

Yeah - the last two I could understand a little. The first two are completely unexpected.
Now that we're here, actually. I, personally, would be fine with LLD re-defining its own rules - as long as it's consistent.

(The problem I'm trying to fix here is the duplicate personality symbols - hence as long as we have a set of rules to pick one, i'm good)
Any inclination on which way to go from here?

oontvoo retitled this revision from Handle encoding personalities of same names but different kinds. to [lld-macho] Handle encoding personalities of same names but different kinds.Sep 10 2021, 11:47 AM
oontvoo edited the summary of this revision. (Show Details)
oontvoo added inline comments.Sep 10 2021, 2:21 PM
lld/MachO/UnwindInfoSection.cpp
145–146

To be clear, the "bug" here is *NOT* that ld -r produces local symbols for previously private extern symbols . In fact, it seems sort of reasonable(in hindsight 😊) - if you consider the output of ld -r to be a shared library, then making sure private symbols remain local makes sense.

The "bug" here is that the LD64 incorrectly thinks symbols from one of our .o files were not used, and therefore strip them from the final output when we run ld -r. We've been working around this by having an additional .o file containing the personality symbol as a local symbol.

int3 added inline comments.Sep 13 2021, 10:49 PM
lld/MachO/UnwindInfoSection.cpp
175–178

I was thinking we could special-case just local symbols. For each local personality symbol, we check the symbol table to see if an entry with the same name exists there. If so, we replace the relocation's reference to the local symbol with the symbol in the symbol table.

oontvoo marked 2 inline comments as done.Sep 16 2021, 11:24 AM
oontvoo added inline comments.
lld/MachO/UnwindInfoSection.cpp
175–178

Ok - let's try that - it might be simpler. (I wasn't super in favour of that because we might end up special-casing a lot of things ... in which case, it would've been better to do it in a separate pass, as attempted here).

oontvoo updated this revision to Diff 373017.Sep 16 2021, 11:25 AM
oontvoo marked an inline comment as done.
oontvoo retitled this revision from [lld-macho] Handle encoding personalities of same names but different kinds to [lld-macho] Handle encoding personalities of same names but different kinds.
oontvoo edited the summary of this revision. (Show Details)
  • Simplify the handling to just deal with local sym. (left a fixme on the possibly non-determinism of this)
  • Added more checks in test
int3 added inline comments.Sep 16 2021, 12:57 PM
lld/MachO/UnwindInfoSection.cpp
212

why not use SymbolTable::find() instead?

Right now, this just pick whatever comes first

There will only be one symbol of a given name in the global symbol table, so that issue is avoided too :)

oontvoo updated this revision to Diff 373067.Sep 16 2021, 2:12 PM
oontvoo marked an inline comment as done.

simplify

int3 accepted this revision.Sep 16 2021, 11:06 PM

Thanks!

lld/MachO/UnwindInfoSection.cpp
195
200

symtab->find will never return an Undefined, so we can use else if here

lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
18

let's try to be consistent with commas and spaces :) (same for the lines below)

19

-pie is enabled by default

36

suggestion: instead of _LOW and _HI, how about GXX_PERSONALITY_DYLIB and GXX_PERSONALITY?

57
58
This revision is now accepted and ready to land.Sep 16 2021, 11:06 PM
oontvoo updated this revision to Diff 373248.Sep 17 2021, 9:09 AM
oontvoo marked 6 inline comments as done.

fixed formating issues

lld/MachO/UnwindInfoSection.cpp
195

🧐 🧐 🧐

200

Ah, no actually - it *can* find an undefined (dylib) symbol. (The third testcase (c.out) covers this).

lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
18

(the 3 RUNs were consistent with each other tho)

36

But it's not the _DYLIB that's picked in b.out and c.out . so calling it that is a bit mis-leading, I think.

int3 added inline comments.Sep 17 2021, 9:52 AM
lld/MachO/UnwindInfoSection.cpp
200

oh yeah I forgot that while we exit early if some undefined symbols remain unresolved, that exit happens after this function runs...

lld/test/MachO/compact-unwind-both-local-and-dylib-personality.s
36

ah gotcha

oontvoo updated this revision to Diff 373265.Sep 17 2021, 9:59 AM

more minor formatting + rebase

thanks for the review!

This revision was landed with ongoing or failed builds.Sep 17 2021, 10:01 AM
This revision was automatically updated to reflect the committed changes.