This is an archive of the discontinued LLVM Phabricator instance.

[lld-macho] Change loadReexport to handle the case where a TAPI re-exports to reference documents nested within other TBD.
ClosedPublic

Authored by oontvoo on Feb 24 2021, 8:57 PM.

Details

Summary

Currently, it was delibrately impleneted to not handle this case, but as it has turnt out, we need this feature.
The concrete use case is

`System/Library/Frameworks/Cocoa.framework/Versions/A/Cocoa` reexports
        /System/Library/Frameworks/AppKit.framework/Versions/C/AppKit , which then rexports
             /System/Library/PrivateFrameworks/UIFoundation.framework/Versions/A/UIFoundation`

The current implemention uses a global currentTopLevelTapi, which is not reset until it finishes loading the whole tree.
This is a problem because if the top-level is set to Cocoa, then when we get to UIFoundation, it will try to find UIFoundation in the current top level, which is Cocoa and will not find it.

The rules should be:

  • When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD.
  • When loading from an actual dylib, no additional TBD documents need to be examined.
  • In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file

OPEN-ISSUE:

Haven't been able to write a good test for this. The test checked in here, ideally, should have failed with errors like lld: error: unable to locate re-export with install name <path to the second-level dylib> before this patch. Unfortunately, it kept failing with "undefined symbols". Can someone tell me what I did wrong in the test?

Diff Detail

Event Timeline

oontvoo created this revision.Feb 24 2021, 8:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2021, 8:57 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
oontvoo requested review of this revision.Feb 24 2021, 8:57 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2021, 8:57 PM
oontvoo edited the summary of this revision. (Show Details)Feb 24 2021, 8:58 PM
oontvoo updated this revision to Diff 326277.Feb 24 2021, 9:01 PM

<cosmetic changes only>

IIUC, the requirement is:

  • When loading a library from a TBD file, re-exports need to be looked up in the auxiliary documents within the same TBD. (even if no file on disk exists for the referenced library).
  • When loading from an actual dylib, no additional TBD documents need to be examined.
  • In no case does a re-export mentioned in one TBD file need to be looked up in a document in an auxiliary document from a different TBD file

So, with the addition of getParent() to InterfaceFile, I think this logic can be fully within within the Dylib(InterfaceFile*, ...) constructor, no external info required. It just needs to pass "interface" down to loadReexport/loadReexportHelper (which can call getParent() before accessing documents()).

So, I think all the other API changes can be reverted as unnecessary.

As for the test: Looks like you have only one TBD file, rather than the two in the original example: Cocoa (in Cocoa.tbd) re-exports AppKit (in AppKit.tbd) re-exports UIFoundation (also in AppKit.tbd). For your test, if you add a new "SystemReexport.tbd" which re-exports libSystem, I think that'd show the problem.

lld/MachO/Driver.h
18

LLVM doesn't require C++17 yet, so you can't use this syntax.

int3 added a comment.Feb 26 2021, 10:55 AM

Like the bundle_loader diff, this one needs more motivating detail in the commit message. It doesn't help that the Cocoa / AppKit libraries on my system don't include TBDs (perhaps because I'm not yet on Big Sur), so I'm not entirely sure what your commit message example is supposed to show. (And we should not assume that future readers will have a local system that matches yours, either.)

@jyknight's comment helps clarify things a bit, but I'm still not sure why we need a parent pointer instead of the global currentTopLevelTapi pointer that we currently have. Is it because we might have a TBD file with child documents that re-exports another TBD file, also with child documents?

lld/test/MachO/Inputs/iPhoneSimulator.sdk/usr/lib/libSystem.tbd
7–9

how is this different from the re-exports list on line 12 below?

(Scanning through TextStub.cpp, it seems like reexported-libraries is a TBD v4 thing, but this file is TBD v3, so I'm not sure it's doing anything. Are the system TBDs on your machine v4?)

60

nit: I would prefer we keep libSystem.tbd looking like a subset of the real libSystem, i.e. ideally this symbol would be something that's actually defined in libsystem_sim_pthread_host

int3 added inline comments.Feb 26 2021, 11:43 AM
lld/test/MachO/Inputs/iPhoneSimulator.sdk/usr/lib/libSystem.tbd
7–9

(perhaps this is why your test wasn't failing as expected)

oontvoo marked 3 inline comments as done.Feb 26 2021, 1:01 PM

As for the test: Looks like you have only one TBD file, rather than the two in the original example: Cocoa (in Cocoa.tbd) re-exports AppKit (in AppKit.tbd) re-exports UIFoundation (also in AppKit.tbd). For your test, if you add a new "SystemReexport.tbd" which re-exports libSystem, I think that'd show the problem.

I've added a libReexportSystem.tbd, which reexport libSystem.tbd but still coulnd't get it to reproduce the re-export error ... :(
I think we have to tell the test to say load the libReexportSystem right?

Like the bundle_loader diff, this one needs more motivating detail in the commit message. It doesn't help that the Cocoa / AppKit libraries on my system don't include TBDs (perhaps because I'm not yet on Big Sur), so I'm not entirely sure what your commit message example is supposed to show. (And we should not assume that future readers will have a local system that matches yours, either.)

@jyknight's comment helps clarify things a bit, but I'm still not sure why we need a parent pointer instead of the global currentTopLevelTapi pointer that we currently have. Is it because we might have a TBD file with child documents that re-exports another TBD file, also with child documents?

Yes, that's correct. Say you have A re-exports B, B (which has some child docs) reexports c (where c is defined in B)
If you keep the globalTopLevelTapi as A, then in reexportHelper, you can't find c in there.

lld/test/MachO/Inputs/iPhoneSimulator.sdk/usr/lib/libSystem.tbd
7–9

Yes, good point. This is TBD v4.
I've reverted changes to this file and added a new libReexportSystem (as suggested by jyknight)

oontvoo updated this revision to Diff 326781.Feb 26 2021, 1:01 PM
oontvoo marked an inline comment as done.

reverted unneccesary API changes.

oontvoo updated this revision to Diff 326784.Feb 26 2021, 1:05 PM

clang-format

oontvoo updated this revision to Diff 326785.Feb 26 2021, 1:08 PM

updated commit message with more details

oontvoo edited the summary of this revision. (Show Details)Feb 26 2021, 1:09 PM
oontvoo edited the summary of this revision. (Show Details)
int3 added inline comments.Feb 26 2021, 2:21 PM
lld/MachO/Driver.h
21
lld/MachO/InputFiles.cpp
543–544

since this comment no longer precedes a variable declaration, it's not entirely clear what it's referring to any more :)

546–551

this comment still seems relevant

707–708

naming convention is camelCase

708–709

Isn't the current interface having no parent the same as a TAPI being the top-level one?

710–713

nit: this could be a ternary

also, since there are only two levels in the document tree, I guess this should just be "second level exporters", no "+"

lld/test/MachO/Inputs/iPhoneSimulator.sdk/usr/lib/libReexportSystem.tbd
7

usually the install name matches the tbd name

9
lld/test/MachO/reexport-nested-lib.s
3
6

I like to

6

lt doesn't seem like libReexportSystem is actually being used in this test

Also, there doesn't seem to be a need to use the iPhoneSimulator SDK over the macOS SDK here -- %lld defaults to include the macOS SDK as syslibroot, so it's a bit more convenient to use that.

Another nit: I would rather libReexportSystem to not be placed under /usr/lib -- but I acknowledge there's no strong rationale here, I just like our fake /usr/lib to look as much like a realistic /usr/lib as possible. Personally I would place that tbd directly under %S/Inputs.

Finally, I think it's nice to test that the binary contains the right symbol binding via llvm-objdump --macho --bind, rather than just assuming it from a successful LLD exit code.

15

I would add a comment explaining why __crashreporter_info__ is an interesting symbol to test here

int3 added inline comments.Feb 26 2021, 2:22 PM
lld/test/MachO/reexport-nested-lib.s
6

I like to

sorry, stray comment, ignore it

Harbormaster completed remote builds in B91106: Diff 326784.
oontvoo marked 13 inline comments as done.Mar 1 2021, 1:24 PM
oontvoo added inline comments.
lld/test/MachO/reexport-nested-lib.s
6

lt doesn't seem like libReexportSystem is actually being used in this test

Also, there doesn't seem to be a need to use the iPhoneSimulator SDK over the macOS SDK here -- %lld defaults to include the macOS SDK as syslibroot, so it's a bit more convenient to use that.

Another nit: I would rather libReexportSystem to not be placed under /usr/lib -- but I acknowledge there's no strong rationale here, I just like our fake /usr/lib to look as much like a realistic /usr/lib as possible. Personally I would place that tbd directly under %S/Inputs.

Finally, I think it's nice to test that the binary contains the right symbol binding via llvm-objdump --macho --bind, rather than just assuming it from a successful LLD exit code.

oontvoo updated this revision to Diff 327265.Mar 1 2021, 1:25 PM
oontvoo marked an inline comment as done.

Updated diff:

  • fixed styling/comment issues
  • updated test
int3 added inline comments.Mar 1 2021, 1:28 PM
lld/test/MachO/reexport-nested-lib.s
6

Did you forget to type something? You just quoted my comment in its entirety :P

oontvoo updated this revision to Diff 327271.Mar 1 2021, 1:37 PM

updated test to check for the symbols in result binary

oontvoo added inline comments.Mar 1 2021, 1:46 PM
lld/test/MachO/reexport-nested-lib.s
6

Did you forget to type something? You just quoted my comment in its entirety :P

Yes, sorry! I thought I typed the responses in-line. not sure why it's not showing up.

Anyway, re-posting it again:

lt doesn't seem like libReexportSystem is actually being used in this test

Fixed (updated the test to reference symbols from both the exporter and the exportee.

Also, there doesn't seem to be a need to use the iPhoneSimulator SDK over the macOS SDK here -- %lld defaults to include the macOS SDK as syslibroot, so it's a bit more convenient to use that.

The macOS SDK's libSystem.tbd doesn't have a "benign" symbol (like crash_reporter symbol) I could use, so I thought it was easier to use the iOS.

Another nit: I would rather libReexportSystem to not be placed under /usr/lib -- but I acknowledge there's no strong rationale here, I just like our fake /usr/lib to look as much like a realistic /usr/lib as possible. Personally I would place that tbd directly under %S/Inputs.
Finally, I think it's nice to test that the binary contains the right symbol binding via llvm-objdump --macho --bind, rather than just

assuming it from a successful LLD exit code.

Added the objdump CHECKs

int3 added inline comments.Mar 1 2021, 4:20 PM
lld/test/MachO/reexport-nested-lib.s
20–22

I don't think this is testing the right thing. (The test passes for me even without your code changes applied.)

I believe the case we are trying to fix is as follows:

  • libFooBar.tbd contains libFoo.dylib and libBar.dylib. libFoo re-exports libBar. libBar contains some symbol Bar.
  • libReexportsFoo.tbd contains libReexportsFoo.dylib which re-exports libFoo. It doesn't matter whether it re-exports anything else.

Prior to this fix, we would have failed to look up libBar because we would be looking inside libReexportsFoo.tbd, instead of libFoo.tbd.

Right now we are testing that a symbol in libFoo can be resolved by the linker, but we actually need to test for a symbol that's in a library that libFoo (libSystem in your case) re-exports. And FWIW, the fake macOS libSystem has __nan which should fit the bill :)

P.S. it would be good to have a comment that explains the motivation behind the test. Right now the comment talks about re-exports but doesn't touch on the TBD files that contain them, which I think is the key bit.

oontvoo updated this revision to Diff 327324.Mar 1 2021, 4:48 PM

Updated test. Also added a brief description on what it's testing

oontvoo marked an inline comment as done.Mar 1 2021, 4:49 PM
oontvoo added inline comments.
lld/test/MachO/reexport-nested-lib.s
20–22

Yeah, that's the right summary of it. :) Thanks.

Updated the test. Confirmed that w/o the patch, it'd fail with something like this:

--
lld: error: unable to locate re-export with install name /usr/lib/system/libdyld.dylib
lld: error: unable to locate re-export with install name /usr/lib/system/libsystem_c.dylib
lld: error: unable to locate re-export with install name /usr/lib/system/libsystem_m.dylib
oontvoo updated this revision to Diff 327325.Mar 1 2021, 4:51 PM
oontvoo marked an inline comment as done.

fixed stale comment

int3 accepted this revision.Mar 2 2021, 8:50 AM

lgtm

lld/test/MachO/reexport-nested-lib.s
10
This revision is now accepted and ready to land.Mar 2 2021, 8:50 AM
oontvoo updated this revision to Diff 327483.Mar 2 2021, 9:14 AM

updated diff

This revision was landed with ongoing or failed builds.Mar 2 2021, 9:15 AM
This revision was automatically updated to reflect the committed changes.