This is an archive of the discontinued LLVM Phabricator instance.

[llvm-nm][MachO] Add support for `MH_FILESET`
ClosedPublic

Authored by antoniofrighetto on Aug 31 2023, 9:24 AM.

Details

Summary

This patch builds upon Jonas’ D132432 and aims to support symbols printing of a MH_FILESET MachO within llvm-nm. Currently llvm-nm returns “no symbols” for a fileset; as a result, discerning the presence of symbols within MachO entries may require manual inspection of the binary. One concern is that, among other things, this might inadvertently lead to the improper assumption that symbols may have been stripped (as LC_SYMTAB of main MachO has no symbols). This patch addresses this issue by establishing whether symbols actually exist or not within each MachO entry.

One possible implementation is to specialize dumpSymbolNamesFromObject function to encompass fileset handling as well. While it may be slightly suboptimal to retain the entire MachO in memory for each LC_FILESET_ENTRY, this ensures accurate MachO parsing, let us extend MachOObjectFile constructor to include just an offset that locates each MachO entry from the file's start, and it makes the patch as non-invasive as possible. Also, this would be exceptionally done only for a specific type of MachO. Note that the MachO would still need to be parsed (at least) twice, once to determine the MachO type, and again to locate the LC_SYMTAB within the entry.


The test is a simple mock MH_FILESET binary, generated from an iOS kernelcache, and trimmed so as to preserve one load command (a LC_FILESET_ENTRY), another LC within a MachO entry (a LC_SYMTAB), and the actual symbol table.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: hiraditya. · View Herald Transcript
antoniofrighetto requested review of this revision.Aug 31 2023, 9:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2023, 9:24 AM

Binary file was not uploaded correctly, attempting to fix this.

fileset.macho-aarch64 test can be found at https://reviews.llvm.org/file/view/28943932/hexdump, as, for some reason, Phabricator does not seem to handle the binary gracefully (despite --binary option to git diff), leading pre-merge checks to fail.

JDevlieghere added inline comments.Sep 1 2023, 7:51 AM
llvm/include/llvm/BinaryFormat/MachO.h
901–906

Why is the ifdef necessary? At first I thought the layout of the struct is different for 32-bit vs 64-bit targets but looking at loader.h I don't see any immediate evidence of that. Either way, changing the layout based on a compile-time constant/host property seems almost guaranteed to be wrong.

llvm/include/llvm/BinaryFormat/MachO.h
901–906

Right, clearly the fact of compiling LLVM for a 32-bit target does not entail that the lc_str union has a ptr member. I think my original intent here was – mistakenly – to adhere to the XNU header, but the layout of the struct should remain constant regardless of the host's architecture (I still left the union just to reflect the header). Dropped it, thanks!

This revision is now accepted and ready to land.Sep 1 2023, 2:38 PM

Thanks for reviewing this! :)

This revision was landed with ongoing or failed builds.Sep 5 2023, 9:51 AM
This revision was automatically updated to reflect the committed changes.