Page MenuHomePhabricator

[llvm-objcopy][MachO] Add support for universal binaries

Authored by alexshap on Sep 28 2020, 3:42 AM.

Diff Detail

Event Timeline

alexshap created this revision.Sep 28 2020, 3:42 AM
alexshap created this object with visibility "All Users".
Herald added a project: Restricted Project. · View Herald Transcript
alexshap requested review of this revision.Sep 28 2020, 3:42 AM
alexshap updated this revision to Diff 294722.Sep 28 2020, 8:54 AM

Just a few minor comments from me, but I don't really follow the logic as always, so this will need others to comment on it.


You could also just inline this text directly in the file and feed %s as the input to llvm-as.


Here and the other consumeError site probably deserve comments explaining why it's okay to throw away the error reported.


Preferred error message style starts with a lower-case letter.

Is there anything that allows us to uniquely identify which slice is the problematic one (e.g. a member name or similar)? If so, it might be helpful to report it as part of the message.

alexshap updated this revision to Diff 294910.Sep 29 2020, 1:48 AM

Address comments

jhenderson added inline comments.Sep 30 2020, 12:25 AM

Not quite what I had in mind, but I guess this is okay.


I think my question is "why is it okay to ignore the type mismatch error?"

alexshap added inline comments.Sep 30 2020, 1:37 AM

Not sure what's expected here.
The interface of ObjectForArch is somewhat suboptimal and if I am not mistaken it has already been discussed in the past (i don't remember - online or offline), refining ObjectForArch (from libObject) / doing a global refactoring is a non-goal for the current effort,
this interface had been around long before llvm-objcopy showed up, it's used in multiple places, but if i remember the previous discussions correctly changing this interface was not a completely trivial problem.
Realistically this is what's going on here - the slice can be one of 3 types, the accessors return Expected and as you know the Error owned by an instance of Expected can't (and shouldn't) be ignored. In general, there are specifics of a particular call site, e.g. if the call site expects a slice of a particular type or if it handles just a few types then it can treat this error as a real error or consume it, but at the moment there is no other way to find out the actual type without calling these accessors. This makes sense to some extent but has the aforementioned unfortunate consequences.
I see a couple of options for now: 1. the current code (it is similar to llvm-dwarfdump.cpp) 2. one can add one more check that the error being consumed is indeed the type mismatch. Perhaps, isNotObjectErrorInvalidFileType should do the trick. On the functional side this doesn't really change anything (if the slice is not a macho object nor is it an archive we report an error anyway).
@jhenderson - does it answer your question / have I missed something ?

jhenderson added inline comments.Sep 30 2020, 2:05 AM

Sorry for the confusion. Perhaps just changing the comment slightly to something like the following would clarify things:

// The methods getAsArchive, getAsObjectFile, getAsIRObject of the class
// ObjectForArch return an Error in case of the type mismatch. We need to check each
// in turn to see what kind of file this is, so ignore errors produced along the way.

(The key bit being that this answers the question why we shouldn't report the error for ArOrErr - because we also need to see if it is an object file)

alexshap updated this revision to Diff 295395.Sep 30 2020, 1:31 PM

Flesh out the comment.

jhenderson accepted this revision.Oct 1 2020, 1:27 AM

Looks good, but someone with Mach-O expertise probably wants to sign off on that side.

This revision is now accepted and ready to land.Oct 1 2020, 1:27 AM

LGTM barring the comment about more tests.


Is the expectation that this constructor will be used when the Archive & is from an existing universal binary, and it's therefore okay to bypass all the checks in the Slice::create overload that takes an Archive &? If so, we should explicitly comment that (preferably as a Doxygen comment in the header), otherwise people might be confused in the future about when they want to use which function.


All of these tests are just for copies. We should also have some tests for performing actual operations (e.g. stripping) to show that we correctly apply the operation to each slice.


You probably want an rm -f %t.archive.i386 before this, otherwise I believe ar will append to any existing file with that name (e.g. a leftover from a previous test run).


This is actually invalid, but I guess it doesn't hurt to support it. From the Apple man page for libtool:

Ranlib takes all correct forms of libraries (universal files containing archives, and simple archives) and updates the table of contents for all archives in the file. Ranlib also takes one common incorrect form of archive, an archive whose members are universal object files, adding or updating the table of contents and producing the library in correct form (a universal file containing multiple archives).


I'd up both of these to 2, since I imagine that's the most common universal binary size.

alexshap added inline comments.Oct 3 2020, 6:28 PM

yeah, this constructor takes cputype, cpusubtype, and align and doesn't infer them from the archive members (e.g. both lipo/llvm-lipo allow to specify a custom value for the alignment). The main use case is creating a Slice from an archive coming from a universal binary.
I'll add a comment explaining these details.



alexshap added inline comments.Oct 3 2020, 6:37 PM

sure, I'll add a test for stripping as well

This revision was automatically updated to reflect the committed changes.
gribozavr2 changed the visibility from "All Users" to "Public (No Login Required)".Oct 6 2020, 2:09 AM

Unfortunately I see this change causing ASan failures in llvm-objcopy/MachO/strip-all.test:

ERROR: AddressSanitizer: stack-use-after-scope on address 0x7fed6fb0f550 at pc 0x55e8dd15e257 bp 0x7ffe7c565d90 sp 0x7ffe7c565558
READ of size 6 at 0x7fed6fb0f550 thread T0
    #0 0x55e8dd15e256 in __asan_memcpy llvm-project/compiler-rt/lib/asan/asan_interceptors_memintrinsics.cpp:22:3
    #1 0x55e8de076a4b in CopyStringRef(char*, llvm::StringRef) llvm-project/llvm/lib/Support/MemoryBuffer.cpp:61:5
    #2 0x55e8de07678e in llvm::WritableMemoryBuffer::getNewUninitMemBuffer(unsigned long, llvm::Twine const&) llvm-project/llvm/lib/Support/MemoryBuffer.cpp:303:3
    #3 0x55e8de076d3e in llvm::WritableMemoryBuffer::getNewMemBuffer(unsigned long, llvm::Twine const&) llvm-project/llvm/lib/Support/MemoryBuffer.cpp:315:13
    #4 0x55e8dd178c27 in llvm::objcopy::MemBuffer::allocate(unsigned long) llvm-project/llvm/tools/llvm-objcopy/Buffer.cpp:64:9
    #5 0x55e8dd379b13 in llvm::objcopy::macho::MachOWriter::write() llvm-project/llvm/tools/llvm-objcopy/MachO/MachOWriter.cpp:502:19
    #6 0x55e8dd33b80e in llvm::objcopy::macho::executeObjcopyOnBinary(llvm::objcopy::CopyConfig const&, llvm::object::MachOObjectFile&, llvm::objcopy::Buffer&) llvm-project/llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp:390:17
    #7 0x55e8dd33d216 in llvm::objcopy::macho::executeObjcopyOnMachOUniversalBinary(llvm::objcopy::CopyConfig&, llvm::object::MachOUniversalBinary const&, llvm::objcopy::Buffer&) llvm-project/llvm/tools/llvm-objcopy/MachO/MachOObjcopy.cpp:439:19
    #8 0x55e8dd38bcf2 in executeObjcopyOnBinary(llvm::objcopy::CopyConfig&, llvm::object::Binary&, llvm::objcopy::Buffer&) llvm-project/llvm/tools/llvm-objcopy/llvm-objcopy.cpp:150:12
    #9 0x55e8dd38d4bc in executeObjcopy(llvm::objcopy::CopyConfig&) llvm-project/llvm/tools/llvm-objcopy/llvm-objcopy.cpp:290:21

I reverted this commit in 8ed7946a7d94f9d23b7f33356a1903d481daa5a0 and 80f66ac0d544d2d9d3108033148d60bb4760b319.