This is an archive of the discontinued LLVM Phabricator instance.

Universal MachO: support LLVM IR objects
ClosedPublic

Authored by aguinet on Aug 11 2020, 8:42 AM.

Details

Summary

This adds support for LLVM IR "objects" within universal MachO, as this can happen when using LTO when compiling iOS/tvOS applications (or any universal MachO object compiled with clang).

This also improves llvm-lipo in order to support these LLVM IR objects. llvm-lipo is now able to extract/thin, create and dump information for universal MachO with LLVM IR bitcode objects.

One strategy to test this properly would require implementing LLVM IR description within the ObjectYAML library. Before doing so, I would like to here about other potential strategies!

Diff Detail

Event Timeline

aguinet created this revision.Aug 11 2020, 8:42 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2020, 8:42 AM
aguinet requested review of this revision.Aug 11 2020, 8:42 AM
aguinet edited the summary of this revision. (Show Details)Aug 11 2020, 8:46 AM
aguinet edited the summary of this revision. (Show Details)Aug 11 2020, 9:46 AM

@aguinet,
thanks for the patch!

  1. This code has been recently refactored (see https://github.com/llvm/llvm-project/commit/df69492cdfa82ff6453c887cd45b3a5903b79afb ) so you'd need to rebase your diff.
  2. Bitcode files can also come from archives (static libraries) (e.g. an archive can contain bitcode files exclusively) - so when we create a Slice from an archive we also need to handle that.
  3. We need tests.

Somewhat related to this area - If I'm not mistaken there is a small issue in SymbolicFile::createSymbolicFile - when LLVMContext is null the function appears to return a misleading error - I was planning to fix that soon.

Hello @alexshap!

Thanks for the comments. Answers below:

In D85740#2210765, @alexshap wrote:
  1. This code has been recently refactored (see https://github.com/llvm/llvm-project/commit/df69492cdfa82ff6453c887cd45b3a5903b79afb ) so you'd need to rebase your diff.

I pushed a new patch rebased on master, sorry about this conflict (did this patch before going to holidays, should have rebased before submitting when coming back :)).

  1. Bitcode files can also come from archives (static libraries) (e.g. an archive can contain bitcode files exclusively) - so when we create a Slice from an archive we also need to handle that.

That's a good catch. I will take a closer look and adapt the patch accordingly.

  1. We need tests.

About that part, like I said in the description, the way I see it is to enhance the ObjectYAML library so that we can generate LLVM bitcode file format (as described in https://llvm.org/docs/BitCodeFormat.html), so that yaml2obj can generate appropriate LLVM IR test objects. That would be used for instance to test the -create flag. Do you see another way to do this?

Somewhat related to this area - If I'm not mistaken there is a small issue in SymbolicFile::createSymbolicFile - when LLVMContext is null the function appears to return a misleading error - I was planning to fix that soon.

Indeed. If LLVMCOntext is null, it says that it is in an invalid format. Not ideal what should be the error, as this is an error which is forwarded to the end user (and I'm not sure the user would understand "you need to pass an LLVMContext object to SymbolicFile::createSymbolicFile to parse IR objects" :))

aguinet updated this revision to Diff 285008.Aug 12 2020, 1:48 AM

About that part, like I said in the description, the way I see it is to enhance the ObjectYAML library so that we can generate LLVM bitcode file format (as described in https://llvm.org/docs/BitCodeFormat.html), so that yaml2obj can generate appropriate LLVM IR test objects. That would be used for instance to test the -create flag. Do you see another way to do this?

Can't you just use llvm-as to produce bitcode files and feed those to llvm-lipo?

About that part, like I said in the description, the way I see it is to enhance the ObjectYAML library so that we can generate LLVM bitcode file format (as described in https://llvm.org/docs/BitCodeFormat.html), so that yaml2obj can generate appropriate LLVM IR test objects. That would be used for instance to test the -create flag. Do you see another way to do this?

Can't you just use llvm-as to produce bitcode files and feed those to llvm-lipo?

Indeed, I can, that's easier... I will try and do that properly next week. Thanks for the hint!

llvm/tools/llvm-lipo/llvm-lipo.cpp
30

nit: these includes need to be moved up (if i remember correctly we usually try to sort includes alphabetically)

aguinet updated this revision to Diff 285958.EditedAug 17 2020, 4:18 AM

Hello @alexshap,

Here is a new patch that includes:

  • support for archives that includes LLVM IR objects
  • testing using llvm-ar
  • fix the order of includes in llvm-lipo (as per your comment)

It also includes a few refactoring of code.

I don't know how much we need to be consistent on the original lipo regarding the error message, but I tried to follow the existing convention.

aguinet updated this revision to Diff 285965.Aug 17 2020, 4:39 AM

Fix fatal error message of MachOUniversalBinary::ObjectForArch::getAsIRObject

aguinet marked an inline comment as done.Aug 17 2020, 5:36 AM
aguinet updated this revision to Diff 286001.Aug 17 2020, 6:26 AM

Clang-tidy + format remarks

aguinet updated this revision to Diff 287199.Aug 22 2020, 10:02 AM

Cleanup/refactoring

llvm/lib/Object/MachOUniversalWriter.cpp
150 ↗(On Diff #287199)

not particularly important, but I'd probably convert "if" + "continue" into

if ( ... ) { ... } else if ( ... ) { ... } else { ... }

(it seems to be more readable)

206 ↗(On Diff #287199)

perhaps, given the check above this assert is not super useful
if you decide to keep it then we need to add a string message

assert (.... && "....")
215 ↗(On Diff #287199)

nit:

  1. auto -> explicit type (since it's not clear from the right-hand side),
  2. for a single-line "if" braces are not necessary.

(and the same applies to lines 216 - 219)

llvm/tools/llvm-lipo/llvm-lipo.cpp
163

what would you say to the following renaming:

irSlice -> createSliceFromIR
archiveSlice -> createSliceFromArchive
163

I'm wondering if we really need to pass the name of the file here - can't it be extracted from the first argument (IRO->getFileName()) ?

(if yes - then probably it can be removed from archiveSlice as well)

P.S. https://llvm.org/doxygen/MachOUniversal_8cpp_source.html#l00078

558

nit: explicit type

724

nit: explicit type

744

nit: explicit type

llvm/lib/Object/MachOUniversalWriter.cpp
164 ↗(On Diff #287199)

nit: explicit type

Thanks @alexshap for your review and comments! Answers are inline.

I will try and push a new diff with the suggested modifications during the day.

llvm/lib/Object/MachOUniversalWriter.cpp
150 ↗(On Diff #287199)

Okay will fix with new diff!

206 ↗(On Diff #287199)

Indeed. I put it during development to make sure I didn't make any mistake, but I will remove it.

215 ↗(On Diff #287199)

Ack will fix with new diff

llvm/tools/llvm-lipo/llvm-lipo.cpp
163

Probably. I will test and let you know :)

No problem for me for the renaming you propose, will put it in the new diff.

aguinet updated this revision to Diff 287344.Aug 24 2020, 5:08 AM
aguinet marked 8 inline comments as done.

@alexshap there we go, it should fix all your remarks. Please let my know if there are still some issues.

aguinet updated this revision to Diff 287346.Aug 24 2020, 5:12 AM
aguinet marked an inline comment as done.

Just a small variable naming cleanup

This revision is now accepted and ready to land.Aug 24 2020, 9:50 AM

I think llvm-libtool-darwin would also need some adjustments to handle bitcode objects, archives, and universal binaries. That's definitely out of scope for this diff, but I'm just flagging it as another tool with missing bitcode support.

llvm/include/llvm/Object/MachOUniversalWriter.h
19 ↗(On Diff #287346)

Nit: how come you're including the header here instead of forward declaring IRObjectFile?

llvm/lib/Object/MachOUniversalWriter.cpp
91 ↗(On Diff #287346)

nit: a blank line between 90 and 91 would not hurt

aguinet updated this revision to Diff 287576.Aug 25 2020, 12:12 AM
aguinet marked an inline comment as done.

Fresh air!

aguinet marked an inline comment as done.Aug 25 2020, 12:15 AM

Thanks for the new reviews! Fixed in the incoming diff

llvm/include/llvm/Object/MachOUniversalWriter.h
19 ↗(On Diff #287346)

Nice catch! Fixed in the new diff

aguinet updated this revision to Diff 287578.Aug 25 2020, 12:15 AM
aguinet marked an inline comment as done.

@alexshap note that I don't have RW access to the llvm repository, so I can't merge this myself

aguinet updated this revision to Diff 287663.Aug 25 2020, 7:36 AM

This patch fixes one bug that was present: running llvm-lipo -archs on an LLVM IR object would crash, because we didn't check for that case in printBinaryArchs. This new patch fixes it, with an associated test.

This revision was landed with ongoing or failed builds.Aug 25 2020, 9:11 PM
This revision was automatically updated to reflect the committed changes.

Thanks for the merge and reviews @alexshap (and @smeenai) !