This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy][MachO] Implement a layout algorithm for executables
ClosedPublic

Authored by seiya on Jul 31 2019, 2:34 PM.

Event Timeline

seiya created this revision.Jul 31 2019, 2:34 PM
seiya updated this revision to Diff 212669.Jul 31 2019, 2:42 PM
  • Add a newline at end of file.

Just a couple of minor comments from me. I don't know enough about Mach-O to be able to review the majority of this.

llvm/test/tools/llvm-objcopy/MachO/basic-executable-copy.test
2

Add a comment to the top of this test saying what the purpose of the test is.

4

--file-headers is the canonical option, I believe (we provide --file-header as an alias, but we should probably try to only use the real one here).

6

There's a lot going on in this YAML. Is it the minimal it can be to show all the behaviour, or can it be reduced further?

llvm/test/tools/llvm-objcopy/MachO/basic-shared-library-copy.test
1 ↗(On Diff #212669)

Same comments as previous test.

seiya updated this revision to Diff 213577.Aug 6 2019, 4:05 AM
seiya marked 4 inline comments as done.
  • Addressed review comments.
seiya updated this revision to Diff 213579.Aug 6 2019, 4:11 AM
  • Remove basic-shared-library-copy.test because shared libraries support is not sufficient.
seiya retitled this revision from [llvm-objcopy][MachO] Implement a layout algorithm for executables/shared libraries to [llvm-objcopy][MachO] Implement a layout algorithm for executables.Aug 6 2019, 4:11 AM
seiya edited the summary of this revision. (Show Details)
seiya added inline comments.
llvm/test/tools/llvm-objcopy/MachO/basic-executable-copy.test
6

It's too long but I think we should keep this verbose YAML to test all executable/shared library-specific load commands like LC_LOAD_DYLINKER.

seiya marked an inline comment as done.EditedAug 6 2019, 4:13 AM

Removed "shared libraries support " from the title and description. I noticed that shared libraries support is not sufficient. I'll submit a separate patch for this at some time.

llvm/tools/llvm-objcopy/MachO/MachOLayoutBuilder.cpp
274

I mean here.

jhenderson added inline comments.Aug 6 2019, 4:34 AM
llvm/test/tools/llvm-objcopy/MachO/basic-executable-copy.test
2–3

Replace the second "This test" with "It uses"

some part -> some parts

seiya updated this revision to Diff 213739.Aug 6 2019, 3:41 PM
seiya marked an inline comment as done.
  • Fixed review comments.

Right, my comments have been addressed, but somebody (e.g. @alexshap) with Mach-O knowledge will need to review the rest.

seiya added a comment.Aug 13 2019, 7:28 PM

Friendly ping.

in general i like the approach, i have added a few minor comments and one question (maybe I'm missing smth).

llvm/tools/llvm-objcopy/MachO/MachOLayoutBuilder.cpp
20

const auto &MLC

104

const bool

151

khm, maybe I'm missing smth, but why do we have Sec.Addr + Sec.Size here ?

seiya updated this revision to Diff 215340.Aug 15 2019, 1:32 AM
seiya marked 4 inline comments as done.
  • Added const.
llvm/tools/llvm-objcopy/MachO/MachOLayoutBuilder.cpp
151

We need to recompute the VMSize in case a new section is added at the end of the segment. Note that in an object file, it seems that Sec.Addr is offset from the beginning of the first section data.

Why we need to use std::max() instead of simply assigning Sec.Addr + Sec.Size is that sections are not necessarily sorted by their vm addresses Here's an example:

Sections [
  Section {
    Index: 0
    Name: __text (5F 5F 74 65 78 74 00 00 00 00 00 00 00 00 00 00)
    Segment: __TEXT (5F 5F 54 45 58 54 00 00 00 00 00 00 00 00 00 00)
    Address: 0x0
    Size: 0xC
    Offset: 472
    ...
  }
  Section {
    Index: 1
    Name: __bss (5F 5F 62 73 73 00 00 00 00 00 00 00 00 00 00 00)
    Segment: __DATA (5F 5F 44 41 54 41 00 00 00 00 00 00 00 00 00 00)
    Address: 0x50
    Size: 0x4
    Offset: 0
    ...
  }
  Section {
    Index: 2
    Name: __eh_frame (5F 5F 65 68 5F 66 72 61 6D 65 00 00 00 00 00 00)
    Segment: __TEXT (5F 5F 54 45 58 54 00 00 00 00 00 00 00 00 00 00)
    Address: 0x10
    Size: 0x40
    Offset: 488
    ...
  }
]
Segment {
  Cmd: LC_SEGMENT_64
  Name: 
  Size: 312
  vmaddr: 0x0
  vmsize: 0x54
  ...
}

In this object file, __bss is located at the end of the segment so the vmsize should be (__bss' vmaddr) + (__bss' vmsize), namely, 0x54. However, since __eh_frame comes after the __bss in the section list, we need to determine the end of the segment by std::max.

FYI, this part comes from lib/MC/MachObjectWriter.cpp.

This revision is now accepted and ready to land.Aug 15 2019, 9:52 AM
seiya updated this revision to Diff 215820.Aug 18 2019, 10:29 PM

Rebase for the sake of arc patch.

This revision was automatically updated to reflect the committed changes.