Page MenuHomePhabricator

[llvm-objcopy][MachO] Recompute and update offset/size fields in the writer
ClosedPublic

Authored by seiya on May 30 2019, 5:03 AM.

Details

Summary

Recompute and update offset/size fields so that we can implement llvm-objcopy options like --only-section.

This patch is the first step and focuses on supporting load commands that covered by existing tests: executable files and
dynamic libraries are not supported.

Diff Detail

Repository
rL LLVM

Event Timeline

seiya created this revision.May 30 2019, 5:03 AM

great start! I will take a closer look at the diff tomorrow + over the weekend

compnerd added inline comments.
llvm/test/tools/llvm-objcopy/MachO/Inputs/various-symbols.s
2 ↗(On Diff #202155)

NIT: common is misspelt

4 ↗(On Diff #202155)

global3 is unreferenced. Is that intended?

llvm/test/tools/llvm-objcopy/MachO/real-world-input-copy.test
13 ↗(On Diff #202155)

The test input seems oddly complicated for testing that. It seems that anything that contains a LC_DYNSYMTAB which means anything with an externally visible symbol should be sufficient. The BSS section would be emitted by any static. Something like this seems much more concise which is better for testing the cases:

static int i;
int f(void) { return i; }
llvm/tools/llvm-objcopy/MachO/MachOReader.cpp
22 ↗(On Diff #202155)

Isn't this just strnlen?

43 ↗(On Diff #202155)

I believe that the 16 here is supposed to be the maximum length of the section/segment name in MachO. Can you use sizeof(llvm::MachO::section::sectname) and sizeof(llvm::MachO::section::segname) instead please?

llvm/tools/llvm-objcopy/MachO/MachOWriter.cpp
145 ↗(On Diff #202155)

I wish that we had the structs built using packed_endian types

149 ↗(On Diff #202155)

Can the Sec be const?

152 ↗(On Diff #202155)

I think that you can drop copyStringWithPadding. Instead zero initialise the section that you are constructing, which also gives you more reproducible output.

478 ↗(On Diff #202155)

Can't Sec be const?

486 ↗(On Diff #202155)

VM is more appropriate - it is an initialism for Virtual Memory.

517 ↗(On Diff #202155)

Can't LC be const?

518 ↗(On Diff #202155)

Can't Sec be const?

526 ↗(On Diff #202155)

Can't LC be const?

llvm/tools/llvm-objcopy/MachO/MachOWriter.h
32 ↗(On Diff #202155)

Might be more readable as updateDySymTab

alexshap added inline comments.May 30 2019, 1:31 PM
llvm/test/tools/llvm-objcopy/MachO/real-world-input-copy.test
13 ↗(On Diff #202155)

+1

seiya updated this revision to Diff 202344.May 30 2019, 6:12 PM
  • Simplified various-symbols.s.
  • Replaced strlenOrMaxLen (introduced by this patch) with strnlen.
  • Replaced the hard-coded lengths of segment/section name with sizeof().
  • Renamed updateDysymtab to updateDySymTab .
  • Replaced copyStringWithPadding (introduced by this patch) with memset-ing by zero and memcpy.
  • Add const to a variable.
seiya marked 22 inline comments as done.May 30 2019, 6:30 PM
seiya added inline comments.
llvm/test/tools/llvm-objcopy/MachO/real-world-input-copy.test
13 ↗(On Diff #202155)

Indeed. Simplified the test case.

llvm/tools/llvm-objcopy/MachO/MachOReader.cpp
22 ↗(On Diff #202155)

Quite so :)

llvm/tools/llvm-objcopy/MachO/MachOWriter.cpp
145 ↗(On Diff #202155)

I wonder if we could do this better too. Could you tell me what packed_endian is? You mean llvm/Support/Endian.h? I'd like to take a look at it.

478 ↗(On Diff #202155)

No. It's mutated in the loop.

517 ↗(On Diff #202155)

It can't be const since its sections are mutated in the loop.

518 ↗(On Diff #202155)

No. It's mutated in the loop.

526 ↗(On Diff #202155)

No. LC. MachOLoadCommand is mutated in the loop.

487 ↗(On Diff #202344)

I've replaced the calculation of VMSize by the algorithm used in MachObjectWriter::writeObject

seiya marked 6 inline comments as done.May 30 2019, 6:31 PM
seiya marked an inline comment as done.May 30 2019, 6:33 PM
seiya added inline comments.
llvm/test/tools/llvm-objcopy/MachO/Inputs/various-symbols.s
4 ↗(On Diff #202155)

That was a mistake. I've updated this file so it no longer matters btw.

seiya updated this revision to Diff 202400.May 31 2019, 2:19 AM
  • add some consts
alexshap added inline comments.Jun 3 2019, 3:03 PM
llvm/tools/llvm-objcopy/MachO/MachOWriter.cpp
135 ↗(On Diff #202400)

nit: I'd use the explicit type here (for readability)

147 ↗(On Diff #202400)

khm, here and in some other similar places, i'd use sizeof(...)
or create a named constant for this number (16).

174 ↗(On Diff #202400)

here and the lines above (143 - 170) are very similar,
let's create a helper function for this and avoid code duplication.

438 ↗(On Diff #202400)

explicit type + I think naming here doesn't follow our style guide)

448 ↗(On Diff #202400)

see the comment on the line 438

481 ↗(On Diff #202400)
   
1 << Sec.Align

P.S. pow works with floating point numbers, I think we don't really need these conversions here

seiya updated this revision to Diff 202842.Jun 3 2019, 7:38 PM

Address review comments.

seiya marked 5 inline comments as done.Jun 3 2019, 7:39 PM
seiya updated this revision to Diff 202845.Jun 3 2019, 7:41 PM

Address remaining review comments.

seiya marked an inline comment as done.Jun 3 2019, 7:42 PM
alexshap accepted this revision.Jun 4 2019, 7:00 PM

so in general this looks very promising to me,
however test coverage needs improvements imo, though given that at the moment we don't expose any non-trivial options here, we can add more tests incrementally. Btw - for DYSYMTAB, SYMTAB - is it possible to add YAML-based tests ? or yaml2obj's support is not sufficient for this ?

I'd also wait for @jhenderson and/or @rupprecht to take a look at this code as well.

This revision is now accepted and ready to land.Jun 4 2019, 7:00 PM

I'm struggling for time at the moment, and am not really familiar with MachO at all, so it would take me quite some time to review this. If @rupprecht is happy, I am happy.

seiya added a comment.Jun 5 2019, 5:43 PM

so in general this looks very promising to me,
however test coverage needs improvements imo, though given that at the moment we don't expose any non-trivial options here, we can add more tests incrementally. Btw - for DYSYMTAB, SYMTAB - is it possible to add YAML-based tests ? or yaml2obj's support is not sufficient for this ?

I'd also wait for @jhenderson and/or @rupprecht to take a look at this code as well.

It seems that yaml2obj supports DYSYMTAB and SYMTAB. I'll add YAML-based tests in another patch which implements symbol table rebuilding.

@seiya, ok, i think if nobody comments on this diff in the coming days it's safe to proceed and commit it, I kinda don't want to block it and I'm looking forward to more tests in the follow-up patches .

rupprecht accepted this revision.Jun 6 2019, 1:19 PM

Also not a Mach-O expert, but I looked through it and what I can understand looks fine (+ it passes existing tests). More tests in a future patch are always welcome though.

seiya added a comment.Jun 6 2019, 4:22 PM

Thank you all. I'll commit this later.

This revision was automatically updated to reflect the committed changes.