Page MenuHomePhabricator

[llvm-objcopy] Handle -O <format> flag.
ClosedPublic

Authored by rupprecht on Oct 24 2018, 1:14 PM.

Details

Summary

The -O flag is currently being mostly ignored; it's only checked whether or not the output format is "binary". This adds support for a few formats (e.g. elf64-x86-64), so that when specified, the output can change between 32/64 bit and sizes/alignments are updated accordingly.

As a side effect, this also fixes a bad alignment in symbol table when the input format is "binary".

This fixes PR39135

Diff Detail

Repository
rL LLVM

Event Timeline

rupprecht created this revision.Oct 24 2018, 1:14 PM

@rupprecht Jordan, if you don't mind I will commit my patch https://reviews.llvm.org/D53311 (this probably will cause some minor rebase for you), and then I will cherry-pick your changes (locally) and send my future patch (to move elf-specific things into a separate file and subfolder) on top of yours

@rupprecht Jordan, if you don't mind I will commit my patch https://reviews.llvm.org/D53311 (this probably will cause some minor rebase for you), and then I will cherry-pick your changes (locally) and send my future patch (to move elf-specific things into a separate file and subfolder) on top of yours

No worries, I was already planning on rebasing once you submit D53311.
I actually need to fix another case -- this doesn't handle -I binary -B <arch1> -O <arch2> correctly. Hopefully I'll have time to upload a better patch today...

rupprecht updated this revision to Diff 171009.Oct 24 2018, 3:41 PM
  • Fix output arch format when --input is binary
alexshap added inline comments.Oct 25 2018, 11:34 AM
tools/llvm-objcopy/llvm-objcopy.cpp
259 ↗(On Diff #171017)

maybe i'm missing smth and Jake / James can clarify / comment on it,
but looks like that this basically propagates the info about the format to Object (Object = "intermediate representation" which has been arch-agnostic (mostly) so far (if i understand correctly)) (so that different elf/binary input/output formats can be mixed and matched by substituting different Readers/Writers (if i'm not mistaken that was the main idea)). More precisely, I'm wondering if Object should be aware of the Arch or, alternatively, we should make Writer take care of this. cc: @jakehehrlich

I spoke a bit offline with Jordan. I'm going to put a patch up by end of day that refactors things so that this can be done in the Writer. It's a bit hariy but I really don't want setArch or anything like it. The issue we were aware of before was that size was format dependent and we (me really) kluded around it. This actully points out that many other things are as well. The general shift that I've liked here is to slowy remove state from the internal representation and push those things toward constants in the writers. This change I have in mind will remove a lot of state from sections and additionally add that section headers be written on a section by section basis.

rupprecht planned changes to this revision.Oct 25 2018, 12:34 PM

I spoke a bit offline with Jordan. I'm going to put a patch up by end of day that refactors things so that this can be done in the Writer. It's a bit hariy but I really don't want setArch or anything like it. The issue we were aware of before was that size was format dependent and we (me really) kluded around it. This actully points out that many other things are as well. The general shift that I've liked here is to slowy remove state from the internal representation and push those things toward constants in the writers. This change I have in mind will remove a lot of state from sections and additionally add that section headers be written on a section by section basis.

Yep, sorry I forgot to update this revision after that convo. Marking as needing further action. I'll update this when Jake's patch goes in.

Some other notes from when I was trying to work on this:

  • Arch-dependent things are not just Size, but also EntrySize and AddressAlignment (possibly others, but that's what I've observed so far)
  • The tricky thing (IMO) is that in some cases (e.g. data section) it _does_ make sense for a section to know what size it is, but in others (e.g. symbol tables) it's tied to the output arch (just 32/64 bit really)... but not entirely constant: e.g. for symbol tables EntrySize is either 16 or 24 depending on the arch (don't quote me on those constants :) ), but Size = EntrySize * <num of symbols>, so the writer can't just "tell" the section what its size is, but neither can the section just "tell" the writer what its size is. Writing an API to be flexible for that gets weird. Good luck :)

@jakehehrlich - if your patch is correct, it should require the change here for tools/llvm-objcopy/binary-input.test. The rest of it I can submit separately after your refactoring.

rupprecht updated this revision to Diff 180147.Thu, Jan 3, 2:54 PM

This change is now ready for review. It's limited to just driver changes now that ELF/Object.cpp has the necessary changes to avoid the weird setArch() method from before.

jhenderson added inline comments.Fri, Jan 4, 2:36 AM
test/tools/llvm-objcopy/ELF/binary-input-with-arch.test
2 ↗(On Diff #180147)

Nit: full stop, here and elsewhere.

8 ↗(On Diff #180147)

I don't think you need to check most of the lines that you check in this test. Keep it to the minimum that is actually interesting. I think that's the Format, the Arch, the AddressSize, the Class, possibly the DataEncoding the Machine and the SectionHeaderEntrySize. It might even be more minimal than that.

test/tools/llvm-objcopy/ELF/cross-arch-bad-format.test
1 ↗(On Diff #180147)

I'm not sure what the "cross-arch" bit refers to in this test name. Perhaps it should be called "bad-output-format.test"?

13 ↗(On Diff #180147)

I think this message needs to specify that its the Output format that is bad (i.e. the argument to -O).

test/tools/llvm-objcopy/ELF/cross-arch-headers.test
3 ↗(On Diff #180147)

Let's put a space between -O and its argument to make it a little more readable and consistent with other tests.

4 ↗(On Diff #180147)

Nit: -file-headers uses a single dash, but --check-prefixes uses two. Please be consistent (I'd prefer double, but that's just me).

58 ↗(On Diff #180147)

Similar comments apply here to the above test. Remove the things that aren't relevant to this test.

test/tools/llvm-objcopy/ELF/cross-arch-sections-symbols.test
14 ↗(On Diff #180147)

Maybe worth having a non-zero size for the sections?

21 ↗(On Diff #180147)

Maybe worth having a non-zero size for the symbols?

26 ↗(On Diff #180147)

These values should probably be inside the sections, once they have non-zero size.

tools/llvm-objcopy/CopyConfig.h
51 ↗(On Diff #180147)

Binary -> binary, I believe? The comment above needs updating to be consistent with this one too.

tools/llvm-objcopy/ELF/ELFObjcopy.cpp
550 ↗(On Diff #180147)

infer if -> infer it

rupprecht updated this revision to Diff 180292.Fri, Jan 4, 11:59 AM
rupprecht marked 15 inline comments as done.
  • Rename test case
  • Specify 'output format' in error message
  • Grammar
  • Use spaces in -O<format> for clarity
  • Reduce test cases
test/tools/llvm-objcopy/ELF/cross-arch-headers.test
3 ↗(On Diff #180147)

Done here, but I do want to make sure that it works this way to ease migration, so I've left in a single test case (that also checks for no spaces in -Ibinary and -Barch) w/o the spaces.

4 ↗(On Diff #180147)

Done + I'll fix the other test cases in this dir in a separate NFC commit

test/tools/llvm-objcopy/ELF/cross-arch-sections-symbols.test
26 ↗(On Diff #180147)

Done; is this what you meant?

jhenderson accepted this revision.Mon, Jan 7, 2:53 AM

LGTM, with a couple of minor comments.

test/tools/llvm-objcopy/ELF/binary-input-with-arch.test
16 ↗(On Diff #180292)

Super-nit: remove the extra spaces between CHECK and the pattern here :)

test/tools/llvm-objcopy/ELF/cross-arch-headers.test
3 ↗(On Diff #180147)

Okay, that makes a lot of sense.

4 ↗(On Diff #180147)

Sounds good.

test/tools/llvm-objcopy/ELF/cross-arch-sections-symbols.test
26 ↗(On Diff #180147)

Almost. The symbol value of foo points to somewhere that isn't inside .text (i.e. it has value 0x1234, but the section size is only 32, so it's a long way past the end). It's not technically invalid ELF format, but it is a bit weird. If you don't want to update again, that's fine, but changing the value to, say, 16, would be my preference.

This revision is now accepted and ready to land.Mon, Jan 7, 2:53 AM
rupprecht updated this revision to Diff 180491.Mon, Jan 7, 7:16 AM
  • Fix whitespace + .text test value
rupprecht marked 2 inline comments as done.Mon, Jan 7, 7:18 AM
rupprecht added inline comments.
test/tools/llvm-objcopy/ELF/cross-arch-sections-symbols.test
26 ↗(On Diff #180147)

Oh, I see what you mean now -- thanks, changed to 16.

This revision was automatically updated to reflect the committed changes.
rupprecht marked an inline comment as done.