This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/tools/llvm-objcopy/
-
tools/
-
llvm-objcopy/
2/2
binary-input-and-output.test
1/1
binary-input-arch.test
2/2
binary-input-error.test
3/3
binary-input.test
-
tools/llvm-objcopy/
-
llvm-objcopy/
5/5
Object.h
29/32
Object.cpp
2/7
llvm-objcopy.cpp

Differential D50343

[llvm-objcopy] Add support for -I binary -B <arch>.
ClosedPublic

Authored by rupprecht on Aug 6 2018, 11:34 AM.

Download Raw Diff

Details

Reviewers

jakehehrlich
alexander-shaposhnikov
jhenderson
javed.absar

Commits

rGcf67633e66de: [llvm-objcopy] Add support for -I binary -B <arch>.
rL340070: [llvm-objcopy] Add support for -I binary -B <arch>.

Summary

The -I (--input-target) and -B (--binary-architecture) flags exist but are currently silently ignored. This adds support for -I binary for architectures i386, x86-64 (and alias i386:x86-64), arm, and aarch64. This is largely based on D41687.

This is done by implementing an additional subclass of Reader, BinaryReader, which works by interpreting the input file as contents for .data field, sets up a synthetic header, and adds additional sections/symbols (e.g. _binary__tmp_data_txt_start). Additionally, change the symbol table to own symbols, by changing StringRef to std::string, as the synthetic symbols added are otherwise unowned.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 21471
Build 21471: arc lint + arc unit

Event Timeline

rupprecht created this revision.Aug 6 2018, 11:34 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptAug 6 2018, 11:34 AM

Herald added a subscriber: kristof.beyls. · View Herald Transcript

Harbormaster completed remote builds in B21129: Diff 159343.Aug 6 2018, 11:34 AM

Ah so I think I'm against BinaryReader being templated against ELFT. That is required to implement getElfType() however. getElfType drives which writer we create. That shouldn't be how things work however. I'll go and add my related comments to this to the previous patch. Sorry if I didn't see what was going on in the other patch and lead you astray. This patch makes things a bit more clear.

tools/llvm-objcopy/Object.cpp
617	Correct me if I'm wrong but I don't think GNU objcopy produces any program headers when it does this. The resulting file should be relocatable and thus program headers are pointless. Even if GNU objcopy does this, we probably shouldn't.
650	The MemBuf should outlive the section in which case you can just use `Section`
654	nit: Can you use a Twine? You'll have to use an std::string for the buffer identifier part unfortunately but other than that you should be able to use a twine.
674	I think you're ok to initialize the symbol table in whatever order you want here since there are no relocation. That's a current thorn that you even have to initialize things in a different order. I need to carve out some time to refactor llvm-objcopy to work differently to get there however.

Refactor Object based on code review comments

Harbormaster completed remote builds in B21155: Diff 159407.Aug 6 2018, 3:18 PM

In D50343#1190050, @jakehehrlich wrote:

Ah so I think I'm against BinaryReader being templated against ELFT. That is required to implement getElfType() however. getElfType drives which writer we create. That shouldn't be how things work however. I'll go and add my related comments to this to the previous patch. Sorry if I didn't see what was going on in the other patch and lead you astray. This patch makes things a bit more clear.

Yeah, it's not my first choice either, but it did make the implementation easier and seemed somewhat consistent with the pattern elsewhere in this file.

I wasn't sure what plans you had for refactoring things here (still haven't seen whatever thread was referenced earlier). My hunch is that "the right way" would be to not have any extra ELF type here and just always directly use what's in Object/ELFTypes.h, and to do things with composition instead of templated-inheritance, and I'm happy to make that change (or whatever alternative change makes sense), but that seems like an orthogonal discussion that isn't related to the features implemented here.

tools/llvm-objcopy/Object.cpp
617	I just verified this; commenting out this section causes an invalid object file (readelf/llvm-readobj can't read it at all). This is ElfHdrSegment. I think you are confusing this with ProgramHdrSegment? However, speaking of program headers, line 1290 (i.e. `OrderedSegments.push_back(&Obj.ProgramHdrSegment)`) seems to do nothing -- all the tests pass -- and also gets rid of the error `readelf: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers` when running readelf -a on object files that llvm-objcopy produces. I'd like to take a look at that after this patch; it seems to be a preexisting issue.

rupprecht mentioned this in D50117: [llvm-objcopy] NFC: Refactor main objcopy method that takes an ELFReader to a generic Reader..Aug 8 2018, 9:16 AM

Pass MachineInfo into BinaryReader to push ELFType template down to only BinaryELFBuilder<>
Move getElfType out of Reader (and also ELFReader), it can be implemented purely in the driver class. This make it clear that "Reader" is not tied to ELF types, even if that's all that's supported currently.

In D50343#1190161, @rupprecht wrote:

In D50343#1190050, @jakehehrlich wrote:

Ah so I think I'm against BinaryReader being templated against ELFT. That is required to implement getElfType() however. getElfType drives which writer we create. That shouldn't be how things work however. I'll go and add my related comments to this to the previous patch. Sorry if I didn't see what was going on in the other patch and lead you astray. This patch makes things a bit more clear.

Yeah, it's not my first choice either, but it did make the implementation easier and seemed somewhat consistent with the pattern elsewhere in this file.

OK, I removed ELF templates from BinaryReader; having them remain on BinaryELFBuilder still seems appropriate though.

jakehehrlich added inline comments.Aug 8 2018, 1:16 PM

tools/llvm-objcopy/Object.cpp
590	So I this function is the real problem child. We don't actually use the Ident anywhere.This is also 100% comprised of information that the ELFWriter knows but nothing else knows. We should make the ELFWriter construct the Ident and just not store it in Object.
617	You're right. I confused it with ProgramHdrSegment. I now remember that someone had the clever idea to make the ELF header an implicit segment so that the same layout algorithm that had all the bugs worked out could already be used. As for the warning I guess I knew about that issue. The `OrderedSegments.push_back(&Obj.ProgramHdrSegment)` should accomplish adding that program headers to layout when you have a PT_PHDR segment in a consistent way. The warning is a separate issue on line 1078 (in this patch) not checking if the program headers are empty before blindly setting phoff to the offset of that segment. This is in full compliance with the ELF standard because the number of program headers is still zero. It's just that no tool normally has a reason to produce an ELF that has phoff != 0 and phnum = 0 so it's normally a sign of corruption.

rupprecht marked an inline comment as done.Aug 8 2018, 2:38 PM

rupprecht added inline comments.

tools/llvm-objcopy/Object.cpp
590	Nice, I was able to completely remove Ident from Object.
617	I tried adding that check, and several tests failed -- I might not have been checking the right thing, or maybe the tests are bad. I added a TODO on that line to investigate further.

Remove Ident from object

Harbormaster completed remote builds in B21267: Diff 159801.Aug 8 2018, 2:39 PM

jakehehrlich added inline comments.Aug 8 2018, 2:58 PM

tools/llvm-objcopy/Object.cpp
617	How did the tests fail? Some of the tests do a literal check against the header. e.g. they may be checking to see that 0x40 is used even when (to comply with the warning) 0x0 should be used instead.

Add sparc/ppc to arch map

Herald added subscribers: jrtc27, fedor.sergeev, kbarton and 2 others. · View Herald TranscriptAug 8 2018, 3:01 PM

Harbormaster completed remote builds in B21269: Diff 159809.Aug 8 2018, 3:01 PM

rupprecht added inline comments.Aug 8 2018, 3:10 PM

tools/llvm-objcopy/Object.cpp

617

With this change:

// Obj.ProgramHdrSegment.firstSection() == nullptr implies
// Obj.ProgramHdrSegment.Sections is empty
Ehdr.e_phoff = Obj.ProgramHdrSegment.firstSection() == nullptr
                   ? 0
                   : Obj.ProgramHdrSegment.Offset;

The failures usually looked like:

Command Output (stderr):
--
<src>/llvm/test/tools/llvm-objcopy/triple-overlap.test:72:14: error: CHECK-NEXT: expected string not found in input
#CHECK-NEXT: Type: PT_LOAD (0x1)
             ^
<stdin>:9:2: note: scanning from here
 Type: (0x464C457F)
 ^
<stdin>:21:2: note: possible intended match here
 Type: (0x400004)
 ^

Actually, it looks like the object is corrupted with that change, e.g. the first program header for that test has an alignment of 15762873573703680, vs 4096 for all of them on the base side.

jakehehrlich added inline comments.Aug 8 2018, 3:12 PM

tools/llvm-objcopy/Object.cpp
617	Yeah that's not the right check. Use `Obj.ProgramHeaders.size() == 0`

rupprecht added inline comments.Aug 8 2018, 3:35 PM

tools/llvm-objcopy/Object.cpp
617	I can't seem to get that to work: Obj doesn't have a field called ProgramHeaders You probably mean `Obj.ProgramHdrSegment`, but `Segment` doesn't have a method size() If I implement size() as returning either `Contents.size()` or `Sections.size()` and then try: Ehdr.e_phoff = Obj.ProgramHdrSegment.size() == 0 ? 0 : Obj.ProgramHdrSegment.Offset; Then I get the same failure as before

jakehehrlich added inline comments.Aug 8 2018, 10:39 PM

tools/llvm-objcopy/Object.cpp
605	I think ELFWriter should assign these. I think I need to do some refactoring to make Object not so ELF specific (with respect to not containing fields that are not known until write time). This change is exposing a lot of points where I failed to properly separate those concerns. In the mean time if you could move this code into ELFWriter and call it form assignOffset that would be ideal.
617	hmm...that's bothersome; I must not be understanding something. I'll see about looking into that. Thanks for letting me know about this!

Extract out ehdr initialization into initEhdr()

Harbormaster completed remote builds in B21289: Diff 159939.Aug 9 2018, 9:21 AM

rupprecht marked an inline comment as done.Aug 9 2018, 9:22 AM

rupprecht added inline comments.

tools/llvm-objcopy/Object.cpp
605	I didn't find a way to refactor out setting ElfHdr.Index, since `ELFBuilder<ELFT>::readProgramHeaders()` is setting it to `ElfHdr.Index = Index++` in the middle of some other sections, but the rest was simple to refactor into ELFWriter.

Consistently use ELFT typename aliases in headers

Harbormaster completed remote builds in B21290: Diff 159941.Aug 9 2018, 9:30 AM

Awesome! I'd like to have someone else look over this but this pretty well LGTM.

tools/llvm-objcopy/Object.cpp
614	And this is the last thing where we can't easily factor out dependence on ElfType. This is going to require that Size be handled differently from how it is now. I'm not going to block this change on that since the fix is just not simple. Under the current design of llvm-objcopy that would require a new visitor to calculate the size. I'm not sure I want to further things along that path. Can you add a TODO here for me?

Add Elf_Sym TODO

Harbormaster completed remote builds in B21295: Diff 159961.Aug 9 2018, 11:03 AM

Thanks for the review!

Alex/James/anyone else who's lurking, mind taking a look for a second set of eyes?

In D50343#1194094, @rupprecht wrote:

Thanks for the review!

Alex/James/anyone else who's lurking, mind taking a look for a second set of eyes?

Yup, just going to look now. Sorry, it's been a busy couple of days, otherwise I'd have done so earlier.

jhenderson added inline comments.Aug 10 2018, 3:16 AM

test/tools/llvm-objcopy/binary-input-aarch64.test
1 ↗	(On Diff #159961)	Use echo, not printf. I'm not sure if printf is implemented in the lit test system etc, but echo is the more common approach anyway.
3 ↗	(On Diff #159961)	You dump the symbols here, but don't check any of them... you should probably do so!
8 ↗	(On Diff #159961)	Does this line add anything? If not, can we remove it?
test/tools/llvm-objcopy/binary-input-arm.test
1 ↗	(On Diff #159961)	Could you fold all of the regular tests into a single test file? There isn't much difference between them. Just run llvm-objcopy N times in a single file with different -B options, and use check prefixes to match the different sections (e.g. COMMON, ARM, AARCH64... etc).
test/tools/llvm-objcopy/binary-input-error.test
2	printf -> echo
test/tools/llvm-objcopy/binary-input-reconstitute.test
1 ↗	(On Diff #159961)	I'm not sure I understand what "reconstitute" means here. Perhaps rename the test? I assume the aim is for binary input and binary output? So maybe a better name is "binary-input-and-output.test".
10 ↗	(On Diff #159961)	Technically, this only shows that the contents include "abcd", not that they are explicitly and solely "abcd". So this would also match a file containing other junk. A better way might be to just compare the output files against the original input files using diff. Also, a common technique in our tests is to copy the input file to a separate file before running llvm-objcopy and then diff the input against the copy to show that the input hasn't been modified.
test/tools/llvm-objcopy/binary-input.test
3	I'm not sure how this test is different from the others (apart from the symbols). Is i386:x86-64 just a synonym of x86-64? If so, I think this test can be folded into the others, like mentioned above.
tools/llvm-objcopy/Object.cpp
65–66	Could you do these little unrelated tidy-ups in a separate NFC commit, please. No need for a review. Essentially, I'd like this review to just be of the things required for your change.
621	AR -> Data. "AR" is meaningless. It saddens me that different parts of LLVM can't agree on whether to use chars or uint8_t.
624	Rather than abbreviate this to something that is unclear, just call it "DataSection"
633	I think it's considered bad form to have explicit local variable Twines that aren't just function parameters, if I remember, based on comments in a previous review (I might be mistaken though, so am happy to be proven wrong). Please also name it something more descriptive like "Prefix".
642	Should the end symbol have a size? That seems weird that it does. I might expect the start symbol to, but not the end symbol. What does GNU objcopy do?
1255	Could you rename this initEhdrSegment, since it doesn't actually initialise the ELF header itself.
1267	I'm not convinced that the ELF header segment should be initialised inside something called "assignOffsets", since it's doing a lot more than assigning it an offset (which actually is always 0, so could be initialised at the same time as the rest of the header). It should probably be called outside this function.
tools/llvm-objcopy/Object.h
403	Side note: I think a recent change for --prefix-symbols makes this part of the diff identical. I recommend rebasing when you next update the diff.
458	Don't abbreviate variable names: Sz -> Size.
tools/llvm-objcopy/llvm-objcopy.cpp
135–136	It looks to me like we are grouping things in CopyConfig by type, so this should probably be moved to before (or after) the StringRef block.
313	I feel like this might read better if everything is constructed inline: static const StringMap<MachineInfo> ArchMap{ // Name, EM value, 64bit, LittleEndian {"aarch64", {EM_AARCH64, true, true}}, {"arm", {EM_ARM, false, true}}, /* More entries here */ }; I'd order the entries either alphabetically by name or numerically by their EM value. You could also put "headers" as shown to avoid needing to duplicate the 64/Endianness comment in each part.

Fold arch-specific file header tests into a single templated test
Replace printf w/ echo -n
Other code review comments

Harbormaster completed remote builds in B21337: Diff 160161.Aug 10 2018, 11:35 AM

rupprecht added inline comments.Aug 10 2018, 11:36 AM

test/tools/llvm-objcopy/binary-input-aarch64.test
1 ↗	(On Diff #159961)	Done -- I think printf was chosen to avoid the \n, but that can be done with "echo -n". Looks like lit does support it, as it's used in test/tools/llvm-objcopy/add-gnu-debuglink.test, as well as some non-objcopy tests. I'll resist the temptation to fix those in this patch :)
3 ↗	(On Diff #159961)	Actually, the point of these arch-specific tests is just to check the file header. I removed -sections and -symbols from this test; those are checked by binary-input.test, which shouldn't depend much on the specific arch.
test/tools/llvm-objcopy/binary-input-reconstitute.test
1 ↗	(On Diff #159961)	Yes, essentially it's an integration test that some payload doesn't change when it's converted back and forth (i.e. payload -> object file w/ payload shoved into .data -> payload). I'm not sure why I thought "reconstitute" would be a good word for that, so I'll take your name suggestion.
test/tools/llvm-objcopy/binary-input.test
3	This test is focused on the generic stuff, namely section/symbols. I dropped the arch-specific stuff since that's covered elsewhere.
tools/llvm-objcopy/Object.cpp
633	I originally had string, but changed to Twine based on Jake's suggestion here: https://reviews.llvm.org/D50343?id=159343#inline-442972 I'm trying to understand why it would be a bad idea to write this. Looking at http://llvm.org/docs/ProgrammersManual.html#dss-twine, the discouraged pattern is: void foo(const Twine &T); ... StringRef X = ... unsigned i = ... const Twine &Tmp = X + "." + Twine(i); foo(Tmp); Which is bad because Tmp is a ref, whereas the Twine here is a regular stack variable that persists past all the calls to addSymbol (which calls .str() on the input twine to save/own the symbol). I renamed to "Prefix", but kept this as a Twine here, since I think this is safe. Otherwise, I think I'd have to do: SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_start", ...); SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_end", ...); SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_size", ...);
642	Nope, this is Value, not Size, although it's confusing, especially because addSymbol has so many parameters. I commented the param names here to make this clear. The sizes/values here match gnu objcopy.
tools/llvm-objcopy/Object.h
458	Done, although this is existing code -- it's only showing up as a diff because clang-format put it on a new line after I changed StringRef->Twine...
tools/llvm-objcopy/llvm-objcopy.cpp
135–136	Done -- I added some comments to try to logically explain how this config is laid out, but I'm not very familiar with all of them... I'm open to suggestions here.

Rebase

Harbormaster completed remote builds in B21338: Diff 160163.Aug 10 2018, 11:38 AM

jhenderson added inline comments.Aug 13 2018, 3:58 AM

test/tools/llvm-objcopy/binary-input-and-output.test
2	Is the no new line important to this test? I think it isn't any more.
tools/llvm-objcopy/Object.cpp
642	Right, I get it now. Thanks for the comments.
tools/llvm-objcopy/Object.h
458	Ah, yes of course. But the renaming is good anyway.
701	If this is a clang-format added difference, please do it in another patch.
tools/llvm-objcopy/llvm-objcopy.cpp
135–136	I'd like @jakehehrlich to comment on these comments, if that's okay, as he may have a specific desire as to how this class is laid out.

echo -n -> echo where it doesn't matter

Harbormaster completed remote builds in B21400: Diff 160409.Aug 13 2018, 11:28 AM

jakehehrlich added inline comments.Aug 13 2018, 11:41 AM

tools/llvm-objcopy/llvm-objcopy.cpp
134–136	Do we use these anywhere? I think I added them thinking about how I was going to use them to do exactly what this change does but embedding MachineInfo seems much nicer and convey's the same information. Maybe we should just do that?
135–136	To date I have had no reason or rhyme to how I've added these...in fact I'm not sure if I have even added the majority at this point. A pattern of grouping by type does seem to have formed organically however. Let's stick with it. Long term ideals on how this should be laid out: The names should have a consistent method for naming them derived from the option name where possible. I don't think I've done a very good job of this and it isn't clear how to best solve this issue. Grouping first by type and then by alphabetization is probably a good idea. I'm pretty trash at finding things in alphabetized lists but I always know the basic type of thing I'm looking for. AddSection is the only thing I know of which currently violates this. (we should make it work the way Paul made SectionsToRename work).

rupprecht added inline comments.Aug 13 2018, 3:05 PM

tools/llvm-objcopy/llvm-objcopy.cpp
134–136	We do. Essentially, we use this to decide whether we're going to need a BinaryReader vs ELFReader for InputFormat, and, separately, whether we're going to need a BinaryWriter vs ELFWriter for OutputFormat. BinaryArch is extra information that only applies when using BinaryReader.

Rebase/fix other method names to be lower cased.

Harbormaster completed remote builds in B21417: Diff 160461.Aug 13 2018, 3:05 PM

No null symbol is a blocker, so I've marked this as requesting changes. Sorry!

test/tools/llvm-objcopy/binary-input-and-output.test
6	I'm not sure if this is important or not in this case, but the usual pattern in llvm-objcopy tests is to copy the input file to a backup to verify that it hasn't been modified before doing the diff.
test/tools/llvm-objcopy/binary-input-arch.test
2	I don't think -n is important here? Indeed, you could do away with this file entirely and just feed in %s into llvm-objcopy, if you wanted, although this is probably less confusing.
test/tools/llvm-objcopy/binary-input-error.test
2	The -n isn't important here either.
test/tools/llvm-objcopy/binary-input.test
74	This is an illegal symbol table: it has no null symbol.
tools/llvm-objcopy/Object.cpp
613	Doesn't this assignment rely on knowing that the symbol table is added immediately after the string table? That seems like poor design to me. Better would be to pass in the index or string table section. I might well be forgetting how llvm-objcopy is designed in this area, but don't we need to explicitly add a null symbol as the first symbol in the symbol table?
639–640	Is there any point in adding this local section symbol? It can't be referenced, so I think it's superfluous.

This revision now requires changes to proceed.Aug 14 2018, 2:55 AM

Fix tests, add null symbol to symtab, make binary objection creation less fragile, and automatically assign section indices

tools/llvm-objcopy/Object.cpp
613	Yes, this is very fragile, I think I understand the issue I was having before a little better. The Link needs to be set to the StrTab Index, as you mention. The Object::addSection() helper was not assigning any index, so this would implicitly be zero (which is SHN_UNDEF), and that was throwing errors when initializing it. Using size(Obj->sections()) - 1 was more of a reverse-engineered way of getting things working. Changing Object::addSection() to automatically assign the index lets us use the index from StrTab directly. I think this might allow us to stop assigning manually indices elsewhere in this file, but I'll save that for another change. Also -- fixed the symbol table to include a null symbol.
639–640	GNU objcopy adds it, but it doesn't seem to be necessary. I'll add it back if it turns out to be needed.

jakehehrlich added inline comments.Aug 14 2018, 12:10 PM

tools/llvm-objcopy/Object.cpp
613	Yeah that probably should have functioned that way the whole time. Thanks for fixing that!

LGTM, but there is one thing still outstanding, I think, which @jakehehrlich mentioned in a comment regarding the CopyConfig options:

A pattern of grouping by type does seem to have formed organically however. Let's stick with it.

I don't have a strong preference either way, but you should get @jakehehrlich to confirm he is happy before committing this change.

This revision is now accepted and ready to land.Aug 15 2018, 2:22 AM

In D50343#1200318, @jhenderson wrote:

LGTM, but there is one thing still outstanding, I think, which @jakehehrlich mentioned in a comment regarding the CopyConfig options:

A pattern of grouping by type does seem to have formed organically however. Let's stick with it.

I don't have a strong preference either way, but you should get @jakehehrlich to confirm he is happy before committing this change.

Ok, I split out each section by type + alphabetized (as suggested after that comment). I also don't have any strong preference though; I'm happy to apply whatever organization/comments/etc. others want here.

There was also a suggestion in that comment to rename config names to more closely match flag names which is also a good idea, but I'd rather submit that as an NFC after this to avoid putting myself in merge conflict hell :)

Reorganize CopyConfig sections

rebase/fix stale filename in a test case

Harbormaster completed remote builds in B21591: Diff 161086.Aug 16 2018, 11:51 AM

Closed by commit rL340070: [llvm-objcopy] Add support for -I binary -B <arch>. (authored by rupprecht). · Explain WhyAug 17 2018, 11:52 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

test/

tools/

llvm-objcopy/

binary-input-and-output.test

15 lines

binary-input-arch.test

75 lines

binary-input-error.test

10 lines

binary-input.test

112 lines

tools/

llvm-objcopy/

Object.h

54 lines

Object.cpp

147 lines

llvm-objcopy.cpp

102 lines

Diff 160656

test/tools/llvm-objcopy/binary-input-and-output.test

This file was added.

				# RUN: echo abcd > %t.txt

				jhendersonUnsubmitted Done Reply Inline Actions Is the no new line important to this test? I think it isn't any more. jhenderson: Is the no new line important to this test? I think it isn't any more.
				# Preserve input to verify it is not modified
				# RUN: cp %t.txt %t-copy.txt

				# -I binary -O binary preserves payload through in-memory representation
				jhendersonUnsubmitted Done Reply Inline Actions I'm not sure if this is important or not in this case, but the usual pattern in llvm-objcopy tests is to copy the input file to a backup to verify that it hasn't been modified before doing the diff. jhenderson: I'm not sure if this is important or not in this case, but the usual pattern in llvm-objcopy…
				# RUN: llvm-objcopy -I binary -B i386:x86-64 -O binary %t.txt %t.2.txt
				# RUN: cmp %t.txt %t.2.txt
				# RUN: cmp %t.txt %t-copy.txt

				# -I binary -O binary preserves payload through an intermediate object file
				# RUN: llvm-objcopy -I binary -B i386:x86-64 %t.txt %t.o
				# RUN: llvm-objcopy -O binary %t.o %t.3.txt
				# RUN: cmp %t.txt %t.3.txt
				# RUN: cmp %t.txt %t-copy.txt

test/tools/llvm-objcopy/binary-input-arch.test

This file was added.

				# RUN: echo abcd > %t.txt

				jhendersonUnsubmitted Done Reply Inline Actions I don't think -n is important here? Indeed, you could do away with this file entirely and just feed in %s into llvm-objcopy, if you wanted, although this is probably less confusing. jhenderson: I don't think -n is important here? Indeed, you could do away with this file entirely and just…
				# RUN: llvm-objcopy -I binary -B aarch64 %t.txt %t.aarch64.o
				# RUN: llvm-readobj -file-headers %t.aarch64.o \| FileCheck %s --check-prefixes=CHECK,AARCH64,64

				# RUN: llvm-objcopy -I binary -B arm %t.txt %t.arm.o
				# RUN: llvm-readobj -file-headers %t.arm.o \| FileCheck %s --check-prefixes=CHECK,ARM,32

				# RUN: llvm-objcopy -I binary -B i386 %t.txt %t.i386.o
				# RUN: llvm-readobj -file-headers %t.i386.o \| FileCheck %s --check-prefixes=CHECK,I386,32

				# RUN: llvm-objcopy -I binary -B i386:x86-64 %t.txt %t.i386:x86-64.o
				# RUN: llvm-readobj -file-headers %t.i386:x86-64.o \| FileCheck %s --check-prefixes=CHECK,X86-64,64

				# RUN: llvm-objcopy -I binary -B powerpc:common64 %t.txt %t.powerpc:common64.o
				# RUN: llvm-readobj -file-headers %t.powerpc:common64.o \| FileCheck %s --check-prefixes=CHECK,PPC,64

				# RUN: llvm-objcopy -I binary -B sparc %t.txt %t.sparc.o
				# RUN: llvm-readobj -file-headers %t.sparc.o \| FileCheck %s --check-prefixes=CHECK,SPARC,32

				# RUN: llvm-objcopy -I binary -B x86-64 %t.txt %t.x86-64.o
				# RUN: llvm-readobj -file-headers %t.x86-64.o \| FileCheck %s --check-prefixes=CHECK,X86-64,64

				# CHECK: Format:
				# AARCH64-SAME: ELF64-aarch64-little
				# ARM-SAME: ELF32-arm-little
				# I386-SAME: ELF32-i386
				# PPC-SAME: ELF64-ppc64
				# SPARC-SAME: ELF32-sparc
				# X86-64-SAME: ELF64-x86-64

				# AARCH64-NEXT: Arch: aarch64
				# ARM-NEXT: Arch: arm
				# I386-NEXT: Arch: i386
				# PPC-NEXT: Arch: powerpc64le
				# SPARC-NEXT: Arch: sparcel
				# X86-64-NEXT: Arch: x86_64

				# 32-NEXT: AddressSize: 32bit
				# 64-NEXT: AddressSize: 64bit

				# CHECK: ElfHeader {
				# CHECK-NEXT: Ident {
				# CHECK-NEXT: Magic: (7F 45 4C 46)
				# 32-NEXT: Class: 32-bit (0x1)
				# 64-NEXT: Class: 64-bit (0x2)
				# CHECK-NEXT: DataEncoding: LittleEndian (0x1)
				# CHECK-NEXT: FileVersion: 1
				# CHECK-NEXT: OS/ABI: SystemV (0x0)
				# CHECK-NEXT: ABIVersion: 0
				# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
				# CHECK-NEXT: }
				# CHECK-NEXT: Type: Relocatable (0x1)
				# AARCH64-NEXT: Machine: EM_AARCH64 (0xB7)
				# ARM-NEXT: Machine: EM_ARM (0x28)
				# I386-NEXT: Machine: EM_386 (0x3)
				# PPC-NEXT: Machine: EM_PPC64 (0x15)
				# SPARC-NEXT: Machine: EM_SPARC (0x2)
				# X86-64-NEXT: Machine: EM_X86_64 (0x3E)
				# CHECK-NEXT: Version: 1
				# CHECK-NEXT: Entry: 0x0
				# CHECK-NEXT: ProgramHeaderOffset:
				# CHECK-NEXT: SectionHeaderOffset:
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# 32-NEXT: HeaderSize: 52
				# 64-NEXT: HeaderSize: 64
				# 32-NEXT: ProgramHeaderEntrySize: 32
				# 64-NEXT: ProgramHeaderEntrySize: 56
				# CHECK-NEXT: ProgramHeaderCount: 0
				# 32-NEXT: SectionHeaderEntrySize: 40
				# 64-NEXT: SectionHeaderEntrySize: 64
				# CHECK-NEXT: SectionHeaderCount: 4
				# CHECK-NEXT: StringTableSectionIndex:
				# CHECK-NEXT: }

test/tools/llvm-objcopy/binary-input-error.test

This file was added.

				# RUN: echo abcd > %t.txt

				jhendersonUnsubmitted Done Reply Inline Actions printf -> echo jhenderson: printf -> echo
				jhendersonUnsubmitted Done Reply Inline Actions The -n isn't important here either. jhenderson: The -n isn't important here either.
				# RUN: not llvm-objcopy -I binary %t.txt %t.o 2>&1 \
				# RUN: \| FileCheck %s --check-prefix=MISSING-BINARY-ARCH

				# RUN: not llvm-objcopy -I binary -B xyz %t.txt %t.o 2>&1 \
				# RUN: \| FileCheck %s --check-prefix=BAD-BINARY-ARCH

				# MISSING-BINARY-ARCH: Specified binary input without specifiying an architecture.
				# BAD-BINARY-ARCH: Invalid architecture: 'xyz'.

test/tools/llvm-objcopy/binary-input.test

This file was added.

				# RUN: echo -n abcd > %t.x-txt
				# Preserve input to verify it is not modified
				# RUN: cp %t.txt %t-copy.txt
				jhendersonUnsubmitted Done Reply Inline Actions I'm not sure how this test is different from the others (apart from the symbols). Is i386:x86-64 just a synonym of x86-64? If so, I think this test can be folded into the others, like mentioned above. jhenderson: I'm not sure how this test is different from the others (apart from the symbols). Is i386:x86…
				rupprechtAuthorUnsubmitted Done Reply Inline Actions This test is focused on the generic stuff, namely section/symbols. I dropped the arch-specific stuff since that's covered elsewhere. rupprecht: This test is focused on the generic stuff, namely section/symbols. I dropped the arch-specific…
				# RUN: llvm-objcopy -I binary -B i386:x86-64 %t.x-txt %t.o
				# RUN: llvm-readobj -sections -symbols %t.o \| FileCheck %s
				# RUN: cmp %t.txt %t-copy.txt

				# CHECK: Sections [
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 0
				# CHECK-NEXT: Name: (0)
				# CHECK-NEXT: Type: SHT_NULL (0x0)
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size:
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 0
				# CHECK-NEXT: EntrySize: 0
				# CHECK-NEXT: }
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 1
				# CHECK-NEXT: Name: .strtab
				# CHECK-NEXT: Type: SHT_STRTAB (0x3)
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size:
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0
				# CHECK-NEXT: }
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 2
				# CHECK-NEXT: Name: .symtab
				# CHECK-NEXT: Type: SHT_SYMTAB (0x2)
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size:
				# CHECK-NEXT: Link: 1
				# CHECK-NEXT: Info: 1
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 24
				# CHECK-NEXT: }
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 3
				# CHECK-NEXT: Name: .data
				# CHECK-NEXT: Type: SHT_PROGBITS (0x1)
				# CHECK-NEXT: Flags [ (0x3)
				# CHECK-NEXT: SHF_ALLOC (0x2)
				# CHECK-NEXT: SHF_WRITE (0x1)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size: 4
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0
				# CHECK-NEXT: }
				# CHECK-NEXT: ]

				# Note: the symbol names are derived from the full path (with non-alnum values
				# replaced with "_"), e.g. "/tmp/a-b.c" should yield
				# _binary__tmp_a_b_c_{start,end,size}.
				# Just check for _binary_{{[_a-zA-Z0-9]*}}_x_txt_{start,end,size} to avoid
				# making assumptions about how this test is run.

				jhendersonUnsubmitted Done Reply Inline Actions This is an illegal symbol table: it has no null symbol. jhenderson: This is an illegal symbol table: it has no null symbol.
				# CHECK: Symbols [
				# CHECK-NEXT: Symbol {
				# CHECK-NEXT: Name:
				# CHECK-NEXT: Value: 0x0
				# CHECK-NEXT: Size: 0
				# CHECK-NEXT: Binding: Local (0x0)
				# CHECK-NEXT: Type: None (0x0)
				# CHECK-NEXT: Other: 0
				# CHECK-NEXT: Section: Undefined (0x0)
				# CHECK-NEXT: }
				# CHECK-NEXT: Symbol {
				# CHECK-NEXT: Name: _binary_{{[_a-zA-Z0-9]*}}_x_txt_start
				# CHECK-NEXT: Value: 0x0
				# CHECK-NEXT: Size: 0
				# CHECK-NEXT: Binding: Global (0x1)
				# CHECK-NEXT: Type: None (0x0)
				# CHECK-NEXT: Other: 0
				# CHECK-NEXT: Section: .data
				# CHECK-NEXT: }
				# CHECK-NEXT: Symbol {
				# CHECK-NEXT: Name: _binary_{{[_a-zA-Z0-9]*}}_x_txt_end
				# CHECK-NEXT: Value: 0x4
				# CHECK-NEXT: Size: 0
				# CHECK-NEXT: Binding: Global (0x1)
				# CHECK-NEXT: Type: None (0x0)
				# CHECK-NEXT: Other: 0
				# CHECK-NEXT: Section: .data
				# CHECK-NEXT: }
				# CHECK-NEXT: Symbol {
				# CHECK-NEXT: Name: _binary_{{[_a-zA-Z0-9]*}}_x_txt_size
				# CHECK-NEXT: Value: 0x4
				# CHECK-NEXT: Size: 0
				# CHECK-NEXT: Binding: Global (0x1)
				# CHECK-NEXT: Type: None (0x0)
				# CHECK-NEXT: Other: 0
				# CHECK-NEXT: Section: Absolute
				# CHECK-NEXT: }
				# CHECK-NEXT: ]

tools/llvm-objcopy/Object.h

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	public:
SectionBase *getSection(uint32_t Index, Twine ErrMsg);		SectionBase *getSection(uint32_t Index, Twine ErrMsg);

template <class T>		template <class T>
T *getSectionOfType(uint32_t Index, Twine IndexErrMsg, Twine TypeErrMsg);		T *getSectionOfType(uint32_t Index, Twine IndexErrMsg, Twine TypeErrMsg);
};		};

enum ElfType { ELFT_ELF32LE, ELFT_ELF64LE, ELFT_ELF32BE, ELFT_ELF64BE };		enum ElfType { ELFT_ELF32LE, ELFT_ELF64LE, ELFT_ELF32BE, ELFT_ELF64BE };

		// This type keeps track of the machine info for various architectures. This
		// lets us map architecture names to ELF types and the e_machine value of the
		// ELF file.
		struct MachineInfo {
		uint16_t EMachine;
		bool Is64Bit;
		bool IsLittleEndian;
		};

class SectionVisitor {		class SectionVisitor {
public:		public:
virtual ~SectionVisitor();		virtual ~SectionVisitor();

virtual void visit(const Section &Sec) = 0;		virtual void visit(const Section &Sec) = 0;
virtual void visit(const OwnedDataSection &Sec) = 0;		virtual void visit(const OwnedDataSection &Sec) = 0;
virtual void visit(const StringTableSection &Sec) = 0;		virtual void visit(const StringTableSection &Sec) = 0;
virtual void visit(const SymbolTableSection &Sec) = 0;		virtual void visit(const SymbolTableSection &Sec) = 0;
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines

template <class ELFT> class ELFWriter : public Writer {		template <class ELFT> class ELFWriter : public Writer {
private:		private:
using Elf_Addr = typename ELFT::Addr;		using Elf_Addr = typename ELFT::Addr;
using Elf_Shdr = typename ELFT::Shdr;		using Elf_Shdr = typename ELFT::Shdr;
using Elf_Phdr = typename ELFT::Phdr;		using Elf_Phdr = typename ELFT::Phdr;
using Elf_Ehdr = typename ELFT::Ehdr;		using Elf_Ehdr = typename ELFT::Ehdr;

		void initEhdrSegment();

void writeEhdr();		void writeEhdr();
void writePhdr(const Segment &Seg);		void writePhdr(const Segment &Seg);
void writeShdr(const SectionBase &Sec);		void writeShdr(const SectionBase &Sec);

void writePhdrs();		void writePhdrs();
void writeShdrs();		void writeShdrs();
void writeSectionData();		void writeSectionData();

▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	enum SymbolShndxType {
SYMBOL_XINDEX = ELF::SHN_XINDEX,		SYMBOL_XINDEX = ELF::SHN_XINDEX,
};		};

struct Symbol {		struct Symbol {
uint8_t Binding;		uint8_t Binding;
SectionBase *DefinedIn = nullptr;		SectionBase *DefinedIn = nullptr;
SymbolShndxType ShndxType;		SymbolShndxType ShndxType;
uint32_t Index;		uint32_t Index;
std::string Name;		std::string Name;
		jhendersonUnsubmitted Done Reply Inline Actions Side note: I think a recent change for --prefix-symbols makes this part of the diff identical. I recommend rebasing when you next update the diff. jhenderson: Side note: I think a recent change for --prefix-symbols makes this part of the diff identical.
uint32_t NameIndex;		uint32_t NameIndex;
uint64_t Size;		uint64_t Size;
uint8_t Type;		uint8_t Type;
uint64_t Value;		uint64_t Value;
uint8_t Visibility;		uint8_t Visibility;
bool Referenced = false;		bool Referenced = false;

uint16_t getShndx() const;		uint16_t getShndx() const;
Show All 34 Lines
protected:		protected:
std::vector<std::unique_ptr<Symbol>> Symbols;		std::vector<std::unique_ptr<Symbol>> Symbols;
StringTableSection *SymbolNames = nullptr;		StringTableSection *SymbolNames = nullptr;
SectionIndexSection *SectionIndexTable = nullptr;		SectionIndexSection *SectionIndexTable = nullptr;

using SymPtr = std::unique_ptr<Symbol>;		using SymPtr = std::unique_ptr<Symbol>;

public:		public:
void addSymbol(StringRef Name, uint8_t Bind, uint8_t Type,		SymbolTableSection() { Type = ELF::SHT_SYMTAB; }
SectionBase *DefinedIn, uint64_t Value, uint8_t Visibility,
uint16_t Shndx, uint64_t Sz);		void addSymbol(Twine Name, uint8_t Bind, uint8_t Type, SectionBase *DefinedIn,
		uint64_t Value, uint8_t Visibility, uint16_t Shndx,
		uint64_t Size);
		jhendersonUnsubmitted Done Reply Inline Actions Don't abbreviate variable names: Sz -> Size. jhenderson: Don't abbreviate variable names: Sz -> Size.
		rupprechtAuthorUnsubmitted Done Reply Inline Actions Done, although this is existing code -- it's only showing up as a diff because clang-format put it on a new line after I changed StringRef->Twine... rupprecht: Done, although this is existing code -- it's only showing up as a diff because clang-format put…
		jhendersonUnsubmitted Done Reply Inline Actions Ah, yes of course. But the renaming is good anyway. jhenderson: Ah, yes of course. But the renaming is good anyway.
void prepareForLayout();		void prepareForLayout();
// An 'empty' symbol table still contains a null symbol.		// An 'empty' symbol table still contains a null symbol.
bool empty() const { return Symbols.size() == 1; }		bool empty() const { return Symbols.size() == 1; }
void setShndxTable(SectionIndexSection *ShndxTable) {		void setShndxTable(SectionIndexSection *ShndxTable) {
SectionIndexTable = ShndxTable;		SectionIndexTable = ShndxTable;
}		}
const SectionIndexSection *getShndxTable() const { return SectionIndexTable; }		const SectionIndexSection *getShndxTable() const { return SectionIndexTable; }
const SectionBase *getStrTab() const { return SymbolNames; }		const SectionBase *getStrTab() const { return SymbolNames; }
▲ Show 20 Lines • Show All 167 Lines • ▼ Show 20 Lines	public:
virtual std::unique_ptr<Object> create() const = 0;		virtual std::unique_ptr<Object> create() const = 0;
};		};

using object::Binary;		using object::Binary;
using object::ELFFile;		using object::ELFFile;
using object::ELFObjectFile;		using object::ELFObjectFile;
using object::OwningBinary;		using object::OwningBinary;

		template <class ELFT> class BinaryELFBuilder {
		using Elf_Sym = typename ELFT::Sym;

		uint16_t EMachine;
		MemoryBuffer *MemBuf;
		std::unique_ptr<Object> Obj;

		void initFileHeader();
		void initHeaderSegment();
		StringTableSection *addStrTab();
		SymbolTableSection addSymTab(StringTableSection StrTab);
		void addData(SymbolTableSection *SymTab);
		void initSections();

		public:
		BinaryELFBuilder(uint16_t EM, MemoryBuffer *MB)
		: EMachine(EM), MemBuf(MB), Obj(llvm::make_unique<Object>()) {}

		std::unique_ptr<Object> build();
		};

template <class ELFT> class ELFBuilder {		template <class ELFT> class ELFBuilder {
private:		private:
using Elf_Addr = typename ELFT::Addr;		using Elf_Addr = typename ELFT::Addr;
using Elf_Shdr = typename ELFT::Shdr;		using Elf_Shdr = typename ELFT::Shdr;
using Elf_Ehdr = typename ELFT::Ehdr;
using Elf_Word = typename ELFT::Word;		using Elf_Word = typename ELFT::Word;

const ELFFile<ELFT> &ElfFile;		const ELFFile<ELFT> &ElfFile;
Object &Obj;		Object &Obj;

void setParentSegment(Segment &Child);		void setParentSegment(Segment &Child);
void readProgramHeaders();		void readProgramHeaders();
void initGroupSection(GroupSection *GroupSec);		void initGroupSection(GroupSection *GroupSec);
void initSymbolTable(SymbolTableSection *SymTab);		void initSymbolTable(SymbolTableSection *SymTab);
void readSectionHeaders();		void readSectionHeaders();
SectionBase &makeSection(const Elf_Shdr &Shdr);		SectionBase &makeSection(const Elf_Shdr &Shdr);

public:		public:
ELFBuilder(const ELFObjectFile<ELFT> &ElfObj, Object &Obj)		ELFBuilder(const ELFObjectFile<ELFT> &ElfObj, Object &Obj)
: ElfFile(*ElfObj.getELFFile()), Obj(Obj) {}		: ElfFile(*ElfObj.getELFFile()), Obj(Obj) {}

void build();		void build();
};		};

		class BinaryReader : public Reader {
		const MachineInfo &MInfo;
		MemoryBuffer *MemBuf;

		public:
		BinaryReader(const MachineInfo &MI, MemoryBuffer *MB)
		: MInfo(MI), MemBuf(MB) {}
		std::unique_ptr<Object> create() const override;
		};

class ELFReader : public Reader {		class ELFReader : public Reader {
Binary *Bin;		Binary *Bin;

public:		public:
ElfType getElfType() const;
std::unique_ptr<Object> create() const override;		std::unique_ptr<Object> create() const override;
explicit ELFReader(Binary *B) : Bin(B) {}		explicit ELFReader(Binary *B) : Bin(B) {}
		jhendersonUnsubmitted Done Reply Inline Actions If this is a clang-format added difference, please do it in another patch. jhenderson: If this is a clang-format added difference, please do it in another patch.
};		};

class Object {		class Object {
private:		private:
using SecPtr = std::unique_ptr<SectionBase>;		using SecPtr = std::unique_ptr<SectionBase>;
using SegPtr = std::unique_ptr<Segment>;		using SegPtr = std::unique_ptr<Segment>;

std::vector<SecPtr> Sections;		std::vector<SecPtr> Sections;
Show All 12 Lines	public:
// not present in any segment. This could be a problem during file layout,		// not present in any segment. This could be a problem during file layout,
// because other segments may get assigned an offset where either of the		// because other segments may get assigned an offset where either of the
// two should reside, which will effectively corrupt the resulting binary.		// two should reside, which will effectively corrupt the resulting binary.
// Other than that we use these segments to track program header offsets		// Other than that we use these segments to track program header offsets
// when they may not follow the ELF header.		// when they may not follow the ELF header.
Segment ElfHdrSegment;		Segment ElfHdrSegment;
Segment ProgramHdrSegment;		Segment ProgramHdrSegment;

uint8_t Ident[16];
uint64_t Entry;		uint64_t Entry;
uint64_t SHOffset;		uint64_t SHOffset;
uint32_t Type;		uint32_t Type;
uint32_t Machine;		uint32_t Machine;
uint32_t Version;		uint32_t Version;
uint32_t Flags;		uint32_t Flags;

StringTableSection *SectionNames = nullptr;		StringTableSection *SectionNames = nullptr;
Show All 9 Lines	public:
ConstRange<Segment> segments() const { return make_pointee_range(Segments); }		ConstRange<Segment> segments() const { return make_pointee_range(Segments); }

void removeSections(std::function<bool(const SectionBase &)> ToRemove);		void removeSections(std::function<bool(const SectionBase &)> ToRemove);
void removeSymbols(function_ref<bool(const Symbol &)> ToRemove);		void removeSymbols(function_ref<bool(const Symbol &)> ToRemove);
template <class T, class... Ts> T &addSection(Ts &&... Args) {		template <class T, class... Ts> T &addSection(Ts &&... Args) {
auto Sec = llvm::make_unique<T>(std::forward<Ts>(Args)...);		auto Sec = llvm::make_unique<T>(std::forward<Ts>(Args)...);
auto Ptr = Sec.get();		auto Ptr = Sec.get();
Sections.emplace_back(std::move(Sec));		Sections.emplace_back(std::move(Sec));
		Ptr->Index = Sections.size();
return *Ptr;		return *Ptr;
}		}
Segment &addSegment(ArrayRef<uint8_t> Data) {		Segment &addSegment(ArrayRef<uint8_t> Data) {
Segments.emplace_back(llvm::make_unique<Segment>(Data));		Segments.emplace_back(llvm::make_unique<Segment>(Data));
return *Segments.back();		return *Segments.back();
}		}
};		};
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

#endif // LLVM_TOOLS_OBJCOPY_OBJECT_H		#endif // LLVM_TOOLS_OBJCOPY_OBJECT_H

tools/llvm-objcopy/Object.cpp

Show First 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
uint8_t *MemBuffer::getBufferStart() {		uint8_t *MemBuffer::getBufferStart() {
return reinterpret_cast<uint8_t *>(Buf->getBufferStart());		return reinterpret_cast<uint8_t *>(Buf->getBufferStart());
}		}

std::unique_ptr<WritableMemoryBuffer> MemBuffer::releaseMemoryBuffer() {		std::unique_ptr<WritableMemoryBuffer> MemBuffer::releaseMemoryBuffer() {
return std::move(Buf);		return std::move(Buf);
}		}

template <class ELFT> void ELFWriter<ELFT>::writePhdr(const Segment &Seg) {		template <class ELFT> void ELFWriter<ELFT>::writePhdr(const Segment &Seg) {
uint8_t *B = Buf.getBufferStart();		uint8_t *B = Buf.getBufferStart();
jhendersonUnsubmitted Done Reply Inline Actions Could you do these little unrelated tidy-ups in a separate NFC commit, please. No need for a review. Essentially, I'd like this review to just be of the things required for your change. jhenderson: Could you do these little unrelated tidy-ups in a separate NFC commit, please. No need for a…
B += Obj.ProgramHdrSegment.Offset + Seg.Index * sizeof(Elf_Phdr);		B += Obj.ProgramHdrSegment.Offset + Seg.Index * sizeof(Elf_Phdr);
Elf_Phdr &Phdr = reinterpret_cast<Elf_Phdr >(B);		Elf_Phdr &Phdr = reinterpret_cast<Elf_Phdr >(B);
Phdr.p_type = Seg.Type;		Phdr.p_type = Seg.Type;
Phdr.p_flags = Seg.Flags;		Phdr.p_flags = Seg.Flags;
Phdr.p_offset = Seg.Offset;		Phdr.p_offset = Seg.Offset;
Phdr.p_vaddr = Seg.VAddr;		Phdr.p_vaddr = Seg.VAddr;
Phdr.p_paddr = Seg.PAddr;		Phdr.p_paddr = Seg.PAddr;
Phdr.p_filesz = Seg.FileSize;		Phdr.p_filesz = Seg.FileSize;
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines
}		}

void SymbolTableSection::assignIndices() {		void SymbolTableSection::assignIndices() {
uint32_t Index = 0;		uint32_t Index = 0;
for (auto &Sym : Symbols)		for (auto &Sym : Symbols)
Sym->Index = Index++;		Sym->Index = Index++;
}		}

void SymbolTableSection::addSymbol(StringRef Name, uint8_t Bind, uint8_t Type,		void SymbolTableSection::addSymbol(Twine Name, uint8_t Bind, uint8_t Type,
SectionBase *DefinedIn, uint64_t Value,		SectionBase *DefinedIn, uint64_t Value,
uint8_t Visibility, uint16_t Shndx,		uint8_t Visibility, uint16_t Shndx,
uint64_t Sz) {		uint64_t Size) {
Symbol Sym;		Symbol Sym;
Sym.Name = Name;		Sym.Name = Name.str();
Sym.Binding = Bind;		Sym.Binding = Bind;
Sym.Type = Type;		Sym.Type = Type;
Sym.DefinedIn = DefinedIn;		Sym.DefinedIn = DefinedIn;
if (DefinedIn != nullptr)		if (DefinedIn != nullptr)
DefinedIn->HasSymbol = true;		DefinedIn->HasSymbol = true;
if (DefinedIn == nullptr) {		if (DefinedIn == nullptr) {
if (Shndx >= SHN_LORESERVE)		if (Shndx >= SHN_LORESERVE)
Sym.ShndxType = static_cast<SymbolShndxType>(Shndx);		Sym.ShndxType = static_cast<SymbolShndxType>(Shndx);
else		else
Sym.ShndxType = SYMBOL_SIMPLE_INDEX;		Sym.ShndxType = SYMBOL_SIMPLE_INDEX;
}		}
Sym.Value = Value;		Sym.Value = Value;
Sym.Visibility = Visibility;		Sym.Visibility = Visibility;
Sym.Size = Sz;		Sym.Size = Size;
Sym.Index = Symbols.size();		Sym.Index = Symbols.size();
Symbols.emplace_back(llvm::make_unique<Symbol>(Sym));		Symbols.emplace_back(llvm::make_unique<Symbol>(Sym));
Size += this->EntrySize;		Size += this->EntrySize;
}		}

void SymbolTableSection::removeSectionReferences(const SectionBase *Sec) {		void SymbolTableSection::removeSectionReferences(const SectionBase *Sec) {
if (SectionIndexTable == Sec)		if (SectionIndexTable == Sec)
SectionIndexTable = nullptr;		SectionIndexTable = nullptr;
▲ Show 20 Lines • Show All 321 Lines • ▼ Show 20 Lines
static bool compareSegmentsByPAddr(const Segment A, const Segment B) {		static bool compareSegmentsByPAddr(const Segment A, const Segment B) {
if (A->PAddr < B->PAddr)		if (A->PAddr < B->PAddr)
return true;		return true;
if (A->PAddr > B->PAddr)		if (A->PAddr > B->PAddr)
return false;		return false;
return A->Index < B->Index;		return A->Index < B->Index;
}		}

		template <class ELFT> void BinaryELFBuilder<ELFT>::initFileHeader() {
		jakehehrlichUnsubmitted Done Reply Inline Actions So I this function is the real problem child. We don't actually use the Ident anywhere.This is also 100% comprised of information that the ELFWriter knows but nothing else knows. We should make the ELFWriter construct the Ident and just not store it in Object. jakehehrlich: So I this function is the real problem child. We don't actually use the Ident anywhere.This…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions Nice, I was able to completely remove Ident from Object. rupprecht: Nice, I was able to completely remove Ident from Object.
		Obj->Flags = 0x0;
		Obj->Type = ET_REL;
		Obj->Entry = 0x0;
		Obj->Machine = EMachine;
		Obj->Version = 1;
		}

		template <class ELFT> void BinaryELFBuilder<ELFT>::initHeaderSegment() {
		Obj->ElfHdrSegment.Index = 0;
		}

		template <class ELFT> StringTableSection *BinaryELFBuilder<ELFT>::addStrTab() {
		auto &StrTab = Obj->addSection<StringTableSection>();
		StrTab.Name = ".strtab";

		jakehehrlichUnsubmitted Done Reply Inline Actions I think ELFWriter should assign these. I think I need to do some refactoring to make Object not so ELF specific (with respect to not containing fields that are not known until write time). This change is exposing a lot of points where I failed to properly separate those concerns. In the mean time if you could move this code into ELFWriter and call it form assignOffset that would be ideal. jakehehrlich: I think ELFWriter should assign these. I think I need to do some refactoring to make Object not…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions I didn't find a way to refactor out setting ElfHdr.Index, since `ELFBuilder<ELFT>::readProgramHeaders()` is setting it to `ElfHdr.Index = Index++` in the middle of some other sections, but the rest was simple to refactor into ELFWriter. rupprecht: I didn't find a way to refactor out setting ElfHdr.Index, since `ELFBuilder<ELFT>…
		Obj->SectionNames = &StrTab;
		return &StrTab;
		}

		template <class ELFT>
		SymbolTableSection *
		BinaryELFBuilder<ELFT>::addSymTab(StringTableSection *StrTab) {
		auto &SymTab = Obj->addSection<SymbolTableSection>();
		jhendersonUnsubmitted Done Reply Inline Actions Doesn't this assignment rely on knowing that the symbol table is added immediately after the string table? That seems like poor design to me. Better would be to pass in the index or string table section. I might well be forgetting how llvm-objcopy is designed in this area, but don't we need to explicitly add a null symbol as the first symbol in the symbol table? jhenderson: Doesn't this assignment rely on knowing that the symbol table is added immediately after the…
		rupprechtAuthorUnsubmitted Not Done Reply Inline Actions Yes, this is very fragile, I think I understand the issue I was having before a little better. The Link needs to be set to the StrTab Index, as you mention. The Object::addSection() helper was not assigning any index, so this would implicitly be zero (which is SHN_UNDEF), and that was throwing errors when initializing it. Using size(Obj->sections()) - 1 was more of a reverse-engineered way of getting things working. Changing Object::addSection() to automatically assign the index lets us use the index from StrTab directly. I think this might allow us to stop assigning manually indices elsewhere in this file, but I'll save that for another change. Also -- fixed the symbol table to include a null symbol. rupprecht: Yes, this is very fragile, I think I understand the issue I was having before a little better.
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Yeah that probably should have functioned that way the whole time. Thanks for fixing that! jakehehrlich: Yeah that probably should have functioned that way the whole time. Thanks for fixing that!

		jakehehrlichUnsubmitted Done Reply Inline Actions And this is the last thing where we can't easily factor out dependence on ElfType. This is going to require that Size be handled differently from how it is now. I'm not going to block this change on that since the fix is just not simple. Under the current design of llvm-objcopy that would require a new visitor to calculate the size. I'm not sure I want to further things along that path. Can you add a TODO here for me? jakehehrlich: And this is the last thing where we can't easily factor out dependence on ElfType. This is…
		SymTab.Name = ".symtab";
		SymTab.Link = StrTab->Index;
		// TODO: Factor out dependence on ElfType here.
		jakehehrlichUnsubmitted Done Reply Inline Actions Correct me if I'm wrong but I don't think GNU objcopy produces any program headers when it does this. The resulting file should be relocatable and thus program headers are pointless. Even if GNU objcopy does this, we probably shouldn't. jakehehrlich: Correct me if I'm wrong but I don't think GNU objcopy produces any program headers when it does…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions I just verified this; commenting out this section causes an invalid object file (readelf/llvm-readobj can't read it at all). This is ElfHdrSegment. I think you are confusing this with ProgramHdrSegment? However, speaking of program headers, line 1290 (i.e. `OrderedSegments.push_back(&Obj.ProgramHdrSegment)`) seems to do nothing -- all the tests pass -- and also gets rid of the error `readelf: Warning: possibly corrupt ELF header - it has a non-zero program header offset, but no program headers` when running readelf -a on object files that llvm-objcopy produces. I'd like to take a look at that after this patch; it seems to be a preexisting issue. rupprecht: I just verified this; commenting out this section causes an invalid object file (readelf/llvm…
		jakehehrlichUnsubmitted Done Reply Inline Actions You're right. I confused it with ProgramHdrSegment. I now remember that someone had the clever idea to make the ELF header an implicit segment so that the same layout algorithm that had all the bugs worked out could already be used. As for the warning I guess I knew about that issue. The `OrderedSegments.push_back(&Obj.ProgramHdrSegment)` should accomplish adding that program headers to layout when you have a PT_PHDR segment in a consistent way. The warning is a separate issue on line 1078 (in this patch) not checking if the program headers are empty before blindly setting phoff to the offset of that segment. This is in full compliance with the ELF standard because the number of program headers is still zero. It's just that no tool normally has a reason to produce an ELF that has phoff != 0 and phnum = 0 so it's normally a sign of corruption. jakehehrlich: You're right. I confused it with ProgramHdrSegment. I now remember that someone had the clever…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions I tried adding that check, and several tests failed -- I might not have been checking the right thing, or maybe the tests are bad. I added a TODO on that line to investigate further. rupprecht: I tried adding that check, and several tests failed -- I might not have been checking the right…
		jakehehrlichUnsubmitted Done Reply Inline Actions How did the tests fail? Some of the tests do a literal check against the header. e.g. they may be checking to see that 0x40 is used even when (to comply with the warning) 0x0 should be used instead. jakehehrlich: How did the tests fail? Some of the tests do a literal check against the header. e.g. they may…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions With this change: // Obj.ProgramHdrSegment.firstSection() == nullptr implies // Obj.ProgramHdrSegment.Sections is empty Ehdr.e_phoff = Obj.ProgramHdrSegment.firstSection() == nullptr ? 0 : Obj.ProgramHdrSegment.Offset; The failures usually looked like: Command Output (stderr): -- <src>/llvm/test/tools/llvm-objcopy/triple-overlap.test:72:14: error: CHECK-NEXT: expected string not found in input #CHECK-NEXT: Type: PT_LOAD (0x1) ^ <stdin>:9:2: note: scanning from here Type: (0x464C457F) ^ <stdin>:21:2: note: possible intended match here Type: (0x400004) ^ Actually, it looks like the object is corrupted with that change, e.g. the first program header for that test has an alignment of 15762873573703680, vs 4096 for all of them on the base side. rupprecht: With this change: ``` // Obj.ProgramHdrSegment.firstSection() == nullptr implies // Obj.
		jakehehrlichUnsubmitted Done Reply Inline Actions Yeah that's not the right check. Use `Obj.ProgramHeaders.size() == 0` jakehehrlich: Yeah that's not the right check. Use `Obj.ProgramHeaders.size() == 0`
		rupprechtAuthorUnsubmitted Done Reply Inline Actions I can't seem to get that to work: Obj doesn't have a field called ProgramHeaders You probably mean `Obj.ProgramHdrSegment`, but `Segment` doesn't have a method size() If I implement size() as returning either `Contents.size()` or `Sections.size()` and then try: Ehdr.e_phoff = Obj.ProgramHdrSegment.size() == 0 ? 0 : Obj.ProgramHdrSegment.Offset; Then I get the same failure as before rupprecht: I can't seem to get that to work: - Obj doesn't have a field called ProgramHeaders - You…
		jakehehrlichUnsubmitted Done Reply Inline Actions hmm...that's bothersome; I must not be understanding something. I'll see about looking into that. Thanks for letting me know about this! jakehehrlich: hmm...that's bothersome; I must not be understanding something. I'll see about looking into…
		SymTab.EntrySize = sizeof(Elf_Sym);

		// The symbol table always needs a null symbol
		SymTab.addSymbol("", 0, 0, nullptr, 0, 0, 0, 0);
		jhendersonUnsubmitted Done Reply Inline Actions AR -> Data. "AR" is meaningless. It saddens me that different parts of LLVM can't agree on whether to use chars or uint8_t. jhenderson: AR -> Data. "AR" is meaningless. It saddens me that different parts of LLVM can't agree on…

		Obj->SymbolTable = &SymTab;
		return &SymTab;
		jhendersonUnsubmitted Done Reply Inline Actions Rather than abbreviate this to something that is unclear, just call it "DataSection" jhenderson: Rather than abbreviate this to something that is unclear, just call it "DataSection"
		}

		template <class ELFT>
		void BinaryELFBuilder<ELFT>::addData(SymbolTableSection *SymTab) {
		auto Data = ArrayRef<uint8_t>(
		reinterpret_cast<const uint8_t *>(MemBuf->getBufferStart()),
		MemBuf->getBufferSize());
		auto &DataSection = Obj->addSection<Section>(Data);
		DataSection.Name = ".data";
		jhendersonUnsubmitted Done Reply Inline Actions I think it's considered bad form to have explicit local variable Twines that aren't just function parameters, if I remember, based on comments in a previous review (I might be mistaken though, so am happy to be proven wrong). Please also name it something more descriptive like "Prefix". jhenderson: I think it's considered bad form to have explicit local variable Twines that aren't just…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions I originally had string, but changed to Twine based on Jake's suggestion here: https://reviews.llvm.org/D50343?id=159343#inline-442972 I'm trying to understand why it would be a bad idea to write this. Looking at http://llvm.org/docs/ProgrammersManual.html#dss-twine, the discouraged pattern is: void foo(const Twine &T); ... StringRef X = ... unsigned i = ... const Twine &Tmp = X + "." + Twine(i); foo(Tmp); Which is bad because Tmp is a ref, whereas the Twine here is a regular stack variable that persists past all the calls to addSymbol (which calls .str() on the input twine to save/own the symbol). I renamed to "Prefix", but kept this as a Twine here, since I think this is safe. Otherwise, I think I'd have to do: SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_start", ...); SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_end", ...); SymTab->addSymbol(Twine("_binary_") + SanitizedFilename + "_size", ...); rupprecht: I originally had string, but changed to Twine based on Jake's suggestion here: https://reviews.
		DataSection.Type = ELF::SHT_PROGBITS;
		DataSection.Size = Data.size();
		DataSection.Flags = ELF::SHF_ALLOC \| ELF::SHF_WRITE;

		std::string SanitizedFilename = MemBuf->getBufferIdentifier().str();
		std::replace_if(std::begin(SanitizedFilename), std::end(SanitizedFilename),
		[](char c) { return !isalnum(c); }, '_');
		jhendersonUnsubmitted Done Reply Inline Actions Is there any point in adding this local section symbol? It can't be referenced, so I think it's superfluous. jhenderson: Is there any point in adding this local section symbol? It can't be referenced, so I think it's…
		rupprechtAuthorUnsubmitted Not Done Reply Inline Actions GNU objcopy adds it, but it doesn't seem to be necessary. I'll add it back if it turns out to be needed. rupprecht: GNU objcopy adds it, but it doesn't seem to be necessary. I'll add it back if it turns out to…
		Twine Prefix = Twine("_binary_") + SanitizedFilename;

		jhendersonUnsubmitted Done Reply Inline Actions Should the end symbol have a size? That seems weird that it does. I might expect the start symbol to, but not the end symbol. What does GNU objcopy do? jhenderson: Should the end symbol have a size? That seems weird that it does. I might expect the start…
		rupprechtAuthorUnsubmitted Done Reply Inline Actions Nope, this is Value, not Size, although it's confusing, especially because addSymbol has so many parameters. I commented the param names here to make this clear. The sizes/values here match gnu objcopy. rupprecht: Nope, this is Value, not Size, although it's confusing, especially because addSymbol has so…
		jhendersonUnsubmitted Done Reply Inline Actions Right, I get it now. Thanks for the comments. jhenderson: Right, I get it now. Thanks for the comments.
		SymTab->addSymbol(Prefix + "_start", STB_GLOBAL, STT_NOTYPE, &DataSection,
		/Value=/0, STV_DEFAULT, 0, 0);
		SymTab->addSymbol(Prefix + "_end", STB_GLOBAL, STT_NOTYPE, &DataSection,
		/Value=/DataSection.Size, STV_DEFAULT, 0, 0);
		SymTab->addSymbol(Prefix + "_size", STB_GLOBAL, STT_NOTYPE, nullptr,
		/Value=/DataSection.Size, STV_DEFAULT, SHN_ABS, 0);
		}

		jakehehrlichUnsubmitted Done Reply Inline Actions The MemBuf should outlive the section in which case you can just use `Section` jakehehrlich: The MemBuf should outlive the section in which case you can just use `Section`
		template <class ELFT> void BinaryELFBuilder<ELFT>::initSections() {
		for (auto &Section : Obj->sections()) {
		Section.initialize(Obj->sections());
		}
		jakehehrlichUnsubmitted Done Reply Inline Actions nit: Can you use a Twine? You'll have to use an std::string for the buffer identifier part unfortunately but other than that you should be able to use a twine. jakehehrlich: nit: Can you use a Twine? You'll have to use an std::string for the buffer identifier part…
		}

		template <class ELFT> std::unique_ptr<Object> BinaryELFBuilder<ELFT>::build() {
		initFileHeader();
		initHeaderSegment();
		StringTableSection *StrTab = addStrTab();
		SymbolTableSection *SymTab = addSymTab(StrTab);
		initSections();
		addData(SymTab);

		return std::move(Obj);
		}

template <class ELFT> void ELFBuilder<ELFT>::setParentSegment(Segment &Child) {		template <class ELFT> void ELFBuilder<ELFT>::setParentSegment(Segment &Child) {
for (auto &Parent : Obj.segments()) {		for (auto &Parent : Obj.segments()) {
// Every segment will overlap with itself but we don't want a segment to		// Every segment will overlap with itself but we don't want a segment to
// be it's own parent so we avoid that situation.		// be it's own parent so we avoid that situation.
if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {		if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {
// We want a canonical "most parental" segment but this requires		// We want a canonical "most parental" segment but this requires
// inspecting the ParentSegment.		// inspecting the ParentSegment.
		jakehehrlichUnsubmitted Done Reply Inline Actions I think you're ok to initialize the symbol table in whatever order you want here since there are no relocation. That's a current thorn that you even have to initialize things in a different order. I need to carve out some time to refactor llvm-objcopy to work differently to get there however. jakehehrlich: I think you're ok to initialize the symbol table in whatever order you want here since there…
if (compareSegmentsByOffset(&Parent, &Child))		if (compareSegmentsByOffset(&Parent, &Child))
if (Child.ParentSegment == nullptr \|\|		if (Child.ParentSegment == nullptr \|\|
compareSegmentsByOffset(&Parent, Child.ParentSegment)) {		compareSegmentsByOffset(&Parent, Child.ParentSegment)) {
Child.ParentSegment = &Parent;		Child.ParentSegment = &Parent;
}		}
}		}
}		}
}		}
Show All 21 Lines	for (auto &Section : Obj.sections()) {
Section.ParentSegment->Offset > Seg.Offset) {		Section.ParentSegment->Offset > Seg.Offset) {
Section.ParentSegment = &Seg;		Section.ParentSegment = &Seg;
}		}
}		}
}		}
}		}

auto &ElfHdr = Obj.ElfHdrSegment;		auto &ElfHdr = Obj.ElfHdrSegment;
// Creating multiple PT_PHDR segments technically is not valid, but PT_LOAD
// segments must not overlap, and other types fit even less.
ElfHdr.Type = PT_PHDR;
ElfHdr.Flags = 0;
ElfHdr.OriginalOffset = ElfHdr.Offset = 0;
ElfHdr.VAddr = 0;
ElfHdr.PAddr = 0;
ElfHdr.FileSize = ElfHdr.MemSize = sizeof(Elf_Ehdr);
ElfHdr.Align = 0;
ElfHdr.Index = Index++;		ElfHdr.Index = Index++;

const auto &Ehdr = *ElfFile.getHeader();		const auto &Ehdr = *ElfFile.getHeader();
auto &PrHdr = Obj.ProgramHdrSegment;		auto &PrHdr = Obj.ProgramHdrSegment;
PrHdr.Type = PT_PHDR;		PrHdr.Type = PT_PHDR;
PrHdr.Flags = 0;		PrHdr.Flags = 0;
// The spec requires us to have p_vaddr % p_align == p_offset % p_align.		// The spec requires us to have p_vaddr % p_align == p_offset % p_align.
// Whereas this works automatically for ElfHdr, here OriginalOffset is		// Whereas this works automatically for ElfHdr, here OriginalOffset is
▲ Show 20 Lines • Show All 238 Lines • ▼ Show 20 Lines	if (auto RelSec = dyn_cast<RelocationSection>(&Section)) {
initGroupSection(GroupSec);		initGroupSection(GroupSec);
}		}
}		}
}		}

template <class ELFT> void ELFBuilder<ELFT>::build() {		template <class ELFT> void ELFBuilder<ELFT>::build() {
const auto &Ehdr = *ElfFile.getHeader();		const auto &Ehdr = *ElfFile.getHeader();

std::copy(Ehdr.e_ident, Ehdr.e_ident + 16, Obj.Ident);
Obj.Type = Ehdr.e_type;		Obj.Type = Ehdr.e_type;
Obj.Machine = Ehdr.e_machine;		Obj.Machine = Ehdr.e_machine;
Obj.Version = Ehdr.e_version;		Obj.Version = Ehdr.e_version;
Obj.Entry = Ehdr.e_entry;		Obj.Entry = Ehdr.e_entry;
Obj.Flags = Ehdr.e_flags;		Obj.Flags = Ehdr.e_flags;

readSectionHeaders();		readSectionHeaders();
readProgramHeaders();		readProgramHeaders();
Show All 15 Lines
template <class R> size_t size(R &&Range) {		template <class R> size_t size(R &&Range) {
return static_cast<size_t>(std::end(Range) - std::begin(Range));		return static_cast<size_t>(std::end(Range) - std::begin(Range));
}		}

Writer::~Writer() {}		Writer::~Writer() {}

Reader::~Reader() {}		Reader::~Reader() {}

ElfType ELFReader::getElfType() const {		std::unique_ptr<Object> BinaryReader::create() const {
if (isa<ELFObjectFile<ELF32LE>>(Bin))		if (MInfo.Is64Bit)
return ELFT_ELF32LE;		return MInfo.IsLittleEndian
if (isa<ELFObjectFile<ELF64LE>>(Bin))		? BinaryELFBuilder<ELF64LE>(MInfo.EMachine, MemBuf).build()
return ELFT_ELF64LE;		: BinaryELFBuilder<ELF64BE>(MInfo.EMachine, MemBuf).build();
if (isa<ELFObjectFile<ELF32BE>>(Bin))		else
return ELFT_ELF32BE;		return MInfo.IsLittleEndian
if (isa<ELFObjectFile<ELF64BE>>(Bin))		? BinaryELFBuilder<ELF32LE>(MInfo.EMachine, MemBuf).build()
return ELFT_ELF64BE;		: BinaryELFBuilder<ELF32BE>(MInfo.EMachine, MemBuf).build();
llvm_unreachable("Invalid ELFType");
}		}

std::unique_ptr<Object> ELFReader::create() const {		std::unique_ptr<Object> ELFReader::create() const {
auto Obj = llvm::make_unique<Object>();		auto Obj = llvm::make_unique<Object>();
if (auto *o = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {		if (auto *o = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {
ELFBuilder<ELF32LE> Builder(o, Obj);		ELFBuilder<ELF32LE> Builder(o, Obj);
Builder.build();		Builder.build();
return Obj;		return Obj;
Show All 11 Lines	if (auto *o = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {
return Obj;		return Obj;
}		}
error("Invalid file type");		error("Invalid file type");
}		}

template <class ELFT> void ELFWriter<ELFT>::writeEhdr() {		template <class ELFT> void ELFWriter<ELFT>::writeEhdr() {
uint8_t *B = Buf.getBufferStart();		uint8_t *B = Buf.getBufferStart();
Elf_Ehdr &Ehdr = reinterpret_cast<Elf_Ehdr >(B);		Elf_Ehdr &Ehdr = reinterpret_cast<Elf_Ehdr >(B);
std::copy(Obj.Ident, Obj.Ident + 16, Ehdr.e_ident);		std::fill(Ehdr.e_ident, Ehdr.e_ident + 16, 0);
		Ehdr.e_ident[EI_MAG0] = 0x7f;
		Ehdr.e_ident[EI_MAG1] = 'E';
		Ehdr.e_ident[EI_MAG2] = 'L';
		Ehdr.e_ident[EI_MAG3] = 'F';
		Ehdr.e_ident[EI_CLASS] = ELFT::Is64Bits ? ELFCLASS64 : ELFCLASS32;
		Ehdr.e_ident[EI_DATA] =
		ELFT::TargetEndianness == support::big ? ELFDATA2MSB : ELFDATA2LSB;
		Ehdr.e_ident[EI_VERSION] = EV_CURRENT;
		Ehdr.e_ident[EI_OSABI] = ELFOSABI_NONE;
		Ehdr.e_ident[EI_ABIVERSION] = 0;

Ehdr.e_type = Obj.Type;		Ehdr.e_type = Obj.Type;
Ehdr.e_machine = Obj.Machine;		Ehdr.e_machine = Obj.Machine;
Ehdr.e_version = Obj.Version;		Ehdr.e_version = Obj.Version;
Ehdr.e_entry = Obj.Entry;		Ehdr.e_entry = Obj.Entry;
		// TODO: Only set phoff when a program header exists, to avoid tools
		// thinking this is corrupt data.
Ehdr.e_phoff = Obj.ProgramHdrSegment.Offset;		Ehdr.e_phoff = Obj.ProgramHdrSegment.Offset;
Ehdr.e_flags = Obj.Flags;		Ehdr.e_flags = Obj.Flags;
Ehdr.e_ehsize = sizeof(Elf_Ehdr);		Ehdr.e_ehsize = sizeof(Elf_Ehdr);
Ehdr.e_phentsize = sizeof(Elf_Phdr);		Ehdr.e_phentsize = sizeof(Elf_Phdr);
Ehdr.e_phnum = size(Obj.segments());		Ehdr.e_phnum = size(Obj.segments());
Ehdr.e_shentsize = sizeof(Elf_Shdr);		Ehdr.e_shentsize = sizeof(Elf_Shdr);
if (WriteSectionHeaders) {		if (WriteSectionHeaders) {
Ehdr.e_shoff = Obj.SHOffset;		Ehdr.e_shoff = Obj.SHOffset;
▲ Show 20 Lines • Show All 188 Lines • ▼ Show 20 Lines	if (Section.ParentSegment != nullptr) {
Section.Offset = Offset;		Section.Offset = Offset;
if (Section.Type != SHT_NOBITS)		if (Section.Type != SHT_NOBITS)
Offset += Section.Size;		Offset += Section.Size;
}		}
}		}
return Offset;		return Offset;
}		}

		template <class ELFT> void ELFWriter<ELFT>::initEhdrSegment() {
		jhendersonUnsubmitted Done Reply Inline Actions Could you rename this initEhdrSegment, since it doesn't actually initialise the ELF header itself. jhenderson: Could you rename this initEhdrSegment, since it doesn't actually initialise the ELF header…
		auto &ElfHdr = Obj.ElfHdrSegment;
		ElfHdr.Type = PT_PHDR;
		ElfHdr.Flags = 0;
		ElfHdr.OriginalOffset = ElfHdr.Offset = 0;
		ElfHdr.VAddr = 0;
		ElfHdr.PAddr = 0;
		ElfHdr.FileSize = ElfHdr.MemSize = sizeof(Elf_Ehdr);
		ElfHdr.Align = 0;
		}

template <class ELFT> void ELFWriter<ELFT>::assignOffsets() {		template <class ELFT> void ELFWriter<ELFT>::assignOffsets() {
// We need a temporary list of segments that has a special order to it		// We need a temporary list of segments that has a special order to it
		jhendersonUnsubmitted Done Reply Inline Actions I'm not convinced that the ELF header segment should be initialised inside something called "assignOffsets", since it's doing a lot more than assigning it an offset (which actually is always 0, so could be initialised at the same time as the rest of the header). It should probably be called outside this function. jhenderson: I'm not convinced that the ELF header segment should be initialised inside something called…
// so that we know that anytime ->ParentSegment is set that segment has		// so that we know that anytime ->ParentSegment is set that segment has
// already had its offset properly set.		// already had its offset properly set.
std::vector<Segment *> OrderedSegments;		std::vector<Segment *> OrderedSegments;
for (auto &Segment : Obj.segments())		for (auto &Segment : Obj.segments())
OrderedSegments.push_back(&Segment);		OrderedSegments.push_back(&Segment);
OrderedSegments.push_back(&Obj.ElfHdrSegment);		OrderedSegments.push_back(&Obj.ElfHdrSegment);
OrderedSegments.push_back(&Obj.ProgramHdrSegment);		OrderedSegments.push_back(&Obj.ProgramHdrSegment);
OrderSegments(OrderedSegments);		OrderSegments(OrderedSegments);
▲ Show 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	template <class ELFT> void ELFWriter<ELFT>::finalize() {

// Make sure we add the names of all the sections. Importantly this must be		// Make sure we add the names of all the sections. Importantly this must be
// done after we decide to add or remove SectionIndexes.		// done after we decide to add or remove SectionIndexes.
if (Obj.SectionNames != nullptr)		if (Obj.SectionNames != nullptr)
for (const auto &Section : Obj.sections()) {		for (const auto &Section : Obj.sections()) {
Obj.SectionNames->addString(Section.Name);		Obj.SectionNames->addString(Section.Name);
}		}

		initEhdrSegment();
// Before we can prepare for layout the indexes need to be finalized.		// Before we can prepare for layout the indexes need to be finalized.
uint64_t Index = 0;		uint64_t Index = 0;
for (auto &Sec : Obj.sections())		for (auto &Sec : Obj.sections())
Sec.Index = Index++;		Sec.Index = Index++;

// The symbol table does not update all other sections on update. For		// The symbol table does not update all other sections on update. For
// instance, symbol names are not added as new symbols are added. This means		// instance, symbol names are not added as new symbols are added. This means
// that some sections, like .strtab, don't yet have their final size.		// that some sections, like .strtab, don't yet have their final size.
▲ Show 20 Lines • Show All 111 Lines • ▼ Show 20 Lines	void BinaryWriter::finalize() {

Buf.allocate(TotalSize);		Buf.allocate(TotalSize);
SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);		SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);
}		}

namespace llvm {		namespace llvm {
namespace objcopy {		namespace objcopy {

		template class BinaryELFBuilder<ELF64LE>;
		template class BinaryELFBuilder<ELF64BE>;
		template class BinaryELFBuilder<ELF32LE>;
		template class BinaryELFBuilder<ELF32BE>;

template class ELFBuilder<ELF64LE>;		template class ELFBuilder<ELF64LE>;
template class ELFBuilder<ELF64BE>;		template class ELFBuilder<ELF64BE>;
template class ELFBuilder<ELF32LE>;		template class ELFBuilder<ELF32LE>;
template class ELFBuilder<ELF32BE>;		template class ELFBuilder<ELF32BE>;

template class ELFWriter<ELF64LE>;		template class ELFWriter<ELF64LE>;
template class ELFWriter<ELF64BE>;		template class ELFWriter<ELF64BE>;
template class ELFWriter<ELF32LE>;		template class ELFWriter<ELF32LE>;
template class ELFWriter<ELF32BE>;		template class ELFWriter<ELF32BE>;
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

tools/llvm-objcopy/llvm-objcopy.cpp

Show All 28 Lines
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/ErrorOr.h"		#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/FileOutputBuffer.h"		#include "llvm/Support/FileOutputBuffer.h"
#include "llvm/Support/InitLLVM.h"		#include "llvm/Support/InitLLVM.h"
		#include "llvm/Support/Memory.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include "llvm/Support/WithColor.h"		#include "llvm/Support/WithColor.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdlib>		#include <cstdlib>
#include <functional>		#include <functional>
#include <iterator>		#include <iterator>
▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines

struct SectionRename {		struct SectionRename {
StringRef OriginalName;		StringRef OriginalName;
StringRef NewName;		StringRef NewName;
Optional<uint64_t> NewFlags;		Optional<uint64_t> NewFlags;
};		};

struct CopyConfig {		struct CopyConfig {
		// Main input/output options
StringRef OutputFilename;		StringRef OutputFilename;
StringRef InputFilename;		StringRef InputFilename;
StringRef OutputFormat;		StringRef OutputFormat;
StringRef InputFormat;		StringRef InputFormat;
StringRef BinaryArch;

		jhendersonUnsubmitted Done Reply Inline Actions It looks to me like we are grouping things in CopyConfig by type, so this should probably be moved to before (or after) the StringRef block. jhenderson: It looks to me like we are grouping things in CopyConfig by type, so this should probably be…
		rupprechtAuthorUnsubmitted Not Done Reply Inline Actions Done -- I added some comments to try to logically explain how this config is laid out, but I'm not very familiar with all of them... I'm open to suggestions here. rupprecht: Done -- I added some comments to try to logically explain how this config is laid out, but I'm…
		jhendersonUnsubmitted Not Done Reply Inline Actions I'd like @jakehehrlich to comment on these comments, if that's okay, as he may have a specific desire as to how this class is laid out. jhenderson: I'd like @jakehehrlich to comment on these comments, if that's okay, as he may have a specific…
		jakehehrlichUnsubmitted Not Done Reply Inline Actions To date I have had no reason or rhyme to how I've added these...in fact I'm not sure if I have even added the majority at this point. A pattern of grouping by type does seem to have formed organically however. Let's stick with it. Long term ideals on how this should be laid out: The names should have a consistent method for naming them derived from the option name where possible. I don't think I've done a very good job of this and it isn't clear how to best solve this issue. Grouping first by type and then by alphabetization is probably a good idea. I'm pretty trash at finding things in alphabetized lists but I always know the basic type of thing I'm looking for. AddSection is the only thing I know of which currently violates this. (we should make it work the way Paul made SectionsToRename work). jakehehrlich: To date I have had no reason or rhyme to how I've added these...in fact I'm not sure if I have…
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Do we use these anywhere? I think I added them thinking about how I was going to use them to do exactly what this change does but embedding MachineInfo seems much nicer and convey's the same information. Maybe we should just do that? jakehehrlich: Do we use these anywhere? I think I added them thinking about how I was going to use them to do…
		rupprechtAuthorUnsubmitted Not Done Reply Inline Actions We do. Essentially, we use this to decide whether we're going to need a BinaryReader vs ELFReader for InputFormat, and, separately, whether we're going to need a BinaryWriter vs ELFWriter for OutputFormat. BinaryArch is extra information that only applies when using BinaryReader. rupprecht: We do. Essentially, we use this to decide whether we're going to need a BinaryReader vs…
		// Only applicable for --input-format=Binary
		MachineInfo BinaryArch;

		// Advanced options
StringRef SplitDWO;		StringRef SplitDWO;
StringRef AddGnuDebugLink;		StringRef AddGnuDebugLink;
StringRef SymbolsPrefix;		StringRef SymbolsPrefix;
std::vector<StringRef> ToRemove;		std::vector<StringRef> ToRemove;
std::vector<StringRef> Keep;		std::vector<StringRef> Keep;
std::vector<StringRef> OnlyKeep;		std::vector<StringRef> OnlyKeep;
std::vector<StringRef> AddSection;		std::vector<StringRef> AddSection;
std::vector<StringRef> DumpSection;		std::vector<StringRef> DumpSection;
std::vector<StringRef> SymbolsToLocalize;		std::vector<StringRef> SymbolsToLocalize;
std::vector<StringRef> SymbolsToGlobalize;		std::vector<StringRef> SymbolsToGlobalize;
std::vector<StringRef> SymbolsToWeaken;		std::vector<StringRef> SymbolsToWeaken;
std::vector<StringRef> SymbolsToRemove;		std::vector<StringRef> SymbolsToRemove;
std::vector<StringRef> SymbolsToKeep;		std::vector<StringRef> SymbolsToKeep;
StringMap<SectionRename> SectionsToRename;		StringMap<SectionRename> SectionsToRename;
StringMap<StringRef> SymbolsToRename;		StringMap<StringRef> SymbolsToRename;

		// Boolean options
bool StripAll = false;		bool StripAll = false;
bool StripAllGNU = false;		bool StripAllGNU = false;
bool StripDebug = false;		bool StripDebug = false;
bool StripSections = false;		bool StripSections = false;
bool StripNonAlloc = false;		bool StripNonAlloc = false;
bool StripDWO = false;		bool StripDWO = false;
bool StripUnneeded = false;		bool StripUnneeded = false;
bool ExtractDWO = false;		bool ExtractDWO = false;
▲ Show 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	static bool onlyKeepDWOPred(const Object &Obj, const SectionBase &Sec) {
// We can't remove the section header string table.		// We can't remove the section header string table.
if (&Sec == Obj.SectionNames)		if (&Sec == Obj.SectionNames)
return false;		return false;
// Short of keeping the string table we want to keep everything that is a DWO		// Short of keeping the string table we want to keep everything that is a DWO
// section and remove everything else.		// section and remove everything else.
return !isDWOSection(Sec);		return !isDWOSection(Sec);
}		}

		static const StringMap<MachineInfo> ArchMap{
		// Name, {EMachine, 64bit, LittleEndian}
		{"aarch64", {EM_AARCH64, true, true}},
		{"arm", {EM_ARM, false, true}},
		{"i386", {EM_386, false, true}},
		{"i386:x86-64", {EM_X86_64, true, true}},
		{"powerpc:common64", {EM_PPC64, true, true}},
		{"sparc", {EM_SPARC, false, true}},
		{"x86-64", {EM_X86_64, true, true}},
		};

		jhendersonUnsubmitted Done Reply Inline Actions I feel like this might read better if everything is constructed inline: static const StringMap<MachineInfo> ArchMap{ // Name, EM value, 64bit, LittleEndian {"aarch64", {EM_AARCH64, true, true}}, {"arm", {EM_ARM, false, true}}, /* More entries here / }; I'd order the entries either alphabetically by name or numerically by their EM value. You could also put "headers" as shown to avoid needing to duplicate the 64/Endianness comment in each part. jhenderson:* I feel like this might read better if everything is constructed inline: ``` static const…
		static const MachineInfo &getMachineInfo(StringRef Arch) {
		auto Iter = ArchMap.find(Arch);
		if (Iter == std::end(ArchMap))
		error("Invalid architecture: '" + Arch + "'");
		return Iter->getValue();
		}

		static ElfType getOutputElfType(const Binary &Bin) {
		// Infer output ELF type from the input ELF object
		if (isa<ELFObjectFile<ELF32LE>>(Bin))
		return ELFT_ELF32LE;
		if (isa<ELFObjectFile<ELF64LE>>(Bin))
		return ELFT_ELF64LE;
		if (isa<ELFObjectFile<ELF32BE>>(Bin))
		return ELFT_ELF32BE;
		if (isa<ELFObjectFile<ELF64BE>>(Bin))
		return ELFT_ELF64BE;
		llvm_unreachable("Invalid ELFType");
		}

		static ElfType getOutputElfType(const MachineInfo &MI) {
		// Infer output ELF type from the binary arch specified
		if (MI.Is64Bit)
		return MI.IsLittleEndian ? ELFT_ELF64LE : ELFT_ELF64BE;
		else
		return MI.IsLittleEndian ? ELFT_ELF32LE : ELFT_ELF32BE;
		}

static std::unique_ptr<Writer> createWriter(const CopyConfig &Config,		static std::unique_ptr<Writer> createWriter(const CopyConfig &Config,
Object &Obj, Buffer &Buf,		Object &Obj, Buffer &Buf,
ElfType OutputElfType) {		ElfType OutputElfType) {
if (Config.OutputFormat == "binary") {		if (Config.OutputFormat == "binary") {
return llvm::make_unique<BinaryWriter>(Obj, Buf);		return llvm::make_unique<BinaryWriter>(Obj, Buf);
}		}
// Depending on the initial ELFT and OutputFormat we need a different Writer.		// Depending on the initial ELFT and OutputFormat we need a different Writer.
switch (OutputElfType) {		switch (OutputElfType) {
▲ Show 20 Lines • Show All 292 Lines • ▼ Show 20 Lines	for (const auto &Flag : Config.DumpSection) {
reportError(Config.InputFilename, std::move(E));		reportError(Config.InputFilename, std::move(E));
}		}
}		}

if (!Config.AddGnuDebugLink.empty())		if (!Config.AddGnuDebugLink.empty())
Obj.addSection<GnuDebugLinkSection>(Config.AddGnuDebugLink);		Obj.addSection<GnuDebugLinkSection>(Config.AddGnuDebugLink);
}		}

static void executeElfObjcopyOnBinary(const CopyConfig &Config, Binary &Binary,		static void executeElfObjcopyOnBinary(const CopyConfig &Config, Reader &Reader,
Buffer &Out) {		Buffer &Out, ElfType OutputElfType) {
ELFReader Reader(&Binary);
std::unique_ptr<Object> Obj = Reader.create();		std::unique_ptr<Object> Obj = Reader.create();

handleArgs(Config, *Obj, Reader, Reader.getElfType());		handleArgs(Config, *Obj, Reader, OutputElfType);

std::unique_ptr<Writer> Writer =		std::unique_ptr<Writer> Writer =
createWriter(Config, *Obj, Out, Reader.getElfType());		createWriter(Config, *Obj, Out, OutputElfType);
Writer->finalize();		Writer->finalize();
Writer->write();		Writer->write();
}		}

// For regular archives this function simply calls llvm::writeArchive,		// For regular archives this function simply calls llvm::writeArchive,
// For thin archives it writes the archive file itself as well as its members.		// For thin archives it writes the archive file itself as well as its members.
static Error deepWriteArchive(StringRef ArcName,		static Error deepWriteArchive(StringRef ArcName,
ArrayRef<NewArchiveMember> NewMembers,		ArrayRef<NewArchiveMember> NewMembers,
Show All 25 Lines
static void executeElfObjcopyOnArchive(const CopyConfig &Config,		static void executeElfObjcopyOnArchive(const CopyConfig &Config,
const Archive &Ar) {		const Archive &Ar) {
std::vector<NewArchiveMember> NewArchiveMembers;		std::vector<NewArchiveMember> NewArchiveMembers;
Error Err = Error::success();		Error Err = Error::success();
for (const Archive::Child &Child : Ar.children(Err)) {		for (const Archive::Child &Child : Ar.children(Err)) {
Expected<std::unique_ptr<Binary>> ChildOrErr = Child.getAsBinary();		Expected<std::unique_ptr<Binary>> ChildOrErr = Child.getAsBinary();
if (!ChildOrErr)		if (!ChildOrErr)
reportError(Ar.getFileName(), ChildOrErr.takeError());		reportError(Ar.getFileName(), ChildOrErr.takeError());
		Binary *Bin = ChildOrErr->get();

Expected<StringRef> ChildNameOrErr = Child.getName();		Expected<StringRef> ChildNameOrErr = Child.getName();
if (!ChildNameOrErr)		if (!ChildNameOrErr)
reportError(Ar.getFileName(), ChildNameOrErr.takeError());		reportError(Ar.getFileName(), ChildNameOrErr.takeError());

MemBuffer MB(ChildNameOrErr.get());		MemBuffer MB(ChildNameOrErr.get());
executeElfObjcopyOnBinary(Config, **ChildOrErr, MB);		ELFReader Reader(Bin);
		executeElfObjcopyOnBinary(Config, Reader, MB, getOutputElfType(*Bin));

Expected<NewArchiveMember> Member =		Expected<NewArchiveMember> Member =
NewArchiveMember::getOldMember(Child, true);		NewArchiveMember::getOldMember(Child, true);
if (!Member)		if (!Member)
reportError(Ar.getFileName(), Member.takeError());		reportError(Ar.getFileName(), Member.takeError());
Member->Buf = MB.releaseMemoryBuffer();		Member->Buf = MB.releaseMemoryBuffer();
Member->MemberName = Member->Buf->getBufferIdentifier();		Member->MemberName = Member->Buf->getBufferIdentifier();
NewArchiveMembers.push_back(std::move(*Member));		NewArchiveMembers.push_back(std::move(*Member));
}		}

if (Err)		if (Err)
reportError(Config.InputFilename, std::move(Err));		reportError(Config.InputFilename, std::move(Err));
if (Error E =		if (Error E =
deepWriteArchive(Config.OutputFilename, NewArchiveMembers,		deepWriteArchive(Config.OutputFilename, NewArchiveMembers,
Ar.hasSymbolTable(), Ar.kind(), true, Ar.isThin()))		Ar.hasSymbolTable(), Ar.kind(), true, Ar.isThin()))
reportError(Config.OutputFilename, std::move(E));		reportError(Config.OutputFilename, std::move(E));
}		}

static void executeElfObjcopy(const CopyConfig &Config) {		static void executeElfObjcopy(const CopyConfig &Config) {
		if (Config.InputFormat == "binary") {
		auto BufOrErr = MemoryBuffer::getFile(Config.InputFilename);
		if (!BufOrErr)
		reportError(Config.InputFilename, BufOrErr.getError());

		FileBuffer FB(Config.OutputFilename);
		BinaryReader Reader(Config.BinaryArch, BufOrErr->get());
		executeElfObjcopyOnBinary(Config, Reader, FB,
		getOutputElfType(Config.BinaryArch));
		} else {
Expected<OwningBinary<llvm::object::Binary>> BinaryOrErr =		Expected<OwningBinary<llvm::object::Binary>> BinaryOrErr =
createBinary(Config.InputFilename);		createBinary(Config.InputFilename);
if (!BinaryOrErr)		if (!BinaryOrErr)
reportError(Config.InputFilename, BinaryOrErr.takeError());		reportError(Config.InputFilename, BinaryOrErr.takeError());

if (Archive *Ar = dyn_cast<Archive>(BinaryOrErr.get().getBinary()))		if (Archive *Ar = dyn_cast<Archive>(BinaryOrErr.get().getBinary()))
return executeElfObjcopyOnArchive(Config, *Ar);		return executeElfObjcopyOnArchive(Config, *Ar);

FileBuffer FB(Config.OutputFilename);		FileBuffer FB(Config.OutputFilename);
executeElfObjcopyOnBinary(Config, *BinaryOrErr.get().getBinary(), FB);		Binary *Bin = BinaryOrErr.get().getBinary();
		ELFReader Reader(Bin);
		executeElfObjcopyOnBinary(Config, Reader, FB, getOutputElfType(*Bin));
		}
}		}

// ParseObjcopyOptions returns the config and sets the input arguments. If a		// ParseObjcopyOptions returns the config and sets the input arguments. If a
// help flag is set then ParseObjcopyOptions will print the help messege and		// help flag is set then ParseObjcopyOptions will print the help messege and
// exit.		// exit.
static CopyConfig parseObjcopyOptions(ArrayRef<const char *> ArgsArr) {		static CopyConfig parseObjcopyOptions(ArrayRef<const char *> ArgsArr) {
ObjcopyOptTable T;		ObjcopyOptTable T;
unsigned MissingArgumentIndex, MissingArgumentCount;		unsigned MissingArgumentIndex, MissingArgumentCount;
Show All 24 Lines	static CopyConfig parseObjcopyOptions(ArrayRef<const char *> ArgsArr) {
if (Positional.size() > 2)		if (Positional.size() > 2)
error("Too many positional arguments");		error("Too many positional arguments");

CopyConfig Config;		CopyConfig Config;
Config.InputFilename = Positional[0];		Config.InputFilename = Positional[0];
Config.OutputFilename = Positional[Positional.size() == 1 ? 0 : 1];		Config.OutputFilename = Positional[Positional.size() == 1 ? 0 : 1];
Config.InputFormat = InputArgs.getLastArgValue(OBJCOPY_input_target);		Config.InputFormat = InputArgs.getLastArgValue(OBJCOPY_input_target);
Config.OutputFormat = InputArgs.getLastArgValue(OBJCOPY_output_target);		Config.OutputFormat = InputArgs.getLastArgValue(OBJCOPY_output_target);
Config.BinaryArch = InputArgs.getLastArgValue(OBJCOPY_binary_architecture);		if (Config.InputFormat == "binary") {
		auto BinaryArch = InputArgs.getLastArgValue(OBJCOPY_binary_architecture);
		if (BinaryArch.empty())
		error("Specified binary input without specifiying an architecture");
		Config.BinaryArch = getMachineInfo(BinaryArch);
		}

Config.SplitDWO = InputArgs.getLastArgValue(OBJCOPY_split_dwo);		Config.SplitDWO = InputArgs.getLastArgValue(OBJCOPY_split_dwo);
Config.AddGnuDebugLink = InputArgs.getLastArgValue(OBJCOPY_add_gnu_debuglink);		Config.AddGnuDebugLink = InputArgs.getLastArgValue(OBJCOPY_add_gnu_debuglink);
Config.SymbolsPrefix = InputArgs.getLastArgValue(OBJCOPY_prefix_symbols);		Config.SymbolsPrefix = InputArgs.getLastArgValue(OBJCOPY_prefix_symbols);

for (auto Arg : InputArgs.filtered(OBJCOPY_redefine_symbol)) {		for (auto Arg : InputArgs.filtered(OBJCOPY_redefine_symbol)) {
if (!StringRef(Arg->getValue()).contains('='))		if (!StringRef(Arg->getValue()).contains('='))
error("Bad format for --redefine-sym");		error("Bad format for --redefine-sym");
▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add support for -I binary -B <arch>.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 160656

test/tools/llvm-objcopy/binary-input-and-output.test

test/tools/llvm-objcopy/binary-input-arch.test

test/tools/llvm-objcopy/binary-input-error.test

test/tools/llvm-objcopy/binary-input.test

tools/llvm-objcopy/Object.h

tools/llvm-objcopy/Object.cpp

tools/llvm-objcopy/llvm-objcopy.cpp

[llvm-objcopy] Add support for -I binary -B <arch>.
ClosedPublic