This is an archive of the discontinued LLVM Phabricator instance.

[llvm][llvm-objcopy] Added support for outputting to binary in llvm-objcopy
ClosedPublic

Authored by jakehehrlich on Jun 21 2017, 3:05 PM.

Download Raw Diff

Details

Reviewers

phosek
jhenderson
Bigcheese

Commits

Summary

This change adds the "-O binary" flag which directs llvm-objcopy to output the object file to the same format as GNU objcopy does when given the flag "-O binary". This was done by splitting the Object class into two subclasses ObjectELF and ObjectBianry which each output a different format but relay on the same code to read in the Object in Object.

This change depends on D33964

Diff Detail

Repository: rL LLVM

Event Timeline

jakehehrlich created this revision.Jun 21 2017, 3:05 PM

tpimh added a subscriber: tpimh.Jun 23 2017, 4:33 AM

jakehehrlich updated this revision to Diff 107409.Jul 19 2017, 4:19 PM

jakehehrlich edited the summary of this revision. (Show Details)

jakehehrlich added a parent revision: D33964: [LLVM][llvm-objcopy] Added basic plumbing to get things started.

jakehehrlich added reviewers: phosek, jhenderson, Bigcheese.

jakehehrlich removed a subscriber: phosek.

Updated this to solve Mac build issues and MSan issues in D33964

It looks like you've accidentally lost program-headers.test again and regained hello-world.test.

I'm not familiar with the binary output format, and the objcopy documentation isn't exactly clear on the matter. Would you mind giving a quick overview, please?

tools/llvm-objcopy/Object.cpp
359–368	I can't help but notice that this method and totalSize above are basically the same. Is there any way we can sensibly factor out the duplication?
tools/llvm-objcopy/Object.h
151	"= default" maybe?
154	I feel like "ELFObject" (and "BinaryObject") are more natural names. Could we use those instead, please?
159–161	Are these needed?

In D34480#818627, @jhenderson wrote:

It looks like you've accidentally lost program-headers.test again and regained hello-world.test.

I'm not familiar with the binary output format, and the objcopy documentation isn't exactly clear on the matter. Would you mind giving a quick overview, please?

Sure, I had to play with objtool to figure out what was going on because it isn't at all clear otherwise. Loosely the "binary" format should stack the memory image of each segment into a single file. The loader is responsible for knowing where these segments need to go and where they lie in the file. There is an exception however for NOBITS sections that occur at the end of a segment. Those are not included in the file. Additionally alignment needs to be respected so you have to bump things to alignment boundaries. The goal is that if you know the offset, filesize, and virtual address of each segment in the output then you could use nothing more than that information to load each segment into memory. All extraneous information is excluded.

tools/llvm-objcopy/Object.h
159–161	All 3 are used in ELFObject for file layout

jakehehrlich updated this revision to Diff 107968.Jul 24 2017, 3:57 PM

Thinking a little bit more about the design. Do we anticipate in the future supporting non-ELF inputs? Seeing inheritance used to control the output, made me consider an alternative design based more on templates: rather than have one base Object class which allows sharing the input code, and then derive from it to specify the output code, how about a single class that is templated on two arguments - an input reader class, and an output reader class. This would in theory allow for easy mixing and matching the different input and output types, without having to provide a different class definition for every single combination. A sketch outline is below:

Input Types:
Elf64LEInput --+   Object <Input, Output>   +-- Elf64LEOutput
Elf32LEInput --+-------------^       ^------+-- Elf32LEOutput
NonElfInput  --+                            +-- BinaryOutput

and the high-level program would look something like:

Object<Elf64LEInput, BinaryOutput> Obj;
Obj.Input.read();
Obj.dostuff();
Obj.Output.write();

I think that this design might also allow for converting between endianness and ELF size, although I don't know if there's any value in that. Overall, it might not be worth it, but it's something to consider.

In D34480#819388, @jakehehrlich wrote:

Sure, I had to play with objtool to figure out what was going on because it isn't at all clear otherwise. Loosely the "binary" format should stack the memory image of each segment into a single file. The loader is responsible for knowing where these segments need to go and where they lie in the file. There is an exception however for NOBITS sections that occur at the end of a segment. Those are not included in the file. Additionally alignment needs to be respected so you have to bump things to alignment boundaries. The goal is that if you know the offset, filesize, and virtual address of each segment in the output then you could use nothing more than that information to load each segment into memory. All extraneous information is excluded.

Thanks, that makes sense. I take it that we can ignore nested segments here, because you shouldn't nest PT_LOAD segments?

tools/llvm-objcopy/Object.cpp
108	I'm slightly nervous that this might turn into a dangling reference at some point in the future - at the moment, it works fine, because ElfFile remains open until after writing happens, but in theory, I don't see any reason why ElfFile needs to stay open once reading has happened (it might help reduce the memory footprint, for example). However, closing it would make this and any other references directly to the data invalid.
359–368	Yeah, thinking about it a bit more, this could be turned into a common function called "forEachLoadableSegment" or similar, with a lambda or similar abstracting the uncommon bit away.
363	What about PT_GNU_RELRO segments? Does the GNU tool treat these the same as PT_LOAD, since they are still loadable segments?
tools/llvm-objcopy/Object.h
159–161	Oops, I put this comment on the wrong instance - they're repeated in BinaryObject below, where I think they are unnecessary.
185–187	"virtual" is unnecessary here.
tools/llvm-objcopy/llvm-objcopy.cpp
56	Maybe worth documenting the valid options in the help text?
63	Is it worth emitting an error here if the OutputFormat isn't an understood type? At the moment, I could do "-O Elf32LE" or "-O coff" or similar expecting to get 32-bit ELF/COFF out, but actually I get 64-bit ELF.
69	Unused variable. Should the error message report it?

In D34480#819822, @jhenderson wrote:

Thinking a little bit more about the design. Do we anticipate in the future supporting non-ELF inputs? Seeing inheritance used to control the output, made me consider an alternative design based more on templates: rather than have one base Object class which allows sharing the input code, and then derive from it to specify the output code, how about a single class that is templated on two arguments - an input reader class, and an output reader class. This would in theory allow for easy mixing and matching the different input and output types, without having to provide a different class definition for every single combination. A sketch outline is below:

I do anticipate multiple output formats long term. There are a few different low level output formats like "binary" and intel HEX that might be needed by somebody. Additionally there are a number of operations that have been discussed by various people that change how output works enough that I would consider subclassing ELFObject rather than try and incorporate the behavior into ELFObject. For instance one idea is to "complete strip" a binary so that it has no section headers or other extraneous information. It would be just like "-O binary" except that the ELF header and program headers would be retained. Another format (that was discussed but I don't thin will be included) is a tool for taking shared objects and stripping them down so much that a linker can link against them but there is no actual content to the output. So I expect multiple things to possibly fit into an "output format" like thing even if the actual format output is still ELF.

The output format depends heavily on the input format. For instance converting between ELF and MachO is, as I understand it, not a great idea. Converting between 32-bit and 64-bit isn't possible in general so I don't think I want to support that. I've also been advised by many people that ELF, COFF, and MachO support should be totally separate and that it isn't easy to create a uniform interface to them. However somethings like converting between endianness' makes sense I think. I think as long as this stays ELF specific and everything maintains the same 32-bit/64-bit then I like the idea. This would allow for converting between endinesses and perhaps provide a better separation between how things are read in and how things are written out. I'll play around with design today and see if I can't make something better than this.

Thanks, that makes sense. I take it that we can ignore nested segments here, because you shouldn't nest PT_LOAD segments?

Every time I make an assumption like "this will never happen" I generally find a case where I'm wrong. So I won't say "shouldn't" but I certainly don't see why it makes sense to have overlapping PT_LOAD segments. It would just be redundant from my perspective. PT_GNU_RELRO might come up at some point as a "loadable" segment that overlaps but I don't think that should have any baring on the use cases for binary output since most of those use cases are for kernels and embedded systems.

tools/llvm-objcopy/Object.cpp
359–368	One thought I had was to put this loop in finalize, set the Offset of each segment as I went, and then use the final Offset to set a "TotalSize" field of BinaryObject. This would deduplicate the code but I didn't like using Segment::Offset in that way. My complaint is that Segment::Offset is supposed to represent "p_offset" and "p_offset" is just not relevant in the binary output format. Additionally it requires adding a field. I'm less concerned with that however. I'd be willing to dedup the code this way if you provided an argument for why Segment::Offset should be allowed to be used this way. E.g. the argument should claim that it is ok to divorce Segment::Offset from p_offset. Also currently there is a bit of a blemish in the code in that most "write" interfaces take a FileOutputBuffer but Segment::writeSegment takes a uint8_t pointer. Using Offset in this way would solve that issue. Still I'm concerned that that just isn't the right use of Segment::Offset. I could be convinced that it doesn't matter though. Perhaps there is a general pattern here that goes something like "loop over all loadable segments and calculate their offsets" but that seems very specific. Certainly I could write a method that took a function and did this but it seems too specific. It would be nice if this pattern kept coming up but it only seems to come up twice and I'd bet that it will only come up twice.

Changed how error handling works
Deduplicated looping over loadable segments while calculating their offset

Herald added a subscriber: mgorny. · View Herald TranscriptJul 25 2017, 11:54 AM

In D34480#820413, @jakehehrlich wrote:

I do anticipate multiple output formats long term. There are a few different low level output formats like "binary" and intel HEX that might be needed by somebody. Additionally there are a number of operations that have been discussed by various people that change how output works enough that I would consider subclassing ELFObject rather than try and incorporate the behavior into ELFObject. For instance one idea is to "complete strip" a binary so that it has no section headers or other extraneous information. It would be just like "-O binary" except that the ELF header and program headers would be retained. Another format (that was discussed but I don't thin will be included) is a tool for taking shared objects and stripping them down so much that a linker can link against them but there is no actual content to the output. So I expect multiple things to possibly fit into an "output format" like thing even if the actual format output is still ELF.

The output format depends heavily on the input format. For instance converting between ELF and MachO is, as I understand it, not a great idea. Converting between 32-bit and 64-bit isn't possible in general so I don't think I want to support that. I've also been advised by many people that ELF, COFF, and MachO support should be totally separate and that it isn't easy to create a uniform interface to them. However somethings like converting between endianness' makes sense I think. I think as long as this stays ELF specific and everything maintains the same 32-bit/64-bit then I like the idea. This would allow for converting between endinesses and perhaps provide a better separation between how things are read in and how things are written out. I'll play around with design today and see if I can't make something better than this.

Ok, sounds good to me. I can certainly understand that switching between formats might not be the easiest (I have managed it in the past between COFF and ELF to some extent, but it's far from trivial and doesn't seem like a real use case). I'm not sure I understand why going between 32 and 64 bits isn't possible in general personally, but regardless, if there's some value in switching endianness at least, it's certainly worth investigating as a design option.

Thanks, that makes sense. I take it that we can ignore nested segments here, because you shouldn't nest PT_LOAD segments?

Every time I make an assumption like "this will never happen" I generally find a case where I'm wrong. So I won't say "shouldn't" but I certainly don't see why it makes sense to have overlapping PT_LOAD segments. It would just be redundant from my perspective. PT_GNU_RELRO might come up at some point as a "loadable" segment that overlaps but I don't think that should have any baring on the use cases for binary output since most of those use cases are for kernels and embedded systems.

Very true! I think ultimately we can ignore nested segments entirely, since the parent segment will result in them being written. The only possible case we need to be wary of in the future is if segments can overlap - I'm not aware of any use cases for this, but if there were, and some of it involved PT_LOAD (or other loadable segment type), we'd need to do something about it. Not sure that's worth worrying about at this point though.

tools/llvm-objcopy/Object.cpp
359–368	Although p_offset itself doesn't make a huge amount of sense in a BinaryObject, I don't think it's unreasonable to treat Offset as the location of the data in the file. After all, what else does it represent? The p_offset field in a regular ELF represents exactly that - the offset of the segment in the file. Also, now looking at your loop changes, I'm beginning to think that it doesn't look particularly nice after all, so if you can remove the looping twice by moving it into finalize, all the better, but I'm happy with whichever you prefer. If you're concerned about the extra field, perhaps you could simply look at the last Segment in the vector (assuming it's sorted by offset, and that there are no nested segments), and add it's size to the offset to get the total size? Alternatively, maybe return the total size from finalize in all cases?
363	Oops, for some reason I was under the impression that PT_GNU_RELRO was a separate non-nested segment. Never mind!
383	CompareSegments, not Sections.

Removed function to loop though loadable segments while computing offset. I did this using the change that James and I discussed
Fixed a naming issue

Ok, sounds good to me. I can certainly understand that switching between formats might not be the easiest (I have managed it in the past between COFF and ELF to some extent, but it's far from trivial and doesn't seem like a real use case). I'm not sure I understand why going between 32 and 64 bits isn't possible in general personally, but regardless, if there's some value in switching endianness at least, it's certainly worth investigating as a design option.

The only reason I say switching between 32 and 64 bit isn't possible in general is because going from 64 to 32 might demand copying an address that doesn't fit in 32-bits. Also 64-bit ELF files will tend to have 64-bit code which might not be usable on a 32-bit system. Also does the endianness switch make sense? I'm not aware of an architecture that is sometimes little endiann and sometimes big endiann...then again it most likely exists given the oddity of these things. I'm not sure I want to make a big architectural shift for that reason alone however. Still I think there might be other benefits to your separation of input and output so I'm still looking at it.

tools/llvm-objcopy/Object.cpp
108	I was slightly worried about this as well because I believe that all of the StringRefs for section names and symbol names have this same issue. Additionally the Section type has the same issue as well. So I'd have to make sure that all of those strings were copied as well. I think LLVM has a string saver for this reason. I'm fine demanding that the ELFFile stay open but I understand the concern. Perhaps I should pass a unique pointer containing the ELFFile to the Object and have it take ownership of the ELFFile? I think that would avoid using defensive copies and using a string saver as well as keep the memory footprint lower while still ensuring that the needed data stays open.
363	It ignores them from my testing. Also I don't think PT_GNU_RELRO will be used in the applications of "-O binary" since PT_GNU_RELRO requires virtual memory support that most applications won't have when they're loaded (e.g. Kernels and embedded systems)

In D34480#822009, @jakehehrlich wrote:

The only reason I say switching between 32 and 64 bit isn't possible in general is because going from 64 to 32 might demand copying an address that doesn't fit in 32-bits. Also 64-bit ELF files will tend to have 64-bit code which might not be usable on a 32-bit system. Also does the endianness switch make sense? I'm not aware of an architecture that is sometimes little endiann and sometimes big endiann...then again it most likely exists given the oddity of these things. I'm not sure I want to make a big architectural shift for that reason alone however. Still I think there might be other benefits to your separation of input and output so I'm still looking at it.

Ok, sounds reasonable to me.

Did you mean to rename program-headers.test?

tools/llvm-objcopy/Object.cpp
108	Perhaps I should pass a unique pointer containing the ELFFile to the Object and have it take ownership of the ELFFile? I think that would avoid using defensive copies and using a string saver as well as keep the memory footprint lower while still ensuring that the needed data stays open. I like this idea, but it should probably be done as part of a different change. It reduces any risk of future (and somewhat subtle) mistakes, without being overly defensive in coding.

jakehehrlich marked 20 inline comments as done.Jul 27 2017, 2:22 PM

fixed file renaming issue (I did not mean to do that)

LGTM, with some fixes for nits in the comments.

tools/llvm-objcopy/Object.cpp
47	Either "... maintain segments' interstitial data..." (with apostrophe) or "... maintain interstitial data and contents of segments..."
48	"this" -> "This"
375–380	Full stop.

This revision is now accepted and ready to land.Jul 28 2017, 1:17 AM

Fixed nits
Brought this into line with 33964 changes that were made primarily for windows compatibility.

jakehehrlich updated this revision to Diff 109026.Jul 31 2017, 4:22 PM

Closed by commit rL309658: [llvm][llvm-objcopy] Added support for outputting to binary in llvm-objcopy (authored by phosek). · Explain WhyJul 31 2017, 10:21 PM

This revision was automatically updated to reflect the committed changes.

On release builds Object<...>::Object(ELFFile<...>) was giving an undefined error. This appears to be because in the explicit instantiation of ELFObject and BianryObject in Object.cpp there was no need to instantiate that method from Object. I added the explicit instantiations for Object to insure that that constructor would be defined in Object.cpp.

On 32-bit builds size_t changes and causes a casting warning when converting a 64-bit p_filesz field to a size_t.

basic-align-copy.test was failing on big endian machines because of the byte ordering needed in the check for the .data contents. I fixed this by giving the .data section 2 of the same bytes so no matter how they're ordered the message will be the same.

jakehehrlich updated this revision to Diff 109794.Aug 4 2017, 11:51 AM

jhenderson mentioned this in D41619: [llvm-objcopy] Use physical instead of virtual address when aligning and placing sections in binary.Jan 9 2018, 3:25 AM

Revision Contents

Path

Size

test/

tools/

llvm-objcopy/

basic-align-copy.test

37 lines

basic-binary-copy.test

25 lines

	program-header.test
	program-headers.test

program-headers.test

tools/

llvm-objcopy/

Object.h

58 lines

Object.cpp

206 lines

llvm-objcopy.cpp

15 lines

Diff 107968

test/tools/llvm-objcopy/basic-align-copy.test

This file was added.

				# RUN: yaml2obj %s -o %t
				# RUN: llvm-objcopy -O binary %t %t2
				# RUN: od -t x2 %t2 \| FileCheck %s
				# RUN: wc -c < %t2 \| FileCheck %s --check-prefix=SIZE

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x0000000000001000
				Content: "c3c3c3c3"
				- Name: .data
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x0000000000001000
				Content: "32"
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text
				- Type: PT_LOAD
				Flags: [ PF_R ]
				Sections:
				- Section: .data

				# CHECK: 0000000 c3c3 c3c3 0000 0000 0000 0000 0000 0000
				# CHECK-NEXT: 0000020 0000 0000 0000 0000 0000 0000 0000 0000
				# CHECK-NEXT: *
				# CHECK-NEXT: 0010000 0032
				# SIZE: 4097

test/tools/llvm-objcopy/basic-binary-copy.test

This file was added.

				# RUN: yaml2obj %s -o %t
				# RUN: llvm-objcopy -O binary %t %t2
				# RUN: od -t x2 -v %t2 \| FileCheck %s
				# RUN: wc -c < %t2 \| FileCheck %s --check-prefix=SIZE

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x0000000000001000
				Content: "c3c3c3c3"
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text

				# CHECK: 0000000 c3c3 c3c3
				# SIZE: 4

test/tools/llvm-objcopy/program-header.test

This file was moved from test/tools/llvm-objcopy/program-headers.test.

The contents of this file were not changed.

test/tools/llvm-objcopy/program-headers.test

This file was moved to test/tools/llvm-objcopy/program-header.test.

tools/llvm-objcopy/Object.h

Show First 20 Lines • Show All 52 Lines • ▼ Show 20 Lines	bool operator()(const SectionBase Lhs, const SectionBase Rhs) const {
// address of the actully stored section.		// address of the actully stored section.
if (Lhs->Addr == Rhs->Addr)		if (Lhs->Addr == Rhs->Addr)
return Lhs < Rhs;		return Lhs < Rhs;
return Lhs->Addr < Rhs->Addr;		return Lhs->Addr < Rhs->Addr;
}		}
};		};

std::set<const SectionBase *, SectionCompare> Sections;		std::set<const SectionBase *, SectionCompare> Sections;
		llvm::ArrayRef<uint8_t> Contents;

public:		public:
uint64_t Align;		uint64_t Align;
uint64_t FileSize;		uint64_t FileSize;
uint32_t Flags;		uint32_t Flags;
uint32_t Index;		uint32_t Index;
uint64_t MemSize;		uint64_t MemSize;
uint64_t Offset;		uint64_t Offset;
uint64_t PAddr;		uint64_t PAddr;
uint64_t Type;		uint64_t Type;
uint64_t VAddr;		uint64_t VAddr;

		Segment(llvm::ArrayRef<uint8_t> Data) : Contents(Data) {}
void finalize();		void finalize();
const SectionBase *firstSection() const {		const SectionBase *firstSection() const {
if (!Sections.empty())		if (!Sections.empty())
return *Sections.begin();		return *Sections.begin();
return nullptr;		return nullptr;
}		}
void addSection(const SectionBase *sec) { Sections.insert(sec); }		void addSection(const SectionBase *sec) { Sections.insert(sec); }
template <class ELFT> void writeHeader(llvm::FileOutputBuffer &Out) const;		template <class ELFT> void writeHeader(llvm::FileOutputBuffer &Out) const;
		void writeSegment(uint8_t *Buf) const;
};		};

class Section : public SectionBase {		class Section : public SectionBase {
private:		private:
llvm::ArrayRef<uint8_t> Contents;		llvm::ArrayRef<uint8_t> Contents;

public:		public:
Section(llvm::ArrayRef<uint8_t> Data) : Contents(Data) {}		Section(llvm::ArrayRef<uint8_t> Data) : Contents(Data) {}
Show All 23 Lines
private:		private:
typedef std::unique_ptr<SectionBase> SecPtr;		typedef std::unique_ptr<SectionBase> SecPtr;
typedef std::unique_ptr<Segment> SegPtr;		typedef std::unique_ptr<Segment> SegPtr;

typedef typename ELFT::Shdr Elf_Shdr;		typedef typename ELFT::Shdr Elf_Shdr;
typedef typename ELFT::Ehdr Elf_Ehdr;		typedef typename ELFT::Ehdr Elf_Ehdr;
typedef typename ELFT::Phdr Elf_Phdr;		typedef typename ELFT::Phdr Elf_Phdr;

StringTableSection *SectionNames;
std::vector<SecPtr> Sections;
std::vector<SegPtr> Segments;

void sortSections();
void assignOffsets();
SecPtr makeSection(const llvm::object::ELFFile<ELFT> &ElfFile,		SecPtr makeSection(const llvm::object::ELFFile<ELFT> &ElfFile,
const Elf_Shdr &Shdr);		const Elf_Shdr &Shdr);
void readProgramHeaders(const llvm::object::ELFFile<ELFT> &ElfFile);		void readProgramHeaders(const llvm::object::ELFFile<ELFT> &ElfFile);
void readSectionHeaders(const llvm::object::ELFFile<ELFT> &ElfFile);		void readSectionHeaders(const llvm::object::ELFFile<ELFT> &ElfFile);

		protected:
		StringTableSection *SectionNames;
		std::vector<SecPtr> Sections;
		std::vector<SegPtr> Segments;

void writeHeader(llvm::FileOutputBuffer &Out) const;		void writeHeader(llvm::FileOutputBuffer &Out) const;
void writeProgramHeaders(llvm::FileOutputBuffer &Out) const;		void writeProgramHeaders(llvm::FileOutputBuffer &Out) const;
void writeSectionData(llvm::FileOutputBuffer &Out) const;		void writeSectionData(llvm::FileOutputBuffer &Out) const;
void writeSectionHeaders(llvm::FileOutputBuffer &Out) const;		void writeSectionHeaders(llvm::FileOutputBuffer &Out) const;

public:		public:
uint8_t Ident[16];		uint8_t Ident[16];
uint64_t Entry;		uint64_t Entry;
uint64_t SHOffset;		uint64_t SHOffset;
uint32_t Type;		uint32_t Type;
uint32_t Machine;		uint32_t Machine;
uint32_t Version;		uint32_t Version;
uint32_t Flags;		uint32_t Flags;

Object(const llvm::object::ELFObjectFile<ELFT> &Obj);		Object(const llvm::object::ELFObjectFile<ELFT> &Obj);
size_t totalSize() const;		virtual size_t totalSize() const = 0;
void finalize();		virtual void finalize() = 0;
void write(llvm::FileOutputBuffer &Out);		virtual void write(llvm::FileOutputBuffer &Out) = 0;
		virtual ~Object() = default;
		jhendersonUnsubmitted Done Reply Inline Actions "= default" maybe? jhenderson: "= default" maybe?
};		};

		template <class ELFT> class ELFObject : public Object<ELFT> {
		jhendersonUnsubmitted Done Reply Inline Actions I feel like "ELFObject" (and "BinaryObject") are more natural names. Could we use those instead, please? jhenderson: I feel like "ELFObject" (and "BinaryObject") are more natural names. Could we use those instead…
		private:
		typedef std::unique_ptr<SectionBase> SecPtr;
		typedef std::unique_ptr<Segment> SegPtr;

		typedef typename ELFT::Shdr Elf_Shdr;
		typedef typename ELFT::Ehdr Elf_Ehdr;
		typedef typename ELFT::Phdr Elf_Phdr;
		jhendersonUnsubmitted Done Reply Inline Actions Are these needed? jhenderson: Are these needed?
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions All 3 are used in ELFObject for file layout jakehehrlich: All 3 are used in ELFObject for file layout
		jhendersonUnsubmitted Done Reply Inline Actions Oops, I put this comment on the wrong instance - they're repeated in BinaryObject below, where I think they are unnecessary. jhenderson: Oops, I put this comment on the wrong instance - they're repeated in BinaryObject below, where…

		void sortSections();
		void assignOffsets();

		public:
		ELFObject(const llvm::object::ELFObjectFile<ELFT> &Obj) : Object<ELFT>(Obj) {}
		void finalize() override;
		size_t totalSize() const override;
		void write(llvm::FileOutputBuffer &Out) override;
		};

		template <class ELFT> class BinaryObject : public Object<ELFT> {
		private:
		typedef std::unique_ptr<SectionBase> SecPtr;
		typedef std::unique_ptr<Segment> SegPtr;

		typedef typename ELFT::Shdr Elf_Shdr;
		typedef typename ELFT::Ehdr Elf_Ehdr;
		typedef typename ELFT::Phdr Elf_Phdr;

		public:
		BinaryObject(const llvm::object::ELFObjectFile<ELFT> &Obj)
		: Object<ELFT>(Obj) {}
		virtual void finalize() override;
		virtual size_t totalSize() const override;
		virtual void write(llvm::FileOutputBuffer &Out) override;
		jhendersonUnsubmitted Done Reply Inline Actions "virtual" is unnecessary here. jhenderson: "virtual" is unnecessary here.

		};
#endif		#endif

tools/llvm-objcopy/Object.cpp

Show All 36 Lines	if (FirstSec) {
// this we need to compute the new offset based on how large this gap was		// this we need to compute the new offset based on how large this gap was
// in the source file. Section layout should have already ensured that this		// in the source file. Section layout should have already ensured that this
// space is not used for something else.		// space is not used for something else.
uint64_t OriginalOffset = Offset;		uint64_t OriginalOffset = Offset;
Offset = FirstSec->Offset - (FirstSec->OriginalOffset - OriginalOffset);		Offset = FirstSec->Offset - (FirstSec->OriginalOffset - OriginalOffset);
}		}
}		}

		void Segment::writeSegment(uint8_t *Buf) const {
		// We want to maintain segments interstitial data and contents exactly.
		// this lets us just copy segments directly.
		jhendersonUnsubmitted Not Done Reply Inline Actions Either "... maintain segments' interstitial data..." (with apostrophe) or "... maintain interstitial data and contents of segments..." jhenderson: Either "... maintain segments' interstitial data..." (with apostrophe) or "... maintain…
		std::copy(std::begin(Contents), std::end(Contents), Buf);
		jhendersonUnsubmitted Not Done Reply Inline Actions "this" -> "This" jhenderson: "this" -> "This"
		}

void SectionBase::finalize() {}		void SectionBase::finalize() {}

template <class ELFT>		template <class ELFT>
void SectionBase::writeHeader(FileOutputBuffer &Out) const {		void SectionBase::writeHeader(FileOutputBuffer &Out) const {
uint8_t *Buf = Out.getBufferStart();		uint8_t *Buf = Out.getBufferStart();
Buf += HeaderOffset;		Buf += HeaderOffset;
typename ELFT::Shdr &Shdr = reinterpret_cast<typename ELFT::Shdr >(Buf);		typename ELFT::Shdr &Shdr = reinterpret_cast<typename ELFT::Shdr >(Buf);
Shdr.sh_name = NameIndex;		Shdr.sh_name = NameIndex;
▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	static bool sectionWithinSegment(const SectionBase &Section,
return Segment.Offset <= Section.OriginalOffset &&		return Segment.Offset <= Section.OriginalOffset &&
Segment.Offset + Segment.FileSize >= Section.OriginalOffset + SecSize;		Segment.Offset + Segment.FileSize >= Section.OriginalOffset + SecSize;
}		}

template <class ELFT>		template <class ELFT>
void Object<ELFT>::readProgramHeaders(const ELFFile<ELFT> &ElfFile) {		void Object<ELFT>::readProgramHeaders(const ELFFile<ELFT> &ElfFile) {
uint32_t Index = 0;		uint32_t Index = 0;
for (const auto &Phdr : unwrapOrError(ElfFile.program_headers())) {		for (const auto &Phdr : unwrapOrError(ElfFile.program_headers())) {
Segments.emplace_back(make_unique<Segment>());		ArrayRef<uint8_t> Data{ElfFile.base() + Phdr.p_offset, Phdr.p_filesz};
		jhendersonUnsubmitted Done Reply Inline Actions I'm slightly nervous that this might turn into a dangling reference at some point in the future - at the moment, it works fine, because ElfFile remains open until after writing happens, but in theory, I don't see any reason why ElfFile needs to stay open once reading has happened (it might help reduce the memory footprint, for example). However, closing it would make this and any other references directly to the data invalid. jhenderson: I'm slightly nervous that this might turn into a dangling reference at some point in the future…
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions I was slightly worried about this as well because I believe that all of the StringRefs for section names and symbol names have this same issue. Additionally the Section type has the same issue as well. So I'd have to make sure that all of those strings were copied as well. I think LLVM has a string saver for this reason. I'm fine demanding that the ELFFile stay open but I understand the concern. Perhaps I should pass a unique pointer containing the ELFFile to the Object and have it take ownership of the ELFFile? I think that would avoid using defensive copies and using a string saver as well as keep the memory footprint lower while still ensuring that the needed data stays open. jakehehrlich: I was slightly worried about this as well because I believe that all of the StringRefs for…
		jhendersonUnsubmitted Done Reply Inline Actions Perhaps I should pass a unique pointer containing the ELFFile to the Object and have it take ownership of the ELFFile? I think that would avoid using defensive copies and using a string saver as well as keep the memory footprint lower while still ensuring that the needed data stays open. I like this idea, but it should probably be done as part of a different change. It reduces any risk of future (and somewhat subtle) mistakes, without being overly defensive in coding. jhenderson: > Perhaps I should pass a unique pointer containing the ELFFile to the Object and have it take…
		Segments.emplace_back(make_unique<Segment>(Data));
Segment &Seg = *Segments.back();		Segment &Seg = *Segments.back();
Seg.Type = Phdr.p_type;		Seg.Type = Phdr.p_type;
Seg.Flags = Phdr.p_flags;		Seg.Flags = Phdr.p_flags;
Seg.Offset = Phdr.p_offset;		Seg.Offset = Phdr.p_offset;
Seg.VAddr = Phdr.p_vaddr;		Seg.VAddr = Phdr.p_vaddr;
Seg.PAddr = Phdr.p_paddr;		Seg.PAddr = Phdr.p_paddr;
Seg.FileSize = Phdr.p_filesz;		Seg.FileSize = Phdr.p_filesz;
Seg.MemSize = Phdr.p_memsz;		Seg.MemSize = Phdr.p_memsz;
Show All 19 Lines	Object<ELFT>::makeSection(const llvm::object::ELFFile<ELFT> &ElfFile,
switch (Shdr.sh_type) {		switch (Shdr.sh_type) {
case SHT_STRTAB:		case SHT_STRTAB:
return make_unique<StringTableSection>();		return make_unique<StringTableSection>();
case SHT_NOBITS:		case SHT_NOBITS:
return make_unique<Section>(Data);		return make_unique<Section>(Data);
default:		default:
Data = unwrapOrError(ElfFile.getSectionContents(&Shdr));		Data = unwrapOrError(ElfFile.getSectionContents(&Shdr));
return make_unique<Section>(Data);		return make_unique<Section>(Data);
};		}
}		}

template <class ELFT>		template <class ELFT>
void Object<ELFT>::readSectionHeaders(const ELFFile<ELFT> &ElfFile) {		void Object<ELFT>::readSectionHeaders(const ELFFile<ELFT> &ElfFile) {
uint32_t Index = 0;		uint32_t Index = 0;
for (const auto &Shdr : unwrapOrError(ElfFile.sections())) {		for (const auto &Shdr : unwrapOrError(ElfFile.sections())) {
if (Index == 0) {		if (Index == 0) {
++Index;		++Index;
Show All 11 Lines	for (const auto &Shdr : unwrapOrError(ElfFile.sections())) {
Sec->Info = Shdr.sh_info;		Sec->Info = Shdr.sh_info;
Sec->Align = Shdr.sh_addralign;		Sec->Align = Shdr.sh_addralign;
Sec->EntrySize = Shdr.sh_entsize;		Sec->EntrySize = Shdr.sh_entsize;
Sec->Index = Index++;		Sec->Index = Index++;
Sections.push_back(std::move(Sec));		Sections.push_back(std::move(Sec));
}		}
}		}

template <class ELFT> size_t Object<ELFT>::totalSize() const {
// We already have the section header offset so we can calculate the total
// size by just adding up the size of each section header.
return SHOffset + Sections.size() * sizeof(Elf_Shdr) + sizeof(Elf_Shdr);
}

template <class ELFT> Object<ELFT>::Object(const ELFObjectFile<ELFT> &Obj) {		template <class ELFT> Object<ELFT>::Object(const ELFObjectFile<ELFT> &Obj) {
const auto &ElfFile = *Obj.getELFFile();		const auto &ElfFile = *Obj.getELFFile();
const auto &Ehdr = *ElfFile.getHeader();		const auto &Ehdr = *ElfFile.getHeader();

std::copy(Ehdr.e_ident, Ehdr.e_ident + 16, Ident);		std::copy(Ehdr.e_ident, Ehdr.e_ident + 16, Ident);
Type = Ehdr.e_type;		Type = Ehdr.e_type;
Machine = Ehdr.e_machine;		Machine = Ehdr.e_machine;
Version = Ehdr.e_version;		Version = Ehdr.e_version;
Entry = Ehdr.e_entry;		Entry = Ehdr.e_entry;
Flags = Ehdr.e_flags;		Flags = Ehdr.e_flags;

readSectionHeaders(ElfFile);		readSectionHeaders(ElfFile);
readProgramHeaders(ElfFile);		readProgramHeaders(ElfFile);

SectionNames =		SectionNames =
dyn_cast<StringTableSection>(Sections[Ehdr.e_shstrndx - 1].get());		dyn_cast<StringTableSection>(Sections[Ehdr.e_shstrndx - 1].get());
}		}

template <class ELFT> void Object<ELFT>::sortSections() {		template <class ELFT>
		void Object<ELFT>::writeHeader(FileOutputBuffer &Out) const {
		uint8_t *Buf = Out.getBufferStart();
		Elf_Ehdr &Ehdr = reinterpret_cast<Elf_Ehdr >(Buf);
		std::copy(Ident, Ident + 16, Ehdr.e_ident);
		Ehdr.e_type = Type;
		Ehdr.e_machine = Machine;
		Ehdr.e_version = Version;
		Ehdr.e_entry = Entry;
		Ehdr.e_phoff = sizeof(Elf_Ehdr);
		Ehdr.e_shoff = SHOffset;
		Ehdr.e_flags = Flags;
		Ehdr.e_ehsize = sizeof(Elf_Ehdr);
		Ehdr.e_phentsize = sizeof(Elf_Phdr);
		Ehdr.e_phnum = Segments.size();
		Ehdr.e_shentsize = sizeof(Elf_Shdr);
		Ehdr.e_shnum = Sections.size() + 1;
		Ehdr.e_shstrndx = SectionNames->Index;
		}

		template <class ELFT>
		void Object<ELFT>::writeProgramHeaders(FileOutputBuffer &Out) const {
		for (auto &Phdr : Segments)
		Phdr->template writeHeader<ELFT>(Out);
		}

		template <class ELFT>
		void Object<ELFT>::writeSectionHeaders(FileOutputBuffer &Out) const {
		uint8_t *Buf = Out.getBufferStart() + SHOffset;
		// This reference serves to write the dummy section header at the begining
		// of the file.
		Elf_Shdr &Shdr = reinterpret_cast<Elf_Shdr >(Buf);
		Shdr.sh_name = 0;
		Shdr.sh_type = SHT_NULL;
		Shdr.sh_flags = 0;
		Shdr.sh_addr = 0;
		Shdr.sh_offset = 0;
		Shdr.sh_size = 0;
		Shdr.sh_link = 0;
		Shdr.sh_info = 0;
		Shdr.sh_addralign = 0;
		Shdr.sh_entsize = 0;

		for (auto &Section : Sections)
		Section->template writeHeader<ELFT>(Out);
		}

		template <class ELFT>
		void Object<ELFT>::writeSectionData(FileOutputBuffer &Out) const {
		for (auto &Section : Sections)
		Section->writeSection(Out);
		}

		template <class ELFT> void ELFObject<ELFT>::sortSections() {
// Put all sections in offset order. Maintain the ordering as closely as		// Put all sections in offset order. Maintain the ordering as closely as
// possible while meeting that demand however.		// possible while meeting that demand however.
auto CompareSections = [](const SecPtr &A, const SecPtr &B) {		auto CompareSections = [](const SecPtr &A, const SecPtr &B) {
return A->OriginalOffset < B->OriginalOffset;		return A->OriginalOffset < B->OriginalOffset;
};		};
std::stable_sort(std::begin(Sections), std::end(Sections), CompareSections);		std::stable_sort(std::begin(this->Sections), std::end(this->Sections),
		CompareSections);
}		}

template <class ELFT> void Object<ELFT>::assignOffsets() {		template <class ELFT> void ELFObject<ELFT>::assignOffsets() {
// Decide file offsets and indexes.		// Decide file offsets and indexes.
size_t PhdrSize = Segments.size() * sizeof(Elf_Phdr);		size_t PhdrSize = this->Segments.size() * sizeof(Elf_Phdr);
// We can put section data after the ELF header and the program headers.		// We can put section data after the ELF header and the program headers.
uint64_t Offset = sizeof(Elf_Ehdr) + PhdrSize;		uint64_t Offset = sizeof(Elf_Ehdr) + PhdrSize;
uint64_t Index = 1;		uint64_t Index = 1;
for (auto &Section : Sections) {		for (auto &Section : this->Sections) {
// The segment can have a different alignment than the section. In the case		// The segment can have a different alignment than the section. In the case
// that there is a parent segment then as long as we satisfy the alignment		// that there is a parent segment then as long as we satisfy the alignment
// of the segment it should follow that that the section is aligned.		// of the segment it should follow that that the section is aligned.
if (Section->ParentSegment) {		if (Section->ParentSegment) {
auto FirstInSeg = Section->ParentSegment->firstSection();		auto FirstInSeg = Section->ParentSegment->firstSection();
if (FirstInSeg == Section.get()) {		if (FirstInSeg == Section.get()) {
Offset = alignTo(Offset, Section->ParentSegment->Align);		Offset = alignTo(Offset, Section->ParentSegment->Align);
// There can be gaps at the start of a segment before the first section.		// There can be gaps at the start of a segment before the first section.
Show All 30 Lines	if (Section->Type != SHT_NOBITS)
Offset += Section->Size;		Offset += Section->Size;
}		}
// 'offset' should now be just after all the section data so we should set the		// 'offset' should now be just after all the section data so we should set the
// section header table offset to be exactly here. This spot might not be		// section header table offset to be exactly here. This spot might not be
// aligned properly however so we should align it as needed. For 32-bit ELF		// aligned properly however so we should align it as needed. For 32-bit ELF
// this needs to be 4-byte aligned and on 64-bit it needs to be 8-byte aligned		// this needs to be 4-byte aligned and on 64-bit it needs to be 8-byte aligned
// so the size of ELFT::Addr is used to ensure this.		// so the size of ELFT::Addr is used to ensure this.
Offset = alignTo(Offset, sizeof(typename ELFT::Addr));		Offset = alignTo(Offset, sizeof(typename ELFT::Addr));
SHOffset = Offset;		this->SHOffset = Offset;
}		}

template <class ELFT> void Object<ELFT>::finalize() {		template <class ELFT> size_t ELFObject<ELFT>::totalSize() const {
for (auto &Section : Sections)		// We already have the section header offset so we can calculate the total
SectionNames->addString(Section->Name);		// size by just adding up the size of each section header.
		return this->SHOffset + this->Sections.size() * sizeof(Elf_Shdr) +
		sizeof(Elf_Shdr);
		}

		template <class ELFT> void ELFObject<ELFT>::write(FileOutputBuffer &Out) {
		this->writeHeader(Out);
		this->writeProgramHeaders(Out);
		this->writeSectionData(Out);
		this->writeSectionHeaders(Out);
		}

		template <class ELFT> void ELFObject<ELFT>::finalize() {
		for (const auto &Section : this->Sections) {
		this->SectionNames->addString(Section->Name);
		}

sortSections();		sortSections();
assignOffsets();		assignOffsets();

// Finalize SectionNames first so that we can assign name indexes.		// Finalize SectionNames first so that we can assign name indexes.
SectionNames->finalize();		this->SectionNames->finalize();
// Finally now that all offsets and indexes have been set we can finalize any		// Finally now that all offsets and indexes have been set we can finalize any
// remaining issues.		// remaining issues.
uint64_t Offset = SHOffset + sizeof(Elf_Shdr);		uint64_t Offset = this->SHOffset + sizeof(Elf_Shdr);
for (auto &Section : Sections) {		for (auto &Section : this->Sections) {
Section->HeaderOffset = Offset;		Section->HeaderOffset = Offset;
Offset += sizeof(Elf_Shdr);		Offset += sizeof(Elf_Shdr);
Section->NameIndex = SectionNames->findIndex(Section->Name);		Section->NameIndex = this->SectionNames->findIndex(Section->Name);
Section->finalize();		Section->finalize();
}		}

for (auto &Segment : Segments)		for (auto &Segment : this->Segments)
Segment->finalize();		Segment->finalize();
}		}

template <class ELFT>		template <class ELFT> size_t BinaryObject<ELFT>::totalSize() const {
void Object<ELFT>::writeHeader(FileOutputBuffer &Out) const {		size_t TotalSize = 0;
uint8_t *Buf = Out.getBufferStart();		for (const auto &Segment : this->Segments) {
Elf_Ehdr &Ehdr = reinterpret_cast<Elf_Ehdr >(Buf);		if (Segment->Type == PT_LOAD) {
std::copy(Ident, Ident + 16, Ehdr.e_ident);		TotalSize = alignTo(TotalSize, Segment->Align);
Ehdr.e_type = Type;		TotalSize += Segment->FileSize;
Ehdr.e_machine = Machine;
Ehdr.e_version = Version;
Ehdr.e_entry = Entry;
Ehdr.e_phoff = sizeof(Elf_Ehdr);
Ehdr.e_shoff = SHOffset;
Ehdr.e_flags = Flags;
Ehdr.e_ehsize = sizeof(Elf_Ehdr);
Ehdr.e_phentsize = sizeof(Elf_Phdr);
Ehdr.e_phnum = Segments.size();
Ehdr.e_shentsize = sizeof(Elf_Shdr);
Ehdr.e_shnum = Sections.size() + 1;
Ehdr.e_shstrndx = SectionNames->Index;
}		}

template <class ELFT>
void Object<ELFT>::writeProgramHeaders(FileOutputBuffer &Out) const {
for (auto &Phdr : Segments)
Phdr->template writeHeader<ELFT>(Out);
}		}
		return TotalSize;
template <class ELFT>
void Object<ELFT>::writeSectionHeaders(FileOutputBuffer &Out) const {
uint8_t *Buf = Out.getBufferStart() + SHOffset;
// This reference serves to write the dummy section header at the begining
// of the file.
Elf_Shdr &Shdr = reinterpret_cast<Elf_Shdr >(Buf);
Shdr.sh_name = 0;
Shdr.sh_type = SHT_NULL;
Shdr.sh_flags = 0;
Shdr.sh_addr = 0;
Shdr.sh_offset = 0;
Shdr.sh_size = 0;
Shdr.sh_link = 0;
Shdr.sh_info = 0;
Shdr.sh_addralign = 0;
Shdr.sh_entsize = 0;

for (auto &Section : Sections)
Section->template writeHeader<ELFT>(Out);
}		}

template <class ELFT>		template <class ELFT> void BinaryObject<ELFT>::write(FileOutputBuffer &Out) {
void Object<ELFT>::writeSectionData(FileOutputBuffer &Out) const {		uint8_t *Buf = Out.getBufferStart();
for (auto &Section : Sections)		uint64_t Offset = 0;
Section->writeSection(Out);		for (const auto &Segment : this->Segments) {
		if (Segment->Type == PT_LOAD) {
		jhendersonUnsubmitted Done Reply Inline Actions What about PT_GNU_RELRO segments? Does the GNU tool treat these the same as PT_LOAD, since they are still loadable segments? jhenderson: What about PT_GNU_RELRO segments? Does the GNU tool treat these the same as PT_LOAD, since they…
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions It ignores them from my testing. Also I don't think PT_GNU_RELRO will be used in the applications of "-O binary" since PT_GNU_RELRO requires virtual memory support that most applications won't have when they're loaded (e.g. Kernels and embedded systems) jakehehrlich: It ignores them from my testing. Also I don't think PT_GNU_RELRO will be used in the…
		jhendersonUnsubmitted Done Reply Inline Actions Oops, for some reason I was under the impression that PT_GNU_RELRO was a separate non-nested segment. Never mind! jhenderson: Oops, for some reason I was under the impression that PT_GNU_RELRO was a separate non-nested…
		Offset = alignTo(Offset, Segment->Align);
		Segment->writeSegment(Buf + Offset);
		Offset += Segment->FileSize;
		}
		}
		jhendersonUnsubmitted Done Reply Inline Actions I can't help but notice that this method and totalSize above are basically the same. Is there any way we can sensibly factor out the duplication? jhenderson: I can't help but notice that this method and totalSize above are basically the same. Is there…
		jakehehrlichAuthorUnsubmitted Done Reply Inline Actions One thought I had was to put this loop in finalize, set the Offset of each segment as I went, and then use the final Offset to set a "TotalSize" field of BinaryObject. This would deduplicate the code but I didn't like using Segment::Offset in that way. My complaint is that Segment::Offset is supposed to represent "p_offset" and "p_offset" is just not relevant in the binary output format. Additionally it requires adding a field. I'm less concerned with that however. I'd be willing to dedup the code this way if you provided an argument for why Segment::Offset should be allowed to be used this way. E.g. the argument should claim that it is ok to divorce Segment::Offset from p_offset. Also currently there is a bit of a blemish in the code in that most "write" interfaces take a FileOutputBuffer but Segment::writeSegment takes a uint8_t pointer. Using Offset in this way would solve that issue. Still I'm concerned that that just isn't the right use of Segment::Offset. I could be convinced that it doesn't matter though. Perhaps there is a general pattern here that goes something like "loop over all loadable segments and calculate their offsets" but that seems very specific. Certainly I could write a method that took a function and did this but it seems too specific. It would be nice if this pattern kept coming up but it only seems to come up twice and I'd bet that it will only come up twice. jakehehrlich: One thought I had was to put this loop in finalize, set the Offset of each segment as I went…
		jhendersonUnsubmitted Done Reply Inline Actions Although p_offset itself doesn't make a huge amount of sense in a BinaryObject, I don't think it's unreasonable to treat Offset as the location of the data in the file. After all, what else does it represent? The p_offset field in a regular ELF represents exactly that - the offset of the segment in the file. Also, now looking at your loop changes, I'm beginning to think that it doesn't look particularly nice after all, so if you can remove the looping twice by moving it into finalize, all the better, but I'm happy with whichever you prefer. If you're concerned about the extra field, perhaps you could simply look at the last Segment in the vector (assuming it's sorted by offset, and that there are no nested segments), and add it's size to the offset to get the total size? Alternatively, maybe return the total size from finalize in all cases? jhenderson: Although p_offset itself doesn't make a huge amount of sense in a BinaryObject, I don't think…
		jhendersonUnsubmitted Done Reply Inline Actions Yeah, thinking about it a bit more, this could be turned into a common function called "forEachLoadableSegment" or similar, with a lambda or similar abstracting the uncommon bit away. jhenderson: Yeah, thinking about it a bit more, this could be turned into a common function called…
}		}

template <class ELFT> void Object<ELFT>::write(FileOutputBuffer &Out) {		template <class ELFT> void BinaryObject<ELFT>::finalize() {
writeHeader(Out);		for (auto &Segment : this->Segments)
writeProgramHeaders(Out);		Segment->finalize();
writeSectionData(Out);
writeSectionHeaders(Out);		// Put all segments in offset order
		auto CompareSections = [](const SegPtr &A, const SegPtr &B) {
		return A->Offset < B->Offset;
		};
		std::sort(std::begin(this->Segments), std::end(this->Segments),
		CompareSections);
		jhendersonUnsubmitted Not Done Reply Inline Actions Full stop. jhenderson: Full stop.
}		}

template class Object<ELF64LE>;		template class ELFObject<ELF64LE>;
		jhendersonUnsubmitted Done Reply Inline Actions CompareSegments, not Sections. jhenderson: CompareSegments, not Sections.
template class Object<ELF64BE>;		template class ELFObject<ELF64BE>;
template class Object<ELF32LE>;		template class ELFObject<ELF32LE>;
template class Object<ELF32BE>;		template class ELFObject<ELF32BE>;

		template class BinaryObject<ELF64LE>;
		template class BinaryObject<ELF64BE>;
		template class BinaryObject<ELF32LE>;
		template class BinaryObject<ELF32BE>;

tools/llvm-objcopy/llvm-objcopy.cpp

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	LLVM_ATTRIBUTE_NORETURN void reportError(StringRef File, llvm::Error E) {
errs() << ToolName << ": '" << File << "': " << Buf;		errs() << ToolName << ": '" << File << "': " << Buf;
exit(1);		exit(1);
}		}
}		}

cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input>"));		cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input>"));
cl::opt<std::string> OutputFilename(cl::Positional, cl::desc("<output>"),		cl::opt<std::string> OutputFilename(cl::Positional, cl::desc("<output>"),
cl::init("-"));		cl::init("-"));
		cl::opt<std::string> OutputFormat("O", cl::desc("set output format"));
		jhendersonUnsubmitted Done Reply Inline Actions Maybe worth documenting the valid options in the help text? jhenderson: Maybe worth documenting the valid options in the help text?

void CopyBinary(const ELFObjectFile<ELF64LE> &ObjFile) {		void CopyBinary(const ELFObjectFile<ELF64LE> &ObjFile) {
std::unique_ptr<FileOutputBuffer> Buffer;		std::unique_ptr<FileOutputBuffer> Buffer;
Object<ELF64LE> Obj{ObjFile};		std::unique_ptr<Object<ELF64LE>> Obj;
Obj.finalize();		if (!OutputFormat.empty() && OutputFormat == "binary")
		Obj = make_unique<BinaryObject<ELF64LE>>(ObjFile);
		else
		jhendersonUnsubmitted Done Reply Inline Actions Is it worth emitting an error here if the OutputFormat isn't an understood type? At the moment, I could do "-O Elf32LE" or "-O coff" or similar expecting to get 32-bit ELF/COFF out, but actually I get 64-bit ELF. jhenderson: Is it worth emitting an error here if the OutputFormat isn't an understood type? At the moment…
		Obj = make_unique<ELFObject<ELF64LE>>(ObjFile);
		Obj->finalize();
ErrorOr<std::unique_ptr<FileOutputBuffer>> BufferOrErr =		ErrorOr<std::unique_ptr<FileOutputBuffer>> BufferOrErr =
FileOutputBuffer::create(OutputFilename, Obj.totalSize(),		FileOutputBuffer::create(OutputFilename, Obj->totalSize(),
FileOutputBuffer::F_executable);		FileOutputBuffer::F_executable);
if (BufferOrErr.getError())		if (auto EC = BufferOrErr.getError())
		jhendersonUnsubmitted Done Reply Inline Actions Unused variable. Should the error message report it? jhenderson: Unused variable. Should the error message report it?
error("failed to open " + OutputFilename);		error("failed to open " + OutputFilename);
else		else
Buffer = std::move(*BufferOrErr);		Buffer = std::move(*BufferOrErr);
std::error_code EC;		std::error_code EC;
std::unique_ptr<tool_output_file> Out =		std::unique_ptr<tool_output_file> Out =
make_unique<tool_output_file>(OutputFilename.data(), EC, sys::fs::F_None);		make_unique<tool_output_file>(OutputFilename.data(), EC, sys::fs::F_None);
if (EC)		if (EC)
report_fatal_error(EC.message());		report_fatal_error(EC.message());
Obj.write(*Buffer);		Obj->write(*Buffer);
if (auto EC = Buffer->commit())		if (auto EC = Buffer->commit())
reportError(OutputFilename, EC);		reportError(OutputFilename, EC);
Out->keep();		Out->keep();
}		}

int main(int argc, char **argv) {		int main(int argc, char **argv) {
// Print a stack trace if we signal out.		// Print a stack trace if we signal out.
sys::PrintStackTraceOnErrorSignal(argv[0]);		sys::PrintStackTraceOnErrorSignal(argv[0]);
Show All 18 Lines