This is an archive of the discontinued LLVM Phabricator instance.

Please do split this into a separate pass-through change and option-hook up. It will keep the scale of the review down by quite a bit. Hooking up the options one or two at a time after that would then make reviewing each of those easier too.

I assume you've seen D54384, where @alexshap is doing the same for MachO? It would make a lot of sense for the two approaches to mirror each other.

I noticed you're planning on adding pre-canned binaries for the tests. Is it possible to use yaml2obj or similar instead?

In D54939#1309348, @jhenderson wrote:

Please do split this into a separate pass-through change and option-hook up. It will keep the scale of the review down by quite a bit. Hooking up the options one or two at a time after that would then make reviewing each of those easier too.

Ok, will split that out. As for adding other options one at a time, I probably can't do it one option at a time, but I'll try to split it into smaller sets of options with the same mechanics at least.

I assume you've seen D54384, where @alexshap is doing the same for MachO? It would make a lot of sense for the two approaches to mirror each other.

I've seen it and looked at it a little for inspiration with this one as well. The general structure of the ELF/COFF/MachO subdirs are the same, but the actual objects and functions used for rebuilding objects obviously differ depending on the formats' structure etc.

I noticed you're planning on adding pre-canned binaries for the tests. Is it possible to use yaml2obj or similar instead?

For most practical functional tests, yes. I have one test for the plain passthrough copying which does a byte-by-byte comparison between the input and output, and tools that generate object files/executables have some freedom in exactly how they are laid out. The input files I use right now have been generated with llvm-mc/lld/msvc, but I can try to see how many of them end up identical if passed via yaml.

Removed the changes that do actual transformations on the binary. Still using hardcoded input files, as it will take a while to sort out what exact differences are produced if going via yaml (I ran into one yaml2obj bug so far), will look more into that later.

In D54939#1309539, @mstorsjo wrote:

Removed the changes that do actual transformations on the binary. Still using hardcoded input files, as it will take a while to sort out what exact differences are produced if going via yaml (I ran into one yaml2obj bug so far), will look more into that later.

FWIW, I can easily change the simple object file inputs into yaml. Executables can also be made into yaml, when the serialization format of VirtualAddress for sections is clarified. (Right now it's broken if you convert to yaml and back.) For executables, I currently pad code segments with 0xcc, to match what LLD does, but yaml2obj doesn't do that, so I'd have to stop doing that if I want a test that matches byte for byte the input data.

For bigobj inputs, yaml2obj doesn't produce that unless the bigobj format strictly is needed. So I can't easily test that with yaml unless I create an object file with over 64k sections. So at least for that, a binary input files more or less is wanted.

smeenai added a subscriber: smeenai.Nov 27 2018, 5:38 PM

alexander-shaposhnikov added a subscriber: compnerd.Nov 29 2018, 9:20 PM

i will get to this diff early next week, sorry about the delay. A small side note - probably the simple changes to include/llvm/Object/COFF.h can be factored out from this diff and reviewed separately (and probably much faster).

In D54939#1314029, @alexshap wrote:

i will get to this diff early next week, sorry about the delay. A small side note - probably the simple changes to include/llvm/Object/COFF.h can be factored out from this diff and reviewed separately (and probably much faster).

Ok - splitting them out isn't hard, but I dunno what to make as test for them, if they are going in as such.

In D54939#1314030, @mstorsjo wrote:

In D54939#1314029, @alexshap wrote:

i will get to this diff early next week, sorry about the delay. A small side note - probably the simple changes to include/llvm/Object/COFF.h can be factored out from this diff and reviewed separately (and probably much faster).

Ok - splitting them out isn't hard, but I dunno what to make as test for them, if they are going in as such.

It looks to me like they should be unit-tested. There is already a set of unit tests for LLVMObject, so it shouldn't be too hard to add them, I'd think.

I'm not really a COFF expert, so I haven't yet looked at the details of this change. It broadly looks like the right direction though.

If I get time, I'll try to start reviewing the details, but it would be better if somebody with more familiarity looked at them.

include/llvm/Object/COFF.h
974	We seem to be mixing our styles of defining things in headers and in .cpp files for no particular reason. Would you mind defining these in COFF.cpp, please since there are two that already are? (I'd support moving getDosHeader into the COFF.cpp too in that case). I could be persuaded instead that the two PE header functions should be moved here.
test/tools/llvm-objcopy/COFF/basic-copy.test
25	nit: remove the extra "of"
tools/llvm-objcopy/COFF/Object.cpp
47	This is quite a big function. Would it be feasible to split it up into a few helpers?
144	Again, this is quite a big function. Would it be feasible to split it up into a few helpers?
240	Again, this is quite a big function. Would it be feasible to split it up into a few helpers?
tools/llvm-objcopy/COFF/Object.h
30	To avoid confusion, maybe we should rename the ELF Object to ELFObject, and then this Object to COFFObject, etc?

mstorsjo marked 4 inline comments as done.Dec 3 2018, 4:46 AM

mstorsjo added inline comments.

include/llvm/Object/COFF.h
974	Sure, I don't mind moving them there. It's indeed quite a bit inconsistent right now, although I don't know if there's some rationale for which ones have been made inline so far.
test/tools/llvm-objcopy/COFF/basic-copy.test
25	Thanks, amended the patch locally with that fixed.
tools/llvm-objcopy/COFF/Object.cpp
47	I guess it should be feasible, I'll give it a try.
tools/llvm-objcopy/COFF/Object.h
30	Sure, I don't mind doing that. The same goes for D54674 and MachO as well. Although that also moves the names closer to the class names used by llvm/Object, `COFFObjectFile` and `ELFObjectFile`, but I don't think that's an issue.

mstorsjo added inline comments.Dec 3 2018, 5:11 AM

tools/llvm-objcopy/COFF/Object.cpp
275	Open questions to reviewers: Here I pad executable sections with 0xcc, in the same way as LLD does. It was requested to have tests that use yaml input instead of committing binaries, and the tests so far check that objcopy produces bytewise identical files to its input. As yaml2obj doesn't do the same 0xcc padding, we'd either need to make yaml2obj do the same padding as well, or stop doing it here. Which one do you prefer? (Also, currently, LLD/COFF doesn't currently use different paddings for ARM/AArch64, which it probably should, and the same would go here as well.)

jhenderson added inline comments.Dec 3 2018, 5:31 AM

tools/llvm-objcopy/COFF/Object.cpp
275	Not too sure, but it might be nice to find a way to specify the padding for any given file. For example, padding with 0xcc doesn't make much sense for x86 output in the data segment, but makes a lot of sense in the text segment.

(I have no idea why my inline comments were marked done from the start...)

mstorsjo added inline comments.Dec 3 2018, 5:41 AM

tools/llvm-objcopy/COFF/Object.cpp
275	Well currently, the padding is only done for executable sections (`if ((S.Header.Characteristics & IMAGE_SCN_CNT_CODE)` above), other sections are padded with zeros.

alexander-shaposhnikov added inline comments.Dec 3 2018, 10:20 AM

tools/llvm-objcopy/COFF/Object.h
30	khm, i don't have a strong opinion, but they are already kinda named this way if we take into account the namespace.

Split the large methods in Object.cpp into a bunch of smaller ones, and added a bit more comments. Moved the COFF header getter implementations into the cpp file.

Suggested changes not (yet) done:

Didn't split the COFF header getter methods to a separate patch as I don't know a sensible standalone test for it, other than using it here.
Kept the class named Object instead of COFFObject, as there weren't clear unanimosity about changing them yet
Didn't change to use yaml2obj for the tests yet.

Currently, the llvm-objcopy output matches bytewise what lld outputs for executables (except for string tables), and what the LLVM codegen outputs for normal object files. If testing with files synthesized from yaml, yaml2obj needs 3 changes; padding executable sections, skipping 4 byte section content alignment as yaml2obj does but MC doesn't, and using the MC StringTableBuilder. The former two are trivial to change, but the latter requires adding an MC dependency to yaml2obj, which I'm not sure is wanted.

mstorsjo marked 8 inline comments as done.Dec 3 2018, 2:41 PM

On the whole ELFObject vs Object thing. I'm in favor of just using Object but I don't really care too much. Having both the namespace and the name prefix is kind of verbose. A while back we had two kinds of Objects instead of just one but this was determined overtime to be a mistake for the ELF backend. If you want to do conversions between formats, even if the format is "binary" you'll probably want to avoid having two kinds of Object or if you do make sure you provide a sufficiently rich base interface to them to allow most code to be written in terms of that.

tools/llvm-objcopy/COFF/Object.h
83	It might be useful to declare this as the OriginalHeader and have a second Header. We discovered overtime that this would have been a good structure in the ELF case but currently we have a horrible hodgepodge of fields prefixed with "Original" and some without. Also, the name "Contents" doesn't necessarily need to change but it should be clear that such a thing is the original contents and nothing more.

Also don't focus on byte for byte accuracy, test semantically so that we can make layout changes independent of content changes. It's clear to me now that we shouldn't even bother attempting exact binary matches even in the first patch.

In D54939#1317502, @jakehehrlich wrote:

Also don't focus on byte for byte accuracy, test semantically so that we can make layout changes independent of content changes. It's clear to me now that we shouldn't even bother attempting exact binary matches even in the first patch.

Sure, in general it's not necessary to have byte for byte accuracy, but as we write a new file, I've tried to mimic the layout of MC and LLD as those should be sane and actually used in the wild.

The test itself doesn't need to check for such accuracy of course though. I guess I could check e.g. the input/output with e.g. llvm-readobj -file-headers or something like that, to check that the output generally looks sane.

tools/llvm-objcopy/COFF/Object.h
83	So far, I don't keep a copy of the original header anywhere but I just patch things in this one copy (marginally when just doing a plain copy, patching more when actually changing things) before finally writing it to the output with one memcpy, so there's no distinction between original and current. If we need to keep the original header separately later, couldn't we add the separate OriginalHeader field at that point?

jakehehrlich added inline comments.Dec 3 2018, 3:11 PM

tools/llvm-objcopy/COFF/Object.h
83	I'd say if you ever have to keep a single "Original" field then you should keep the whole original header.

mstorsjo added inline comments.Dec 4 2018, 12:22 AM

tools/llvm-objcopy/COFF/Object.h
83	Right - well I don't keep any "original" fields in the sense that it is the original unmodified value from the input file, but it's the actual header as it will be written to the destination file in the same struct form, with values updated and filled in along the objcopy process. So I don't break out all the individual fields but keep them as they are on disk. The only exception is the Name field that I break out into a separate StringRef, as the Name field of the `coff_section` header only makes sense in the context of a full file. As for the field `Contents`, why would I need to make a distinction that it is the original contents? In the follow-up parts where I synthesize a `.gnu_debuglink` section, `Contents` won't be any original contents but the newly created. (In that case I add a separate field to actually own the allocation as the `ArrayRef` either points to the original input file contents or newly synthesized contents.)

@rnk Can you have a look at this one wrt the COFF specifics, and how I'm storing/reassembling it?

Removed the RFC tag as it has been through a few rounds of discussion already, and I'd like to see actual progress towards getting it merged.

I changed the tests to use yaml files as input as requested. In order to have sensible tests that actually check that the file as a whole is copied correctly, without tediously writing lots of individual CHECK lines for llvm-readobj+FileCheck, I'm using obj2yaml on both inputs and outputs, and a plain cmp on the output from obj2yaml. Later patches that actually do some transformation can of course use more specific and targeted checks with llvm-readobj+FileCheck.

Ping @rnk, can you give this a review from a generic COFF perspective?

Ping @alexshap, do you have time to have a look?

Ping @jakehehrlich, can you follow-up on the discussion where you suggested renaming fields? Can you, if possible, have a look on the patch as a whole, wrt to how I read structs from the input file, keeping them mostly packed in the same kinds of structs as in the binary file.

Ping

sorry about the delay, I've also pinged some other people who know coff better than I do (at the moment).
What do think about keeping the implementation of Reader and Writer in their own files, and (in the future) put into Object.cpp only the logic for
modifying the "intermediate representation" (mutations of COFF) ?

In D54939#1330427, @alexshap wrote:

What do think about keeping the implementation of Reader and Writer in their own files, and (in the future) put into Object.cpp only the logic for
modifying the "intermediate representation" (mutations of COFF) ?

I guess that could be doable.

For the actual stripping operations that I've already implemented (but holding off posting until this is merged), most of the code that does modifications actually is outside of the Object class (so far), so the Object class itself is mostly a dumb container. This, because modifications don't (so far) easily map down to simple individual operations that would fit into individual methods. E.g. removing a section requires updating symbols as well - so so far I've kept it as one series of modifications in COFFObjcopy.cpp. But that's

Do you have any opinion on the extra Reader/Writer abstract base classes (vs COFFReader/COFFWriter)? I copied that design from ELF, but I'm not sure if it makes sense here as we have much fewer variants of everything.

that's fine, my point was to "unload" some code from Object.cpp, in particular, the serialization/deserialization into separate files. (probably it won't change that much in the future, but the code for modifying Object will evolve as more features will be added).

I don't have time to go over this with a fine tooth comb, but I took a look, and all the COFF parts make sense. The string table and the use of "/%d" for long section names is all correct.

This revision is now accepted and ready to land.Dec 13 2018, 3:32 PM

Split Object.cpp/h into separate files for Reader and Writer as suggested by @alexshap.

A few minor cleanups and tweaks to the includes and file headers for the split Reader/Writer.

I've looked at the code, there are few minor nits, but to me this code looks like a pretty good start, many thanks for working on this ))

tools/llvm-objcopy/COFF/Object.h
62	nit: i'd probably replace the names A and B with smth that reflects they are the symbol types
tools/llvm-objcopy/COFF/Reader.cpp
63 ↗	(On Diff #178198)	I would add a comment saying that it's on purpose (that the indices start with 1)
tools/llvm-objcopy/COFF/Reader.h
32 ↗	(On Diff #178198)	nit: don't need private here
40 ↗	(On Diff #178198)	nit: for consistency with the code above I would swap these two lines (first - ctor, second - create)

@jakehehrlich , @jhenderson - are there any concerns with moving this forward ?

Thanks for the review!

tools/llvm-objcopy/COFF/Object.h
62	Sure, will replace with Symbol1Ty/Symbol2Ty.
tools/llvm-objcopy/COFF/Reader.cpp
63 ↗	(On Diff #178198)	Good point, will do.
tools/llvm-objcopy/COFF/Reader.h
32 ↗	(On Diff #178198)	Thanks, leaving it out - doing the same for COFFWriter as well.

Applied the changes suggested by @alexshap.

alexander-shaposhnikov added inline comments.Dec 14 2018, 12:55 PM

tools/llvm-objcopy/COFF/Object.h
49	btw - I don't have a strong opinion on this yet, but wanted to explain the motivation why we were trying not to keep this type of information (Is64 etc) inside the Object. Basically, the idea was to have all encoding-related logic consolidated inside the Readers/Writers and avoid accidental usage / leaking of this information into the model itself. I don't know if it makes sense for COFF (although yes, to some extent it means that this information needs to be passed around) - what do you think ?

mstorsjo added inline comments.Dec 14 2018, 1:26 PM

tools/llvm-objcopy/COFF/Object.h
49	Oh, ok, I see. We could probably quite easily get rid of `IsBigObj` and create a big object whenever it's necessary. (The current design just copies the input struct as such, bringing in the exact values of fields such as Sig1/Sig2/Version/UUID, but if we'd decouple them, we'd probably synthesize a new bigobj header instead.) Wrt `pe32_header` vs `pe32plus_header`, the latter is almost a superset of the former - it has got a few fields lengthened to 64 bit, but it instead lacks the `BaseOfData` field. With a `pe32plus_header` struct plus a `BaseOfData` field, we could store whichever input data we read, but that on the other hand requires hardcoding which architectures are 32 bit and which are 64 bit, which we right now just take from the input file. (Hardcoding the machine type vs bitness isn't much of a practical issue as there only are 4 architectures in common use these days - but nothing else in the COFF objcopy needs to know about the architecture at all.)

Removed the IsBigObj field, and only storing one single coff file header and one pe32 header struct, and converting to/from the intermediate form in the reader/writer, as suggested/hinted by @alexshap.

In D54939#1331487, @alexshap wrote:

@jakehehrlich , @jhenderson - are there any concerns with moving this forward ?

My last day in work until the New Year is on Wednesday, and unfortunately I don't think I'll get a chance to look further at it until then. I'm happy to defer to other people's judgement on whether this is good, and can review it post-commit if needed.

Ping @alexshap - Any reply to your last discussion point and my update of the patch relating to it?

Ping @jakehehrlich - Any objections to this, or can I commit with @alexshap's and @rnk's approval?

on my side i think this is fine, I don't want to block this diff.
I would wait for Jake since he is the code owner for llvm-objcopy, but to me this looks good enough (to start the ball rolling).

High level structure/interface looks good to me. I don't see in obvious code level issues. I'll trust @rnk's assessment of the COFF specific parts of this. I haven't reviewed the testing inputs but it looks like you're able to copy some substantial object files which looks like a solid start. I'll trust that James and Alex have looked at enough of this to fill in the gaps that I might have missed. We can start iterating now if there are any issues. Land it.

tools/llvm-objcopy/COFF/Writer.cpp
18 ↗	(On Diff #178376)	Why do you need this?
tools/llvm-objcopy/COFF/Writer.h
15 ↗	(On Diff #178376)	Why do you need this?

mstorsjo marked 3 inline comments as done.Dec 18 2018, 11:20 PM

mstorsjo added inline comments.

tools/llvm-objcopy/COFF/Writer.cpp
18 ↗	(On Diff #178376)	On a second look, I don't - I had somehow conflated it with the header for the llvm-objcopy specific `Buffer` class. Will remove before committing.

Closed by commit rL349605: [llvm-objcopy] Initial COFF support (authored by mstorsjo). · Explain WhyDec 18 2018, 11:28 PM

This revision was automatically updated to reflect the committed changes.

mstorsjo marked an inline comment as done.

Revision Contents

Path

Size

include/

llvm/

Object/

COFF.h

9 lines

test/

tools/

llvm-objcopy/

COFF/

Inputs/

30 lines

tools/

llvm-objcopy/

CMakeLists.txt

2 lines

COFF/

31 lines

38 lines

118 lines

352 lines

4 lines

Diff 175462

include/llvm/Object/COFF.h

Show First 20 Lines • Show All 965 Lines • ▼ Show 20 Lines	iterator_range<const debug_directory *> debug_directories() const {
return make_range(debug_directory_begin(), debug_directory_end());		return make_range(debug_directory_begin(), debug_directory_end());
}		}

const dos_header *getDOSHeader() const {		const dos_header *getDOSHeader() const {
if (!PE32Header && !PE32PlusHeader)		if (!PE32Header && !PE32PlusHeader)
return nullptr;		return nullptr;
return reinterpret_cast<const dos_header *>(base());		return reinterpret_cast<const dos_header *>(base());
}		}
		std::error_code getCOFFHeader(const coff_file_header *&Res) const {
		jhendersonUnsubmitted Done Reply Inline Actions We seem to be mixing our styles of defining things in headers and in .cpp files for no particular reason. Would you mind defining these in COFF.cpp, please since there are two that already are? (I'd support moving getDosHeader into the COFF.cpp too in that case). I could be persuaded instead that the two PE header functions should be moved here. jhenderson: We seem to be mixing our styles of defining things in headers and in .cpp files for no…
		mstorsjoAuthorUnsubmitted Done Reply Inline Actions Sure, I don't mind moving them there. It's indeed quite a bit inconsistent right now, although I don't know if there's some rationale for which ones have been made inline so far. mstorsjo: Sure, I don't mind moving them there. It's indeed quite a bit inconsistent right now, although…
		Res = COFFHeader;
		return std::error_code();
		}
		std::error_code
		getCOFFBigObjHeader(const coff_bigobj_file_header *&Res) const {
		Res = COFFBigObjHeader;
		return std::error_code();
		}
std::error_code getPE32Header(const pe32_header *&Res) const;		std::error_code getPE32Header(const pe32_header *&Res) const;
std::error_code getPE32PlusHeader(const pe32plus_header *&Res) const;		std::error_code getPE32PlusHeader(const pe32plus_header *&Res) const;
std::error_code getDataDirectory(uint32_t index,		std::error_code getDataDirectory(uint32_t index,
const data_directory *&Res) const;		const data_directory *&Res) const;
std::error_code getSection(int32_t index, const coff_section *&Res) const;		std::error_code getSection(int32_t index, const coff_section *&Res) const;
std::error_code getSection(StringRef SectionName,		std::error_code getSection(StringRef SectionName,
const coff_section *&Res) const;		const coff_section *&Res) const;

▲ Show 20 Lines • Show All 259 Lines • Show Last 20 Lines

test/tools/llvm-objcopy/COFF/Inputs/i386-big.o

This binary file was added.

test/tools/llvm-objcopy/COFF/Inputs/i386.exe

This binary file was added.

Property	Old Value	New Value
File Mode	null	100755

test/tools/llvm-objcopy/COFF/Inputs/i386.o

This binary file was added.

test/tools/llvm-objcopy/COFF/Inputs/x86_64-big.o

This binary file was added.

test/tools/llvm-objcopy/COFF/Inputs/x86_64.exe

This binary file was added.

Property	Old Value	New Value
File Mode	null	100755

test/tools/llvm-objcopy/COFF/Inputs/x86_64.o

This binary file was added.

test/tools/llvm-objcopy/COFF/basic-copy.test

This file was added.

				RUN: llvm-objcopy %p/Inputs/i386.o %t.o
				RUN: cmp %p/Inputs/i386.o %t.o

				RUN: llvm-objcopy %p/Inputs/x86_64.o %t.o
				RUN: cmp %p/Inputs/x86_64.o %t.o

				RUN: llvm-objcopy %p/Inputs/i386-big.o %t.o
				RUN: cmp %p/Inputs/i386-big.o %t.o

				RUN: llvm-objcopy %p/Inputs/x86_64-big.o %t.o
				RUN: cmp %p/Inputs/x86_64-big.o %t.o

				RUN: llvm-objcopy %p/Inputs/i386.exe %t.exe
				RUN: cmp %p/Inputs/i386.exe %t.exe

				RUN: llvm-objcopy %p/Inputs/x86_64.exe %t.exe
				RUN: cmp %p/Inputs/x86_64.exe %t.exe

				Having exactly identical output, as this test requires, is pretty
				brittle if considering any random input file. Details that can
				vary are:
				- The padding of executable sections (lld uses 0xcc, which is int3 on x86)
				- The gap between headers and the contents of the first section
				(lld currently can leave a whole empty sector inbetween)
				- The actual layout of of the string table (it can be filled linearly,
				jhendersonUnsubmitted Done Reply Inline Actions nit: remove the extra "of" jhenderson: nit: remove the extra "of"
				mstorsjoAuthorUnsubmitted Done Reply Inline Actions Thanks, amended the patch locally with that fixed. mstorsjo: Thanks, amended the patch locally with that fixed.
				strings can be dedupliated, the table can be optimized by sharing tails
				of longer strings; different parts in llvm do each of these three options)
				- The size indication for an empty/missing string table can either be 4
				or left out altogether
				- Checksums

tools/llvm-objcopy/CMakeLists.txt

	Show All 11 Lines
	set(LLVM_TARGET_DEFINITIONS StripOpts.td)			set(LLVM_TARGET_DEFINITIONS StripOpts.td)
	tablegen(LLVM StripOpts.inc -gen-opt-parser-defs)			tablegen(LLVM StripOpts.inc -gen-opt-parser-defs)
	add_public_tablegen_target(StripOptsTableGen)			add_public_tablegen_target(StripOptsTableGen)

	add_llvm_tool(llvm-objcopy			add_llvm_tool(llvm-objcopy
	Buffer.cpp			Buffer.cpp
	CopyConfig.cpp			CopyConfig.cpp
	llvm-objcopy.cpp			llvm-objcopy.cpp
				COFF/COFFObjcopy.cpp
				COFF/Object.cpp
	ELF/ELFObjcopy.cpp			ELF/ELFObjcopy.cpp
	ELF/Object.cpp			ELF/Object.cpp
	DEPENDS			DEPENDS
	ObjcopyOptsTableGen			ObjcopyOptsTableGen
	StripOptsTableGen			StripOptsTableGen
	)			)

	add_llvm_tool_symlink(llvm-strip llvm-objcopy)			add_llvm_tool_symlink(llvm-strip llvm-objcopy)

	if(LLVM_INSTALL_BINUTILS_SYMLINKS)			if(LLVM_INSTALL_BINUTILS_SYMLINKS)
	add_llvm_tool_symlink(objcopy llvm-objcopy)			add_llvm_tool_symlink(objcopy llvm-objcopy)
	add_llvm_tool_symlink(strip llvm-objcopy)			add_llvm_tool_symlink(strip llvm-objcopy)
	endif()			endif()

tools/llvm-objcopy/COFF/COFFObjcopy.h

This file was added.

				//===- COFFObjcopy.h --------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TOOLS_OBJCOPY_COFFOBJCOPY_H
				#define LLVM_TOOLS_OBJCOPY_COFFOBJCOPY_H

				namespace llvm {

				namespace object {
				class COFFObjectFile;
				} // end namespace object

				namespace objcopy {
				struct CopyConfig;
				class Buffer;

				namespace coff {
				void executeObjcopyOnBinary(const CopyConfig &Config,
				object::COFFObjectFile &In, Buffer &Out);

				} // end namespace coff
				} // end namespace objcopy
				} // end namespace llvm

				#endif // LLVM_TOOLS_OBJCOPY_COFFOBJCOPY_H

tools/llvm-objcopy/COFF/COFFObjcopy.cpp

This file was added.

				//===- COFFObjcopy.cpp ----------------------------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "COFFObjcopy.h"
				#include "Buffer.h"
				#include "CopyConfig.h"
				#include "Object.h"
				#include "llvm-objcopy.h"

				#include "llvm/Object/Binary.h"
				#include "llvm/Object/COFF.h"
				#include <cassert>

				namespace llvm {
				namespace objcopy {
				namespace coff {

				using namespace object;
				using namespace COFF;

				void executeObjcopyOnBinary(const CopyConfig &Config,
				object::COFFObjectFile &In, Buffer &Out) {
				COFFReader Reader(In);
				std::unique_ptr<Object> Obj = Reader.create();
				assert(Obj && "Unable to deserialize COFF object");
				COFFWriter Writer(*Obj, Out);
				Writer.write();
				}

				} // end namespace coff
				} // end namespace objcopy
				} // end namespace llvm

tools/llvm-objcopy/COFF/Object.h

This file was added.

				//===- Object.h -------------------------------------------------- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_TOOLS_OBJCOPY_COFF_OBJECT_H
				#define LLVM_TOOLS_OBJCOPY_COFF_OBJECT_H

				#include "Buffer.h"
				#include "CopyConfig.h"
				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/BinaryFormat/COFF.h"
				#include "llvm/MC/StringTableBuilder.h"
				#include "llvm/Object/COFF.h"
				#include "llvm/Support/FileOutputBuffer.h"
				#include <cstddef>
				#include <cstdint>
				#include <map>
				#include <vector>

				namespace llvm {
				namespace objcopy {
				namespace coff {

				class Object;
				jhendersonUnsubmitted Not Done Reply Inline Actions To avoid confusion, maybe we should rename the ELF Object to ELFObject, and then this Object to COFFObject, etc? jhenderson: To avoid confusion, maybe we should rename the ELF Object to ELFObject, and then this Object to…
				mstorsjoAuthorUnsubmitted Not Done Reply Inline Actions Sure, I don't mind doing that. The same goes for D54674 and MachO as well. Although that also moves the names closer to the class names used by llvm/Object, `COFFObjectFile` and `ELFObjectFile`, but I don't think that's an issue. mstorsjo: Sure, I don't mind doing that. The same goes for D54674 and MachO as well. Although that also…
				alexander-shaposhnikovUnsubmitted Not Done Reply Inline Actions khm, i don't have a strong opinion, but they are already kinda named this way if we take into account the namespace. alexander-shaposhnikov: khm, i don't have a strong opinion, but they are already kinda named this way if we take into…

				using object::Binary;
				using object::COFFObjectFile;

				class Writer {
				protected:
				Object &Obj;
				Buffer &Buf;

				public:
				virtual ~Writer();
				virtual void write() = 0;

				Writer(Object &O, Buffer &B) : Obj(O), Buf(B) {}
				};

				class COFFWriter : public Writer {
				private:
				size_t FileSize;
				alexander-shaposhnikovUnsubmitted Not Done Reply Inline Actions btw - I don't have a strong opinion on this yet, but wanted to explain the motivation why we were trying not to keep this type of information (Is64 etc) inside the Object. Basically, the idea was to have all encoding-related logic consolidated inside the Readers/Writers and avoid accidental usage / leaking of this information into the model itself. I don't know if it makes sense for COFF (although yes, to some extent it means that this information needs to be passed around) - what do you think ? alexander-shaposhnikov: btw - I don't have a strong opinion on this yet, but wanted to explain the motivation why we…
				mstorsjoAuthorUnsubmitted Not Done Reply Inline Actions Oh, ok, I see. We could probably quite easily get rid of `IsBigObj` and create a big object whenever it's necessary. (The current design just copies the input struct as such, bringing in the exact values of fields such as Sig1/Sig2/Version/UUID, but if we'd decouple them, we'd probably synthesize a new bigobj header instead.) Wrt `pe32_header` vs `pe32plus_header`, the latter is almost a superset of the former - it has got a few fields lengthened to 64 bit, but it instead lacks the `BaseOfData` field. With a `pe32plus_header` struct plus a `BaseOfData` field, we could store whichever input data we read, but that on the other hand requires hardcoding which architectures are 32 bit and which are 64 bit, which we right now just take from the input file. (Hardcoding the machine type vs bitness isn't much of a practical issue as there only are 4 architectures in common use these days - but nothing else in the COFF objcopy needs to know about the architecture at all.) mstorsjo: Oh, ok, I see. We could probably quite easily get rid of `IsBigObj` and create a big object…
				size_t FileAlignment;
				StringTableBuilder StrTabBuilder;

				template <class CoffHeaderTy, class PeHeaderTy, class SymbolTy>
				void finalize(CoffHeaderTy &CoffFileHeader, PeHeaderTy &PeHeader);
				template <class CoffHeaderTy, class PeHeaderTy, class SymbolTy>
				void write(CoffHeaderTy &CoffFileHeader, PeHeaderTy &PeHeader);

				void patchDebugDirectory();

				public:
				virtual ~COFFWriter() {}

				alexander-shaposhnikovUnsubmitted Not Done Reply Inline Actions nit: i'd probably replace the names A and B with smth that reflects they are the symbol types alexander-shaposhnikov: nit: i'd probably replace the names A and B with smth that reflects they are the symbol types
				mstorsjoAuthorUnsubmitted Done Reply Inline Actions Sure, will replace with Symbol1Ty/Symbol2Ty. mstorsjo: Sure, will replace with Symbol1Ty/Symbol2Ty.
				void write() override;
				COFFWriter(Object &Obj, Buffer &Buf)
				: Writer(Obj, Buf), StrTabBuilder(StringTableBuilder::WinCOFF) {}
				};

				class Reader {
				public:
				virtual ~Reader();
				virtual std::unique_ptr<Object> create() const = 0;
				};

				class COFFReader : public Reader {
				const COFFObjectFile &COFFObj;

				public:
				std::unique_ptr<Object> create() const override;
				explicit COFFReader(const COFFObjectFile &O) : COFFObj(O) {}
				};

				struct Section {
				object::coff_section Header;
				jakehehrlichUnsubmitted Not Done Reply Inline Actions It might be useful to declare this as the OriginalHeader and have a second Header. We discovered overtime that this would have been a good structure in the ELF case but currently we have a horrible hodgepodge of fields prefixed with "Original" and some without. Also, the name "Contents" doesn't necessarily need to change but it should be clear that such a thing is the original contents and nothing more. jakehehrlich: It might be useful to declare this as the OriginalHeader and have a second Header. We…
				mstorsjoAuthorUnsubmitted Not Done Reply Inline Actions So far, I don't keep a copy of the original header anywhere but I just patch things in this one copy (marginally when just doing a plain copy, patching more when actually changing things) before finally writing it to the output with one memcpy, so there's no distinction between original and current. If we need to keep the original header separately later, couldn't we add the separate OriginalHeader field at that point? mstorsjo: So far, I don't keep a copy of the original header anywhere but I just patch things in this one…
				jakehehrlichUnsubmitted Not Done Reply Inline Actions I'd say if you ever have to keep a single "Original" field then you should keep the whole original header. jakehehrlich: I'd say if you ever have to keep a single "Original" field then you should keep the whole…
				mstorsjoAuthorUnsubmitted Not Done Reply Inline Actions Right - well I don't keep any "original" fields in the sense that it is the original unmodified value from the input file, but it's the actual header as it will be written to the destination file in the same struct form, with values updated and filled in along the objcopy process. So I don't break out all the individual fields but keep them as they are on disk. The only exception is the Name field that I break out into a separate StringRef, as the Name field of the `coff_section` header only makes sense in the context of a full file. As for the field `Contents`, why would I need to make a distinction that it is the original contents? In the follow-up parts where I synthesize a `.gnu_debuglink` section, `Contents` won't be any original contents but the newly created. (In that case I add a separate field to actually own the allocation as the `ArrayRef` either points to the original input file contents or newly synthesized contents.) mstorsjo: Right - well I don't keep any "original" fields in the sense that it is the original unmodified…
				ArrayRef<uint8_t> Contents;
				std::vector<object::coff_relocation> Relocs;
				StringRef Name;
				};

				struct Symbol {
				object::coff_symbol32 Sym;
				StringRef Name;
				ArrayRef<uint8_t> AuxData;
				};

				struct Object {
				bool IsPE = false;

				object::dos_header DosHeader;
				ArrayRef<uint8_t> DosStub;

				bool IsBigObj = false;
				object::coff_file_header CoffFileHeader;
				object::coff_bigobj_file_header CoffBigObjFileHeader;

				bool Is64 = false;
				object::pe32_header PeHeader;
				object::pe32plus_header PePlusHeader;

				std::vector<object::data_directory> DataDirectories;
				std::vector<Section> Sections;
				std::vector<Symbol> Symbols;
				};

				} // end namespace coff
				} // end namespace objcopy
				} // end namespace llvm

				#endif // LLVM_TOOLS_OBJCOPY_COFF_OBJECT_H

tools/llvm-objcopy/COFF/Object.cpp

This file was added.

				//===- Object.cpp ---------------------------------------------------------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "Object.h"
				#include "llvm-objcopy.h"
				#include "llvm/ADT/ArrayRef.h"
				#include "llvm/ADT/STLExtras.h"
				#include "llvm/ADT/StringRef.h"
				#include "llvm/BinaryFormat/COFF.h"
				#include "llvm/Object/COFF.h"
				#include "llvm/Support/Endian.h"
				#include "llvm/Support/ErrorHandling.h"
				#include "llvm/Support/FileOutputBuffer.h"
				#include <algorithm>
				#include <cstddef>
				#include <cstdint>
				#include <vector>

				namespace llvm {
				namespace objcopy {
				namespace coff {

				using namespace object;
				using namespace COFF;

				Writer::~Writer() {}

				Reader::~Reader() {}

				template <class A, class B> static void copySymbol(A &Dest, const B &Src) {
				static_assert(sizeof(Dest.Name.ShortName) == sizeof(Src.Name.ShortName),
				"Mismatched name sizes");
				memcpy(Dest.Name.ShortName, Src.Name.ShortName, sizeof(Dest.Name.ShortName));
				Dest.Value = Src.Value;
				Dest.SectionNumber = Src.SectionNumber;
				Dest.Type = Src.Type;
				Dest.StorageClass = Src.StorageClass;
				Dest.NumberOfAuxSymbols = Src.NumberOfAuxSymbols;
				}

				std::unique_ptr<Object> COFFReader::create() const {
				jhendersonUnsubmitted Done Reply Inline Actions This is quite a big function. Would it be feasible to split it up into a few helpers? jhenderson: This is quite a big function. Would it be feasible to split it up into a few helpers?
				mstorsjoAuthorUnsubmitted Done Reply Inline Actions I guess it should be feasible, I'll give it a try. mstorsjo: I guess it should be feasible, I'll give it a try.
				auto Obj = llvm::make_unique<Object>();

				const coff_file_header *CFH = nullptr;
				const coff_bigobj_file_header *CBFH = nullptr;
				COFFObj.getCOFFHeader(CFH);
				COFFObj.getCOFFBigObjHeader(CBFH);
				if (CFH) {
				Obj->CoffFileHeader = *CFH;
				Obj->IsBigObj = false;
				} else {
				if (!CBFH)
				reportError(COFFObj.getFileName(),
				make_error<StringError>("No COFF file header returned",
				object_error::parse_failed));
				Obj->CoffBigObjFileHeader = *CBFH;
				Obj->IsBigObj = true;
				}

				const dos_header *DH = COFFObj.getDOSHeader();
				Obj->Is64 = COFFObj.is64();
				if (DH) {
				Obj->IsPE = true;
				Obj->DosHeader = *DH;
				if (DH->AddressOfNewExeHeader > sizeof(*DH))
				Obj->DosStub =
				ArrayRef<uint8_t>(reinterpret_cast<const uint8_t *>(&DH[1]),
				DH->AddressOfNewExeHeader - sizeof(*DH));

				const pe32_header *PE32 = nullptr;
				const pe32plus_header *PE32Plus = nullptr;
				if (COFFObj.is64()) {
				if (auto EC = COFFObj.getPE32PlusHeader(PE32Plus))
				reportError(COFFObj.getFileName(), std::move(EC));
				Obj->PePlusHeader = *PE32Plus;
				} else {
				if (auto EC = COFFObj.getPE32Header(PE32))
				reportError(COFFObj.getFileName(), std::move(EC));
				Obj->PeHeader = *PE32;
				}

				size_t NumberOfDataDirectory =
				PE32Plus ? PE32Plus->NumberOfRvaAndSize : PE32->NumberOfRvaAndSize;
				for (size_t I = 0; I < NumberOfDataDirectory; I++) {
				const data_directory *Dir;
				if (auto EC = COFFObj.getDataDirectory(I, Dir))
				reportError(COFFObj.getFileName(), std::move(EC));
				Obj->DataDirectories.emplace_back(*Dir);
				}
				}

				for (size_t I = 1, E = COFFObj.getNumberOfSections(); I <= E; I++) {
				const coff_section *Sec;
				if (auto EC = COFFObj.getSection(I, Sec))
				reportError(COFFObj.getFileName(), std::move(EC));
				Obj->Sections.push_back(Section());
				Section &S = Obj->Sections.back();
				S.Header = *Sec;
				if (auto EC = COFFObj.getSectionContents(Sec, S.Contents))
				reportError(COFFObj.getFileName(), std::move(EC));
				ArrayRef<coff_relocation> Relocs = COFFObj.getRelocations(Sec);
				S.Relocs.insert(S.Relocs.end(), Relocs.begin(), Relocs.end());
				if (auto EC = COFFObj.getSectionName(Sec, S.Name))
				reportError(COFFObj.getFileName(), std::move(EC));
				if (Sec->hasExtendedRelocations())
				reportError(
				COFFObj.getFileName(),
				make_error<StringError>("Extended relocations not supported yet",
				object_error::parse_failed));
				}

				for (uint32_t I = 0, E = COFFObj.getRawNumberOfSymbols(); I < E;) {
				Expected<COFFSymbolRef> SymOrErr = COFFObj.getSymbol(I);
				if (!SymOrErr)
				reportError(COFFObj.getFileName(), SymOrErr.takeError());
				COFFSymbolRef SymRef = *SymOrErr;

				Obj->Symbols.push_back(Symbol());
				Symbol &Sym = Obj->Symbols.back();
				if (Obj->IsBigObj)
				copySymbol(Sym.Sym,
				reinterpret_cast<const coff_symbol32 >(SymRef.getRawPtr()));
				else
				copySymbol(Sym.Sym,
				reinterpret_cast<const coff_symbol16 >(SymRef.getRawPtr()));
				if (auto EC = COFFObj.getSymbolName(SymRef, Sym.Name))
				reportError(COFFObj.getFileName(), std::move(EC));
				Sym.AuxData = COFFObj.getSymbolAuxData(SymRef);
				assert((Sym.AuxData.size() % (Obj->IsBigObj ? sizeof(coff_symbol32)
				: sizeof(coff_symbol16))) == 0);
				I += 1 + SymRef.getNumberOfAuxSymbols();
				}

				return Obj;
				}

				template <class CoffHeaderTy, class PeHeaderTy, class SymbolTy>
				void COFFWriter::finalize(CoffHeaderTy &CoffFileHeader, PeHeaderTy &PeHeader) {
				jhendersonUnsubmitted Done Reply Inline Actions Again, this is quite a big function. Would it be feasible to split it up into a few helpers? jhenderson: Again, this is quite a big function. Would it be feasible to split it up into a few helpers?
				size_t SizeOfHeaders = 0;
				size_t FileAlignment = 1;
				if (Obj.IsPE) {
				Obj.DosHeader.AddressOfNewExeHeader =
				sizeof(Obj.DosHeader) + Obj.DosStub.size();
				SizeOfHeaders += Obj.DosHeader.AddressOfNewExeHeader + sizeof(PEMagic);

				FileAlignment = PeHeader.FileAlignment;
				PeHeader.NumberOfRvaAndSize = Obj.DataDirectories.size();

				SizeOfHeaders +=
				sizeof(PeHeader) + sizeof(data_directory) * Obj.DataDirectories.size();
				}
				CoffFileHeader.NumberOfSections = Obj.Sections.size();
				SizeOfHeaders += sizeof(CoffFileHeader);
				SizeOfHeaders += sizeof(coff_section) * Obj.Sections.size();
				SizeOfHeaders = alignTo(SizeOfHeaders, FileAlignment);

				// Directly accessing Obj.CoffFileHeader, as CoffBigObjFileHeader doesn't
				// have this field.
				if (!Obj.IsBigObj && Obj.IsPE)
				Obj.CoffFileHeader.SizeOfOptionalHeader =
				sizeof(PeHeader) + sizeof(data_directory) * Obj.DataDirectories.size();

				FileSize = SizeOfHeaders;
				size_t SizeOfInitializedData = 0;
				for (auto &S : Obj.Sections) {
				if (S.Header.SizeOfRawData > 0)
				S.Header.PointerToRawData = FileSize;
				FileSize += S.Header.SizeOfRawData; // For executables, this is aligned to
				// FileAlignment.
				if (S.Header.NumberOfRelocations > 0)
				S.Header.PointerToRelocations = FileSize;
				FileSize += S.Relocs.size() * sizeof(coff_relocation);
				FileSize = alignTo(FileSize, FileAlignment);
				if (S.Name.size() > COFF::NameSize)
				StrTabBuilder.add(S.Name);
				if (S.Header.Characteristics & IMAGE_SCN_CNT_INITIALIZED_DATA)
				SizeOfInitializedData += S.Header.SizeOfRawData;
				}
				if (Obj.IsPE) {
				PeHeader.SizeOfHeaders = SizeOfHeaders;
				PeHeader.SizeOfInitializedData = SizeOfInitializedData;
				PeHeader.CheckSum = 0;
				if (!Obj.Sections.empty()) {
				const Section &S = Obj.Sections.back();
				PeHeader.SizeOfImage =
				alignTo(S.Header.VirtualAddress + S.Header.VirtualSize,
				PeHeader.SectionAlignment);
				}
				}

				size_t SymTabSize = Obj.Symbols.size() * sizeof(SymbolTy);
				for (const auto &S : Obj.Symbols) {
				SymTabSize += S.AuxData.size();
				if (S.Name.size() > COFF::NameSize)
				StrTabBuilder.add(S.Name);
				}

				StrTabBuilder.finalize();

				for (auto &S : Obj.Sections) {
				if (S.Name.size() > COFF::NameSize) {
				snprintf(S.Header.Name, sizeof(S.Header.Name), "/%d",
				(int)StrTabBuilder.getOffset(S.Name));
				} else {
				strncpy(S.Header.Name, S.Name.data(), COFF::NameSize);
				}
				}
				for (auto &S : Obj.Symbols) {
				if (S.Name.size() > COFF::NameSize) {
				S.Sym.Name.Offset.Zeroes = 0;
				S.Sym.Name.Offset.Offset = StrTabBuilder.getOffset(S.Name);
				} else {
				strncpy(S.Sym.Name.ShortName, S.Name.data(), COFF::NameSize);
				}
				}

				size_t StrTabSize = StrTabBuilder.getSize();
				size_t PointerToSymbolTable = FileSize;
				if (SymTabSize == 0 && StrTabSize <= 4) {
				PointerToSymbolTable = 0;
				// For executables, skip the length field of an empty string table.
				if (Obj.IsPE)
				StrTabSize = 0;
				}

				size_t NumRawSymbols = SymTabSize / sizeof(SymbolTy);
				CoffFileHeader.PointerToSymbolTable = PointerToSymbolTable;
				CoffFileHeader.NumberOfSymbols = NumRawSymbols;
				FileSize += SymTabSize + StrTabSize;
				FileSize = alignTo(FileSize, FileAlignment);
				}

				template <class CoffHeaderTy, class PeHeaderTy, class SymbolTy>
				void COFFWriter::write(CoffHeaderTy &CoffFileHeader, PeHeaderTy &PeHeader) {
				jhendersonUnsubmitted Done Reply Inline Actions Again, this is quite a big function. Would it be feasible to split it up into a few helpers? jhenderson: Again, this is quite a big function. Would it be feasible to split it up into a few helpers?
				finalize<CoffHeaderTy, PeHeaderTy, SymbolTy>(CoffFileHeader, PeHeader);

				Buf.allocate(FileSize);

				uint8_t *Ptr = Buf.getBufferStart();
				if (Obj.IsPE) {
				memcpy(Ptr, &Obj.DosHeader, sizeof(Obj.DosHeader));
				Ptr += sizeof(Obj.DosHeader);
				memcpy(Ptr, Obj.DosStub.data(), Obj.DosStub.size());
				Ptr += Obj.DosStub.size();
				memcpy(Ptr, PEMagic, sizeof(PEMagic));
				Ptr += sizeof(PEMagic);
				}
				memcpy(Ptr, &CoffFileHeader, sizeof(CoffFileHeader));
				Ptr += sizeof(CoffFileHeader);
				if (Obj.IsPE) {
				memcpy(Ptr, &PeHeader, sizeof(PeHeader));
				Ptr += sizeof(PeHeader);
				for (const auto &DD : Obj.DataDirectories) {
				memcpy(Ptr, &DD, sizeof(DD));
				Ptr += sizeof(DD);
				}
				}
				for (const auto &S : Obj.Sections) {
				memcpy(Ptr, &S.Header, sizeof(S.Header));
				Ptr += sizeof(S.Header);
				}

				for (const auto &S : Obj.Sections) {
				Ptr = Buf.getBufferStart() + S.Header.PointerToRawData;
				memcpy(Ptr, S.Contents.data(), S.Contents.size());
				if ((S.Header.Characteristics & IMAGE_SCN_CNT_CODE) &&
				S.Header.SizeOfRawData > S.Contents.size())
				memset(Ptr + S.Contents.size(), 0xcc,
				S.Header.SizeOfRawData - S.Contents.size());
				mstorsjoAuthorUnsubmitted Not Done Reply Inline Actions Open questions to reviewers: Here I pad executable sections with 0xcc, in the same way as LLD does. It was requested to have tests that use yaml input instead of committing binaries, and the tests so far check that objcopy produces bytewise identical files to its input. As yaml2obj doesn't do the same 0xcc padding, we'd either need to make yaml2obj do the same padding as well, or stop doing it here. Which one do you prefer? (Also, currently, LLD/COFF doesn't currently use different paddings for ARM/AArch64, which it probably should, and the same would go here as well.) mstorsjo: Open questions to reviewers: Here I pad executable sections with 0xcc, in the same way as LLD…
				jhendersonUnsubmitted Not Done Reply Inline Actions Not too sure, but it might be nice to find a way to specify the padding for any given file. For example, padding with 0xcc doesn't make much sense for x86 output in the data segment, but makes a lot of sense in the text segment. jhenderson: Not too sure, but it might be nice to find a way to specify the padding for any given file. For…
				mstorsjoAuthorUnsubmitted Not Done Reply Inline Actions Well currently, the padding is only done for executable sections (`if ((S.Header.Characteristics & IMAGE_SCN_CNT_CODE)` above), other sections are padded with zeros. mstorsjo: Well currently, the padding is only done for executable sections (`if ((S.Header.
				Ptr += S.Header.SizeOfRawData;
				memcpy(Ptr, S.Relocs.data(), S.Relocs.size() * sizeof(coff_relocation));
				}

				Ptr = Buf.getBufferStart() + CoffFileHeader.PointerToSymbolTable;
				for (const auto &S : Obj.Symbols) {
				copySymbol<SymbolTy, coff_symbol32>(reinterpret_cast<SymbolTy >(Ptr),
				S.Sym);
				Ptr += sizeof(SymbolTy);
				memcpy(Ptr, S.AuxData.data(), S.AuxData.size());
				Ptr += S.AuxData.size();
				}
				if (StrTabBuilder.getSize() > 4 \|\| !Obj.IsPE) {
				// Always write a string table in object files, even an empty one.
				StrTabBuilder.write(Ptr);
				Ptr += StrTabBuilder.getSize();
				}

				if (Obj.IsPE && Obj.DataDirectories.size() >= DEBUG_DIRECTORY)
				patchDebugDirectory();

				if (auto E = Buf.commit())
				reportError(Buf.getName(), errorToErrorCode(std::move(E)));
				}

				void COFFWriter::patchDebugDirectory() {
				const data_directory *Dir = &Obj.DataDirectories[DEBUG_DIRECTORY];
				if (Dir->Size <= 0)
				return;
				bool Found = false;
				for (const auto &S : Obj.Sections) {
				if (Dir->RelativeVirtualAddress >= S.Header.VirtualAddress &&
				Dir->RelativeVirtualAddress <
				S.Header.VirtualAddress + S.Header.SizeOfRawData) {
				if (Dir->RelativeVirtualAddress + Dir->Size >
				S.Header.VirtualAddress + S.Header.SizeOfRawData)
				reportError(Buf.getName(),
				make_error<StringError>(
				"Debug directory extends past end of section",
				object_error::parse_failed));
				size_t Offset = Dir->RelativeVirtualAddress - S.Header.VirtualAddress;
				uint8_t *Ptr = Buf.getBufferStart() + S.Header.PointerToRawData + Offset;
				uint8_t *End = Ptr + Dir->Size;
				while (Ptr < End) {
				debug_directory Debug = reinterpret_cast<debug_directory >(Ptr);
				Debug->PointerToRawData =
				S.Header.PointerToRawData + Offset + sizeof(debug_directory);
				Ptr += sizeof(debug_directory) + Debug->SizeOfData;
				Offset += sizeof(debug_directory) + Debug->SizeOfData;
				}
				Found = true;
				break;
				}
				}
				if (!Found)
				reportError(Buf.getName(),
				make_error<StringError>("Debug directory missing",
				object_error::parse_failed));
				}

				void COFFWriter::write() {
				assert(!(Obj.IsBigObj && Obj.IsPE) &&
				"Can't have BigObj headers in a PE executable");
				if (Obj.IsBigObj)
				write<coff_bigobj_file_header, pe32_header, coff_symbol32>(
				Obj.CoffBigObjFileHeader, Obj.PeHeader);
				else if (Obj.Is64)
				write<coff_file_header, pe32plus_header, coff_symbol16>(Obj.CoffFileHeader,
				Obj.PePlusHeader);
				else
				write<coff_file_header, pe32_header, coff_symbol16>(Obj.CoffFileHeader,
				Obj.PeHeader);
				}

				} // end namespace coff
				} // end namespace objcopy
				} // end namespace llvm

tools/llvm-objcopy/llvm-objcopy.cpp

	//===- llvm-objcopy.cpp ---------------------------------------------------===//			//===- llvm-objcopy.cpp ---------------------------------------------------===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "llvm-objcopy.h"			#include "llvm-objcopy.h"
	#include "Buffer.h"			#include "Buffer.h"
				#include "COFF/COFFObjcopy.h"
	#include "CopyConfig.h"			#include "CopyConfig.h"
	#include "ELF/ELFObjcopy.h"			#include "ELF/ELFObjcopy.h"

	#include "llvm/ADT/STLExtras.h"			#include "llvm/ADT/STLExtras.h"
	#include "llvm/ADT/SmallVector.h"			#include "llvm/ADT/SmallVector.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include "llvm/ADT/Twine.h"			#include "llvm/ADT/Twine.h"
	#include "llvm/Object/Archive.h"			#include "llvm/Object/Archive.h"
	#include "llvm/Object/ArchiveWriter.h"			#include "llvm/Object/ArchiveWriter.h"
	#include "llvm/Object/Binary.h"			#include "llvm/Object/Binary.h"
				#include "llvm/Object/COFF.h"
	#include "llvm/Object/ELFObjectFile.h"			#include "llvm/Object/ELFObjectFile.h"
	#include "llvm/Object/ELFTypes.h"			#include "llvm/Object/ELFTypes.h"
	#include "llvm/Object/Error.h"			#include "llvm/Object/Error.h"
	#include "llvm/Option/Arg.h"			#include "llvm/Option/Arg.h"
	#include "llvm/Option/ArgList.h"			#include "llvm/Option/ArgList.h"
	#include "llvm/Option/Option.h"			#include "llvm/Option/Option.h"
	#include "llvm/Support/Casting.h"			#include "llvm/Support/Casting.h"
	#include "llvm/Support/Error.h"			#include "llvm/Support/Error.h"
	▲ Show 20 Lines • Show All 90 Lines • ▼ Show 20 Lines
	}			}

	/// The function executeObjcopyOnBinary does the dispatch based on the format			/// The function executeObjcopyOnBinary does the dispatch based on the format
	/// of the input binary (ELF, MachO or COFF).			/// of the input binary (ELF, MachO or COFF).
	static void executeObjcopyOnBinary(const CopyConfig &Config, object::Binary &In,			static void executeObjcopyOnBinary(const CopyConfig &Config, object::Binary &In,
	Buffer &Out) {			Buffer &Out) {
	if (auto *ELFBinary = dyn_cast<object::ELFObjectFileBase>(&In))			if (auto *ELFBinary = dyn_cast<object::ELFObjectFileBase>(&In))
	return elf::executeObjcopyOnBinary(Config, *ELFBinary, Out);			return elf::executeObjcopyOnBinary(Config, *ELFBinary, Out);
				else if (auto *COFFBinary = dyn_cast<object::COFFObjectFile>(&In))
				return coff::executeObjcopyOnBinary(Config, *COFFBinary, Out);
	else			else
	error("Unsupported object file format");			error("Unsupported object file format");
	}			}

	static void executeObjcopyOnArchive(const CopyConfig &Config,			static void executeObjcopyOnArchive(const CopyConfig &Config,
	const Archive &Ar) {			const Archive &Ar) {
	std::vector<NewArchiveMember> NewArchiveMembers;			std::vector<NewArchiveMember> NewArchiveMembers;
	Error Err = Error::success();			Error Err = Error::success();
	▲ Show 20 Lines • Show All 93 Lines • Show Last 20 Lines