This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add support for input types and the -I and -B flags
Needs ReviewPublic

Authored by jakehehrlich on Jan 2 2018, 5:09 PM.

Download Raw Diff

Details

Reviewers

Summary

This change adds support for the -I and -B flags from GNU objcopy. The only currently supported input type is "binary" and only 4 architectures (by a total of 5 names) are currently supported. This change adds a constructor for Object and its subtypes that includes the basic sections and contents that almost any relocatable ELF will have. This needs to know the ELFT and EMachine code so some architecture information stuff was needed (oddly I wasn't able to figure out any good way to piggy back off of llvm) After that another method adds the parts that are specific to the binary input type.

Diff Detail

Repository: rL LLVM

Event Timeline

jakehehrlich created this revision.Jan 2 2018, 5:09 PM

I'm not sure I'm a big fan of the big stack of functions that gradually pick off an argument, and then choose a specific function to call next. It's effectively a giant nested if statement, whereas what we had before was much more linear. I haven't got a good alternative solution as of yet, but will have a think and try to come up with an alternative.

Otherwise, most of this looks fine. There should be tests for each of the different possible -B values though.

tools/llvm-objcopy/Object.cpp
604	The contents of this function feel a bit out-of-order. It starts with setting stuff to do with sections, then adds a symbol, then sets some properties, based on the sections, then goes off and initialises the elf header, before coming back and updating some more properties to do with the sections. Could you maybe reorder things as follows: Initialise the ELF header, probably in a separate function called by this function. Create the string table section, and add it to the Sections array. Create the symbol table, set the string table property, add the symbol, and add it to the array. This should make the function a little easier to follow.
tools/llvm-objcopy/llvm-objcopy.cpp
354–356	I wonder if this is an indication that Symbols should own their own names. It would mean a bit of copying in the ELF input case, but could prevent easy-to-make errors if we want to create or rename symbols. If you prefer keeping it as is, I'd make a separate function called "MakeBinarySymbolName(StringRef BaseName, StringRef Suffix)", so that the warts of the name ownership can be kept separate from the adding of symbols, and it can be reused in other places too. Did you consider making InputBinaryFormat a subclass of Object? That would allow you to have a slightly nicer name ownership resolution, apart from anything else.
421	This function should be static (along with the rest of the new functions).
438	I'm not sure this function is well named, given that this function is effectively the driver of the rest of objcopy - it's not really handling the input. It's doing all the work. I'd even go so far as to say that it should be inlined into main. If you do want to keep the function, I'd rename Arch to InArch, and change the argument order to Input, InFmt, InArch, OutFmt (i.e. keeping all input-related arguments together).

I just did some experimenting with GNU objcopy with this, and I'm getting a binary copy of the input, when using -I binary, even with -B i386:x86-64. I can workaround this using -O elf64-x86-64, i.e. "objcopy text.txt text.o -B i386:x86-64 -I binary -O elf64-x86-64". If I drop the -B, I get an EM_NONE machine type, and if I drop the -O, I get a text file. Not sure how you want to handle this!

In D41687#966597, @jhenderson wrote:

I just did some experimenting with GNU objcopy with this, and I'm getting a binary copy of the input, when using -I binary, even with -B i386:x86-64. I can workaround this using -O elf64-x86-64, i.e. "objcopy text.txt text.o -B i386:x86-64 -I binary -O elf64-x86-64". If I drop the -B, I get an EM_NONE machine type, and if I drop the -O, I get a text file. Not sure how you want to handle this!

Well for command line compatibility we're going to have to accept "-O elf64-x86-64" which we don't do right now. In fact for the specific use case that I'm trying to solve here I'll need that to work (whoops). I'm fine copying the behavior of GNU objcopy here but I might add it in two more changes. One change will add extra output formats (like elf64-x86-64) and the other will make the default output type the corresponding input type. Those two changes together with this change should produce command line compatibility. Sound good?

I haven't figured out a solution to the "giant nested if-statment" problem yet. There's a sense in which it's largely intractable because we have to use dynamic information to dispatch to 1 of 4 statically known functions of different types. I'm still thinking about it.

Added test for each -B option (as future architectures/names are added then that test can be extended)
Made any non-template function static (I think it was just the one)
Reordered Object(uint16_t) as mentioned
Made symbols own their names (amazingly this was as simple just replacing StringRef with std::string in one place)
Made addSymbol take a Twine instead of a StringRef for the symbol name (strictly more general and suits the symbol generation case used in InputBianryFormat)
Eh...I may be forgetting something but I only changed things that were requested. Hopefully I changed everything that was requested.

Context missing.

In D41687#966842, @jakehehrlich wrote:

In D41687#966597, @jhenderson wrote:

I just did some experimenting with GNU objcopy with this, and I'm getting a binary copy of the input, when using -I binary, even with -B i386:x86-64. I can workaround this using -O elf64-x86-64, i.e. "objcopy text.txt text.o -B i386:x86-64 -I binary -O elf64-x86-64". If I drop the -B, I get an EM_NONE machine type, and if I drop the -O, I get a text file. Not sure how you want to handle this!

Well for command line compatibility we're going to have to accept "-O elf64-x86-64" which we don't do right now. In fact for the specific use case that I'm trying to solve here I'll need that to work (whoops). I'm fine copying the behavior of GNU objcopy here but I might add it in two more changes. One change will add extra output formats (like elf64-x86-64) and the other will make the default output type the corresponding input type. Those two changes together with this change should produce command line compatibility. Sound good?

Yes, sounds good.

test/tools/llvm-objcopy/binary-input-archs.test
2	You don't need to change this test in this commit, but it will need updating once you make the -O format changes.
test/tools/llvm-objcopy/binary-input.test
3	Could you modify this test slightly, please, to exercise the non-alnum replacement code. The easiest thing to do would be to add a file extension to the input file (e.g. make it %t.bin rather than %t).
tools/llvm-objcopy/llvm-objcopy.cpp
354–356	The comment needs updating now that symbols do own their own names, and I think the StringSaver header probably can be deleted too.

Removed comment and include for StringSaver stuff
Tweaked test to show that stemming and removing of alpha numeric characters exists

Adding ".bin" isn't really ideal. It means that ".tmp" winds up in the stemmed file name but that extension probably isn't something we should relay on. It turns out (via a means we definitively shouldn't rely on) that the replacement of alpha numeric characters was already being tested because binary-input.test was the stem. I made this explicit in the test so that even if the test file's name changes the test will still test the appropriate things. I add ".bin" as the extension instead of ".tmp". This also shows more explicitly that stemming works correctly.

On the topic of removing the implicit "giant if" I'm not sure there's a great way to solve this. I've considered solutions involving macros (which are simple but should work) to some rather overly complicated dynamic solutions that check a condition for each ELF type for you and then dispatch to the correct template correspondingly but a) I wasn't able to get them to work properly and b) they were super complicated. In general you're going to have at least 8 branches because a) There are 4 templates and you need to dispatch to the correct one and b) There are two fundamentally different sources of information that need to be considered to decide which of those 4 should be used. Removing the templates we can from Object dosn't help either because how we read in the object and how we decide to write out the object still require that we have the same number of branches. Using macros to ensure uniformity of branches is one idea that a) works and b) is simple but I think I'd rather know explicitly what's going on.

The macro I have in mind looks like this

FOREACH_ELFT(auto *o = dyn_cast<ELFObjectFile<ELFT>>(&Binary), {

HandleELF(*o, OutFmt);
return;

})

ELFT is then typedefed in each scope that's copied 4 times for each type. ELFT is then available in both the block and the condition. The code block is only executed in the case that the condition is true. For the case where MInfo is used we can do something that looks like this:

FOREACH_ELFT(elftMatchesMachine<ELFT>(MInfo), {

HandleBinary<ELFT>(Input, OutFmt, MInfo);

})

Using those two changes we can ensure uniformity of these dispatches. Coupling that with a function that returns a unique pointer to an Object and uses the output format to decide the which of ELFObject or BianryObject will be used beings our apparent branches down to a more manageable amount. I'm slightly against both of these solutions but I could be convinced otherwise with minimal effort. The output format decider in particular seems fine to me.

tools/llvm-objcopy/llvm-objcopy.cpp
354–356	I think switching to Symbols owning their names is a good idea. Symbols and relocations are likely to be the sticking point for optimization at some point in the future but I'd rather use the conceptually simplest option now and optimize later when we have an issue. I think the biggest optimization for symbol tables will come from lazy loading and not from optimizing copying of small strings like that. As for making InputBinaryFormat a subclass of Object I'm not sure. I didn't consider but considering it now I was intending for those sub classes to be the output formats. This change does raise the question of how input formats should be handled however. For instance why is the binary input format handled here but the elf input format is handled inside of Object, that seems kind of off to me. I remember you mentioned an idea a while back about having read and writer types that map in and out of a common representation. Maybe we should refactor the Object code to expose enough of an interface that code outside of Object can reconstruct the ELF Object the way this code does so for the binary input case.

For instance why is the binary input format handled here but the elf input format is handled inside of Object, that seems kind of off to me. I remember you mentioned an idea a while back about having read and writer types that map in and out of a common representation. Maybe we should refactor the Object code to expose enough of an interface that code outside of Object can reconstruct the ELF Object the way this code does so for the binary input case.

I think this is exactly the issue with the big "if" statement, and also the naming of the functions. At the moment, we have a path which goes something like: HandleBinaryELFT -> HandleBinary -> HandleArgs (binary object), so we're converting from a binary input into an ELF object, and then emitting it as a binary object. The whole conversion to ELF seems wrong here. The advantage of having separate classes for readers and writers, with a common intermediate representation, is that the process looks a lot clearer.

Your code could look something like:

InputReader Reader(InputFile, InputFormat);
Object *Obj = Reader.createObject(); // Creates an intermediate object based on the input type.
                                     // There would be 5 different paths (one for each ELF format, and one for binary).
Obj->handleArgs(Args);               // Removes/add sections/symbols etc.
Obj->write(OutputFormat);            // Performs section and segment layout and writes out the file, according to the output format.

Note how the "handleArgs" call is hoisted up out of the depths of the nested call, and how the writing is handled separately. It would allow easy adding of new input and output formats, by simply extending the corresponding createObject/write implementation.

Does this sound plausible to you?

test/tools/llvm-objcopy/binary-input-test.bin
1	I don't think you meant to add this file?
test/tools/llvm-objcopy/binary-input.test
1	I think you probably want %T, not %p, since that's the test output directory, not the test source directory.
112	I just did some experimenting, and these symbols should include the file extension (so in the current case, this one should be "_binary_binary_input_test_bin_start"), according to GNU objcopy. I think you want to be using filename() instead of stem().

I plan on adding a refactoring to use an InputReader and possibly a Writer object and detemplating Object to make the refactoring work. I think that's a) quite doable and b) a very good idea. I'm going to post the refactoring in another change and put this one on hold for a bit and then rebase the refactor into this change so there isn't one giant change all at once.

test/tools/llvm-objcopy/binary-input-test.bin
1	uh crap my %p solution was wrong. Now I see that you commend on that. Whoops.

rupprecht mentioned this in D50117: [llvm-objcopy] NFC: Refactor main objcopy method that takes an ELFReader to a generic Reader..Aug 1 2018, 11:54 AM

rupprecht mentioned this in D50343: [llvm-objcopy] Add support for -I binary -B <arch>..Aug 6 2018, 11:34 AM

rupprecht mentioned this in rL340070: [llvm-objcopy] Add support for -I binary -B <arch>..Aug 17 2018, 11:52 AM

Revision Contents

Path

Size

test/

tools/

llvm-objcopy/

binary-input-archs.test

32 lines

binary-input-test.bin

1 line

binary-input.test

138 lines

tools/

llvm-objcopy/

Object.h

23 lines

Object.cpp

49 lines

llvm-objcopy.cpp

195 lines

Diff 128987

test/tools/llvm-objcopy/binary-input-archs.test

This file was added.

				# RUN: printf 0000 > %t

				jhendersonUnsubmitted Not Done Reply Inline Actions You don't need to change this test in this commit, but it will need updating once you make the -O format changes. jhenderson: You don't need to change this test in this commit, but it will need updating once you make the…
				# RUN: llvm-objcopy -I binary -B i386:x86-64 %t %t2
				# RUN: llvm-readobj -file-headers %t2 \| FileCheck %s --check-prefix=ARCH_x86_64

				# RUN: llvm-objcopy -I binary -B i386 %t %t2
				# RUN: llvm-readobj -file-headers %t2 \| FileCheck %s --check-prefix=ARCH_i386

				# RUN: llvm-objcopy -I binary -B x86-64 %t %t2
				# RUN: llvm-readobj -file-headers %t2 \| FileCheck %s --check-prefix=ARCH_x86_64

				# RUN: llvm-objcopy -I binary -B arm %t %t2
				# RUN: llvm-readobj -file-headers %t2 \| FileCheck %s --check-prefix=ARCH_arm32

				# RUN: llvm-objcopy -I binary -B aarch64 %t %t2
				# RUN: llvm-readobj -file-headers %t2 \| FileCheck %s --check-prefix=ARCH_aarch64

				# ARCH_i386: Class: 32-bit
				# ARCH_i386: DataEncoding: LittleEndian
				# ARCH_i386: Machine: EM_386

				# ARCH_x86_64: Class: 64-bit
				# ARCH_x86_64: DataEncoding: LittleEndian
				# ARCH_x86_64: Machine: EM_X86_64

				# ARCH_arm32: Class: 32-bit
				# ARCH_arm32: DataEncoding: LittleEndian
				# ARCH_arm32: Machine: EM_ARM

				# ARCH_aarch64: Class: 64-bit
				# ARCH_aarch64: DataEncoding: LittleEndian
				# ARCH_aarch64: Machine: EM_AARCH64

test/tools/llvm-objcopy/binary-input-test.bin

This file was added.

				0000
				jhendersonUnsubmitted Not Done Reply Inline Actions I don't think you meant to add this file? jhenderson: I don't think you meant to add this file?
				jakehehrlichAuthorUnsubmitted Not Done Reply Inline Actions uh crap my %p solution was wrong. Now I see that you commend on that. Whoops. jakehehrlich: uh crap my %p solution was wrong. Now I see that you commend on that. Whoops.
				No newline at end of file

test/tools/llvm-objcopy/binary-input.test

This file was added.

				# RUN: printf 0000 > %p/binary-input-test.bin
				jhendersonUnsubmitted Not Done Reply Inline Actions I think you probably want %T, not %p, since that's the test output directory, not the test source directory. jhenderson: I think you probably want %T, not %p, since that's the test output directory, not the test…
				# RUN: llvm-objcopy -I binary -B i386:x86-64 %p/binary-input-test.bin %t2
				# RUN: llvm-readobj -file-headers -sections -symbols %t2 \| FileCheck %s
				jhendersonUnsubmitted Not Done Reply Inline Actions Could you modify this test slightly, please, to exercise the non-alnum replacement code. The easiest thing to do would be to add a file extension to the input file (e.g. make it %t.bin rather than %t). jhenderson: Could you modify this test slightly, please, to exercise the non-alnum replacement code. The…

				# CHECK: ElfHeader {
				# CHECK-NEXT: Ident {
				# CHECK-NEXT: Magic: (7F 45 4C 46)
				# CHECK-NEXT: Class: 64-bit (0x2)
				# CHECK-NEXT: DataEncoding: LittleEndian (0x1)
				# CHECK-NEXT: FileVersion: 1
				# CHECK-NEXT: OS/ABI: SystemV (0x0)
				# CHECK-NEXT: ABIVersion: 0
				# CHECK-NEXT: Unused: (00 00 00 00 00 00 00)
				# CHECK-NEXT: }
				# CHECK-NEXT: Type: Relocatable (0x1)
				# CHECK-NEXT: Machine: EM_X86_64 (0x3E)
				# CHECK-NEXT: Version: 1
				# CHECK-NEXT: Entry: 0x0
				# CHECK-NEXT: ProgramHeaderOffset:
				# CHECK-NEXT: SectionHeaderOffset:
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: HeaderSize: 64
				# CHECK-NEXT: ProgramHeaderEntrySize:
				# CHECK-NEXT: ProgramHeaderCount: 0
				# CHECK-NEXT: SectionHeaderEntrySize: 64
				# CHECK-NEXT: SectionHeaderCount: 4
				# CHECK-NEXT: StringTableSectionIndex: 1
				# CHECK-NEXT: }

				# CHECK: Sections [
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 0
				# CHECK-NEXT: Name: (0)
				# CHECK-NEXT: Type: SHT_NULL (0x0)
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size:
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 0
				# CHECK-NEXT: EntrySize: 0
				# CHECK-NEXT: }
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 1
				# CHECK-NEXT: Name: .strtab
				# CHECK-NEXT: Type: SHT_STRTAB
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size:
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0
				# CHECK-NEXT: }
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 2
				# CHECK-NEXT: Name: .symtab
				# CHECK-NEXT: Type: SHT_SYMTAB
				# CHECK-NEXT: Flags [ (0x0)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size:
				# CHECK-NEXT: Link: 1
				# CHECK-NEXT: Info: 2
				# CHECK-NEXT: AddressAlignment: 8
				# CHECK-NEXT: EntrySize: 24
				# CHECK-NEXT: }
				# CHECK-NEXT: Section {
				# CHECK-NEXT: Index: 3
				# CHECK-NEXT: Name: .data
				# CHECK-NEXT: Type: SHT_PROGBITS
				# CHECK-NEXT: Flags [
				# CHECK-NEXT: SHF_ALLOC
				# CHECK-NEXT: SHF_WRITE
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset:
				# CHECK-NEXT: Size: 4
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0
				# CHECK-NEXT: }
				# CHECK-NEXT: ]

				#CHECK: Symbols [
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name:
				#CHECK-NEXT: Value: 0x0
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Local (0x0)
				#CHECK-NEXT: Type: None (0x0)
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: Undefined (0x0)
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name:
				#CHECK-NEXT: Value: 0x0
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Local (0x0)
				#CHECK-NEXT: Type: Section (0x3)
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: .data
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: _binary_binary_input_test_start
				jhendersonUnsubmitted Not Done Reply Inline Actions I just did some experimenting, and these symbols should include the file extension (so in the current case, this one should be "_binary_binary_input_test_bin_start"), according to GNU objcopy. I think you want to be using filename() instead of stem(). jhenderson: I just did some experimenting, and these symbols should include the file extension (so in the…
				#CHECK-NEXT: Value: 0x0
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Global (0x1)
				#CHECK-NEXT: Type: None (0x0)
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: .data
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: _binary_binary_input_test_end
				#CHECK-NEXT: Value: 0x4
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Global (0x1)
				#CHECK-NEXT: Type: None (0x0)
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: .data
				#CHECK-NEXT: }
				#CHECK-NEXT: Symbol {
				#CHECK-NEXT: Name: _binary_binary_input_test_size
				#CHECK-NEXT: Value: 0x4
				#CHECK-NEXT: Size: 0
				#CHECK-NEXT: Binding: Global (0x1)
				#CHECK-NEXT: Type: None (0x0)
				#CHECK-NEXT: Other: 0
				#CHECK-NEXT: Section: Absolute (0xFFF1)
				#CHECK-NEXT: }
				#CHECK-NEXT: ]

tools/llvm-objcopy/Object.h

Show First 20 Lines • Show All 182 Lines • ▼ Show 20 Lines	enum SymbolShndxType {
SYMBOL_HEXAGON_SCOMMON_8 = ELF::SHN_HEXAGON_SCOMMON_8,		SYMBOL_HEXAGON_SCOMMON_8 = ELF::SHN_HEXAGON_SCOMMON_8,
};		};

struct Symbol {		struct Symbol {
uint8_t Binding;		uint8_t Binding;
SectionBase *DefinedIn = nullptr;		SectionBase *DefinedIn = nullptr;
SymbolShndxType ShndxType;		SymbolShndxType ShndxType;
uint32_t Index;		uint32_t Index;
StringRef Name;		std::string Name;
uint32_t NameIndex;		uint32_t NameIndex;
uint64_t Size;		uint64_t Size;
uint8_t Type;		uint8_t Type;
uint64_t Value;		uint64_t Value;
uint8_t Visibility;		uint8_t Visibility;

uint16_t getShndx() const;		uint16_t getShndx() const;
};		};

class SymbolTableSection : public SectionBase {		class SymbolTableSection : public SectionBase {
protected:		protected:
std::vector<std::unique_ptr<Symbol>> Symbols;		std::vector<std::unique_ptr<Symbol>> Symbols;
StringTableSection *SymbolNames = nullptr;		StringTableSection *SymbolNames = nullptr;

using SymPtr = std::unique_ptr<Symbol>;		using SymPtr = std::unique_ptr<Symbol>;

public:		public:
		SymbolTableSection() {
		Type = ELF::SHT_SYMTAB;
		Align = 8;
		}
void setStrTab(StringTableSection *StrTab) { SymbolNames = StrTab; }		void setStrTab(StringTableSection *StrTab) { SymbolNames = StrTab; }
void addSymbol(StringRef Name, uint8_t Bind, uint8_t Type,		void addSymbol(Twine Name, uint8_t Bind, uint8_t Type, SectionBase *DefinedIn,
SectionBase *DefinedIn, uint64_t Value, uint8_t Visibility,		uint64_t Value, uint8_t Visibility, uint16_t Shndx,
uint16_t Shndx, uint64_t Sz);		uint64_t Sz);
void addSymbolNames();		void addSymbolNames();
const SectionBase *getStrTab() const { return SymbolNames; }		const SectionBase *getStrTab() const { return SymbolNames; }
const Symbol *getSymbolByIndex(uint32_t Index) const;		const Symbol *getSymbolByIndex(uint32_t Index) const;
void removeSectionReferences(const SectionBase *Sec) override;		void removeSectionReferences(const SectionBase *Sec) override;
void localize(std::function<bool(const Symbol &)> ToLocalize);		void localize(std::function<bool(const Symbol &)> ToLocalize);
void initialize(SectionTableRef SecTable) override;		void initialize(SectionTableRef SecTable) override;
void finalize() override;		void finalize() override;

static bool classof(const SectionBase *S) {		static bool classof(const SectionBase *S) {
return S->Type == ELF::SHT_SYMTAB;		return S->Type == ELF::SHT_SYMTAB;
}		}
};		};

// Only writeSection depends on the ELF type so we implement it in a subclass.		// Only writeSection depends on the ELF type so we implement it in a subclass.
template <class ELFT> class SymbolTableSectionImpl : public SymbolTableSection {		template <class ELFT> class SymbolTableSectionImpl : public SymbolTableSection {
		public:
		SymbolTableSectionImpl() { EntrySize = sizeof(typename ELFT::Sym); }
void writeSection(FileOutputBuffer &Out) const override;		void writeSection(FileOutputBuffer &Out) const override;
};		};

struct Relocation {		struct Relocation {
const Symbol *RelocSymbol = nullptr;		const Symbol *RelocSymbol = nullptr;
uint64_t Offset;		uint64_t Offset;
uint64_t Addend;		uint64_t Addend;
uint32_t Type;		uint32_t Type;
▲ Show 20 Lines • Show All 113 Lines • ▼ Show 20 Lines
private:		private:
using SecPtr = std::unique_ptr<SectionBase>;		using SecPtr = std::unique_ptr<SectionBase>;
using SegPtr = std::unique_ptr<Segment>;		using SegPtr = std::unique_ptr<Segment>;

using Elf_Shdr = typename ELFT::Shdr;		using Elf_Shdr = typename ELFT::Shdr;
using Elf_Ehdr = typename ELFT::Ehdr;		using Elf_Ehdr = typename ELFT::Ehdr;
using Elf_Phdr = typename ELFT::Phdr;		using Elf_Phdr = typename ELFT::Phdr;

		void initHeader(uint16_t EMachine);
void initSymbolTable(const object::ELFFile<ELFT> &ElfFile,		void initSymbolTable(const object::ELFFile<ELFT> &ElfFile,
SymbolTableSection *SymTab, SectionTableRef SecTable);		SymbolTableSection *SymTab, SectionTableRef SecTable);
SecPtr makeSection(const object::ELFFile<ELFT> &ElfFile,		SecPtr makeSection(const object::ELFFile<ELFT> &ElfFile,
const Elf_Shdr &Shdr);		const Elf_Shdr &Shdr);
void readProgramHeaders(const object::ELFFile<ELFT> &ElfFile);		void readProgramHeaders(const object::ELFFile<ELFT> &ElfFile);
SectionTableRef readSectionHeaders(const object::ELFFile<ELFT> &ElfFile);		SectionTableRef readSectionHeaders(const object::ELFFile<ELFT> &ElfFile);

protected:		protected:
Show All 12 Lines	public:
uint64_t Entry;		uint64_t Entry;
uint64_t SHOffset;		uint64_t SHOffset;
uint32_t Type;		uint32_t Type;
uint32_t Machine;		uint32_t Machine;
uint32_t Version;		uint32_t Version;
uint32_t Flags;		uint32_t Flags;
bool WriteSectionHeaders = true;		bool WriteSectionHeaders = true;

		Object(uint16_t EMachine);
Object(const object::ELFObjectFile<ELFT> &Obj);		Object(const object::ELFObjectFile<ELFT> &Obj);
virtual ~Object() = default;		virtual ~Object() = default;

SymbolTableSection *getSymTab() const { return SymbolTable; }		SymbolTableSection *getSymTab() { return SymbolTable; }
		const SymbolTableSection *getSymTab() const { return SymbolTable; }
const SectionBase *getSectionHeaderStrTab() const { return SectionNames; }		const SectionBase *getSectionHeaderStrTab() const { return SectionNames; }
void removeSections(std::function<bool(const SectionBase &)> ToRemove);		void removeSections(std::function<bool(const SectionBase &)> ToRemove);
void addSection(StringRef SecName, ArrayRef<uint8_t> Data);		SectionBase *addSection(StringRef SecName, ArrayRef<uint8_t> Data);
virtual size_t totalSize() const = 0;		virtual size_t totalSize() const = 0;
virtual void finalize() = 0;		virtual void finalize() = 0;
virtual void write(FileOutputBuffer &Out) const = 0;		virtual void write(FileOutputBuffer &Out) const = 0;
};		};

template <class ELFT> class ELFObject : public Object<ELFT> {		template <class ELFT> class ELFObject : public Object<ELFT> {
private:		private:
using SecPtr = std::unique_ptr<SectionBase>;		using SecPtr = std::unique_ptr<SectionBase>;
using SegPtr = std::unique_ptr<Segment>;		using SegPtr = std::unique_ptr<Segment>;

using Elf_Shdr = typename ELFT::Shdr;		using Elf_Shdr = typename ELFT::Shdr;
using Elf_Ehdr = typename ELFT::Ehdr;		using Elf_Ehdr = typename ELFT::Ehdr;
using Elf_Phdr = typename ELFT::Phdr;		using Elf_Phdr = typename ELFT::Phdr;

void sortSections();		void sortSections();
void assignOffsets();		void assignOffsets();

public:		public:
		ELFObject(uint16_t EMachine) : Object<ELFT>(EMachine) {}
ELFObject(const object::ELFObjectFile<ELFT> &Obj) : Object<ELFT>(Obj) {}		ELFObject(const object::ELFObjectFile<ELFT> &Obj) : Object<ELFT>(Obj) {}

void finalize() override;		void finalize() override;
size_t totalSize() const override;		size_t totalSize() const override;
void write(FileOutputBuffer &Out) const override;		void write(FileOutputBuffer &Out) const override;
};		};

template <class ELFT> class BinaryObject : public Object<ELFT> {		template <class ELFT> class BinaryObject : public Object<ELFT> {
private:		private:
using SecPtr = std::unique_ptr<SectionBase>;		using SecPtr = std::unique_ptr<SectionBase>;
using SegPtr = std::unique_ptr<Segment>;		using SegPtr = std::unique_ptr<Segment>;

uint64_t TotalSize;		uint64_t TotalSize;

public:		public:
		BinaryObject(uint16_t EMachine) : Object<ELFT>(EMachine) {}
BinaryObject(const object::ELFObjectFile<ELFT> &Obj) : Object<ELFT>(Obj) {}		BinaryObject(const object::ELFObjectFile<ELFT> &Obj) : Object<ELFT>(Obj) {}

void finalize() override;		void finalize() override;
size_t totalSize() const override;		size_t totalSize() const override;
void write(FileOutputBuffer &Out) const override;		void write(FileOutputBuffer &Out) const override;
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_TOOLS_OBJCOPY_OBJECT_H		#endif // LLVM_TOOLS_OBJCOPY_OBJECT_H

tools/llvm-objcopy/Object.cpp

Show First 20 Lines • Show All 133 Lines • ▼ Show 20 Lines	uint16_t Symbol::getShndx() const {
case SYMBOL_HEXAGON_SCOMMON_2:		case SYMBOL_HEXAGON_SCOMMON_2:
case SYMBOL_HEXAGON_SCOMMON_4:		case SYMBOL_HEXAGON_SCOMMON_4:
case SYMBOL_HEXAGON_SCOMMON_8:		case SYMBOL_HEXAGON_SCOMMON_8:
return static_cast<uint16_t>(ShndxType);		return static_cast<uint16_t>(ShndxType);
}		}
llvm_unreachable("Symbol with invalid ShndxType encountered");		llvm_unreachable("Symbol with invalid ShndxType encountered");
}		}

void SymbolTableSection::addSymbol(StringRef Name, uint8_t Bind, uint8_t Type,		void SymbolTableSection::addSymbol(Twine Name, uint8_t Bind, uint8_t Type,
SectionBase *DefinedIn, uint64_t Value,		SectionBase *DefinedIn, uint64_t Value,
uint8_t Visibility, uint16_t Shndx,		uint8_t Visibility, uint16_t Shndx,
uint64_t Sz) {		uint64_t Sz) {
Symbol Sym;		Symbol Sym;
Sym.Name = Name;		Sym.Name = Name.str();
Sym.Binding = Bind;		Sym.Binding = Bind;
Sym.Type = Type;		Sym.Type = Type;
Sym.DefinedIn = DefinedIn;		Sym.DefinedIn = DefinedIn;
if (DefinedIn == nullptr) {		if (DefinedIn == nullptr) {
if (Shndx >= SHN_LORESERVE)		if (Shndx >= SHN_LORESERVE)
Sym.ShndxType = static_cast<SymbolShndxType>(Shndx);		Sym.ShndxType = static_cast<SymbolShndxType>(Shndx);
else		else
Sym.ShndxType = SYMBOL_SIMPLE_INDEX;		Sym.ShndxType = SYMBOL_SIMPLE_INDEX;
▲ Show 20 Lines • Show All 426 Lines • ▼ Show 20 Lines	if (auto RelSec = dyn_cast<RelocationSection<ELFT>>(Section.get())) {
initRelocations(RelSec, SymbolTable,		initRelocations(RelSec, SymbolTable,
unwrapOrError(ElfFile.relas(Shdr)));		unwrapOrError(ElfFile.relas(Shdr)));
}		}
}		}

return SecTable;		return SecTable;
}		}

		template <class ELFT> void SetIdent(uint8_t Ident[16]) {
		std::fill(Ident, Ident + 16, 0);
		Ident[EI_MAG0] = 0x7f;
		Ident[EI_MAG1] = 'E';
		Ident[EI_MAG2] = 'L';
		Ident[EI_MAG3] = 'F';
		Ident[EI_CLASS] = ELFT::Is64Bits ? ELFCLASS64 : ELFCLASS32;
		Ident[EI_DATA] =
		ELFT::TargetEndianness == support::big ? ELFDATA2MSB : ELFDATA2LSB;
		Ident[EI_VERSION] = EV_CURRENT;
		Ident[EI_OSABI] = ELFOSABI_NONE;
		Ident[EI_ABIVERSION] = 0;
		}

		template <class ELFT> void Object<ELFT>::initHeader(uint16_t EMachine) {
		jhendersonUnsubmitted Done Reply Inline Actions The contents of this function feel a bit out-of-order. It starts with setting stuff to do with sections, then adds a symbol, then sets some properties, based on the sections, then goes off and initialises the elf header, before coming back and updating some more properties to do with the sections. Could you maybe reorder things as follows: Initialise the ELF header, probably in a separate function called by this function. Create the string table section, and add it to the Sections array. Create the symbol table, set the string table property, add the symbol, and add it to the array. This should make the function a little easier to follow. jhenderson: The contents of this function feel a bit out-of-order. It starts with setting stuff to do with…
		SetIdent<ELFT>(Ident);
		Flags = 0x0;
		Type = ET_REL;
		Entry = 0x0;
		Machine = EMachine;
		Version = 1;
		}

		template <class ELFT> Object<ELFT>::Object(uint16_t EMachine) {
		initHeader(EMachine);

		auto StrTab = llvm::make_unique<StringTableSection>();
		StrTab->Name = ".strtab";
		SectionNames = StrTab.get();
		Sections.emplace_back(std::move(StrTab));

		auto SymTab = llvm::make_unique<SymbolTableSectionImpl<ELFT>>();
		SymTab->Name = ".symtab";
		SymTab->setStrTab(SectionNames);
		// We need to add the null symbol.
		SymTab->addSymbol("", 0, 0, nullptr, 0, 0, 0, 0);
		SymbolTable = SymTab.get();
		Sections.emplace_back(std::move(SymTab));
		}

template <class ELFT> Object<ELFT>::Object(const ELFObjectFile<ELFT> &Obj) {		template <class ELFT> Object<ELFT>::Object(const ELFObjectFile<ELFT> &Obj) {
const auto &ElfFile = *Obj.getELFFile();		const auto &ElfFile = *Obj.getELFFile();
const auto &Ehdr = *ElfFile.getHeader();		const auto &Ehdr = *ElfFile.getHeader();

std::copy(Ehdr.e_ident, Ehdr.e_ident + 16, Ident);		std::copy(Ehdr.e_ident, Ehdr.e_ident + 16, Ident);
Type = Ehdr.e_type;		Type = Ehdr.e_type;
Machine = Ehdr.e_machine;		Machine = Ehdr.e_machine;
Version = Ehdr.e_version;		Version = Ehdr.e_version;
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	for (auto &RemoveSec : make_range(Iter, std::end(Sections))) {
for (auto &KeepSec : make_range(std::begin(Sections), Iter))		for (auto &KeepSec : make_range(std::begin(Sections), Iter))
KeepSec->removeSectionReferences(RemoveSec.get());		KeepSec->removeSectionReferences(RemoveSec.get());
}		}
// Now finally get rid of them all togethor.		// Now finally get rid of them all togethor.
Sections.erase(Iter, std::end(Sections));		Sections.erase(Iter, std::end(Sections));
}		}

template <class ELFT>		template <class ELFT>
void Object<ELFT>::addSection(StringRef SecName, ArrayRef<uint8_t> Data) {		SectionBase *Object<ELFT>::addSection(StringRef SecName,
		ArrayRef<uint8_t> Data) {
auto Sec = llvm::make_unique<OwnedDataSection>(SecName, Data);		auto Sec = llvm::make_unique<OwnedDataSection>(SecName, Data);
		auto Out = Sec.get();
Sec->OriginalOffset = ~0ULL;		Sec->OriginalOffset = ~0ULL;
Sections.push_back(std::move(Sec));		Sections.push_back(std::move(Sec));
		return Out;
}		}

template <class ELFT> void ELFObject<ELFT>::sortSections() {		template <class ELFT> void ELFObject<ELFT>::sortSections() {
// Put all sections in offset order. Maintain the ordering as closely as		// Put all sections in offset order. Maintain the ordering as closely as
// possible while meeting that demand however.		// possible while meeting that demand however.
auto CompareSections = [](const SecPtr &A, const SecPtr &B) {		auto CompareSections = [](const SecPtr &A, const SecPtr &B) {
return A->OriginalOffset < B->OriginalOffset;		return A->OriginalOffset < B->OriginalOffset;
};		};
▲ Show 20 Lines • Show All 252 Lines • Show Last 20 Lines

tools/llvm-objcopy/llvm-objcopy.cpp

Show All 19 Lines
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
#include "llvm/Support/Error.h"		#include "llvm/Support/Error.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/ErrorOr.h"		#include "llvm/Support/ErrorOr.h"
#include "llvm/Support/FileOutputBuffer.h"		#include "llvm/Support/FileOutputBuffer.h"
#include "llvm/Support/ManagedStatic.h"		#include "llvm/Support/ManagedStatic.h"
		#include "llvm/Support/Path.h"
#include "llvm/Support/PrettyStackTrace.h"		#include "llvm/Support/PrettyStackTrace.h"
#include "llvm/Support/Signals.h"		#include "llvm/Support/Signals.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdlib>		#include <cstdlib>
#include <functional>		#include <functional>
#include <iterator>		#include <iterator>
Show All 33 Lines	LLVM_ATTRIBUTE_NORETURN void reportError(StringRef File, Error E) {
exit(1);		exit(1);
}		}

} // end namespace llvm		} // end namespace llvm

static cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input>"));		static cl::opt<std::string> InputFilename(cl::Positional, cl::desc("<input>"));
static cl::opt<std::string> OutputFilename(cl::Positional, cl::desc("<output>"),		static cl::opt<std::string> OutputFilename(cl::Positional, cl::desc("<output>"),
cl::init("-"));		cl::init("-"));
		static cl::opt<std::string> BinaryArchitecture(
		"B", cl::desc("Specify the architecture of a binary input file."));
		static cl::opt<std::string>
		InputFormat("I", cl::desc("Set input format to one of the following:"
		"\n\tbinary"));
static cl::opt<std::string>		static cl::opt<std::string>
OutputFormat("O", cl::desc("Set output format to one of the following:"		OutputFormat("O", cl::desc("Set output format to one of the following:"
"\n\tbinary"));		"\n\tbinary"));
static cl::list<std::string> ToRemove("remove-section",		static cl::list<std::string> ToRemove("remove-section",
cl::desc("Remove <section>"),		cl::desc("Remove <section>"),
cl::value_desc("section"));		cl::value_desc("section"));
static cl::alias ToRemoveA("R", cl::desc("Alias for remove-section"),		static cl::alias ToRemoveA("R", cl::desc("Alias for remove-section"),
cl::aliasopt(ToRemove));		cl::aliasopt(ToRemove));
▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines	void SplitDWOToFile(const ELFObjectFile<ELFT> &ObjFile, StringRef File) {

DWOFile.removeSections([&](const SectionBase &Sec) {		DWOFile.removeSections([&](const SectionBase &Sec) {
return OnlyKeepDWOPred<ELFT>(DWOFile, Sec);		return OnlyKeepDWOPred<ELFT>(DWOFile, Sec);
});		});
DWOFile.finalize();		DWOFile.finalize();
WriteObjectFile(DWOFile, File);		WriteObjectFile(DWOFile, File);
}		}

		template <class ELFT>
		SectionBase *AddSectionFromFile(Object<ELFT> &Obj, StringRef SecName,
		StringRef File) {
		auto BufOrErr = MemoryBuffer::getFile(File);
		if (!BufOrErr)
		reportError(File, BufOrErr.getError());
		auto Buf = std::move(*BufOrErr);
		auto BufPtr = reinterpret_cast<const uint8_t *>(Buf->getBufferStart());
		auto BufSize = Buf->getBufferSize();
		return Obj.addSection(SecName, ArrayRef<uint8_t>(BufPtr, BufSize));
		}

// This function handles the high level operations of GNU objcopy including		// This function handles the high level operations of GNU objcopy including
// handling command line options. It's important to outline certain properties		// handling command line options. It's important to outline certain properties
// we expect to hold of the command line operations. Any operation that "keeps"		// we expect to hold of the command line operations. Any operation that "keeps"
// should keep regardless of a remove. Additionally any removal should respect		// should keep regardless of a remove. Additionally any removal should respect
// any previous removals. Lastly whether or not something is removed shouldn't		// any previous removals. Lastly whether or not something is removed shouldn't
// depend a) on the order the options occur in or b) on some opaque priority		// depend a) on the order the options occur in or b) on some opaque priority
// system. The only priority is that keeps/copies overrule removes.		// system. The only priority is that keeps/copies overrule removes.
template <class ELFT> void CopyBinary(const ELFObjectFile<ELFT> &ObjFile) {		template <class ELFT> void HandleArgs(Object<ELFT> &Obj) {
std::unique_ptr<Object<ELFT>> Obj;

if (!OutputFormat.empty() && OutputFormat != "binary")
error("invalid output format '" + OutputFormat + "'");
if (!OutputFormat.empty() && OutputFormat == "binary")
Obj = llvm::make_unique<BinaryObject<ELFT>>(ObjFile);
else
Obj = llvm::make_unique<ELFObject<ELFT>>(ObjFile);

if (!SplitDWO.empty())
SplitDWOToFile<ELFT>(ObjFile, SplitDWO.getValue());

// Localize:		// Localize:

if (LocalizeHidden) {		if (LocalizeHidden) {
Obj->getSymTab()->localize([](const Symbol &Sym) {		Obj.getSymTab()->localize([](const Symbol &Sym) {
return Sym.Visibility == STV_HIDDEN \|\| Sym.Visibility == STV_INTERNAL;		return Sym.Visibility == STV_HIDDEN \|\| Sym.Visibility == STV_INTERNAL;
});		});
}		}

SectionPred RemovePred = [](const SectionBase &) { return false; };		SectionPred RemovePred = [](const SectionBase &) { return false; };

// Removes:		// Removes:

if (!ToRemove.empty()) {		if (!ToRemove.empty()) {
RemovePred = [&](const SectionBase &Sec) {		RemovePred = [&](const SectionBase &Sec) {
return std::find(std::begin(ToRemove), std::end(ToRemove), Sec.Name) !=		return std::find(std::begin(ToRemove), std::end(ToRemove), Sec.Name) !=
std::end(ToRemove);		std::end(ToRemove);
};		};
}		}

if (StripDWO \|\| !SplitDWO.empty())		if (StripDWO \|\| !SplitDWO.empty())
RemovePred = [RemovePred](const SectionBase &Sec) {		RemovePred = [RemovePred](const SectionBase &Sec) {
return IsDWOSection(Sec) \|\| RemovePred(Sec);		return IsDWOSection(Sec) \|\| RemovePred(Sec);
};		};

if (ExtractDWO)		if (ExtractDWO)
RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {		RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {
return OnlyKeepDWOPred(*Obj, Sec) \|\| RemovePred(Sec);		return OnlyKeepDWOPred(Obj, Sec) \|\| RemovePred(Sec);
};		};

if (StripAllGNU)		if (StripAllGNU)
RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {		RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {
if (RemovePred(Sec))		if (RemovePred(Sec))
return true;		return true;
if ((Sec.Flags & SHF_ALLOC) != 0)		if ((Sec.Flags & SHF_ALLOC) != 0)
return false;		return false;
if (&Sec == Obj->getSectionHeaderStrTab())		if (&Sec == Obj.getSectionHeaderStrTab())
return false;		return false;
switch (Sec.Type) {		switch (Sec.Type) {
case SHT_SYMTAB:		case SHT_SYMTAB:
case SHT_REL:		case SHT_REL:
case SHT_RELA:		case SHT_RELA:
case SHT_STRTAB:		case SHT_STRTAB:
return true;		return true;
}		}
return Sec.Name.startswith(".debug");		return Sec.Name.startswith(".debug");
};		};

if (StripSections) {		if (StripSections) {
RemovePred = [RemovePred](const SectionBase &Sec) {		RemovePred = [RemovePred](const SectionBase &Sec) {
return RemovePred(Sec) \|\| (Sec.Flags & SHF_ALLOC) == 0;		return RemovePred(Sec) \|\| (Sec.Flags & SHF_ALLOC) == 0;
};		};
Obj->WriteSectionHeaders = false;		Obj.WriteSectionHeaders = false;
}		}

if (StripDebug) {		if (StripDebug) {
RemovePred = [RemovePred](const SectionBase &Sec) {		RemovePred = [RemovePred](const SectionBase &Sec) {
return RemovePred(Sec) \|\| Sec.Name.startswith(".debug");		return RemovePred(Sec) \|\| Sec.Name.startswith(".debug");
};		};
}		}

if (StripNonAlloc)		if (StripNonAlloc)
RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {		RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {
if (RemovePred(Sec))		if (RemovePred(Sec))
return true;		return true;
if (&Sec == Obj->getSectionHeaderStrTab())		if (&Sec == Obj.getSectionHeaderStrTab())
return false;		return false;
return (Sec.Flags & SHF_ALLOC) == 0;		return (Sec.Flags & SHF_ALLOC) == 0;
};		};

if (StripAll)		if (StripAll)
RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {		RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {
if (RemovePred(Sec))		if (RemovePred(Sec))
return true;		return true;
if (&Sec == Obj->getSectionHeaderStrTab())		if (&Sec == Obj.getSectionHeaderStrTab())
return false;		return false;
if (Sec.Name.startswith(".gnu.warning"))		if (Sec.Name.startswith(".gnu.warning"))
return false;		return false;
return (Sec.Flags & SHF_ALLOC) == 0;		return (Sec.Flags & SHF_ALLOC) == 0;
};		};

// Explicit copies:		// Explicit copies:

if (!OnlyKeep.empty()) {		if (!OnlyKeep.empty()) {
RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {		RemovePred = [RemovePred, &Obj](const SectionBase &Sec) {
// Explicitly keep these sections regardless of previous removes.		// Explicitly keep these sections regardless of previous removes.
if (std::find(std::begin(OnlyKeep), std::end(OnlyKeep), Sec.Name) !=		if (std::find(std::begin(OnlyKeep), std::end(OnlyKeep), Sec.Name) !=
std::end(OnlyKeep))		std::end(OnlyKeep))
return false;		return false;

// Allow all implicit removes.		// Allow all implicit removes.
if (RemovePred(Sec)) {		if (RemovePred(Sec)) {
return true;		return true;
}		}

// Keep special sections.		// Keep special sections.
if (Obj->getSectionHeaderStrTab() == &Sec) {		if (Obj.getSectionHeaderStrTab() == &Sec) {
return false;		return false;
}		}
if (Obj->getSymTab() == &Sec \|\| Obj->getSymTab()->getStrTab() == &Sec) {		if (Obj.getSymTab() == &Sec \|\| Obj.getSymTab()->getStrTab() == &Sec) {
return false;		return false;
}		}
// Remove everything else.		// Remove everything else.
return true;		return true;
};		};
}		}

if (!Keep.empty()) {		if (!Keep.empty()) {
RemovePred = [RemovePred](const SectionBase &Sec) {		RemovePred = [RemovePred](const SectionBase &Sec) {
// Explicitly keep these sections regardless of previous removes.		// Explicitly keep these sections regardless of previous removes.
if (std::find(std::begin(Keep), std::end(Keep), Sec.Name) !=		if (std::find(std::begin(Keep), std::end(Keep), Sec.Name) !=
std::end(Keep))		std::end(Keep))
return false;		return false;
// Otherwise defer to RemovePred.		// Otherwise defer to RemovePred.
return RemovePred(Sec);		return RemovePred(Sec);
};		};
}		}

Obj->removeSections(RemovePred);		Obj.removeSections(RemovePred);

if (!AddSection.empty()) {		if (!AddSection.empty()) {
for (const auto &Flag : AddSection) {		for (const auto &Flag : AddSection) {
auto SecPair = StringRef(Flag).split("=");		auto SecPair = StringRef(Flag).split("=");
auto SecName = SecPair.first;		auto SecName = SecPair.first;
auto File = SecPair.second;		auto File = SecPair.second;
auto BufOrErr = MemoryBuffer::getFile(File);		AddSectionFromFile(Obj, SecName, File);
if (!BufOrErr)		}
reportError(File, BufOrErr.getError());		}
auto Buf = std::move(*BufOrErr);
auto BufPtr = reinterpret_cast<const uint8_t *>(Buf->getBufferStart());		Obj.finalize();
auto BufSize = Buf->getBufferSize();		WriteObjectFile(Obj, OutputFilename.getValue());
Obj->addSection(SecName, ArrayRef<uint8_t>(BufPtr, BufSize));
}		}

		// This type keeps track of the machine info for various architectures. This
		// lets us map architecture names to ELF types and the e_machine value of the
		// ELF file.
		struct MachineInfo {
		uint16_t EMachine;
		bool Is64Bit;
		bool IsLittleEndian;
		};

		static const MachineInfo MI_X86_64{EM_X86_64, true, true};
		static const MachineInfo MI_386{EM_386, false, true};
		static const MachineInfo MI_AARCH64{EM_AARCH64, true, true};
		static const MachineInfo MI_ARM{EM_ARM, false, true};

		static const StringMap<MachineInfo> ArchMap{{"i386", MI_386},
		{"i386:x86-64", MI_X86_64},
		{"x86-64", MI_X86_64},
		{"arm", MI_ARM},
		{"aarch64", MI_AARCH64}};

		static MachineInfo GetMachineInfo(StringRef Arch) {
		auto Iter = ArchMap.find(Arch);
		if (Iter == std::end(ArchMap))
		error("Invalid architecture: " + Arch);
		return Iter->getValue();
		}

		template <class ELFT>
		void InputBinaryFormat(Object<ELFT> &Obj, StringRef BinFile) {
		auto DataSec = AddSectionFromFile(Obj, ".data", BinFile);
		DataSec->Flags = SHF_ALLOC \| SHF_WRITE;
		std::string BinFileName = sys::path::stem(BinFile);
		jhendersonUnsubmitted Done Reply Inline Actions I wonder if this is an indication that Symbols should own their own names. It would mean a bit of copying in the ELF input case, but could prevent easy-to-make errors if we want to create or rename symbols. If you prefer keeping it as is, I'd make a separate function called "MakeBinarySymbolName(StringRef BaseName, StringRef Suffix)", so that the warts of the name ownership can be kept separate from the adding of symbols, and it can be reused in other places too. Did you consider making InputBinaryFormat a subclass of Object? That would allow you to have a slightly nicer name ownership resolution, apart from anything else. jhenderson: I wonder if this is an indication that Symbols should own their own names. It would mean a bit…
		jakehehrlichAuthorUnsubmitted Not Done Reply Inline Actions I think switching to Symbols owning their names is a good idea. Symbols and relocations are likely to be the sticking point for optimization at some point in the future but I'd rather use the conceptually simplest option now and optimize later when we have an issue. I think the biggest optimization for symbol tables will come from lazy loading and not from optimizing copying of small strings like that. As for making InputBinaryFormat a subclass of Object I'm not sure. I didn't consider but considering it now I was intending for those sub classes to be the output formats. This change does raise the question of how input formats should be handled however. For instance why is the binary input format handled here but the elf input format is handled inside of Object, that seems kind of off to me. I remember you mentioned an idea a while back about having read and writer types that map in and out of a common representation. Maybe we should refactor the Object code to expose enough of an interface that code outside of Object can reconstruct the ELF Object the way this code does so for the binary input case. jakehehrlich: I think switching to Symbols owning their names is a good idea. Symbols and relocations are…
		jhendersonUnsubmitted Not Done Reply Inline Actions The comment needs updating now that symbols do own their own names, and I think the StringSaver header probably can be deleted too. jhenderson: The comment needs updating now that symbols do own their own names, and I think the StringSaver…
		std::replace_if(std::begin(BinFileName), std::end(BinFileName),
		[](char c) { return !isalnum(c); }, '_');
		auto BaseName = Twine("_binary_") + BinFileName;
		auto StartName = BaseName + "_start";
		auto EndName = BaseName + "_end";
		auto SizeName = BaseName + "_size";
		auto SymTab = Obj.getSymTab();
		SymTab->addSymbol("", STB_LOCAL, STT_SECTION, DataSec, 0, STV_DEFAULT, 0, 0);
		SymTab->addSymbol(StartName, STB_GLOBAL, STT_NOTYPE, DataSec, 0, STV_DEFAULT,
		0, 0);
		SymTab->addSymbol(EndName, STB_GLOBAL, STT_NOTYPE, DataSec, DataSec->Size,
		STV_DEFAULT, 0, 0);
		SymTab->addSymbol(SizeName, STB_GLOBAL, STT_NOTYPE, nullptr, DataSec->Size,
		STV_DEFAULT, SHN_ABS, 0);
}		}

Obj->finalize();		template <class ELFT>
WriteObjectFile(*Obj, OutputFilename.getValue());		void HandleBinary(StringRef Input, StringRef OutFmt, MachineInfo MInfo) {
		if (OutFmt == "binary") {
		BinaryObject<ELFT> Obj(MInfo.EMachine);
		InputBinaryFormat(Obj, Input);
		HandleArgs(Obj);
		} else {
		ELFObject<ELFT> Obj(MInfo.EMachine);
		InputBinaryFormat(Obj, Input);
		HandleArgs(Obj);
		}
		}

		static void HandleBinaryELFT(StringRef Input, StringRef OutFmt,
		StringRef Arch) {
		MachineInfo MInfo = GetMachineInfo(Arch);
		if (MInfo.Is64Bit) {
		if (MInfo.IsLittleEndian)
		HandleBinary<ELF64LE>(Input, OutFmt, MInfo);
		else
		HandleBinary<ELF64BE>(Input, OutFmt, MInfo);
		} else {
		if (MInfo.IsLittleEndian)
		HandleBinary<ELF32LE>(Input, OutFmt, MInfo);
		else
		HandleBinary<ELF32BE>(Input, OutFmt, MInfo);
		}
		}

		template <class ELFT>
		void HandleELF(const ELFObjectFile<ELFT> &ObjFile, StringRef OutFmt) {
		if (!SplitDWO.empty())
		SplitDWOToFile<ELFT>(ObjFile, SplitDWO.getValue());

		if (OutFmt == "binary") {
		BinaryObject<ELFT> Obj(ObjFile);
		HandleArgs(Obj);
		} else {
		ELFObject<ELFT> Obj(ObjFile);
		HandleArgs(Obj);
		}
		}

		static void HandleELFT(StringRef Input, StringRef OutFmt) {
		Expected<OwningBinary<Binary>> BinaryOrErr = createBinary(Input);
		if (!BinaryOrErr)
		reportError(Input, BinaryOrErr.takeError());
		Binary &Binary = *BinaryOrErr.get().getBinary();
		if (auto *o = dyn_cast<ELFObjectFile<ELF64LE>>(&Binary))
		jhendersonUnsubmitted Done Reply Inline Actions This function should be static (along with the rest of the new functions). jhenderson: This function should be static (along with the rest of the new functions).
		HandleELF(*o, OutFmt);
		else if (auto *o = dyn_cast<ELFObjectFile<ELF32LE>>(&Binary))
		HandleELF(*o, OutFmt);
		else if (auto *o = dyn_cast<ELFObjectFile<ELF64BE>>(&Binary))
		HandleELF(*o, OutFmt);
		else if (auto *o = dyn_cast<ELFObjectFile<ELF32BE>>(&Binary))
		HandleELF(*o, OutFmt);
		else
		reportError(Input, object_error::invalid_file_type);
}		}

int main(int argc, char **argv) {		int main(int argc, char **argv) {
// Print a stack trace if we signal out.		// Print a stack trace if we signal out.
sys::PrintStackTraceOnErrorSignal(argv[0]);		sys::PrintStackTraceOnErrorSignal(argv[0]);
PrettyStackTraceProgram X(argc, argv);		PrettyStackTraceProgram X(argc, argv);
llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.		llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.
cl::ParseCommandLineOptions(argc, argv, "llvm objcopy utility\n");		cl::ParseCommandLineOptions(argc, argv, "llvm objcopy utility\n");
		jhendersonUnsubmitted Done Reply Inline Actions I'm not sure this function is well named, given that this function is effectively the driver of the rest of objcopy - it's not really handling the input. It's doing all the work. I'd even go so far as to say that it should be inlined into main. If you do want to keep the function, I'd rename Arch to InArch, and change the argument order to Input, InFmt, InArch, OutFmt (i.e. keeping all input-related arguments together). jhenderson: I'm not sure this function is well named, given that this function is effectively the driver of…
ToolName = argv[0];		ToolName = argv[0];
if (InputFilename.empty()) {		if (InputFilename.empty()) {
cl::PrintHelpMessage();		cl::PrintHelpMessage();
return 2;		return 2;
}		}
Expected<OwningBinary<Binary>> BinaryOrErr = createBinary(InputFilename);		if (InputFormat == "binary")
if (!BinaryOrErr)		HandleBinaryELFT(InputFilename, OutputFormat, BinaryArchitecture);
reportError(InputFilename, BinaryOrErr.takeError());		else
Binary &Binary = *BinaryOrErr.get().getBinary();		HandleELFT(InputFilename, OutputFormat);
if (auto *o = dyn_cast<ELFObjectFile<ELF64LE>>(&Binary)) {
CopyBinary(*o);
return 0;
}
if (auto *o = dyn_cast<ELFObjectFile<ELF32LE>>(&Binary)) {
CopyBinary(*o);
return 0;
}
if (auto *o = dyn_cast<ELFObjectFile<ELF64BE>>(&Binary)) {
CopyBinary(*o);
return 0;
}
if (auto *o = dyn_cast<ELFObjectFile<ELF32BE>>(&Binary)) {
CopyBinary(*o);
return 0;
}
reportError(InputFilename, object_error::invalid_file_type);
}		}