This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/tools/llvm-objcopy/COFF/
-
tools/
-
llvm-objcopy/
-
COFF/
1
bigobj.test
-
tools/llvm-objcopy/
-
llvm-objcopy/
-
COFF/
-
COFFObjcopy.cpp
1/3
Object.h
-
Object.cpp
3/7
Reader.cpp
-
Writer.h
3/5
Writer.cpp
-
CopyConfig.h
-
CopyConfig.cpp
-
ObjcopyOpts.td

Differential D57009

[llvm-objcopy] [COFF] Fix handling of aux symbols for big objects
ClosedPublic

Authored by mstorsjo on Jan 21 2019, 2:18 AM.

Download Raw Diff

Details

Reviewers

jhenderson
alexander-shaposhnikov
jakehehrlich
rupprecht
rnk
• espindola
serge-sans-paille

Commits

rG1be91958b34c: [llvm-objcopy] [COFF] Fix handling of aux symbols for big objects
rL351947: [llvm-objcopy] [COFF] Fix handling of aux symbols for big objects

Summary

@rnk - I want your input on this one.

Other llvm-objcopy reviewers: I'd like to add a custom hidden option for testing, for triggering using the big object format. Without that, a test would have to create over 32k sections to trigger that.

Currently, the aux symbols are stored in an opaque std::vector<uint8_t>, with contents interpreted according to the rest of the symbol. This allows passing through all the aux symbols we don't need to touch or care about.

If the input was a bigobj but the output isn't, or vice versa, this makes the aux data desync the whole symbol table.

All aux symbol types that use a struct fit in 18 bytes (sizeof(coff_symbol16)), and if written to a bigobj, two extra padding bytes are written after each (as sizeof(coff_symbol32) is 20).

This patch implements the following fix: In the llvm-objcopy storage agnostic intermediate representation, store the aux symbols as a series of coff_symbol16 sized opaque blobs within the same std::vector<uint8_t>. (In practice, all such struct based aux symbols only consist of one aux symbol, so this is more flexible than what reality needs.)

The special case is the file aux symbols, which are written in potentially more than one aux symbol slot, without any padding, as one single long string. This can't be stored in the same opaque vector of fixed sized aux symbol entries. The file aux symbols will
occupy a different number of aux symbol slots depending on the type of output object file. As nothing in the intermediate process needs to have accurate raw symbol indices, updating that is moved into the writer class.

Instead of updating the symbol raw indices at the end when the final format is known, one could alternatively choose to waste a bit more space and always allocate indices based on a normal object file. For a bigobj, we could potentially end up with a whole aux entry slot of padding for the filename. As this is rather uncommon (in practice max one per file), the total wasted space would be 20 bytes per file, unless really long file names are stored.

An alternative to the opaque AuxData vector would be to add a set of Optional<coff_aux_section_definition>, Optional <coff_aux_weak_external>. The upside is that this makes the intermediate format much clearer and neater, but the downside is that we need to explicitly know and care about all sorts of aux symbols (5 types, plus the file names) that we'd otherwise just pass through without touching and even knowing the specifics about.

Diff Detail

Event Timeline

mstorsjo created this revision.Jan 21 2019, 2:18 AM

Other llvm-objcopy reviewers: I'd like to add a custom hidden option for testing, for triggering using the big object format. Without that, a test would have to create over 32k sections to trigger that.

I'm not a fan of adding hidden options purely for testing when there are alternatives. In the ELF tests, we use a pre-built zip file containing an object with many sections (see ELF/many-sections.test). I think you could probably do the same thing here.

Regarding the code, I honestly don't really understand it, so I don't feel like I'm qualified to review this COFF-ism.

tools/llvm-objcopy/COFF/Reader.cpp
113	You can get rid of the braces here.
tools/llvm-objcopy/COFF/Writer.cpp
321–332	Strictly speaking, you don't need the braces in this if/else.
327–329	Are coff_symbol16 and SymbolTy supposed to be the same size? Or are you deliberately writing less (or more) into the buffer than you iterate over?

In D57009#1366076, @jhenderson wrote:

Other llvm-objcopy reviewers: I'd like to add a custom hidden option for testing, for triggering using the big object format. Without that, a test would have to create over 32k sections to trigger that.

I'm not a fan of adding hidden options purely for testing when there are alternatives. In the ELF tests, we use a pre-built zip file containing an object with many sections (see ELF/many-sections.test). I think you could probably do the same thing here.

Hmm, ok. Now with --add-gnu-debuglink it's possible to add a section, so with that I guess it should be possible to achieve a test which removes a section to make the big object small, and then add another one to make it big again. Less elegant than a small and neat yaml test input IMO, but probably tolerable.

tools/llvm-objcopy/COFF/Writer.cpp
327–329	I'm deliberately writing more or less, yes. The COFF-ism is that the symbol table can consist of entries of either coff_symbol16 or coff_symbol32, of 18 or 20 bytes each. A symbol can be followed by a number of aux symbols, which can be one of a number of different structs, all 18 bytes each. If the table consists of coff_symbol32 entries, each one of the aux symbols (opaque aux structs) will have 2 bytes of padding at the end. So here I'm writing chunks of 18 bytes at a time out of the stored AuxData (where they are packed tightly), spaced 18 or 20 bytes apart in the output symbol table (depending on the entry size of that symbol table).

rnk added inline comments.Jan 22 2019, 9:56 AM

tools/llvm-objcopy/COFF/Object.h
85–86	Well, they aren't always coff_symbol16 sized are they? For an input bigobj, it'll be coff_symbol32, or we should make this a vector of coff_symbol16 directly. I don't know much about objcopy, but I think it might be more in the spirit of it to widen into coff_symbol32 as is done for the main symbol field above, instead of keeping this as an opaque binary blob.
tools/llvm-objcopy/COFF/Reader.cpp
116	I think this can just be `.rtrim('\0')`, there is a single character overload of rtrim.
121	Should this second `sizeof(coff_symbol16)` be SymSize? Maybe an easier way to express it would be: ArrayRef<uint8_t> Chunk = AuxData.take_front(SymSize); Sym.AuxData.insert(Sym.AuxData.end(), Chunk.begin(), Chunk.end()); AuxData = AuxData.drop_front(SymSize); It mutates a local variable, but takes less math.

mstorsjo marked 2 inline comments as done.Jan 22 2019, 10:43 AM

mstorsjo added inline comments.

tools/llvm-objcopy/COFF/Object.h
85–86	Even for bigobj inputs, the aux symbols (except for .file) only have coff_symbol16 worth of payload. There's no wide version of any of the `coff_aux_..` structs, so they can't be widened into the intermediate storage. Making it a vector of coff_symbol16 would make things clearer, but as the data actually isn't that struct, maybe `struct { uint8_t Opaque[sizeof(coff_symbol16)]; }` would be more correct? Or alternatively `Optional<coff_aux_*>` for each of the known types - but I prefer being able to passthrough unknown data untouched.
tools/llvm-objcopy/COFF/Reader.cpp
121	No, this is intentionally (within the current patch design) copying 18 bytes from a source which has got either 18 or 20 bytes stride.

Using a vector of AuxSymbol, which are an opaque struct of coff_symbol16 size, making the code slightly clearer. Didn't change the test to use a large object file to actually trigger generating a big object yet.

rnk added inline comments.Jan 22 2019, 4:36 PM

tools/llvm-objcopy/COFF/Object.h
85–86	I see.
tools/llvm-objcopy/COFF/Reader.cpp
121	Of course now I read you already clarified this. I think there should be a comment about how this is normalizing from coff_symbol32-sized entries to AuxSymbol sized entries, and discarding the padding bytes that are present in a bigobj.

lgtm I like the new code, so feel free to commit after adding a comment about the thing both reviewers were confused by. :)

This revision is now accepted and ready to land.Jan 22 2019, 4:47 PM

In D57009#1366076, @jhenderson wrote:

Other llvm-objcopy reviewers: I'd like to add a custom hidden option for testing, for triggering using the big object format. Without that, a test would have to create over 32k sections to trigger that.

I'm not a fan of adding hidden options purely for testing when there are alternatives. In the ELF tests, we use a pre-built zip file containing an object with many sections (see ELF/many-sections.test). I think you could probably do the same thing here.

I'm experimenting with crafting such an input file now. The uncompressed object file weighs in at around 5 MB, and after gzip (as is used for that ELF test) it currently ends up at around 725 KB. Do you think that's acceptable or too large?

In D57009#1367437, @mstorsjo wrote:

In D57009#1366076, @jhenderson wrote:

Other llvm-objcopy reviewers: I'd like to add a custom hidden option for testing, for triggering using the big object format. Without that, a test would have to create over 32k sections to trigger that.

I'm not a fan of adding hidden options purely for testing when there are alternatives. In the ELF tests, we use a pre-built zip file containing an object with many sections (see ELF/many-sections.test). I think you could probably do the same thing here.

I'm experimenting with crafting such an input file now. The uncompressed object file weighs in at around 5 MB, and after gzip (as is used for that ELF test) it currently ends up at around 725 KB. Do you think that's acceptable or too large?

It's not ideal, if I'm honest, but it might be a quirk of the file format, and therefore unavoidable. The equivalent file for ELF is only 152 KB. I don't know if it was somehow more aggressively compressed though. If you can't get it any smaller, I think it's probably acceptable.

By the way, there's a gunzip.py script in the ELF/Inputs directory, which you should probably move to a shared area and use for decompressing.

In D57009#1367445, @jhenderson wrote:

In D57009#1367437, @mstorsjo wrote:

I'm experimenting with crafting such an input file now. The uncompressed object file weighs in at around 5 MB, and after gzip (as is used for that ELF test) it currently ends up at around 725 KB. Do you think that's acceptable or too large?

It's not ideal, if I'm honest, but it might be a quirk of the file format, and therefore unavoidable. The equivalent file for ELF is only 152 KB. I don't know if it was somehow more aggressively compressed though. If you can't get it any smaller, I think it's probably acceptable.

Ok then. At least it makes for a better testcase.

By the way, there's a gunzip.py script in the ELF/Inputs directory, which you should probably move to a shared area and use for decompressing.

Hmm, what place would that be, where it's findable by python within the lit tests? I could just move it up into test/tools/llvm-objcopy/Inputs and refer to it with %p/../Inputs/ungzip.py in the ELF/COFF subdirs - not ideal or elegant or anything, but at least shared between these two directories.

In D57009#1367446, @mstorsjo wrote:

By the way, there's a gunzip.py script in the ELF/Inputs directory, which you should probably move to a shared area and use for decompressing.

Hmm, what place would that be, where it's findable by python within the lit tests? I could just move it up into test/tools/llvm-objcopy/Inputs and refer to it with %p/../Inputs/ungzip.py in the ELF/COFF subdirs - not ideal or elegant or anything, but at least shared between these two directories.

That's where I'd put it. No point in duplicating it after all.

Removed the option for forcing emission of a big object, made a test that operates on a bundled large object file instead.

Herald added a reviewer: • espindola. · View Herald TranscriptJan 23 2019, 2:16 AM

Herald added a reviewer: serge-sans-paille. · View Herald Transcript

Herald added subscribers: arichardson, emaste. · View Herald Transcript

jhenderson added inline comments.Jan 23 2019, 3:03 AM

test/tools/llvm-objcopy/COFF/bigobj.test
6–9	I think it probably is easier to associate comments with the corresponding test case without the blank link between them, but I'm not too bothered, so if you prefer it this way, that's fine.
tools/llvm-objcopy/COFF/Reader.cpp
113	You can still get rid of these braces ;) I think a few code comments around here explaining what you are doing and why would make it much more understandable. Perhaps a brief explanation of the difference in the BigObj format?
tools/llvm-objcopy/COFF/Writer.cpp
147–148	This is another place requiring a code comment, I think, just explaining the "why".
321–333	Again, a few comments around here would be good.

In D57009#1367445, @jhenderson wrote:

In D57009#1367437, @mstorsjo wrote:

In D57009#1366076, @jhenderson wrote:

Other llvm-objcopy reviewers: I'd like to add a custom hidden option for testing, for triggering using the big object format. Without that, a test would have to create over 32k sections to trigger that.

I'm not a fan of adding hidden options purely for testing when there are alternatives. In the ELF tests, we use a pre-built zip file containing an object with many sections (see ELF/many-sections.test). I think you could probably do the same thing here.

I'm experimenting with crafting such an input file now. The uncompressed object file weighs in at around 5 MB, and after gzip (as is used for that ELF test) it currently ends up at around 725 KB. Do you think that's acceptable or too large?

It's not ideal, if I'm honest, but it might be a quirk of the file format, and therefore unavoidable. The equivalent file for ELF is only 152 KB. I don't know if it was somehow more aggressively compressed though. If you can't get it any smaller, I think it's probably acceptable.

I realized I could make the testcase less interesting and remove some aspects that aren't strictly needed for this test. That reduced the uncompressed object from 5 MB to 2.5, and the compressed one from 725 KB to 7 KB. That's probably small enough :-)

tools/llvm-objcopy/COFF/Reader.cpp
113	Oh, right, I forgot about the other comments when focusing on the test data.

Removed the extra braces and added more comments, adjusted the testcase for the smaller test data.

LGTM.

Closed by commit rL351947: [llvm-objcopy] [COFF] Fix handling of aux symbols for big objects (authored by mstorsjo). · Explain WhyJan 23 2019, 3:55 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

test/

tools/

llvm-objcopy/

COFF/

bigobj.test

59 lines

tools/

llvm-objcopy/

COFF/

2 lines

18 lines

6 lines

14 lines

8 lines

44 lines

3 lines

2 lines

3 lines

Diff 182981

test/tools/llvm-objcopy/COFF/bigobj.test

This file was added.

				# RUN: yaml2obj %s > %t.in.o
				#
				# RUN: llvm-objdump -t %t.in.o \| FileCheck %s --check-prefixes=SYMBOLS,SYMBOLS-SMALL
				#
				# RUN: llvm-objcopy --force-bigobj %t.in.o %t.big.o
				# RUN: llvm-objdump -t %t.big.o \| FileCheck %s --check-prefixes=SYMBOLS,SYMBOLS-BIG
				# RUN: llvm-objcopy %t.big.o %t.small.o
				# RUN: llvm-objdump -t %t.small.o \| FileCheck %s --check-prefixes=SYMBOLS,SYMBOLS-SMALL

				jhendersonUnsubmitted Not Done Reply Inline Actions I think it probably is easier to associate comments with the corresponding test case without the blank link between them, but I'm not too bothered, so if you prefer it this way, that's fine. jhenderson: I think it probably is easier to associate comments with the corresponding test case without…
				# SYMBOLS: SYMBOL TABLE:
				# SYMBOLS-NEXT: [ 0]{{.}} (nx 1) {{.}} .text
				# SYMBOLS-NEXT: AUX scnlen
				# SYMBOLS-SMALL-NEXT: [ 2]{{.}} (nx 2) {{.}} .file
				# SYMBOLS-BIG-NEXT: [ 2]{{.}} (nx 1) {{.}} .file
				# SYMBOLS-NEXT: AUX abcdefghijklmnopqrs
				# SYMBOLS-SMALL-NEXT: [ 5]{{.}} (nx 0) {{.}} foo
				# SYMBOLS-BIG-NEXT: [ 4]{{.}} (nx 0) {{.}} foo
				# SYMBOLS-EMPTY:

				--- !COFF
				header:
				Machine: IMAGE_FILE_MACHINE_AMD64
				Characteristics: [ ]
				sections:
				- Name: .text
				Characteristics: [ ]
				Alignment: 4
				SectionData: 488B0500000000C3
				Relocations:
				- VirtualAddress: 3
				SymbolName: foo
				Type: IMAGE_REL_AMD64_REL32
				symbols:
				- Name: .text
				Value: 0
				SectionNumber: 1
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_STATIC
				SectionDefinition:
				Length: 1
				NumberOfRelocations: 1
				NumberOfLinenumbers: 0
				CheckSum: 583624169
				Number: 1
				- Name: .file
				Value: 0
				SectionNumber: -2
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_FILE
				File: abcdefghijklmnopqrs
				- Name: foo
				Value: 0
				SectionNumber: 1
				SimpleType: IMAGE_SYM_TYPE_NULL
				ComplexType: IMAGE_SYM_DTYPE_NULL
				StorageClass: IMAGE_SYM_CLASS_EXTERNAL
				...

tools/llvm-objcopy/COFF/COFFObjcopy.cpp

Show First 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	void executeObjcopyOnBinary(const CopyConfig &Config,
COFFReader Reader(In);		COFFReader Reader(In);
Expected<std::unique_ptr<Object>> ObjOrErr = Reader.create();		Expected<std::unique_ptr<Object>> ObjOrErr = Reader.create();
if (!ObjOrErr)		if (!ObjOrErr)
reportError(Config.InputFilename, ObjOrErr.takeError());		reportError(Config.InputFilename, ObjOrErr.takeError());
Object *Obj = ObjOrErr->get();		Object *Obj = ObjOrErr->get();
assert(Obj && "Unable to deserialize COFF object");		assert(Obj && "Unable to deserialize COFF object");
if (Error E = handleArgs(Config, *Obj))		if (Error E = handleArgs(Config, *Obj))
reportError(Config.InputFilename, std::move(E));		reportError(Config.InputFilename, std::move(E));
COFFWriter Writer(*Obj, Out);		COFFWriter Writer(*Obj, Out, Config.ForceBigObj);
if (Error E = Writer.write())		if (Error E = Writer.write())
reportError(Config.OutputFilename, std::move(E));		reportError(Config.OutputFilename, std::move(E));
}		}

} // end namespace coff		} // end namespace coff
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

tools/llvm-objcopy/COFF/Object.h

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	void clearContents() {
OwnedContents.clear();		OwnedContents.clear();
}		}

private:		private:
ArrayRef<uint8_t> ContentsRef;		ArrayRef<uint8_t> ContentsRef;
std::vector<uint8_t> OwnedContents;		std::vector<uint8_t> OwnedContents;
};		};

		struct AuxSymbol {
		AuxSymbol(ArrayRef<uint8_t> In) {
		assert(In.size() == sizeof(Opaque));
		std::copy(In.begin(), In.end(), Opaque);
		}

		ArrayRef<uint8_t> getRef() const {
		return ArrayRef<uint8_t>(Opaque, sizeof(Opaque));
		}

		uint8_t Opaque[sizeof(object::coff_symbol16)];
		};

struct Symbol {		struct Symbol {
object::coff_symbol32 Sym;		object::coff_symbol32 Sym;
StringRef Name;		StringRef Name;
std::vector<uint8_t> AuxData;		std::vector<AuxSymbol> AuxData;
		StringRef AuxFile;
		rnkUnsubmitted Not Done Reply Inline Actions Well, they aren't always coff_symbol16 sized are they? For an input bigobj, it'll be coff_symbol32, or we should make this a vector of coff_symbol16 directly. I don't know much about objcopy, but I think it might be more in the spirit of it to widen into coff_symbol32 as is done for the main symbol field above, instead of keeping this as an opaque binary blob. rnk: Well, they aren't always coff_symbol16 sized are they? For an input bigobj, it'll be…
		mstorsjoAuthorUnsubmitted Done Reply Inline Actions Even for bigobj inputs, the aux symbols (except for .file) only have coff_symbol16 worth of payload. There's no wide version of any of the `coff_aux_..` structs, so they can't be widened into the intermediate storage. Making it a vector of coff_symbol16 would make things clearer, but as the data actually isn't that struct, maybe `struct { uint8_t Opaque[sizeof(coff_symbol16)]; }` would be more correct? Or alternatively `Optional<coff_aux_>` for each of the known types - but I prefer being able to passthrough unknown data untouched. mstorsjo:* Even for bigobj inputs, the aux symbols (except for .file) only have coff_symbol16 worth of…
		rnkUnsubmitted Not Done Reply Inline Actions I see. rnk: I see.
ssize_t TargetSectionId;		ssize_t TargetSectionId;
ssize_t AssociativeComdatTargetSectionId = 0;		ssize_t AssociativeComdatTargetSectionId = 0;
Optional<size_t> WeakTargetSymbolId;		Optional<size_t> WeakTargetSymbolId;
size_t UniqueId;		size_t UniqueId;
size_t RawIndex;		size_t RawIndex;
bool Referenced;		bool Referenced;
};		};

▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	private:

size_t NextSymbolUniqueId = 0;		size_t NextSymbolUniqueId = 0;

std::vector<Section> Sections;		std::vector<Section> Sections;
DenseMap<ssize_t, Section *> SectionMap;		DenseMap<ssize_t, Section *> SectionMap;

ssize_t NextSectionUniqueId = 1; // Allow a UniqueId 0 to mean undefined.		ssize_t NextSectionUniqueId = 1; // Allow a UniqueId 0 to mean undefined.

// Update SymbolMap and RawIndex in each Symbol.		// Update SymbolMap.
void updateSymbols();		void updateSymbols();

// Update SectionMap and Index in each Section.		// Update SectionMap and Index in each Section.
void updateSections();		void updateSections();
};		};

// Copy between coff_symbol16 and coff_symbol32.		// Copy between coff_symbol16 and coff_symbol32.
// The source and destination files can use either coff_symbol16 or		// The source and destination files can use either coff_symbol16 or
▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

tools/llvm-objcopy/COFF/Object.cpp

Show All 20 Lines	for (Symbol S : NewSymbols) {
S.UniqueId = NextSymbolUniqueId++;		S.UniqueId = NextSymbolUniqueId++;
Symbols.emplace_back(S);		Symbols.emplace_back(S);
}		}
updateSymbols();		updateSymbols();
}		}

void Object::updateSymbols() {		void Object::updateSymbols() {
SymbolMap = DenseMap<size_t, Symbol *>(Symbols.size());		SymbolMap = DenseMap<size_t, Symbol *>(Symbols.size());
size_t RawSymIndex = 0;		for (Symbol &Sym : Symbols)
for (Symbol &Sym : Symbols) {
SymbolMap[Sym.UniqueId] = &Sym;		SymbolMap[Sym.UniqueId] = &Sym;
Sym.RawIndex = RawSymIndex;
RawSymIndex += 1 + Sym.Sym.NumberOfAuxSymbols;
}
}		}

const Symbol *Object::findSymbol(size_t UniqueId) const {		const Symbol *Object::findSymbol(size_t UniqueId) const {
auto It = SymbolMap.find(UniqueId);		auto It = SymbolMap.find(UniqueId);
if (It == SymbolMap.end())		if (It == SymbolMap.end())
return nullptr;		return nullptr;
return It->second;		return It->second;
}		}
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

tools/llvm-objcopy/COFF/Reader.cpp

Show First 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	for (uint32_t I = 0, E = COFFObj.getRawNumberOfSymbols(); I < E;) {
if (IsBigObj)		if (IsBigObj)
copySymbol(Sym.Sym,		copySymbol(Sym.Sym,
reinterpret_cast<const coff_symbol32 >(SymRef.getRawPtr()));		reinterpret_cast<const coff_symbol32 >(SymRef.getRawPtr()));
else		else
copySymbol(Sym.Sym,		copySymbol(Sym.Sym,
reinterpret_cast<const coff_symbol16 >(SymRef.getRawPtr()));		reinterpret_cast<const coff_symbol16 >(SymRef.getRawPtr()));
if (auto EC = COFFObj.getSymbolName(SymRef, Sym.Name))		if (auto EC = COFFObj.getSymbolName(SymRef, Sym.Name))
return errorCodeToError(EC);		return errorCodeToError(EC);
Sym.AuxData = COFFObj.getSymbolAuxData(SymRef);		ArrayRef<uint8_t> AuxData = COFFObj.getSymbolAuxData(SymRef);
assert((Sym.AuxData.size() %		size_t SymSize = IsBigObj ? sizeof(coff_symbol32) : sizeof(coff_symbol16);
(IsBigObj ? sizeof(coff_symbol32) : sizeof(coff_symbol16))) == 0);		assert(AuxData.size() == SymSize * SymRef.getNumberOfAuxSymbols());
		if (SymRef.isFileRecord()) {
		jhendersonUnsubmitted Not Done Reply Inline Actions You can get rid of the braces here. jhenderson: You can get rid of the braces here.
		jhendersonUnsubmitted Done Reply Inline Actions You can still get rid of these braces ;) I think a few code comments around here explaining what you are doing and why would make it much more understandable. Perhaps a brief explanation of the difference in the BigObj format? jhenderson: You can still get rid of these braces ;) I think a few code comments around here explaining…
		mstorsjoAuthorUnsubmitted Done Reply Inline Actions Oh, right, I forgot about the other comments when focusing on the test data. mstorsjo: Oh, right, I forgot about the other comments when focusing on the test data.
		Sym.AuxFile = StringRef(reinterpret_cast<const char *>(AuxData.data()),
		AuxData.size())
		.rtrim('\0');
		rnkUnsubmitted Not Done Reply Inline Actions I think this can just be `.rtrim('\0')`, there is a single character overload of rtrim. rnk: I think this can just be `.rtrim('\0')`, there is a single character overload of rtrim.
		} else {
		for (size_t I = 0; I < SymRef.getNumberOfAuxSymbols(); I++)
		Sym.AuxData.push_back(AuxData.slice(I * SymSize, sizeof(AuxSymbol)));
		}
// Find the unique id of the section		// Find the unique id of the section
		rnkUnsubmitted Not Done Reply Inline Actions Should this second `sizeof(coff_symbol16)` be SymSize? Maybe an easier way to express it would be: ArrayRef<uint8_t> Chunk = AuxData.take_front(SymSize); Sym.AuxData.insert(Sym.AuxData.end(), Chunk.begin(), Chunk.end()); AuxData = AuxData.drop_front(SymSize); It mutates a local variable, but takes less math. rnk: Should this second `sizeof(coff_symbol16)` be SymSize? Maybe an easier way to express it would…
		mstorsjoAuthorUnsubmitted Done Reply Inline Actions No, this is intentionally (within the current patch design) copying 18 bytes from a source which has got either 18 or 20 bytes stride. mstorsjo: No, this is intentionally (within the current patch design) copying 18 bytes from a source…
		rnkUnsubmitted Not Done Reply Inline Actions Of course now I read you already clarified this. I think there should be a comment about how this is normalizing from coff_symbol32-sized entries to AuxSymbol sized entries, and discarding the padding bytes that are present in a bigobj. rnk: Of course now I read you already clarified this. I think there should be a comment about how…
if (SymRef.getSectionNumber() <=		if (SymRef.getSectionNumber() <=
0) // Special symbol (undefined/absolute/debug)		0) // Special symbol (undefined/absolute/debug)
Sym.TargetSectionId = SymRef.getSectionNumber();		Sym.TargetSectionId = SymRef.getSectionNumber();
else if (static_cast<uint32_t>(SymRef.getSectionNumber() - 1) <		else if (static_cast<uint32_t>(SymRef.getSectionNumber() - 1) <
Sections.size())		Sections.size())
Sym.TargetSectionId = Sections[SymRef.getSectionNumber() - 1].UniqueId;		Sym.TargetSectionId = Sections[SymRef.getSectionNumber() - 1].UniqueId;
else		else
return createStringError(object_error::parse_failed,		return createStringError(object_error::parse_failed,
▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

tools/llvm-objcopy/COFF/Writer.h

	Show All 18 Lines
	namespace objcopy {			namespace objcopy {
	namespace coff {			namespace coff {

	struct Object;			struct Object;

	class COFFWriter {			class COFFWriter {
	Object &Obj;			Object &Obj;
	Buffer &Buf;			Buffer &Buf;
				bool ForceBigObj;

	size_t FileSize;			size_t FileSize;
	size_t FileAlignment;			size_t FileAlignment;
	size_t SizeOfInitializedData;			size_t SizeOfInitializedData;
	StringTableBuilder StrTabBuilder;			StringTableBuilder StrTabBuilder;

				template <class SymbolTy> std::pair<size_t, size_t> finalizeSymbolTable();
	Error finalizeRelocTargets();			Error finalizeRelocTargets();
	Error finalizeSymbolContents();			Error finalizeSymbolContents();
	void layoutSections();			void layoutSections();
	size_t finalizeStringTable();			size_t finalizeStringTable();
	template <class SymbolTy> std::pair<size_t, size_t> finalizeSymbolTable();

	Error finalize(bool IsBigObj);			Error finalize(bool IsBigObj);

	void writeHeaders(bool IsBigObj);			void writeHeaders(bool IsBigObj);
	void writeSections();			void writeSections();
	template <class SymbolTy> void writeSymbolStringTables();			template <class SymbolTy> void writeSymbolStringTables();

	Error write(bool IsBigObj);			Error write(bool IsBigObj);

	Error patchDebugDirectory();			Error patchDebugDirectory();

	public:			public:
	virtual ~COFFWriter() {}			virtual ~COFFWriter() {}
	Error write();			Error write();

	COFFWriter(Object &Obj, Buffer &Buf)			COFFWriter(Object &Obj, Buffer &Buf, bool ForceBigObj = false)
	: Obj(Obj), Buf(Buf), StrTabBuilder(StringTableBuilder::WinCOFF) {}			: Obj(Obj), Buf(Buf), ForceBigObj(ForceBigObj),
				StrTabBuilder(StringTableBuilder::WinCOFF) {}
	};			};

	} // end namespace coff			} // end namespace coff
	} // end namespace objcopy			} // end namespace objcopy
	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TOOLS_OBJCOPY_COFF_WRITER_H			#endif // LLVM_TOOLS_OBJCOPY_COFF_WRITER_H

tools/llvm-objcopy/COFF/Writer.cpp

Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	if (Sym.TargetSectionId <= 0) {
return createStringError(object_error::invalid_symbol_index,		return createStringError(object_error::invalid_symbol_index,
"Symbol '%s' points to a removed section",		"Symbol '%s' points to a removed section",
Sym.Name.str().c_str());		Sym.Name.str().c_str());
Sym.Sym.SectionNumber = Sec->Index;		Sym.Sym.SectionNumber = Sec->Index;

if (Sym.Sym.NumberOfAuxSymbols == 1 &&		if (Sym.Sym.NumberOfAuxSymbols == 1 &&
Sym.Sym.StorageClass == IMAGE_SYM_CLASS_STATIC) {		Sym.Sym.StorageClass == IMAGE_SYM_CLASS_STATIC) {
coff_aux_section_definition *SD =		coff_aux_section_definition *SD =
reinterpret_cast<coff_aux_section_definition *>(Sym.AuxData.data());		reinterpret_cast<coff_aux_section_definition *>(
		Sym.AuxData[0].Opaque);
uint32_t SDSectionNumber;		uint32_t SDSectionNumber;
if (Sym.AssociativeComdatTargetSectionId == 0) {		if (Sym.AssociativeComdatTargetSectionId == 0) {
// Not a comdat associative section; just set the Number field to		// Not a comdat associative section; just set the Number field to
// the number of the section itself.		// the number of the section itself.
SDSectionNumber = Sec->Index;		SDSectionNumber = Sec->Index;
} else {		} else {
Sec = Obj.findSection(Sym.AssociativeComdatTargetSectionId);		Sec = Obj.findSection(Sym.AssociativeComdatTargetSectionId);
if (Sec == nullptr)		if (Sec == nullptr)
return createStringError(		return createStringError(
object_error::invalid_symbol_index,		object_error::invalid_symbol_index,
"Symbol '%s' is associative to a removed section",		"Symbol '%s' is associative to a removed section",
Sym.Name.str().c_str());		Sym.Name.str().c_str());
SDSectionNumber = Sec->Index;		SDSectionNumber = Sec->Index;
}		}
// Update the section definition with the new section number.		// Update the section definition with the new section number.
SD->NumberLowPart = static_cast<uint16_t>(SDSectionNumber);		SD->NumberLowPart = static_cast<uint16_t>(SDSectionNumber);
SD->NumberHighPart = static_cast<uint16_t>(SDSectionNumber >> 16);		SD->NumberHighPart = static_cast<uint16_t>(SDSectionNumber >> 16);
}		}
}		}
// Check that we actually have got AuxData to match the weak symbol target		// Check that we actually have got AuxData to match the weak symbol target
// we want to set. Only >= 1 would be required, but only == 1 makes sense.		// we want to set. Only >= 1 would be required, but only == 1 makes sense.
if (Sym.WeakTargetSymbolId && Sym.Sym.NumberOfAuxSymbols == 1) {		if (Sym.WeakTargetSymbolId && Sym.Sym.NumberOfAuxSymbols == 1) {
coff_aux_weak_external *WE =		coff_aux_weak_external *WE =
reinterpret_cast<coff_aux_weak_external *>(Sym.AuxData.data());		reinterpret_cast<coff_aux_weak_external *>(Sym.AuxData[0].Opaque);
const Symbol Target = Obj.findSymbol(Sym.WeakTargetSymbolId);		const Symbol Target = Obj.findSymbol(Sym.WeakTargetSymbolId);
if (Target == nullptr)		if (Target == nullptr)
return createStringError(object_error::invalid_symbol_index,		return createStringError(object_error::invalid_symbol_index,
"Symbol '%s' is missing its weak target",		"Symbol '%s' is missing its weak target",
Sym.Name.str().c_str());		Sym.Name.str().c_str());
WE->TagIndex = Target->RawIndex;		WE->TagIndex = Target->RawIndex;
}		}
}		}
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	if (S.Name.size() > COFF::NameSize) {
strncpy(S.Sym.Name.ShortName, S.Name.data(), COFF::NameSize);		strncpy(S.Sym.Name.ShortName, S.Name.data(), COFF::NameSize);
}		}
}		}
return StrTabBuilder.getSize();		return StrTabBuilder.getSize();
}		}

template <class SymbolTy>		template <class SymbolTy>
std::pair<size_t, size_t> COFFWriter::finalizeSymbolTable() {		std::pair<size_t, size_t> COFFWriter::finalizeSymbolTable() {
size_t SymTabSize = Obj.getSymbols().size() * sizeof(SymbolTy);		size_t RawSymIndex = 0;
for (const auto &S : Obj.getSymbols())		for (auto &S : Obj.getMutableSymbols()) {
SymTabSize += S.AuxData.size();		if (!S.AuxFile.empty())
return std::make_pair(SymTabSize, sizeof(SymbolTy));		S.Sym.NumberOfAuxSymbols =
		alignTo(S.AuxFile.size(), sizeof(SymbolTy)) / sizeof(SymbolTy);
		jhendersonUnsubmitted Done Reply Inline Actions This is another place requiring a code comment, I think, just explaining the "why". jhenderson: This is another place requiring a code comment, I think, just explaining the "why".
		S.RawIndex = RawSymIndex;
		RawSymIndex += 1 + S.Sym.NumberOfAuxSymbols;
		}
		return std::make_pair(RawSymIndex * sizeof(SymbolTy), sizeof(SymbolTy));
}		}

Error COFFWriter::finalize(bool IsBigObj) {		Error COFFWriter::finalize(bool IsBigObj) {
		size_t SymTabSize, SymbolSize;
		std::tie(SymTabSize, SymbolSize) = IsBigObj
		? finalizeSymbolTable<coff_symbol32>()
		: finalizeSymbolTable<coff_symbol16>();

if (Error E = finalizeRelocTargets())		if (Error E = finalizeRelocTargets())
return E;		return E;
if (Error E = finalizeSymbolContents())		if (Error E = finalizeSymbolContents())
return E;		return E;

size_t SizeOfHeaders = 0;		size_t SizeOfHeaders = 0;
FileAlignment = 1;		FileAlignment = 1;
size_t PeHeaderSize = 0;		size_t PeHeaderSize = 0;
Show All 35 Lines	if (Obj.IsPE) {
}		}

// If the PE header had a checksum, clear it, since it isn't valid		// If the PE header had a checksum, clear it, since it isn't valid
// any longer. (We don't calculate a new one.)		// any longer. (We don't calculate a new one.)
Obj.PeHeader.CheckSum = 0;		Obj.PeHeader.CheckSum = 0;
}		}

size_t StrTabSize = finalizeStringTable();		size_t StrTabSize = finalizeStringTable();
size_t SymTabSize, SymbolSize;
std::tie(SymTabSize, SymbolSize) = IsBigObj
? finalizeSymbolTable<coff_symbol32>()
: finalizeSymbolTable<coff_symbol16>();

size_t PointerToSymbolTable = FileSize;		size_t PointerToSymbolTable = FileSize;
// StrTabSize <= 4 is the size of an empty string table, only consisting		// StrTabSize <= 4 is the size of an empty string table, only consisting
// of the length field.		// of the length field.
if (SymTabSize == 0 && StrTabSize <= 4 && Obj.IsPE) {		if (SymTabSize == 0 && StrTabSize <= 4 && Obj.IsPE) {
// For executables, don't point to the symbol table and skip writing		// For executables, don't point to the symbol table and skip writing
// the length field, if both the symbol and string tables are empty.		// the length field, if both the symbol and string tables are empty.
PointerToSymbolTable = 0;		PointerToSymbolTable = 0;
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines

template <class SymbolTy> void COFFWriter::writeSymbolStringTables() {		template <class SymbolTy> void COFFWriter::writeSymbolStringTables() {
uint8_t *Ptr = Buf.getBufferStart() + Obj.CoffFileHeader.PointerToSymbolTable;		uint8_t *Ptr = Buf.getBufferStart() + Obj.CoffFileHeader.PointerToSymbolTable;
for (const auto &S : Obj.getSymbols()) {		for (const auto &S : Obj.getSymbols()) {
// Convert symbols back to the right size, from coff_symbol32.		// Convert symbols back to the right size, from coff_symbol32.
copySymbol<SymbolTy, coff_symbol32>(reinterpret_cast<SymbolTy >(Ptr),		copySymbol<SymbolTy, coff_symbol32>(reinterpret_cast<SymbolTy >(Ptr),
S.Sym);		S.Sym);
Ptr += sizeof(SymbolTy);		Ptr += sizeof(SymbolTy);
std::copy(S.AuxData.begin(), S.AuxData.end(), Ptr);		if (!S.AuxFile.empty()) {
Ptr += S.AuxData.size();		std::copy(S.AuxFile.begin(), S.AuxFile.end(), Ptr);
		// This assumes that unwritten parts of the memory mapped file
		// are initialized to zero.
		Ptr += S.Sym.NumberOfAuxSymbols * sizeof(SymbolTy);
		} else {
		for (const AuxSymbol &AuxSym : S.AuxData) {
		ArrayRef<uint8_t> Ref = AuxSym.getRef();
		std::copy(Ref.begin(), Ref.end(), Ptr);
		jhendersonUnsubmitted Not Done Reply Inline Actions Are coff_symbol16 and SymbolTy supposed to be the same size? Or are you deliberately writing less (or more) into the buffer than you iterate over? jhenderson: Are coff_symbol16 and SymbolTy supposed to be the same size? Or are you deliberately writing…
		mstorsjoAuthorUnsubmitted Done Reply Inline Actions I'm deliberately writing more or less, yes. The COFF-ism is that the symbol table can consist of entries of either coff_symbol16 or coff_symbol32, of 18 or 20 bytes each. A symbol can be followed by a number of aux symbols, which can be one of a number of different structs, all 18 bytes each. If the table consists of coff_symbol32 entries, each one of the aux symbols (opaque aux structs) will have 2 bytes of padding at the end. So here I'm writing chunks of 18 bytes at a time out of the stored AuxData (where they are packed tightly), spaced 18 or 20 bytes apart in the output symbol table (depending on the entry size of that symbol table). mstorsjo: I'm deliberately writing more or less, yes. The COFF-ism is that the symbol table can consist…
		Ptr += sizeof(SymbolTy);
		}
		}
		jhendersonUnsubmitted Not Done Reply Inline Actions Strictly speaking, you don't need the braces in this if/else. jhenderson: Strictly speaking, you don't need the braces in this if/else.
}		}
		jhendersonUnsubmitted Done Reply Inline Actions Again, a few comments around here would be good. jhenderson: Again, a few comments around here would be good.
if (StrTabBuilder.getSize() > 4 \|\| !Obj.IsPE) {		if (StrTabBuilder.getSize() > 4 \|\| !Obj.IsPE) {
// Always write a string table in object files, even an empty one.		// Always write a string table in object files, even an empty one.
StrTabBuilder.write(Ptr);		StrTabBuilder.write(Ptr);
Ptr += StrTabBuilder.getSize();		Ptr += StrTabBuilder.getSize();
}		}
}		}

Error COFFWriter::write(bool IsBigObj) {		Error COFFWriter::write(bool IsBigObj) {
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (Dir->RelativeVirtualAddress >= S.Header.VirtualAddress &&
return Error::success();		return Error::success();
}		}
}		}
return createStringError(object_error::parse_failed,		return createStringError(object_error::parse_failed,
"Debug directory not found");		"Debug directory not found");
}		}

Error COFFWriter::write() {		Error COFFWriter::write() {
bool IsBigObj = Obj.getSections().size() > MaxNumberOfSections16;		bool IsBigObj =
		Obj.getSections().size() > MaxNumberOfSections16 \|\| ForceBigObj;
if (IsBigObj && Obj.IsPE)		if (IsBigObj && Obj.IsPE)
return createStringError(object_error::parse_failed,		return createStringError(object_error::parse_failed,
"Too many sections for executable");		"Too many sections for executable");
return write(IsBigObj);		return write(IsBigObj);
}		}

} // end namespace coff		} // end namespace coff
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

tools/llvm-objcopy/CopyConfig.h

Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	struct CopyConfig {
bool StripDWO = false;		bool StripDWO = false;
bool StripDebug = false;		bool StripDebug = false;
bool StripNonAlloc = false;		bool StripNonAlloc = false;
bool StripSections = false;		bool StripSections = false;
bool StripUnneeded = false;		bool StripUnneeded = false;
bool Weaken = false;		bool Weaken = false;
bool DecompressDebugSections = false;		bool DecompressDebugSections = false;
DebugCompressionType CompressionType = DebugCompressionType::None;		DebugCompressionType CompressionType = DebugCompressionType::None;

		// Options for testing only
		bool ForceBigObj = false;
};		};

// Configuration for the overall invocation of this tool. When invoked as		// Configuration for the overall invocation of this tool. When invoked as
// objcopy, will always contain exactly one CopyConfig. When invoked as strip,		// objcopy, will always contain exactly one CopyConfig. When invoked as strip,
// will contain one or more CopyConfigs.		// will contain one or more CopyConfigs.
struct DriverConfig {		struct DriverConfig {
SmallVector<CopyConfig, 1> CopyConfigs;		SmallVector<CopyConfig, 1> CopyConfigs;
};		};
Show All 15 Lines

tools/llvm-objcopy/CopyConfig.cpp

Show First 20 Lines • Show All 368 Lines • ▼ Show 20 Lines	for (auto Arg : InputArgs.filtered(OBJCOPY_keep_symbol))
Config.SymbolsToKeep.push_back(Arg->getValue());		Config.SymbolsToKeep.push_back(Arg->getValue());

Config.DeterministicArchives = InputArgs.hasFlag(		Config.DeterministicArchives = InputArgs.hasFlag(
OBJCOPY_enable_deterministic_archives,		OBJCOPY_enable_deterministic_archives,
OBJCOPY_disable_deterministic_archives, /default=/true);		OBJCOPY_disable_deterministic_archives, /default=/true);

Config.PreserveDates = InputArgs.hasArg(OBJCOPY_preserve_dates);		Config.PreserveDates = InputArgs.hasArg(OBJCOPY_preserve_dates);

		Config.ForceBigObj = InputArgs.hasArg(OBJCOPY_force_bigobj);

if (Config.DecompressDebugSections &&		if (Config.DecompressDebugSections &&
Config.CompressionType != DebugCompressionType::None) {		Config.CompressionType != DebugCompressionType::None) {
error("Cannot specify --compress-debug-sections at the same time as "		error("Cannot specify --compress-debug-sections at the same time as "
"--decompress-debug-sections at the same time");		"--decompress-debug-sections at the same time");
}		}

if (Config.DecompressDebugSections && !zlib::isAvailable())		if (Config.DecompressDebugSections && !zlib::isAvailable())
error("LLVM was not compiled with LLVM_ENABLE_ZLIB: cannot decompress.");		error("LLVM was not compiled with LLVM_ENABLE_ZLIB: cannot decompress.");
▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

tools/llvm-objcopy/ObjcopyOpts.td

	Show First 20 Lines • Show All 172 Lines • ▼ Show 20 Lines
	defm build_id_link_input			defm build_id_link_input
	: Eq<"build-id-link-input", "Hard-link the input to <dir>/xx/xxx<suffix> "			: Eq<"build-id-link-input", "Hard-link the input to <dir>/xx/xxx<suffix> "
	"name derived from hex build ID">,			"name derived from hex build ID">,
	MetaVarName<"suffix">;			MetaVarName<"suffix">;
	defm build_id_link_output			defm build_id_link_output
	: Eq<"build-id-link-output", "Hard-link the output to <dir>/xx/xxx<suffix> "			: Eq<"build-id-link-output", "Hard-link the output to <dir>/xx/xxx<suffix> "
	"name derived from hex build ID">,			"name derived from hex build ID">,
	MetaVarName<"suffix">;			MetaVarName<"suffix">;

				// Undocumented options only used for testing.
				def force_bigobj : Flag<["--"], "force-bigobj">;