This is an archive of the discontinued LLVM Phabricator instance.

I ran this on some internal ihex files we have, and it seems to be dropping one of the sections, so I'll have to poke around to see what the bug is. But some ihex support is better than none, so it's not necessarily blocking :)

test/tools/llvm-objcopy/ELF/ihex-reader.test
2–4 ↗	(On Diff #193725)	The only ihex reading test (besides the error cases) is just consuming ihex that llvm-objcopy produces with -O ihex. Could you add a .hex test file for a more stable test? And then it would be easier to verify, e.g., llvm-objcopy -I ihex -O binary <test> is identical to objcopy -I ihex -O binary <test>
tools/llvm-objcopy/ELF/ELFObjcopy.cpp
670	A more succinct way (and same in executeObjcopyOnRawBinary): const ElfType OutputElfType = getOutputElfType( Config.OutputArch.getValueOr(Config.BinaryArch));
tools/llvm-objcopy/ELF/Object.cpp
160–161	`Fail` is unused in release builds, so you need to add a `(void)Fail;` to silence the error/warning in release builds.
1077	RecAddr should be defined in the loop, where it is used
1420–1430	WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if it just attempts to match ":[0-9A-F]+". It would produce less precise error messages, though.
1421	I think this will crash (or be UB) on an empty line?
1439–1440	I think there should be validation (somewhere) that there are no more records after this
1480	as a tiny optimization, call Records.reserve(Lines.size()) once you know how many lines there are.
1495–1502	Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t> fields for each record, so we don't have to worry about it being valid everywhere (i.e. using checkedGetHex)?
1504–1507	How about creating a static method to convert a line into an Expected<IHexRecord>, so we can return an error if it's invalid instead of making the user call getChecksum/checkRecord?
tools/llvm-objcopy/ELF/Object.h
196–199	Why not just uint16_t fields?

evgeny777 marked 4 inline comments as done.Apr 5 2019, 10:36 AM

evgeny777 added inline comments.

tools/llvm-objcopy/ELF/Object.cpp
1420–1430	I think that precise error message is more important. It might be hard in some cases to identify wrong character, e.g: `I` instead of `1`, `O` instead of `0`, russian `A` instead of english `A` and so on.
1421	Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert here on `!Line.empty()`
1439–1440	I think that `EndOfFile` record should unconditionally cancel further processing. . This allows moving EOF record within a file to temporarily prevent part of records from loading. This can be useful for testing. Also it seems GNU objcopy behaves this way.
1495–1502	It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're checking string with `checkChars` we shouldn't really step on conversion error, unless something really weird happens.

Line parser moved to IHexRecord::parse
Better test case for reader
Addressed some of review comments

Ping

In D60270#1463807, @evgeny777 wrote:

Ping

FYI, most of the other llvm-objcopy developers are still away following Euro LLVM, so it may not be until next week that you get an comments from @rupprecht. I don't currently have time to read up on ihex, I'm afraid, but if you don't get any feedback next week, I'll try to look into it if I have time.

FYI, most of the other llvm-objcopy developers are still away following Euro LLVM

Ah, I see. Ok, there is no rush.

In D60270#1464005, @evgeny777 wrote:

FYI, most of the other llvm-objcopy developers are still away following Euro LLVM

Ah, I see. Ok, there is no rush.

Sorry, I should have mentioned earlier that I was going to be busy last week. (In advance: I'm here this week, but I'll be out next week).
Hopefully I'll get to this one today, or if not, then tomorrow.

btw, some people at euro llvm also requested srec supprt, which seems extremely similar to ihex -- so it might be good to think about how generic this handling can be, e.g. maybe most of it should just be a "record" parser which is shared with ihex and srec. I don't think premature specialization should be done to make it more general than it should be, but just don't do anything that would be hostile towards refactoring it :)

btw, some people at euro llvm also requested srec supprt, which seems extremely similar to ihex

For me it doesn't look extremely similar to IHEX, except both formats use hexadecimal byte representation.
There are no such things as segment and extended addresses in SREC and even checksum calculation is different.

I think if we implement SREC then part of section builder functionality from IHexELFBuilder::addDataSections can be moved to a common base class,
also it seems SREC would have similar record structure (Type, Address, Data).

Still I expect writer and parser to be completely separate.

There's a lot of code to review here. I'll keep reviewing it everyday but this is going to take a while to review. Any help on splitting this up and making into smaller chunks would be helpful. Splitting reading and writing up into two separate patches would be helpful and removing features that we can add later would be helpful.

include/llvm/Support/Error.h
1180	size_t here and below is kind of confusing, can we use uint32_t?
tools/llvm-objcopy/ELF/Object.cpp
158	Unless an check that generates an error always proceeds this I think its best to return an error in this case, not assert fail. It would be better to roll this into an Expected function in that case I think anyway.
198	Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to lend itself to streams where as my opinion was the opposite before. You wouldn't need utohexstr since those formatting options are already supplied by the library I believe.
210	This is a very generic name with no comment. In general your comments have been awesome. I'd like to have an idea what this function does without reading the contents.
299	Does this ever make sense if there is no segment?
311	Masking that like this seems redundant, in general the number of places we're converting from 64 to 32 in an unchecked way is really shocking. I'd feel a lot more comfortable if we encapsulated these checks more and made them more clear.
314	Maybe we could split support for extended records out into a sperate patch and error out here for now?
tools/llvm-objcopy/ELF/Object.h
267	What's the point in splitting this into two classes? Also does inheriting from BinarySectionWriter make sense? The same visitors will need to be implemented I would suppose the offsets and everything would be very different.
297	In general I think it might be worth considering weather there is a need to use sections at all. Originally with the binary writer we only used program headers. It turns out that people did a lot of stuff in a really odd way with GNU objcopy when using -O binary that required that we use sections as the primary basis for output. I would imagine that ihex users would not be doing the same sorts of odd tricks and that you could write the output strait from program headers. This would simplify the implementation greatly I think and harden the implementation against all sorts of odd corner cases.
495–504	Why do we need this to output a new format?
915	We can probably split this into two changes to make things smaller, one for reading, and one for writing yeah?

There's a lot of code to review here.

I've responded to some of the comments, meanwhile I'm splitting patch into writer (will go first) and reader (will go next). Will update the review soon

tools/llvm-objcopy/ELF/Object.cpp
158	This function can't actually return error, because string has been previously validated (see `checkChars` for example). IMO, it's bad practice to implement runtime checks for one's own logical errors.
198	This function was optimized to not using any dynamic allocation (IHexLineData is actually SmallVector), because each line contains only 16 bytes of section data, so it's possible to have really huge number of lines. What are the benefits of using raw_ostream?
299	It's a helper function which returns section VA if there is no segment. Any suggestion for better name?
311	There is a checkSections method which check all sections to detect if any of them has 64-bit address. Bear in mind that implementation also supports sign extended 32-bit addresses, i.e 0xFFFFFFFF80000000 is a valid address, but 0x100000000 is not
314	I suggest splitting reader and writer. To me it looks like a more logical split compared to removal of certain record types.
tools/llvm-objcopy/ELF/Object.h
267	I inherited IHexSectionWriter from BinarySectionWriter in order to reuse visitors for RelocationSection, GnuDebugLinkSection, e.t.c which will never go to IHEX nor to binary output. Are you suggesting duplication?
297	AFAIU one can't do this with IHEX, because unlike binary IHEX is not a contiguous blob, e.g you can have a gap between sections which won't go to output file.
495–504	This function takes a portion of hexadecimal data from IHexRecord and appends it in binary form to internal vector of OwnedDataSection
915	I think so.

Splitted IHEX patch into reader and writer. Diff now contains the "writer" part.

Any comments on this?

Ping

Looked mostly at the test for now, going to take a pass over the code today.

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-segments.yaml
20	I think this should be .data1? Command Output (stderr): -- error: Unknown section referenced: '.data1' by program header. (I think this is a recent validation added by rL359663)
test/tools/llvm-objcopy/ELF/ihex-writer.test
1	All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected? I haven't yet debugged exactly why.
3	`cat X \| FileCheck` should be replaced with `FileCheck --input-file=X` everywhere
20	When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of range for Intel Hex file`. Maybe we shouldn't be supporting it? Are we able to handle it correctly somehow even though GNU objcopy can't?
tools/llvm-objcopy/ELF/Object.cpp
155	Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr & 0xffffffff80000000 == 0xffffffff80000000` instead?

Some insight into the differences:

test/tools/llvm-objcopy/ELF/ihex-writer.test
10	It looks like the addresses in this file don't match up, but I don't have a specific suggestion yet
60	It looks like this should only be printed when the address is not zero: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l880
tools/llvm-objcopy/ELF/Object.cpp
205	It looks like ihex uses `\r\n` line endings 😦 https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l752 It seems weird for me to request this, but I think we should write `\r\n`, as this seems like a strange detail that people might need when consuming these files. I don't actually have any examples of this, however.

MaskRay added inline comments.May 22 2019, 6:37 PM

test/tools/llvm-objcopy/ELF/ihex-writer.test
9	The two RUN lines can be written as: `llvm-objcopy -O ihex %t-segs - \| FileCheck --check-prefix=SEGMENTS %s` if the output `%t2-segs.hex` isn't used elsewhere.

Fixed error in section name in one of the tests
Removed cat | FileCheck from test case
Zero start address is not longer written to IHEX
Switched to windows line endings.

test/tools/llvm-objcopy/ELF/ihex-writer.test
1	After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.yaml` is identical to one of GNU objcopy. However if input ELF file contains segments situation is different - GNU objcopy seems to ignore segments completely and always uses section virtual address. This doesn't seem logical to me and also doesn't look consistent with the way we're currently generating binary output in llvm-objcopy.
20	Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.32.51.20190227 works fine
tools/llvm-objcopy/ELF/Object.cpp
155	As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t.
205	Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells that: Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems Probably the easiest thing to do is to stick to GNU behavior. I'll update the patch

seiya added a subscriber: seiya.May 27 2019, 1:42 AM

I ran a few internal tests and this produces identical ihex output for every file I checked! \ o /

test/tools/llvm-objcopy/ELF/ihex-writer.test
20	Yep, that was it, I'm no longer seeing it with GNU objcopy from trunk.

This revision is now accepted and ready to land.May 28 2019, 2:19 PM

evgeny777 retitled this revision from [llvm-objcopy] Add support for Intel HEX input/output format to [llvm-objcopy] Add support for Intel HEX output format.May 29 2019, 3:11 AM

Closed by commit rL361949: [llvm-objcopy] Implement IHEX writer (authored by evgeny777). · Explain WhyMay 29 2019, 4:37 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2019, 4:37 AM

Herald added a subscriber: kristina. · View Herald Transcript

evgeny777 mentioned this in D62583: [llvm-objcopy] Implement IHEX reader.May 29 2019, 6:12 AM

simon_tatham mentioned this in D132541: [llvm-objcopy] Introduce 'ihex-flat' output format..Aug 24 2022, 2:46 AM

Revision Contents

Path

Size

include/

llvm/

Support/

Error.h

27 lines

test/

tools/

llvm-objcopy/

ELF/

Inputs/

ihex-elf-pt-null.yaml

20 lines

ihex-elf-sections.yaml

60 lines

ihex-elf-sections2.yaml

39 lines

ihex-elf-segments.yaml

60 lines

ihex-writer.test

92 lines

tools/

llvm-objcopy/

CopyConfig.cpp

3 lines

ELF/

ELFObjcopy.cpp

45 lines

Object.h

140 lines

Object.cpp

258 lines

Diff 197265

include/llvm/Support/Error.h

	Show First 20 Lines • Show All 1,171 Lines • ▼ Show 20 Lines

	/// This class wraps a filename and another Error.			/// This class wraps a filename and another Error.
	///			///
	/// In some cases, an error needs to live along a 'source' name, in order to			/// In some cases, an error needs to live along a 'source' name, in order to
	/// show more detailed information to the user.			/// show more detailed information to the user.
	class FileError final : public ErrorInfo<FileError> {			class FileError final : public ErrorInfo<FileError> {

	friend Error createFileError(const Twine &, Error);			friend Error createFileError(const Twine &, Error);
				friend Error createFileError(const Twine &, size_t, Error);
				jakehehrlichUnsubmitted Not Done Reply Inline Actions size_t here and below is kind of confusing, can we use uint32_t? jakehehrlich: size_t here and below is kind of confusing, can we use uint32_t?

	public:			public:
	void log(raw_ostream &OS) const override {			void log(raw_ostream &OS) const override {
	assert(Err && !FileName.empty() && "Trying to log after takeError().");			assert(Err && !FileName.empty() && "Trying to log after takeError().");
	OS << "'" << FileName << "': ";			OS << "'" << FileName << "': ";
				if (Line.hasValue())
				OS << "line " << Line.getValue() << ": ";
	Err->log(OS);			Err->log(OS);
	}			}

	Error takeError() { return Error(std::move(Err)); }			Error takeError() { return Error(std::move(Err)); }

	std::error_code convertToErrorCode() const override;			std::error_code convertToErrorCode() const override;

	// Used by ErrorInfo::classID.			// Used by ErrorInfo::classID.
	static char ID;			static char ID;

	private:			private:
	FileError(const Twine &F, std::unique_ptr<ErrorInfoBase> E) {			FileError(const Twine &F, Optional<size_t> LineNum,
				std::unique_ptr<ErrorInfoBase> E) {
	assert(E && "Cannot create FileError from Error success value.");			assert(E && "Cannot create FileError from Error success value.");
	assert(!F.isTriviallyEmpty() &&			assert(!F.isTriviallyEmpty() &&
	"The file name provided to FileError must not be empty.");			"The file name provided to FileError must not be empty.");
	FileName = F.str();			FileName = F.str();
	Err = std::move(E);			Err = std::move(E);
				Line = std::move(LineNum);
	}			}

	static Error build(const Twine &F, Error E) {			static Error build(const Twine &F, Optional<size_t> Line, Error E) {
	return Error(std::unique_ptr<FileError>(new FileError(F, E.takePayload())));			return Error(
				std::unique_ptr<FileError>(new FileError(F, Line, E.takePayload())));
	}			}

	std::string FileName;			std::string FileName;
				Optional<size_t> Line;
	std::unique_ptr<ErrorInfoBase> Err;			std::unique_ptr<ErrorInfoBase> Err;
	};			};

	/// Concatenate a source file path and/or name with an Error. The resulting			/// Concatenate a source file path and/or name with an Error. The resulting
	/// Error is unchecked.			/// Error is unchecked.
	inline Error createFileError(const Twine &F, Error E) {			inline Error createFileError(const Twine &F, Error E) {
	return FileError::build(F, std::move(E));			return FileError::build(F, Optional<size_t>(), std::move(E));
				}

				/// Concatenate a source file path and/or name with line number and an Error.
				/// The resulting Error is unchecked.
				inline Error createFileError(const Twine &F, size_t Line, Error E) {
				return FileError::build(F, Optional<size_t>(Line), std::move(E));
	}			}

	/// Concatenate a source file path and/or name with a std::error_code			/// Concatenate a source file path and/or name with a std::error_code
	/// to form an Error object.			/// to form an Error object.
	inline Error createFileError(const Twine &F, std::error_code EC) {			inline Error createFileError(const Twine &F, std::error_code EC) {
	return createFileError(F, errorCodeToError(EC));			return createFileError(F, errorCodeToError(EC));
	}			}

				/// Concatenate a source file path and/or name with line number and
				/// std::error_code to form an Error object.
				inline Error createFileError(const Twine &F, size_t Line, std::error_code EC) {
				return createFileError(F, Line, errorCodeToError(EC));
				}

	Error createFileError(const Twine &F, ErrorSuccess) = delete;			Error createFileError(const Twine &F, ErrorSuccess) = delete;

	/// Helper for check-and-exit error handling.			/// Helper for check-and-exit error handling.
	///			///
	/// For tool use only. NOT FOR USE IN LIBRARY CODE.			/// For tool use only. NOT FOR USE IN LIBRARY CODE.
	///			///
	class ExitOnError {			class ExitOnError {
	public:			public:
	▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-pt-null.yaml

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x0
				AddressAlign: 0x8
				Content: "0001020304"
				ProgramHeaders:
				- Type: PT_NULL
				Flags: [ PF_X, PF_R ]
				VAddr: 0xF00000000
				PAddr: 0x100000
				Sections:
				- Section: .text

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-sections.yaml

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				# This section contents exceeds default IHex line length of 16 bytes
				# so we expect two lines created for it.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x0
				AddressAlign: 0x8
				Content: "000102030405060708090A0B0C0D0E0F1011121314"
				- Name: .data
				# This section overlap 16-bit segment boundary, so we expect
				# additional 'SegmentAddr' record of type '02'
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "3031323334353637383940"
				Address: 0xFFF8
				AddressAlign: 0x8
				- Name: .data2
				# Previous section '.data' should have forced creation of
				# 'SegmentAddr'(02) record with segment address of 0x10000,
				# so this section should have address of 0x100.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "40414243"
				Address: 0x10100
				AddressAlign: 0x8
				- Name: .data3
				# The last section not only overlaps segment boundary, but
				# also has linear address which doesn't fit 20 bits. The
				# following records should be craeted:
				# 'SegmentAddr'(02) record with address 0x0
				# 'ExtendedAddr'(04) record with address 0x100000
				# 'Data'(00) record with 8 bytes of section data
				# 'SegmentAddr'(02) record with address 0x10000
				# 'Data'(00) record with remaining 3 bytes of data.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "5051525354555657585960"
				Address: 0x10FFF8
				AddressAlign: 0x8
				- Name: .bss
				# NOBITS sections are not written to IHex
				Type: SHT_NOBITS
				Flags: [ SHF_ALLOC ]
				Address: 0x10100
				Size: 0x1000
				AddressAlign: 0x8
				- Name: .dummy
				# Non-allocatable sections are not written to IHex
				Type: SHT_PROGBITS
				Flags: [ ]
				Address: 0x20FFF8
				Size: 65536
				AddressAlign: 0x8

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-sections2.yaml

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				# Zero length sections are not exported to IHex
				# 'SegmentAddr' and 'ExtendedAddr' records aren't
				# created either.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x7FFFFFFF
				AddressAlign: 0x8
				Size: 0
				- Name: .text1
				# Section address is sign-extended 32-bit address
				# Data fits 32-bit range
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0xFFFFFFFF80001000
				AddressAlign: 0x8
				Content: "0001020304"
				- Name: .text2
				# Part of section data is in 32-bit address range
				# and part isn't.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0xFFFFFFF8
				AddressAlign: 0x8
				Content: "000102030405060708"
				- Name: .text3
				# Entire secion is outside of 32-bit range
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0xFFFFFFFF0
				AddressAlign: 0x8
				Content: "0001020304"

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-segments.yaml

				# Here we use yaml from ihex-elf-sections.yaml, but add single load
				# segment containing all exported sections. In such case we should
				# use physical address of a section intead of virtual address. Physical
				# addresses start from 0x100000, so we create two additional 'ExtenededAddr'
				# (03) record in the beginning of IHex file with that physical address
				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Entry: 0x100000
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x0
				AddressAlign: 0x8
				Content: "000102030405060708090A0B0C0D0E0F1011121314"
				- Name: .data
				rupprechtUnsubmitted Done Reply Inline Actions I think this should be .data1? Command Output (stderr): -- error: Unknown section referenced: '.data1' by program header. (I think this is a recent validation added by rL359663) rupprecht: I think this should be .data1? ``` Command Output (stderr): -- error: Unknown section…
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "3031323334353637383940"
				Address: 0xFFF8
				AddressAlign: 0x8
				- Name: .data2
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "40414243"
				Address: 0x10100
				AddressAlign: 0x8
				- Name: .data3
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "5051525354555657585960"
				Address: 0x10FFF8
				AddressAlign: 0x8
				- Name: .bss
				Type: SHT_NOBITS
				Flags: [ SHF_ALLOC ]
				Address: 0x10100
				Size: 0x1000
				AddressAlign: 0x8
				- Name: .dummy
				Type: SHT_PROGBITS
				Flags: [ ]
				Address: 0x20FFF8
				Size: 65536
				AddressAlign: 0x8
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				VAddr: 0xF00000000
				PAddr: 0x100000
				Sections:
				- Section: .text
				- Section: .data1
				- Section: .data2
				- Section: .data3
				- Section: .bss

test/tools/llvm-objcopy/ELF/ihex-writer.test

				# RUN: yaml2obj %p/Inputs/ihex-elf-sections.yaml -o %t
				rupprechtUnsubmitted Not Done Reply Inline Actions All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected? I haven't yet debugged exactly why. rupprecht: All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected?
				evgeny777AuthorUnsubmitted Done Reply Inline Actions After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.yaml` is identical to one of GNU objcopy. However if input ELF file contains segments situation is different - GNU objcopy seems to ignore segments completely and always uses section virtual address. This doesn't seem logical to me and also doesn't look consistent with the way we're currently generating binary output in llvm-objcopy. evgeny777: After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.
				# RUN: llvm-objcopy -O ihex %t %t2.hex
				# RUN: cat %t2.hex \| FileCheck %s
				rupprechtUnsubmitted Done Reply Inline Actions `cat X \| FileCheck` should be replaced with `FileCheck --input-file=X` everywhere rupprecht: `cat X \| FileCheck` should be replaced with `FileCheck --input-file=X` everywhere

				# Check ihex output, when we have segments in ELF file
				# In such case only sections in PT_LOAD segments will
				# be exported and their physical addresses will be used
				# RUN: yaml2obj %p/Inputs/ihex-elf-segments.yaml -o %t-segs
				# RUN: llvm-objcopy -O ihex %t-segs %t2-segs.hex
				MaskRayUnsubmitted Done Reply Inline Actions The two RUN lines can be written as: `llvm-objcopy -O ihex %t-segs - \| FileCheck --check-prefix=SEGMENTS %s` if the output `%t2-segs.hex` isn't used elsewhere. MaskRay: The two RUN lines can be written as: `llvm-objcopy -O ihex %t-segs - \| FileCheck --check…
				# RUN: cat %t2-segs.hex \| FileCheck %s --check-prefix=SEGMENTS
				rupprechtUnsubmitted Not Done Reply Inline Actions It looks like the addresses in this file don't match up, but I don't have a specific suggestion yet rupprecht: It looks like the addresses in this file don't match up, but I don't have a specific suggestion…

				# Check that non-load segments are ignored:
				# RUN: yaml2obj %p/Inputs/ihex-elf-pt-null.yaml -o %t2-segs
				# RUN: llvm-objcopy -O ihex %t2-segs %t3-segs.hex
				# RUN: cat %t3-segs.hex \| FileCheck %s --check-prefix=PT_NULL

				# Check that sign-extended 32-bit section addresses are processed
				# correctly
				# RUN: yaml2obj %p/Inputs/ihex-elf-sections2.yaml -o %t-sec2
				# RUN: llvm-objcopy -O ihex --only-section=.text1 %t-sec2 %t-sec2.hex
				rupprechtUnsubmitted Not Done Reply Inline Actions When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of range for Intel Hex file`. Maybe we shouldn't be supporting it? Are we able to handle it correctly somehow even though GNU objcopy can't? rupprecht: When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of…
				evgeny777AuthorUnsubmitted Done Reply Inline Actions Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.32.51.20190227 works fine evgeny777: Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.
				rupprechtUnsubmitted Done Reply Inline Actions Yep, that was it, I'm no longer seeing it with GNU objcopy from trunk. rupprecht: Yep, that was it, I'm no longer seeing it with GNU objcopy from trunk.
				# RUN: cat %t-sec2.hex \| FileCheck %s --check-prefix=SIGN_EXTENDED

				# Check that section address range overlapping 32 bit range
				# triggers an error
				# RUN: not llvm-objcopy -O ihex --only-section=.text2 %t-sec2 %t-sec2-2.hex 2>&1 \| FileCheck %s --check-prefix=BAD-ADDR
				# RUN: not llvm-objcopy -O ihex --only-section=.text3 %t-sec2 %t-sec2-3.hex 2>&1 \| FileCheck %s --check-prefix=BAD-ADDR2

				# Check that zero length section is not written
				# RUN: llvm-objcopy -O ihex --only-section=.text %t-sec2 %t-sec2-4.hex
				# RUN: cat %t-sec2-4.hex \| FileCheck %s --check-prefix=ZERO_SIZE_SEC

				# Check 80x86 start address record. It is created for start
				# addresses less than 0x100000
				# RUN: llvm-objcopy -O ihex --set-start=0xFFFF %t %t3.hex
				# RUN: cat %t3.hex \| FileCheck %s --check-prefix=START1

				# Check i386 start address record (05). It is created for
				# start addresses which doesn't fit 20 bits
				# RUN: llvm-objcopy -O ihex --set-start=0x100000 %t %t4.hex
				# RUN: cat %t4.hex \| FileCheck %s --check-prefix=START2

				# We allow sign extended 32 bit start addresses as well.
				# RUN: llvm-objcopy -O ihex --set-start=0xFFFFFFFF80001000 %t %t5.hex
				# RUN: cat %t5.hex \| FileCheck %s --check-prefix=START3

				# Start address which exceeds 32 bit range triggers an error
				# RUN: not llvm-objcopy -O ihex --set-start=0xF00000000 %t %t6.hex 2>&1 \| FileCheck %s --check-prefix=BAD-START

				# CHECK: :10000000000102030405060708090A0B0C0D0E0F78
				# CHECK-NEXT: :05001000101112131491
				# CHECK-NEXT: :08FFF800303132333435363765
				# CHECK-NEXT: :020000021000EC
				# CHECK-NEXT: :030000003839404C
				# CHECK-NEXT: :0401000040414243F5
				# CHECK-NEXT: :020000020000FC
				# CHECK-NEXT: :020000040010EA
				# CHECK-NEXT: :08FFF800505152535455565765
				# CHECK-NEXT: :020000040011E9
				# CHECK-NEXT: :03000000585960EC
				# CHECK-NEXT: :0400000300000000F9
				rupprechtUnsubmitted Not Done Reply Inline Actions It looks like this should only be printed when the address is not zero: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l880 rupprecht: It looks like this should only be printed when the address is not zero: https://sourceware.
				# CHECK-NEXT: :00000001FF

				# SEGMENTS: :020000040010EA
				# SEGMENTS-NEXT: :1002F800000102030405060708090A0B0C0D0E0F7E
				# SEGMENTS-NEXT: :05030800101112131496
				# SEGMENTS-NEXT: :0B031000303132333435363738394095
				# SEGMENTS-NEXT: :0403200040414243D3
				# SEGMENTS-NEXT: :0B03280050515253545556575859601D
				# SEGMENTS-NEXT: :0400000500100000E7
				# SEGMENTS-NEXT: :00000001FF

				# 'ExtendedAddr' (04) record shouldn't be created
				# PT_NULL-NOT: :02000004

				# SIGN_EXTENDED: :0200000480007A
				# SIGN_EXTENDED-NEXT: :051000000001020304E1
				# SIGN_EXTENDED-NEXT: :0400000300000000F9
				# SIGN_EXTENDED-NEXT: :00000001FF

				# BAD-ADDR: error: Section '.text2' address range [0xfffffff8, 0x100000000] is not 32 bit
				# BAD-ADDR2: error: Section '.text3' address range [0xffffffff0, 0xffffffff4] is not 32 bit

				# There shouldn't be 'ExtendedAddr' nor 'Data' records
				# ZERO_SIZE_SEC-NOT: :02000004
				# ZERO_SIZE_SEC-NOT: :00FFFF00
				# ZERO_SIZE_SEC: :0400000300000000F9
				# ZERO_SIZE_SEC-NEXT: :00000001FF

				# START1: :040000030000FFFFFB
				# START2: :0400000500100000E7
				# START3: :040000058000100067
				# BAD-START: error: Entry point address 0xf00000000 overflows 32 bits

tools/llvm-objcopy/CopyConfig.cpp

Show First 20 Lines • Show All 440 Lines • ▼ Show 20 Lines	if (BinaryArch.empty())
return createStringError(		return createStringError(
errc::invalid_argument,		errc::invalid_argument,
"Specified binary input without specifiying an architecture");		"Specified binary input without specifiying an architecture");
Expected<const MachineInfo &> MI = getMachineInfo(BinaryArch);		Expected<const MachineInfo &> MI = getMachineInfo(BinaryArch);
if (!MI)		if (!MI)
return MI.takeError();		return MI.takeError();
Config.BinaryArch = *MI;		Config.BinaryArch = *MI;
}		}
if (!Config.OutputFormat.empty() && Config.OutputFormat != "binary") {		if (!Config.OutputFormat.empty() && Config.OutputFormat != "binary" &&
		Config.OutputFormat != "ihex") {
Expected<MachineInfo> MI = getOutputFormatMachineInfo(Config.OutputFormat);		Expected<MachineInfo> MI = getOutputFormatMachineInfo(Config.OutputFormat);
if (!MI)		if (!MI)
return MI.takeError();		return MI.takeError();
Config.OutputArch = *MI;		Config.OutputArch = *MI;
}		}

if (auto Arg = InputArgs.getLastArg(OBJCOPY_compress_debug_sections,		if (auto Arg = InputArgs.getLastArg(OBJCOPY_compress_debug_sections,
OBJCOPY_compress_debug_sections_eq)) {		OBJCOPY_compress_debug_sections_eq)) {
▲ Show 20 Lines • Show All 310 Lines • Show Last 20 Lines

tools/llvm-objcopy/ELF/ELFObjcopy.cpp

Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines
static ElfType getOutputElfType(const MachineInfo &MI) {		static ElfType getOutputElfType(const MachineInfo &MI) {
// Infer output ELF type from the binary arch specified		// Infer output ELF type from the binary arch specified
if (MI.Is64Bit)		if (MI.Is64Bit)
return MI.IsLittleEndian ? ELFT_ELF64LE : ELFT_ELF64BE;		return MI.IsLittleEndian ? ELFT_ELF64LE : ELFT_ELF64BE;
else		else
return MI.IsLittleEndian ? ELFT_ELF32LE : ELFT_ELF32BE;		return MI.IsLittleEndian ? ELFT_ELF32LE : ELFT_ELF32BE;
}		}

static std::unique_ptr<Writer> createWriter(const CopyConfig &Config,		static std::unique_ptr<Writer> createELFWriter(const CopyConfig &Config,
Object &Obj, Buffer &Buf,		Object &Obj, Buffer &Buf,
ElfType OutputElfType) {		ElfType OutputElfType) {
if (Config.OutputFormat == "binary") {
return llvm::make_unique<BinaryWriter>(Obj, Buf);
}
// Depending on the initial ELFT and OutputFormat we need a different Writer.		// Depending on the initial ELFT and OutputFormat we need a different Writer.
switch (OutputElfType) {		switch (OutputElfType) {
case ELFT_ELF32LE:		case ELFT_ELF32LE:
return llvm::make_unique<ELFWriter<ELF32LE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF32LE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
case ELFT_ELF64LE:		case ELFT_ELF64LE:
return llvm::make_unique<ELFWriter<ELF64LE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF64LE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
case ELFT_ELF32BE:		case ELFT_ELF32BE:
return llvm::make_unique<ELFWriter<ELF32BE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF32BE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
case ELFT_ELF64BE:		case ELFT_ELF64BE:
return llvm::make_unique<ELFWriter<ELF64BE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF64BE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
}		}
llvm_unreachable("Invalid output format");		llvm_unreachable("Invalid output format");
}		}

		static std::unique_ptr<Writer> createWriter(const CopyConfig &Config,
		Object &Obj, Buffer &Buf,
		ElfType OutputElfType) {
		using Functor = std::function<std::unique_ptr<Writer>()>;
		return StringSwitch<Functor>(Config.OutputFormat)
		.Case("binary", [&] { return llvm::make_unique<BinaryWriter>(Obj, Buf); })
		.Case("ihex", [&] { return llvm::make_unique<IHexWriter>(Obj, Buf); })
		.Default(
		[&] { return createELFWriter(Config, Obj, Buf, OutputElfType); })();
		}

template <class ELFT>		template <class ELFT>
static Expected<ArrayRef<uint8_t>>		static Expected<ArrayRef<uint8_t>>
findBuildID(const object::ELFFile<ELFT> &In) {		findBuildID(const object::ELFFile<ELFT> &In) {
for (const auto &Phdr : unwrapOrError(In.program_headers())) {		for (const auto &Phdr : unwrapOrError(In.program_headers())) {
if (Phdr.p_type != PT_NOTE)		if (Phdr.p_type != PT_NOTE)
continue;		continue;
Error Err = Error::success();		Error Err = Error::success();
for (const auto &Note : In.notes(Phdr, Err))		for (const auto &Note : In.notes(Phdr, Err))
▲ Show 20 Lines • Show All 475 Lines • ▼ Show 20 Lines	Obj.SymbolTable->addSymbol(
Sec ? (uint16_t)SYMBOL_SIMPLE_INDEX : (uint16_t)SHN_ABS, 0);		Sec ? (uint16_t)SYMBOL_SIMPLE_INDEX : (uint16_t)SHN_ABS, 0);
}		}

if (Config.EntryExpr)		if (Config.EntryExpr)
Obj.Entry = Config.EntryExpr(Obj.Entry);		Obj.Entry = Config.EntryExpr(Obj.Entry);
return Error::success();		return Error::success();
}		}

		static Error writeOutput(const CopyConfig &Config, Object &Obj, Buffer &Out,
		ElfType OutputElfType) {
		std::unique_ptr<Writer> Writer =
		createWriter(Config, Obj, Out, OutputElfType);
		if (Error E = Writer->finalize())
		return E;
		return Writer->write();
		}

Error executeObjcopyOnRawBinary(const CopyConfig &Config, MemoryBuffer &In,		Error executeObjcopyOnRawBinary(const CopyConfig &Config, MemoryBuffer &In,
Buffer &Out) {		Buffer &Out) {
BinaryReader Reader(Config.BinaryArch, &In);		BinaryReader Reader(Config.BinaryArch, &In);
std::unique_ptr<Object> Obj = Reader.create();		std::unique_ptr<Object> Obj = Reader.create();

// Prefer OutputArch (-O<format>) if set, otherwise fallback to BinaryArch		// Prefer OutputArch (-O<format>) if set, otherwise fallback to BinaryArch
		rupprechtUnsubmitted Not Done Reply Inline Actions A more succinct way (and same in executeObjcopyOnRawBinary): const ElfType OutputElfType = getOutputElfType( Config.OutputArch.getValueOr(Config.BinaryArch)); rupprecht: A more succinct way (and same in executeObjcopyOnRawBinary): ``` const ElfType OutputElfType =…
// (-B<arch>).		// (-B<arch>).
const ElfType OutputElfType = getOutputElfType(		const ElfType OutputElfType =
Config.OutputArch ? Config.OutputArch.getValue() : Config.BinaryArch);		getOutputElfType(Config.OutputArch.getValueOr(Config.BinaryArch));
if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))		if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))
return E;		return E;
std::unique_ptr<Writer> Writer =		return writeOutput(Config, *Obj, Out, OutputElfType);
createWriter(Config, *Obj, Out, OutputElfType);
if (Error E = Writer->finalize())
return E;
return Writer->write();
}		}

Error executeObjcopyOnBinary(const CopyConfig &Config,		Error executeObjcopyOnBinary(const CopyConfig &Config,
object::ELFObjectFileBase &In, Buffer &Out) {		object::ELFObjectFileBase &In, Buffer &Out) {
ELFReader Reader(&In);		ELFReader Reader(&In);
std::unique_ptr<Object> Obj = Reader.create();		std::unique_ptr<Object> Obj = Reader.create();
// Prefer OutputArch (-O<format>) if set, otherwise infer it from the input.		// Prefer OutputArch (-O<format>) if set, otherwise infer it from the input.
const ElfType OutputElfType =		const ElfType OutputElfType =
Show All 13 Lines	Error executeObjcopyOnBinary(const CopyConfig &Config,
if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkInput)		if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkInput)
if (Error E =		if (Error E =
linkToBuildIdDir(Config, Config.InputFilename,		linkToBuildIdDir(Config, Config.InputFilename,
Config.BuildIdLinkInput.getValue(), BuildIdBytes))		Config.BuildIdLinkInput.getValue(), BuildIdBytes))
return E;		return E;

if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))		if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))
return E;		return E;
std::unique_ptr<Writer> Writer =		if (Error E = writeOutput(Config, *Obj, Out, OutputElfType))
createWriter(Config, *Obj, Out, OutputElfType);
if (Error E = Writer->finalize())
return E;
if (Error E = Writer->write())
return E;		return E;
if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkOutput)		if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkOutput)
if (Error E =		if (Error E =
linkToBuildIdDir(Config, Config.OutputFilename,		linkToBuildIdDir(Config, Config.OutputFilename,
Config.BuildIdLinkOutput.getValue(), BuildIdBytes))		Config.BuildIdLinkOutput.getValue(), BuildIdBytes))
return E;		return E;

return Error::success();		return Error::success();
}		}

} // end namespace elf		} // end namespace elf
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

tools/llvm-objcopy/ELF/Object.h

Show All 11 Lines
#include "Buffer.h"		#include "Buffer.h"
#include "CopyConfig.h"		#include "CopyConfig.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/BinaryFormat/ELF.h"		#include "llvm/BinaryFormat/ELF.h"
#include "llvm/MC/StringTableBuilder.h"		#include "llvm/MC/StringTableBuilder.h"
#include "llvm/Object/ELFObjectFile.h"		#include "llvm/Object/ELFObjectFile.h"
		#include "llvm/Support/Errc.h"
#include "llvm/Support/FileOutputBuffer.h"		#include "llvm/Support/FileOutputBuffer.h"
#include "llvm/Support/JamCRC.h"		#include "llvm/Support/JamCRC.h"
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <functional>		#include <functional>
#include <memory>		#include <memory>
#include <set>		#include <set>
#include <vector>		#include <vector>
▲ Show 20 Lines • Show All 136 Lines • ▼ Show 20 Lines	public:
void visit(GroupSection &Sec) override;		void visit(GroupSection &Sec) override;
void visit(SectionIndexSection &Sec) override;		void visit(SectionIndexSection &Sec) override;
void visit(CompressedSection &Sec) override;		void visit(CompressedSection &Sec) override;
void visit(DecompressedSection &Sec) override;		void visit(DecompressedSection &Sec) override;
};		};

#define MAKE_SEC_WRITER_FRIEND \		#define MAKE_SEC_WRITER_FRIEND \
friend class SectionWriter; \		friend class SectionWriter; \
		friend class IHexSectionWriterBase; \
		friend class IHexSectionWriter; \
template <class ELFT> friend class ELFSectionWriter; \		template <class ELFT> friend class ELFSectionWriter; \
template <class ELFT> friend class ELFSectionSizer;		template <class ELFT> friend class ELFSectionSizer;

class BinarySectionWriter : public SectionWriter {		class BinarySectionWriter : public SectionWriter {
public:		public:
virtual ~BinarySectionWriter() {}		virtual ~BinarySectionWriter() {}

void visit(const SymbolTableSection &Sec) override;		void visit(const SymbolTableSection &Sec) override;
void visit(const RelocationSection &Sec) override;		void visit(const RelocationSection &Sec) override;
void visit(const GnuDebugLinkSection &Sec) override;		void visit(const GnuDebugLinkSection &Sec) override;
void visit(const GroupSection &Sec) override;		void visit(const GroupSection &Sec) override;
void visit(const SectionIndexSection &Sec) override;		void visit(const SectionIndexSection &Sec) override;
void visit(const CompressedSection &Sec) override;		void visit(const CompressedSection &Sec) override;
void visit(const DecompressedSection &Sec) override;		void visit(const DecompressedSection &Sec) override;

explicit BinarySectionWriter(Buffer &Buf) : SectionWriter(Buf) {}		explicit BinarySectionWriter(Buffer &Buf) : SectionWriter(Buf) {}
};		};

		using IHexLineData = SmallVector<char, 64>;

		struct IHexRecord {
		// Memory address of the record.
		uint16_t Addr;
		// Record type (see below).
		uint16_t Type;
		rupprechtUnsubmitted Not Done Reply Inline Actions Why not just uint16_t fields? rupprecht: Why not just uint16_t fields?
		// Record data in hexadecimal form.
		StringRef HexData;

		// Helper method to get file length of the record
		// including newline character
		static size_t getLength(size_t DataSize) {
		// :LLAAAATT[DD...DD]CC'
		return DataSize * 2 + 11;
		}

		// Gets length of line in a file (getLength + CR).
		static size_t getLineLength(size_t DataSize) {
		return getLength(DataSize) + 1;
		}

		// Given type, address and data returns line which can
		// be written to output file.
		static IHexLineData getLine(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data);

		// Calculates checksum of stringified record representation
		// S must NOT contain leading ':' and trailing whitespace
		// characters
		static uint8_t getChecksum(StringRef S);

		enum Type {
		// Contains data and a 16-bit starting address for the data.
		// The byte count specifies number of data bytes in the record.
		Data = 0,
		// Must occur exactly once per file in the last line of the file.
		// The data field is empty (thus byte count is 00) and the address
		// field is typically 0000.
		EndOfFile = 1,
		// The data field contains a 16-bit segment base address (thus byte
		// count is always 02) compatible with 80x86 real mode addressing.
		// The address field (typically 0000) is ignored. The segment address
		// from the most recent 02 record is multiplied by 16 and added to each
		// subsequent data record address to form the physical starting address
		// for the data. This allows addressing up to one megabyte of address
		// space.
		SegmentAddr = 2,
		// or 80x86 processors, specifies the initial content of the CS:IP
		// registers. The address field is 0000, the byte count is always 04,
		// the first two data bytes are the CS value, the latter two are the
		// IP value.
		StartAddr80x86 = 3,
		// Allows for 32 bit addressing (up to 4GiB). The record's address field
		// is ignored (typically 0000) and its byte count is always 02. The two
		// data bytes (big endian) specify the upper 16 bits of the 32 bit
		// absolute address for all subsequent type 00 records
		ExtendedAddr = 4,
		// The address field is 0000 (not used) and the byte count is always 04.
		// The four data bytes represent a 32-bit address value. In the case of
		// 80386 and higher CPUs, this address is loaded into the EIP register.
		StartAddr = 5,
		// We have no other valid types
		InvalidType = 6
		};
		};

		// Base class for IHexSectionWriter. This class implements writing algorithm,
		// but doesn't actually write records. It is used for output buffer size
		// calculation in IHexWriter::finalize.
		class IHexSectionWriterBase : public BinarySectionWriter {
		// 20-bit segment address
		uint32_t SegmentAddr = 0;
		// Extended linear address
		uint32_t BaseAddr = 0;
		jakehehrlichUnsubmitted Not Done Reply Inline Actions What's the point in splitting this into two classes? Also does inheriting from BinarySectionWriter make sense? The same visitors will need to be implemented I would suppose the offsets and everything would be very different. jakehehrlich: What's the point in splitting this into two classes? Also does inheriting from…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I inherited IHexSectionWriter from BinarySectionWriter in order to reuse visitors for RelocationSection, GnuDebugLinkSection, e.t.c which will never go to IHEX nor to binary output. Are you suggesting duplication? evgeny777: I inherited IHexSectionWriter from BinarySectionWriter in order to reuse visitors for…

		// Write segment address corresponding to 'Addr'
		uint64_t writeSegmentAddr(uint64_t Addr);
		// Write extended linear (base) address corresponding to 'Addr'
		uint64_t writeBaseAddr(uint64_t Addr);

		protected:
		// Offset in the output buffer
		uint64_t Offset = 0;

		void writeSection(const SectionBase *Sec, ArrayRef<uint8_t> Data);
		virtual void writeData(uint8_t Type, uint16_t Addr, ArrayRef<uint8_t> Data);

		public:
		explicit IHexSectionWriterBase(Buffer &Buf) : BinarySectionWriter(Buf) {}

		uint64_t getBufferOffset() const { return Offset; }
		void visit(const Section &Sec) final;
		void visit(const OwnedDataSection &Sec) final;
		void visit(const StringTableSection &Sec) override;
		void visit(const DynamicRelocationSection &Sec) final;
		using BinarySectionWriter::visit;
		};

		// Real IHEX section writer
		class IHexSectionWriter : public IHexSectionWriterBase {
		public:
		IHexSectionWriter(Buffer &Buf) : IHexSectionWriterBase(Buf) {}

		void writeData(uint8_t Type, uint16_t Addr, ArrayRef<uint8_t> Data) override;
		jakehehrlichUnsubmitted Not Done Reply Inline Actions In general I think it might be worth considering weather there is a need to use sections at all. Originally with the binary writer we only used program headers. It turns out that people did a lot of stuff in a really odd way with GNU objcopy when using -O binary that required that we use sections as the primary basis for output. I would imagine that ihex users would not be doing the same sorts of odd tricks and that you could write the output strait from program headers. This would simplify the implementation greatly I think and harden the implementation against all sorts of odd corner cases. jakehehrlich: In general I think it might be worth considering weather there is a need to use sections at all.
		evgeny777AuthorUnsubmitted Done Reply Inline Actions AFAIU one can't do this with IHEX, because unlike binary IHEX is not a contiguous blob, e.g you can have a gap between sections which won't go to output file. evgeny777: AFAIU one can't do this with IHEX, because unlike binary IHEX is not a contiguous blob, e.g you…
		void visit(const StringTableSection &Sec) override;
		};

class Writer {		class Writer {
protected:		protected:
Object &Obj;		Object &Obj;
Buffer &Buf;		Buffer &Buf;

public:		public:
virtual ~Writer();		virtual ~Writer();
virtual Error finalize() = 0;		virtual Error finalize() = 0;
▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

public:		public:
~BinaryWriter() {}		~BinaryWriter() {}
Error finalize() override;		Error finalize() override;
Error write() override;		Error write() override;
BinaryWriter(Object &Obj, Buffer &Buf) : Writer(Obj, Buf) {}		BinaryWriter(Object &Obj, Buffer &Buf) : Writer(Obj, Buf) {}
};		};

		class IHexWriter : public Writer {
		struct SectionCompare {
		bool operator()(const SectionBase Lhs, const SectionBase Rhs) const;
		};

		std::set<const SectionBase *, SectionCompare> Sections;
		size_t TotalSize;

		Error checkSection(const SectionBase &Sec);
		uint64_t writeEntryPointRecord(uint8_t *Buf);
		uint64_t writeEndOfFileRecord(uint8_t *Buf);

		public:
		~IHexWriter() {}
		Error finalize() override;
		Error write() override;
		IHexWriter(Object &Obj, Buffer &Buf) : Writer(Obj, Buf) {}
		};

class SectionBase {		class SectionBase {
public:		public:
std::string Name;		std::string Name;
Segment *ParentSegment = nullptr;		Segment *ParentSegment = nullptr;
uint64_t HeaderOffset;		uint64_t HeaderOffset;
uint64_t OriginalOffset = std::numeric_limits<uint64_t>::max();		uint64_t OriginalOffset = std::numeric_limits<uint64_t>::max();
uint32_t Index;		uint32_t Index;
bool HasSymbol = false;		bool HasSymbol = false;
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	public:
OwnedDataSection(StringRef SecName, ArrayRef<uint8_t> Data)		OwnedDataSection(StringRef SecName, ArrayRef<uint8_t> Data)
: Data(std::begin(Data), std::end(Data)) {		: Data(std::begin(Data), std::end(Data)) {
Name = SecName.str();		Name = SecName.str();
Type = ELF::SHT_PROGBITS;		Type = ELF::SHT_PROGBITS;
Size = Data.size();		Size = Data.size();
OriginalOffset = std::numeric_limits<uint64_t>::max();		OriginalOffset = std::numeric_limits<uint64_t>::max();
}		}

		OwnedDataSection(const Twine &SecName, uint64_t SecAddr, uint64_t SecFlags,
		uint64_t SecOff) {
		Name = SecName.str();
		Type = ELF::SHT_PROGBITS;
		Addr = SecAddr;
		Flags = SecFlags;
		OriginalOffset = SecOff;
		}

		void appendHexData(StringRef HexData);
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Why do we need this to output a new format? jakehehrlich: Why do we need this to output a new format?
		evgeny777AuthorUnsubmitted Done Reply Inline Actions This function takes a portion of hexadecimal data from IHexRecord and appends it in binary form to internal vector of OwnedDataSection evgeny777: This function takes a portion of hexadecimal data from IHexRecord and appends it in binary form…
void accept(SectionVisitor &Sec) const override;		void accept(SectionVisitor &Sec) const override;
void accept(MutableSectionVisitor &Visitor) override;		void accept(MutableSectionVisitor &Visitor) override;
};		};

class CompressedSection : public SectionBase {		class CompressedSection : public SectionBase {
MAKE_SEC_WRITER_FRIEND		MAKE_SEC_WRITER_FRIEND

DebugCompressionType CompressionType;		DebugCompressionType CompressionType;
▲ Show 20 Lines • Show All 394 Lines • ▼ Show 20 Lines	class BinaryReader : public Reader {
MemoryBuffer *MemBuf;		MemoryBuffer *MemBuf;

public:		public:
BinaryReader(const MachineInfo &MI, MemoryBuffer *MB)		BinaryReader(const MachineInfo &MI, MemoryBuffer *MB)
: MInfo(MI), MemBuf(MB) {}		: MInfo(MI), MemBuf(MB) {}
std::unique_ptr<Object> create() const override;		std::unique_ptr<Object> create() const override;
};		};

class ELFReader : public Reader {		class ELFReader : public Reader {
		jakehehrlichUnsubmitted Not Done Reply Inline Actions We can probably split this into two changes to make things smaller, one for reading, and one for writing yeah? jakehehrlich: We can probably split this into two changes to make things smaller, one for reading, and one…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I think so. evgeny777: I think so.
Binary *Bin;		Binary *Bin;

public:		public:
std::unique_ptr<Object> create() const override;		std::unique_ptr<Object> create() const override;
explicit ELFReader(Binary *B) : Bin(B) {}		explicit ELFReader(Binary *B) : Bin(B) {}
};		};

class Object {		class Object {
▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

tools/llvm-objcopy/ELF/Object.cpp

Show All 11 Lines
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/BinaryFormat/ELF.h"		#include "llvm/BinaryFormat/ELF.h"
#include "llvm/MC/MCTargetOptions.h"		#include "llvm/MC/MCTargetOptions.h"
#include "llvm/Object/ELFObjectFile.h"		#include "llvm/Object/ELFObjectFile.h"
#include "llvm/Support/Compression.h"		#include "llvm/Support/Compression.h"
#include "llvm/Support/Errc.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/FileOutputBuffer.h"		#include "llvm/Support/FileOutputBuffer.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include <algorithm>		#include <algorithm>
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <iterator>		#include <iterator>
#include <unordered_set>		#include <unordered_set>
▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines

void SectionWriter::visit(const Section &Sec) {		void SectionWriter::visit(const Section &Sec) {
if (Sec.Type == SHT_NOBITS)		if (Sec.Type == SHT_NOBITS)
return;		return;
uint8_t *Buf = Out.getBufferStart() + Sec.Offset;		uint8_t *Buf = Out.getBufferStart() + Sec.Offset;
llvm::copy(Sec.Contents, Buf);		llvm::copy(Sec.Contents, Buf);
}		}

		static bool addressOverflows32bit(uint64_t Addr) {
		// Sign extended 32 bit addresses (e.g 0xFFFFFFFF80000000) are ok
		return Addr > UINT32_MAX && Addr + 0x80000000 > UINT32_MAX;
		rupprechtUnsubmitted Not Done Reply Inline Actions Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr & 0xffffffff80000000 == 0xffffffff80000000` instead? rupprecht: Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr &…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t. evgeny777: As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t.
		}

		template <class T> static T checkedGetHex(StringRef S) {
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Unless an check that generates an error always proceeds this I think its best to return an error in this case, not assert fail. It would be better to roll this into an Expected function in that case I think anyway. jakehehrlich: Unless an check that generates an error always proceeds this I think its best to return an…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions This function can't actually return error, because string has been previously validated (see `checkChars` for example). IMO, it's bad practice to implement runtime checks for one's own logical errors. evgeny777: This function can't actually return error, because string has been previously validated (see…
		T Value;
		bool Fail = S.getAsInteger(16, Value);
		assert(!Fail);
		rupprechtUnsubmitted Not Done Reply Inline Actions `Fail` is unused in release builds, so you need to add a `(void)Fail;` to silence the error/warning in release builds. rupprecht: `Fail` is unused in release builds, so you need to add a `(void)Fail;` to silence the…
		(void)Fail;
		return Value;
		}

		// Fills exactly Len bytes of buffer with hexadecimal characters
		// representing value 'X'
		template <class T, class Iterator>
		static Iterator utohexstr(T X, Iterator It, size_t Len) {
		// Fill range with '0'
		std::fill(It, It + Len, '0');

		for (long I = Len - 1; I >= 0; --I) {
		unsigned char Mod = static_cast<unsigned char>(X) & 15;
		*(It + I) = hexdigit(Mod, false);
		X >>= 4;
		}
		assert(X == 0);
		return It + Len;
		}

		uint8_t IHexRecord::getChecksum(StringRef S) {
		assert((S.size() & 1) == 0);
		uint8_t Checksum = 0;
		while (!S.empty()) {
		Checksum += checkedGetHex<uint8_t>(S.take_front(2));
		S = S.drop_front(2);
		}
		return -Checksum;
		}

		IHexLineData IHexRecord::getLine(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data) {
		IHexLineData Line(getLineLength(Data.size()));
		assert(Line.size());
		auto Iter = Line.begin();
		*Iter++ = ':';
		Iter = utohexstr(Data.size(), Iter, 2);
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to lend itself to streams where as my opinion was the opposite before. You wouldn't need utohexstr since those formatting options are already supplied by the library I believe. jakehehrlich: Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions This function was optimized to not using any dynamic allocation (IHexLineData is actually SmallVector), because each line contains only 16 bytes of section data, so it's possible to have really huge number of lines. What are the benefits of using raw_ostream? evgeny777: This function was optimized to not using any dynamic allocation (IHexLineData is actually…
		Iter = utohexstr(Addr, Iter, 4);
		Iter = utohexstr(Type, Iter, 2);
		for (uint8_t X : Data)
		Iter = utohexstr(X, Iter, 2);
		StringRef S(Line.data() + 1, std::distance(Line.begin() + 1, Iter));
		Iter = utohexstr(getChecksum(S), Iter, 2);
		*Iter++ = '\n';
		rupprechtUnsubmitted Not Done Reply Inline Actions It looks like ihex uses `\r\n` line endings 😦 https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l752 It seems weird for me to request this, but I think we should write `\r\n`, as this seems like a strange detail that people might need when consuming these files. I don't actually have any examples of this, however. rupprecht: It looks like ihex uses `\r\n` line endings 😦 https://sourceware.org/git/gitweb.cgi?p=binutils…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells that: Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems Probably the easiest thing to do is to stick to GNU behavior. I'll update the patch evgeny777: Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells…
		assert(Iter == Line.end());
		return Line;
		}

		static uint64_t sectionPhysicalAddr(const SectionBase *Sec) {
		jakehehrlichUnsubmitted Not Done Reply Inline Actions This is a very generic name with no comment. In general your comments have been awesome. I'd like to have an idea what this function does without reading the contents. jakehehrlich: This is a very generic name with no comment. In general your comments have been awesome. I'd…
		Segment *Seg = Sec->ParentSegment;
		if (Seg && Seg->Type != ELF::PT_LOAD)
		Seg = nullptr;
		return Seg ? Seg->PAddr + Sec->OriginalOffset - Seg->OriginalOffset
		: Sec->Addr;
		}

		void IHexSectionWriterBase::writeSection(const SectionBase *Sec,
		ArrayRef<uint8_t> Data) {
		assert(Data.size() == Sec->Size);
		const uint32_t ChunkSize = 16;
		uint32_t Addr = sectionPhysicalAddr(Sec) & 0xFFFFFFFFU;
		while (!Data.empty()) {
		uint64_t DataSize = std::min<uint64_t>(Data.size(), ChunkSize);
		if (Addr > SegmentAddr + BaseAddr + 0xFFFFU) {
		if (Addr > 0xFFFFFU) {
		// Write extended address record, zeroing segment address
		// if needed.
		if (SegmentAddr != 0)
		SegmentAddr = writeSegmentAddr(0U);
		BaseAddr = writeBaseAddr(Addr);
		} else {
		// We can still remain 16-bit
		SegmentAddr = writeSegmentAddr(Addr);
		}
		}
		uint64_t SegOffset = Addr - BaseAddr - SegmentAddr;
		assert(SegOffset <= 0xFFFFU);
		DataSize = std::min(DataSize, 0x10000U - SegOffset);
		writeData(0, SegOffset, Data.take_front(DataSize));
		Addr += DataSize;
		Data = Data.drop_front(DataSize);
		}
		}

		uint64_t IHexSectionWriterBase::writeSegmentAddr(uint64_t Addr) {
		assert(Addr <= 0xFFFFFU);
		uint8_t Data[] = {static_cast<uint8_t>((Addr & 0xF0000U) >> 12), 0};
		writeData(2, 0, Data);
		return Addr & 0xF0000U;
		}

		uint64_t IHexSectionWriterBase::writeBaseAddr(uint64_t Addr) {
		assert(Addr <= 0xFFFFFFFFU);
		uint64_t Base = Addr & 0xFFFF0000U;
		uint8_t Data[] = {static_cast<uint8_t>(Base >> 24),
		static_cast<uint8_t>((Base >> 16) & 0xFF)};
		writeData(4, 0, Data);
		return Base;
		}

		void IHexSectionWriterBase::writeData(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data) {
		Offset += IHexRecord::getLineLength(Data.size());
		}

		void IHexSectionWriterBase::visit(const Section &Sec) {
		writeSection(&Sec, Sec.Contents);
		}

		void IHexSectionWriterBase::visit(const OwnedDataSection &Sec) {
		writeSection(&Sec, Sec.Data);
		}

		void IHexSectionWriterBase::visit(const StringTableSection &Sec) {
		// Check that sizer has already done its work
		assert(Sec.Size == Sec.StrTabBuilder.getSize());
		// We are free to pass an invalid pointer to writeSection as long
		// as we don't actually write any data. The real writer class has
		// to override this method .
		writeSection(&Sec, {nullptr, Sec.Size});
		}

		void IHexSectionWriterBase::visit(const DynamicRelocationSection &Sec) {
		writeSection(&Sec, Sec.Contents);
		}

		void IHexSectionWriter::writeData(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data) {
		IHexLineData HexData = IHexRecord::getLine(Type, Addr, Data);
		memcpy(Out.getBufferStart() + Offset, HexData.data(), HexData.size());
		Offset += HexData.size();
		}

		void IHexSectionWriter::visit(const StringTableSection &Sec) {
		assert(Sec.Size == Sec.StrTabBuilder.getSize());
		std::vector<uint8_t> Data(Sec.Size);
		Sec.StrTabBuilder.write(Data.data());
		writeSection(&Sec, Data);
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Does this ever make sense if there is no segment? jakehehrlich: Does this ever make sense if there is no segment?
		evgeny777AuthorUnsubmitted Done Reply Inline Actions It's a helper function which returns section VA if there is no segment. Any suggestion for better name? evgeny777: It's a helper function which returns section VA if there is no segment. Any suggestion for…
		}

void Section::accept(SectionVisitor &Visitor) const { Visitor.visit(*this); }		void Section::accept(SectionVisitor &Visitor) const { Visitor.visit(*this); }

void Section::accept(MutableSectionVisitor &Visitor) { Visitor.visit(*this); }		void Section::accept(MutableSectionVisitor &Visitor) { Visitor.visit(*this); }

void SectionWriter::visit(const OwnedDataSection &Sec) {		void SectionWriter::visit(const OwnedDataSection &Sec) {
uint8_t *Buf = Out.getBufferStart() + Sec.Offset;		uint8_t *Buf = Out.getBufferStart() + Sec.Offset;
llvm::copy(Sec.Data, Buf);		llvm::copy(Sec.Data, Buf);
}		}

static const std::vector<uint8_t> ZlibGnuMagic = {'Z', 'L', 'I', 'B'};		static const std::vector<uint8_t> ZlibGnuMagic = {'Z', 'L', 'I', 'B'};
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Masking that like this seems redundant, in general the number of places we're converting from 64 to 32 in an unchecked way is really shocking. I'd feel a lot more comfortable if we encapsulated these checks more and made them more clear. jakehehrlich: Masking that like this seems redundant, in general the number of places we're converting from…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions There is a checkSections method which check all sections to detect if any of them has 64-bit address. Bear in mind that implementation also supports sign extended 32-bit addresses, i.e 0xFFFFFFFF80000000 is a valid address, but 0x100000000 is not evgeny777: There is a checkSections method which check all sections to detect if any of them has 64-bit…

static bool isDataGnuCompressed(ArrayRef<uint8_t> Data) {		static bool isDataGnuCompressed(ArrayRef<uint8_t> Data) {
return Data.size() > ZlibGnuMagic.size() &&		return Data.size() > ZlibGnuMagic.size() &&
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Maybe we could split support for extended records out into a sperate patch and error out here for now? jakehehrlich: Maybe we could split support for extended records out into a sperate patch and error out here…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I suggest splitting reader and writer. To me it looks like a more logical split compared to removal of certain record types. evgeny777: I suggest splitting reader and writer. To me it looks like a more logical split compared to…
std::equal(ZlibGnuMagic.begin(), ZlibGnuMagic.end(), Data.data());		std::equal(ZlibGnuMagic.begin(), ZlibGnuMagic.end(), Data.data());
}		}

template <class ELFT>		template <class ELFT>
static std::tuple<uint64_t, uint64_t>		static std::tuple<uint64_t, uint64_t>
getDecompressedSizeAndAlignment(ArrayRef<uint8_t> Data) {		getDecompressedSizeAndAlignment(ArrayRef<uint8_t> Data) {
const bool IsGnuDebug = isDataGnuCompressed(Data);		const bool IsGnuDebug = isDataGnuCompressed(Data);
const uint64_t DecompressedSize =		const uint64_t DecompressedSize =
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
void OwnedDataSection::accept(SectionVisitor &Visitor) const {		void OwnedDataSection::accept(SectionVisitor &Visitor) const {
Visitor.visit(*this);		Visitor.visit(*this);
}		}

void OwnedDataSection::accept(MutableSectionVisitor &Visitor) {		void OwnedDataSection::accept(MutableSectionVisitor &Visitor) {
Visitor.visit(*this);		Visitor.visit(*this);
}		}

		void OwnedDataSection::appendHexData(StringRef HexData) {
		assert((HexData.size() & 1) == 0);
		while (!HexData.empty()) {
		Data.push_back(checkedGetHex<uint8_t>(HexData.take_front(2)));
		HexData = HexData.drop_front(2);
		}
		Size = Data.size();
		}

void BinarySectionWriter::visit(const CompressedSection &Sec) {		void BinarySectionWriter::visit(const CompressedSection &Sec) {
error("Cannot write compressed section '" + Sec.Name + "' ");		error("Cannot write compressed section '" + Sec.Name + "' ");
}		}

template <class ELFT>		template <class ELFT>
void ELFSectionWriter<ELFT>::visit(const CompressedSection &Sec) {		void ELFSectionWriter<ELFT>::visit(const CompressedSection &Sec) {
uint8_t *Buf = Out.getBufferStart();		uint8_t *Buf = Out.getBufferStart();
Buf += Sec.Offset;		Buf += Sec.Offset;
▲ Show 20 Lines • Show All 679 Lines • ▼ Show 20 Lines	std::unique_ptr<Object> BinaryELFBuilder::build() {

return std::move(Obj);		return std::move(Obj);
}		}

template <class ELFT> void ELFBuilder<ELFT>::setParentSegment(Segment &Child) {		template <class ELFT> void ELFBuilder<ELFT>::setParentSegment(Segment &Child) {
for (auto &Parent : Obj.segments()) {		for (auto &Parent : Obj.segments()) {
// Every segment will overlap with itself but we don't want a segment to		// Every segment will overlap with itself but we don't want a segment to
// be it's own parent so we avoid that situation.		// be it's own parent so we avoid that situation.
if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {		if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {
		rupprechtUnsubmitted Not Done Reply Inline Actions RecAddr should be defined in the loop, where it is used rupprecht: RecAddr should be defined in the loop, where it is used
// We want a canonical "most parental" segment but this requires		// We want a canonical "most parental" segment but this requires
// inspecting the ParentSegment.		// inspecting the ParentSegment.
if (compareSegmentsByOffset(&Parent, &Child))		if (compareSegmentsByOffset(&Parent, &Child))
if (Child.ParentSegment == nullptr \|\|		if (Child.ParentSegment == nullptr \|\|
compareSegmentsByOffset(&Parent, Child.ParentSegment)) {		compareSegmentsByOffset(&Parent, Child.ParentSegment)) {
Child.ParentSegment = &Parent;		Child.ParentSegment = &Parent;
}		}
}		}
▲ Show 20 Lines • Show All 326 Lines • ▼ Show 20 Lines
Writer::~Writer() {}		Writer::~Writer() {}

Reader::~Reader() {}		Reader::~Reader() {}

std::unique_ptr<Object> BinaryReader::create() const {		std::unique_ptr<Object> BinaryReader::create() const {
return BinaryELFBuilder(MInfo.EMachine, MemBuf).build();		return BinaryELFBuilder(MInfo.EMachine, MemBuf).build();
}		}

std::unique_ptr<Object> ELFReader::create() const {		std::unique_ptr<Object> ELFReader::create() const {
auto Obj = llvm::make_unique<Object>();		auto Obj = llvm::make_unique<Object>();
		rupprechtUnsubmitted Not Done Reply Inline Actions I think this will crash (or be UB) on an empty line? rupprecht: I think this will crash (or be UB) on an empty line?
		evgeny777AuthorUnsubmitted Done Reply Inline Actions Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert here on `!Line.empty()` evgeny777: Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert…
if (auto *O = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {		if (auto *O = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {
ELFBuilder<ELF32LE> Builder(O, Obj);		ELFBuilder<ELF32LE> Builder(O, Obj);
Builder.build();		Builder.build();
return Obj;		return Obj;
} else if (auto *O = dyn_cast<ELFObjectFile<ELF64LE>>(Bin)) {		} else if (auto *O = dyn_cast<ELFObjectFile<ELF64LE>>(Bin)) {
ELFBuilder<ELF64LE> Builder(O, Obj);		ELFBuilder<ELF64LE> Builder(O, Obj);
Builder.build();		Builder.build();
return Obj;		return Obj;
} else if (auto *O = dyn_cast<ELFObjectFile<ELF32BE>>(Bin)) {		} else if (auto *O = dyn_cast<ELFObjectFile<ELF32BE>>(Bin)) {
		rupprechtUnsubmitted Not Done Reply Inline Actions WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if it just attempts to match ":[0-9A-F]+". It would produce less precise error messages, though. rupprecht: WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I think that precise error message is more important. It might be hard in some cases to identify wrong character, e.g: `I` instead of `1`, `O` instead of `0`, russian `A` instead of english `A` and so on. evgeny777: I think that precise error message is more important. It might be hard in some cases to…
ELFBuilder<ELF32BE> Builder(O, Obj);		ELFBuilder<ELF32BE> Builder(O, Obj);
Builder.build();		Builder.build();
return Obj;		return Obj;
} else if (auto *O = dyn_cast<ELFObjectFile<ELF64BE>>(Bin)) {		} else if (auto *O = dyn_cast<ELFObjectFile<ELF64BE>>(Bin)) {
ELFBuilder<ELF64BE> Builder(O, Obj);		ELFBuilder<ELF64BE> Builder(O, Obj);
Builder.build();		Builder.build();
return Obj;		return Obj;
}		}
error("Invalid file type");		error("Invalid file type");
}		}
		rupprechtUnsubmitted Not Done Reply Inline Actions I think there should be validation (somewhere) that there are no more records after this rupprecht: I think there should be validation (somewhere) that there are no more records after this
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I think that `EndOfFile` record should unconditionally cancel further processing. . This allows moving EOF record within a file to temporarily prevent part of records from loading. This can be useful for testing. Also it seems GNU objcopy behaves this way. evgeny777: I think that `EndOfFile` record should unconditionally cancel further processing. . This allows…

template <class ELFT> void ELFWriter<ELFT>::writeEhdr() {		template <class ELFT> void ELFWriter<ELFT>::writeEhdr() {
uint8_t *B = Buf.getBufferStart();		uint8_t *B = Buf.getBufferStart();
Elf_Ehdr &Ehdr = reinterpret_cast<Elf_Ehdr >(B);		Elf_Ehdr &Ehdr = reinterpret_cast<Elf_Ehdr >(B);
std::fill(Ehdr.e_ident, Ehdr.e_ident + 16, 0);		std::fill(Ehdr.e_ident, Ehdr.e_ident + 16, 0);
Ehdr.e_ident[EI_MAG0] = 0x7f;		Ehdr.e_ident[EI_MAG0] = 0x7f;
Ehdr.e_ident[EI_MAG1] = 'E';		Ehdr.e_ident[EI_MAG1] = 'E';
Ehdr.e_ident[EI_MAG2] = 'L';		Ehdr.e_ident[EI_MAG2] = 'L';
Show All 23 Lines	if (WriteSectionHeaders && Obj.sections().size() != 0) {
// If the number of sections is greater than or equal to		// If the number of sections is greater than or equal to
// SHN_LORESERVE (0xff00), this member has the value zero and the actual		// SHN_LORESERVE (0xff00), this member has the value zero and the actual
// number of section header table entries is contained in the sh_size field		// number of section header table entries is contained in the sh_size field
// of the section header at index 0.		// of the section header at index 0.
// """		// """
auto Shnum = Obj.sections().size() + 1;		auto Shnum = Obj.sections().size() + 1;
if (Shnum >= SHN_LORESERVE)		if (Shnum >= SHN_LORESERVE)
Ehdr.e_shnum = 0;		Ehdr.e_shnum = 0;
else		else
		rupprechtUnsubmitted Not Done Reply Inline Actions as a tiny optimization, call Records.reserve(Lines.size()) once you know how many lines there are. rupprecht: as a tiny optimization, call Records.reserve(Lines.size()) once you know how many lines there…
Ehdr.e_shnum = Shnum;		Ehdr.e_shnum = Shnum;
// """		// """
// If the section name string table section index is greater than or equal		// If the section name string table section index is greater than or equal
// to SHN_LORESERVE (0xff00), this member has the value SHN_XINDEX (0xffff)		// to SHN_LORESERVE (0xff00), this member has the value SHN_XINDEX (0xffff)
// and the actual index of the section name string table section is		// and the actual index of the section name string table section is
// contained in the sh_link field of the section header at index 0.		// contained in the sh_link field of the section header at index 0.
// """		// """
if (Obj.SectionNames->Index >= SHN_LORESERVE)		if (Obj.SectionNames->Index >= SHN_LORESERVE)
Ehdr.e_shstrndx = SHN_XINDEX;		Ehdr.e_shstrndx = SHN_XINDEX;
else		else
Ehdr.e_shstrndx = Obj.SectionNames->Index;		Ehdr.e_shstrndx = Obj.SectionNames->Index;
} else {		} else {
Ehdr.e_shentsize = 0;		Ehdr.e_shentsize = 0;
Ehdr.e_shoff = 0;		Ehdr.e_shoff = 0;
Ehdr.e_shnum = 0;		Ehdr.e_shnum = 0;
Ehdr.e_shstrndx = 0;		Ehdr.e_shstrndx = 0;
}		}
}		}

template <class ELFT> void ELFWriter<ELFT>::writePhdrs() {		template <class ELFT> void ELFWriter<ELFT>::writePhdrs() {
for (auto &Seg : Obj.segments())		for (auto &Seg : Obj.segments())
writePhdr(Seg);		writePhdr(Seg);
		rupprechtUnsubmitted Not Done Reply Inline Actions Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t> fields for each record, so we don't have to worry about it being valid everywhere (i.e. using checkedGetHex)? rupprecht: Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t>…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're checking string with `checkChars` we shouldn't really step on conversion error, unless something really weird happens. evgeny777: It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're…
}		}

template <class ELFT> void ELFWriter<ELFT>::writeShdrs() {		template <class ELFT> void ELFWriter<ELFT>::writeShdrs() {
uint8_t *B = Buf.getBufferStart() + Obj.SHOffset;		uint8_t *B = Buf.getBufferStart() + Obj.SHOffset;
// This reference serves to write the dummy section header at the begining		// This reference serves to write the dummy section header at the begining
		rupprechtUnsubmitted Not Done Reply Inline Actions How about creating a static method to convert a line into an Expected<IHexRecord>, so we can return an error if it's invalid instead of making the user call getChecksum/checkRecord? rupprecht: How about creating a static method to convert a line into an Expected<IHexRecord>, so we can…
// of the file. It is not used for anything else		// of the file. It is not used for anything else
Elf_Shdr &Shdr = reinterpret_cast<Elf_Shdr >(B);		Elf_Shdr &Shdr = reinterpret_cast<Elf_Shdr >(B);
Shdr.sh_name = 0;		Shdr.sh_name = 0;
Shdr.sh_type = SHT_NULL;		Shdr.sh_type = SHT_NULL;
Shdr.sh_flags = 0;		Shdr.sh_flags = 0;
Shdr.sh_addr = 0;		Shdr.sh_addr = 0;
Shdr.sh_offset = 0;		Shdr.sh_offset = 0;
// See writeEhdr for why we do this.		// See writeEhdr for why we do this.
▲ Show 20 Lines • Show All 439 Lines • ▼ Show 20 Lines	Error BinaryWriter::finalize() {
}		}

if (Error E = Buf.allocate(TotalSize))		if (Error E = Buf.allocate(TotalSize))
return E;		return E;
SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);		SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);
return Error::success();		return Error::success();
}		}

		bool IHexWriter::SectionCompare::operator()(const SectionBase *Lhs,
		const SectionBase *Rhs) const {
		return (sectionPhysicalAddr(Lhs) & 0xFFFFFFFFU) <
		(sectionPhysicalAddr(Rhs) & 0xFFFFFFFFU);
		}

		uint64_t IHexWriter::writeEntryPointRecord(uint8_t *Buf) {
		IHexLineData HexData;
		uint8_t Data[4] = {};
		if (Obj.Entry <= 0xFFFFFU) {
		Data[0] = ((Obj.Entry & 0xF0000U) >> 12) & 0xFF;
		support::endian::write(&Data[2], static_cast<uint16_t>(Obj.Entry),
		support::big);
		HexData = IHexRecord::getLine(IHexRecord::StartAddr80x86, 0, Data);
		} else {
		support::endian::write(Data, static_cast<uint32_t>(Obj.Entry),
		support::big);
		HexData = IHexRecord::getLine(IHexRecord::StartAddr, 0, Data);
		}
		memcpy(Buf, HexData.data(), HexData.size());
		return HexData.size();
		}

		uint64_t IHexWriter::writeEndOfFileRecord(uint8_t *Buf) {
		IHexLineData HexData = IHexRecord::getLine(IHexRecord::EndOfFile, 0, {});
		memcpy(Buf, HexData.data(), HexData.size());
		return HexData.size();
		}

		Error IHexWriter::write() {
		IHexSectionWriter Writer(Buf);
		// Write sections.
		for (const SectionBase *Sec : Sections)
		Sec->accept(Writer);

		uint64_t Offset = Writer.getBufferOffset();
		// Write entry point address.
		Offset += writeEntryPointRecord(Buf.getBufferStart() + Offset);
		// Write EOF.
		Offset += writeEndOfFileRecord(Buf.getBufferStart() + Offset);
		assert(Offset == TotalSize);
		return Buf.commit();
		}

		Error IHexWriter::checkSection(const SectionBase &Sec) {
		uint64_t Addr = sectionPhysicalAddr(&Sec);
		if (addressOverflows32bit(Addr) \|\| addressOverflows32bit(Addr + Sec.Size - 1))
		return createStringError(
		errc::invalid_argument,
		"Section '%s' address range [%p, %p] is not 32 bit", Sec.Name.c_str(),
		Addr, Addr + Sec.Size - 1);
		return Error::success();
		}

		Error IHexWriter::finalize() {
		bool UseSegments = false;
		auto ShouldWrite = [](const SectionBase &Sec) {
		return (Sec.Flags & ELF::SHF_ALLOC) && (Sec.Type != ELF::SHT_NOBITS);
		};
		auto IsInPtLoad = [](const SectionBase &Sec) {
		return Sec.ParentSegment && Sec.ParentSegment->Type == ELF::PT_LOAD;
		};

		// We can't write 64-bit addresses.
		if (addressOverflows32bit(Obj.Entry))
		return createStringError(errc::invalid_argument,
		"Entry point address %p overflows 32 bits.",
		Obj.Entry);

		// If any section we're to write has segment then we
		// switch to using physical addresses. Otherwise we
		// use section virtual address.
		for (auto &Section : Obj.sections())
		if (ShouldWrite(Section) && IsInPtLoad(Section)) {
		UseSegments = true;
		break;
		}

		for (auto &Section : Obj.sections())
		if (ShouldWrite(Section) && (!UseSegments \|\| IsInPtLoad(Section))) {
		if (Error E = checkSection(Section))
		return E;
		Sections.insert(&Section);
		}

		IHexSectionWriterBase LengthCalc(Buf);
		for (const SectionBase *Sec : Sections)
		Sec->accept(LengthCalc);

		// We need space to write section records + StartAddress record +
		// EndOfFile record.
		TotalSize = LengthCalc.getBufferOffset() + IHexRecord::getLineLength(4) +
		IHexRecord::getLineLength(0);
		if (Error E = Buf.allocate(TotalSize))
		return E;
		return Error::success();
		}

template class ELFBuilder<ELF64LE>;		template class ELFBuilder<ELF64LE>;
template class ELFBuilder<ELF64BE>;		template class ELFBuilder<ELF64BE>;
template class ELFBuilder<ELF32LE>;		template class ELFBuilder<ELF32LE>;
template class ELFBuilder<ELF32BE>;		template class ELFBuilder<ELF32BE>;

template class ELFWriter<ELF64LE>;		template class ELFWriter<ELF64LE>;
template class ELFWriter<ELF64BE>;		template class ELFWriter<ELF64BE>;
template class ELFWriter<ELF32LE>;		template class ELFWriter<ELF32LE>;
template class ELFWriter<ELF32BE>;		template class ELFWriter<ELF32BE>;

} // end namespace elf		} // end namespace elf
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm