This is an archive of the discontinued LLVM Phabricator instance.

I ran this on some internal ihex files we have, and it seems to be dropping one of the sections, so I'll have to poke around to see what the bug is. But some ihex support is better than none, so it's not necessarily blocking :)

test/tools/llvm-objcopy/ELF/ihex-reader.test
2–4	The only ihex reading test (besides the error cases) is just consuming ihex that llvm-objcopy produces with -O ihex. Could you add a .hex test file for a more stable test? And then it would be easier to verify, e.g., llvm-objcopy -I ihex -O binary <test> is identical to objcopy -I ihex -O binary <test>
tools/llvm-objcopy/ELF/ELFObjcopy.cpp
638	A more succinct way (and same in executeObjcopyOnRawBinary): const ElfType OutputElfType = getOutputElfType( Config.OutputArch.getValueOr(Config.BinaryArch));
tools/llvm-objcopy/ELF/Object.cpp
159–160	`Fail` is unused in release builds, so you need to add a `(void)Fail;` to silence the error/warning in release builds.
1050	RecAddr should be defined in the loop, where it is used
1450–1460	WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if it just attempts to match ":[0-9A-F]+". It would produce less precise error messages, though.
1451	I think this will crash (or be UB) on an empty line?
1469–1470	I think there should be validation (somewhere) that there are no more records after this
1510	as a tiny optimization, call Records.reserve(Lines.size()) once you know how many lines there are.
1525–1532	Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t> fields for each record, so we don't have to worry about it being valid everywhere (i.e. using checkedGetHex)?
1534–1537	How about creating a static method to convert a line into an Expected<IHexRecord>, so we can return an error if it's invalid instead of making the user call getChecksum/checkRecord?
tools/llvm-objcopy/ELF/Object.h
195–198	Why not just uint16_t fields?

evgeny777 marked 4 inline comments as done.Apr 5 2019, 10:36 AM

evgeny777 added inline comments.

tools/llvm-objcopy/ELF/Object.cpp
1450–1460	I think that precise error message is more important. It might be hard in some cases to identify wrong character, e.g: `I` instead of `1`, `O` instead of `0`, russian `A` instead of english `A` and so on.
1451	Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert here on `!Line.empty()`
1469–1470	I think that `EndOfFile` record should unconditionally cancel further processing. . This allows moving EOF record within a file to temporarily prevent part of records from loading. This can be useful for testing. Also it seems GNU objcopy behaves this way.
1525–1532	It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're checking string with `checkChars` we shouldn't really step on conversion error, unless something really weird happens.

Line parser moved to IHexRecord::parse
Better test case for reader
Addressed some of review comments

Ping

In D60270#1463807, @evgeny777 wrote:

Ping

FYI, most of the other llvm-objcopy developers are still away following Euro LLVM, so it may not be until next week that you get an comments from @rupprecht. I don't currently have time to read up on ihex, I'm afraid, but if you don't get any feedback next week, I'll try to look into it if I have time.

FYI, most of the other llvm-objcopy developers are still away following Euro LLVM

Ah, I see. Ok, there is no rush.

In D60270#1464005, @evgeny777 wrote:

FYI, most of the other llvm-objcopy developers are still away following Euro LLVM

Ah, I see. Ok, there is no rush.

Sorry, I should have mentioned earlier that I was going to be busy last week. (In advance: I'm here this week, but I'll be out next week).
Hopefully I'll get to this one today, or if not, then tomorrow.

btw, some people at euro llvm also requested srec supprt, which seems extremely similar to ihex -- so it might be good to think about how generic this handling can be, e.g. maybe most of it should just be a "record" parser which is shared with ihex and srec. I don't think premature specialization should be done to make it more general than it should be, but just don't do anything that would be hostile towards refactoring it :)

btw, some people at euro llvm also requested srec supprt, which seems extremely similar to ihex

For me it doesn't look extremely similar to IHEX, except both formats use hexadecimal byte representation.
There are no such things as segment and extended addresses in SREC and even checksum calculation is different.

I think if we implement SREC then part of section builder functionality from IHexELFBuilder::addDataSections can be moved to a common base class,
also it seems SREC would have similar record structure (Type, Address, Data).

Still I expect writer and parser to be completely separate.

There's a lot of code to review here. I'll keep reviewing it everyday but this is going to take a while to review. Any help on splitting this up and making into smaller chunks would be helpful. Splitting reading and writing up into two separate patches would be helpful and removing features that we can add later would be helpful.

include/llvm/Support/Error.h
1180 ↗	(On Diff #193910)	size_t here and below is kind of confusing, can we use uint32_t?
tools/llvm-objcopy/ELF/Object.cpp
157	Unless an check that generates an error always proceeds this I think its best to return an error in this case, not assert fail. It would be better to roll this into an Expected function in that case I think anyway.
197	Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to lend itself to streams where as my opinion was the opposite before. You wouldn't need utohexstr since those formatting options are already supplied by the library I believe.
209	This is a very generic name with no comment. In general your comments have been awesome. I'd like to have an idea what this function does without reading the contents.
298	Does this ever make sense if there is no segment?
310	Masking that like this seems redundant, in general the number of places we're converting from 64 to 32 in an unchecked way is really shocking. I'd feel a lot more comfortable if we encapsulated these checks more and made them more clear.
313	Maybe we could split support for extended records out into a sperate patch and error out here for now?
tools/llvm-objcopy/ELF/Object.h
266	What's the point in splitting this into two classes? Also does inheriting from BinarySectionWriter make sense? The same visitors will need to be implemented I would suppose the offsets and everything would be very different.
296	In general I think it might be worth considering weather there is a need to use sections at all. Originally with the binary writer we only used program headers. It turns out that people did a lot of stuff in a really odd way with GNU objcopy when using -O binary that required that we use sections as the primary basis for output. I would imagine that ihex users would not be doing the same sorts of odd tricks and that you could write the output strait from program headers. This would simplify the implementation greatly I think and harden the implementation against all sorts of odd corner cases.
494–503	Why do we need this to output a new format?
928	We can probably split this into two changes to make things smaller, one for reading, and one for writing yeah?

There's a lot of code to review here.

I've responded to some of the comments, meanwhile I'm splitting patch into writer (will go first) and reader (will go next). Will update the review soon

tools/llvm-objcopy/ELF/Object.cpp
157	This function can't actually return error, because string has been previously validated (see `checkChars` for example). IMO, it's bad practice to implement runtime checks for one's own logical errors.
197	This function was optimized to not using any dynamic allocation (IHexLineData is actually SmallVector), because each line contains only 16 bytes of section data, so it's possible to have really huge number of lines. What are the benefits of using raw_ostream?
298	It's a helper function which returns section VA if there is no segment. Any suggestion for better name?
310	There is a checkSections method which check all sections to detect if any of them has 64-bit address. Bear in mind that implementation also supports sign extended 32-bit addresses, i.e 0xFFFFFFFF80000000 is a valid address, but 0x100000000 is not
313	I suggest splitting reader and writer. To me it looks like a more logical split compared to removal of certain record types.
tools/llvm-objcopy/ELF/Object.h
266	I inherited IHexSectionWriter from BinarySectionWriter in order to reuse visitors for RelocationSection, GnuDebugLinkSection, e.t.c which will never go to IHEX nor to binary output. Are you suggesting duplication?
296	AFAIU one can't do this with IHEX, because unlike binary IHEX is not a contiguous blob, e.g you can have a gap between sections which won't go to output file.
494–503	This function takes a portion of hexadecimal data from IHexRecord and appends it in binary form to internal vector of OwnedDataSection
928	I think so.

Splitted IHEX patch into reader and writer. Diff now contains the "writer" part.

Any comments on this?

Ping

Looked mostly at the test for now, going to take a pass over the code today.

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-segments.yaml
21	I think this should be .data1? Command Output (stderr): -- error: Unknown section referenced: '.data1' by program header. (I think this is a recent validation added by rL359663)
test/tools/llvm-objcopy/ELF/ihex-writer.test
2	All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected? I haven't yet debugged exactly why.
4	`cat X \| FileCheck` should be replaced with `FileCheck --input-file=X` everywhere
21	When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of range for Intel Hex file`. Maybe we shouldn't be supporting it? Are we able to handle it correctly somehow even though GNU objcopy can't?
tools/llvm-objcopy/ELF/Object.cpp
154	Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr & 0xffffffff80000000 == 0xffffffff80000000` instead?

Some insight into the differences:

test/tools/llvm-objcopy/ELF/ihex-writer.test
11	It looks like the addresses in this file don't match up, but I don't have a specific suggestion yet
61	It looks like this should only be printed when the address is not zero: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l880
tools/llvm-objcopy/ELF/Object.cpp
204	It looks like ihex uses `\r\n` line endings 😦 https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l752 It seems weird for me to request this, but I think we should write `\r\n`, as this seems like a strange detail that people might need when consuming these files. I don't actually have any examples of this, however.

MaskRay added inline comments.May 22 2019, 6:37 PM

test/tools/llvm-objcopy/ELF/ihex-writer.test
10	The two RUN lines can be written as: `llvm-objcopy -O ihex %t-segs - \| FileCheck --check-prefix=SEGMENTS %s` if the output `%t2-segs.hex` isn't used elsewhere.

Fixed error in section name in one of the tests
Removed cat | FileCheck from test case
Zero start address is not longer written to IHEX
Switched to windows line endings.

test/tools/llvm-objcopy/ELF/ihex-writer.test
2	After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.yaml` is identical to one of GNU objcopy. However if input ELF file contains segments situation is different - GNU objcopy seems to ignore segments completely and always uses section virtual address. This doesn't seem logical to me and also doesn't look consistent with the way we're currently generating binary output in llvm-objcopy.
21	Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.32.51.20190227 works fine
tools/llvm-objcopy/ELF/Object.cpp
154	As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t.
204	Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells that: Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems Probably the easiest thing to do is to stick to GNU behavior. I'll update the patch

seiya added a subscriber: seiya.May 27 2019, 1:42 AM

I ran a few internal tests and this produces identical ihex output for every file I checked! \ o /

test/tools/llvm-objcopy/ELF/ihex-writer.test
21	Yep, that was it, I'm no longer seeing it with GNU objcopy from trunk.

This revision is now accepted and ready to land.May 28 2019, 2:19 PM

evgeny777 retitled this revision from [llvm-objcopy] Add support for Intel HEX input/output format to [llvm-objcopy] Add support for Intel HEX output format.May 29 2019, 3:11 AM

Closed by commit rL361949: [llvm-objcopy] Implement IHEX writer (authored by evgeny777). · Explain WhyMay 29 2019, 4:37 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2019, 4:37 AM

Herald added a subscriber: kristina. · View Herald Transcript

evgeny777 mentioned this in D62583: [llvm-objcopy] Implement IHEX reader.May 29 2019, 6:12 AM

simon_tatham mentioned this in D132541: [llvm-objcopy] Introduce 'ihex-flat' output format..Aug 24 2022, 2:46 AM

Revision Contents

Path

Size

test/

tools/

llvm-objcopy/

ELF/

Inputs/

ihex-elf-pt-null.yaml

20 lines

ihex-elf-sections.yaml

60 lines

ihex-elf-sections2.yaml

39 lines

ihex-elf-segments.yaml

60 lines

ihex-reader.test

129 lines

ihex-writer.test

92 lines

tools/

llvm-objcopy/

CopyConfig.cpp

3 lines

ELF/

2 lines

52 lines

187 lines

429 lines

2 lines

18 lines

Diff 193725

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-pt-null.yaml

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x0
				AddressAlign: 0x8
				Content: "0001020304"
				ProgramHeaders:
				- Type: PT_NULL
				Flags: [ PF_X, PF_R ]
				VAddr: 0xF00000000
				PAddr: 0x100000
				Sections:
				- Section: .text

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-sections.yaml

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				# This section contents exceeds default IHex line length of 16 bytes
				# so we expect two lines created for it.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x0
				AddressAlign: 0x8
				Content: "000102030405060708090A0B0C0D0E0F1011121314"
				- Name: .data
				# This section overlap 16-bit segment boundary, so we expect
				# additional 'SegmentAddr' record of type '02'
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "3031323334353637383940"
				Address: 0xFFF8
				AddressAlign: 0x8
				- Name: .data2
				# Previous section '.data' should have forced creation of
				# 'SegmentAddr'(02) record with segment address of 0x10000,
				# so this section should have address of 0x100.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "40414243"
				Address: 0x10100
				AddressAlign: 0x8
				- Name: .data3
				# The last section not only overlaps segment boundary, but
				# also has linear address which doesn't fit 20 bits. The
				# following records should be craeted:
				# 'SegmentAddr'(02) record with address 0x0
				# 'ExtendedAddr'(04) record with address 0x100000
				# 'Data'(00) record with 8 bytes of section data
				# 'SegmentAddr'(02) record with address 0x10000
				# 'Data'(00) record with remaining 3 bytes of data.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "5051525354555657585960"
				Address: 0x10FFF8
				AddressAlign: 0x8
				- Name: .bss
				# NOBITS sections are not written to IHex
				Type: SHT_NOBITS
				Flags: [ SHF_ALLOC ]
				Address: 0x10100
				Size: 0x1000
				AddressAlign: 0x8
				- Name: .dummy
				# Non-allocatable sections are not written to IHex
				Type: SHT_PROGBITS
				Flags: [ ]
				Address: 0x20FFF8
				Size: 65536
				AddressAlign: 0x8

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-sections2.yaml

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				# Zero length sections are not exported to IHex
				# 'SegmentAddr' and 'ExtendedAddr' records aren't
				# created either.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x7FFFFFFF
				AddressAlign: 0x8
				Size: 0
				- Name: .text1
				# Section address is sign-extended 32-bit address
				# Data fits 32-bit range
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0xFFFFFFFF80001000
				AddressAlign: 0x8
				Content: "0001020304"
				- Name: .text2
				# Part of section data is in 32-bit address range
				# and part isn't.
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0xFFFFFFF8
				AddressAlign: 0x8
				Content: "000102030405060708"
				- Name: .text3
				# Entire secion is outside of 32-bit range
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0xFFFFFFFF0
				AddressAlign: 0x8
				Content: "0001020304"

test/tools/llvm-objcopy/ELF/Inputs/ihex-elf-segments.yaml

				# Here we use yaml from ihex-elf-sections.yaml, but add single load
				# segment containing all exported sections. In such case we should
				# use physical address of a section intead of virtual address. Physical
				# addresses start from 0x100000, so we create two additional 'ExtenededAddr'
				# (03) record in the beginning of IHex file with that physical address
				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Entry: 0x100000
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0x0
				AddressAlign: 0x8
				Content: "000102030405060708090A0B0C0D0E0F1011121314"
				- Name: .data
				Type: SHT_PROGBITS
				rupprechtUnsubmitted Done Reply Inline Actions I think this should be .data1? Command Output (stderr): -- error: Unknown section referenced: '.data1' by program header. (I think this is a recent validation added by rL359663) rupprecht: I think this should be .data1? ``` Command Output (stderr): -- error: Unknown section…
				Flags: [ SHF_ALLOC ]
				Content: "3031323334353637383940"
				Address: 0xFFF8
				AddressAlign: 0x8
				- Name: .data2
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "40414243"
				Address: 0x10100
				AddressAlign: 0x8
				- Name: .data3
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC ]
				Content: "5051525354555657585960"
				Address: 0x10FFF8
				AddressAlign: 0x8
				- Name: .bss
				Type: SHT_NOBITS
				Flags: [ SHF_ALLOC ]
				Address: 0x10100
				Size: 0x1000
				AddressAlign: 0x8
				- Name: .dummy
				Type: SHT_PROGBITS
				Flags: [ ]
				Address: 0x20FFF8
				Size: 65536
				AddressAlign: 0x8
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				VAddr: 0xF00000000
				PAddr: 0x100000
				Sections:
				- Section: .text
				- Section: .data1
				- Section: .data2
				- Section: .data3
				- Section: .bss

test/tools/llvm-objcopy/ELF/ihex-reader.test

				# Check section headers when converting from hex to ELF
				# RUN: yaml2obj %p/Inputs/ihex-elf-sections.yaml -o %t
				# RUN: llvm-objcopy -O ihex %t %t.hex
				# RUN: llvm-objcopy -I ihex %t.hex %t2
				rupprechtUnsubmitted Not Done Reply Inline Actions The only ihex reading test (besides the error cases) is just consuming ihex that llvm-objcopy produces with -O ihex. Could you add a .hex test file for a more stable test? And then it would be easier to verify, e.g., llvm-objcopy -I ihex -O binary <test> is identical to objcopy -I ihex -O binary <test> rupprecht: The only ihex reading test (besides the error cases) is just consuming ihex that llvm-objcopy…
				# RUN: llvm-readobj -section-headers %t2 \| FileCheck %s

				# Check section contents
				# RUN: llvm-objcopy -O binary --only-section=.text %t %t.text
				# RUN: llvm-objcopy -O binary --only-section=.sec1 %t2 %t2.sec1
				# RUN: cmp %t.text %t2.sec1
				# RUN: llvm-objcopy -O binary --only-section=.data %t %t.data
				# RUN: llvm-objcopy -O binary --only-section=.sec2 %t2 %t2.sec2
				# RUN: cmp %t.data %t2.sec2
				# RUN: llvm-objcopy -O binary --only-section=.data2 %t %t.data2
				# RUN: llvm-objcopy -O binary --only-section=.sec3 %t2 %t2.sec3
				# RUN: cmp %t.data2 %t2.sec3
				# RUN: llvm-objcopy -O binary --only-section=.data3 %t %t.data3
				# RUN: llvm-objcopy -O binary --only-section=.sec4 %t2 %t2.sec4
				# RUN: cmp %t.data3 %t2.sec4

				# Check for various parsing errors
				# 1. String too short
				# RUN: echo "01000000FF" > %t-bad.hex
				# RUN: not llvm-objcopy -I ihex %t-bad.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_LENGTH
				# 2. missing ':'
				# RUN: echo "0100000000FF" > %t-bad2.hex
				# RUN: not llvm-objcopy -I ihex %t-bad2.hex %t-none 2>&1 \| FileCheck %s --check-prefix=MISSING_COLON
				# 3. invalid charatcer
				# RUN: echo ":01000000xF" > %t-bad3.hex
				# RUN: not llvm-objcopy -I ihex %t-bad3.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_CHAR
				# 4. incorrect string length
				# RUN: echo ":010000000000000F" > %t-bad4.hex
				# RUN: not llvm-objcopy -I ihex %t-bad4.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_LENGTH2
				# 5. invalid type (06)
				# RUN: echo ":00000006FA" > %t-bad5.hex
				# RUN: not llvm-objcopy -I ihex %t-bad5.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_TYPE
				# 6. invalid checksum
				# RUN: echo ":00000001FA" > %t-bad6.hex
				# RUN: not llvm-objcopy -I ihex %t-bad6.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_CKSUM
				# 7. zero data length
				# RUN: echo ":00010000FF" > %t-bad7.hex
				# RUN: not llvm-objcopy -I ihex %t-bad7.hex %t-none 2>&1 \| FileCheck %s --check-prefix=ZERO_DATA_LEN
				# 8. Bad data length for '02' (SegmentAddr) record
				# RUN: echo ":03000002000000FB" > %t-bad8.hex
				# RUN: not llvm-objcopy -I ihex %t-bad8.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_SEGADDR_LEN
				# 9. Bad data length for '03' (StartAddr80x86) record
				# RUN: echo ":03000003000000FA" > %t-bad9.hex
				# RUN: not llvm-objcopy -I ihex %t-bad9.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_STARTADDR_LEN
				# 10. Bad data length for '05' (StartAddr) record
				# RUN: echo ":03000005000000F8" > %t-bad10.hex
				# RUN: not llvm-objcopy -I ihex %t-bad10.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_STARTADDR_LEN
				# 11. Address value for 'StartAddr80x86' is greater then 0xFFFFFU
				# RUN: echo ":04000003FFFFFFFFFD" > %t-bad11.hex
				# RUN: not llvm-objcopy -I ihex %t-bad11.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_STARTADDR
				# 12. Invalid extended address data size
				# RUN: echo ":04000004FFFFFFFFFC" > %t-bad12.hex
				# RUN: not llvm-objcopy -I ihex %t-bad12.hex %t-none 2>&1 \| FileCheck %s --check-prefix=BAD_EXTADDR_LEN

				# CHECK: Index: 1
				# CHECK-NEXT: Name: .sec1 (35)
				# CHECK-NEXT: Type: SHT_PROGBITS (0x1)
				# CHECK-NEXT: Flags [ (0x3)
				# CHECK-NEXT: SHF_ALLOC (0x2)
				# CHECK-NEXT: SHF_WRITE (0x1)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x0
				# CHECK-NEXT: Offset: 0x34
				# CHECK-NEXT: Size: 21
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0

				# CHECK: Index: 2
				# CHECK-NEXT: Name: .sec2 (29)
				# CHECK-NEXT: Type: SHT_PROGBITS (0x1)
				# CHECK-NEXT: Flags [ (0x3)
				# CHECK-NEXT: SHF_ALLOC (0x2)
				# CHECK-NEXT: SHF_WRITE (0x1)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0xFFF8
				# CHECK-NEXT: Offset: 0x49
				# CHECK-NEXT: Size: 11
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0

				# CHECK: Index: 3
				# CHECK-NEXT: Name: .sec3 (23)
				# CHECK-NEXT: Type: SHT_PROGBITS (0x1)
				# CHECK-NEXT: Flags [ (0x3)
				# CHECK-NEXT: SHF_ALLOC (0x2)
				# CHECK-NEXT: SHF_WRITE (0x1)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x10100
				# CHECK-NEXT: Offset: 0x54
				# CHECK-NEXT: Size: 4
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0

				# CHECK: Index: 4
				# CHECK-NEXT: Name: .sec4 (17)
				# CHECK-NEXT: Type: SHT_PROGBITS (0x1)
				# CHECK-NEXT: Flags [ (0x3)
				# CHECK-NEXT: SHF_ALLOC (0x2)
				# CHECK-NEXT: SHF_WRITE (0x1)
				# CHECK-NEXT: ]
				# CHECK-NEXT: Address: 0x10FFF8
				# CHECK-NEXT: Offset: 0x58
				# CHECK-NEXT: Size: 11
				# CHECK-NEXT: Link: 0
				# CHECK-NEXT: Info: 0
				# CHECK-NEXT: AddressAlignment: 1
				# CHECK-NEXT: EntrySize: 0

				# BAD_LENGTH: error: '{{.*}}.hex': line 1: line is too short: 10 chars
				# MISSING_COLON: error: '{{.*}}.hex': line 1: missing ':' in the beginning of line
				# BAD_CHAR: error: '{{.*}}.hex': line 1: invalid character at position 10
				# BAD_LENGTH2: error: '{{.*}}.hex': line 1: invalid line length 17 (should be 13)
				# BAD_TYPE: error: '{{.*}}.hex': line 1: unknown record type: 6
				# BAD_CKSUM: error: '{{.*}}.hex': line 1: incorrect checksum
				# ZERO_DATA_LEN: error: '{{.*}}.hex': line 1: zero data length is not allowed for data records
				# BAD_SEGADDR_LEN: error: '{{.*}}.hex': line 1: segment address data should be 2 bytes in size
				# BAD_STARTADDR_LEN: error: '{{.*}}.hex': line 1: start address data should be 4 bytes in size
				# BAD_STARTADDR: error: '{{.*}}.hex': line 1: start address exceeds 20 bit for 80x86
				# BAD_EXTADDR_LEN: error: '{{.*}}.hex': line 1: extended address data should be 2 bytes in size

test/tools/llvm-objcopy/ELF/ihex-writer.test

				# RUN: yaml2obj %p/Inputs/ihex-elf-sections.yaml -o %t
				# RUN: llvm-objcopy -O ihex %t %t2.hex
				rupprechtUnsubmitted Not Done Reply Inline Actions All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected? I haven't yet debugged exactly why. rupprecht: All these hex outputs have diffs when compared to what GNU objcopy produces... is that expected?
				evgeny777AuthorUnsubmitted Done Reply Inline Actions After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.yaml` is identical to one of GNU objcopy. However if input ELF file contains segments situation is different - GNU objcopy seems to ignore segments completely and always uses section virtual address. This doesn't seem logical to me and also doesn't look consistent with the way we're currently generating binary output in llvm-objcopy. evgeny777: After I stopped emitting '03' record for zero start address output from `ihex-elf-sections.
				# RUN: cat %t2.hex \| FileCheck %s

				rupprechtUnsubmitted Done Reply Inline Actions `cat X \| FileCheck` should be replaced with `FileCheck --input-file=X` everywhere rupprecht: `cat X \| FileCheck` should be replaced with `FileCheck --input-file=X` everywhere
				# Check ihex output, when we have segments in ELF file
				# In such case only sections in PT_LOAD segments will
				# be exported and their physical addresses will be used
				# RUN: yaml2obj %p/Inputs/ihex-elf-segments.yaml -o %t-segs
				# RUN: llvm-objcopy -O ihex %t-segs %t2-segs.hex
				# RUN: cat %t2-segs.hex \| FileCheck %s --check-prefix=SEGMENTS
				MaskRayUnsubmitted Done Reply Inline Actions The two RUN lines can be written as: `llvm-objcopy -O ihex %t-segs - \| FileCheck --check-prefix=SEGMENTS %s` if the output `%t2-segs.hex` isn't used elsewhere. MaskRay: The two RUN lines can be written as: `llvm-objcopy -O ihex %t-segs - \| FileCheck --check…

				rupprechtUnsubmitted Not Done Reply Inline Actions It looks like the addresses in this file don't match up, but I don't have a specific suggestion yet rupprecht: It looks like the addresses in this file don't match up, but I don't have a specific suggestion…
				# Check that non-load segments are ignored:
				# RUN: yaml2obj %p/Inputs/ihex-elf-pt-null.yaml -o %t2-segs
				# RUN: llvm-objcopy -O ihex %t2-segs %t3-segs.hex
				# RUN: cat %t3-segs.hex \| FileCheck %s --check-prefix=PT_NULL

				# Check that sign-extended 32-bit section addresses are processed
				# correctly
				# RUN: yaml2obj %p/Inputs/ihex-elf-sections2.yaml -o %t-sec2
				# RUN: llvm-objcopy -O ihex --only-section=.text1 %t-sec2 %t-sec2.hex
				# RUN: cat %t-sec2.hex \| FileCheck %s --check-prefix=SIGN_EXTENDED
				rupprechtUnsubmitted Not Done Reply Inline Actions When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of range for Intel Hex file`. Maybe we shouldn't be supporting it? Are we able to handle it correctly somehow even though GNU objcopy can't? rupprecht: When I run GNU objcopy on this test case, I get an error: `address 0xffffffff80001000 out of…
				evgeny777AuthorUnsubmitted Done Reply Inline Actions Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.32.51.20190227 works fine evgeny777: Probably the problem is in the version of objcopy you're using. On my machine 2.30 fails, but 2.
				rupprechtUnsubmitted Done Reply Inline Actions Yep, that was it, I'm no longer seeing it with GNU objcopy from trunk. rupprecht: Yep, that was it, I'm no longer seeing it with GNU objcopy from trunk.

				# Check that section address range overlapping 32 bit range
				# triggers an error
				# RUN: not llvm-objcopy -O ihex --only-section=.text2 %t-sec2 %t-sec2-2.hex 2>&1 \| FileCheck %s --check-prefix=BAD-ADDR
				# RUN: not llvm-objcopy -O ihex --only-section=.text3 %t-sec2 %t-sec2-3.hex 2>&1 \| FileCheck %s --check-prefix=BAD-ADDR2

				# Check that zero length section is not written
				# RUN: llvm-objcopy -O ihex --only-section=.text %t-sec2 %t-sec2-4.hex
				# RUN: cat %t-sec2-4.hex \| FileCheck %s --check-prefix=ZERO_SIZE_SEC

				# Check 80x86 start address record. It is created for start
				# addresses less than 0x100000
				# RUN: llvm-objcopy -O ihex --set-start=0xFFFF %t %t3.hex
				# RUN: cat %t3.hex \| FileCheck %s --check-prefix=START1

				# Check i386 start address record (05). It is created for
				# start addresses which doesn't fit 20 bits
				# RUN: llvm-objcopy -O ihex --set-start=0x100000 %t %t4.hex
				# RUN: cat %t4.hex \| FileCheck %s --check-prefix=START2

				# We allow sign extended 32 bit start addresses as well.
				# RUN: llvm-objcopy -O ihex --set-start=0xFFFFFFFF80001000 %t %t5.hex
				# RUN: cat %t5.hex \| FileCheck %s --check-prefix=START3

				# Start address which exceeds 32 bit range triggers an error
				# RUN: not llvm-objcopy -O ihex --set-start=0xF00000000 %t %t6.hex 2>&1 \| FileCheck %s --check-prefix=BAD-START

				# CHECK: :10000000000102030405060708090A0B0C0D0E0F78
				# CHECK-NEXT: :05001000101112131491
				# CHECK-NEXT: :08FFF800303132333435363765
				# CHECK-NEXT: :020000021000EC
				# CHECK-NEXT: :030000003839404C
				# CHECK-NEXT: :0401000040414243F5
				# CHECK-NEXT: :020000020000FC
				# CHECK-NEXT: :020000040010EA
				# CHECK-NEXT: :08FFF800505152535455565765
				# CHECK-NEXT: :020000040011E9
				# CHECK-NEXT: :03000000585960EC
				# CHECK-NEXT: :0400000300000000F9
				# CHECK-NEXT: :00000001FF
				rupprechtUnsubmitted Not Done Reply Inline Actions It looks like this should only be printed when the address is not zero: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l880 rupprecht: It looks like this should only be printed when the address is not zero: https://sourceware.

				# SEGMENTS: :020000040010EA
				# SEGMENTS-NEXT: :1002F800000102030405060708090A0B0C0D0E0F7E
				# SEGMENTS-NEXT: :05030800101112131496
				# SEGMENTS-NEXT: :0B031000303132333435363738394095
				# SEGMENTS-NEXT: :0403200040414243D3
				# SEGMENTS-NEXT: :0B03280050515253545556575859601D
				# SEGMENTS-NEXT: :0400000500100000E7
				# SEGMENTS-NEXT: :00000001FF

				# 'ExtendedAddr' (04) record shouldn't be created
				# PT_NULL-NOT: :02000004

				# SIGN_EXTENDED: :0200000480007A
				# SIGN_EXTENDED-NEXT: :051000000001020304E1
				# SIGN_EXTENDED-NEXT: :0400000300000000F9
				# SIGN_EXTENDED-NEXT: :00000001FF

				# BAD-ADDR: error: Section '.text2' address range [0xfffffff8, 0x100000000] is not 32 bit
				# BAD-ADDR2: error: Section '.text3' address range [0xffffffff0, 0xffffffff4] is not 32 bit

				# There shouldn't be 'ExtendedAddr' nor 'Data' records
				# ZERO_SIZE_SEC-NOT: :02000004
				# ZERO_SIZE_SEC-NOT: :00FFFF00
				# ZERO_SIZE_SEC: :0400000300000000F9
				# ZERO_SIZE_SEC-NEXT: :00000001FF

				# START1: :040000030000FFFFFB
				# START2: :0400000500100000E7
				# START3: :040000058000100067
				# BAD-START: error: Entry point address 0xf00000000 overflows 32 bits

tools/llvm-objcopy/CopyConfig.cpp

Show First 20 Lines • Show All 453 Lines • ▼ Show 20 Lines	if (BinaryArch.empty())
return createStringError(		return createStringError(
errc::invalid_argument,		errc::invalid_argument,
"Specified binary input without specifiying an architecture");		"Specified binary input without specifiying an architecture");
Expected<const MachineInfo &> MI = getMachineInfo(BinaryArch);		Expected<const MachineInfo &> MI = getMachineInfo(BinaryArch);
if (!MI)		if (!MI)
return MI.takeError();		return MI.takeError();
Config.BinaryArch = *MI;		Config.BinaryArch = *MI;
}		}
if (!Config.OutputFormat.empty() && Config.OutputFormat != "binary") {		if (!Config.OutputFormat.empty() && Config.OutputFormat != "binary" &&
		Config.OutputFormat != "ihex") {
Expected<const MachineInfo &> MI =		Expected<const MachineInfo &> MI =
getOutputFormatMachineInfo(Config.OutputFormat);		getOutputFormatMachineInfo(Config.OutputFormat);
if (!MI)		if (!MI)
return MI.takeError();		return MI.takeError();
Config.OutputArch = *MI;		Config.OutputArch = *MI;
}		}

if (auto Arg = InputArgs.getLastArg(OBJCOPY_compress_debug_sections,		if (auto Arg = InputArgs.getLastArg(OBJCOPY_compress_debug_sections,
▲ Show 20 Lines • Show All 304 Lines • Show Last 20 Lines

tools/llvm-objcopy/ELF/ELFObjcopy.h

	Show All 16 Lines
	class ELFObjectFileBase;			class ELFObjectFileBase;
	} // end namespace object			} // end namespace object

	namespace objcopy {			namespace objcopy {
	struct CopyConfig;			struct CopyConfig;
	class Buffer;			class Buffer;

	namespace elf {			namespace elf {
				Error executeObjcopyOnIHex(const CopyConfig &Config, MemoryBuffer &In,
				Buffer &Out);
	Error executeObjcopyOnRawBinary(const CopyConfig &Config, MemoryBuffer &In,			Error executeObjcopyOnRawBinary(const CopyConfig &Config, MemoryBuffer &In,
	Buffer &Out);			Buffer &Out);
	Error executeObjcopyOnBinary(const CopyConfig &Config,			Error executeObjcopyOnBinary(const CopyConfig &Config,
	object::ELFObjectFileBase &In, Buffer &Out);			object::ELFObjectFileBase &In, Buffer &Out);

	} // end namespace elf			} // end namespace elf
	} // end namespace objcopy			} // end namespace objcopy
	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TOOLS_OBJCOPY_ELFOBJCOPY_H			#endif // LLVM_TOOLS_OBJCOPY_ELFOBJCOPY_H

tools/llvm-objcopy/ELF/ELFObjcopy.cpp

Show First 20 Lines • Show All 96 Lines • ▼ Show 20 Lines
static ElfType getOutputElfType(const MachineInfo &MI) {		static ElfType getOutputElfType(const MachineInfo &MI) {
// Infer output ELF type from the binary arch specified		// Infer output ELF type from the binary arch specified
if (MI.Is64Bit)		if (MI.Is64Bit)
return MI.IsLittleEndian ? ELFT_ELF64LE : ELFT_ELF64BE;		return MI.IsLittleEndian ? ELFT_ELF64LE : ELFT_ELF64BE;
else		else
return MI.IsLittleEndian ? ELFT_ELF32LE : ELFT_ELF32BE;		return MI.IsLittleEndian ? ELFT_ELF32LE : ELFT_ELF32BE;
}		}

static std::unique_ptr<Writer> createWriter(const CopyConfig &Config,		static std::unique_ptr<Writer> createELFWriter(const CopyConfig &Config,
Object &Obj, Buffer &Buf,		Object &Obj, Buffer &Buf,
ElfType OutputElfType) {		ElfType OutputElfType) {
if (Config.OutputFormat == "binary") {
return llvm::make_unique<BinaryWriter>(Obj, Buf);
}
// Depending on the initial ELFT and OutputFormat we need a different Writer.		// Depending on the initial ELFT and OutputFormat we need a different Writer.
switch (OutputElfType) {		switch (OutputElfType) {
case ELFT_ELF32LE:		case ELFT_ELF32LE:
return llvm::make_unique<ELFWriter<ELF32LE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF32LE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
case ELFT_ELF64LE:		case ELFT_ELF64LE:
return llvm::make_unique<ELFWriter<ELF64LE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF64LE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
case ELFT_ELF32BE:		case ELFT_ELF32BE:
return llvm::make_unique<ELFWriter<ELF32BE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF32BE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
case ELFT_ELF64BE:		case ELFT_ELF64BE:
return llvm::make_unique<ELFWriter<ELF64BE>>(Obj, Buf,		return llvm::make_unique<ELFWriter<ELF64BE>>(Obj, Buf,
!Config.StripSections);		!Config.StripSections);
}		}
llvm_unreachable("Invalid output format");		llvm_unreachable("Invalid output format");
}		}

		static std::unique_ptr<Writer> createWriter(const CopyConfig &Config,
		Object &Obj, Buffer &Buf,
		ElfType OutputElfType) {
		using Functor = std::function<std::unique_ptr<Writer>()>;
		return StringSwitch<Functor>(Config.OutputFormat)
		.Case("binary", [&] { return llvm::make_unique<BinaryWriter>(Obj, Buf); })
		.Case("ihex", [&] { return llvm::make_unique<IHexWriter>(Obj, Buf); })
		.Default(
		[&] { return createELFWriter(Config, Obj, Buf, OutputElfType); })();
		}

template <class ELFT>		template <class ELFT>
static Expected<ArrayRef<uint8_t>>		static Expected<ArrayRef<uint8_t>>
findBuildID(const object::ELFFile<ELFT> &In) {		findBuildID(const object::ELFFile<ELFT> &In) {
for (const auto &Phdr : unwrapOrError(In.program_headers())) {		for (const auto &Phdr : unwrapOrError(In.program_headers())) {
if (Phdr.p_type != PT_NOTE)		if (Phdr.p_type != PT_NOTE)
continue;		continue;
Error Err = Error::success();		Error Err = Error::success();
for (const auto &Note : In.notes(Phdr, Err))		for (const auto &Note : In.notes(Phdr, Err))
▲ Show 20 Lines • Show All 471 Lines • ▼ Show 20 Lines	Obj.SymbolTable->addSymbol(
Sec ? (uint16_t)SYMBOL_SIMPLE_INDEX : (uint16_t)SHN_ABS, 0);		Sec ? (uint16_t)SYMBOL_SIMPLE_INDEX : (uint16_t)SHN_ABS, 0);
}		}

if (Config.EntryExpr)		if (Config.EntryExpr)
Obj.Entry = Config.EntryExpr(Obj.Entry);		Obj.Entry = Config.EntryExpr(Obj.Entry);
return Error::success();		return Error::success();
}		}

		static Error writeOutput(const CopyConfig &Config, Object &Obj, Buffer &Out,
		ElfType OutputElfType) {
		std::unique_ptr<Writer> Writer =
		createWriter(Config, Obj, Out, OutputElfType);
		if (Error E = Writer->finalize())
		return E;
		return Writer->write();
		}

		Error executeObjcopyOnIHex(const CopyConfig &Config, MemoryBuffer &In,
		Buffer &Out) {
		IHexReader Reader(&In);
		std::unique_ptr<Object> Obj = Reader.create();
		const ElfType OutputElfType = getOutputElfType(
		Config.OutputArch ? Config.OutputArch.getValue() : Config.BinaryArch);
		rupprechtUnsubmitted Not Done Reply Inline Actions A more succinct way (and same in executeObjcopyOnRawBinary): const ElfType OutputElfType = getOutputElfType( Config.OutputArch.getValueOr(Config.BinaryArch)); rupprecht: A more succinct way (and same in executeObjcopyOnRawBinary): ``` const ElfType OutputElfType =…
		if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))
		return E;
		return writeOutput(Config, *Obj, Out, OutputElfType);
		}

Error executeObjcopyOnRawBinary(const CopyConfig &Config, MemoryBuffer &In,		Error executeObjcopyOnRawBinary(const CopyConfig &Config, MemoryBuffer &In,
Buffer &Out) {		Buffer &Out) {
BinaryReader Reader(Config.BinaryArch, &In);		BinaryReader Reader(Config.BinaryArch, &In);
std::unique_ptr<Object> Obj = Reader.create();		std::unique_ptr<Object> Obj = Reader.create();

// Prefer OutputArch (-O<format>) if set, otherwise fallback to BinaryArch		// Prefer OutputArch (-O<format>) if set, otherwise fallback to BinaryArch
// (-B<arch>).		// (-B<arch>).
const ElfType OutputElfType = getOutputElfType(		const ElfType OutputElfType = getOutputElfType(
Config.OutputArch ? Config.OutputArch.getValue() : Config.BinaryArch);		Config.OutputArch ? Config.OutputArch.getValue() : Config.BinaryArch);
if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))		if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))
return E;		return E;
std::unique_ptr<Writer> Writer =		return writeOutput(Config, *Obj, Out, OutputElfType);
createWriter(Config, *Obj, Out, OutputElfType);
if (Error E = Writer->finalize())
return E;
return Writer->write();
}		}

Error executeObjcopyOnBinary(const CopyConfig &Config,		Error executeObjcopyOnBinary(const CopyConfig &Config,
object::ELFObjectFileBase &In, Buffer &Out) {		object::ELFObjectFileBase &In, Buffer &Out) {
ELFReader Reader(&In);		ELFReader Reader(&In);
std::unique_ptr<Object> Obj = Reader.create();		std::unique_ptr<Object> Obj = Reader.create();
// Prefer OutputArch (-O<format>) if set, otherwise infer it from the input.		// Prefer OutputArch (-O<format>) if set, otherwise infer it from the input.
const ElfType OutputElfType =		const ElfType OutputElfType =
Show All 13 Lines	Error executeObjcopyOnBinary(const CopyConfig &Config,
if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkInput)		if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkInput)
if (Error E =		if (Error E =
linkToBuildIdDir(Config, Config.InputFilename,		linkToBuildIdDir(Config, Config.InputFilename,
Config.BuildIdLinkInput.getValue(), BuildIdBytes))		Config.BuildIdLinkInput.getValue(), BuildIdBytes))
return E;		return E;

if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))		if (Error E = handleArgs(Config, *Obj, Reader, OutputElfType))
return E;		return E;
std::unique_ptr<Writer> Writer =		if (Error E = writeOutput(Config, *Obj, Out, OutputElfType))
createWriter(Config, *Obj, Out, OutputElfType);
if (Error E = Writer->finalize())
return E;
if (Error E = Writer->write())
return E;		return E;
if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkOutput)		if (!Config.BuildIdLinkDir.empty() && Config.BuildIdLinkOutput)
if (Error E =		if (Error E =
linkToBuildIdDir(Config, Config.OutputFilename,		linkToBuildIdDir(Config, Config.OutputFilename,
Config.BuildIdLinkOutput.getValue(), BuildIdBytes))		Config.BuildIdLinkOutput.getValue(), BuildIdBytes))
return E;		return E;

return Error::success();		return Error::success();
}		}

} // end namespace elf		} // end namespace elf
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

tools/llvm-objcopy/ELF/Object.h

Show All 11 Lines
#include "Buffer.h"		#include "Buffer.h"
#include "CopyConfig.h"		#include "CopyConfig.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/BinaryFormat/ELF.h"		#include "llvm/BinaryFormat/ELF.h"
#include "llvm/MC/StringTableBuilder.h"		#include "llvm/MC/StringTableBuilder.h"
#include "llvm/Object/ELFObjectFile.h"		#include "llvm/Object/ELFObjectFile.h"
		#include "llvm/Support/Errc.h"
#include "llvm/Support/FileOutputBuffer.h"		#include "llvm/Support/FileOutputBuffer.h"
#include "llvm/Support/JamCRC.h"		#include "llvm/Support/JamCRC.h"
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <functional>		#include <functional>
#include <memory>		#include <memory>
#include <set>		#include <set>
#include <vector>		#include <vector>
▲ Show 20 Lines • Show All 135 Lines • ▼ Show 20 Lines	public:
void visit(GroupSection &Sec) override;		void visit(GroupSection &Sec) override;
void visit(SectionIndexSection &Sec) override;		void visit(SectionIndexSection &Sec) override;
void visit(CompressedSection &Sec) override;		void visit(CompressedSection &Sec) override;
void visit(DecompressedSection &Sec) override;		void visit(DecompressedSection &Sec) override;
};		};

#define MAKE_SEC_WRITER_FRIEND \		#define MAKE_SEC_WRITER_FRIEND \
friend class SectionWriter; \		friend class SectionWriter; \
		friend class IHexSectionWriterBase; \
		friend class IHexSectionWriter; \
template <class ELFT> friend class ELFSectionWriter; \		template <class ELFT> friend class ELFSectionWriter; \
template <class ELFT> friend class ELFSectionSizer;		template <class ELFT> friend class ELFSectionSizer;

class BinarySectionWriter : public SectionWriter {		class BinarySectionWriter : public SectionWriter {
public:		public:
virtual ~BinarySectionWriter() {}		virtual ~BinarySectionWriter() {}

void visit(const SymbolTableSection &Sec) override;		void visit(const SymbolTableSection &Sec) override;
void visit(const RelocationSection &Sec) override;		void visit(const RelocationSection &Sec) override;
void visit(const GnuDebugLinkSection &Sec) override;		void visit(const GnuDebugLinkSection &Sec) override;
void visit(const GroupSection &Sec) override;		void visit(const GroupSection &Sec) override;
void visit(const SectionIndexSection &Sec) override;		void visit(const SectionIndexSection &Sec) override;
void visit(const CompressedSection &Sec) override;		void visit(const CompressedSection &Sec) override;
void visit(const DecompressedSection &Sec) override;		void visit(const DecompressedSection &Sec) override;

explicit BinarySectionWriter(Buffer &Buf) : SectionWriter(Buf) {}		explicit BinarySectionWriter(Buffer &Buf) : SectionWriter(Buf) {}
};		};

		using IHexLineData = SmallVector<char, 64>;

		struct IHexRecord {
		// Memory address of the record.
		uint32_t Addr : 16;
		// Record type (see below).
		uint32_t Type : 16;
		rupprechtUnsubmitted Not Done Reply Inline Actions Why not just uint16_t fields? rupprecht: Why not just uint16_t fields?
		// Record data in hexadecimal form.
		StringRef HexData;

		// Helper method to get file length of the record
		// including newline character
		static size_t getLength(size_t DataSize) {
		// :LLAAAATT[DD...DD]CC'
		return DataSize * 2 + 11;
		}

		// Gets length of line in a file (getLength + CR).
		static size_t getLineLength(size_t DataSize) {
		return getLength(DataSize) + 1;
		}

		// Given type, address and data returns line which can
		// be written to output file.
		static IHexLineData getLine(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data);

		// Calculates checksum of stringified record representation
		// S must NOT contain leading ':' and trailing whitespace
		// characters
		static uint8_t getChecksum(StringRef S);

		enum Type {
		// Contains data and a 16-bit starting address for the data.
		// The byte count specifies number of data bytes in the record.
		Data = 0,
		// Must occur exactly once per file in the last line of the file.
		// The data field is empty (thus byte count is 00) and the address
		// field is typically 0000.
		EndOfFile = 1,
		// The data field contains a 16-bit segment base address (thus byte
		// count is always 02) compatible with 80x86 real mode addressing.
		// The address field (typically 0000) is ignored. The segment address
		// from the most recent 02 record is multiplied by 16 and added to each
		// subsequent data record address to form the physical starting address
		// for the data. This allows addressing up to one megabyte of address
		// space.
		SegmentAddr = 2,
		// or 80x86 processors, specifies the initial content of the CS:IP
		// registers. The address field is 0000, the byte count is always 04,
		// the first two data bytes are the CS value, the latter two are the
		// IP value.
		StartAddr80x86 = 3,
		// Allows for 32 bit addressing (up to 4GiB). The record's address field
		// is ignored (typically 0000) and its byte count is always 02. The two
		// data bytes (big endian) specify the upper 16 bits of the 32 bit
		// absolute address for all subsequent type 00 records
		ExtendedAddr = 4,
		// The address field is 0000 (not used) and the byte count is always 04.
		// The four data bytes represent a 32-bit address value. In the case of
		// 80386 and higher CPUs, this address is loaded into the EIP register.
		StartAddr = 5,
		// We have no other valid types
		InvalidType = 6
		};
		};

		// Base class for IHexSectionWriter. This class implements writing algorithm,
		// but doesn't actually write records. It is used for output buffer size
		// calculation in IHexWriter::finalize.
		class IHexSectionWriterBase : public BinarySectionWriter {
		// 20-bit segment address
		uint32_t SegmentAddr = 0;
		// Extended linear address
		uint32_t BaseAddr = 0;
		jakehehrlichUnsubmitted Not Done Reply Inline Actions What's the point in splitting this into two classes? Also does inheriting from BinarySectionWriter make sense? The same visitors will need to be implemented I would suppose the offsets and everything would be very different. jakehehrlich: What's the point in splitting this into two classes? Also does inheriting from…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I inherited IHexSectionWriter from BinarySectionWriter in order to reuse visitors for RelocationSection, GnuDebugLinkSection, e.t.c which will never go to IHEX nor to binary output. Are you suggesting duplication? evgeny777: I inherited IHexSectionWriter from BinarySectionWriter in order to reuse visitors for…

		// Write segment address corresponding to 'Addr'
		uint64_t writeSegmentAddr(uint64_t Addr);
		// Write extended linear (base) address corresponding to 'Addr'
		uint64_t writeBaseAddr(uint64_t Addr);

		protected:
		// Offset in the output buffer
		uint64_t Offset = 0;

		void writeSection(const SectionBase *Sec, ArrayRef<uint8_t> Data);
		virtual void writeData(uint8_t Type, uint16_t Addr, ArrayRef<uint8_t> Data);

		public:
		explicit IHexSectionWriterBase(Buffer &Buf) : BinarySectionWriter(Buf) {}

		uint64_t getBufferOffset() const { return Offset; }
		void visit(const Section &Sec) final;
		void visit(const OwnedDataSection &Sec) final;
		void visit(const StringTableSection &Sec) override;
		void visit(const DynamicRelocationSection &Sec) final;
		using BinarySectionWriter::visit;
		};

		// Real IHEX section writer
		class IHexSectionWriter : public IHexSectionWriterBase {
		public:
		IHexSectionWriter(Buffer &Buf) : IHexSectionWriterBase(Buf) {}

		void writeData(uint8_t Type, uint16_t Addr, ArrayRef<uint8_t> Data) override;
		jakehehrlichUnsubmitted Not Done Reply Inline Actions In general I think it might be worth considering weather there is a need to use sections at all. Originally with the binary writer we only used program headers. It turns out that people did a lot of stuff in a really odd way with GNU objcopy when using -O binary that required that we use sections as the primary basis for output. I would imagine that ihex users would not be doing the same sorts of odd tricks and that you could write the output strait from program headers. This would simplify the implementation greatly I think and harden the implementation against all sorts of odd corner cases. jakehehrlich: In general I think it might be worth considering weather there is a need to use sections at all.
		evgeny777AuthorUnsubmitted Done Reply Inline Actions AFAIU one can't do this with IHEX, because unlike binary IHEX is not a contiguous blob, e.g you can have a gap between sections which won't go to output file. evgeny777: AFAIU one can't do this with IHEX, because unlike binary IHEX is not a contiguous blob, e.g you…
		void visit(const StringTableSection &Sec) override;
		};

class Writer {		class Writer {
protected:		protected:
Object &Obj;		Object &Obj;
Buffer &Buf;		Buffer &Buf;

public:		public:
virtual ~Writer();		virtual ~Writer();
virtual Error finalize() = 0;		virtual Error finalize() = 0;
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines

public:		public:
~BinaryWriter() {}		~BinaryWriter() {}
Error finalize() override;		Error finalize() override;
Error write() override;		Error write() override;
BinaryWriter(Object &Obj, Buffer &Buf) : Writer(Obj, Buf) {}		BinaryWriter(Object &Obj, Buffer &Buf) : Writer(Obj, Buf) {}
};		};

		class IHexWriter : public Writer {
		struct SectionCompare {
		bool operator()(const SectionBase Lhs, const SectionBase Rhs) const;
		};

		std::set<const SectionBase *, SectionCompare> Sections;
		size_t TotalSize;

		Error checkSection(const SectionBase &Sec);
		uint64_t writeEntryPointRecord(uint8_t *Buf);
		uint64_t writeEndOfFileRecord(uint8_t *Buf);

		public:
		~IHexWriter() {}
		Error finalize() override;
		Error write() override;
		IHexWriter(Object &Obj, Buffer &Buf) : Writer(Obj, Buf) {}
		};

class SectionBase {		class SectionBase {
public:		public:
std::string Name;		std::string Name;
Segment *ParentSegment = nullptr;		Segment *ParentSegment = nullptr;
uint64_t HeaderOffset;		uint64_t HeaderOffset;
uint64_t OriginalOffset = std::numeric_limits<uint64_t>::max();		uint64_t OriginalOffset = std::numeric_limits<uint64_t>::max();
uint32_t Index;		uint32_t Index;
bool HasSymbol = false;		bool HasSymbol = false;
▲ Show 20 Lines • Show All 99 Lines • ▼ Show 20 Lines	public:
OwnedDataSection(StringRef SecName, ArrayRef<uint8_t> Data)		OwnedDataSection(StringRef SecName, ArrayRef<uint8_t> Data)
: Data(std::begin(Data), std::end(Data)) {		: Data(std::begin(Data), std::end(Data)) {
Name = SecName.str();		Name = SecName.str();
Type = ELF::SHT_PROGBITS;		Type = ELF::SHT_PROGBITS;
Size = Data.size();		Size = Data.size();
OriginalOffset = std::numeric_limits<uint64_t>::max();		OriginalOffset = std::numeric_limits<uint64_t>::max();
}		}

		OwnedDataSection(const Twine &SecName, uint64_t SecAddr, uint64_t SecFlags,
		uint64_t SecOff) {
		Name = SecName.str();
		Type = ELF::SHT_PROGBITS;
		Addr = SecAddr;
		Flags = SecFlags;
		OriginalOffset = SecOff;
		}

		void appendHexData(StringRef HexData);
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Why do we need this to output a new format? jakehehrlich: Why do we need this to output a new format?
		evgeny777AuthorUnsubmitted Done Reply Inline Actions This function takes a portion of hexadecimal data from IHexRecord and appends it in binary form to internal vector of OwnedDataSection evgeny777: This function takes a portion of hexadecimal data from IHexRecord and appends it in binary form…
void accept(SectionVisitor &Sec) const override;		void accept(SectionVisitor &Sec) const override;
void accept(MutableSectionVisitor &Visitor) override;		void accept(MutableSectionVisitor &Visitor) override;
};		};

class CompressedSection : public SectionBase {		class CompressedSection : public SectionBase {
MAKE_SEC_WRITER_FRIEND		MAKE_SEC_WRITER_FRIEND

DebugCompressionType CompressionType;		DebugCompressionType CompressionType;
▲ Show 20 Lines • Show All 336 Lines • ▼ Show 20 Lines	public:
virtual std::unique_ptr<Object> create() const = 0;		virtual std::unique_ptr<Object> create() const = 0;
};		};

using object::Binary;		using object::Binary;
using object::ELFFile;		using object::ELFFile;
using object::ELFObjectFile;		using object::ELFObjectFile;
using object::OwningBinary;		using object::OwningBinary;

class BinaryELFBuilder {		class BasicELFBuilder {
		protected:
uint16_t EMachine;		uint16_t EMachine;
MemoryBuffer *MemBuf;
std::unique_ptr<Object> Obj;		std::unique_ptr<Object> Obj;

void initFileHeader();		void initFileHeader();
void initHeaderSegment();		void initHeaderSegment();
StringTableSection *addStrTab();		StringTableSection *addStrTab();
SymbolTableSection addSymTab(StringTableSection StrTab);		SymbolTableSection addSymTab(StringTableSection StrTab);
void addData(SymbolTableSection *SymTab);
void initSections();		void initSections();

public:		public:
		BasicELFBuilder(uint16_t EM)
		: EMachine(EM), Obj(llvm::make_unique<Object>()) {}
		};

		class BinaryELFBuilder : public BasicELFBuilder {
		MemoryBuffer *MemBuf;
		void addData(SymbolTableSection *SymTab);

		public:
BinaryELFBuilder(uint16_t EM, MemoryBuffer *MB)		BinaryELFBuilder(uint16_t EM, MemoryBuffer *MB)
: EMachine(EM), MemBuf(MB), Obj(llvm::make_unique<Object>()) {}		: BasicELFBuilder(EM), MemBuf(MB) {}

		std::unique_ptr<Object> build();
		};

		class IHexELFBuilder : public BasicELFBuilder {
		const std::vector<IHexRecord> &Records;

		void addDataSections();

		public:
		IHexELFBuilder(const std::vector<IHexRecord> &Records)
		: BasicELFBuilder(ELF::EM_386), Records(Records) {}

std::unique_ptr<Object> build();		std::unique_ptr<Object> build();
};		};

template <class ELFT> class ELFBuilder {		template <class ELFT> class ELFBuilder {
private:		private:
using Elf_Addr = typename ELFT::Addr;		using Elf_Addr = typename ELFT::Addr;
using Elf_Shdr = typename ELFT::Shdr;		using Elf_Shdr = typename ELFT::Shdr;
Show All 21 Lines	class BinaryReader : public Reader {
MemoryBuffer *MemBuf;		MemoryBuffer *MemBuf;

public:		public:
BinaryReader(const MachineInfo &MI, MemoryBuffer *MB)		BinaryReader(const MachineInfo &MI, MemoryBuffer *MB)
: MInfo(MI), MemBuf(MB) {}		: MInfo(MI), MemBuf(MB) {}
std::unique_ptr<Object> create() const override;		std::unique_ptr<Object> create() const override;
};		};

		class IHexReader : public Reader {
		jakehehrlichUnsubmitted Not Done Reply Inline Actions We can probably split this into two changes to make things smaller, one for reading, and one for writing yeah? jakehehrlich: We can probably split this into two changes to make things smaller, one for reading, and one…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I think so. evgeny777: I think so.
		MemoryBuffer *MemBuf;

		Error checkChars(StringRef S, size_t LineNo) const;
		Error checkRecord(const IHexRecord &R, size_t LineNo) const;
		Expected<std::vector<IHexRecord>> parse() const;
		template <typename... Ts>
		Error parseError(char const *Fmt, const Ts &... Vals) const {
		return createFileError(
		MemBuf->getBufferIdentifier(),
		createStringError(errc::invalid_argument, Fmt, Vals...));
		}

		public:
		IHexReader(MemoryBuffer *MB) : MemBuf(MB) {}

		std::unique_ptr<Object> create() const override;
		};

class ELFReader : public Reader {		class ELFReader : public Reader {
Binary *Bin;		Binary *Bin;

public:		public:
std::unique_ptr<Object> create() const override;		std::unique_ptr<Object> create() const override;
explicit ELFReader(Binary *B) : Bin(B) {}		explicit ELFReader(Binary *B) : Bin(B) {}
};		};

▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

tools/llvm-objcopy/ELF/Object.cpp

Show All 11 Lines
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/BinaryFormat/ELF.h"		#include "llvm/BinaryFormat/ELF.h"
#include "llvm/MC/MCTargetOptions.h"		#include "llvm/MC/MCTargetOptions.h"
#include "llvm/Object/ELFObjectFile.h"		#include "llvm/Object/ELFObjectFile.h"
#include "llvm/Support/Compression.h"		#include "llvm/Support/Compression.h"
#include "llvm/Support/Errc.h"		#include "llvm/Support/Endian.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/FileOutputBuffer.h"		#include "llvm/Support/FileOutputBuffer.h"
#include "llvm/Support/Path.h"		#include "llvm/Support/Path.h"
#include <algorithm>		#include <algorithm>
#include <cstddef>		#include <cstddef>
#include <cstdint>		#include <cstdint>
#include <iterator>		#include <iterator>
#include <unordered_set>		#include <unordered_set>
▲ Show 20 Lines • Show All 115 Lines • ▼ Show 20 Lines

void SectionWriter::visit(const Section &Sec) {		void SectionWriter::visit(const Section &Sec) {
if (Sec.Type == SHT_NOBITS)		if (Sec.Type == SHT_NOBITS)
return;		return;
uint8_t *Buf = Out.getBufferStart() + Sec.Offset;		uint8_t *Buf = Out.getBufferStart() + Sec.Offset;
llvm::copy(Sec.Contents, Buf);		llvm::copy(Sec.Contents, Buf);
}		}

		static bool addressOverflows32bit(uint64_t Addr) {
		// Sign extended 32 bit addresses (e.g 0xFFFFFFFF80000000) are ok
		return Addr > UINT32_MAX && Addr + 0x80000000 > UINT32_MAX;
		rupprechtUnsubmitted Not Done Reply Inline Actions Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr & 0xffffffff80000000 == 0xffffffff80000000` instead? rupprecht: Isn't relying on `Addr + 0x80000000` to loop around UB? Could this just directly check `Addr &…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t. evgeny777: As far as I know unsigned overflows (unlike signed) are not UB. Addr is uint64_t.
		}

		template <class T> static T checkedGetHex(StringRef S) {
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Unless an check that generates an error always proceeds this I think its best to return an error in this case, not assert fail. It would be better to roll this into an Expected function in that case I think anyway. jakehehrlich: Unless an check that generates an error always proceeds this I think its best to return an…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions This function can't actually return error, because string has been previously validated (see `checkChars` for example). IMO, it's bad practice to implement runtime checks for one's own logical errors. evgeny777: This function can't actually return error, because string has been previously validated (see…
		T Value;
		bool Fail = S.getAsInteger(16, Value);
		assert(!Fail);
		rupprechtUnsubmitted Not Done Reply Inline Actions `Fail` is unused in release builds, so you need to add a `(void)Fail;` to silence the error/warning in release builds. rupprecht: `Fail` is unused in release builds, so you need to add a `(void)Fail;` to silence the…
		return Value;
		}

		// Fills exactly Len bytes of buffer with hexadecimal characters
		// representing value 'X'
		template <class T, class Iterator>
		static Iterator utohexstr(T X, Iterator It, size_t Len) {
		// Fill range with '0'
		std::fill(It, It + Len, '0');

		for (long I = Len - 1; I >= 0; --I) {
		unsigned char Mod = static_cast<unsigned char>(X) & 15;
		*(It + I) = hexdigit(Mod, false);
		X >>= 4;
		}
		assert(X == 0);
		return It + Len;
		}

		uint8_t IHexRecord::getChecksum(StringRef S) {
		assert((S.size() & 1) == 0);
		uint8_t Checksum = 0;
		while (!S.empty()) {
		Checksum += checkedGetHex<uint8_t>(S.take_front(2));
		S = S.drop_front(2);
		}
		return -Checksum;
		}

		IHexLineData IHexRecord::getLine(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data) {
		IHexLineData Line(getLineLength(Data.size()));
		assert(Line.size());
		auto Iter = Line.begin();
		*Iter++ = ':';
		Iter = utohexstr(Data.size(), Iter, 2);
		Iter = utohexstr(Addr, Iter, 4);
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to lend itself to streams where as my opinion was the opposite before. You wouldn't need utohexstr since those formatting options are already supplied by the library I believe. jakehehrlich: Maybe a raw_ostream would be useful here. We've generally avoided them but this format seems to…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions This function was optimized to not using any dynamic allocation (IHexLineData is actually SmallVector), because each line contains only 16 bytes of section data, so it's possible to have really huge number of lines. What are the benefits of using raw_ostream? evgeny777: This function was optimized to not using any dynamic allocation (IHexLineData is actually…
		Iter = utohexstr(Type, Iter, 2);
		for (uint8_t X : Data)
		Iter = utohexstr(X, Iter, 2);
		StringRef S(Line.data() + 1, std::distance(Line.begin() + 1, Iter));
		Iter = utohexstr(getChecksum(S), Iter, 2);
		*Iter++ = '\n';
		assert(Iter == Line.end());
		rupprechtUnsubmitted Not Done Reply Inline Actions It looks like ihex uses `\r\n` line endings 😦 https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=blob;f=bfd/ihex.c;h=101e0a76155fc48f95312c08307739cf9c1ee5eb;hb=HEAD#l752 It seems weird for me to request this, but I think we should write `\r\n`, as this seems like a strange detail that people might need when consuming these files. I don't actually have any examples of this, however. rupprecht: It looks like ihex uses `\r\n` line endings 😦 https://sourceware.org/git/gitweb.cgi?p=binutils…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells that: Programs that create HEX records typically use line termination characters that conform to the conventions of their operating systems Probably the easiest thing to do is to stick to GNU behavior. I'll update the patch evgeny777: Yes, I've seen this also. Nothing is said in IHEX spec about the line endings. Wikipedia tells…
		return Line;
		}

		static uint64_t sectionPhysicalAddr(const SectionBase *Sec) {
		Segment *Seg = Sec->ParentSegment;
		jakehehrlichUnsubmitted Not Done Reply Inline Actions This is a very generic name with no comment. In general your comments have been awesome. I'd like to have an idea what this function does without reading the contents. jakehehrlich: This is a very generic name with no comment. In general your comments have been awesome. I'd…
		if (Seg && Seg->Type != ELF::PT_LOAD)
		Seg = nullptr;
		return Seg ? Seg->PAddr + Sec->OriginalOffset - Seg->OriginalOffset
		: Sec->Addr;
		}

		void IHexSectionWriterBase::writeSection(const SectionBase *Sec,
		ArrayRef<uint8_t> Data) {
		assert(Data.size() == Sec->Size);
		const uint32_t ChunkSize = 16;
		uint32_t Addr = sectionPhysicalAddr(Sec) & 0xFFFFFFFFU;
		while (!Data.empty()) {
		uint64_t DataSize = std::min<uint64_t>(Data.size(), ChunkSize);
		if (Addr > SegmentAddr + BaseAddr + 0xFFFFU) {
		if (Addr > 0xFFFFFU) {
		// Write extended address record, zeroing segment address
		// if needed.
		if (SegmentAddr != 0)
		SegmentAddr = writeSegmentAddr(0U);
		BaseAddr = writeBaseAddr(Addr);
		} else {
		// We can still remain 16-bit
		SegmentAddr = writeSegmentAddr(Addr);
		}
		}
		uint64_t SegOffset = Addr - BaseAddr - SegmentAddr;
		assert(SegOffset <= 0xFFFFU);
		DataSize = std::min(DataSize, 0x10000U - SegOffset);
		writeData(0, SegOffset, Data.take_front(DataSize));
		Addr += DataSize;
		Data = Data.drop_front(DataSize);
		}
		}

		uint64_t IHexSectionWriterBase::writeSegmentAddr(uint64_t Addr) {
		assert(Addr <= 0xFFFFFU);
		uint8_t Data[] = {static_cast<uint8_t>((Addr & 0xF0000U) >> 12), 0};
		writeData(2, 0, Data);
		return Addr & 0xF0000U;
		}

		uint64_t IHexSectionWriterBase::writeBaseAddr(uint64_t Addr) {
		assert(Addr <= 0xFFFFFFFFU);
		uint64_t Base = Addr & 0xFFFF0000U;
		uint8_t Data[] = {static_cast<uint8_t>(Base >> 24),
		static_cast<uint8_t>((Base >> 16) & 0xFF)};
		writeData(4, 0, Data);
		return Base;
		}

		void IHexSectionWriterBase::writeData(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data) {
		Offset += IHexRecord::getLineLength(Data.size());
		}

		void IHexSectionWriterBase::visit(const Section &Sec) {
		writeSection(&Sec, Sec.Contents);
		}

		void IHexSectionWriterBase::visit(const OwnedDataSection &Sec) {
		writeSection(&Sec, Sec.Data);
		}

		void IHexSectionWriterBase::visit(const StringTableSection &Sec) {
		// Check that sizer has already done its work
		assert(Sec.Size == Sec.StrTabBuilder.getSize());
		// We are free to pass an invalid pointer to writeSection as long
		// as we don't actually write any data. The real writer class has
		// to override this method .
		writeSection(&Sec, {nullptr, Sec.Size});
		}

		void IHexSectionWriterBase::visit(const DynamicRelocationSection &Sec) {
		writeSection(&Sec, Sec.Contents);
		}

		void IHexSectionWriter::writeData(uint8_t Type, uint16_t Addr,
		ArrayRef<uint8_t> Data) {
		IHexLineData HexData = IHexRecord::getLine(Type, Addr, Data);
		memcpy(Out.getBufferStart() + Offset, HexData.data(), HexData.size());
		Offset += HexData.size();
		}

		void IHexSectionWriter::visit(const StringTableSection &Sec) {
		assert(Sec.Size == Sec.StrTabBuilder.getSize());
		std::vector<uint8_t> Data(Sec.Size);
		Sec.StrTabBuilder.write(Data.data());
		writeSection(&Sec, Data);
		}
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Does this ever make sense if there is no segment? jakehehrlich: Does this ever make sense if there is no segment?
		evgeny777AuthorUnsubmitted Done Reply Inline Actions It's a helper function which returns section VA if there is no segment. Any suggestion for better name? evgeny777: It's a helper function which returns section VA if there is no segment. Any suggestion for…

void Section::accept(SectionVisitor &Visitor) const { Visitor.visit(*this); }		void Section::accept(SectionVisitor &Visitor) const { Visitor.visit(*this); }

void Section::accept(MutableSectionVisitor &Visitor) { Visitor.visit(*this); }		void Section::accept(MutableSectionVisitor &Visitor) { Visitor.visit(*this); }

void SectionWriter::visit(const OwnedDataSection &Sec) {		void SectionWriter::visit(const OwnedDataSection &Sec) {
uint8_t *Buf = Out.getBufferStart() + Sec.Offset;		uint8_t *Buf = Out.getBufferStart() + Sec.Offset;
llvm::copy(Sec.Data, Buf);		llvm::copy(Sec.Data, Buf);
}		}

static const std::vector<uint8_t> ZlibGnuMagic = {'Z', 'L', 'I', 'B'};		static const std::vector<uint8_t> ZlibGnuMagic = {'Z', 'L', 'I', 'B'};

		jakehehrlichUnsubmitted Not Done Reply Inline Actions Masking that like this seems redundant, in general the number of places we're converting from 64 to 32 in an unchecked way is really shocking. I'd feel a lot more comfortable if we encapsulated these checks more and made them more clear. jakehehrlich: Masking that like this seems redundant, in general the number of places we're converting from…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions There is a checkSections method which check all sections to detect if any of them has 64-bit address. Bear in mind that implementation also supports sign extended 32-bit addresses, i.e 0xFFFFFFFF80000000 is a valid address, but 0x100000000 is not evgeny777: There is a checkSections method which check all sections to detect if any of them has 64-bit…
static bool isDataGnuCompressed(ArrayRef<uint8_t> Data) {		static bool isDataGnuCompressed(ArrayRef<uint8_t> Data) {
return Data.size() > ZlibGnuMagic.size() &&		return Data.size() > ZlibGnuMagic.size() &&
std::equal(ZlibGnuMagic.begin(), ZlibGnuMagic.end(), Data.data());		std::equal(ZlibGnuMagic.begin(), ZlibGnuMagic.end(), Data.data());
		jakehehrlichUnsubmitted Not Done Reply Inline Actions Maybe we could split support for extended records out into a sperate patch and error out here for now? jakehehrlich: Maybe we could split support for extended records out into a sperate patch and error out here…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I suggest splitting reader and writer. To me it looks like a more logical split compared to removal of certain record types. evgeny777: I suggest splitting reader and writer. To me it looks like a more logical split compared to…
}		}

template <class ELFT>		template <class ELFT>
static std::tuple<uint64_t, uint64_t>		static std::tuple<uint64_t, uint64_t>
getDecompressedSizeAndAlignment(ArrayRef<uint8_t> Data) {		getDecompressedSizeAndAlignment(ArrayRef<uint8_t> Data) {
const bool IsGnuDebug = isDataGnuCompressed(Data);		const bool IsGnuDebug = isDataGnuCompressed(Data);
const uint64_t DecompressedSize =		const uint64_t DecompressedSize =
IsGnuDebug		IsGnuDebug
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines
void OwnedDataSection::accept(SectionVisitor &Visitor) const {		void OwnedDataSection::accept(SectionVisitor &Visitor) const {
Visitor.visit(*this);		Visitor.visit(*this);
}		}

void OwnedDataSection::accept(MutableSectionVisitor &Visitor) {		void OwnedDataSection::accept(MutableSectionVisitor &Visitor) {
Visitor.visit(*this);		Visitor.visit(*this);
}		}

		void OwnedDataSection::appendHexData(StringRef HexData) {
		assert((HexData.size() & 1) == 0);
		while (!HexData.empty()) {
		Data.push_back(checkedGetHex<uint8_t>(HexData.take_front(2)));
		HexData = HexData.drop_front(2);
		}
		Size = Data.size();
		}

void BinarySectionWriter::visit(const CompressedSection &Sec) {		void BinarySectionWriter::visit(const CompressedSection &Sec) {
error("Cannot write compressed section '" + Sec.Name + "' ");		error("Cannot write compressed section '" + Sec.Name + "' ");
}		}

template <class ELFT>		template <class ELFT>
void ELFSectionWriter<ELFT>::visit(const CompressedSection &Sec) {		void ELFSectionWriter<ELFT>::visit(const CompressedSection &Sec) {
uint8_t *Buf = Out.getBufferStart();		uint8_t *Buf = Out.getBufferStart();
Buf += Sec.Offset;		Buf += Sec.Offset;
▲ Show 20 Lines • Show All 577 Lines • ▼ Show 20 Lines
static bool compareSegmentsByPAddr(const Segment A, const Segment B) {		static bool compareSegmentsByPAddr(const Segment A, const Segment B) {
if (A->PAddr < B->PAddr)		if (A->PAddr < B->PAddr)
return true;		return true;
if (A->PAddr > B->PAddr)		if (A->PAddr > B->PAddr)
return false;		return false;
return A->Index < B->Index;		return A->Index < B->Index;
}		}

void BinaryELFBuilder::initFileHeader() {		void BasicELFBuilder::initFileHeader() {
Obj->Flags = 0x0;		Obj->Flags = 0x0;
Obj->Type = ET_REL;		Obj->Type = ET_REL;
Obj->OSABI = ELFOSABI_NONE;		Obj->OSABI = ELFOSABI_NONE;
Obj->ABIVersion = 0;		Obj->ABIVersion = 0;
Obj->Entry = 0x0;		Obj->Entry = 0x0;
Obj->Machine = EMachine;		Obj->Machine = EMachine;
Obj->Version = 1;		Obj->Version = 1;
}		}

void BinaryELFBuilder::initHeaderSegment() { Obj->ElfHdrSegment.Index = 0; }		void BasicELFBuilder::initHeaderSegment() { Obj->ElfHdrSegment.Index = 0; }

StringTableSection *BinaryELFBuilder::addStrTab() {		StringTableSection *BasicELFBuilder::addStrTab() {
auto &StrTab = Obj->addSection<StringTableSection>();		auto &StrTab = Obj->addSection<StringTableSection>();
StrTab.Name = ".strtab";		StrTab.Name = ".strtab";

Obj->SectionNames = &StrTab;		Obj->SectionNames = &StrTab;
return &StrTab;		return &StrTab;
}		}

SymbolTableSection BinaryELFBuilder::addSymTab(StringTableSection StrTab) {		SymbolTableSection BasicELFBuilder::addSymTab(StringTableSection StrTab) {
auto &SymTab = Obj->addSection<SymbolTableSection>();		auto &SymTab = Obj->addSection<SymbolTableSection>();

SymTab.Name = ".symtab";		SymTab.Name = ".symtab";
SymTab.Link = StrTab->Index;		SymTab.Link = StrTab->Index;

// The symbol table always needs a null symbol		// The symbol table always needs a null symbol
SymTab.addSymbol("", 0, 0, nullptr, 0, 0, 0, 0);		SymTab.addSymbol("", 0, 0, nullptr, 0, 0, 0, 0);

Obj->SymbolTable = &SymTab;		Obj->SymbolTable = &SymTab;
return &SymTab;		return &SymTab;
}		}

		void BasicELFBuilder::initSections() {
		for (auto &Section : Obj->sections())
		Section.initialize(Obj->sections());
		}

void BinaryELFBuilder::addData(SymbolTableSection *SymTab) {		void BinaryELFBuilder::addData(SymbolTableSection *SymTab) {
auto Data = ArrayRef<uint8_t>(		auto Data = ArrayRef<uint8_t>(
reinterpret_cast<const uint8_t *>(MemBuf->getBufferStart()),		reinterpret_cast<const uint8_t *>(MemBuf->getBufferStart()),
MemBuf->getBufferSize());		MemBuf->getBufferSize());
auto &DataSection = Obj->addSection<Section>(Data);		auto &DataSection = Obj->addSection<Section>(Data);
DataSection.Name = ".data";		DataSection.Name = ".data";
DataSection.Type = ELF::SHT_PROGBITS;		DataSection.Type = ELF::SHT_PROGBITS;
DataSection.Size = Data.size();		DataSection.Size = Data.size();
DataSection.Flags = ELF::SHF_ALLOC \| ELF::SHF_WRITE;		DataSection.Flags = ELF::SHF_ALLOC \| ELF::SHF_WRITE;

std::string SanitizedFilename = MemBuf->getBufferIdentifier().str();		std::string SanitizedFilename = MemBuf->getBufferIdentifier().str();
std::replace_if(std::begin(SanitizedFilename), std::end(SanitizedFilename),		std::replace_if(std::begin(SanitizedFilename), std::end(SanitizedFilename),
[](char C) { return !isalnum(C); }, '_');		[](char C) { return !isalnum(C); }, '_');
Twine Prefix = Twine("_binary_") + SanitizedFilename;		Twine Prefix = Twine("_binary_") + SanitizedFilename;

SymTab->addSymbol(Prefix + "_start", STB_GLOBAL, STT_NOTYPE, &DataSection,		SymTab->addSymbol(Prefix + "_start", STB_GLOBAL, STT_NOTYPE, &DataSection,
/Value=/0, STV_DEFAULT, 0, 0);		/Value=/0, STV_DEFAULT, 0, 0);
SymTab->addSymbol(Prefix + "_end", STB_GLOBAL, STT_NOTYPE, &DataSection,		SymTab->addSymbol(Prefix + "_end", STB_GLOBAL, STT_NOTYPE, &DataSection,
/Value=/DataSection.Size, STV_DEFAULT, 0, 0);		/Value=/DataSection.Size, STV_DEFAULT, 0, 0);
SymTab->addSymbol(Prefix + "_size", STB_GLOBAL, STT_NOTYPE, nullptr,		SymTab->addSymbol(Prefix + "_size", STB_GLOBAL, STT_NOTYPE, nullptr,
/Value=/DataSection.Size, STV_DEFAULT, SHN_ABS, 0);		/Value=/DataSection.Size, STV_DEFAULT, SHN_ABS, 0);
}		}

void BinaryELFBuilder::initSections() {
for (auto &Section : Obj->sections()) {
Section.initialize(Obj->sections());
}
}

std::unique_ptr<Object> BinaryELFBuilder::build() {		std::unique_ptr<Object> BinaryELFBuilder::build() {
initFileHeader();		initFileHeader();
initHeaderSegment();		initHeaderSegment();
StringTableSection *StrTab = addStrTab();		StringTableSection *StrTab = addStrTab();
SymbolTableSection *SymTab = addSymTab(StrTab);		SymbolTableSection *SymTab = addSymTab(StrTab);
initSections();		initSections();
addData(SymTab);		addData(SymTab);

return std::move(Obj);		return std::move(Obj);
}		}

		// Adds sections from IHEX data file. Data should have been
		// fully validated by this time.
		void IHexELFBuilder::addDataSections() {
		OwnedDataSection *Section = nullptr;
		uint64_t RecAddr, SegmentAddr = 0, BaseAddr = 0;
		rupprechtUnsubmitted Not Done Reply Inline Actions RecAddr should be defined in the loop, where it is used rupprecht: RecAddr should be defined in the loop, where it is used
		uint32_t SecNo = 1;

		for (const IHexRecord &R : Records) {
		switch (R.Type) {
		case IHexRecord::Data:
		// Ignore empty data records
		if (R.HexData.empty())
		continue;
		RecAddr = R.Addr + SegmentAddr + BaseAddr;
		if (!Section \|\| Section->Addr + Section->Size != RecAddr)
		// OriginalOffset field is only used to sort section properly, so
		// instead of keeping track of real offset in IHEX file, we use
		// section number.
		Section = &Obj->addSection<OwnedDataSection>(
		".sec" + std::to_string(SecNo++), RecAddr,
		ELF::SHF_ALLOC \| ELF::SHF_WRITE, SecNo);
		Section->appendHexData(R.HexData);
		break;
		case IHexRecord::EndOfFile:
		break;
		case IHexRecord::SegmentAddr:
		// 20-bit segment address.
		SegmentAddr = checkedGetHex<uint16_t>(R.HexData) << 4;
		break;
		case IHexRecord::StartAddr80x86:
		case IHexRecord::StartAddr:
		Obj->Entry = checkedGetHex<uint32_t>(R.HexData);
		assert(Obj->Entry <= 0xFFFFFU);
		break;
		case IHexRecord::ExtendedAddr:
		// 16-31 bits of linear base address
		BaseAddr = checkedGetHex<uint16_t>(R.HexData) << 16;
		break;
		default:
		llvm_unreachable("unknown record type");
		}
		}
		}

		std::unique_ptr<Object> IHexELFBuilder::build() {
		initFileHeader();
		initHeaderSegment();
		StringTableSection *StrTab = addStrTab();
		addSymTab(StrTab);
		initSections();
		addDataSections();

		return std::move(Obj);
		}

template <class ELFT> void ELFBuilder<ELFT>::setParentSegment(Segment &Child) {		template <class ELFT> void ELFBuilder<ELFT>::setParentSegment(Segment &Child) {
for (auto &Parent : Obj.segments()) {		for (auto &Parent : Obj.segments()) {
// Every segment will overlap with itself but we don't want a segment to		// Every segment will overlap with itself but we don't want a segment to
// be it's own parent so we avoid that situation.		// be it's own parent so we avoid that situation.
if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {		if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {
// We want a canonical "most parental" segment but this requires		// We want a canonical "most parental" segment but this requires
// inspecting the ParentSegment.		// inspecting the ParentSegment.
if (compareSegmentsByOffset(&Parent, &Child))		if (compareSegmentsByOffset(&Parent, &Child))
▲ Show 20 Lines • Show All 333 Lines • ▼ Show 20 Lines
Writer::~Writer() {}		Writer::~Writer() {}

Reader::~Reader() {}		Reader::~Reader() {}

std::unique_ptr<Object> BinaryReader::create() const {		std::unique_ptr<Object> BinaryReader::create() const {
return BinaryELFBuilder(MInfo.EMachine, MemBuf).build();		return BinaryELFBuilder(MInfo.EMachine, MemBuf).build();
}		}

		Error IHexReader::checkChars(StringRef Line, size_t LineNo) const {
		if (Line[0] != ':')
		rupprechtUnsubmitted Not Done Reply Inline Actions I think this will crash (or be UB) on an empty line? rupprecht: I think this will crash (or be UB) on an empty line?
		evgeny777AuthorUnsubmitted Done Reply Inline Actions Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert here on `!Line.empty()` evgeny777: Line is checked for minimal valid length earlier in the code. Though, it makes sense to assert…
		return parseError("line %zu: missing ':' in the beginning of line.",
		LineNo);

		for (size_t Pos = 1; Pos < Line.size(); ++Pos)
		if (hexDigitValue(Line[Pos]) == -1U)
		return parseError("line %zu: invalid character at position %zu.", LineNo,
		Pos + 1);
		return Error::success();
		}
		rupprechtUnsubmitted Not Done Reply Inline Actions WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if it just attempts to match ":[0-9A-F]+". It would produce less precise error messages, though. rupprecht: WDYT about just using llvm::Regex here instead of this method? It may be easier to read code if…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I think that precise error message is more important. It might be hard in some cases to identify wrong character, e.g: `I` instead of `1`, `O` instead of `0`, russian `A` instead of english `A` and so on. evgeny777: I think that precise error message is more important. It might be hard in some cases to…

		Error IHexReader::checkRecord(const IHexRecord &R, size_t LineNo) const {
		switch (R.Type) {
		case IHexRecord::Data:
		if (R.HexData.size() == 0)
		return parseError(
		"line %zu: zero data length is not allowed for data records", LineNo);
		break;
		case IHexRecord::EndOfFile:
		break;
		rupprechtUnsubmitted Not Done Reply Inline Actions I think there should be validation (somewhere) that there are no more records after this rupprecht: I think there should be validation (somewhere) that there are no more records after this
		evgeny777AuthorUnsubmitted Done Reply Inline Actions I think that `EndOfFile` record should unconditionally cancel further processing. . This allows moving EOF record within a file to temporarily prevent part of records from loading. This can be useful for testing. Also it seems GNU objcopy behaves this way. evgeny777: I think that `EndOfFile` record should unconditionally cancel further processing. . This allows…
		case IHexRecord::SegmentAddr:
		// 20-bit segment address. Data length must be 2 bytes
		// (4 bytes in hex)
		if (R.HexData.size() != 4)
		return parseError(
		"line %zu: segment address data should be 2 bytes in size", LineNo);
		break;
		case IHexRecord::StartAddr80x86:
		case IHexRecord::StartAddr:
		if (R.HexData.size() != 8)
		return parseError(
		"line %zu: start address data should be 4 bytes in size", LineNo);
		// According to Intel HEX specification '03' record
		// only specifies the code address within the 20-bit
		// segmented address space of the 8086/80186. This
		// means 12 high order bits should be zeroes.
		if (R.Type == IHexRecord::StartAddr80x86 &&
		R.HexData.take_front(3) != "000")
		return parseError("line %zu: start address exceeds 20 bit for 80x86",
		LineNo);
		break;
		case IHexRecord::ExtendedAddr:
		// 16-31 bits of linear base address
		if (R.HexData.size() != 4)
		return parseError(
		"line %zu: extended address data should be 2 bytes in size", LineNo);
		break;
		default:
		// Unknown record type
		return parseError("line %zu: unknown record type: %u", LineNo,
		static_cast<unsigned>(R.Type));
		}
		return Error::success();
		}

		Expected<std::vector<IHexRecord>> IHexReader::parse() const {
		SmallVector<StringRef, 16> Lines;
		std::vector<IHexRecord> Records;

		MemBuf->getBuffer().split(Lines, '\n');
		rupprechtUnsubmitted Not Done Reply Inline Actions as a tiny optimization, call Records.reserve(Lines.size()) once you know how many lines there are. rupprecht: as a tiny optimization, call Records.reserve(Lines.size()) once you know how many lines there…
		for (size_t LineNo = 1; LineNo <= Lines.size(); ++LineNo) {
		StringRef Line = Lines[LineNo - 1].trim();
		if (Line.empty())
		continue;

		// ':' + Length + Address + Type + Checksum with empty data ':LLAAAATTCC'
		if (Line.size() < 11)
		return parseError("line %zu: line is too short: %zu chars.", LineNo,
		Line.size());

		if (Error E = checkChars(Line, LineNo))
		return std::move(E);

		IHexRecord Rec;
		size_t DataLen = checkedGetHex<uint8_t>(Line.substr(1, 2));
		if (Line.size() != IHexRecord::getLength(DataLen))
		return parseError("line %zu: invalid line length %zu (should be %zu)",
		LineNo, Line.size(), IHexRecord::getLength(DataLen));

		Rec.Addr = checkedGetHex<uint16_t>(Line.substr(3, 4));
		Rec.Type = checkedGetHex<uint8_t>(Line.substr(7, 2));
		Rec.HexData = Line.substr(9, DataLen * 2);
		rupprechtUnsubmitted Not Done Reply Inline Actions Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t> fields for each record, so we don't have to worry about it being valid everywhere (i.e. using checkedGetHex)? rupprecht: Once we've validated it, can we convert the whole hex string to separate ArrayRef<uint8/16_t>…
		evgeny777AuthorUnsubmitted Done Reply Inline Actions It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're checking string with `checkChars` we shouldn't really step on conversion error, unless something really weird happens. evgeny777: It's possible, but I don't see straight way to do this w/o dynamic memory allocation. As we're…

		if (IHexRecord::getChecksum(Line.drop_front(1)) != 0)
		return parseError("line %zu: incorrect checksum.", LineNo);
		if (Error E = checkRecord(Rec, LineNo))
		return std::move(E);
		rupprechtUnsubmitted Not Done Reply Inline Actions How about creating a static method to convert a line into an Expected<IHexRecord>, so we can return an error if it's invalid instead of making the user call getChecksum/checkRecord? rupprecht: How about creating a static method to convert a line into an Expected<IHexRecord>, so we can…
		Records.push_back(Rec);
		}
		return std::move(Records);
		}

		std::unique_ptr<Object> IHexReader::create() const {
		std::vector<IHexRecord> Records = unwrapOrError(parse());
		return IHexELFBuilder(Records).build();
		}

std::unique_ptr<Object> ELFReader::create() const {		std::unique_ptr<Object> ELFReader::create() const {
auto Obj = llvm::make_unique<Object>();		auto Obj = llvm::make_unique<Object>();
if (auto *O = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {		if (auto *O = dyn_cast<ELFObjectFile<ELF32LE>>(Bin)) {
ELFBuilder<ELF32LE> Builder(O, Obj);		ELFBuilder<ELF32LE> Builder(O, Obj);
Builder.build();		Builder.build();
return Obj;		return Obj;
} else if (auto *O = dyn_cast<ELFObjectFile<ELF64LE>>(Bin)) {		} else if (auto *O = dyn_cast<ELFObjectFile<ELF64LE>>(Bin)) {
ELFBuilder<ELF64LE> Builder(O, Obj);		ELFBuilder<ELF64LE> Builder(O, Obj);
▲ Show 20 Lines • Show All 519 Lines • ▼ Show 20 Lines	Error BinaryWriter::finalize() {
}		}

if (Error E = Buf.allocate(TotalSize))		if (Error E = Buf.allocate(TotalSize))
return E;		return E;
SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);		SecWriter = llvm::make_unique<BinarySectionWriter>(Buf);
return Error::success();		return Error::success();
}		}

		bool IHexWriter::SectionCompare::operator()(const SectionBase *Lhs,
		const SectionBase *Rhs) const {
		return (sectionPhysicalAddr(Lhs) & 0xFFFFFFFFU) <
		(sectionPhysicalAddr(Rhs) & 0xFFFFFFFFU);
		}

		uint64_t IHexWriter::writeEntryPointRecord(uint8_t *Buf) {
		IHexLineData HexData;
		uint8_t Data[4] = {};
		if (Obj.Entry <= 0xFFFFFU) {
		Data[0] = ((Obj.Entry & 0xF0000U) >> 12) & 0xFF;
		support::endian::write(&Data[2], static_cast<uint16_t>(Obj.Entry),
		support::big);
		HexData = IHexRecord::getLine(IHexRecord::StartAddr80x86, 0, Data);
		} else {
		support::endian::write(Data, static_cast<uint32_t>(Obj.Entry),
		support::big);
		HexData = IHexRecord::getLine(IHexRecord::StartAddr, 0, Data);
		}
		memcpy(Buf, HexData.data(), HexData.size());
		return HexData.size();
		}

		uint64_t IHexWriter::writeEndOfFileRecord(uint8_t *Buf) {
		IHexLineData HexData = IHexRecord::getLine(IHexRecord::EndOfFile, 0, {});
		memcpy(Buf, HexData.data(), HexData.size());
		return HexData.size();
		}

		Error IHexWriter::write() {
		IHexSectionWriter Writer(Buf);
		// Write sections.
		for (const SectionBase *Sec : Sections)
		Sec->accept(Writer);

		uint64_t Offset = Writer.getBufferOffset();
		// Write entry point address.
		Offset += writeEntryPointRecord(Buf.getBufferStart() + Offset);
		// Write EOF.
		Offset += writeEndOfFileRecord(Buf.getBufferStart() + Offset);
		assert(Offset == TotalSize);
		return Buf.commit();
		}

		Error IHexWriter::checkSection(const SectionBase &Sec) {
		uint64_t Addr = sectionPhysicalAddr(&Sec);
		if (addressOverflows32bit(Addr) \|\| addressOverflows32bit(Addr + Sec.Size - 1))
		return createStringError(
		errc::invalid_argument,
		"Section '%s' address range [%p, %p] is not 32 bit", Sec.Name.c_str(),
		Addr, Addr + Sec.Size - 1);
		return Error::success();
		}

		Error IHexWriter::finalize() {
		bool UseSegments = false;
		auto ShouldWrite = [](const SectionBase &Sec) {
		return (Sec.Flags & ELF::SHF_ALLOC) && (Sec.Type != ELF::SHT_NOBITS);
		};
		auto IsInPtLoad = [](const SectionBase &Sec) {
		return Sec.ParentSegment && Sec.ParentSegment->Type == ELF::PT_LOAD;
		};

		// We can't write 64-bit addresses.
		if (addressOverflows32bit(Obj.Entry))
		return createStringError(errc::invalid_argument,
		"Entry point address %p overflows 32 bits.",
		Obj.Entry);

		// If any section we're to write has segment then we
		// switch to using physical addresses. Otherwise we
		// use section virtual address.
		for (auto &Section : Obj.sections())
		if (ShouldWrite(Section) && IsInPtLoad(Section)) {
		UseSegments = true;
		break;
		}

		for (auto &Section : Obj.sections())
		if (ShouldWrite(Section) && (!UseSegments \|\| IsInPtLoad(Section))) {
		if (Error E = checkSection(Section))
		return E;
		Sections.insert(&Section);
		}

		IHexSectionWriterBase LengthCalc(Buf);
		for (const SectionBase *Sec : Sections)
		Sec->accept(LengthCalc);

		// We need space to write section records + StartAddress record +
		// EndOfFile record.
		TotalSize = LengthCalc.getBufferOffset() + IHexRecord::getLineLength(4) +
		IHexRecord::getLineLength(0);
		if (Error E = Buf.allocate(TotalSize))
		return E;
		return Error::success();
		}

template class ELFBuilder<ELF64LE>;		template class ELFBuilder<ELF64LE>;
template class ELFBuilder<ELF64BE>;		template class ELFBuilder<ELF64BE>;
template class ELFBuilder<ELF32LE>;		template class ELFBuilder<ELF32LE>;
template class ELFBuilder<ELF32BE>;		template class ELFBuilder<ELF32BE>;

template class ELFWriter<ELF64LE>;		template class ELFWriter<ELF64LE>;
template class ELFWriter<ELF64BE>;		template class ELFWriter<ELF64BE>;
template class ELFWriter<ELF32LE>;		template class ELFWriter<ELF32LE>;
template class ELFWriter<ELF32BE>;		template class ELFWriter<ELF32BE>;

} // end namespace elf		} // end namespace elf
} // end namespace objcopy		} // end namespace objcopy
} // end namespace llvm		} // end namespace llvm

tools/llvm-objcopy/llvm-objcopy.h

	Show All 22 Lines
	LLVM_ATTRIBUTE_NORETURN extern void reportError(StringRef File, Error E);			LLVM_ATTRIBUTE_NORETURN extern void reportError(StringRef File, Error E);
	LLVM_ATTRIBUTE_NORETURN extern void reportError(StringRef File,			LLVM_ATTRIBUTE_NORETURN extern void reportError(StringRef File,
	std::error_code EC);			std::error_code EC);

	// This is taken from llvm-readobj.			// This is taken from llvm-readobj.
	// [see here](llvm/tools/llvm-readobj/llvm-readobj.h:38)			// [see here](llvm/tools/llvm-readobj/llvm-readobj.h:38)
	template <class T> T unwrapOrError(Expected<T> EO) {			template <class T> T unwrapOrError(Expected<T> EO) {
	if (EO)			if (EO)
	return *EO;			return std::move(*EO);
	std::string Buf;			std::string Buf;
	raw_string_ostream OS(Buf);			raw_string_ostream OS(Buf);
	logAllUnhandledErrors(EO.takeError(), OS);			logAllUnhandledErrors(EO.takeError(), OS);
	OS.flush();			OS.flush();
	error(Buf);			error(Buf);
	}			}

	} // end namespace objcopy			} // end namespace objcopy
	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_TOOLS_OBJCOPY_OBJCOPY_H			#endif // LLVM_TOOLS_OBJCOPY_OBJCOPY_H

tools/llvm-objcopy/llvm-objcopy.cpp

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	for (const NewArchiveMember &Member : NewMembers) {
std::copy(Member.Buf->getBufferStart(), Member.Buf->getBufferEnd(),		std::copy(Member.Buf->getBufferStart(), Member.Buf->getBufferEnd(),
FB.getBufferStart());		FB.getBufferStart());
if (Error E = FB.commit())		if (Error E = FB.commit())
return E;		return E;
}		}
return Error::success();		return Error::success();
}		}

		/// The function executeObjcopyOnIHex does the dispatch based on the format
		/// of the output specified by the command line options.
		static Error executeObjcopyOnIHex(const CopyConfig &Config, MemoryBuffer &In,
		Buffer &Out) {
		// TODO: support output formats other than ELF.
		return elf::executeObjcopyOnIHex(Config, In, Out);
		}

/// The function executeObjcopyOnRawBinary does the dispatch based on the format		/// The function executeObjcopyOnRawBinary does the dispatch based on the format
/// of the output specified by the command line options.		/// of the output specified by the command line options.
static Error executeObjcopyOnRawBinary(const CopyConfig &Config,		static Error executeObjcopyOnRawBinary(const CopyConfig &Config,
MemoryBuffer &In, Buffer &Out) {		MemoryBuffer &In, Buffer &Out) {
// TODO: llvm-objcopy should parse CopyConfig.OutputFormat to recognize		// TODO: llvm-objcopy should parse CopyConfig.OutputFormat to recognize
// formats other than ELF / "binary" and invoke		// formats other than ELF / "binary" and invoke
// elf::executeObjcopyOnRawBinary, macho::executeObjcopyOnRawBinary or		// elf::executeObjcopyOnRawBinary, macho::executeObjcopyOnRawBinary or
// coff::executeObjcopyOnRawBinary accordingly.		// coff::executeObjcopyOnRawBinary accordingly.
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
/// of input (raw binary, archive or single object file) and takes care of the		/// of input (raw binary, archive or single object file) and takes care of the
/// format-agnostic modifications, i.e. preserving dates.		/// format-agnostic modifications, i.e. preserving dates.
static Error executeObjcopy(const CopyConfig &Config) {		static Error executeObjcopy(const CopyConfig &Config) {
sys::fs::file_status Stat;		sys::fs::file_status Stat;
if (Config.PreserveDates)		if (Config.PreserveDates)
if (auto EC = sys::fs::status(Config.InputFilename, Stat))		if (auto EC = sys::fs::status(Config.InputFilename, Stat))
return createFileError(Config.InputFilename, EC);		return createFileError(Config.InputFilename, EC);

if (Config.InputFormat == "binary") {		typedef Error (*ProcessRawFn)(const CopyConfig &, MemoryBuffer &, Buffer &);
		auto ProcessRaw = StringSwitch<ProcessRawFn>(Config.InputFormat)
		.Case("binary", executeObjcopyOnRawBinary)
		.Case("ihex", executeObjcopyOnIHex)
		.Default(nullptr);

		if (ProcessRaw) {
auto BufOrErr = MemoryBuffer::getFile(Config.InputFilename);		auto BufOrErr = MemoryBuffer::getFile(Config.InputFilename);
if (!BufOrErr)		if (!BufOrErr)
return createFileError(Config.InputFilename, BufOrErr.getError());		return createFileError(Config.InputFilename, BufOrErr.getError());
FileBuffer FB(Config.OutputFilename);		FileBuffer FB(Config.OutputFilename);
if (Error E = executeObjcopyOnRawBinary(Config, *BufOrErr->get(), FB))		if (Error E = ProcessRaw(Config, *BufOrErr->get(), FB))
return E;		return E;
} else {		} else {
Expected<OwningBinary<llvm::object::Binary>> BinaryOrErr =		Expected<OwningBinary<llvm::object::Binary>> BinaryOrErr =
createBinary(Config.InputFilename);		createBinary(Config.InputFilename);
if (!BinaryOrErr)		if (!BinaryOrErr)
return createFileError(Config.InputFilename, BinaryOrErr.takeError());		return createFileError(Config.InputFilename, BinaryOrErr.takeError());

if (Archive *Ar = dyn_cast<Archive>(BinaryOrErr.get().getBinary())) {		if (Archive *Ar = dyn_cast<Archive>(BinaryOrErr.get().getBinary())) {
▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines