This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
test/tools/llvm-objcopy/
-
tools/
-
llvm-objcopy/
-
overlap-chain.test
-
pt-phdr.test
1
same-segment.test
2
tripple-overlap.test
-
tools/llvm-objcopy/
-
llvm-objcopy/
-
Object.h
17
Object.cpp

Differential D36558

[llvm-objcopy] Add support for nested and overlapping segments
ClosedPublic

Authored by jakehehrlich on Aug 9 2017, 4:24 PM.

Download Raw Diff

Details

Reviewers

phosek
jhenderson

Commits

rGd246b0a2843d: Reland "[llvm-objcopy] Add support for nested and overlapping segments"
rG0a84b1ac8046: [llvm-objcopy] Add support for nested and overlapping segments
rL313682: Reland "[llvm-objcopy] Add support for nested and overlapping segments"
rL313656: [llvm-objcopy] Add support for nested and overlapping segments

Summary

This change adds support for nested and even overlapping segments. This means that PT_PHDR, PT_GNU_RELRO, PT_TLS, and PT_DYNAMIC can be supported properly.

Diff Detail

Repository: rL LLVM

Event Timeline

jakehehrlich created this revision.Aug 9 2017, 4:24 PM

jakehehrlich added a reviewer: jhenderson.Aug 29 2017, 2:30 PM

jakehehrlich added a subscriber: llvm-commits.

Lots of small nits, and one or two slightly larger issues to address here. I like the overall approach though.

I'm not all that happy with the level of test coverage available, but at the same time, I'm not sure that there's much that can be done without unit-testing a lot of this, or creating large numbers of virtually identical Lit tests, which seems a little excessive.

tools/llvm-objcopy/Object.cpp
170	Nit: full stop.
171	segmentOverlappsSegment -> segmentOverlapsSegment
174–175	I don't think this is quite right - in the case of two adjacent segments (i.e. where the end of one and the start of the next are the same), the second ends up being treated as a child of the first. I think the second clause should be strictly greater than.
206–207	Couple of typos in this comment: 1) though -> through; 2) match up nested segments up. -> match up segments.
210	"will be a child of itself" should probably be "will overlap with itself", to match the function name.
213	cononical -> canonical
214	Nit: full stop.
222	I think there's one case missing here - we don't want the case where two segments with identical offsets and file sizes to be parents of each other. In this situation, we also can't simply say something like "if (Parent->ParentSegment == Child) do nothing", you could end up with segment 1 being parent of segment 2 which is parent of segment 3 which is parent of segment 1. I think in that situation, the segment that is first in the table should be treated as the parent of the other two.
232	Nit: unnecessary blank line.
401	properlly -> properly
433	parrent -> parent
435	Full stop.
437	Could you rename PSeg to Parent, please?
438	Should this be: `Segment->Offset = PSeg->Offset + Segment->OriginalOffset - PSeg->OriginalOffset;` ?

In D36558#856377, @jhenderson wrote:

Lots of small nits, and one or two slightly larger issues to address here. I like the overall approach though.

I'm not all that happy with the level of test coverage available, but at the same time, I'm not sure that there's much that can be done without unit-testing a lot of this, or creating large numbers of virtually identical Lit tests, which seems a little excessive.

I'm not that happy with test coverage either. You caught a mistake that wasn't just a little mess sup on my part (Segment->OriginalOffset - PSeg->OriginalOffset vs Segment->Offset - PSeg->Offset). I think it's pretty clear that the tests are not enough here. There are lots of branches in this code that likely aren't being checked as well. I don't know of a way to add unit tests in LLVM.

Are you aware of some way of adding unit tests in llvm?

In D36558#857165, @jakehehrlich wrote:

Are you aware of some way of adding unit tests in llvm?

We have unittests in llvm/unittests. We generally prefer regression tests for various reasons, but the option exists if you need it.

Switched to using Index instead of FileSize for parent determination. FileSize was only being used to disambiguate the case where two segments had the same offset. Index does that better for reasons James pointed out.
Fixed bug where Offset was being used instead of OriginalOffset
Fixed lots of typos

In D36558#857165, @jakehehrlich wrote:

Are you aware of some way of adding unit tests in llvm?

As @efriedma mentioned, there are plenty of unit tests in llvm/unittests (I don't know if you're developing on Windows or Linux, but at least in the Visual Studio solution, most of the projects under the "Test" group are unit test projects). I have a little experience with adding unit tests to existing tests, but no experience with setting up a whole new set of unit tests for a new program, such as llvm-objcopy. I do believe that it would require moving all the code that we want to test this way into a separate support library. In general, my experience has been that writing unit tests has been easier to generate sufficient coverage than regression tests, but that was with an internal non-LLVM product, so the principle may not so readily apply here.

I think if the alternative is large numbers of pre-built binaries, I think we have to unit test this, if we want sensible coverage. However, I also think this a broader problem with llvm-objcopy, so I think it's not part of this change. However, I do think it's something that should be looked at sooner rather than later (from experience, the bigger a project grows, the harder it is to unit test, because certain design decisions that make unit testing hard get harder to undo). This all might be something to ask about on the mailing list (i.e. how to start unit testing the program), as I haven't discovered any documentation on it anywhere.

I'm happy with the code as is now, but I'm not sure how we can exercise the parts of the code where the bugs were from a Lit perspective until we add the ability to actually do things to the ELF to llvm-objcopy (e.g. strip sections). I'd therefore like the unit testing discussion/investigation to happen before I accept this review (although if you don't think it's worth it, I could probably be persuaded otherwise).

Added condition to check for PT_GNU_STACK in layout because the new layout algorithm threw and error in the case
Added 3 test to try and get as much coverage of the O(n^2) loop over program headers. I tried to make sure every branch was covered.

Turning this into a library isn't really an option right now. Petr Hosek talked with Rui about the possibility of that in future. Namely LLD's synthetic sections and my sections look a lot alike so I think deduplicating those and merging them into a library makes a lot of sense at some point in the future but not right now. If this stuff gets turned into a library I think that's the form it will take.

I think we should probably settle for adding a few tests (possibly more, please recommend more if you can think of any) here and then wait to add some stripping capability next. After stripping is implemented we can write better tests. I plan on adding some stripping capabilities right after dynamic stuff has been submitted. I'll start writing code for it soon and put it up for review as soon as possible.

What kind of stripping would be most useful for testing at first? I think removing a section by name and then removing a symbol by name would be best. That way we can tailor tests to see what happens when specific sections/symbols are removed. After that we can implement more useful types of stripping that remove multiple sections and symbols at once.

So my proposal is the following:

Compromise between adding a bunch of very similar .test files and unit testing by adding a few more .test files (please recommend more) and waiting for better tests to come up after we have section and symbol removal
Write section removal by name next.
Write a collection of tests using section removal works well and to get more test coverage
Write symbol removal by name after that.
Write a collection of tests using symbol removal to get better coverage.
Add more advanced kinds of stripping like --strip-all and --strip-debug

In D36558#862704, @jakehehrlich wrote:

I think we should probably settle for adding a few tests (possibly more, please recommend more if you can think of any) here and then wait to add some stripping capability next. After stripping is implemented we can write better tests. I plan on adding some stripping capabilities right after dynamic stuff has been submitted. I'll start writing code for it soon and put it up for review as soon as possible.

What kind of stripping would be most useful for testing at first? I think removing a section by name and then removing a symbol by name would be best. That way we can tailor tests to see what happens when specific sections/symbols are removed. After that we can implement more useful types of stripping that remove multiple sections and symbols at once.

So my proposal is the following:

Compromise between adding a bunch of very similar .test files and unit testing by adding a few more .test files (please recommend more) and waiting for better tests to come up after we have section and symbol removal

Write section removal by name next.

Write a collection of tests using section removal works well and to get more test coverage

Write symbol removal by name after that.

Write a collection of tests using symbol removal to get better coverage.

Add more advanced kinds of stripping like --strip-all and --strip-debug

I think what you've proposed here sounds reasonable. Section stripping is something I've made use of in the past, but not symbol stripping, so I think section stripping should be first, definitely. It would also allow us to test this area (nested segments) more directly as well.

As for suggested tests, I think you've done a pretty good job of covering the cases I can think of. One I didn't see, so might be missing, was two adjacent segments (i.e. the end of one is the same value as the start of the next), perhaps with different alignments, so that the first can move but not the second (or they can both move, but are not tied together, so one moves further than the other). That would test the second half of "segmentOverlapsSegment" I think. It might just be part of one of the other tests. As there are several tests now that test similar, but slightly different cases, I think you should add a comment in each test to describe what exactly is being tested (e.g. adjacent segments, segments that are identical, chains of segments that partially overlap), essentially describing why that particular case is interesting. For example, in the test I've suggested, the comment would read something like "Check the case where two non-empty segments are adjacent in the file, i.e. the end of one is the start of the next. In this case, the two should move independently of each other."

test/tools/llvm-objcopy/same-segment.test
1	I'd call this test identical-segment.test (same-segment implies that two different things refer to one segment).
test/tools/llvm-objcopy/tripple-overlap.test
1	tripple-overlap.test -> triple-overlap.test
tools/llvm-objcopy/Object.cpp
432	Why is PT_GNU_STACK special?

Added test and fixed test names

tools/llvm-objcopy/Object.cpp
432	For the purposes of skipping it here it is special because it has no alignment and it's offset must be zero. In general segments should have non-zero alignment.. It's more generally special however. All of it's fields are zero (including address, offset, and size) except for the flags. It isn't nested in another loadable segment but it also doesn't cover any section. I skip it here because it needs to maintain a zero offset and it causes align to fail (when I originally wrote this diff, the new layout algorithm wasn't a part of it so I didn't have this issue)

One of the renamed tests needs another tweak to the name, and another of the tests still has a typo in its name.

test/tools/llvm-objcopy/identical-segment.test
1 ↗	(On Diff #114430)	Sorry, should be identical-segments.test (plural).
2 ↗	(On Diff #114430)	based in -> based on.
test/tools/llvm-objcopy/tripple-overlap.test
4	the case which -> the case where
tools/llvm-objcopy/Object.cpp
432	LLD can emit other segments with zero in every field. I have a linker script, for example, that requests a PT_INTERP segment, but LLD does not assign anything to it, so the segment is empty. Every field apart from the type is zero. Empty segments don't need an alignment or address, so I think the check should be for any empty segments at offset zero.

Fixed typos and test names
Made recommended change to skipping certain segments. I used MemSize to check that the size is zero because FileSize could be zero while MemSize would not be. If a segment covers only SHT_NOBITS read only sections then it's offset is technically free to be whatever I believe and it would have to have FileSize be zero. In practice it will be something more sensible but it seemed right t cover this case.

tripple-overlap.test still needs renaming!

I think it may be possible to have segments with zero MemSize but non-zero FileSize. I think it's unlikely to happen that they appear at offset zero, but I wouldn't want to guarantee it. As such, these should probably be considered in the algorithm as well, rather than being skipped. If they overlap another later segment, then the latter must stay relative to the former. We should have tests for these two cases as well, possibly independent of the completely empty segment.

Sorry about not getting that changed faster. That's just not ok. I apologize.

As for the FileSize < MemSize issue. The standard states "The file size may not be larger than the memory size" here: http://www.sco.com/developers/gabi/2012-12-31/ch5.pheader.html

I haven't seen anything producing any such binaries. Also LLD considers it a bug when it happens so that's at least one linker that promises to never do it. I think I'd rather throw an error on read in if this happens. Would that be ok?

In case you're cool with just throwing an error in the case that p_filesz > p_memsz I've gone ahead an added that as well. Adding a test is tricky for two reasons 1) It's tricky to make a binary to trigger this case and 2) I would have to upload a binary to do so.

In D36558#868749, @jakehehrlich wrote:

Sorry about not getting that changed faster. That's just not ok. I apologize.

As for the FileSize < MemSize issue. The standard states "The file size may not be larger than the memory size" here: http://www.sco.com/developers/gabi/2012-12-31/ch5.pheader.html

I haven't seen anything producing any such binaries. Also LLD considers it a bug when it happens so that's at least one linker that promises to never do it. I think I'd rather throw an error on read in if this happens. Would that be ok?

No problem, it happens. Sorry for getting back to you slowly - I've been on annual leave for a few days.

That standard quote only refers to PT_LOAD segments. Other segment types are not constrained by this, so the new error will spuriously catch such cases. We cannot rely on the linker preventing this case for other segment types, because it is entirely reasonable for specific targets to have segments that are not loaded on the target, so don't need address or memory size allocated - see the "NOLOAD" linker script directive, for example. I could also imagine a target which does not assign addresses for the ELF header/program header table, but theoretically they could be assigned to a (non-loaded) segment.

Ultimately though, I think this is all irrelevant - the actual problem here is the use of alignTo with zero alignment, but according to the ELF spec, a value of 0 or 1 means no alignment, so if we see an align of zero, we should align as if aligning to 1, I think.

Right, the alignment trick makes perfect sense. Also I removed the error because, you're right, it only makes sense for PT_LOAD. I still think there might be an issue with some special nearly "all zero" sections but it's kind of hard to figure out how exactly they should be handled. Anything that has an offset of zero and a file size of zero will currently be handled correctly so I can't think of such a case that isn't handled correctly.

LGTM.

This revision is now accepted and ready to land.Sep 19 2017, 1:25 AM

Closed by commit rL313656: [llvm-objcopy] Add support for nested and overlapping segments (authored by jakehehrlich). · Explain WhySep 19 2017, 11:15 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

test/

tools/

llvm-objcopy/

112 lines

68 lines

78 lines

118 lines

tools/

llvm-objcopy/

Object.h

1 line

Object.cpp

74 lines

Diff 114075

test/tools/llvm-objcopy/overlap-chain.test

This file was added.

				# RUN: yaml2obj %s -o %t
				# RUN: llvm-objcopy %t %t2
				# RUN: llvm-readobj -program-headers %t2 \| FileCheck %s

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text2
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text3
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text4
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text5
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text
				- Section: .text2
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text4
				- Section: .text5
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text3
				- Section: .text4
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text2
				- Section: .text3

				#CHECK: ProgramHeaders [
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x1000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x4000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x3000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x2000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT:]

test/tools/llvm-objcopy/pt-phdr.test

This file was added.

				# RUN: llvm-objcopy %p/Inputs/pt-phdr.elf %t
				# RUN: llvm-readobj -program-headers %t \| FileCheck %s

				#CHECK: ProgramHeaders [
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_PHDR
				#CHECK-NEXT: Offset: 0x40
				#CHECK-NEXT: VirtualAddress: 0x200040
				#CHECK-NEXT: PhysicalAddress: 0x200040
				#CHECK-NEXT: FileSize: 280
				#CHECK-NEXT: MemSize: 280
				#CHECK-NEXT: Flags [
				#CHECK-NEXT: PF_R
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 8
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD
				#CHECK-NEXT: Offset: 0x0
				#CHECK-NEXT: VirtualAddress: 0x200000
				#CHECK-NEXT: PhysicalAddress: 0x200000
				#CHECK-NEXT: FileSize: 344
				#CHECK-NEXT: MemSize: 344
				#CHECK-NEXT: Flags [
				#CHECK-NEXT: PF_R
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD
				#CHECK-NEXT: Offset: 0x1000
				#CHECK-NEXT: VirtualAddress: 0x201000
				#CHECK-NEXT: PhysicalAddress: 0x201000
				#CHECK-NEXT: FileSize: 1
				#CHECK-NEXT: MemSize: 1
				#CHECK-NEXT: Flags [
				#CHECK-NEXT: PF_R
				#CHECK-NEXT: PF_X
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD
				#CHECK-NEXT: Offset: 0x2000
				#CHECK-NEXT: VirtualAddress: 0x202000
				#CHECK-NEXT: PhysicalAddress: 0x202000
				#CHECK-NEXT: FileSize: 14
				#CHECK-NEXT: MemSize: 14
				#CHECK-NEXT: Flags [
				#CHECK-NEXT: PF_R
				#CHECK-NEXT: PF_W
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_GNU_STACK (0x6474E551)
				#CHECK-NEXT: Offset: 0x0
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 0
				#CHECK-NEXT: MemSize: 0
				#CHECK-NEXT: Flags [
				#CHECK-NEXT: PF_R
				#CHECK-NEXT: PF_W
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 0
				#CHECK-NEXT: }
				#CHECK-NEXT:]

test/tools/llvm-objcopy/same-segment.test

This file was added.

				# RUN: yaml2obj %s -o %t
				jhendersonUnsubmitted Not Done Reply Inline Actions I'd call this test identical-segment.test (same-segment implies that two different things refer to one segment). jhenderson: I'd call this test identical-segment.test (same-segment implies that two different things refer…
				# RUN: llvm-objcopy %t %t2
				# RUN: llvm-readobj --program-headers %t2 \| FileCheck %s

				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text2
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text2
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text
				- Section: .text2
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text
				- Section: .text2

				#CHECK: ProgramHeaders [
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x2000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 4096
				#CHECK-NEXT: MemSize: 4096
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x1000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x1000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT:]

test/tools/llvm-objcopy/tripple-overlap.test

This file was added.

				# RUN: yaml2obj %s -o %t
				jhendersonUnsubmitted Not Done Reply Inline Actions tripple-overlap.test -> triple-overlap.test jhenderson: tripple-overlap.test -> triple-overlap.test
				# RUN: llvm-objcopy %t %t2
				# RUN: llvm-readobj --program-headers %t2 \| FileCheck %s

				jhendersonUnsubmitted Not Done Reply Inline Actions the case which -> the case where jhenderson: the case which -> the case where
				!ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_EXEC
				Machine: EM_X86_64
				Sections:
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text2
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text3
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text4
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				- Name: .text5
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				AddressAlign: 0x1000
				Size: 4096
				ProgramHeaders:
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text4
				- Section: .text5
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text3
				- Section: .text4
				- Section: .text5
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text
				- Section: .text2
				- Section: .text3
				- Section: .text4
				- Section: .text5
				- Type: PT_LOAD
				Flags: [ PF_X, PF_R ]
				Sections:
				- Section: .text2
				- Section: .text3
				- Section: .text4
				- Section: .text5

				#CHECK: ProgramHeaders [
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x4000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 8192
				#CHECK-NEXT: MemSize: 8192
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x3000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 12288
				#CHECK-NEXT: MemSize: 12288
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x1000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 20480
				#CHECK-NEXT: MemSize: 20480
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT: ProgramHeader {
				#CHECK-NEXT: Type: PT_LOAD (0x1)
				#CHECK-NEXT: Offset: 0x2000
				#CHECK-NEXT: VirtualAddress: 0x0
				#CHECK-NEXT: PhysicalAddress: 0x0
				#CHECK-NEXT: FileSize: 16384
				#CHECK-NEXT: MemSize: 16384
				#CHECK-NEXT: Flags [ (0x5)
				#CHECK-NEXT: PF_R (0x4)
				#CHECK-NEXT: PF_X (0x1)
				#CHECK-NEXT: ]
				#CHECK-NEXT: Alignment: 4096
				#CHECK-NEXT: }
				#CHECK-NEXT:]

tools/llvm-objcopy/Object.h

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	public:
uint32_t Index;		uint32_t Index;
uint64_t MemSize;		uint64_t MemSize;
uint64_t Offset;		uint64_t Offset;
uint64_t PAddr;		uint64_t PAddr;
uint64_t Type;		uint64_t Type;
uint64_t VAddr;		uint64_t VAddr;

uint64_t OriginalOffset;		uint64_t OriginalOffset;
		Segment *ParentSegment;

Segment(llvm::ArrayRef<uint8_t> Data) : Contents(Data) {}		Segment(llvm::ArrayRef<uint8_t> Data) : Contents(Data) {}
void finalize();		void finalize();
const SectionBase *firstSection() const {		const SectionBase *firstSection() const {
if (!Sections.empty())		if (!Sections.empty())
return *Sections.begin();		return *Sections.begin();
return nullptr;		return nullptr;
}		}
▲ Show 20 Lines • Show All 143 Lines • Show Last 20 Lines

tools/llvm-objcopy/Object.cpp

Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines	static bool sectionWithinSegment(const SectionBase &Section,
// to clarify the case when an empty section lies on a boundary between two		// to clarify the case when an empty section lies on a boundary between two
// segments and ensures that the section "belongs" to the second segment and		// segments and ensures that the section "belongs" to the second segment and
// not the first.		// not the first.
uint64_t SecSize = Section.Size ? Section.Size : 1;		uint64_t SecSize = Section.Size ? Section.Size : 1;
return Segment.Offset <= Section.OriginalOffset &&		return Segment.Offset <= Section.OriginalOffset &&
Segment.Offset + Segment.FileSize >= Section.OriginalOffset + SecSize;		Segment.Offset + Segment.FileSize >= Section.OriginalOffset + SecSize;
}		}

		// Returns true IFF a segment's original offset is inside of another segment's
		// range.
		jhendersonUnsubmitted Not Done Reply Inline Actions Nit: full stop. jhenderson: Nit: full stop.
		static bool segmentOverlapsSegment(const Segment &Child,
		jhendersonUnsubmitted Not Done Reply Inline Actions segmentOverlappsSegment -> segmentOverlapsSegment jhenderson: segmentOverlappsSegment -> segmentOverlapsSegment
		const Segment &Parent) {

		return Parent.OriginalOffset <= Child.OriginalOffset &&
		Parent.OriginalOffset + Parent.FileSize > Child.OriginalOffset;
		jhendersonUnsubmitted Not Done Reply Inline Actions I don't think this is quite right - in the case of two adjacent segments (i.e. where the end of one and the start of the next are the same), the second ends up being treated as a child of the first. I think the second clause should be strictly greater than. jhenderson: I don't think this is quite right - in the case of two adjacent segments (i.e. where the end of…
		}

template <class ELFT>		template <class ELFT>
void Object<ELFT>::readProgramHeaders(const ELFFile<ELFT> &ElfFile) {		void Object<ELFT>::readProgramHeaders(const ELFFile<ELFT> &ElfFile) {
uint32_t Index = 0;		uint32_t Index = 0;
for (const auto &Phdr : unwrapOrError(ElfFile.program_headers())) {		for (const auto &Phdr : unwrapOrError(ElfFile.program_headers())) {
ArrayRef<uint8_t> Data{ElfFile.base() + Phdr.p_offset,		ArrayRef<uint8_t> Data{ElfFile.base() + Phdr.p_offset,
(size_t)Phdr.p_filesz};		(size_t)Phdr.p_filesz};
Segments.emplace_back(llvm::make_unique<Segment>(Data));		Segments.emplace_back(llvm::make_unique<Segment>(Data));
Segment &Seg = *Segments.back();		Segment &Seg = *Segments.back();
Show All 12 Lines	for (auto &Section : Sections) {
Seg.addSection(&*Section);		Seg.addSection(&*Section);
if (!Section->ParentSegment \|\|		if (!Section->ParentSegment \|\|
Section->ParentSegment->Offset > Seg.Offset) {		Section->ParentSegment->Offset > Seg.Offset) {
Section->ParentSegment = &Seg;		Section->ParentSegment = &Seg;
}		}
}		}
}		}
}		}
		// Now we do an O(n^2) loop through the segments in order to match up
		// segments.
		jhendersonUnsubmitted Not Done Reply Inline Actions Couple of typos in this comment: 1) though -> through; 2) match up nested segments up. -> match up segments. jhenderson: Couple of typos in this comment: 1) though -> through; 2) match up nested segments up. -> match…
		for (auto &Child : Segments) {
		for (auto &Parent : Segments) {
		// Every segment will overlap with itself but we don't want a segment to
		jhendersonUnsubmitted Not Done Reply Inline Actions "will be a child of itself" should probably be "will overlap with itself", to match the function name. jhenderson: "will be a child of itself" should probably be "will overlap with itself", to match the…
		// be it's own parent so we avoid that situation.
		if (&Child != &Parent && segmentOverlapsSegment(Child, Parent)) {
		// We want a canonical "most parental" segment but this requires
		jhendersonUnsubmitted Not Done Reply Inline Actions cononical -> canonical jhenderson: cononical -> canonical
		// inspecting the ParentSegment.
		jhendersonUnsubmitted Not Done Reply Inline Actions Nit: full stop. jhenderson: Nit: full stop.
		if (Child->ParentSegment != nullptr) {
		if (Child->ParentSegment->OriginalOffset > Parent->OriginalOffset) {
		Child->ParentSegment = Parent.get();
		} else if (Child->ParentSegment->Index > Parent->Index) {
		// They must have equal OriginalOffsets so we need to disambiguate.
		// To decide which is the parent we'll choose the one with the
		// higher index.
		Child->ParentSegment = Parent.get();
		jhendersonUnsubmitted Not Done Reply Inline Actions I think there's one case missing here - we don't want the case where two segments with identical offsets and file sizes to be parents of each other. In this situation, we also can't simply say something like "if (Parent->ParentSegment == Child) do nothing", you could end up with segment 1 being parent of segment 2 which is parent of segment 3 which is parent of segment 1. I think in that situation, the segment that is first in the table should be treated as the parent of the other two. jhenderson: I think there's one case missing here - we don't want the case where two segments with…
		}
		} else {
		Child->ParentSegment = Parent.get();
		}
		}
		}
		}
}		}

template <class ELFT>		template <class ELFT>
		jhendersonUnsubmitted Not Done Reply Inline Actions Nit: unnecessary blank line. jhenderson: Nit: unnecessary blank line.
void Object<ELFT>::initSymbolTable(const llvm::object::ELFFile<ELFT> &ElfFile,		void Object<ELFT>::initSymbolTable(const llvm::object::ELFFile<ELFT> &ElfFile,
SymbolTableSection *SymTab) {		SymbolTableSection *SymTab) {

SymTab->Size = 0;		SymTab->Size = 0;
if (SymbolTable->Link - 1 >= Sections.size())		if (SymbolTable->Link - 1 >= Sections.size())
error("Symbol table has link index of " + Twine(SymbolTable->Link) +		error("Symbol table has link index of " + Twine(SymbolTable->Link) +
" which is not a valid index");		" which is not a valid index");

▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	template <class ELFT> void ELFObject<ELFT>::sortSections() {
auto CompareSections = [](const SecPtr &A, const SecPtr &B) {		auto CompareSections = [](const SecPtr &A, const SecPtr &B) {
return A->OriginalOffset < B->OriginalOffset;		return A->OriginalOffset < B->OriginalOffset;
};		};
std::stable_sort(std::begin(this->Sections), std::end(this->Sections),		std::stable_sort(std::begin(this->Sections), std::end(this->Sections),
CompareSections);		CompareSections);
}		}

template <class ELFT> void ELFObject<ELFT>::assignOffsets() {		template <class ELFT> void ELFObject<ELFT>::assignOffsets() {
		// We need a temporary list of segments that has a special order to it
		// so that we know that anytime ->ParentSegment is set that segment has
		// already had it's offset properly set.
		jhendersonUnsubmitted Not Done Reply Inline Actions properlly -> properly jhenderson: properlly -> properly
		std::vector<Segment *> OrderedSegments;
		for (auto &Segment : this->Segments)
		OrderedSegments.push_back(Segment.get());
		auto CompareSegments = [](const Segment A, const Segment B) {
		// Any segment without a parent segment should come before a segment
		// that has a parent segment.
		if (A->OriginalOffset < B->OriginalOffset)
		return true;
		if (A->OriginalOffset > B->OriginalOffset)
		return false;
		return A->Index < B->Index;
		};
		std::stable_sort(std::begin(OrderedSegments), std::end(OrderedSegments),
		CompareSegments);
// The size of ELF + program headers will not change so it is ok to assume		// The size of ELF + program headers will not change so it is ok to assume
// that the first offset of the first segment is a good place to start		// that the first offset of the first segment is a good place to start
// outputting sections. This covers both the standard case and the PT_PHDR		// outputting sections. This covers both the standard case and the PT_PHDR
// case.		// case.
uint64_t Offset;		uint64_t Offset;
if (!this->Segments.empty()) {		if (!OrderedSegments.empty()) {
Offset = this->Segments[0]->Offset;		Offset = OrderedSegments[0]->Offset;
} else {		} else {
Offset = sizeof(Elf_Ehdr);		Offset = sizeof(Elf_Ehdr);
}		}
// The only way a segment should move is if a section was between two		// The only way a segment should move is if a section was between two
// segments and that section was removed. If that section isn't in a segment		// segments and that section was removed. If that section isn't in a segment
// then it's acceptable, but not ideal, to simply move it to after the		// then it's acceptable, but not ideal, to simply move it to after the
// segments. So we can simply layout segments one after the other accounting		// segments. So we can simply layout segments one after the other accounting
// for alignment.		// for alignment.
for (auto &Segment : this->Segments) {		for (auto &Segment : OrderedSegments) {
		if (Segment->Type == PT_GNU_STACK)
		jhendersonUnsubmitted Not Done Reply Inline Actions Why is PT_GNU_STACK special? jhenderson: Why is PT_GNU_STACK special?
		jakehehrlichAuthorUnsubmitted Not Done Reply Inline Actions For the purposes of skipping it here it is special because it has no alignment and it's offset must be zero. In general segments should have non-zero alignment.. It's more generally special however. All of it's fields are zero (including address, offset, and size) except for the flags. It isn't nested in another loadable segment but it also doesn't cover any section. I skip it here because it needs to maintain a zero offset and it causes align to fail (when I originally wrote this diff, the new layout algorithm wasn't a part of it so I didn't have this issue) jakehehrlich: For the purposes of skipping it here it is special because it has no alignment and it's offset…
		jhendersonUnsubmitted Not Done Reply Inline Actions LLD can emit other segments with zero in every field. I have a linker script, for example, that requests a PT_INTERP segment, but LLD does not assign anything to it, so the segment is empty. Every field apart from the type is zero. Empty segments don't need an alignment or address, so I think the check should be for any empty segments at offset zero. jhenderson: LLD can emit other segments with zero in every field. I have a linker script, for example, that…
		continue;
		jhendersonUnsubmitted Not Done Reply Inline Actions parrent -> parent jhenderson: parrent -> parent
		// We assume that segments have been ordered by OriginalOffset and Index
		// such that a parent segment will always come before a child segment in
		jhendersonUnsubmitted Not Done Reply Inline Actions Full stop. jhenderson: Full stop.
		// OrderedSegments. This means that the Offset of the ParentSegment should
		// already be set and we can set our offset relative to it.
		jhendersonUnsubmitted Not Done Reply Inline Actions Could you rename PSeg to Parent, please? jhenderson: Could you rename PSeg to Parent, please?
		if (Segment->ParentSegment != nullptr) {
		jhendersonUnsubmitted Not Done Reply Inline Actions Should this be: `Segment->Offset = PSeg->Offset + Segment->OriginalOffset - PSeg->OriginalOffset;` ? jhenderson: Should this be: `Segment->Offset = PSeg->Offset + Segment->OriginalOffset - PSeg…
		auto Parent = Segment->ParentSegment;
		Segment->Offset =
		Parent->Offset + Segment->OriginalOffset - Parent->OriginalOffset;
		} else {
Offset = alignTo(Offset, Segment->Align);		Offset = alignTo(Offset, Segment->Align);
Segment->Offset = Offset;		Segment->Offset = Offset;
Offset += Segment->FileSize;		Offset += Segment->FileSize;
}		}
		}
// Now the offset of every segment has been set we can assign the offsets		// Now the offset of every segment has been set we can assign the offsets
// of each section. For sections that are covered by a segment we should use		// of each section. For sections that are covered by a segment we should use
// the segment's original offset and the section's original offset to compute		// the segment's original offset and the section's original offset to compute
// the offset from the start of the segment. Using the offset from the start		// the offset from the start of the segment. Using the offset from the start
// of the segment we can assign a new offset to the section. For sections not		// of the segment we can assign a new offset to the section. For sections not
// covered by segments we can just bump Offset to the next valid location.		// covered by segments we can just bump Offset to the next valid location.
uint32_t Index = 1;		uint32_t Index = 1;
for (auto &Section : this->Sections) {		for (auto &Section : this->Sections) {
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines