This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/ObjectYAML/
-
ObjectYAML/
4/7
ELFEmitter.cpp
-
test/tools/yaml2obj/ELF/
-
tools/
-
yaml2obj/
-
ELF/
4/9
program-header-size-offset.yaml

Differential D78628

[obj2yaml] - Program headers: simplify the computation of p_filesz.
ClosedPublic

Authored by grimar on Apr 22 2020, 5:02 AM.

Download Raw Diff

Details

Reviewers

jhenderson
MaskRay
• espindola

Commits

rG9f9a08e19c4b: [obj2yaml] - Program headers: simplify the computation of p_filesz.

Summary

Currently we have computations of p_filesz and p_memsz mixed together
with the use of a loop over fragments. After recent changes it is possible to
avoid using a loop for the computation of p_filesz, since we know that fragments
are sorted by their file offsets.

The main benefit of this change is that splits the computation of p_filesz
and p_memsz what is simpler and allows us to fix the computation of the
p_memsz independently (D78005 shows the issue that we have currently).

Also it fixes the bug in program-header-size-offset.yaml.

Depends on D78627.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

grimar created this revision.Apr 22 2020, 5:02 AM

Herald added a reviewer: • espindola. · View Herald TranscriptApr 22 2020, 5:02 AM

Herald added subscribers: hiraditya, emaste. · View Herald Transcript

MaskRay added inline comments.Apr 22 2020, 10:03 AM

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	Why is this changed?

jhenderson added inline comments.Apr 23 2020, 12:46 AM

llvm/lib/ObjectYAML/ELFEmitter.cpp
778–780	This is a little subtle, and probably deserves a comment to explain why we pay attention to offset but not size of a single trailing SHT_NOBITS sections.
787–793	Probably tangential to this change, but if I'm reading this right, this will unconditionally cause all non-empty segments to have a memory size, even if there are no allocatable sections in them. This doesn't seem right to me, especially coming from an environment where segments with non-alloc sections (and therefore no address) are quite normal.

grimar marked an inline comment as done.Apr 23 2020, 12:48 AM

grimar added inline comments.

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml

33–34

This patch fixes a bug I beliece: it's

# Program header with 2 SHT_NOBITS sections.
- Type:     0x6abcdef0
  Offset:   0x2004
  Sections:
    - Section: .data
    - Section: .nobits1
    - Section: .nobits2

The layout is:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00     0   0  0
  [ 1] .text             PROGBITS        0000000000000000 001000 000004 00     0   0 4096
  [ 2] .rodata           PROGBITS        0000000000000000 002000 000004 00     0   0 4096
  [ 3] .data             PROGBITS        0000000000000000 002004 000004 00     0   0  0
  [ 4] .nobits1          NOBITS          0000000000000000 002008 000001 00     0   0  0
  [ 5] .nobits2          NOBITS          0000000000000000 002009 000001 00     0   0  0
  [ 6] .strtab           STRTAB          0000000000000000 002008 000001 00     0   0  1
  [ 7] .shstrtab         STRTAB          0000000000000000 002009 000039 00     0   0  1

0x2009 - 0x2004 == 0x5, not 0x4

grimar marked an inline comment as done.Apr 23 2020, 12:53 AM

grimar added inline comments.

llvm/lib/ObjectYAML/ELFEmitter.cpp
787–793	My intention is to land this patch (for `p_filesz`) and then I'll be able update and rebase the D78005 to make it focus only on the `p_memsz` calculation. I'll try to address this comment there.

grimar mentioned this in D78627: [obj2yaml] - Zero initialize program headers. NFCI..Apr 23 2020, 1:45 AM

grimar marked an inline comment as done.Apr 23 2020, 4:00 AM

grimar added inline comments.

llvm/lib/ObjectYAML/ELFEmitter.cpp
778–780	I was about to place the following comment: // SHT_NOBITS sections occupy no physical space in a file, we should not // take their sizes into account when calculating the file size of a // segment. But then remembered one thing and started to doubt about the usage of its offset. It was mentioned in D69192 that the sh_offset of a SHT_NOBITS can be "larger than the file size with some usage of objcopy". I was unable to find a proper test case committed for this and I do not think I know a way to achieve it. (`elf-disassemble-bss.test` (https://github.com/llvm/llvm-project/blob/master/llvm/test/tools/llvm-objdump/X86/elf-disassemble-bss.test) has a non-trailing .bss section that has a too large sh_offset though). ELF spec says about sh_offset: "This member's value gives the byte offset from the beginning of the file to the first byte in the section. One section type, SHT_NOBITS described below, occupies no space in the file, and its sh_offset member locates the conceptual placement in the file." Does "sh_offset member locates the conceptual placement in the file" imply that the `sh_offset` have to be less than the file size? I am not sure that "in the file." == "can be outside of the file", it is wierd. So can we use `sh_offset` like I do here?

grimar marked an inline comment as done.Apr 23 2020, 4:07 AM

grimar added inline comments.

llvm/lib/ObjectYAML/ELFEmitter.cpp

778–780

FTR, the approach of this patch it is the same as what LLD does:

template <class ELFT> void Writer<ELFT>::setPhdrs(Partition &part) {
  for (PhdrEntry *p : part.phdrs) {
    OutputSection *first = p->firstSec;
    OutputSection *last = p->lastSec;

    if (first) {
      p->p_filesz = last->offset - first->offset;
      if (last->type != SHT_NOBITS)
        p->p_filesz += last->size;
...

MaskRay added inline comments.Apr 23 2020, 9:11 AM

llvm/lib/ObjectYAML/ELFEmitter.cpp
778–780	llvm-objcopy cannot make sh_offset larger than the file size (if it does, I will assuredly consider it a bug). GNU objcopy can do some in some cases. See `binutils-gdb/bfd/elf.c:assign_file_positions_for_load_sections` the variable `off_adjust`. A few bug reports are related https://sourceware.org/bugzilla/show_bug.cgi?id=25662 This is basically a size saving optimization but this is rather error-prone. In llvm-objcopy --only-keep-debug, we simply made `sh_offset` monotonically increasing to avoid such complexity.
llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	I think it is hard to say that this is a bug. Conceptually sh_offset of a SHT_NOBITS section can be ignored. Usually, the ELF writer should set the sh_offset field of `.nobits2` to 0x2008 because there is no need to leave a one-byte gap. I don't think this trivia matter much though, handling it either way is ok. If doing it one way helps simplify our overall logic, let's choose that way.

MaskRay added inline comments.Apr 23 2020, 9:12 AM

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
32	Leave a comment how `FileSize` is computed.

grimar marked an inline comment as done.Apr 24 2020, 12:46 AM

grimar added inline comments.

llvm/lib/ObjectYAML/ELFEmitter.cpp
778–780	Thanks for information!

Addressed review comments.

jhenderson added inline comments.Apr 24 2020, 1:09 AM

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	Is there a risk that leaving the FileSize of the segment too high might result in it going outside the range of the file? In other words, does yaml2obj pay any attention to the segment sizes when it lays things out? In other words, in this example, if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data outside the file? Knowing that llvm-objcopy reads segment data based on the segment file size property, I think we need to be careful about what the FileSize is for segments. It cannot go beyond the file's end.

grimar marked an inline comment as done.Apr 24 2020, 1:43 AM

grimar added inline comments.

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data outside the file? It is possible with the use of `ShOffset`. But since `ShOffset` is itself a thing for overriding the offset and used mostly for creating invalid object, this probably not a problem?

grimar marked an inline comment as done.Apr 24 2020, 2:12 AM

grimar added inline comments.

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	See. Is there a risk that leaving the FileSize of the segment too high might result in it going outside the range of the file? The way to achieve it is to use `ShOffset`, what is probably not an issue. yaml2obj places all sections in order and can'y assign an offset that is outside of the file by itself (without using `ShOffset`). In other words, does yaml2obj pay any attention to the segment sizes when it lays things out? The layout of section is unrelated to segments. First we do layout for sections and then create headers independently, basing on the sections layout. In other words, in this example, if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data outside the file? Let me answer this again. I've read `size` as `offset` previously. The size of SHT-NOBITS is not taken into account. It might be possible to have a regular section with a broken size (with the use of `ShSize`) and then the segment file size will be broken. But that is about broken objects again.

jhenderson added inline comments.Apr 24 2020, 2:29 AM

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	Oh, I think I see. I agree that Sh* values can be ignored for this in general (and indeed, I think we agreed elsewhere that they shouldn't really impact the layout of the segments anyway). Just to confirm my understanding, the FileSize increase is because ShOffset of .nobits2 is affecting the size of the segment, and not because the size of .nobits1 is 1?

grimar marked an inline comment as done.Apr 24 2020, 2:47 AM

grimar added inline comments.

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml
33–34	Just to confirm my understanding, the FileSize increase is because ShOffset of .nobits2 is affecting the size of the segment, and not because the size of .nobits1 is 1? Right.

LGTM.

This revision is now accepted and ready to land.Apr 24 2020, 3:51 AM

Closed by commit rG9f9a08e19c4b: [obj2yaml] - Program headers: simplify the computation of p_filesz. (authored by grimar). · Explain WhyApr 24 2020, 5:54 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2020, 5:54 AM

Revision Contents

Path

Size

llvm/

lib/

ObjectYAML/

ELFEmitter.cpp

29 lines

test/

tools/

yaml2obj/

ELF/

program-header-size-offset.yaml

3 lines

Diff 259874

llvm/lib/ObjectYAML/ELFEmitter.cpp

Show First 20 Lines • Show All 765 Lines • ▼ Show 20 Lines	if (YamlPhdr.Offset) {
" must be less than or equal to the minimum file offset of "		" must be less than or equal to the minimum file offset of "
"all included sections (0x" +		"all included sections (0x" +
Twine::utohexstr(Fragments.front().Offset) + ")");		Twine::utohexstr(Fragments.front().Offset) + ")");
PHeader.p_offset = *YamlPhdr.Offset;		PHeader.p_offset = *YamlPhdr.Offset;
} else if (!Fragments.empty()) {		} else if (!Fragments.empty()) {
PHeader.p_offset = Fragments.front().Offset;		PHeader.p_offset = Fragments.front().Offset;
}		}

// Find the maximum offset of the end of a section in order to set p_filesz		// Set the file size if not set explicitly.
// and p_memsz. When setting p_filesz, trailing SHT_NOBITS sections are not		if (YamlPhdr.FileSize) {
// counted.		PHeader.p_filesz = *YamlPhdr.FileSize;
uint64_t FileOffset = PHeader.p_offset, MemOffset = PHeader.p_offset;		} else if (!Fragments.empty()) {
for (const Fragment &F : Fragments) {		uint64_t FileSize = Fragments.back().Offset - PHeader.p_offset;
uint64_t End = F.Offset + F.Size;		// SHT_NOBITS sections occupy no physical space in a file, we should not
MemOffset = std::max(MemOffset, End);		// take their sizes into account when calculating the file size of a
		jhendersonUnsubmitted Not Done Reply Inline Actions This is a little subtle, and probably deserves a comment to explain why we pay attention to offset but not size of a single trailing SHT_NOBITS sections. jhenderson: This is a little subtle, and probably deserves a comment to explain why we pay attention to…
		grimarAuthorUnsubmitted Done Reply Inline Actions I was about to place the following comment: // SHT_NOBITS sections occupy no physical space in a file, we should not // take their sizes into account when calculating the file size of a // segment. But then remembered one thing and started to doubt about the usage of its offset. It was mentioned in D69192 that the sh_offset of a SHT_NOBITS can be "larger than the file size with some usage of objcopy". I was unable to find a proper test case committed for this and I do not think I know a way to achieve it. (`elf-disassemble-bss.test` (https://github.com/llvm/llvm-project/blob/master/llvm/test/tools/llvm-objdump/X86/elf-disassemble-bss.test) has a non-trailing .bss section that has a too large sh_offset though). ELF spec says about sh_offset: "This member's value gives the byte offset from the beginning of the file to the first byte in the section. One section type, SHT_NOBITS described below, occupies no space in the file, and its sh_offset member locates the conceptual placement in the file." Does "sh_offset member locates the conceptual placement in the file" imply that the `sh_offset` have to be less than the file size? I am not sure that "in the file." == "can be outside of the file", it is wierd. So can we use `sh_offset` like I do here? grimar: I was about to place the following comment: ``` // SHT_NOBITS sections occupy no physical…
		grimarAuthorUnsubmitted Done Reply Inline Actions FTR, the approach of this patch it is the same as what LLD does: template <class ELFT> void Writer<ELFT>::setPhdrs(Partition &part) { for (PhdrEntry p : part.phdrs) { OutputSection first = p->firstSec; OutputSection last = p->lastSec; if (first) { p->p_filesz = last->offset - first->offset; if (last->type != SHT_NOBITS) p->p_filesz += last->size; ... grimar:* FTR, the approach of this patch it is the same as what LLD does: ``` template <class ELFT>…
		MaskRayUnsubmitted Not Done Reply Inline Actions llvm-objcopy cannot make sh_offset larger than the file size (if it does, I will assuredly consider it a bug). GNU objcopy can do some in some cases. See `binutils-gdb/bfd/elf.c:assign_file_positions_for_load_sections` the variable `off_adjust`. A few bug reports are related https://sourceware.org/bugzilla/show_bug.cgi?id=25662 This is basically a size saving optimization but this is rather error-prone. In llvm-objcopy --only-keep-debug, we simply made `sh_offset` monotonically increasing to avoid such complexity. MaskRay: llvm-objcopy cannot make sh_offset larger than the file size (if it does, I will assuredly…
		grimarAuthorUnsubmitted Done Reply Inline Actions Thanks for information! grimar: Thanks for information!
		// segment.
if (F.Type != llvm::ELF::SHT_NOBITS)		if (Fragments.back().Type != llvm::ELF::SHT_NOBITS)
FileOffset = std::max(FileOffset, End);		FileSize += Fragments.back().Size;
		PHeader.p_filesz = FileSize;
}		}

// Set the file size and the memory size if not set explicitly.		// Find the maximum offset of the end of a section in order to set p_memsz.
PHeader.p_filesz = YamlPhdr.FileSize ? uint64_t(*YamlPhdr.FileSize)		uint64_t MemOffset = PHeader.p_offset;
: FileOffset - PHeader.p_offset;		for (const Fragment &F : Fragments)
		MemOffset = std::max(MemOffset, F.Offset + F.Size);
		// Set the memory size if not set explicitly.
PHeader.p_memsz = YamlPhdr.MemSize ? uint64_t(*YamlPhdr.MemSize)		PHeader.p_memsz = YamlPhdr.MemSize ? uint64_t(*YamlPhdr.MemSize)
: MemOffset - PHeader.p_offset;		: MemOffset - PHeader.p_offset;
		jhendersonUnsubmitted Not Done Reply Inline Actions Probably tangential to this change, but if I'm reading this right, this will unconditionally cause all non-empty segments to have a memory size, even if there are no allocatable sections in them. This doesn't seem right to me, especially coming from an environment where segments with non-alloc sections (and therefore no address) are quite normal. jhenderson: Probably tangential to this change, but if I'm reading this right, this will unconditionally…
		grimarAuthorUnsubmitted Done Reply Inline Actions My intention is to land this patch (for `p_filesz`) and then I'll be able update and rebase the D78005 to make it focus only on the `p_memsz` calculation. I'll try to address this comment there. grimar: My intention is to land this patch (for `p_filesz`) and then I'll be able update and rebase the…

if (YamlPhdr.Align) {		if (YamlPhdr.Align) {
PHeader.p_align = *YamlPhdr.Align;		PHeader.p_align = *YamlPhdr.Align;
} else {		} else {
// Set the alignment of the segment to be the maximum alignment of the		// Set the alignment of the segment to be the maximum alignment of the
// sections so that by default the segment has a valid and sensible		// sections so that by default the segment has a valid and sensible
// alignment.		// alignment.
PHeader.p_align = 1;		PHeader.p_align = 1;
▲ Show 20 Lines • Show All 738 Lines • Show Last 20 Lines

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml

	Show All 23 Lines
	# CHECK: Offset: 0xFFE			# CHECK: Offset: 0xFFE
	# CHECK: FileSize: 7			# CHECK: FileSize: 7
	# CHECK: MemSize: 9			# CHECK: MemSize: 9

	# CHECK: Offset: 0x3000			# CHECK: Offset: 0x3000
	# CHECK: FileSize: 3			# CHECK: FileSize: 3
	# CHECK: MemSize: 2			# CHECK: MemSize: 2

	# CHECK: Offset: 0x2004			# CHECK: Offset: 0x2004
				MaskRayUnsubmitted Not Done Reply Inline Actions Leave a comment how `FileSize` is computed. MaskRay: Leave a comment how `FileSize` is computed.
	# CHECK: FileSize: 4			## Offset of .nobits2 (0x2009) - offset of .data (0x2004) == 0x5.
				# CHECK: FileSize: 5
				MaskRayUnsubmitted Not Done Reply Inline Actions Why is this changed? MaskRay: Why is this changed?
				grimarAuthorUnsubmitted Done Reply Inline Actions This patch fixes a bug I beliece: it's # Program header with 2 SHT_NOBITS sections. - Type: 0x6abcdef0 Offset: 0x2004 Sections: - Section: .data - Section: .nobits1 - Section: .nobits2 The layout is: Section Headers: [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 0] NULL 0000000000000000 000000 000000 00 0 0 0 [ 1] .text PROGBITS 0000000000000000 001000 000004 00 0 0 4096 [ 2] .rodata PROGBITS 0000000000000000 002000 000004 00 0 0 4096 [ 3] .data PROGBITS 0000000000000000 002004 000004 00 0 0 0 [ 4] .nobits1 NOBITS 0000000000000000 002008 000001 00 0 0 0 [ 5] .nobits2 NOBITS 0000000000000000 002009 000001 00 0 0 0 [ 6] .strtab STRTAB 0000000000000000 002008 000001 00 0 0 1 [ 7] .shstrtab STRTAB 0000000000000000 002009 000039 00 0 0 1 `0x2009 - 0x2004 == 0x5`, not `0x4` grimar: This patch fixes a bug I beliece: it's ``` # Program header with 2 SHT_NOBITS sections.
				MaskRayUnsubmitted Not Done Reply Inline Actions I think it is hard to say that this is a bug. Conceptually sh_offset of a SHT_NOBITS section can be ignored. Usually, the ELF writer should set the sh_offset field of `.nobits2` to 0x2008 because there is no need to leave a one-byte gap. I don't think this trivia matter much though, handling it either way is ok. If doing it one way helps simplify our overall logic, let's choose that way. MaskRay: I think it is hard to say that this is a bug. Conceptually sh_offset of a SHT_NOBITS section…
				jhendersonUnsubmitted Not Done Reply Inline Actions Is there a risk that leaving the FileSize of the segment too high might result in it going outside the range of the file? In other words, does yaml2obj pay any attention to the segment sizes when it lays things out? In other words, in this example, if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data outside the file? Knowing that llvm-objcopy reads segment data based on the segment file size property, I think we need to be careful about what the FileSize is for segments. It cannot go beyond the file's end. jhenderson: Is there a risk that leaving the FileSize of the segment too high might result in it going…
				grimarAuthorUnsubmitted Done Reply Inline Actions if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data outside the file? It is possible with the use of `ShOffset`. But since `ShOffset` is itself a thing for overriding the offset and used mostly for creating invalid object, this probably not a problem? grimar: > if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data…
				grimarAuthorUnsubmitted Done Reply Inline Actions See. Is there a risk that leaving the FileSize of the segment too high might result in it going outside the range of the file? The way to achieve it is to use `ShOffset`, what is probably not an issue. yaml2obj places all sections in order and can'y assign an offset that is outside of the file by itself (without using `ShOffset`). In other words, does yaml2obj pay any attention to the segment sizes when it lays things out? The layout of section is unrelated to segments. First we do layout for sections and then create headers independently, basing on the sections layout. In other words, in this example, if .nobits1 was say size 0xFFFF0000, would it cause the program header to be referencing data outside the file? Let me answer this again. I've read `size` as `offset` previously. The size of SHT-NOBITS is not taken into account. It might be possible to have a regular section with a broken size (with the use of `ShSize`) and then the segment file size will be broken. But that is about broken objects again. grimar: See. > Is there a risk that leaving the FileSize of the segment too high might result in it…
				jhendersonUnsubmitted Not Done Reply Inline Actions Oh, I think I see. I agree that Sh* values can be ignored for this in general (and indeed, I think we agreed elsewhere that they shouldn't really impact the layout of the segments anyway). Just to confirm my understanding, the FileSize increase is because ShOffset of .nobits2 is affecting the size of the segment, and not because the size of .nobits1 is 1? jhenderson: Oh, I think I see. I agree that Sh* values can be ignored for this in general (and indeed, I…
				grimarAuthorUnsubmitted Done Reply Inline Actions Just to confirm my understanding, the FileSize increase is because ShOffset of .nobits2 is affecting the size of the segment, and not because the size of .nobits1 is 1? Right. grimar: > Just to confirm my understanding, the FileSize increase is because ShOffset of .nobits2 is…
	# CHECK: MemSize: 6			# CHECK: MemSize: 6
	# CHECK: ]			# CHECK: ]

	--- !ELF			--- !ELF
	FileHeader:			FileHeader:
	Class: ELFCLASS64			Class: ELFCLASS64
	Data: ELFDATA2LSB			Data: ELFDATA2LSB
	Type: ET_EXEC			Type: ET_EXEC
	▲ Show 20 Lines • Show All 175 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[obj2yaml] - Program headers: simplify the computation of p_filesz.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 259874

llvm/lib/ObjectYAML/ELFEmitter.cpp

llvm/test/tools/yaml2obj/ELF/program-header-size-offset.yaml

[obj2yaml] - Program headers: simplify the computation of p_filesz.
ClosedPublic