This is an archive of the discontinued LLVM Phabricator instance.

[yaml2obj] - Reimplement how tool calculates memory sizes of segments.
AcceptedPublic

Authored by grimar on Apr 13 2020, 12:12 AM.

Details

Summary

We do not calculate the memory size properly, e.g when we have 2 or more
SHT_NOBITS sections at the end of a segment or when we mix SHT_NOBITS
and regular sections.

The simple example is:

ProgramHeaders:
  - Type:  PT_LOAD
    Flags: [ PF_W, PF_R ]
    Sections:
      - Section: .foo1
      - Section: .foo2
    VAddr: 0x1000
    Offset: 0x0
Sections:
  - Name:    .foo1
    Type:    SHT_NOBITS
    Flags:   [ SHF_WRITE, SHF_ALLOC ]
    Size:    0x100
    Address: 0x1000
  - Name:    .foo2
    Type:    SHT_NOBITS
    Flags:   [ SHF_WRITE, SHF_ALLOC ]
    Size:    0x88

In this case the memory size of the segment should be
0x100 + 0x88 + file offset of the segment (0x78) = 0x200.
But without this patch yaml2obj sets the memory size to 0x178,
because it calculates the memory size as 0x78 + max(0x100, 0x88).

This patch changes the implementation and refines the test case.

Diff Detail

Event Timeline

grimar created this revision.Apr 13 2020, 12:12 AM
jhenderson added inline comments.Apr 16 2020, 2:21 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
1042–1048

Perhaps worth splitting this into it's own patch. I think it's useful, but it seems to be unrelated to fixing address calculations.

Also "by their file offsets"

1044

I'm concerned that if there are multiple sections with the same offset but different sizes, this doesn't enforce a particular ordering. I'm thinking multiple trailing SHT_NOBITS sections here again. There needs to be one or more fallbacks, probably the section address at least.

1055–1058

This doesn't look right to me. What if there are multiple SHT_NOBITS sections at the end of a segment?

1059–1073

This logic seems awfully complicated for what it's doing. If we enforce an address order as well as offset order (see above), then it can just be the relative offset of the last section within the segment plus the section size, i.e. p_offset - sh_offset + sh_size.

By the way, p_memsz should only be set if the segment contains allocatable segments. If it doesn't, I think it should be 0. That might be slightly tangential to this patch though, so could be a different one.

llvm/test/Object/invalid.test
488 ↗(On Diff #256933)

Why has this test changed?

llvm/test/tools/llvm-readobj/ELF/gnu-notes.test
168 ↗(On Diff #256933)

Why has this test changed?

llvm/test/tools/obj2yaml/program-headers.yaml
450 ↗(On Diff #256933)

Double "the".

452 ↗(On Diff #256933)

"Document" implies to me that we are just showing the current behaviour and that it doesn't matter if it changes. I don't think that's right. It should be "Check that". Also "such a case".

llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
6–55

This sets more than just the file offset of the first segment. It sets it for all segments. I think you need to modify the comment.

Also, missing trailing full stop.

15

I think this comment is redundant. It's evident that we are checking the section and program headers.

19

You could probably make things simpler by deleting the last 5 columns (keeping Address, Offset and Size).

23

chuck -> chunk

29

Here and below, no need for "from".

30

Similar to above: the flags and alignment seem unrelated to the test and can probably be deleted.

35

+ the .filler chunk

38

includes the first

91

The -> A

also below in all similar comments.

117

has a -> and a

Here and below

142

with -> with a

151

No need for "included"

167

of the -> of

177

at the -> in the

I'd delete "of teh segment" from here and elsewhere, since there are multiple segments.

grimar marked 3 inline comments as done.Apr 16 2020, 3:20 AM
grimar added inline comments.Apr 16 2020, 3:20 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
1055–1058

I have this test case below, it works fine, why it shouldn't?

## We have two SHT_NOBITS section at the end of the segment.
## It is important to have at least two of them because we had a bug
## related to an incorrect computation of the memory size of a segment when
## there were multiple SHT_NOBITS sections at the end.
  - Name:    .nobits3
    Type:    SHT_NOBITS
    Flags:   [ SHF_WRITE, SHF_ALLOC ]
    Size:    0x1D0
  - Name:    .nobits4
    Type:    SHT_NOBITS
    Flags:   [ SHF_WRITE, SHF_ALLOC ]
    Size:    0x100
1059–1073

If we enforce an address order as well

We should not. The intention of yaml2obj is to produce any kinds of object including broken objects,
I do not think we should enforce anything. Enforcing of file offsets was natural, we already have it at fact,
it can be broken with ShOffset because we set it too early, but that is all what we should do.

llvm/test/Object/invalid.test
488 ↗(On Diff #256933)

Because my code uses the Offset which is misused here. We use it here like it is ShOffset, i.e. to override the value,
but it is not what Offset field should do. I think it should be used to set value that is <= the offset of the first section in the segment.

We might probably want to add ShOffset.

grimar marked an inline comment as done.Apr 16 2020, 8:33 AM
grimar added inline comments.
llvm/test/Object/invalid.test
488 ↗(On Diff #256933)

I've prepared the patch for this: D78304

MaskRay added inline comments.Apr 16 2020, 11:09 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
1042–1047

llvm::is_sorted

1059–1073

How about lld's approach? (consistent with GNU ld in my observations)

p->p_filesz = last->offset - first->offset;
if (last->type != SHT_NOBITS)
  p->p_filesz += last->size;

p->p_memsz = last->addr + last->size - first->addr;

When there are more than one SHT_NOBITS, the trailing SHT_NOBITS sections have the same sh_size. The algorithm will not overly increase p_filesz.

In this case the memory size of the segment should be 0x100 + 0x88 + file offset of the segment (0x78) = 0x200. But without this patch yaml2obj sets the memory size to 0x178, because it calculates the memory size as 0x78 + max(0x100, 0x88).

A better example may be a PT_LOAD which is not the first. (0x78 being counted may look strange to some people). In linkers, the first PT_LOAD may or may not include the ELF header and program headers. Another PT_LOAD can avoid this complexity.

grimar marked an inline comment as done.Apr 17 2020, 12:38 AM
grimar added inline comments.
llvm/lib/ObjectYAML/ELFEmitter.cpp
1059–1073

p->p_filesz = last->offset - first->offset;
if (last->type != SHT_NOBITS)

p->p_filesz += last->size;

This is the same what this patch does for p_filesz.

When there are more than one SHT_NOBITS, the trailing SHT_NOBITS sections have the same sh_size.

Why? In my sample below used in the test case it is not true and yaml2obj should be able to handle any layout given.

## We have two SHT_NOBITS section at the end of the segment.
## It is important to have at least two of them because we had a bug
## related to an incorrect computation of the memory size of a segment when
## there were multiple SHT_NOBITS sections at the end.
  - Name:    .nobits3
    Type:    SHT_NOBITS
    Flags:   [ SHF_WRITE, SHF_ALLOC ]
    Size:    0x1D0
  - Name:    .nobits4
    Type:    SHT_NOBITS
    Flags:   [ SHF_WRITE, SHF_ALLOC ]
    Size:    0x100
grimar marked an inline comment as done.Apr 17 2020, 12:52 AM
grimar added inline comments.
llvm/lib/ObjectYAML/ELFEmitter.cpp
1059–1073

p->p_memsz = last->addr + last->size - first->addr;

Also this can't be used because depends on addressed that must be sorted, this is not our case here.

grimar planned changes to this revision.Apr 17 2020, 2:00 AM

I'll update it after landing patches that I am going to split/have splitted already.

grimar marked an inline comment as done.Apr 17 2020, 5:40 AM
grimar added inline comments.
llvm/test/tools/obj2yaml/program-headers.yaml
452 ↗(On Diff #256933)

I used "Document" because I am not sure it is not a bug: I am not sure that Sh* properties should affect on how do we create program headers.
"Sh*" properties were initially introduced to override values. E.g. ShSize described as "This can be used to override the sh_size field. It does not affect the content written.". Perhaps we set ShOffset too early and should not report an error for this case. Hence "Document"..

This test is a part of D78361 now.

grimar updated this revision to Diff 259881.Apr 24 2020, 7:10 AM
grimar marked 24 inline comments as done.
  • Rebased.
  • Addressed review comments.

Sharing my experiments just in case. A possible example from real life is:

test.s:

.section .foo.nobits, "aw", @nobits
.quad 0

.section .bar1.nobits, "aw", @nobits
.quad 0
.quad 0

.section .bar2.nobits, "aw", @nobits
.quad 0

.section .zed, "aw"
.quad 0
.quad 0
.quad 0

test.script:

SECTIONS { 
 . = 0x1000;
 .foo.nobits : { *(.foo.nobits) }
 . = 0x1100;
 .zed : { *(.zed) }
 .bar1.nobits : { *(.bar.nobits) }
 .bar2.nobits : { *(.bar.nobits) }
}

The output from bfd:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .foo.nobits       NOBITS           0000000000001000  00001000
       0000000000000008  0000000000000000  WA       0     0     1
  [ 2] .zed              PROGBITS         0000000000001100  00001100
       0000000000000018  0000000000000000  WA       0     0     1
  [ 3] .bar1.nobits      NOBITS           0000000000001118  00001118
       0000000000000010  0000000000000000  WA       0     0     1
  [ 4] .bar2.nobits      NOBITS           0000000000001128  00001118
       0000000000000008  0000000000000000  WA       0     0     1

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x0000000000000118 0x0000000000000130  RW     0x1000

With this patch we should be able to describe such objects properly.

grimar edited the summary of this revision. (Show Details)Apr 24 2020, 7:13 AM
grimar edited the summary of this revision. (Show Details)
grimar retitled this revision from [yaml2obj] - Reimplement how tool calculates file and memory sizes of segments. to [yaml2obj] - Reimplement how tool calculates memory sizes of segments..Apr 26 2020, 11:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 26 2020, 11:19 PM
jhenderson added inline comments.Apr 28 2020, 1:02 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
1074

Fragments.size() -> E?

1077–1078

when the alignment gap present -> when there is a gap due to alignment

1079–1081

I'd change this whole paragraph into a continuation of the previous sentence:

", or when we have 2 or more trailing SHT_NOBITS sections with the same offset. In the latter case, we should take all trailing SHT_NOBITS section sizes into account."

It reads better to me, but also helps fix one or two ambiguities.

llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
1–4

sections -> sections in segments

19

I was initially concerned that .progbits1 and .nobits1 share the same file offset. In such a case, the loader would not allocate the bytes for .nobits1 in the right place for any segment where it wasn't the only section in it. However, this is yaml2obj, not the linker, so it's probably fine to leave as-is, if there's some way of adding padding between the two sections without changing the address of .progbits1.

21–23

Same issue here. I'd naturally expect .nobits3 to have an offset of 0x220, since the filler bytes (which might need to be set to some non-zero value), need to appear after the bytes for .nobits2, rather than sharing the same location.

29–30

Something's not right here. Why is the memory size 0x500/0x3a8?

The highest end address of any section in the segment is 0x2000003b0, according to the section dump above, which would mean a size of 0x508/0x3b0 depending on which offset is used as the start point.

This might be because you've specified section addresses which conflict with the program header address, causing confusion - the program header with Offset 0 can't have the same VAddr as its first section, since the first section doesn't start at offset 0.

94

Unnecessary noise. Flags aren't needed for this test.

151

Either "This segment has only" or "A segment with only"

187

section -> sections

grimar marked an inline comment as done.Apr 28 2020, 1:10 AM
grimar added inline comments.
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
19

if there's some way of adding padding between the two sections without changing the address of .progbits1.

The best way I know is D78927.

grimar updated this revision to Diff 260590.Apr 28 2020, 4:26 AM
grimar marked 11 inline comments as done.
  • Addressed all review comments except one, which I think better to be addressed after landing the D78927.

(i.e. I expect to update this diff at least one more time).

llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
19

So, I'd hold this patch a bit and update this and the place below after landing the D78927.

29–30

I've adjusted the VAddr.

jhenderson added inline comments.Apr 29 2020, 1:16 AM
llvm/lib/ObjectYAML/ELFEmitter.cpp
1075–1076

The comment flowing is now a bit off, since "be" could obviously be on the previous line. Could you reflow, please?

llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
29–30

I don't think it's possible to share things the way you are. All your VAddr adjustment has done is to reduce the values slightly. The cause of the problem is still there.

For the offset 0 case, according to the section header table the first section is at address "0x00000001fffffea8". Meanwhile, the segmetns have the same address. However, the two do not share the same offset. Therefore, you end up with an input layout that looks like the following according to the two tables:

1fffffea8
| Sec1 | Sec2 | ... |
| Seg1               |
...

whereas in offset terms you have:

0                                  158
| Elf Header | Phdr table | Sec1 | Sec2 | ... |
| Seg1                                                   |

which means the address of Sec1 will effectively be calculated by the loader etc as 0x00000001fffffea8 + 0x158, which isn't what the section's claimed address is.

You won't be able to share the section's addresses in this way between the two modes, because the relative offset of the first section in the segments is going to be different between the two cases.

29–30

Urrggh, my examples got messed up a little (the 158 should line up with Sec1, not Sec2), but hopefully my point should be clear.

115

I just misread the "-" as a dash rather than "minus", so I'd explicitly use the word "minus" here.

149

Please keep the "a" (i.e. "A segment with only a SHT_NOBITS section.")

grimar updated this revision to Diff 265200.May 20 2020, 4:18 AM
grimar marked 4 inline comments as done.
  • Addressed review comments.
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
29–30

OK, thanks for detailed explanation. I think I just misunderstood what exactly you expect to see from the test that time. Starting from today, we can use the "Offset" field for both regular sections and Fills (rGbaf322598721), and I've this new feature to layout things properly (hopefully). I am also setting the different VAs for segments when changing their offsets now.

Not sure where, but my analysis of the test case suggests there's still a bug in the calculation.

llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
30

I might be failing at counting, but shouldn't MemSiz here be 0x4f0? (i.e. 0x2000003f0 + 0x100 - 0x200000000 == Final section address + final section size - base address).

31

(Same here, but for 0x398 rather than 0x3b8).

33–34

Same off by 0x20 problem here.

36–37

FileSize here looks like it should be 0x220, matching the memory size (which looks to be correct). It looks like the .nobits2 section is not being allocated file space, even though there's something non-nobits appearing after it. Please correct my understanding if I've misunderstood something. Same problem with the OFFSET case (off by 0x20 in file size).

38

I was slightly surprised to not see a case for sections 1 to 4 or 1 to 3, but I assume that's because the 1 to 6 and 1 to 5 are essentially equivalent?

192

Am I right in thinking that this is unnecessary now? The fill chunk should cause this section to be placed after it, and if my counting is correct, that'll result in it appearing at 0x220 naturally.

grimar marked 3 inline comments as done.May 22 2020, 2:27 AM
grimar added inline comments.
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
36–37

It looks like the .nobits2 section is not being allocated file space, even though there's something non-nobits appearing after it.

Yes, yaml2obj does not allocate a file space in such case. In fact the following piece is the all we do for SHT_NOBITS:

} else if (auto S = dyn_cast<ELFYAML::NoBitsSection>(Sec)) {
  // SHT_NOBITS sections do not have any content to write.
  SHeader.sh_entsize = 0;
  SHeader.sh_size = S->Size;
}

I.e. it never allocates a file space for SHT_NOBITS. Seems to be the root of the problem.

So, should we teach yaml2obj to allocate a file space for a SHT_NOBITS section if there is any no-nobits chunk after it?
How does this behavior sound for you?

38

Honestly I do not remember why I did not include these initially. but I guess I tried to avoid having similar cases. Though now I think this test would benefit in readability from having all possibile cases. I am inclined to add them.

192

In addition to my comment above, and to clarify: so 0x220 naturally could happen here if yaml2obj allocated a file space for .nobits2.
Then I would not need to use "Offset".

jhenderson added inline comments.May 22 2020, 3:33 AM
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
36–37

I think that would make sense. As far as I know, when a SHT_NOBITS section appears in the middle of a segment (i.e. with other non-nobits sections afterwards), there's no other way of representing it correctly without allocating file space - if you include it in the address space, the later progbits sections will have an incorrect address to content mapping at runtime.

What do linkers do with a linker script where a nobits section is followed by a progbits section in the same segment?

38

Sounds reasonable to me.

192

Right. I misread something and thought yaml2obj was already allocating space.

grimar marked an inline comment as done.May 22 2020, 5:41 AM
grimar added inline comments.
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
36–37

I've tested GNU linkers. The behavior is similar. They allocate a space then:

.section .foo.nobits, "aw", @nobits
.quad 0

.section .bar1.nobits, "aw", @nobits
.quad 0
.quad 0

.section .bar2.nobits, "aw", @nobits
.quad 0

.section .zed, "aw"
.quad 0
.quad 0
.quad 0
SECTIONS { 
 . = 0x1000;
 .foo.nobits : { *(.foo.nobits) }
 .zed : { *(.zed) }
 .bar1.nobits : { *(.bar.nobits) }
 .bar2.nobits : { *(.bar.nobits) }
}

bfd output:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .foo.nobits       NOBITS           0000000000001000  00001000
       0000000000000008  0000000000000000  WA       0     0     1
  [ 2] .zed              PROGBITS         0000000000001008  00001008
       0000000000000018  0000000000000000  WA       0     0     1
  [ 3] .bar1.nobits      NOBITS           0000000000001020  00001020
       0000000000000010  0000000000000000  WA       0     0     1
  [ 4] .bar2.nobits      NOBITS           0000000000001030  00001020
       0000000000000008  0000000000000000  WA       0     0     1
  [ 5] .symtab           SYMTAB           0000000000000000  00001020
       0000000000000078  0000000000000018           6     5     8
  [ 6] .strtab           STRTAB           0000000000000000  00001098
       0000000000000001  0000000000000000           0     0     1
  [ 7] .shstrtab         STRTAB           0000000000000000  00001099
       0000000000000046  0000000000000000           0     0     1
Key to Flags:
...

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x0000000000000020 0x0000000000000038  RW     0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .foo.nobits .zed .bar1.nobits .bar2.nobits

Note that there is a single PT_LOAD and the file space is allocated for .foo.nobits.
But when I provide a script which creates 2 segments:

SECTIONS { 
 . = 0x1000;
 .foo.nobits : { *(.foo.nobits) }
 . = 0x2000;
 .zed : { *(.zed) }
 .bar1.nobits : { *(.bar.nobits) }
 .bar2.nobits : { *(.bar.nobits) }
}

then no allocation happens:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .foo.nobits       NOBITS           0000000000001000  00001000
       0000000000000008  0000000000000000  WA       0     0     1
  [ 2] .zed              PROGBITS         0000000000002000  00001000
       0000000000000018  0000000000000000  WA       0     0     1
  [ 3] .bar1.nobits      NOBITS           0000000000002018  00001018
       0000000000000010  0000000000000000  WA       0     0     1
  [ 4] .bar2.nobits      NOBITS           0000000000002028  00001018
       0000000000000008  0000000000000000  WA       0     0     1
  [ 5] .symtab           SYMTAB           0000000000000000  00001018
       0000000000000078  0000000000000018           6     5     8
  [ 6] .strtab           STRTAB           0000000000000000  00001090
       0000000000000001  0000000000000000           0     0     1
  [ 7] .shstrtab         STRTAB           0000000000000000  00001091
       0000000000000046  0000000000000000           0     0     1
...

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000001000 0x0000000000001000 0x0000000000001000
                 0x0000000000000000 0x0000000000000008  RW     0x1000
  LOAD           0x0000000000001000 0x0000000000002000 0x0000000000002000
                 0x0000000000000018 0x0000000000000030  RW     0x1000

I'll try to improve yaml2obj in this area.

grimar updated this revision to Diff 268127.Jun 3 2020, 3:47 AM
grimar marked 11 inline comments as done.
  • Addressed review comments.
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
30

Yes. Now, after D80629 was landed, this and other calculations are fixed.

38

I've did it. Unfortunately offsets changed again because of adding 2 more program headers :(

But before adding those, I've checked that they were fixed after landing D80629 (i.e. they matched values you expected to see).
And the new values also look right to me now.

jhenderson accepted this revision.Jun 4 2020, 1:13 AM

LGTM.

llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
188–189

Just to make sure I'm clear - the file space in these last cases is due to the file header and program header table? It might be worth mentioning this somewhere in the comments.

This revision is now accepted and ready to land.Jun 4 2020, 1:13 AM
grimar planned changes to this revision.Jun 4 2020, 1:59 AM
grimar marked 2 inline comments as done.
grimar added inline comments.
llvm/test/tools/yaml2obj/ELF/program-header-nobits.yaml
188–189

the file space in these last cases is due to the file header and program header table?

Nah :(

It's because after adding 2 program headers in the last update, sections offset have changed.
The offset of the first section was previously 0x158, but now it is 0x1c8.

The following piece still assumes it is 0x158.

## Set file offsets of segments to the offset of the first section to show
## that with a non-zero offset we still calculate memory and file sizes properly.
# RUN: yaml2obj %s --docnum=3 -DOFFSET=0x158 -DVADDR=0x200000158 -o %t6
# RUN: llvm-readelf --sections --segments %t6 | FileCheck %s --check-prefixes=COMMON,OFFSET

And 0x1c8 - 0x158 = 0x70.

So techinally all values are calculated correctly, but it does not do what comment says now.
Going to fix it again.

grimar updated this revision to Diff 270700.Jun 15 2020, 3:22 AM
grimar marked an inline comment as done.
  • Adjusted values slightly to have a case that matches the comment "Set file offsets of segments to the offset of the first section to show...." properly.
This revision is now accepted and ready to land.Jun 15 2020, 3:22 AM
grimar requested review of this revision.Jun 15 2020, 3:24 AM

PTAL. Sorry for moving this back and forth so many times.

MaskRay added a comment.EditedJun 15 2020, 10:59 AM
for (size_t I = 0, E = Fragments.size(); I != E; ++I) {
  const Fragment &Cur = Fragments[I];
  const Fragment *Next = (I + 1 < E) ? &Fragments[I + 1] : nullptr;

  if (Next)
    MemSize += std::max(Next->Offset - Cur.Offset, Cur.Size); ///////////// [1]
  else
    MemSize += Cur.Size;
}

[1] is probably less than ideal. If a middle section has erroneous large sh_size (for example, due to ShSize:), we may want to ignore it when computing p_memsz. If we take Next->Offset - Cur.Offset contribution from one section and Cur.Size contribution from another, the sum will represent a value which is hard to explain.

We probably should ignore sh_offset and use sh_addr when computing p_memsz:

p->p_memsz = last->addr + last->size - first->addr;
jhenderson accepted this revision.Jun 16 2020, 1:55 AM

LGTM, but I'm happy to dicuss @MaskRay's suggestion too.

for (size_t I = 0, E = Fragments.size(); I != E; ++I) {
  const Fragment &Cur = Fragments[I];
  const Fragment *Next = (I + 1 < E) ? &Fragments[I + 1] : nullptr;

  if (Next)
    MemSize += std::max(Next->Offset - Cur.Offset, Cur.Size); ///////////// [1]
  else
    MemSize += Cur.Size;
}

[1] is probably less than ideal. If a middle section has erroneous large sh_size (for example, due to ShSize:), we may want to ignore it when computing p_memsz. If we take Next->Offset - Cur.Offset contribution from one section and Cur.Size contribution from another, the sum will represent a value which is hard to explain.

We probably should ignore sh_offset and use sh_addr when computing p_memsz:

p->p_memsz = last->addr + last->size - first->addr;

My personal opinion is that we should not pay any attention to ShSize (or indeed any similar properties like ShOffset etc) for any calculation except for the value that is finally written to the section header table. Thus, I don't think we need to worry about the case you're suggesting (i.e. with an erroneous size).

This revision is now accepted and ready to land.Jun 16 2020, 1:55 AM