This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Merge PT_LOAD segments in case they have different attributes and reside in the same memory page
AbandonedPublic

Authored by evgeny777 on Jun 28 2016, 9:27 AM.
Tokens
"Like" token, awarded by pelikan.

Details

Reviewers
ruiu
ikudrin
Summary

Currently linker script support in lld is almost useless due to following reasons:

  1. First PT_LO`D segment always has zero virtual address (see http://reviews.llvm.org/D21750)
  2. PT_LOAD segments are very often not aligned correctly, because section grouping is not yet implemented

Consider the following script:

SECTIONS  {
   . = 0x10000000;
   .bss : { *(.bss) }
   .data : { *(.data) }
   . = 0x11000000;
   .text : { *(.text) }
 }

And the following C code:

int main(void) { return 0; }

Compiling and linking it using lld and script provided will result in ELF having following segments:

LOAD  0x002000 0x0000000011000000 0x0000000011000000 0x182 0x182 R E
LOAD  0x002184 0x0000000011000184 0x0000000011000184 0x1d0 0x1d0 R
LOAD  0x002354 0x0000000011000354 0x0000000011000354 0x04c 0x04c R E 
LOAD  0x0023a0 0x00000000110003a0 0x00000000110003a0 0x188 0x188 RW

Linux kernel will map them one by one to virtual address 0x11000000 (page-aligned), with map size of 0x1000 (page rounded)
and different protection attributes. Finally one will have region 0x11000000 - 0x11001000 mapped with RW attributes. Jumping to
entry point 0x11000000 will immediately cause protection fault (SIGSEGV). Not a single user instruction will be executed

To (temporary) overcome this problem and generate working image I propose to merge PT_LOAD segments ORing protection attributes
(this is, to my beleif, gold linker does). After merge one will have a single PT_LOAD segment:

LOAD  0x002000 0x0000000011000000 0x0000000011000000 0x528 0x528 RWE

As now segment has RWE protection attributes user instructions can be executed normally.

Diff Detail

Event Timeline

evgeny777 updated this revision to Diff 62104.Jun 28 2016, 9:27 AM
evgeny777 retitled this revision from to [ELF] Merge PT_LOAD segments in case they have different attributes and reside in the same memory page.
evgeny777 updated this object.
evgeny777 added reviewers: ruiu, ikudrin.
evgeny777 set the repository for this revision to rL LLVM.
evgeny777 added a project: lld.
ruiu edited edge metadata.Jun 28 2016, 11:55 PM

Does merging attributes produce a desired result? It seems to me that it would silently produce writable text segments or something like that, which seems even dangerous. I'd think what we should do instead is to group input sections by attributes and put sections with different attributes into different pages.

Grouping sections doesn't mean you won't have this problem. Consider this script

SECTIONS  {
   . = 0x10000000;
   .text : { *(.text) }
   . = 0x10001000;
    .data : { *(.data) }
}

The gap between .text and .data may be too small to put .plt, .fini and other executable sections, so you can't group them together and have to put
them to the end of the file with other sections (like gold does)

I tried starting sections from new page in case there is an attribute conflict, but it breaks a lot of existing tests. It can probably break something else.
Do you think this is a better approach?

A small addition:

Have two segments:

LOAD  0x002354 0x0000000011000354 0x0000000011000354 0x04c 0x04c R E 
LOAD  0x0023a0 0x00000000110003a0 0x00000000110003a0 0x188 0x188 RWE

Will also make segment 0x10000000 - 0x10001000 writeable. So protection violation already exists

ruiu added a comment.Jun 29 2016, 3:04 AM

So the reason why we end up with a page that needs to have merged page attributes is because we layout sections without any padding in between, right?

Usually, when you give a section script to enforce layout, your section layout usually do not try to put different sections into the same page (and I think it's a good thing and we should report an error otherwise.)

However, if a linker script does not specify a layout for *all* sections, we place remaining sections at end of file without any padding. I think it is the problem. I believe we need to add padding in that case.

Like I said, this was the first thing I tried, but it breaks about 7 unit tests (though all are related to scripting).
How to handle this? Update exisiting test cases?

ruiu added a comment.Jun 29 2016, 3:51 AM

Please update unit tests. They are not golden tests, but they just reflect our current understanding how the linker should behave.

evgeny777 updated this revision to Diff 62227.Jun 29 2016, 8:58 AM
evgeny777 edited edge metadata.
evgeny777 removed rL LLVM as the repository for this revision.

Diff updated.

Instead of checking if any two sections with different attributes share the same memory page, I check if two consecutive sections reside in different PT_LOAD segments.
If so I do page aligning for a section. The desired behavior (padding) is achieved due to createPhdrs() creates new PT_LOAD segment if section attributes are different from those of previous one.

The next thing I plan to implement is support of PHDRS directive and section-to-segment assignments. In such case sections with different attributes may reside in a single PT_LOAD segment (which has its own attributes specified inside PHDRS block). The algorithm described above will work correctly in this case as well.

Any comments on this?

ruiu added inline comments.Jun 30 2016, 11:50 PM
ELF/LinkerScript.cpp
247

You have two spaces between * and PrevSec.

271–273

You want to place a section to a new pace if the section's attribute is different from the previous one, right?

You can have a previous attribute as a local variable in this function instead of defining a set of functions for Anchor sections, no?

evgeny777 added inline comments.Jul 1 2016, 12:37 AM
ELF/LinkerScript.cpp
271–273

Not exactly. I want to place section to a new page if it is located in a different PT_LOAD segment with previous one.
Current version creates new PT_LOAD segment for a section if it has different attributes with previous one (see createPhdrs).
So comparing PT_LOAD segments has exactly the same effect as comparing section attributes

But consider you have this linker script

PHDRS {
    myheader : PT_LOAD FLAGS 7; / * 7 means PF_R | PF_W | PF_X */
}

SECTIONS {
  .text : { *(.text) } : myheader
  .data : { *(.text) } : myheader
}

This linker script forces .text (RX) and .data (RW) to be placed to a same PT_LOAD with RWX attributes. Placing .data section to a new page will simply waste memory and space in ELF image.

ruiu added a comment.Jul 1 2016, 1:11 AM

We do not support FLAGS, so we don't need to care about that at this moment.

Yes, I know :).
But the reason, I'm doing this is that I want to implement script support.

ruiu added a comment.Jul 1 2016, 1:23 AM

Well, in general, I prefer simple code that is enough to support a feature than more complex code that "prepares" for future enhancements. You can always add code when you need it.

evgeny777 updated this revision to Diff 62489.Jul 1 2016, 6:33 AM

Diff updated.
Now it updates section VAs in case they share the same memory page and have conflicting attributes.
This is done only in case they are not mentioned in the linker script (so called orphaned sections).

ruiu added inline comments.Jul 6 2016, 5:04 PM
ELF/LinkerScript.cpp
211

Likewise, IsOrphan.

216

Why do you have to distinguish orphaned sections and non-orphaned sections? I think we reserve the right to order orphaned sections in any way (our current layout is different from gold or bfd), and I don't see a reason to handle them specially.

ELF/LinkerScript.h
49

Let's name this IsOrphan.

ELF/Target.h
109–111

I think this is not generic enough to put it in this header because you are using it in only one file. I'd move this to LinkerScript.cpp (and rename getPageNumber).

Please rename address -> Address (but I'd name Addr). Parameter names should start with lowercase letters.

evgeny777 added inline comments.Jul 6 2016, 10:21 PM
ELF/LinkerScript.cpp
216

I think that it's up to user to specify correct alignment of sections listed in linker script, so I don't do anything for them (garbage in - garbage out)
Orphaned sections are not listed in linker script, so (I think) I can fix their alignment
You thoughts?

ruiu added inline comments.Jul 8 2016, 3:26 PM
ELF/LinkerScript.cpp
216

Is there any problem if you unconditionally align a section to a new page boundary if attributes of the section is different from attributes of the previous section? I mean, if a user specifies impossible layout (by trying to putting different kind of sections to the same page), I think it is okay to do whatever we think sane.

grimar added a subscriber: grimar.Jul 13 2016, 2:58 AM

Eugene, please update this and others patches you posted to have llvm-commits in subscribers.
Otherwise probably nobody except reviewers you assigned are able to notice them. Thanks !

grimar added inline comments.Jul 13 2016, 3:08 AM
ELF/LinkerScript.cpp
215

btw, may be it is worth to change getAlignment to return unsigned ?
(not relative to this patch, but can help to remove static cast below)

218

variable names should be uppercase.

evgeny777 abandoned this revision.Oct 19 2016, 8:17 AM

Not needed any longer