This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Linkrscript: change the way of creating PT_LOADs segments when LS is used.
AbandonedPublic

Authored by grimar on Aug 15 2016, 5:34 AM.

Details

Reviewers
ruiu
Summary

I noticed that logic of PT_LOADs creation when script is used in gnu linekrs is different from what we have now.
For example ld seems to create 2 PT_LOADs. In first one it puts all sections before first writable, it is text load,
and also it has data load where all sections after and including first writable are placed.

FreeBSD script seems to rely on this logic, and also it seems to be clear and simple, so patch implements the same.

Alternative to this can be solution to force align to memory page boundary of sections with different flags.
That is the same we do for non-LS case. This method also can help to avoid unaligned segments.

Diff Detail

Event Timeline

grimar updated this revision to Diff 68022.Aug 15 2016, 5:34 AM
grimar retitled this revision from to [ELF] - Linkrscript: shange the way of creating PT_LOADs segments when LS is used..
grimar updated this object.
grimar added a reviewer: ruiu.
grimar added subscribers: llvm-commits, grimar, davide, evgeny777.
grimar retitled this revision from [ELF] - Linkrscript: shange the way of creating PT_LOADs segments when LS is used. to [ELF] - Linkrscript: change the way of creating PT_LOADs segments when LS is used..
grimar updated this object.Aug 15 2016, 6:00 AM
ruiu edited edge metadata.Aug 15 2016, 3:32 PM

How did you know that FreeBSD's linker scripts depend on this behavior?

If you think the new behavior you are implementing in this patch simpler than the current one, why don't you just replace it with the new one, instead of keeping two different functions?

ELF/Writer.cpp
968

What if the current section is writable and NewFlags are not writable? This code seems to merge them, but is it the right thing to do?

In D23505#515896, @ruiu wrote:

How did you know that FreeBSD's linker scripts depend on this behavior?

ld output of FreeBSD kernel is 2 PT_LOADS:

LOAD           0x0000000000000000 0xffffffff80200000 0xffffffff80200000
                 0x0000000000fe2de8 0x0000000000fe2de8  R E    200000
LOAD           0x0000000000fe3000 0xffffffff813e3000 0xffffffff813e3000
                 0x0000000000129430 0x00000000003313c0  RW     200000

lld output currently is:

LOAD           0x0000000000000000 0xffffffff80001000 0xffffffff80001000
                0x00000000000e410e 0x00000000000e410e  R      1000
LOAD           0x00000000000e4110 0xffffffff800e5110 0xffffffff800e5110
                0x0000000000ba2b0c 0x0000000000ba2b0c  R E    1000
LOAD           0x0000000000c86c20 0xffffffff80c87c20 0xffffffff80c87c20
                0x000000000038ab28 0x000000000038ab28  R      1000
LOAD           0x0000000001012000 0xffffffff81013000 0xffffffff81013000
                0x000000000012ae30 0x0000000000331e30  RW     1000

2 and 3 lld loads are not aligned to memory page. Script does not so that, assuming that
alignment for first section and for last (writable) is enough because 2 loads will be created.

So fix for that can be:

  1. Changing script (not always acceptable).
  2. Forced alignment of sections when flags are changed (should work, but not so obvious as 3 + will create more loads and waste more VA space).
  3. This patch's solution (create 2 PT_LOADS just like ld do).

If you think the new behavior you are implementing in this patch simpler than the current one, why don't you just replace it with the new one, instead of keeping two different functions?

This patch changes the way we create PT_LOADS when use linkerscript. I was unsure if I should change the non-linkerscript case as well.
Though I think it will work fine either. This will require fix for many testcases I guess, but this is in the nature of the case.

grimar added inline comments.Aug 16 2016, 12:48 AM
ELF/Writer.cpp
968

That is what ld do. I think it is fine because writable sections go after all other ones.
At least it is assumed they should.

In D23505#515896, @ruiu wrote:

If you think the new behavior you are implementing in this patch simpler than the current one, why don't you just replace it with the new one, instead of keeping two different functions?

I just checked what will happen if I leave only this implementation for all (for non-scripted case) and had 187 testcases fails.

Current patch changes logic only for scripted case and already had to modify 20 testcases. We probably can land it first and check if it satisfies our needs
generally for scripted path before trying to change all 187 testcases for non-scripted one.

2 and 3 lld loads are not aligned to memory page. Script does not so that, assuming that

I think you mean these ones:

LOAD 0x00000000000e4110 0xffffffff800e5110 0xffffffff800e5110
LOAD 0x0000000000c86c20 0xffffffff80c87c20 0xffffffff80c87c20

What I see is that file offset and VA are properly aligned.

(0x00000000000e4110 mod 0x1000) == (0xffffffff800e5110 mod 0x1000) == 0x110

and

(0x0000000000c86c20 mod 0x1000) == (0xffffffff80c87c20 mod 0x1000) == 0xc20

If it were just some ELF binary it would work fine.
Does FreeBSD bootloader require segments to start on page boundary? Did you check this?

If it were just some ELF binary it would work fine.
Does FreeBSD bootloader require segments to start on page boundary? Did you check this?

Nope. Why some binary would work fine ?

Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
LOAD           0x00000000000e4110 0xffffffff800e5110 0xffffffff800e5110
                0x0000000000ba2b0c 0x0000000000ba2b0c  R E    1000
LOAD           0x0000000000c86c20 0xffffffff80c87c20 0xffffffff80c87c20
                0x000000000038ab28 0x000000000038ab28  R      1000

End of first load is :
0xffffffff800e5110 + 0x0000000000ba2b0c = FFFFFFFF80C87C1C

It shares the same memory page with the second load at 0xffffffff80c87c20,
though they have different access attributes.

Should this really work fine ?

Nope. Why some binary would work fine ?

Well at least it will be loaded by the kernel. I didn't mean actual result of execution.

End of first load is :

That's the problem, yes, but initially you talked about "aligning to memory page", not overlapping. May be just adding one page to
3rd segment VA will fix the problem, no?

Nope. Why some binary would work fine ?

Well at least it will be loaded by the kernel. I didn't mean actual result of execution.

I guess since this patch looks the same with ld, it should solve whole problem, but not the part.

End of first load is :

That's the problem, yes, but initially you talked about "aligning to memory page", not overlapping. May be just adding one page to
3rd segment VA will fix the problem, no?

That option is mentioned in the end of description of the patch. And in one of my comments above ("will create more loads and waste more VA space").
Looks we might need 2M memory page size (D23352) and creating tons of loads does not look a good option for me.

I see, but you're suggesting merging segment attributes which is a potential security hole. May be for FreeBSD kernel this is just fine, but doing this always?

I see, but you're suggesting merging segment attributes which is a potential security hole. May be for FreeBSD kernel this is just fine, but doing this always?

I think security hole is mostly about when segment is Writable + Executable. This patch does not do that.
I do not think that combine of Readable + Executable is critical. At least looks like ld do that many years.

ruiu added inline comments.Aug 17 2016, 2:44 PM
ELF/Writer.cpp
968

I do not understand. Why it is guaranteed that writable sections are written after all other sections? If you are using linker scripts, you have a full control over the section layout, so you can put read-only sections before writable sections.

In addition to that, probably needless to say, but doing the same thing as ld does is not the ultimate goal. It's an important goal, but the ultimate goal should be to implement a logic that we understand and can justify. The last part is important. (So, we don't want to say "we don't know why, but this is what ld does, so do we.")

grimar added inline comments.Aug 17 2016, 11:53 PM
ELF/Writer.cpp
968

I do not understand. Why it is guaranteed that writable sections are written after all other sections? If you are using linker scripts, you have a full >control over the section layout, so you can put read-only sections before writable sections.

It is not quaranteed. Using linkerscript you can do whatever you want, even useless things. At fact ld looks always create 2 loads which are text and data. At least that means that:

  1. This should met everybodies needs as programs in the wild that use ld works with this behavior.
  2. It is probably makes no sence to mix or reorder sections in the way when writable are not after readable, just because it was impossible with ld, what I guess means nobody wants and use that.
  3. This approach of 2 loads instead N simplifies things.
  4. If you want full control over headers you always can use PHDRS command.

In addition to that, probably needless to say, but doing the same thing as ld does is not the ultimate goal. It's an important goal, but the ultimate >goal should be to implement a logic that we understand and can justify. The last part is important. (So, we don't want to say "we don't know why, >but this is what ld does, so do we.")

Sure. My point is ld behavior is just simpler and looks is enough for users, so see no reasons why not to use it. Spec says "The linker will create reasonable program headers by default". From what we see when linking such programs like FreeBSD kernel, lld creates 4 PT_LOADS when ld 2. Question is what is more reasonable and why we create something that looks not really needed ?

ruiu added inline comments.Aug 18 2016, 1:42 PM
ELF/Writer.cpp
968

I don't think none of your arguments justify the behavior you are implementing here. How would you describe to the user why we merge section A and B if A is read-only and B is writable but don't merge if A is writable and B is read-only?

If we don't want to merge read-only sections with writable sections, we always put them in separate segments. If not, we should merge them. The proposed behavior seems inconsistent and hard to justify.

grimar added inline comments.Aug 19 2016, 4:57 AM
ELF/Writer.cpp
968

I don't think none of your arguments justify the behavior you are implementing here. How would you describe to the user why we merge section > A and B if A is read-only and B is writable but don't merge if A is writable and B is read-only?

This patch merges all sections before first writable to first segment, we can call it "text segment". All other following sections no matter of their flags are placed to another segment ("data segment"). There is no any inconsistency here, behavior is always to do that.
It is much easier than have multiple segments and care about proper memory page alignment for all of them in script.

If we don't want to merge read-only sections with writable sections, we always put them in separate segments. If not, we should merge them. The proposed behavior seems inconsistent and hard to justify.

ruiu added inline comments.Aug 19 2016, 8:59 AM
ELF/Writer.cpp
968

I think I still do not understand why you want it. What's the motivation to implement this logic? Do you have a program that doesn't work without this change?

grimar abandoned this revision.Aug 19 2016, 9:12 AM
grimar added inline comments.
ELF/Writer.cpp
968

Nope. Initially I thought that FreeBSD kernel needs this, but looks it lives fine without.
I am almost sure there are such programms but I do not know any, so I think I have no solid motivation (except except my faith in idea of 2 loads). I'll abandon this for now.