This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Implemented -N (-omagic)
ClosedPublic

Authored by grimar on Nov 19 2016, 8:48 AM.

Details

Summary

-omagic should can be useful for FreeBSD boot loaders.

-N (-omagic): Set the text and data sections to be readable and writable. Also, do not page-align the data segment.

Diff Detail

Event Timeline

grimar updated this revision to Diff 78627.Nov 19 2016, 8:48 AM
grimar retitled this revision from to [ELF] - Implemented -N, --norosegment flags..
grimar updated this object.
grimar added reviewers: ruiu, rafael, emaste.
grimar added subscribers: llvm-commits, grimar, evgeny777.
ruiu added inline comments.Nov 19 2016, 10:21 AM
ELF/Writer.cpp
1106–1108

I really don't want to mess up this function more than this. It's already complicated. Do you really need -N?

grimar added inline comments.Nov 19 2016, 10:44 AM
ELF/Writer.cpp
1106–1108

I am still digging the FreeBSD stuff here, as far I know they tried to use linkerscript instead but had to revert the change as it was buggy for them. From what I know it is a desired feature to have anyways at least for now.
Ed might tell more I think, as I mentioned, I am still investigating that all.

Just in case,
I separated and uploaded version that implements only --no-rosegment here: https://reviews.llvm.org/D26889

ruiu edited edge metadata.Nov 19 2016, 3:46 PM

https://reviews.llvm.org/D26889 contains the exact code that I said it's adding more complexity to the already complicated function.

brad added a subscriber: brad.Nov 20 2016, 12:25 PM
brad added inline comments.
ELF/Writer.cpp
1106–1108

We need -N for OpenBSD's boot blocks/loaders as well.

grimar updated this revision to Diff 79383.Nov 28 2016, 2:54 AM
grimar edited edge metadata.
  • Simplified.
grimar retitled this revision from [ELF] - Implemented -N, --norosegment flags. to [ELF] - Implemented -N..Nov 28 2016, 2:54 AM
grimar updated this object.
grimar updated this revision to Diff 79384.Nov 28 2016, 2:56 AM
  • Sort.
emaste edited edge metadata.Nov 28 2016, 7:11 AM

-N changes a few things:

  1. Makes the text and data sections readable and writeable.
  2. Does not page align data.
  3. Uses the OMAGIC magic number, if applicable to the output format.

For FreeBSD we don't care about these per se, but we rely on -N allowing us to generate exceptionally small binaries, such as the 512-byte master boot record (MBR) used for BIOS booting.

For example, consider this trivial program:

#include <unistd.h>
void
_start(void)
{
        _exit(var_data);
}

Linking normally with GNU ld we see:

% cc -nostdlib example.c /usr/lib/libc.a; strip a.out
% readelf -l a.out                                   

Elf file type is EXEC (Executable file)
Entry point 0x400120
There are 4 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x0000000000000250 0x0000000000000250  R E    200000
  LOAD           0x0000000000000250 0x0000000000600250 0x0000000000600250
                 0x0000000000000010 0x0000000000000018  RW     200000
  GNU_EH_FRAME   0x000000000000018c 0x000000000040018c 0x000000000040018c
                 0x000000000000002c 0x000000000000002c  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8

 Section to Segment mapping:
  Segment Sections...
   00     .text .eh_frame_hdr .eh_frame 
   01     .data .bss 
   02     .eh_frame_hdr 
   03

and with -N:

% cc -Wl,-N -nostdlib example.c /usr/lib/libc.a; strip a.out
% readelf -l a.out                                          

Elf file type is EXEC (Executable file)
Entry point 0x4000f0
There are 3 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x00000000000000f0 0x00000000004000f0 0x00000000004000f0
                 0x0000000000000140 0x0000000000000148  RWE    10
  GNU_EH_FRAME   0x000000000000015c 0x000000000040015c 0x000000000040015c
                 0x000000000000002c 0x000000000000002c  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     8

 Section to Segment mapping:
  Segment Sections...
   00     .text .eh_frame_hdr .eh_frame .data .bss 
   01     .eh_frame_hdr 
   02

Excluding the eh_frame_hdr In the first case we have two pages, one RE and one RW, and in the 2ond case with -N we have one RWE page.

In FreeBSD we tried switching to a linker script in r305353 (FreeBSD review D7409 has some comments). We found LLVM PR 30415 when testing with the linker script + LLD.

The linker script change broke one of the loaders when still using ld.bfd though, and the change was reverted in r307954. The linker script change will likely be fixed and reapplied at some point.

Also, the actual boot0 (mbr) Makefile and source is here, if a concrete example is useful: https://svnweb.freebsd.org/base/head/sys/boot/i386/boot0/

ruiu accepted this revision.Nov 28 2016, 10:07 AM
ruiu edited edge metadata.

LGTM

ELF/Options.td
267

For one letter option, please use Flag<["-"], "N"> instead of F. We don't want to accept --F.

This revision is now accepted and ready to land.Nov 28 2016, 10:07 AM
ruiu added a comment.Nov 28 2016, 10:08 AM

Please update the commit message to say that you are implementing --omagic (I don't remember what -N is, and I guess that's true for most people.)

Minor clarification: my comments above are more background on how & why we use -N in FreeBSD, not an example of anything wrong with the patch.

Please update the commit message to say that you are implementing --omagic (I don't remember what -N is, and I guess that's true for most people.)

One interesting aspect of this change in LLD is that the attribute responsible for its name (that is, the magic number) is not relevant for ELF and doesn't offer any immediate clue as to how LLD's output will change under the option. But it's much easier to find useful information in a search for "--omagic" compared to "-N". In an eventual LLD man page we'll presumably need a note explaining that the long form of the option comes from a pre-ELF object file format.

I'm testing now

ELF/Options.td
166

extra whitespace on this line

emaste accepted this revision.Nov 28 2016, 3:59 PM
emaste edited edge metadata.

This gives me a working hello world and reasonable looking phdr:

% cat hello_data.c
#include <stdio.h>

int bss_var;
int data_var = 3;

int
main(int argc, char *argv[])
{
        printf("hello bss %d data %d\n", bss_var, data_var);
        return 0;
}
% cc -fuse-ld=lld -Wl,-N,-z,norelro hello_data.c
% ./a.out
hello bss 0 data 3
% readelf -l a.out

Elf file type is EXEC (Executable file)
Entry point 0x2004d0
There are 7 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  PHDR           0x0000000000000040 0x0000000000200040 0x0000000000200040
                 0x0000000000000188 0x0000000000000188  R      8
  INTERP         0x00000000000001c8 0x00000000002001c8 0x00000000002001c8
                 0x0000000000000015 0x0000000000000015  R      1
      [Requesting program interpreter: /libexec/ld-elf.so.1]
  LOAD           0x0000000000000000 0x0000000000200000 0x0000000000200000
                 0x00000000000009c0 0x00000000000009cc  RWE    1000
  DYNAMIC        0x0000000000000868 0x0000000000200868 0x0000000000200868
                 0x0000000000000120 0x0000000000000120  RW     8
  GNU_EH_FRAME   0x00000000000004a0 0x00000000002004a0 0x00000000002004a0
                 0x0000000000000024 0x0000000000000024  R      1
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0
  NOTE           0x00000000000001e0 0x00000000002001e0 0x00000000002001e0
                 0x0000000000000030 0x0000000000000030  R      4

 Section to Segment mapping:
  Segment Sections...
   00     
   01     .interp 
   02     .interp .note.tag .rodata .dynsym .gnu.version .gnu.version_r .gnu.hash .hash .dynstr .eh_frame .rela.plt .eh_frame_hdr .text .init .fini .plt .data .ctors .dtors .jcr .dynamic .got.plt .bss 
   03     .dynamic 
   04     .eh_frame_hdr 
   05     
   06     .note.tag

The RO GNU_EH_FRAME may cause trouble in some applications, but FreeBSD's rtld didn't seem to care (and I don't think it should matter for the actual expected use cases for -N).

I did have to use -z norelro though: maybe we should fail if -N is used with relro or force norelro? It's arguable that relro is fundamentally incompatible with -N. Either way, I think this patch should go in as is first.

ELF/Options.td
165

Maybe "Combine text and data into one readable and writable segment"? We should definitely refer to segment here not section because the shdr still has .text .data .bss with different perms.

Also see PR 31196 for a bug found while testing this patch.

grimar retitled this revision from [ELF] - Implemented -N. to [ELF] - Implemented -N (-omagic).Nov 29 2016, 1:51 AM
grimar updated this object.
grimar edited edge metadata.
This revision was automatically updated to reflect the committed changes.

I did have to use -z norelro though: maybe we should fail if -N is used with relro or force norelro? It's arguable that relro is fundamentally incompatible with -N. Either way, I think this patch should go in as is first.

It should probably be possible to implement relro for -N. But I guess there is no users of that.
I would just turn it off if -N used.
Just for record - N also implies static output, though I did not find the reasons for that and removed from last diff.