This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Do not create 4gb output when -obinary -Ttext and -omagic used together.
AbandonedPublic

Authored by grimar on Nov 29 2016, 4:23 AM.

Details

Reviewers
ruiu
rafael
Summary

This fixed PR31196,

problem is that ElfHeader address was taken in account when we calculated offsets:

static uintX_t getFileAlignment(uintX_t Off, OutputSectionBase *Sec) {
...
  return First->Offset + Sec->Addr - First->Addr;

Sec->Addr is 600 here.
First->Addr is 0x10000 (First here is Out<ELFT>::ElfHeader->Addr == ImageBase).

Binary output does not have headers, we do not need to rely on their address to
calculate file offset.

Diff Detail

Event Timeline

grimar updated this revision to Diff 79536.Nov 29 2016, 4:23 AM
grimar retitled this revision from to [ELF] - Do not create 4gb output when -obinary -Ttext and -omagic used together. .
grimar updated this object.
grimar added reviewers: ruiu, rafael.
grimar added subscribers: llvm-commits, grimar, evgeny777, emaste.

For me at least this approach (avoiding adding an empty PT_LOAD) seems to be nicer than D27201, although I find the ternary operator and different condition on the following line somewhat puzzling. I'm curious what @ruiu or @rafael think.

ELF/Writer.cpp
1120

Ah. Why do we add Load for the !ScriptConfig->HasSections case?

For me at least this seems to make it more clear:

Phdr *Load = nullptr;
if (!ScriptConfig->HasSections && !Config->OFormatBinary) {
    Load = AddHdr(PT_LOAD, Flags);
    Load->add(Out<ELFT>::ElfHeader);
    Load->add(Out<ELFT>::ProgramHeaders);
}
ruiu edited edge metadata.Nov 29 2016, 7:55 AM

Why are we creating program headers for --oformat=binary in the first place?

In D27200#607883, @ruiu wrote:

Why are we creating program headers for --oformat=binary in the first place?

I think we always create them.
That true for obinary or case when we exclude them later with script:

Out<ELFT>::ElfHeader = make<OutputSectionBase>("", 0, SHF_ALLOC);
Out<ELFT>::ElfHeader->Size = sizeof(Elf_Ehdr);
Out<ELFT>::ProgramHeaders = make<OutputSectionBase>("", 0, SHF_ALLOC);
Out<ELFT>::ProgramHeaders->updateAlignment(sizeof(uintX_t));

We just do not always add them to output sections.

In D27200#607883, @ruiu wrote:

Why are we creating program headers for --oformat=binary in the first place?

And that patch (and D27201) as variation stops adding these headers as output sections,
like it supposed to be I think.

ruiu added a comment.Nov 29 2016, 8:07 AM

The way how this patch fixes the issue is a bit puzzling and too subtle (the important condition is buried inside createPhdrs which later affects file offset computation in an nonobvious). I believe there's a better way. Let me take a closer look.

In D27200#607902, @ruiu wrote:

The way how this patch fixes the issue is a bit puzzling and too subtle (the important condition is buried inside createPhdrs which later affects file offset computation in an nonobvious). I believe there's a better way. Let me take a closer look.

The fact we are setting offsets basing on First section in a load actually does not seems too subtle for me:

static uintX_t getFileAlignment(uintX_t Off, OutputSectionBase *Sec) {
...
  return First->Offset + Sec->Addr - First->Addr;

-Ttext in this case sets address of .text below the 0x10000 what is ImageBase (to 0x600).
We set image base address to headers:

uintX_t BaseVA = Config->ImageBase;
Out<ELFT>::ElfHeader->Addr = BaseVA;
Out<ELFT>::ProgramHeaders->Addr = BaseVA + Out<ELFT>::ElfHeader->Size;

one of solutions probably could be set some different BaseVA here in that case (0x600 - X),
but I found that much more puzling and wrong to rely on headers addresses for offsets for binary output,
since they should absent and even PT_LOAD can be excessive and probably also should not be created (like this patch do).

ruiu added inline comments.Nov 30 2016, 1:51 PM
ELF/Writer.cpp
1120

Agreed. But furthermore, can we avoid calling createPhdrs() at all if OFormatBinary is true? If it's true, all data created in this function will be just ignored later.

emaste added inline comments.Nov 30 2016, 2:00 PM
ELF/Writer.cpp
1120

That would be even more clear :)

grimar added inline comments.Dec 1 2016, 2:06 AM
ELF/Writer.cpp
1120

Well, if we will avoid calling createPhdrs() that means that we will be not able to pagealign .data/.text.
Because page aligning as perfomed basing on loads logic (what is intentional):

for (const Phdr &P : Phdrs)
  if (P.H.p_type == PT_LOAD)
    P.First->PageAlign = true;

As a result our output will be different from ld.

Though looking on PR31196, it creates single load. Also I suppose binary output is used mostly with linkerscripts
to control internal layout and much probably do not need auto pagealign we provide for regular link.
So far I suppose that should be probably OK.

But that also affects on file offsets calculation. Which is currently performed as:

OutputSectionBase<ELFT> *First = Sec->FirstInPtLoad;
// If two sections share the same PT_LOAD the file offset is calculated using
// this formula: Off2 = Off1 + (VA2 - VA1).
if (Sec == First)
  return alignTo(Off, Target->MaxPageSize, Sec->Addr);
return First->Offset + Sec->Addr - First->Addr;

(Introduced in D25014).

where FirstInPtLoad set from:

template <class ELFT> void PhdrEntry<ELFT>::add(OutputSectionBase *Sec) {
..
  if (H.p_type == PT_LOAD)
    Sec->FirstInPtLoad = First;
}

And I think we want to keep the effect from this logic.

So I guess technically we can stop calling createPhdrs(), but that involves much more changes than this patch do, do we want it ?

grimar added inline comments.Dec 1 2016, 5:32 AM
ELF/Writer.cpp
1120

Ah. Why do we add Load for the !ScriptConfig->HasSections case?

It is used for elf and program headers. I investigated if it possible to do suggested change.
At first it looks good since allows to remove the next code:

if (!FirstPTLoad->First) {
  // Sometimes the very first PT_LOAD segment can be empty.
  // This happens if (all conditions met):
  //  - Linker script is used
  //  - First section in ELF image is not RO
  //  - Not enough space for program headers.
  // The code below removes empty PT_LOAD segment and updates
  // program headers size.
  Phdrs.erase(FirstPTLoad);
  Out<ELFT>::ProgramHeaders->Size =
      sizeof(typename ELFT::Phdr) * Phdrs.size();
}

But that breaks linkerscript\at.s test and logic.

# RUN: echo "SECTIONS { \
# RUN:  . = 0x1000; \
# RUN:  .aaa : AT(0x2000) \
# RUN:  { \
# RUN:   *(.aaa) \
# RUN:  } \

It relies on the fact that we create new LOAD for each section that specified AT.
With suggested change headers are placed to load containing .aaa,
that breaks calculation of LMA finally:

if (!P.HasLMA)
  H.p_paddr = First->getLMA();

So I think it is easier to leave them in created load for now than handle such specific case separatelly.

grimar updated this revision to Diff 80057.Dec 2 2016, 4:26 AM
grimar edited edge metadata.
  • Updated.
grimar abandoned this revision.Dec 2 2016, 5:39 AM

I'll post different solution.