This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Start a new PT_LOAD if LMA region is different
ClosedPublic

Authored by MaskRay on Feb 9 2020, 9:07 AM.

Details

Summary

GNU ld has a counterintuitive lang_propagate_lma_regions rule.

// .foo's LMA region is propagated to .bar because their VMA region is the same,
// and .bar does not have an explicit output section address (addr_tree).
.foo : { *(.foo) } >RAM AT> FLASH
.bar : { *(.bar) } >RAM

// An explicit output section address disables propagation.
.foo : { *(.foo) } >RAM AT> FLASH
.bar . : { *(.bar) } >RAM

In both cases, lld thinks .foo's LMA region is propagated and
places .bar in the same PT_LOAD, so lld diverges from GNU ld w.r.t. the
second case (lma-align.test).

This patch changes Writer<ELFT>::createPhdrs to disable propagation
(start a new PT_LOAD). A user of the first case can make linker scripts
portable by explicitly specifying AT>. By contrast, there was no
workaround for the old behavior.

This change uncovers another LMA related bug in assignOffsets() where
ctx->lmaOffset = 0; was omitted. It caused a spurious "load address
range overlaps" error for at2.test

The new PT_LOAD rule is complex. For convenience, I listed the origins of some subexpressions:

  • rL323449: sec->memRegion == load->firstSec->memRegion; linkerscript/at3.test
  • D43284: load->lastSec == Out::programHeaders (don't start a new PT_LOAD after program headers); linkerscript/at4.test
  • D58892: sec != relroEnd (start a new PT_LOAD after PT_GNU_RELRO)

Diff Detail

Event Timeline

MaskRay created this revision.Feb 9 2020, 9:07 AM
psmith added a subscriber: psmith.Feb 10 2020, 3:26 AM

Change looks good to me. Some small nits.

lld/ELF/Writer.cpp
2118

I 'm struggling to read that boolean expression. Would it be possible to move some of the bits around so that the && is at the end or maybe calculate the more complex ones and name them?

A reorder (untested)

if (!load ||
    sec->memRegion != load->firstSec->memRegion ||
    flags != newFlags ||
    sec == relroEnd ||
    (load->lastSec != Out::programHeaders &&
     (sec->lmaExpr ||
      !sec->lmaRegion != !lmaRegion ||
      (sec->lmaRegion && sec->lmaRegion != lmaRegion))
     )
    );

Is there any way to rewrite this in the positive form, we have quite a lot of negatives.
!sec->lmaRegion != !lmaRegion
For example set->lmaRegion == nullptr != lmaRegion == nullptr writing it in that form made it easier to read.

MaskRay updated this revision to Diff 243602.Feb 10 2020, 9:37 AM

negative forms -> positive forms

MaskRay updated this revision to Diff 243604.Feb 10 2020, 9:47 AM
MaskRay edited the summary of this revision. (Show Details)
MaskRay removed a subscriber: psmith.

List origins of some subexpressions

psmith accepted this revision.Feb 11 2020, 7:01 AM

Thanks for the update, looks good to me.
Some suggestions for the expression, not sure if they are better or not, so by all means keep the original if you prefer.

Suggestion 1, rename propagateLMA to sameLMARegion, my brain tended to lump the program header case and the same memory region under propagateLMA

bool sameLMARegion =
    load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion;
if (!(load && newFlags == flags && sec != relroEnd &&
      sec->memRegion == load->firstSec->memRegion &&
      (sameLMARegion || load->lastSec == Out::programHeaders)))

Suggestion 2, add || load->lastSec == Out::programHeaders into propagateLMA

bool propagateLMA =
    (load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion) || load->lastSec == Out::programHeaders;
if (!(load && newFlags == flags && sec != relroEnd &&
      sec->memRegion == load->firstSec->memRegion &&
      propagateLMA))
This revision is now accepted and ready to land.Feb 11 2020, 7:01 AM
MaskRay updated this revision to Diff 243988.EditedFeb 11 2020, 1:42 PM

Adopt peter.smith's Suggestion 1

Thanks for the update, looks good to me.
Some suggestions for the expression, not sure if they are better or not, so by all means keep the original if you prefer.

Suggestion 1, rename propagateLMA to sameLMARegion, my brain tended to lump the program header case and the same memory region under propagateLMA

bool sameLMARegion =
    load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion;
if (!(load && newFlags == flags && sec != relroEnd &&
      sec->memRegion == load->firstSec->memRegion &&
      (sameLMARegion || load->lastSec == Out::programHeaders)))

Suggestion 2, add || load->lastSec == Out::programHeaders into propagateLMA

bool propagateLMA =
    (load && !sec->lmaExpr && sec->lmaRegion == load->firstSec->lmaRegion) || load->lastSec == Out::programHeaders;
if (!(load && newFlags == flags && sec != relroEnd &&
      sec->memRegion == load->firstSec->memRegion &&
      propagateLMA))

Thanks for the suggestions.

Adopted Suggestion 1.

For Suggestion 2: load->lastSec == Out::programHeaders is a special case from D43284. It is not the same LMA region, so I will not place the condition into sameLMARegion.

This revision was automatically updated to reflect the committed changes.