This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Propagate LMA offset to sections with neither AT() nor AT>
ClosedPublic

Authored by MaskRay on Mar 28 2020, 1:35 PM.

Details

Summary

Fixes https://bugs.llvm.org/show_bug.cgi?id=45313
Also fixes linkerscript/{at4.s,overlay.test} LMA address issues exposed by
011b785505b1f6d315a93fd0a0409576ad8d1805.
Related: D74297

This patch improves emulation of GNU ld's heuristics on the difference
between the LMA and the VMA:
https://sourceware.org/binutils/docs/ld/Output-Section-LMA.html#Output-Section-LMA

New test linkerscript/lma-offset.s (based on at4.s) demonstrates some behaviors.

Diff Detail

Event Timeline

MaskRay created this revision.Mar 28 2020, 1:35 PM
MaskRay updated this revision to Diff 253377.Mar 28 2020, 1:48 PM

Improve docs/ELF/linker_script.rst

This looks fine to me. I'd suggest to wait for Peter's opinion regarding this.

I think there may be one case from ld that we aren't handling, and I've made a few suggestions for the documentation, otherwise looks like a good step in the right direction.

Apologies in advance for being slow to respond to reviews this week.

lld/ELF/LinkerScript.cpp
866–867

At a glance, it looks like we might not handle this particular case, is this intentional?

  • If the section has a specific VMA address, then this is used as the LMA address as well.

I think that this is where there is an addrExpr but no lmaExpr we should set ctx->lmaOffset = 0;

lld/docs/ELF/linker_script.rst
64
The lack of ``AT>lma_region``
  means the default region is used. Note, GNU ld propagates the previous LMA
  memory region when ``address`` is not specified.

My understanding of the sentence is that when there is no AT > lma_region LLD does not behave like GNU ld. LLD always uses the default region, whereas GNU ld uses the previous OutputSections lma_region when address is not specified. If I'm right then is this an intentional change or a current limitation? I think we should be explicit, whichever way it is. For example:

If the OutputSection has no ``AT>lma_region`` then LLD will always use the default memory region. Note that this is in contrast with GNU ld which will propagate the lma_region of the previous OutputSection when ``address`` is not specified. For a script that is compatible with
GNU ld and LLD you must set ``AT>lma_region`` explicitly and not rely on propagation.
67
In the absence of a PHDRS command, if the LMA region is different from the previous one, this section will start a new PT_LOAD segment.

Suggest an alternative:

If the linker script does not have a PHDRS command then if the ``lma_region`` for an OutputSection is different to the ``lma_region`` for the previous OutputSection a new loadable segment will be generated.
70

Suggest a new line or paragraph before If neither `AT(lma) nor AT>lma_region` is specified:

MaskRay updated this revision to Diff 253630.Mar 30 2020, 10:14 AM
MaskRay marked 4 inline comments as done.

Incorporate documentation suggestions

lld/ELF/LinkerScript.cpp
866–867

See below. This and whereas GNU ld uses the previous OutputSections lma_region when address is not specified need more thoughts.

lld/docs/ELF/linker_script.rst
64

whereas GNU ld uses the previous OutputSections lma_region when address is not specified.

Yes. See the summary of D74297. The patch was made when I did not have a good understanding of LMA (I still don't but things start to become clearer). We can revisit the decision and fix this discrepancy as well if not difficult to implement.

psmith accepted this revision.Apr 1 2020, 1:51 AM

LGTM thanks for the update. I think it would be good to fix the discrepancies between LLD and BFD in follow up patches if it is reasonable to do so.

This revision is now accepted and ready to land.Apr 1 2020, 1:51 AM
MaskRay updated this revision to Diff 254210.Apr 1 2020, 8:16 AM

Rename lma-difference.s to lma-offset.s

MaskRay edited the summary of this revision. (Show Details)Apr 1 2020, 8:17 AM
This revision was automatically updated to reflect the committed changes.
MaskRay added a comment.EditedApr 1 2020, 11:12 AM

LGTM thanks for the update. I think it would be good to fix the discrepancies between LLD and BFD in follow up patches if it is reasonable to do so.

Looks like our current behavior is close to GNU ld enough. Actually, GNU ld's behavior is different from the documentation

% cat a.s
SECTIONS {
  . = 0x1000;
  .a : { *(.a) }
  .b : AT(0x2005) { *(.b) }
/* "If the section has a specific VMA address, then this is used as the LMA address as well." is not obeyed for .c */
  .c 0x3006 : { *(.c) }
}
Program Headers:
  Type           Offset   VirtAddr           PhysAddr           FileSiz  MemSiz   Flg Align
  LOAD           0x001000 0x0000000000001000 0x0000000000001000 0x000002 0x000002 R E 0x1000
  LOAD           0x001002 0x0000000000001002 0x0000000000002005 0x000001 0x000001 R   0x1000
  LOAD           0x001006 0x0000000000003006 0x0000000000004009 0x000002 0x000002 R   0x1000
  LOAD           0x001008 0x0000000000003008 0x0000000000004010 0x000001 0x000001 RW  0x1000

 Section to Segment mapping:
  Segment Sections...
   00     .text .a 
   01     .b 
   02     .c .d 
   03     .data

Asked on https://sourceware.org/pipermail/binutils/2020-April/110503.html

lld/docs/ELF/linker_script.rst