This is an archive of the discontinued LLVM Phabricator instance.

[LLD] Reserve memory in implicitly set LMARegions
AcceptedPublic

Authored by kschwarz on Aug 9 2018, 12:28 AM.

Details

Summary

If an output section is loaded into an implicitly set LMARegion, we need to reserve space for it. Otherwise, the following output sections will use invalid region counters in their computation of offsets and assign bogus addresses.
The rules for assigning implicit LMARegions are documented here: https://sourceware.org/binutils/docs-2.20/ld/Output-Section-LMA.html

Newer versions of ld.bfd have more complicated rules (https://sourceware.org/binutils/docs-2.30/ld/Output-Section-LMA.html#Output-Section-LMA)

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

kschwarz created this revision.Aug 9 2018, 12:28 AM
grimar added a comment.Aug 9 2018, 1:05 AM

This looks good to me. Few comments below.

ELF/LinkerScript.cpp
765

addr -> Addr

It feels that findContainingSection is a wrong name. You are finding and returning the memory region.
I would suggest findRegion or something alike.

823

According to LLD code style, you do not need adding braces for a single line. (Here and below).

test/ELF/linkerscript/Inputs/at10.s
1 ↗(On Diff #159864)

I would reuse at8.s input. It contains the same sections.
We sometimes reusing the inputs from other tests when it is convenient, I believe it is the case here.

test/ELF/linkerscript/Inputs/at9.s
1 ↗(On Diff #159864)

The same.

test/ELF/linkerscript/at10.test
21

I think this comment (and the comment in the test below) has the following "issue":
You are referring to the PhysAddr and LMAOffset, which are the details of the implementation
and perhaps should not be mentioned here as might change in theory.

What I would suggest is to avoid using names of variables in test cases and use just regular words
to describe the problem.

kschwarz updated this revision to Diff 159873.Aug 9 2018, 1:51 AM

Address review comments. Thanks!

ruiu added a comment.Aug 9 2018, 1:50 PM

I believe this code makes sense, but this is honestly a bit complicated (that's not your fault but by the nature of the linker script itself). I'm curious what program you are trying to link with this patch. Or, are you just fixing it because it's there?

I agree it makes the code more complicated. Linker scripts require a lot of corner cases to be handled specially.
This patch right now does not take into account different memory attributes, which would forbid the placement of some output sections into those regions, but would also complicate the handling even more. In this case, we currently expect the user to set a valid LMARegion explicitly.
However, if we want to be compatible with GNU linker scripts, we need to handle these cases.

These features are used heavily in embedded systems packaging several memories with different properties (flash, ram, tightly coupled memories, etc.). In order to program these systems efficiently, the user needs precise control over where the code and date is placed.
An example can be found here: https://github.com/ChibiOS/ChibiOS/blob/trunk/os/common/startup/ARMCMx/compilers/LLVM/ld/STM32F401xE.ld
My patches from last week fix bugs found and worked-around in this script. This patch fixes problems discovered also in existing GNU linker scripts.

It feels to me that we could try to improve the situation a bit if will split the LinkerScript::assignOffsets to move
out the code relative to assigning LMAOffset to some helper.

kschwarz updated this revision to Diff 160098.Aug 10 2018, 6:39 AM

The search for a suitable LMARegion is now extracted into its own function. I updated this logic to the most recent bfd spec, now also taking into account different memory types.

There was also a problem in the previous version of the patch: finding the LMARegion which contains the explicitly set LMA address is not enough, we need to update the current head of the LMARegion to get valid load addresses (done in findAndAdvanceRegion).
If an explicit LMA address is used to move the location counter of a memory region backwards, this might lead to virtual/load address overlaps detected later in checkSections.

grimar added inline comments.Aug 11 2018, 4:09 AM
ELF/LinkerScript.cpp
784

I tried to simplify the conditions in this method and ended up with the following version: D50602.
What do you think?

kschwarz added inline comments.Aug 13 2018, 12:44 AM
ELF/LinkerScript.cpp
784

Your version is also fine for me, and has better reusability. I added minor comments in the separate review.

grimar added inline comments.Aug 13 2018, 1:55 AM
ELF/LinkerScript.cpp
784

Please feel free to use it whole or partially for this patch. Its intention was to show/suggest the slightly different approach.

ikudrin added inline comments.
ELF/LinkerScript.cpp
778

There are neither existing nor new test cases which require this function and the corresponding condition. You may want to add the tests or, better, extract the check into a separate patch.

kschwarz updated this revision to Diff 161712.Aug 21 2018, 7:22 AM

I updated the patch with parts of https://reviews.llvm.org/D50602
Additionally, two new test cases now exercise the added conditions in isLMARegionCompatible (thanks @ikudrin)
We are now issuing a warning if the LMA region has incompatible flags, which can lead to overlapping addresses. Both bfd and gold do not issue the warning, but fail similarly. Having a warning seems more user friendly to me.

This patch now also fixes https://bugs.llvm.org/show_bug.cgi?id=38624

kschwarz marked 4 inline comments as done.Aug 21 2018, 7:26 AM

Looks good. Few style comments below.

ELF/LinkerScript.cpp
770

You do not need to wrap return M in curly brackets here.

789

You do not need curly brackets here.
Also, we do not use else after return.
So, it can be something like:

if (Sec->LMARegion)
  return Sec->LMARegion;

if (Sec->LMAExpr) {
  uint64_t Addr = Sec->LMAExpr().getValue();
  MemoryRegion *M = findRegion(Addr);
  if (M)
    M->CurPos = Addr;
  return M;
}
808

The same.

812

I think this empty line is excessive.

814

Excessive empty line.

816

I think:

  1. LMA perhaps should be uppercase in the error message.
  2. Would be nice to include memory region name to this warning.
kschwarz updated this revision to Diff 161786.Aug 21 2018, 11:51 AM

In the last path one required line got lost, now all tests pass.
Also updated style and wording of warning. Thanks!

kschwarz added inline comments.Aug 22 2018, 12:03 AM
ELF/LinkerScript.cpp
816

After a second thought, I'm not sure if this warning really makes sense.
Initialized data will be copied to RAM at runtime, but has to be loaded to read-only memory. In that case, the flags will obviously not be compatible, and the warning will be confusing.

grimar added inline comments.Aug 22 2018, 1:39 AM
ELF/LinkerScript.cpp
816

I think its ok to remove it. Patch already do much, we can add/polish the additional warnings/error messages in followups.

819

Seems it is needed for vma.test only. What do you think about splitting this change to separate followup patch (with vma.test)?
I would not introduce moveTo in this patch because this seems to be more or less isolated/independent change and
since we already have setDot, I am not sure it is good to introduce one more method that moves the Dot. I would probably
try to inline it, but first of all, I think it can be discussed/reviewed separately.

kschwarz updated this revision to Diff 161905.Aug 22 2018, 2:24 AM

Removed the warning and moveTo function, which will go into its own patch.

grimar accepted this revision.Aug 22 2018, 2:26 AM

LGTM, thanks! Please wait for Rui's opinion too.

This revision is now accepted and ready to land.Aug 22 2018, 2:26 AM

@ruiu, are you okay with this change?

ruiu added inline comments.Aug 27 2018, 12:35 AM
ELF/LinkerScript.cpp
723–724

Remove extraneous parentheses. I.e.

return M && (!M->Flags || (M->Flags & Sec->Flags)) &&
        (M->NegFlags & Sec->Flags) == 0;

But perhaps it is better to deconstruct it a bit by using more lines:

if (!M)
  return false;
if (M->Flags && !(M->Flags & Sec->Flags)
  return false;
return (M->NegFlags & Sec->Flags) == 0;

Now, my question is, can really M be a nullptr?

782

Add a full stop at end of a sentence.

785

Add a blank line before a comment.

786

Ditto.

794

Blank line.

800

Ditto.

830

If you add a comment to else, add {} and write the comment inside {}.

kschwarz added inline comments.Aug 28 2018, 4:57 AM
ELF/LinkerScript.cpp
723–724

Sure, I can simplify it a bit.

Currently this function can be called with M being a nullptr (from findLMARegion). We can add a check for nullptr there and eliminate the check in areFlagsCompatible, which seems more reasonable.

kschwarz updated this revision to Diff 163269.Aug 30 2018, 12:04 AM

Addressed review comments.

@ruiu, does this look good now?

ruiu added inline comments.Sep 19 2018, 2:30 PM
ELF/LinkerScript.cpp
722

What is the meaning of the expressions in the function? I'd write a brief function comment.

774

Needs a brief function comment.

Herald added a project: Restricted Project. · View Herald TranscriptMar 28 2019, 8:51 AM
Herald added a subscriber: jdoerfert. · View Herald Transcript