This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Support INSERT [AFTER|BEFORE] for orphan sections

Authored by MaskRay on Feb 10 2020, 6:45 PM.



D43468+D44380 added INSERT [AFTER|BEFORE] for non-orphan sections. This patch
makes INSERT work for orphan sections as well.

SECTIONS {...} INSERT [AFTER|BEFORE] .foo does not set hasSectionCommands, so the result
will be similar to a regular link without a linker script. The differences when hasSectionCommands is set include:

  • image base is different
  • -z noseparate-code/-z noseparate-loadable-segments are unavailable
  • some special symbols such as _end _etext _edata are not defined

The behavior is similar to GNU ld:
INSERT is not considered an external linker script.

This feature makes the section layout more flexible. It can be used to:

  • Place .nv_fatbin before other readonly SHT_PROGBITS sections to mitigate relocation overflows.
  • Disturb the layout to expose address sensitive application bugs.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 10 2020, 6:45 PM
MaskRay edited the summary of this revision. (Show Details)Feb 10 2020, 6:46 PM
MaskRay marked an inline comment as done.Feb 10 2020, 6:48 PM
MaskRay added inline comments.

In the future, we should stop enforcing --no-rosegment for hasSectionsCommand so that features can be more orthogonal. (I've mentioned this many times to Android/FreeBSD folks...)

grimar added inline comments.Feb 11 2020, 1:03 AM

Seems the comment needs an update to mention "BEFORE".


Don't use auto then?


I am not sure this is a clear comment. I had to read the code to understand what you do.
Perhaps it should say something like "we do not handle an INSERT command for an output section that ...."


The using of curly bracers violates the LLD style a bit. Across it's code we usually added an additional
brackets for each case where a branch contained more than a single line. I.e.:

for (auto it = sectionCommands.begin(), e = sectionCommands.end(); it != e;
     ++it) {
  if (auto *to = dyn_cast<OutputSection>(*it)) {
    if (to->name == where) {
      if (isAfter)
      sectionCommands.insert(it, os);
      found = true;

To clarify: here you have bracers in a internal branch, hence parent branches expected to use them too.
Actually I probably like this rule, but I observed many times that in LLVM people not always follow it.
And seems there is nothing like that in LLVM coding standart document. So seems it was something internal to LLD,
but perhaps it is better to follow for consistency with the rest od the code?


What do you think about adding a named struct instead of using the tuple (see PhdrsCommand/AddressState above).




Please clarify/expand what is "without making more layout changes". I think usually the word "layout" is commonly used
to describe the layout of sections. But here the order of them is the same, the difference is in segments created, though the first
comment says nothing about that, it is a bit hard to understand what is actually the difference I think.

Perhaps you could also combine the testing of section names to emphasise the difference (i.e. use something like --check-prefixes=COMMON,PART1 --check-prefixes=COMMON,PART2). Up to you though.


Lets add a general description of the test case?


Could you expand this comment and below ones please.

Thanks for implementing this so quickly! Looks good to my untrained eyes :)


I find this a bit difficult to read because of the nested statements. Maybe we can flatten it using llvm::find?

auto insertPos = llvm::find_if(sectionCommands, [where](BaseCommand *command) {
  auto *to = dyn_cast<OutputSection>(*it);
  return to != nullptr && to->name == where;

if (insertPos == sectionCommands.end())
  error("unable to insert " + os->name +
        Twine(isAfter ? " after " : " before ") + where);

if (isAfter)

sectionCommands.insert(insertPos, os);

Up to you. Feel free to leave as is.


[nit]: Move declaration to initialization?

StringRef where = next();
psmith added a subscriber: psmith.Feb 11 2020, 5:22 AM
psmith added inline comments.

FWIW I'd tend towards using a named struct here. We have to look down to the bit of code to find out what the fields are.

OutputSection *os;
bool isAfter;
StringRef where;
std::tie(os, isAfter, where) = insert;

Perhaps the intention can be added to the release notes for 10.0. For example in LLD 11.0 we intend to make the following changes:

  • a linker script no longer implies --no-rosegment

This might be some way towards informing users of interface changes so that they don't get surprised. Given that few read the release notes, we might also consider a warning, maybe an optional one if someone uses a feature that will have its default changed.

MaskRay updated this revision to Diff 243900.Feb 11 2020, 9:11 AM
MaskRay marked 12 inline comments as done.

Address comments

MaskRay edited the summary of this revision. (Show Details)Feb 11 2020, 9:11 AM
MaskRay marked 3 inline comments as done.Feb 11 2020, 9:20 AM
MaskRay added inline comments.

Thanks. I agree find_if makes the code easier to read.


I also prefer fewer braces. (Excess braces harm readability to me...). In Writer.cpp, we have such code:

// Writer.cpp
    for (Symbol *sym : symtab->symbols())
      if (sym->isUndefined() && !sym->isWeak())
        if (auto *f = dyn_cast_or_null<SharedFile>(sym->file))
          if (f->allNeededIsKnown)
            error(toString(f) + ": undefined reference to " + toString(*sym));

I know that in some places, we added braces just to make it harder to make misleading indentation errors. However, -Wmisleading-indentation (GCC 6, clang 10) makes such protection more or less unnecessary...


Address/offset assignments are different with a SECTIONS command. They can be seen from the CHECK/CHECK2 lines.

I added a sentence Address/offset assignments are different with a main linker script. to emphasize this fact.

Thanks! LGTM

MaskRay edited the summary of this revision. (Show Details)Feb 11 2020, 9:30 AM

Thanks for the update, I'm fine with this as well.

grimar accepted this revision.Feb 11 2020, 11:33 PM

LGTM too.

This revision is now accepted and ready to land.Feb 11 2020, 11:33 PM
This revision was automatically updated to reflect the committed changes.
MaskRay marked an inline comment as done.
MaskRay marked an inline comment as done.Feb 12 2020, 8:25 AM
MaskRay added subscribers: nickdesaulniers, srhines.
MaskRay added inline comments.

@srhines @nickdesaulniers I want to delete config->singleRoRx = true; in the future to be more consistant with other parts of lld. (It also matches ld.bfd -z separate-code)

Can you please test whether any project needs adaptation with this change?


Sorry, what is the ask? I don't see separate-code being used in the Linux kernel. Or are you asking about all of Android (userspace)?

MaskRay marked an inline comment as done.Feb 12 2020, 10:26 AM
MaskRay added inline comments.

I'm asking about all of Android userspace that may use the linker script SECTIONS command.

Currently, due to config->singleRoRx = true; (like enforced --no-rosegment), there is no R PT_LOAD segment with a SECTIONS command.

After deleting config->singleRoRx = true;, an executable with a SECTIONS command will get one R PT_LOAD segment and one RX PT_LOAD segment, like a regular executable without a linker script.

I don't expect it will affect Linux kernel. We can easily add --no-rosegment if it does.

srhines added inline comments.Feb 14 2020, 7:55 AM

We'll likely be able to check this out with our next update. If we hit problems, we can follow up. Thanks for letting me know.