This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Allow mixed SHF_LINK_ORDER & non-SHF_LINK_ORDER sections and sort within InputSectionDescription

Authored by MaskRay on Jul 16 2020, 8:54 PM.



LLD currently does not allow non-contiguous SHF_LINK_ORDER components in an
output section. This makes it infeasible to add SHF_LINK_ORDER to an existing
metadata section if backward compatibility with older object files are

We did not allow mixed components (like GNU ld) and D77007 relaxed to allow
non-contiguous SHF_LINK_ORDER components. This patch allows arbitrary mix, with
sorting performed within an InputSectionDescription. For example,
.rodata : {*( *(}, has two InputSectionDescription's.
If there is at least one SHF_LINK_ORDER and at least one non-SHF_LINK_ORDER in, they are ordered within *( we arbitrarily place
SHF_LINK_ORDER components before non-SHF_LINK_ORDER components (like Solaris ld).

*( is ordered similarly, but the two InputSectionDescription's
don't interact. It can be argued that this is more reasonable than the previous
behavior where written order was not respected.

It would be nice if the two different semantics (ordering requirement & garbage
collection) were not overloaded on one section flag, however, it is probably
difficult to obtain a generic flag at this point
"SHF_LINK_ORDER's original semantics make upgrade difficult").

(Actually, without the GC semantics, SHF_LINK_ORDER would still have the
sh_link!=0 & sh_link=0 issue. It is just that people find the GC semantics more
useful and tend to use the feature more often.)

GNU ld feature request:

Diff Detail

Event Timeline

MaskRay created this revision.Jul 16 2020, 8:54 PM

I'll take a look over the weekend, also at D72904, apologies won't be able to get to it today.

This is definitely an improvement from my point of view, so in general +1 from me.

FWIW, this doesn't help my specific use-case because all my debug sections have the same name, so can't be distinguished in linker scripts, but the ordering needs to be maintained the same as the input ordering, so I end up with a mix of ordered and unordered sections, which with this change would be allowed, but would end up with all the ordered ones first (and thus the underlying assumption would be broken). Example:

.debug_info [common1]
.debug_info [referencing .text.1]
.debug_info [common2]
.debug_info [referencing .text.2]

will end up as:

.debug_info [.text.1]
.debug_info [.text.2]
.debug_info [common1]
.debug_info [common2]

which will not be a valid structure. Anyway, I don't think there's anything that can be done here without disabling the sorting entirely.


You don't use the section header dump here. Can you get rid of -S?


I'm either missing something, or something odd is going on here. Where are the cccccc coming from? There doesn't appear to be a .text section with any contents - it has the _start symbol in, if I follow correctly, but should be empty; nothing specifies alignment requirements either here.


This section doesn't seem to be being used?


It seems to me like you need test cases for sh_link=0 for SHF_LINK_ORDER sections here too?

grimar added inline comments.Jul 17 2020, 4:15 AM

This comment needs updating.


Consider reordering to reduce nesting and to remove clear() calls:

// Link order may be distributed across several InputSectionDescriptions.
// Sorting is performed separately.
for (BaseCommand *base : sec->sectionCommands) {
  auto *isd = dyn_cast<InputSectionDescription>(base);
  if (!isd)

  bool hasLinkOrder = false;
  std::vector<InputSection **> scriptSections;
  std::vector<InputSection *> sections;
  for (InputSection *&isec : isd->sections) {
    if (isec->flags & SHF_LINK_ORDER) {
      InputSection *link = isec->getLinkOrderDep();
      if (link && !link->getParent())
        error(toString(isec) + ": sh_link points to discarded section " +
      hasLinkOrder = true;

  if (!hasLinkOrder || errorCount())

  llvm::stable_sort(sections, compareByFilePosition);
  for (int i = 0, n = sections.size(); i < n; ++i)
    *scriptSections[i] = sections[i];
grimar added inline comments.Jul 17 2020, 4:19 AM

Aside: we can replace constructions like:

for (BaseCommand *base : sec->sectionCommands) {
  auto *isd = dyn_cast<InputSectionDescription>(base);
  if (!isd)

with just

for (auto *isd: filter<InputSectionDescription>(sec->sectionCommands)) {

here and in many other places in LLD if we land D45166 + D45490.

psmith added inline comments.Jul 18 2020, 1:52 PM

I think we need to be careful here. This is not a problem for .ARM.exidx as we handle it as a special case, but if we followed this pattern it could cause problems. For example a typical ld.bfd linker script has:

.ARM.exidx : { *(.arm.exidx) *(.arm.exidx.*) }

The intent is to form a single table with the sorting order considered all at once. The script above would create two InputSectionDescriptions and if sorted independently could be in the wrong order. There may be other similar cases that follow the .name and .name.suffix pattern.

A possible heuristic is that if all the input section descriptions have SHF_LINK_ORDER then consider them all at once. If there is a mix of SHF_LINK_ORDER and non SHF_LINK_ORDER then it is unlikely that the ordering requirements are being used.

pcc added inline comments.Jul 21 2020, 12:35 PM
.ARM.exidx : { *(.arm.exidx) *(.arm.exidx.*) }

I think this unambiguously means to sort the groups separately. If the intent was to sort them together, it should have been written as

.ARM.exidx : { *(.arm.exidx .arm.exidx.*) }

But it doesn't look like we can expect linker scripts to be doing this in the first place though? Looking at binutils I see things in the default linker scripts like

ld/testsuite/ld-arm/arm.ld:  .ARM.exidx : { *(.ARM.exidx*) }

which would lead to the correct behavior. And looking at the history of that file it's been that way since the beginning. Presumably if people are copy/pasting from binutils into their linker scripts then they would have a similar clause to this which would be handled correctly.

A possible heuristic is that if all the input section descriptions have SHF_LINK_ORDER then consider them all at once. If there is a mix of SHF_LINK_ORDER and non SHF_LINK_ORDER then it is unlikely that the ordering requirements are being used.

I don't think we should work around broken linker scripts like this, at least not without more evidence that it is a problem, since it will complicate cases involving symbol assignments. For example, in this case:

.foo {
 start_foo = .;
  stop_foo = .;
  start_bar = .;
  stop_bar = .;

start_foo/stop_foo and start_bar/stop_bar would need to cover the correct parts of .foo even if .foo or .bar have SHF_LINK_ORDER sections.

MaskRay updated this revision to Diff 282803.Aug 3 2020, 11:17 PM
MaskRay marked 2 inline comments as done.



sh_addralign(.text)=4. lld fills the gap with traps.


linkorder-mixed.test tested sh_link=0 for SHF_LINK_ORDER sections

MaskRay updated this revision to Diff 283485.Aug 5 2020, 9:50 PM

Rebase after D72904 was pushed

MaskRay edited the summary of this revision. (Show Details)Aug 11 2020, 11:25 AM
psmith accepted this revision.Aug 14 2020, 5:47 AM

Given that it is possible to write a linker script that does work with *(pattern1 pattern2) then I don't have any objections. I've tentatively approved as it looks like was the most cautious here.

This revision is now accepted and ready to land.Aug 14 2020, 5:47 AM
MaskRay marked 4 inline comments as done.Aug 14 2020, 10:28 AM
MaskRay added inline comments.

This was a leftover from a previous patch where I probably wanted to test an orphan section. It is unrelated to this patch so I will not deal with it. It may find new uses if we take another think about the organization of tests.

MaskRay updated this revision to Diff 285687.Aug 14 2020, 10:29 AM

Address comments

This revision was landed with ongoing or failed builds.Aug 17 2020, 11:29 AM
This revision was automatically updated to reflect the committed changes.

Pushed after offline confirmation with pcc.