This is an archive of the discontinued LLVM Phabricator instance.

ELF: Don't merge SHF_LINK_ORDER sections for different output sections in relocatable links.
ClosedPublic

Authored by pcc on Sep 26 2019, 11:36 AM.

Details

Summary

Merging SHF_LINK_ORDER sections can affect semantics if the sh_link
fields point to different sections.

Specifically, for SHF_LINK_ORDER sections, the sh_link field acts as a reverse
dependency from the linked section, causing the SHF_LINK_ORDER section to
be included if the linked section is included. Merging sections with different
sh_link fields will cause the entire contents of the SHF_LINK_ORDER section
to be associated with a single (arbitrarily chosen) output section, whereas the
correct semantics are for the individual pieces of the SHF_LINK_ORDER section
to be associated with their linked output sections. As a result we can end up
incorrectly dropping SHF_LINK_ORDER section contents or including the wrong
section contents, depending on which linked sections were chosen.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Sep 26 2019, 11:36 AM
Herald added a project: Restricted Project. · View Herald Transcript

At a conference right now, so I've not had a chance to test out what happens with the patch. Apologies if I've jumped to a conclusion. Assuming the .text sections that are the targets of .ARM.exidx are merged, I'm not sure that this will be sufficient for .ARM.exidx. If we are also merging the .text sections. We'll end up with:

.text
.ARM.exidx.text.f1 -> .text
.ARM.exidx.text.f2 -> .text

There is no way for .ARM.exidx.text.f1 and .ARM.exidx.text.f2 to be reordered as they both have the same key (.text section address). If we happen to get unlucky, and the order of .ARM.exidx.text.f1 isn't the right one then we produce an incorrectly ordered table.

I think that there are two possible solutions given that there is an expectation that there is one .ARM.exidx section per .text section. Option 1 is to not merge sections with SHF_LINK_ORDER and not merge the sections that they have a sh_link to. This is pessimistic as on a C++ application it would pretty much disable merging on Arm as almost all .text sections have .ARM.exidx. Option 2 is to selectively merge and reorder the .ARM.exidx sections so that there is one .ARM.exidx per .text section.

.text.f1 <- .ARM.exidx.text.f1
.text.f2 <- .ARM.exidx.text.f2

After ld -r

.text <- .ARM.exidx

As long as the .ARM.exidx is ordered to match the order of .text.f1 and .text.f2 in the merged .text then everything will work.

Will try to take a closer look when I get some spare time. Will be back in the office on Monday.

I've got a test case that certainly doesn't work well when doing a -r, and then relinking the corresponding object.

exidx.s

	.section .text.f1,"ax",%progbits
	.globl f1
	.type f1, %function
f1:     .fnstart
	.cantunwind
	bx lr
	.fnend

	.section .text.f2,"ax",%progbits
	.globl f2
	.type f2, %function
f2:     .fnstart
	bx lr
        .save {r7, lr}
        .setfp r7, sp, #0
	.fnend

exidx2.s

	.section .text.f1,"ax",%progbits
	.globl f3
	.type f3, %function
f3:     .fnstart
        .save {r7, lr}
        .setfp r7, sp, #0
	bx lr
	.fnend

	.section .text.f2,"ax",%progbits
	.globl f4
	.type f2, %function
f4:     .fnstart
	.cantunwind
	bx lr
	.fnend

	.text
	.globl __aeabi_unwind_cpp_pr0
__aeabi_unwind_cpp_pr0:	bx lr

exidx.lds

SECTIONS {
  .foo : { *(.text.f2) *(.text.f1) }
}

Commands

clang --target=armv7a-linux-gnu exidx.s exidx2.s -c
ld.lld exidx.o exidx2.o -r -o exidxr.o -T exidx.lds 
ld.lld exidxr.o -o exidx.exe

The creation of the relocatable object produces a single output section with 4 .ARM.exidx sections with a SHF_LINK_ORDER dependency on it. The linker script scrambles the order of the .text.f* functions so that it doesn't match the order of the functions in the output executable section. This will break a linker using SHF_LINK_ORDER to construct the table.

In lld it is worse as there is custom code that is expecting a 1:1 mapping between .ARM.exidx and executable section. Looking at the ARMExidxSyntheticSection::finalizeContents() the code frequently goes from the executable section back to its SHF_LINK_ORDER section (first .ARM.exidx in the dependent sections). This doesn't work well when there are multiple .ARM.exidx sections on the dependent sections list.

Out of interest is this patch motivated by .ARM.exidx or other uses of SHF_LINK_ORDER? If it is the latter, then this may be good enough, and I'll need to fix up .ARM.exidx afterwards.

pcc added a comment.EditedSep 26 2019, 6:06 PM

I think you're right, we can't just prevent merging because this will cause the creation of multiple .ARM.exidx sections (and possibly other SHF_LINK_ORDER sections depending on ordering) and there would then be nothing preventing the sections from being reordered.

I'm uploading a new version of the patch that will merge SHF_LINK_ORDER sections that end up referring to the same output section. It looks like it should work; I just need to test it against the original reproduction scenario that uncovered this bug.

pcc added a comment.EditedSep 26 2019, 6:08 PM

Out of interest is this patch motivated by .ARM.exidx or other uses of SHF_LINK_ORDER? If it is the latter, then this may be good enough, and I'll need to fix up .ARM.exidx afterwards.

The latter (specifically, the hwasan_globals metadata section used by HWASAN global buffer overflow detection).

pcc updated this revision to Diff 222067.Sep 26 2019, 6:09 PM
  • Address review comments
pcc retitled this revision from ELF: Don't merge SHF_LINK_ORDER sections in relocatable links. to ELF: Don't merge SHF_LINK_ORDER sections for different output sections in relocatable links..Sep 26 2019, 6:12 PM

Thanks for the update, will try and take a look tomorrow.

MaskRay added a comment.EditedSep 26 2019, 10:53 PM

For the record, the original approach does not work for the ld.lld exidx.o exidx2.o -r -o exidxr.o -T exidx.lds example at https://reviews.llvm.org/D68094?id=221994#1685161:

[ 1] .foo              PROGBITS        00000000 000034 000010 00  AX  0   0  1
[ 2] .text             PROGBITS        00000000 000044 000004 00  AX  0   0  4
[ 3] .ARM.exidx.text.f1 ARM_EXIDX       00000000 000048 000008 00  AL  1   0  4               
[ 4] .rel.ARM.exidx.text.f1 REL             00000000 000050 000008 08   I 13   3  4                
[ 5] .ARM.exidx.text.f2 ARM_EXIDX       00000000 000058 000008 00  AL  1   0  4
[ 6] .rel.ARM.exidx.text.f2 REL             00000000 000060 000010 08   I 13   5  4
[ 7] .ARM.attributes   ARM_ATTRIBUTES  00000000 000070 00001e 00      0   0  1
[ 8] .ARM.exidx.text.f1 ARM_EXIDX       00000000 000090 000008 00  AL  1   0  4
[ 9] .rel.ARM.exidx.text.f1 REL             00000000 000098 000010 08   I 13   8  4
[10] .ARM.exidx.text.f2 ARM_EXIDX       00000000 0000a8 000008 00  AL  1   0  4
[11] .rel.ARM.exidx.text.f2 REL             00000000 0000b0 000008 08   I 13  10  4

The new approach can cause a segfault. The following 3 sections are consecutive in inputSections:

.text.f1               # `add` returned early because parent(.text.f1) is set (it is specified by a SECTIONS command)
.ARM.exidx.text.f1     # skipped because it has the SHF_LINK_ORDER flag
.rel.ARM.exidx.text.f1 # SIGSEGV when dereferencing `OutputSection *out = sec->getRelocatedSection()->getOutputSection();` because out is null.

One probable fix is:

std::function<void(InputSectionBase *)> add;
add = [&](InputSectionBase *s) {
  if (s->isLive() && s->parent == nullptr) {
    StringRef name = getOutputSectionName(s);

    if (config->orphanHandling == OrphanHandlingPolicy::Error)
      error(toString(s) + " is being placed in '" + name + "'");
    else if (config->orphanHandling == OrphanHandlingPolicy::Warn)
      warn(toString(s) + " is being placed in '" + name + "'");

    if (OutputSection *sec = findByName(sectionCommands, name)) {
      sec->recordSection(s);
    } else {
      if (OutputSection *os = addInputSec(map, s, name))
        v.push_back(os);
      assert(isa<MergeInputSection>(s) ||
             s->getOutputSection()->sectionIndex == UINT32_MAX);
    }
  }

  if (config->relocatable)
    for (InputSectionBase *depSec : s->dependentSections)
      if (depSec->flags & SHF_LINK_ORDER)
        add(depSec);
};

We may need a proper topological sort when the ordering requirement gets more complicated.

Before this patch, or with the fixed patch, we get the same section layout:

[ 1] .foo              PROGBITS        00000000 000034 000010 00  AX  0   0  1
[ 2] .text             PROGBITS        00000000 000044 000004 00  AX  0   0  4
[ 3] .ARM.exidx.text.f1 ARM_EXIDX       00000000 000048 000010 00  AL  1   0  4
[ 4] .rel.ARM.exidx.text.f1 REL             00000000 000058 000018 08   I  9   3  4
[ 5] .ARM.exidx.text.f2 ARM_EXIDX       00000000 000070 000010 00  AL  1   0  4
[ 6] .rel.ARM.exidx.text.f2 REL             00000000 000080 000018 08   I  9   5  4

It may still be less than idea because we would expect .ARM.exidx.text.f2 to be placed before .ARM.exidx.text.f1. It is an existing issue though.


For the new test relocatable-linkorder.s, ld.bfd can combine the two foo sections (the behavior before this patch):

[ 1] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  4
[ 2] .text.f1          PROGBITS        0000000000000000 000040 000001 00  AX  0   0  1
[ 3] .text.f2          PROGBITS        0000000000000000 000041 000001 00  AX  0   0  1
[ 4] foo               PROGBITS        0000000000000000 000042 000010 00  AL  2   0  1

It may be worth sending a feature request. I have sent 3 requests based on LLD's treatment of SHF_LINK_ORDER.

Merging SHF_LINK_ORDER sections can affect semantics if the sh_link

fields point to different sections.

The summary can be a bit more detailed about how this affects semantics.

pcc updated this revision to Diff 222232.Sep 27 2019, 1:02 PM
  • Address review comments
pcc added a comment.Sep 27 2019, 1:03 PM

For the record, the original approach does not work for the ld.lld exidx.o exidx2.o -r -o exidxr.o -T exidx.lds example at https://reviews.llvm.org/D68094?id=221994#1685161:

[ 1] .foo              PROGBITS        00000000 000034 000010 00  AX  0   0  1
[ 2] .text             PROGBITS        00000000 000044 000004 00  AX  0   0  4
[ 3] .ARM.exidx.text.f1 ARM_EXIDX       00000000 000048 000008 00  AL  1   0  4               
[ 4] .rel.ARM.exidx.text.f1 REL             00000000 000050 000008 08   I 13   3  4                
[ 5] .ARM.exidx.text.f2 ARM_EXIDX       00000000 000058 000008 00  AL  1   0  4
[ 6] .rel.ARM.exidx.text.f2 REL             00000000 000060 000010 08   I 13   5  4
[ 7] .ARM.attributes   ARM_ATTRIBUTES  00000000 000070 00001e 00      0   0  1
[ 8] .ARM.exidx.text.f1 ARM_EXIDX       00000000 000090 000008 00  AL  1   0  4
[ 9] .rel.ARM.exidx.text.f1 REL             00000000 000098 000010 08   I 13   8  4
[10] .ARM.exidx.text.f2 ARM_EXIDX       00000000 0000a8 000008 00  AL  1   0  4
[11] .rel.ARM.exidx.text.f2 REL             00000000 0000b0 000008 08   I 13  10  4

The new approach can cause a segfault.

Yes, or the SHF_LINK_ORDER section can be silently discarded. Thanks for catching that, I've adopted your suggested fix and extended my test to cover it.

Before this patch, or with the fixed patch, we get the same section layout:

[ 1] .foo              PROGBITS        00000000 000034 000010 00  AX  0   0  1
[ 2] .text             PROGBITS        00000000 000044 000004 00  AX  0   0  4
[ 3] .ARM.exidx.text.f1 ARM_EXIDX       00000000 000048 000010 00  AL  1   0  4
[ 4] .rel.ARM.exidx.text.f1 REL             00000000 000058 000018 08   I  9   3  4
[ 5] .ARM.exidx.text.f2 ARM_EXIDX       00000000 000070 000010 00  AL  1   0  4
[ 6] .rel.ARM.exidx.text.f2 REL             00000000 000080 000018 08   I  9   5  4

It may still be less than idea because we would expect .ARM.exidx.text.f2 to be placed before .ARM.exidx.text.f1. It is an existing issue though.

Yes, this is a tricky case because we would need to know the section merging rules used in the final link in order to handle it correctly. Peter's option 1 from his first comment would handle this case but I'm not sure whether it makes sense to do that in the presence of linker scripts.

For the new test relocatable-linkorder.s, ld.bfd can combine the two foo sections (the behavior before this patch):

[ 1] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  4
[ 2] .text.f1          PROGBITS        0000000000000000 000040 000001 00  AX  0   0  1
[ 3] .text.f2          PROGBITS        0000000000000000 000041 000001 00  AX  0   0  1
[ 4] foo               PROGBITS        0000000000000000 000042 000010 00  AL  2   0  1

It may be worth sending a feature request. I have sent 3 requests based on LLD's treatment of SHF_LINK_ORDER.

I can see if I can find some time to file a feature request.

Merging SHF_LINK_ORDER sections can affect semantics if the sh_link

fields point to different sections.

The summary can be a bit more detailed about how this affects semantics.

Okay, I'll add more detail to the commit message.

pcc edited the summary of this revision. (Show Details)Sep 27 2019, 1:22 PM

Apologies, not going to get a chance to look in detail today, will get back to the office on Monday. For an OutputSection containing InputSections with a SHF_LINK_ORDER dependency it should be possible to sort these sections based on the outSecOff of the InputSections within the OutputSection that there is a link order dependency on. This relative order would be stable under the final SHF_LINK_ORDER search in the final link. I agree that this is most likely only needed when a linker script is used as the default layout from the compiler will usually produce the right order.

For Arm, I think that ld -r with a linker script is usually used in the context of a linux kernel module which would discard exception tables.

MaskRay accepted this revision.Sep 27 2019, 6:31 PM

LGTM, but let's wait for @peter.smith to take a look in details.

For Arm, I think that ld -r with a linker script is usually used in the context of a linux kernel module which would discard exception tables.

I agree the -r + linker scripts combination is rare. @grimar also stated "It is perhaps uncommon to use a script with -r, but we faced it in the Linux kernel" in D53864. If we either discard SHF_LINK_ORDER sections or keep their order, we don't need to do more change. Though, I suggest enhancing the test to be prepared for that.

lld/test/ELF/relocatable-linkorder.s
12 ↗(On Diff #222232)

An optional suggestion. Recent patches use different comment markers for comments, e.g. /// for comments when // is used for RUN and CHECK lines, or
## for comments when # is used for RUN and CHECK lines.

27 ↗(On Diff #222232)

The llvm-readelf -S RUN line can be enhanced a bit by changing to llvm-readelf -x foo %t to verify that 1 and 2 are in order.

This revision is now accepted and ready to land.Sep 27 2019, 6:31 PM
peter.smith accepted this revision.Sep 30 2019, 1:43 AM

LGTM too. I think that this will work for the same use cases as ld.bfd, and is a definite improvement over what we had before.

This revision was automatically updated to reflect the committed changes.
pcc marked 2 inline comments as done.