This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Don't combine SHF_LINK_ORDER sections linking different output sections
AbandonedPublic

Authored by MaskRay on Mar 18 2020, 10:53 PM.

Details

Summary

This generalizes D68094 (relocatable link) to non-relocatable links.

sh_link(foo[1]) = .text.f1
sh_link(foo[2]) = .text.f2

If .text.f1 and .text.f2 and placed in different output sections, we
currently produce one output foo with the sh_link field arbitrarily set
to the output section index of first input foo.

This patch combines SHF_LINK_ORDER sections linking different output
sections. The idea is similar to D66717 (do not ICF two sections with
different output sections).

If an output section description combines generic SHF_LINK_ORDER
sections, we have to respect the user requirements. Due to the "by name"
nature of section matching, SHF_LINK_ORDER sections linking different
output sections may be combined.

The use of start_/stop_ symbols has an expectation that the pair of
symbols delimiter the section contents (D65770 __start_hwasan_globals is
a use case). Don't create multiple output sections.

Note, .ARM.exidx (singleton synthetic section) is not affected by this change.
D76409 improved the test.

Diff Detail

Event Timeline

MaskRay created this revision.Mar 18 2020, 10:53 PM

I'm happy with this idea in principle, but does it work for users who have .stack_sizes grouped in linker scripts explicitly? (I'm betting not, and yes, such linker scripts do exist).

grimar added inline comments.Mar 19 2020, 2:10 AM
lld/test/ELF/linkorder-output-section.s
43

Perhaps:

# DIFFERENT-COUNT-2: foo
# DIFFERENT-NOT:     foo
lld/test/ELF/partition-move-to-main-startstop.s
17

I think this kind of breaks the logic of this test?

See the output before this patch:

  [ 4] .dynstr           STRTAB           00000000002002b0  000002b0
       0000000000000039  0000000000000000   A       0     0     1 
  [ 5] has_startstop     PROGBITS         00000000002002e9  000002e9
       0000000000000010  0000000000000000  AL       8     0     1
  [ 6] no_startstop      PROGBITS         00000000002002f9  000002f9
       0000000000000008  0000000000000000  AL       8     0     1
...
  [10] part1             LOOS+0xfff4c06   0000000000203000  00001000
       0000000000000040  0000000000000000   A       0     0     1

Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
...
     2: 00000000002002e9     0 NOTYPE  GLOBAL PROTECTED    5 __start_has_startstop
     3: 00000000002002f9     0 NOTYPE  GLOBAL PROTECTED    5 __stop_has_startstop

And after:

 [ 5] has_startstop     PROGBITS         00000000002002e9  000002e9
       0000000000000008  0000000000000000  AL       9     0     1
  [ 6] has_startstop     PROGBITS         00000000002002f1  000002f1
       0000000000000008  0000000000000000  AL      18     0     1
  [ 7] no_startstop      PROGBITS         00000000002002f9  000002f9
       0000000000000008  0000000000000000  AL       9     0     1
  [ 8] .rodata           PROGBITS         0000000000200304  00000304
       000000000000000c  0000000000000000   A       0     0     4
...
  [11] part1             LOOS+0xfff4c06   0000000000203000  00001000
       0000000000000040  0000000000000000   A       0     0     1

Symbol table '.symtab' contains 6 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
...
     2: 00000000002002e9     0 NOTYPE  GLOBAL PROTECTED    5 __start_has_startstop
     3: 00000000002002f1     0 NOTYPE  GLOBAL PROTECTED    5 __stop_has_startstop

__stop_has_startstop points to the begining of the second has_startstop section now.

Given the somewhat loose definition in ELF

This flag adds special ordering requirements for link editors. The requirements apply if the sh_link field of this section's header references another section (the linked-to section). If this section is combined with other sections in the output file, it must appear in the same relative order with respect to those sections, as the linked-to section appears with respect to sections the linked-to section is combined with.
NOTE: A typical use of this flag is to build a table that references text or data sections in address order.

In the text below, a generic SHF_LINK_ORDER section is just a section we haven't identified as needing any special requirements and only demands what it is written in the ELF spec. A non-generic SHF_LINK_ORDER section like .ARM.exidx has additional requirements on top of generic ELF, for example from the ARM ABI documentation.

There are no requirements for a generic SHF_LINK_ORDER section to be combined together in one section or split apart depending on whether the linked-to sections are in different OutputSections. So not merging SHF_LINK_ORDER sections is as good as merging them. Other linkers may choose to handle them differently. So in principle no objections, however I'm not sure we should consider the .stack_size sections as generic SHF_LINK_ORDER sections as from the PR they seem to rely on having a single .stack_size OutputSection per sh_link to OutputSection they describe, this is an opposite requirement for .ARM.exidx, which requires a single contiguous table (normally implemented as a single OutputSection), but as it is processed at run-time there are requirements on the section level view. Can we make an arbitrary decision on all future SHF_LINK_ORDER sections for the benefit of .stack_size?

For the linker script case for generic SHF_LINK_ORDER sections I think we have to respect what is written in the linker script on the grounds that the user knows their own requirements and wrote the linker script accordingly. I think that this might make writing a linker script using .stack_size sections and multiple output sections pretty much impractical for some cases. Perhaps another reason to treat .stack_size sections separately like .ARM.exidx, although .ARM.exidx is pretty easy to write a linker script for.

So in principle no objection, but I recommend handling .stack_sizes as a special case, especially for linker scripts if they have special requirements above what is written in ELF.

I've created D76425 to remove the check in llvm-readobj for fully linked output. This would work as an alternative to this linker change, to fix the PR (although there is one minor caveat - see the bug comment/that review), but I don't think this linker change hurts still. The llvm-readobj patch solves the issue for the case where stack_sizes sections have all been concatenated together via linker script.

MaskRay updated this revision to Diff 251400.Mar 19 2020, 8:55 AM
MaskRay marked 2 inline comments as done.
MaskRay edited the summary of this revision. (Show Details)

Delete mention of .stack_sizes
Improve tests

lld/test/ELF/partition-move-to-main-startstop.s
17

This demonstrates that __start_/__stop_ symbols should not be used with SHF_LINK_ORDER sections.
Added a comment.

grimar added a comment.EditedMar 20 2020, 1:23 AM

The logic looks reasonable to me. 2 more comments about comments are inlined.
(I think it worth to hold this for a day or too to see if there will be any other comments).

lld/test/ELF/partition-move-to-main-startstop.s
6–8

This should be rewritten too. "there is only one section" is not true, and
"We can't let the has_startstop section be split by partition" sounds like we should not split has_startstop at all.

11

And this one looks like it is fine to split "no_startstop", but it assumes it is not true for "has_startstop"

I don't have any more comments. I think the change is reasonable for orphan sections, happy for George to approve if he is happy his last comments are addressed.

grimar accepted this revision.EditedMar 20 2020, 8:11 AM

LGTM, just please update comments accordingly.

This revision is now accepted and ready to land.Mar 20 2020, 8:11 AM
MaskRay updated this revision to Diff 251802.Mar 20 2020, 4:48 PM

Fix a comment

MaskRay marked 2 inline comments as done.Mar 20 2020, 4:48 PM
pcc requested changes to this revision.Mar 20 2020, 5:06 PM
pcc added inline comments.
lld/test/ELF/partition-move-to-main-startstop.s
17

__start_/__stop_ are absolutely being used with SHF_LINK_ORDER sections right now. See e.g. hwasan_globals which may be linked to .data/.data.rel.ro/.rodata/.bss. We need a solution for such use cases before making this change, if it is to be made at all.

This revision now requires changes to proceed.Mar 20 2020, 5:06 PM
MaskRay updated this revision to Diff 251849.Mar 21 2020, 10:37 AM
MaskRay edited the summary of this revision. (Show Details)

Make start_/stop_ work

pcc added a comment.Mar 23 2020, 1:10 PM

Having given this some more thought, I'm still not in favour of this change even with start/stop symbols excluded, since there are other cases where enumeration of SHF_LINK_ORDER sections may be possible/desirable. To give one example, imagine that you have .init_array sections linked to the globals that they initialize (this may be possible if you can prove that the constructor has no side effects other than initializing the object). With this change we may for example have multiple .init_array sections for .bss and .data, which would need to be accounted for when computing DT_INIT_ARRAY/DT_INIT_ARRAY_SZ. There are various ways that you could deal with this sort of situation, but it seems like the simplest one would be to stick with one output section per file.

Given that the tool that deals with .stack_size sections has apparently been fixed, is there a need to make this change?

In D76410#1937683, @pcc wrote:

Having given this some more thought, I'm still not in favour of this change even with start/stop symbols excluded, since there are other cases where enumeration of SHF_LINK_ORDER sections may be possible/desirable. To give one example, imagine that you have .init_array sections linked to the globals that they initialize (this may be possible if you can prove that the constructor has no side effects other than initializing the object). With this change we may for example have multiple .init_array sections for .bss and .data, which would need to be accounted for when computing DT_INIT_ARRAY/DT_INIT_ARRAY_SZ. There are various ways that you could deal with this sort of situation, but it seems like the simplest one would be to stick with one output section per file.

I created the patch because I am not so fond of the relocatable check in D68094. The case you mentioned (SHF_LINK_ORDER .init_array) and .ARM.exidx are examples of a global singleton section. In such a use case, there is a global monolithic section that will be iterated by entry with a pair of symbols or a pair of dynamic tags. However, there is another set of non-singleton use cases. .stack_sizes is one of them.

Given that the tool that deals with .stack_size sections has apparently been fixed, is there a need to make this change?

The immediate use case of this patch has been worked around. This patch is put up to discuss the most reasonable behavior.

Thinking about this a bit more, this is my opinion.

I think anyone that designs a section type with special ordering or collation requirements that wants it to be portable across as many linkers, without custom processing, will need to make it drop out of standard ELF and existing convention. From memory .ARM.exidx was designed to follow the ELF Rules for Linking Unrecognized Sections quoted at the end, albeit with a couple of caveats:

  • Some assemblers didn't support having multiple sections with the same name so .ARM.exidx.<section name> was needed when each function was output in its own section.
  • .ARM.exidx is a processor specific type rather than OS-specific type.

By following the the Unrecognized Sections and .ARM.exidx and .ARM.exidx.* naming convention it made it fairly easy to collate all the sections in one place. Having one section makes a derived start and stop unique.

If we are apply these rules to LLVM stack-size then it should have its own OS-specific type or flag so that the rules for linking unrecognized sections apply, and it should have a distinct name derived from the orphan name of the code section it has a sh_link to. This will also produce uniquely named derived start and stop symbols that can be referred to, and it permits linker scripts to be written to match the names. If that sounds fragile then I agree. I'm not sure that it is realistic to design a SHF_LINK_ORDER section that relies on the sh_link field in the executable without needing special linker support.

From this I think we don't need to treat SHF_LINK_ORDER in non-relocatable links specially. If something like LLVM stack size section comes along we should treat it specially as I don't think it can be easily handled generically.

I think relocatable links mess this argument up as they really are a special case not really covered by ELF at all. If they were to be purely relocatable the linker would do no collation at all, but kernel module require collation as they are using them as a form of dynamic linking. So I think our current behaviour for these is fine.

Rules for Linking Unrecognized Sections
If a link editor encounters sections whose headers contain OS-specific values it does not recognize in the sh_type or sh_flags fields, the link editor should combine those sections as described below.
If the section's sh_flags bits include the attribute SHF_OS_NONCONFORMING, then the section requires special knowledge to be correctly processed, and the link editor should reject the object containing the section with an error.

Unrecognized sections that do not have the SHF_OS_NONCONFORMING attribute, are combined in a two-phase process. As the link editor combines sections using this process, it must honor the alignment constraints of the input sections (asserted by the sh_addralign field), padding between sections with zero bytes, if necessary, and producing a combination with the maximum alignment constraint of its component input sections.

- In the first phase, input sections that match in name, type and attribute flags should be concatenated into single sections. The concatenation order should satisfy the requirements of any known input section attributes (e.g, SHF_MERGE and SHF_LINK_ORDER). When not otherwise constrained, sections should be emitted in input order.
- In the second phase, sections should be assigned to segments or other units based on their attribute flags. Sections of each particular unrecognized type should be assigned to the same unit unless prevented by incompatible flags, and within a unit, sections of the same unrecognized type should be placed together if possible.

Non OS-specific processing (e.g. relocation) should be applied to unrecognized section types. An output section header table, if present, should contain entries for unknown sections. Any unrecognized section attribute flags should be removed.

NOTE: It is recommended that link editors follow the same two-phase ordering approach described above when linking sections of known types. Padding between such sections may have values different from zero, where appropriate.
MaskRay added a comment.EditedMar 24 2020, 12:04 PM

I actually started a generic-abi thread just after I created the patch. https://groups.google.com/forum/#!topic/generic-abi/avQCrdIALKE

I actually started a generic-abi thread just after I created the patch. https://groups.google.com/g/generic-abi/c/avQCrdIALKE

Shows Error 404 for me..

I actually started a generic-abi thread just after I created the patch. https://groups.google.com/g/generic-abi/c/avQCrdIALKE

Shows Error 404 for me..

Updated to a link I found in incognito mode...

I actually started a generic-abi thread just after I created the patch. https://groups.google.com/g/generic-abi/c/avQCrdIALKE

Shows Error 404 for me..

Updated to a link I found in incognito mode...

It now opens fine, thanks. FWIW the original link didn't (and doesn't) work for me in incognito mode.

Reading the comprehensive feedback, I think we should make this change. There are probably really two difference uses of SHF_LINK_ORDER: one is monolithic (__start_/DT_INIT_ARRAY/...) and the other is not (__patchable_function_entries/.stack_sizes). If pcc has future monolithic use cases, we can cross the bridge when we come to it.

Reading the comprehensive feedback, I think we should make this change. There are probably really two difference uses of SHF_LINK_ORDER: one is monolithic (__start_/DT_INIT_ARRAY/...) and the other is not (__patchable_function_entries/.stack_sizes). If pcc has future monolithic use cases, we can cross the bridge when we come to it.

I'm less convinced that this is the right thing to do. Looking at the thread the last response is important:

SHF_LINK_ORDER says nothing about separate output sections. If you
need the metadata sections for .text.foo to be in a separate output
section from those for .text.bar, you should name the metadata
sections accordingly, too.

If the metadata sections are appropriately named then the separate OutputSections drop out without us needing to make any changes. Splitting up identically named OutputSections that would normally be collated as one risks breaking them. I'm personally inclined to view the use of SHF_LINK_ORDER outside of constructing a single table as the special case given the origin story of SHF_LINK_ORDER. I think we'll have to agree to disagree on this one. Having said that we can handle the special cases when needed so I needn't be a blocker if the consensus is the other way.

MaskRay abandoned this revision.EditedMar 30 2020, 2:29 PM

Reading the comprehensive feedback, I think we should make this change. There are probably really two difference uses of SHF_LINK_ORDER: one is monolithic (__start_/DT_INIT_ARRAY/...) and the other is not (__patchable_function_entries/.stack_sizes). If pcc has future monolithic use cases, we can cross the bridge when we come to it.

I'm less convinced that this is the right thing to do. Looking at the thread the last response is important:

SHF_LINK_ORDER says nothing about separate output sections. If you
need the metadata sections for .text.foo to be in a separate output
section from those for .text.bar, you should name the metadata
sections accordingly, too.

If the metadata sections are appropriately named then the separate OutputSections drop out without us needing to make any changes. Splitting up identically named OutputSections that would normally be collated as one risks breaking them. I'm personally inclined to view the use of SHF_LINK_ORDER outside of constructing a single table as the special case given the origin story of SHF_LINK_ORDER. I think we'll have to agree to disagree on this one. Having said that we can handle the special cases when needed so I needn't be a blocker if the consensus is the other way.

I am also less convinced after I created D77007 and put more thoughts on this change. However, my viewpoint is more from

  • We can't (not impossible but not motivated enough) extend the name based linker script input section descriptions.
  • The implementation consistency

but less from the existing use cases of SHF_LINK_ORDER because I am not convinced that we should necessarily restrict SHF_LINK_ORDER just by the current (admittedly dominating) single table use cases.

We don't label outSecOff in sortSections() for -r. This leads to the following rule in addOrphanSections.

// In -r links, SHF_LINK_ORDER sections are added while adding their parent
// sections because we need to know the parent's output section before we
// can select an output section for the SHF_LINK_ORDER section.
if (config->relocatable && (isec->flags & SHF_LINK_ORDER))

This is related to the rule this patch intends to simplify. Simplifying one place is not worthwhile.

I am abandoning this patch but I'll improve the test case.

lld/test/ELF/partition-move-to-main-startstop.s