Page MenuHomePhabricator

[ELF] Resolve defined symbols before undefined symbols
ClosedPublic

Authored by MaskRay on Feb 3 2021, 3:50 PM.

Details

Summary

When parsing an object file, LLD interleaves undefined symbol resolution (which
may recursively fetch other lazy objects) with defined symbol resolution.

This may lead to surprising results, e.g. if an object file defines currently
undefined symbols and references another lazy symbol, we may interleave defined
symbols with the lazy fetch, potentially leading to the defined symbols
resolving to different files.

As an example, if both a.a(a.o) and a.a(b.o) define foo (not in COMDAT
group, or in different COMDAT groups) and __profd_foo (in COMDAT group
__profd_foo). LLD may resolve foo to a.a(a.o) and __profd_foo to
b.a(b.o), i.e. different files.

parse ArchiveFile a.a
  entry fetches a.a(a.o)
  parse ObjectFile a.o
    define entry
    define foo
    reference b
    b fetches a.a(b.o)
    parse ObjectFile b.o
      define prevailing __profd_foo
    define (ignored) non-prevailing __profd_foo

Assuming a set of interconnected symbols are defined all or none in several lazy
objects. Arguably making them resolve to the same file is preferable than making
them resolve to different files (some are lazy objects).

The main argument favoring the new behavior is the stability. The relative order
between a defined symbol and an undefined symbol does not change the symbol
resolution behavior. Only the relative order between two undefined symbols can
affect fetching behaviors.


The real world case is reduced from a Fuchsia PGO usage: a.a(a.o) has a
constructor within COMDAT group C5 while a.a(b.o) has a constructor within
COMDAT group C2. Because they use different group signatures, they are not
de-duplicated. It is not entirely whether Clang behavior is entirely conforming.

LLD selects the PGO counter section (__profd_*) from a.a(b.o) and the
constructor section from a.a(a.o). The __profd_* is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
sh_link points to discarded section. This patch fixes the error.

Diff Detail

Event Timeline

MaskRay created this revision.Feb 3 2021, 3:50 PM
MaskRay requested review of this revision.Feb 3 2021, 3:50 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2021, 3:50 PM
MaskRay updated this revision to Diff 321280.Feb 3 2021, 5:56 PM
MaskRay edited the summary of this revision. (Show Details)

Add test

MaskRay edited the summary of this revision. (Show Details)
MaskRay edited the summary of this revision. (Show Details)Feb 3 2021, 6:07 PM
MaskRay edited the summary of this revision. (Show Details)Feb 3 2021, 6:17 PM

I've subscribed others from my team for visibility. My first thought was that changing symbol resolution order sounds a little scary. That's not to say this is undesirable though - I need to spend some time thinking about this. It might be helpful to have a graph drawn showing the different references that are relevant here and which definitions are in which groups, as I'm struggling to follow the words (I can probably piece it together, but it will require more effort).

The real world case is reduced from a Fuchsia PGO usage: a.a(a.o) has a constructor within COMDAT group C5
while a.a(b.o) has a constructor within COMDAT group C2. Because they use
different group signatures, they are not de-duplicated.

LLD selects the PGO counter section (profd_*) from a.a(b.o) and the
constructor section from a.a(a.o). The
profd_* is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
sh_link points to discarded section. This patch fixes the error.

Can you write out the example in full, I'm not sure I got it. I would expect that if C5 and C2 have different signatures they should both be loaded as to the linker they are entirely separate entities. If the __profd_* sections have link order dependencies into group sections are not in the same group I think we have a problem with ELF file and not with the linker. Have I misunderstood the example?

Will need to have a think. I don't have any particular attachment to the way that we resolve undefined symbols today so I'm not going to object if there are other benefits.

OK. Had a chance to go through this in more detail. To make sure I understand; if the undefined symbols are in object files/libraries yet to be processed then there is no difference, but if there is a lazy symbol for an undefined then the object will be fetched immediately on seeing

// Handle global undefined symbols.
   if (eSym.st_shndx == SHN_UNDEF) {

Resulting in globals being loaded from a recursive chain of lazy loads before the globals from this object file have been loaded.

If I've got that right then I agree that this is a worthwhile change to make, if only because it makes library selection a bit more predictable/less-surprising.

MaskRay added a comment.EditedFeb 4 2021, 12:10 PM

The real world case is reduced from a Fuchsia PGO usage: a.a(a.o) has a constructor within COMDAT group C5
while a.a(b.o) has a constructor within COMDAT group C2. Because they use
different group signatures, they are not de-duplicated.

LLD selects the PGO counter section (profd_*) from a.a(b.o) and the
constructor section from a.a(a.o). The
profd_* is a SHF_LINK_ORDER section
linking to its own non-prevailing constructor section, so LLD errors
sh_link points to discarded section. This patch fixes the error.

Can you write out the example in full, I'm not sure I got it. I would expect that if C5 and C2 have different signatures they should both be loaded as to the linker they are entirely separate entities. If the __profd_* sections have link order dependencies into group sections are not in the same group I think we have a problem with ELF file and not with the linker. Have I misunderstood the example?

Will need to have a think. I don't have any particular attachment to the way that we resolve undefined symbols today so I'm not going to object if there are other benefits.

parse ArchiveFile a.a
  entry fetches a.a(a.o)
  parse ObjectFile a.o
    define entry
    define foo
    reference b
    b fetches a.a(b.o)
    parse ObjectFile b.o
      define prevailing __profd_foo
    define (ignored) non-prevailing __profd_foo

Is the reduced example. The original case is (folks may not want to read it):

parse ArchiveFile libunwindstack.a
  _ZN11unwindstack8Unwinder6UnwindEPKNSt3__26vectorINS1_12basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEENS6_IS8_EEEESC_ fetches libunwindstack.Unwinder.cpp.o
  parse ObjectFile libunwindstack.Unwinder.cpp.o
    _ZN11unwindstack3Elf12GetLastErrorEPNS_9ErrorDataE fetches libunwindstack.Elf.cpp.o
    parse ObjectFile libunwindstack.Elf.cpp.o
      _ZN11unwindstack12ElfInterface11GetLoadBiasI10Elf32_Ehdr10Elf32_PhdrEEmPNS_6MemoryE fetches libunwindstack.ElfInterface.cpp.o
        parse ObjectFile libunwindstack.ElfInterface.cpp.o
          _ZN11unwindstack19DwarfEhFrameWithHdrIjE12GetFdeFromPcEm fetches libunwindstack.DwarfEhFrameWithHdr.cpp.o
          parse ObjectFile libunwindstack.DwarfEhFrameWithHdr.cpp.o
            defines _ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE
            references __profd__ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE
              libunwindstack.ElfInterface.cpp.o defines __profd__ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE

Let me abbreviate the long symbol to `xxx6MemoryE`.

If we let the ancestor parsing frame (libunwindstack.ElfInterface.cpp.o; simplified as b.o in my example) to define `__profd_xxx6MemoryE`, `xxx6MemoryE` and `__profd_xxx6MemoryE` will be defined by the same file.

xxx6MemoryE is not de-duplicated in libunwindstack.ElfInterface.cpp.o and libunwindstack.DwarfEhFrameWithHdr.cpp.o because the two object files use different group signature (C2 vs C5).

libunwindstack.DwarfEhFrameWithHdr.cpp.o
## C5 comdat
COMDAT group section [    3] `.group' [_ZN11unwindstack19DwarfEhFrameWithHdrIjEC5EPNS_6MemoryE] contains 2 sections:
   [Index]    Name
   [    4]   .text._ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE
   [    5]   .rela.text._ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE

libunwindstack.ElfInterface.cpp.o
## C2 comdat
COMDAT group section [    9] `.group' [_ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE] contains 2 sections:
   [Index]    Name
   [   10]   .text._ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE
   [   11]   .rela.text._ZN11unwindstack19DwarfEhFrameWithHdrIjEC2EPNS_6MemoryE

OK. Had a chance to go through this in more detail. To make sure I understand; if the undefined symbols are in object files/libraries yet to be processed then there is no difference, but if there is a lazy symbol for an undefined then the object will be fetched immediately on seeing

// Handle global undefined symbols.
   if (eSym.st_shndx == SHN_UNDEF) {

Resulting in globals being loaded from a recursive chain of lazy loads before the globals from this object file have been loaded.

If I've got that right then I agree that this is a worthwhile change to make, if only because it makes library selection a bit more predictable/less-surprising.

Yes. The idea is to postpone Undefined resolution (which may fetch lazy symbols) to the end. Previously, if we swap the order of a defined and an undefined in an object file, it can potentially change the symbol resolution result.
The new behavior is immune to defined/undefined shuffles.

Undefined shuffles (due to lazy symbols) can still result to different symbol resolution results.

peter.smith accepted this revision.Feb 8 2021, 7:47 AM

I've put my approval down, as I think it is a worthwhile change to make regardless of whether the test case that had problems is strictly legal.

I think it will be worth waiting till Wednesday as that will have given people a week to object.

lld/ELF/InputFiles.cpp
1141

I've seen quite a few objects with more than 8 undefined symbols, particularly older projects. Is it worth running a quick experiment to see what the distribution is like? For example the undefined symbols per object in a clang/llvm build.

This revision is now accepted and ready to land.Feb 8 2021, 7:47 AM
phosek accepted this revision.Feb 8 2021, 11:41 AM

LGTM as well, I agree that this is valuable from the perspective of making the symbol resolution more predictable.

MaskRay added inline comments.Feb 8 2021, 3:15 PM
lld/ELF/InputFiles.cpp
1141

Linking clang:

  • This function is called 2287 times.
  • 228 elements have <= 8 elements.
  • 465 elements have <= 16 elements.
  • 842 elements have <= 32 elements.
  • 1565 elements have <= 64 elements.

In general, I think this looks okay. I've spent some time investigating the behaviour of our propietary linker in comparison to this, and it looks look the modified LLD matches our linker's behaviour, so this seems like a good thing for improving compatibility.

That being said, a colleague and I have been looking at whether there are any potential problemas this behaviour change might cause. We should be able to conclude that investigation tomorrow. Could you hold on until end of UK workday on Thursday before pushing this, please?

lld/ELF/InputFiles.cpp
1208
MaskRay updated this revision to Diff 322742.Feb 10 2021, 10:35 AM

Increase inline element number for unds
Improve comment

MaskRay marked 2 inline comments as done.Feb 10 2021, 10:35 AM
jhenderson accepted this revision.Feb 11 2021, 2:15 AM

A colleague has just finished running a linker with this patch through our internal codebase and it seems to have been working fine. I haven't been able to come up with any realistic cases where this change will have made things worse, so LGTM.

MaskRay edited the summary of this revision. (Show Details)Feb 11 2021, 9:39 AM
This revision was automatically updated to reflect the committed changes.