This is an archive of the discontinued LLVM Phabricator instance.

[ELF] A shared object is needed if any of its occurrences is needed
ClosedPublic

Authored by MaskRay on Dec 26 2018, 1:40 PM.

Details

Summary

If a DSO appears more than once with and without --as-needed, ld.bfd and gold consider --no-as-needed to takes precedence over --as-needed. lld didn't and this patch makes it do so.

This makes it a bit away from the position-dependent behavior (how
different occurrences of the same DSO interact) and protects us from
some mysterious runtime errors: if some interceptor libraries add their
own --no-as-needed dependencies (e.g. librt.so), and the user
application specifies -Wl,--as-needed -lrt , the absence of the
DT_NEEDED entry would make dlsym(RTLD_NEXT, "clock_gettime") return NULL
and would break at runtime.

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.Dec 26 2018, 1:40 PM
ruiu added a comment.Dec 26 2018, 1:47 PM

This is a tricky case, and arguably the current behavior should be considered correct, although there's no definition of what's right and what's wrong. Could you share the background of this change so that I can understand why you want this?

ELF/SymbolTable.cpp
97 ↗(On Diff #179520)

Please replace auto with the actual type.

MaskRay marked an inline comment as done.Dec 26 2018, 2:18 PM

The test case suggests its usage: ld.lld %t.o %tb.so --as-needed %ta.so --no-as-needed %ta.so -o %t

The behavior I observed: both ld.bfd and gold consider a DSO needed if any occurrence of it is not wrapped by --as-needed. However, lld doesn't do so: we uniquify DSOs by their SONAME, and may drop a DT_NEEDED entry if the first occurrence is wrapped by --as-needed.

This patch makes it consistent with ld.bfd/gold in this aspect and moves a bit away from the not-so-good position-dependent behavior.
(An as-needed DSO resembles an archive file to some extent. I understand it is difficult to make it consistent with ld.bfd/gold in every case (gold diverges from ld.bfd in some aspects). But this behavior is what I think reasonable)

ELF/SymbolTable.cpp
97 ↗(On Diff #179520)

The actual type of R is very long std::pair<llvm::DenseMapIterator<llvm::StringRef, lld::elf::InputFile *, llvm::DenseMapInfo<llvm::StringRef>, llvm::detail::DenseMapPair<llvm::StringRef, lld::elf::InputFile *>, false>, bool>

It is a pair consisting of an iterator to the element, and a bool denoting whether the insertion took place (the same as https://en.cppreference.com/w/cpp/container/map/emplace).

ruiu added a comment.Dec 26 2018, 2:24 PM

Although I understand how your change works, I don't understand why you need this in the first place. What is the use case in which you want to specify the same DSO more than once in the same command line, with and without --as-needed?

Although I understand how your change works, I don't understand why you need this in the first place. What is the use case in which you want to specify the same DSO more than once in the same command line, with and without --as-needed?

The AsNeeded status of a.so can also come from the linker script command AS_NEEDED. If an executable links in the script b.so (AS_NEEDED(a.so)), but really wants a.so to be loaded, it may link in a.so explicitly. Without the patch (making it consistent with ld.bfd/gold), the DT_NEEDED entry of a.so will be dropped.

Now the question is why we want to ensure some DSOs in DT_NEEDED entries. It can be related to static constructors and DSO loading order problems (honestly I think many of them are just user error...).

ruiu added a comment.Dec 26 2018, 3:16 PM

What program are you actually trying to fix? Honestly I feel no one should depends on such a corner case behavior of GNU linkers.

https://bugs.llvm.org/show_bug.cgi?id=15823#c0 (clang -fuse-ld=lld -fsanitize=address test.c -Wl,--as-needed -ldl -lrt) is one real world case.

That example does not work today because libclang_rt.asan-x86_64.a has other references (shm_open) to librt.so so librt.so is needed.

But the idea may apply on other interceptor libraries: they may have their own --no-as-needed dependencies (librt.so), if the user application specifies -Wl,--as-needed -lrt , it will break (dlsym(RTLD_NEXT, "clock_gettime")) at runtime.

ruiu added a comment.Dec 27 2018, 10:12 AM

I'm still a bit hesitant about doing this, but perhaps it is better to accept this because without this a mysterious runtime error could happen as you explained. But I wonder if you can simplify it even more. Can you do something like this? I didn't test this but I think we don't need a hash table.

https://gist.github.com/rui314/92fdae6c0678528e76ad1a5e8818a57f

MaskRay updated this revision to Diff 179565.Dec 27 2018, 10:32 AM

Add some comments

MaskRay added a comment.EditedDec 27 2018, 10:36 AM

I'm still a bit hesitant about doing this, but perhaps it is better to accept this because without this a mysterious runtime error could happen as you explained. But I wonder if you can simplify it even more. Can you do something like this? I didn't test this but I think we don't need a hash table.

https://gist.github.com/rui314/92fdae6c0678528e76ad1a5e8818a57f

Thank you for considering this. The actual code addition is one line (minus comments and newline), the benefit is the fix of some potential mysterious runtime errors. In addition, it fixes one position-dependent behavior. There are two introduced by --as-needed

  • --as-needed influences DSOs coming after it. This is inherent and unfixable
  • How the multiple occurrences of the same DSO interact. It was position-dependent and now independent.

I've read your gist, but I think the quadratic behavior may be undesired. The number of DT_NEEDED entries is usually small (100), but some of our internal executables can have thousands of dependencies. By keeping the hash table as is, the increase of memory usage is small but the performance improvement is large.

ruiu added a comment.Dec 27 2018, 11:24 AM

Quadratic behavior should be fine as a square of a few thousand is still small. We don't need to worry too much about it. That said, if you don't like it, it is also fine as long as the code is as easy to understand as the more dumb-but-simple code. I'd think that the current comment is not suffice because it only explains what that code does. Please copy my comment which explains why we want it.

Also please avoid try_emplace as the code looks a bit too smart. The actual values of R.first->second and R->second are not obvious. I'd write something like this.

F->parseSoName();
if (errorCount())
  return;

// DSO are uniquified by soname.
InputFile *&Existing = SoNames[F->SoName];
if (Existing) {
  if (F->IsNeeded)
    cast<SharedFile<ELFT>>(Existing)->IsNeeded = true;
  return;
}
Existing = F;
MaskRay updated this revision to Diff 179568.Dec 27 2018, 11:32 AM

Spell out the two components of the try_emplace return value

Quadratic behavior should be fine as a square of a few thousand is still small. We don't need to worry too much about it. That said, if you don't like it, it is also fine as long as the code is as easy to understand as the more dumb-but-simple code. I'd think that the current comment is not suffice because it only explains what that code does. Please copy my comment which explains why we want it.

Also please avoid try_emplace as the code looks a bit too smart. The actual values of R.first->second and R->second are not obvious. I'd write something like this.

F->parseSoName();
if (errorCount())
  return;

// DSO are uniquified by soname.
InputFile *&Existing = SoNames[F->SoName];
if (Existing) {
  if (F->IsNeeded)
    cast<SharedFile<ELFT>>(Existing)->IsNeeded = true;
  return;
}
Existing = F;

DenseMap::try_emplace is a C++17 flavored function name that powers insert, which has been used in 30+ llvm files and 20+ clang files. I know lld uses a slightly different style, so I have spelled out the two components. WasInserted is picked according to the existing name convention in another place of lld.

ruiu accepted this revision.Dec 27 2018, 2:13 PM

LGTM

ELF/SymbolTable.cpp
100 ↗(On Diff #179568)

I still prefer to add

if (errorCount())
  return;

here because there's no reason to group WasInserted and errorCount() in terms of semantics.

This revision is now accepted and ready to land.Dec 27 2018, 2:13 PM
MaskRay updated this revision to Diff 179582.Dec 27 2018, 2:26 PM
MaskRay edited the summary of this revision. (Show Details)

Check `if (errorCount) early

This revision was automatically updated to reflect the committed changes.