This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Use AS_NEEDED command for linker script inputs
Needs ReviewPublic

Authored by phosek on Oct 27 2018, 5:07 PM.

Details

Summary

This ensures that ELF shared libraries are only added when they are
actually needed, so when using C++ driver for linking, ELF outputs
that don't use any C++ standard library symbols won't end up with
unnecessary DT_NEEDED entries in the dynamic section.

Diff Detail

Repository
rCXX libc++

Event Timeline

phosek created this revision.Oct 27 2018, 5:07 PM
MaskRay added inline comments.
libcxx/utils/gen_link_script.py
73

I don't understand the intention here.

If the user passes -lc++ without --as-needed before, the user expects to see a DT_NEEDED entry for libc++.so.1. But with this change. the DT_NEEDED entry will not appear.

(In lld, libc++abi.so.1 resolves undefined symbols in libc++.so.1 so it will always be needed (SharedFile::IsNeeded == true).)

phosek added inline comments.Oct 27 2018, 7:30 PM
libcxx/utils/gen_link_script.py
73

Passing -lc++ doesn't imply DT_NEEDED because -lc++ may refer to different things, for example if you only have static library (i.e. libc++.a), passing -lc++ won't create any DT_NEEDED entries.

Implementation of the library gets to decide whether its ABI requires DT_NEEDED for users that do not link in any symbols. This is a worthwhile optimization since the implementation does have any initializer-time side-effects or weak symbol overrides that are intended to affect the behavior of a program that doesn't have an undefined reference to a libc++ or libc++abi.

MaskRay added inline comments.Oct 28 2018, 12:36 AM
libcxx/utils/gen_link_script.py
73

Sorry if I was not clear.

Say we have libstdc++.so (ELF) libc++.so (linker script INPUT(...)). The current behavior:

  • -lstdc++ => DT_NEEDED libstdc++.so.6
  • --as-needed -lstdc++ => DT_NEEDED tag is as needed
  • -lc++ => DT_NEEDED libc++.so.1 & libc++abi.so.1
  • --as-needed -lc++ => DT_NEEDED libc++.so.1 is as needed; (in lld (different from ld.bfd and gold), libc++abi.so.1 will always exist)

With this patch,

-lc++ will essentially become --as-needed -lc++. I think we need convincing arguments why we should diverge.

With lld, you can do this if you really need as-needed tag: -fuse-ld=lld -stdlib=libc++ -Wl,--as-needed -lc++ -Wl,--no-as-needed
(-stdlib=libc++ transforms to a second -lc++, but .so with the same soname is just ignored. This observed behavior is different from ld.bfd/gold)

If you want as-needed DT_NEEDED with ld.bfd/gold, I guess you'll have to use -nostdlib++ and link libc++ manually.

If this divergent behavior seems reasonable enough to be hard-coded in clang for some particular toolchain: change clang::driver::toolchains::*::AddCXXStdlibLibArgs to add -Wl,--push-state -Wl,--as-needed -lc++ -Wl,--end-state

MaskRay added inline comments.Oct 28 2018, 12:40 AM
libcxx/utils/gen_link_script.py
73

typo. -Wl,--end-state -> -Wl,--pop-state

ldionne resigned from this revision.Oct 31 2018, 4:02 PM
ldionne added a reviewer: dexonsmith.

I don't feel like I'm qualified to review this change. Adding Duncan who knows more about linkers. I'll avidly watch to learn though :-)

MaskRay removed a subscriber: MaskRay.
dexonsmith resigned from this revision.Oct 31 2018, 7:18 PM

I don't feel like I'm qualified to review this change. Adding Duncan who knows more about linkers. I'll avidly watch to learn though :-)

If this is ELF then I'm out of my depth here.

Is this patch still needed now that Fuchsia uses:

CmdArgs.push_back("--push-state");
CmdArgs.push_back("--as-needed");
if (OnlyLibstdcxxStatic)
  CmdArgs.push_back("-Bstatic");
ToolChain.AddCXXStdlibLibArgs(Args, CmdArgs);
if (OnlyLibstdcxxStatic)
  CmdArgs.push_back("-Bdynamic");
CmdArgs.push_back("-lm");
CmdArgs.push_back("--pop-state");

Is this patch still needed now that Fuchsia uses:

CmdArgs.push_back("--push-state");
CmdArgs.push_back("--as-needed");
if (OnlyLibstdcxxStatic)
  CmdArgs.push_back("-Bstatic");
ToolChain.AddCXXStdlibLibArgs(Args, CmdArgs);
if (OnlyLibstdcxxStatic)
  CmdArgs.push_back("-Bdynamic");
CmdArgs.push_back("-lm");
CmdArgs.push_back("--pop-state");

We chose to address this in the driver for Fuchsia specifically so as not to be blocked on the resolution of the change here.
To be pedantic, I advocate both changes but for slightly different reasons:

  1. Using --as-needed -lc++ in the compiler driver is an assertion about the API contract between the compiler and the -lc++ implementation. It asserts that the sole purpose of -lc++ is to satisfy link-time symbol references generated by the compiler. If there are no symbols to be resolved, the compiler is not obliged to include -lc++ in the link. This assertion is wrong if we decide instead to say that the API contract between the compiler and the -lc++ implementation is that if the user enabled the standard library then -lc++ should be included in the link. That makes it possible e.g. to have the -lc++ implementation decide to include a shared library initializer that *must* run in a program that "enabled the standard library" even if there are no link-time symbol references. It's my humble opinion in general, and my authoritative assertion for the Fuchsia-specific API contract between compiler and -lc++ implementation, that an API contract compatible with --as-needed link-time behavior is the preferable API contract there. I think other compiler configurations should make that choice of API contract and use optimal linker switches accordingly, but that is up to them.
  1. Having libc++.so be a linker script provided by the libc++ implementation means that the linker script is responsible for meeting the API contract with the compiler and translating that into the ABI contract with the libc++.so.N shared library implementation (and future implementations that should drop in and be ABI-compatible). Whether the compiler decides to eagerly use -lc++ or not, libc++ can make the choice for its own ABI that there will never be a future ABI-compatible implementation where there is any need to load libc++.so.N at runtime (to run its initializer or anything else) *except* as needed to satisfy link-time symbol references. I think that's a sensible and desirable choice for libc++ to make independent of whether the API contract with the compiler constrains it to do so or not. In actual practice, there are many compilers and compiler versions and configurations around that might be used to link against -lc++ as provided by tomorrow's libc++, so IMHO it best serves all those users to be optimal here even if it's redundant with some compiler configurations.

I am mostly convinced by the ABI contract argument :) I really enjoy the interpretation but am willing to learn more. @mcgrathr Should other libfoo.so behave so (on-demand DT_NEEDED tag) in an ideal world?

@phosek would you mind amending the description with Roland's arguments?

MaskRay resigned from this revision.Jul 27 2020, 11:46 PM

libc++.a as a linker script may be a good solution (https://bugs.llvm.org/show_bug.cgi?id=46321)