diff --git a/lld/docs/ELF/warn_backrefs.rst b/lld/docs/ELF/warn_backrefs.rst new file mode 100644 --- /dev/null +++ b/lld/docs/ELF/warn_backrefs.rst @@ -0,0 +1,79 @@ +--warn-backrefs +=============== + +Linkers process input files from left to right and maintain the current set of +undefined symbols. If an archive member (or an object file surrounded by +``--start-lib`` and ``-end-lib``) does not satisfy any undefined symbol, it +will be dropped by a single pass linker such as GNU ld, and the link will fail +with an ``undefined reference`` error. + + ld def.a ref.o + +LLD's archive selection semantics is more relaxed (commutative to some extent). +The link succeeds even if an archive (``def.a``) is redundant at the time it is +processed while a future archive/object file (``ref.o``) needs it. + +``--warn-backrefs`` can identify such an invocation which may be incompatible +with GNU ld: + + % ld.lld --warn-backrefs ... -lB -lA + ld.lld: warning: backward reference detected: system in A.a(a.o) refers to B.a(b.o) + + % ld.lld --warn-backrefs ... --start-lib B/b.o --end-lib --start-lib A/a.o --end-lib + ld.lld: warning: backward reference detected: system in A/a.o refers to B/b.o + + # To suppress the warning, you can specify --warn-backrefs-exclude= to match B/b.o or B.a(b.o) + +The single pass linker such as GNU ld actually has a nice property: it is a +layering check tool which enforces a topological order of libraries. +``--warn-backrefs`` retrieves the advantage with better and actionable +feedback: which library defines the symbol. The diagnostic above indicates that +there is a missing dependency ``A -> B``. There are two main cases and one rare +case: + +* If adding the dependency does not form a cycle: conceptually ``A`` is higher + level library while ``B`` is at a lower level. When you are developing an + application ``P`` which depends on ``A``, but does not directly depend on + ``B``, your link may fail surprisingly with ``undefined symbol: + symbol_defined_in_B`` if the used/linked part of ``A`` happens to need some + components of ``B``. It is inappropriate for ``P`` to add a dependency on + ``B`` since ``P`` does not use ``B`` directly. +* If adding the dependency forms a cycle, e.g. ``B->C->A ~> B``. ``A`` + is supposed to be at the lowest level while ``B`` is supposed to be at the + highest level. When you are developing ``C_test`` testing ``C``, your link may + fail surprisingly with ``undefined symbol`` if there is somehow a dependency on + some components of ``B``. You could fix the issue by adding the missing + dependency (``B``), however, then every test (``A_test``, ``B_test``, + ``C_test``) will link against every library. This breaks the motivation + breaking ``B``, ``C`` and ``A`` into separate libraries and makes binaries + unnecessarily large. Moreover, the layering violation makes lower-level + libraries (e.g. ``A``) vulnerable to changes to higher-level libraries (e.g. + ``B``, ``C``). +* (Rare) ``A.a B A2.so``: GNU ld picks the definition from ``A2.so`` while LLD + picks the definition from ``A.a``. ``A`` may be an interceptor (e.g. it + provides some optimized libc functions and A2 is libc), ``B`` does not need + to know about ``A``, and ``A`` may be pulled into the link by other part of + the program. In this case, add ``-Wl,--warn-backrefs-exclude=B/b.o``. + +Resolution: + +* Add a dependency from ``A`` to ``B``. +* The reference may be unintended and can be removed. +* The dependency may be intentionally omitted because there are multiple libraries like ``B``. + Consider linking ``B`` with object semantics instead of archive semantics. +* In the case of circular dependency, sometimes merging the libraries are the best. + +There is a variant of ``A.a B A2.so``: ``A.a B A2.a``. I name this a "linking +sandwich problem". + +* ``A2.a`` may be a replicate of ``A.a``. This is redundant but benign. In some + cases ``A.a`` and ``B`` should be surrounded by a pair of ``--start-group`` + and ``--end-group``. This is especially common among system libraries (e.g. + ``-lc __isnanl references -lm``, ``-lc _IO_funlockfile references + -lpthread``, ``-lc __gcc_personality_v0 references -lgcc_eh``, and + ``-lpthread _Unwind_GetCFA references -lunwind``). +* ``A2.a`` provides a different definition. In a C++ case this is likely a + violation of the One Definition Rule. + +``--warn-backrefs`` does not check this problem. We probably need a dedicated +option for this ODR violation checking feature. diff --git a/lld/docs/index.rst b/lld/docs/index.rst --- a/lld/docs/index.rst +++ b/lld/docs/index.rst @@ -177,3 +177,4 @@ Partitions ReleaseNotes ELF/linker_script + ELF/warn_backrefs