Since rL295240 we add a dynamic symbol table to statically linked binaries.
This also causes a .dynamic section, the _DYNAMIC symtbol and a PT_DYNAMIC
header to be added to the output file. This causes problems for example
when trying to run such a binary on FreeBSD MIPS. All the operating system
tools and ELF loader treat this binary as a dynamically linked on instead
and therefore tries to load it with RTLD which will crash at runtime.
Details
- Reviewers
ruiu ed • espindola MaskRay
Diff Detail
- Repository
- rLLD LLVM Linker
- Build Status
Buildable 14476 Build 14476: arc lint + arc unit
Event Timeline
This effectively reverts rL295240, right? If so, I'd personally oppose to this change.
The downside of statically linked executables is that by default, they don't contain a dynsym section. This means that no symbol table is mapped in memory, meaning that such processes are not able of printing their own symbol names (e.g., in backtraces due to uncaught exceptions). In rL295240, we added support for passing in --export-dynamic to force the creation of dynsym, PT_DYNAMIC, etc. to overcome this limitation. That this deviates from GNU ld is not a bug.
Does this only affect FreeBSD/mips or any architecture supported by FreeBSD? If it's just MIPS, why?
The dynamic symbol table is still included, it's only the PT_DYNAMIC, .dynamic and the _DYNAMIC symbol that are excluded.
I noticed this while testing for MIPS/CHERI but it might also affect other architectures. Looking at the code there are a lot of cases where code checks for _DYNAMIC == NULL to determine whether the executable is statically or dynamically linked:
libc/gen.auxv.c, libc/csu/mips/crt1.c (This one also exists for amd64, i386, riscv, sparc, aarch64, etc, etc), libc/net/nsdipatch.c. The common init code also skips the complete initialization logic for preinit_array, init_array if _DYNAMIC exists: in csu/common/ignore_init.c
For this case it would be enough to not add the _DYNAMIC symbol, and keep PT_DYNAMIC and .dynamic. What do you need the dynamic section for?
However, the file tool prints "dynamically linked" for all executables that contain a PT_DYNAMIC header .
But within a running process, you need PT_DYNAMIC to be able to find the location of dynsym, right? I suspect that without it, dynsym will be nothing more than an unreferenced section.
Omitting .dynamic and _DYNAMIC sounds perfectly fine to me!
I believe you need DT_DYNSYM to find it within a process so I guess I can't remove PT_DYNAMIC or .dynsym. I think the real problem is _DYNAMIC not resolving to null so that is fine by me.
It is slightly annoying that file will print that the executable is dynamically linked even though it isn't but I think that should probably fixed in the file source code rather than worked around here.
@arichardson
If the MIPS problem was similar to https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236165 ,
moving away from &_DYNAMIC will be a more reliable approach.
To check if an executable is dynamically linked, inspecting PT_INTERP is a better choice.
Checking if a weak undefined symbol has zero address is unreliale.
Some compilers may produce a GOT-generating relocation, some may produce an absolute relocation.
After linking, you may see the relocation resolved to static 0, or see a dynamic relocation (if at runtime there is some module providing the dynamic symbol, the weak reference will resolve to non-zero)
Quoting http://www.sco.com/developers/gabi/latest/ch4.symtab.html
The behavior of weak symbols in areas not specified by this document is implementation defined. Weak symbols are intended primarily for use in system software. Applications using weak symbols are unreliable since changes in the runtime environment might cause the execution to fail.
Regarding this patch. Actually, -Bstatic (synonym of -static in ld.bfd and lld) just means: "don't look for libfoo.so when a -lfoo is seen, before next -Bdynamic". I think it is weird to use it to decide whether we should emit .dynamic . (In the compiler drivers (gcc/clang/etc), -static mean static linking, but that is different from -Bstatic/-static in ld.bfd/lld.)
This change neither improves similarity with ld.bfd nor makes behaviors reasonable that suits lld (the internals of lld are very different from ld.bfd, some behaviors of ld.bfd may not suit lld). The logic to emit .dynamic .dynsym .dynstr etc in the 3 linkers:
lld: has_dso || --shared || --pie || --export-dynamic
gold: has_dso || --shared || --pie
bfd: (--shared || --pie) || ((not -r) && info->nointerp && (info->export_dynamic || info->dynamic)) && some (almost always true) conditions
@ed I want to knore more about your motivation to add --export-dynamic to the condition in D29982. Why do you need .dynamic in a position dependent executable for CloudABI, which has no shared object dependency on the linker command line?
Perhaps first you should define what features of static binaries you depend on. In fact PT_INTERP is not absolutely needed to have any of the dynamic feature, in principle rtld.c can be linked from crt1.o. And if we (FreeBSD) decide to support dlopen(3) from static binaries, this is what would happen. We already got a small dynamic linker in crt1.o to support ifunc relocations.
Absence of the DYNAMIC segment is rather good indication that a lot of dynamic features are indeed not used, so &_DYNAMIC == 0 is probably quite good check except that it is broken.
From the description
This also causes a .dynamic section, the _DYNAMIC symtbol and a PT_DYNAMIC header to be added to the output file. This causes problems for example when trying to run such a binary on FreeBSD MIPS.
What I know is just that the presence of _DYNAMIC caused a problem but I don't have more information why it caused the problem. Without more information I can only conjecture. My intuition says it is more likely a problem if the dynamic is absent in some scenarios, I don't understand how the presence (though probably unexpected by you) caused a problem.
Perhaps first you should define what features of static binaries you depend on. In fact PT_INTERP is not absolutely needed to have any of the dynamic feature, in principle rtld.c can be linked from crt1.o. And if we (FreeBSD) decide to support dlopen(3) from static binaries, this is what would happen. We already got a small dynamic linker in crt1.o to support ifunc relocations.
Absence of the DYNAMIC segment is rather good indication that a lot of dynamic features are indeed not used, so &_DYNAMIC == 0 is probably quite good check except that it is broken.
@kib So to answer your question, I need more information.
And if we (FreeBSD) decide to support dlopen(3) from static binaries
And if you decide to support static pie, you also need _DYNAMIC.
The C Runtime (cumulative of crt1.o and libc.so/libc.a) contains a lot of code like if (&_DYNAMIC != NULL) used as a test for the static/dynamic situation. Look at the https://github.com/freebsd/freebsd/blob/master/lib/csu/amd64/crt1.c#L62, there are many more. I believe it was mentioned many times in the discussion.
Perhaps first you should define what features of static binaries you depend on. In fact PT_INTERP is not absolutely needed to have any of the dynamic feature, in principle rtld.c can be linked from crt1.o. And if we (FreeBSD) decide to support dlopen(3) from static binaries, this is what would happen. We already got a small dynamic linker in crt1.o to support ifunc relocations.
Absence of the DYNAMIC segment is rather good indication that a lot of dynamic features are indeed not used, so &_DYNAMIC == 0 is probably quite good check except that it is broken.
@kib So to answer your question, I need more information.
And if we (FreeBSD) decide to support dlopen(3) from static binaries
And if you decide to support static pie, you also need _DYNAMIC.
Are you sure about this ? For static PIE, as I understand, we miss some kind of relocator in csu. For the relocator to work, all we need is to find the relocation section' boundaries. See for instance https://github.com/freebsd/freebsd/blob/master/lib/csu/common/ignore_init.c#L52 how we find iplt relocations in static binaries. I would expect that a similar approach works for non-iplt.
Yes. In the FreeBSD case I believe almost all uses of &_DYNAMIC are either in lib/csu or lib/libc. However, I'm pretty sure I saw some application level code use the check to determine if they are dynamically linked.
Also I just had a look in llvm-project.git and it seems that compiler-rt/lib/sanitizer_common/sanitizer_linux.cc uses the following: level = &_DYNAMIC == nullptr ? AndroidDetectApiLevelStatic(). I think this may be a common way of checking for the presence of a dynamic linker so emitting it for static binaries could be problematic.
Regarding this patch. Actually, -Bstatic (synonym of -static in ld.bfd and lld) just means: "don't look for libfoo.so when a -lfoo is seen, before next -Bdynamic". I think it is weird to use it to decide whether we should emit .dynamic . (In the compiler drivers (gcc/clang/etc), -static mean static linking, but that is different from -Bstatic/-static in ld.bfd/lld.)
This change neither improves similarity with ld.bfd nor makes behaviors reasonable that suits lld (the internals of lld are very different from ld.bfd, some behaviors of ld.bfd may not suit lld). The logic to emit .dynamic .dynsym .dynstr etc in the 3 linkers:
lld: has_dso || --shared || --pie || --export-dynamic
gold: has_dso || --shared || --pie
bfd: (--shared || --pie) || ((not -r) && info->nointerp && (info->export_dynamic || info->dynamic)) && some (almost always true) conditions@ed I want to knore more about your motivation to add --export-dynamic to the condition in D29982. Why do you need .dynamic in a position dependent executable for CloudABI, which has no shared object dependency on the linker command line?
I agree the check for Config->Static is wrong. It just happens to have the side effect that worked for us (and also matched ld.bfd because -static results in a binary without any dynamic libraries) but is the wrong check.
I think it needs to be more like the check in needsInterpSection(): Something like Config->Pic || !SharedFiles.empty() || !Config->DynamicLinker.empty() || Script->needsInterpSection();
I don't think that would for two reasons (at least not for every architecture). First lld does not emit a symbol pointing to the start of the relocation section. And second, with PIE we don't know the real address prior to relocation. It might work for architectures with pc-relative addressing but some others would need to process the relative relocations first.
Therefore you need to look at the .dynamic contents to find REL/RELA.
Regarding this patch: I don't think _DYNAMIC would be useful for -static-pie since we would need to relocate it first so _DYNAMIC could still be undefined and used to check for dynamic linker presence.
I had a quick look at glibc's static-pie code and it seems like glibc loads the first GOT entry for most architectures to get the address of the dynamic section.
For CloudABI it's useful to have when you want to create binaries that are able to print their own stack traces with symbol names in them. The regular symbol table isn't mapped into the address space of the application, whereas the dynamic symbol table is.
And if you decide to support static pie, you also need _DYNAMIC.
Are you sure about this ? For static PIE, as I understand, we miss some kind of relocator in csu.
On the linker side, static pie is -static -pie --no-dynamic-linker. The -pie causes lld to create dynamic sections.
void _start(char **ap, void (*cleanup)(void)) { int argc; char **argv; char **env; argc = *(long *)(void *)ap; argv = ap + 1; env = ap + 2 + argc; handle_argv(argc, argv, env); if (&_DYNAMIC != NULL) { atexit(cleanup); } else { process_irelocs(); _init_tls(); }
If FreeBSD used to differentiate dynamically/statically linked programs with &_DYNAMIC != NULL, I think it is probably time to revisit.
We have two ways to have dynamic sections in a statically linked program, --export-dynamic (https://reviews.llvm.org/rL295240) or -pie (static pie). If the distinction of dynamic/static is whether the program is interpreted by PT_INTERP, isn't testing this property directly more straightforward?
However, I'm pretty sure I saw some application level code use the check to determine if they are dynamically linked.
I think that is extremely rare, but they should also be updated to make static pie work.
Also I just had a look in llvm-project.git and it seems that compiler-rt/lib/sanitizer_common/sanitizer_linux.cc uses the following: level = &_DYNAMIC == nullptr ? AndroidDetectApiLevelStatic(). I think this may be a common way of checking for the presence of a dynamic linker so emitting it for static binaries could be problematic.
It is used by AndroidDetectApiLevelStatic(). If Android ever supports static pie, that piece of code has to change.
No, it is not. PT_INTERP requires the program (csu) to find out the aux vector, parse it to find the binary base, and then parse ELF header and program headers. All this while the binary itself is not relocated. This is significant blow of the crt, and added complexity, which also means that every binary (not only static) now carry a code which we cannot fix because it is statically linked even into dynamic binaries.
Also, as I noted above, PT_INTERP absence does not really mean that the binary is static.
Regarding this patch: I don't think _DYNAMIC would be useful for -static-pie since we would need to relocate it first so _DYNAMIC could still be undefined and used to check for dynamic linker presence.
You need _DYNAMIC for -static-pie. You need _DYNAMIC to find DT_REL* tags, then perform relocations.
I had a quick look at glibc's static-pie code and it seems like glibc loads the first GOT entry for most architectures to get the address of the dynamic section.
It is one thing I don't like (I don't know any other program that needs this) about glibc (this is wrriten in x86 psABI, but probably not in other psABIs): the first entry of .got.plt or .got holds the link-time address of _DYNAMIC. At runtime a pcrel load of _GLOBAL_OFFSET_TABLE_ gets the link-time address, subtracts it from the runtime address of _DYNAMIC, then you get the load base. glibc could parse the program header to get p_vaddr of PT_DYNAMIC instead.
Also, as I noted above, PT_INTERP absence does not really mean that the binary is static.
@kib I'm happy to know about FreeBSD crt internals but I'm afraid we've digressed from the topic, or I fail to understand how you justify the this patch.
I tried not using the term dynamically/statically linked program (the file utility may say a -static-pie is "dynamic" while glibc ldd says it is "static"). I'm sorry if I failed to do that.
I'm not happy with some decisions made in the GNU toolchain: -static does not compose with -pie (it overrides -pie) so they added -static-pie, ld.bfd has default PT_INTERP path for various platforms so it needs --no-dynamic-linker, -static seems to override -export-dynamic, etc.
@ed's explanation of -export-dynamic makes sense to me. I don't understand why this patch wants to override part of the semantics of -export-dynamic with -static, making the overall logic harder to reason about.
A -static-pie needs _DYNAMIC. When FreeBSD gets to -static-pie, if the expectation is that it resembles more a -static than a dynamic (finalizers called in the exe, not in ld.so/libc.so), the simplistic check &_DYNAMIC != NULL will not work.
Or when you decide to support dlopen() from a static program, it also needs _DYNAMIC. If the current (&_DYNAMIC!= is true) status makes applications crash, you need to figure out a more reliable approach.
[Perhaps, pcrel load gets runtime address, not important.]
The content of the first GOT entry (&_DYNAMIC) is specified in x86_64 psABI doc. More, the language there is explicit that this algorithm should be used to find the relocs. In fact it is quite relieving comparing with previous suggestion to find aux, then parse it to find program headers, then to find PT_INTERP and PT_DYNAMIC. It is two or three instructions in crt1 vs. several hundreds, which must be linked into each binary.
Also, as I noted above, PT_INTERP absence does not really mean that the binary is static.
@kib I'm happy to know about FreeBSD crt internals but I'm afraid we've digressed from the topic, or I fail to understand how you justify the this patch.
If linker decisions make crt operation hard or impossible, it is important.
I tried not using the term dynamically/statically linked program (the file utility may say a -static-pie is "dynamic" while glibc ldd says it is "static"). I'm sorry if I failed to do that.
I'm not happy with some decisions made in the GNU toolchain: -static does not compose with -pie (it overrides -pie) so they added -static-pie, ld.bfd has default PT_INTERP path for various platforms so it needs --no-dynamic-linker, -static seems to override -export-dynamic, etc.
@ed's explanation of -export-dynamic makes sense to me. I don't understand why this patch wants to override part of the semantics of -export-dynamic with -static, making the overall logic harder to reason about.
A -static-pie needs _DYNAMIC. When FreeBSD gets to -static-pie, if the expectation is that it resembles more a -static than a dynamic (finalizers called in the exe, not in ld.so/libc.so), the simplistic check &_DYNAMIC != NULL will not work.
Or when you decide to support dlopen() from a static program, it also needs _DYNAMIC. If the current (&_DYNAMIC!= is true) status makes applications crash, you need to figure out a more reliable approach.
I am fine with whatever approach which does not require me to link significant portion of rtld into each binary just to detect the presence of rtld and libc in the process.
&_DYNAMIC == NULL is good enough from the PoV. If new world order requires changes due to linker evolution, I am fine with that, but the method should be equally streamlined and boil down to very simple runtime check, feasible to do in the pre-relocated state.
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=236165 was raised again recently.
I'll reiterate that I think lld's current behavior (after rL295240) is quite reasonable.
lld: has_dso || --shared || --pie || --export-dynamic
gold: has_dso || --shared || --pie
bfd: (--shared || --pie) || ((not -r) && info->nointerp && (info->export_dynamic || info->dynamic)) && some (almost always true) conditions
The following example demonstrates why ld.bfd's behavior is unreasonable.
% cat <e >> a.s .globl _start _start: e % as a.s -o a.o % ld.bfd a.o --export-dynamic -o a; readelf -S a | grep -c dynsym 0 % ld.bfd a.o --no-dynamic-linker --export-dynamic -o a; readelf -S a | grep -c dynsym 1
--no-dynamic-linker was invented to fix its historical mistake: hard-coded ld.so path (e.g. /lib/ld64.so.1 on x86-64). The fallout caused trouble to static pie (so a new -static-pie was invented).
But how ? Again, finding and parsing phdrs in crt1 is not a solution. We need something comparable with the check for &_DYNAMIC != NULL in conciseness and simultaneously working for all three linkers. Ok, I am fine even with having separate tests for lld and GNU linkers, or-ed.
Apparently this needs comment.