Mapping symbols are required by ARM/AArch64 ELF ABI. They help to disassemble files correctly and are also used in linkers. Nonetheless, for executable files, the symbols can be stripped to better resemble the behavior of GNU's objcopy.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I have played with aarch64-linux-gnu-strip a bit. It does seem that whether mapping symbols are removed is related to ET_REL or not.
Does the patch improve compatibility? Seems so to me.
In --strip-unneeded mode, GNU strip retains mapping symbols for ET_REL but not for non-ET_REL.
They help to disassemble files correctly and are also used in linkers.
ld.lld doesn't have special rules for mapping symbols.
Disassembling is an interesting angle. But removing unneeded local symbols (e.g. internal linkage functions) can make the disassembly results worse, too.
It is unclear to me how to want to preserve some local symbols but not others.
That said, the documentation for --discard-locals (in llvm-objcopy and GNU objcopy) mentions compiler-generated local symbols, and I may put mapping symbols into this category as well, but it seems to need some clarification.
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp | ||
---|---|---|
231 | aaelf64 says All mapping symbols have type STT_NOTYPE and binding STB_LOCAL. The Sym.getShndx() == SHN_UNDEF condition shouldn't appear in practice but seems fine to keep (an undefined local is assuredly an error). |
From https://sourceware.org/binutils/docs/binutils/objcopy.html
-x --discard-all Do not copy non-global symbols from the source file. -X --discard-locals Do not copy compiler-generated local symbols. (These usually start with ‘L’ or ‘.’.) --strip-unneeded Remove all symbols that are not needed for relocation processing in addition to debugging symbols and sections stripped by --strip-debug.
Empirically with arm-none-eabi-objcopy from binutils 2.35 and aarch64-linux-gnu-objcopy from binutils 2.34 I get the same results:
filetype/option | --discard-locals | --discard-all | --strip-unneeded |
object | mapping symbols preserved | mapping symbols preserved | mapping symbols preserved |
executable | mapping symbols preserved | mapping symbols removed | all non dynamic symbols removed |
I would expect mapping symbols to be preserved in local objects when doing these transformations. They are required in AArch32 for the following things:
- Denoting Arm/Thumb changes mid-function (assembly) and constant pools (needed for disassembly)
- Linker patching (effectively linker needs to disassemble the code accurately)
- big-endian output (linker needs to endian revers instructions but not constant pools). This is not supported in LLD right now.
LLD uses mapping symbols in https://github.com/llvm/llvm-project/blob/main/lld/ELF/ARMErrataFix.cpp#L316 for linker patching.
In AArch64 in practice mapping symbols are only needed if someone has used constant pools in assembly language. In LLD we account for that case in linker patching for https://github.com/llvm/llvm-project/blob/main/lld/ELF/AArch64ErrataFix.cpp#L430
For executables they are only used for disassembly. I guess --discard-locals could still keep them based on the GNU documentation entry.
Not had a chance to look at the implementation here. I'm broadly in favour with matching the GNU tools behaviour.
The patch aims to improve the compatibility, right.
So, do we have a consensus? Or the patch has to be improved somehow?
llvm-nm -j --special-syms %t1.o | FileCheck %s --match-full-lines