This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Preserve ARM and AArch64 mapping symbols
ClosedPublic

Authored by ikudrin on Jan 13 2022, 9:01 AM.

Details

Summary

Mapping symbols are required by ARM/AArch64 ELF ABI. They help to disassemble files correctly and are also used in linkers. Nonetheless, for executable files, the symbols can be stripped to better resemble the behavior of GNU's objcopy.

Diff Detail

Event Timeline

ikudrin created this revision.Jan 13 2022, 9:01 AM
ikudrin requested review of this revision.Jan 13 2022, 9:01 AM

(@DavidSpickett @peter.smith)

I have played with aarch64-linux-gnu-strip a bit. It does seem that whether mapping symbols are removed is related to ET_REL or not.
Does the patch improve compatibility? Seems so to me.

In --strip-unneeded mode, GNU strip retains mapping symbols for ET_REL but not for non-ET_REL.

They help to disassemble files correctly and are also used in linkers.

ld.lld doesn't have special rules for mapping symbols.

Disassembling is an interesting angle. But removing unneeded local symbols (e.g. internal linkage functions) can make the disassembly results worse, too.
It is unclear to me how to want to preserve some local symbols but not others.
That said, the documentation for --discard-locals (in llvm-objcopy and GNU objcopy) mentions compiler-generated local symbols, and I may put mapping symbols into this category as well, but it seems to need some clarification.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
231

aaelf64 says All mapping symbols have type STT_NOTYPE and binding STB_LOCAL.

The Sym.getShndx() == SHN_UNDEF condition shouldn't appear in practice but seems fine to keep (an undefined local is assuredly an error).

From https://sourceware.org/binutils/docs/binutils/objcopy.html

-x
--discard-all
Do not copy non-global symbols from the source file.

-X
--discard-locals
Do not copy compiler-generated local symbols. (These usually start with ‘L’ or ‘.’.)
--strip-unneeded
Remove all symbols that are not needed for relocation processing in addition to debugging symbols and sections stripped by --strip-debug.

Empirically with arm-none-eabi-objcopy from binutils 2.35 and aarch64-linux-gnu-objcopy from binutils 2.34 I get the same results:

filetype/option--discard-locals--discard-all--strip-unneeded
objectmapping symbols preservedmapping symbols preservedmapping symbols preserved
executablemapping symbols preservedmapping symbols removedall non dynamic symbols removed

I would expect mapping symbols to be preserved in local objects when doing these transformations. They are required in AArch32 for the following things:

  • Denoting Arm/Thumb changes mid-function (assembly) and constant pools (needed for disassembly)
  • Linker patching (effectively linker needs to disassemble the code accurately)
  • big-endian output (linker needs to endian revers instructions but not constant pools). This is not supported in LLD right now.

LLD uses mapping symbols in https://github.com/llvm/llvm-project/blob/main/lld/ELF/ARMErrataFix.cpp#L316 for linker patching.

In AArch64 in practice mapping symbols are only needed if someone has used constant pools in assembly language. In LLD we account for that case in linker patching for https://github.com/llvm/llvm-project/blob/main/lld/ELF/AArch64ErrataFix.cpp#L430

For executables they are only used for disassembly. I guess --discard-locals could still keep them based on the GNU documentation entry.

Not had a chance to look at the implementation here. I'm broadly in favour with matching the GNU tools behaviour.

The patch aims to improve the compatibility, right.

So, do we have a consensus? Or the patch has to be improved somehow?

FWIW I'm happy with the proposed change.

MaskRay accepted this revision.Jan 18 2022, 1:14 PM
MaskRay added inline comments.
llvm/test/tools/llvm-objcopy/ELF/strip-unneeded-arm.test
5

llvm-nm -j --special-syms %t1.o | FileCheck %s --match-full-lines

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
256
This revision is now accepted and ready to land.Jan 18 2022, 1:14 PM
ikudrin marked 2 inline comments as done.Jan 18 2022, 11:24 PM

Thanks for the suggestions!

This revision was automatically updated to reflect the committed changes.