This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ARM] ARM IRELATIVE relocations are in rel.dyn
AbandonedPublic

Authored by peter.smith on Oct 27 2016, 5:31 AM.

Details

Reviewers
ruiu
rafael
Summary

The existing code for IRELATIVE relocations in ARM uses the built-in support by setting Target->IRelativeRel to R_ARM_IRELATIVE.
Unfortunately when trying to run a lld produced executable with an IRELATIVE relocation using the glibc dynamic loader ld-linux-armhf.so.3 I get an "unexpected PLT reloc type" error message.

Looking into sysdeps/arm/dl-machine.h it seems like R_ARM_IRELATIVE is only supported in the rel.dyn section of the PLT. This is probably a legacy of ARM having a single .got section.

This change does the following for ARM targets:

  • Puts the got entry used by the IRELATIVE relocation into .got rather than .got.plt
  • Puts the IRELATIVE relocation into .rel.dyn rather than .rel.plt
  • defines rel_iplt_start and rel_iplt_end around .rel.dyn rather than .rel.plt
  • Always adds .rel.dyn when there are IRELATIVE relocations even when static linking. This follows the same pattern as .rel.plt
  • Passes the GotVA of the symbol to the PLT generation code rather than the GotPltVA for an ARM IRelative relocation
  • If there is no .got.plt section (static linking) then use a dummy zero address for the PLT header as there is no .got.plt address.

Review questions:

  • This patch is a bit messy, but I've not got many ideas that would make it much cleaner. It is possible that some of the tests could be moved into functions and they could be abstracted away from ARM, for example if (Target->IRelInPLT) ...
  • On ARM GNU ld can use an auxiliary PLT OutputSection called .iplt exclusively for ifuncs. Following this implementation choice might make the implementation a little cleaner in some places but it would probably mean more code-changes overall.
  • It could be possible to not output the PLT header if there are only IRELATIVE entries, this would mean changing some code that has assumptions that there is always a PLT header.

Unfortunately the static glibc has an IRELATIVE relocation for memcpy on ARM so anyone trying to do static linking will hit the error message.

dl-machine.h reference:
https://fossies.org/dox/glibc-2.24/arm_2dl-machine_8h_source.html
The .rel.plt code is elf_machine_lazy_rel() at the bottom of the file. The .rel.dyn is elf_machine_rela()

x86_64, x86_32 and AArch64 has IRELATIVE support in both functions, ppc64 like ARM only has support in rel.dyn. The lld support for ppc doesn't support IRELATIVE so there isn't anything broken there.

Diff Detail

Event Timeline

peter.smith retitled this revision from to [LLD][ARM] ARM IRELATIVE relocations are in rel.dyn.
peter.smith updated this object.
peter.smith added reviewers: ruiu, rafael.
peter.smith added a subscriber: llvm-commits.
ruiu edited edge metadata.Oct 28 2016, 3:41 PM

Hmm, it is not indeed your fault, but it is a bit messy because the problem it is handling is an arbitrary ABI limitation. (Is there any effort going on to fix it on the loader side?)

I'm wondering if it can be solved by merging .got.plt into .got. Once the conversion of linker-synthesized section is done, you'll get .got.plt as an input section, so you can put it into any output section, even into .got. Does it solve the issue?

As far as I know there isn't any existing effort to fix the loader as the loader will happily will work with gold and bfd as they are Today. Even if the loader is modified there will be a delay until Linux distributions pick it up as the glibc release cycle can be quite long (ld.so is part of glibc) and many existing installations that will remain on the old library. I think it is worth at least reporting to glibc, but practically speaking I think we'll have to follow the existing conventions.

I think merging the .got and .got.plt solves one of the cases, whether to put the .got entry into .got.plt or .got, but it doesn't solve the problem that the relocation will need to go into .rel.dyn and not .rel.plt, and that is the majority of the mess. Unfortunately we can't merge .rel.plt and .rel.dyn so most of the special cases will still exist.

I can have a go at adding some helper function to see if I can clear up the main code path.

Turning the problem on its head I did consider whether it would be worth making all the other Targets use rel.dyn rather than .rel.plt, this would unify the implementations using the ARM way. However it seems like other linker's like ld don't do this on Targets like x86 and try to avoid doing so where possible so I'm reluctant to do this given how poorly specified all this is.

As an aside while looking into ifunc support in other linkers I came across: https://sourceware.org/bugzilla/show_bug.cgi?id=13302 which makes sure IRELATIVE relocations are always sorted to appear after non IRELATIVE relocations, this avoids corner cases where the IFUNC resolver refers to a PLT entry that hasn't been relocated yet. I don't think that lld does this at the moment so we would presumably be vulnerable to the same corner case.

On reflection I think that it could be possible to clean this up if the rel.dyn and rel.plt OutputSections are rewritten in terms of InputSections. The linker in Relocation.cpp would create the R_<TARGET>_IRELATIVE relocations in an InputSection called .rel.iplt or .rela.iplt. On targets such as AArch64 and X86 rela.iplt would be allocated to the .rela.plt OutputSection, on ARM and PPC (if the PPC lld Target supported it) would go into the .rel.dyn OutputSection.

The PLT and GOT + Relocs as InputSection may be some way off, is there any chance of tidying the code up enough or does it need to wait for the InputSection work to complete? The static glibc uses IFunc for memcpy which is used during initialisation so it will block all static linking on ARM.

ruiu added a comment.Oct 31 2016, 1:17 PM

In an attempt to understand the problem more, I tried to read GNU libc's code, but couldn't even figure out where it distinguishes .rel.dyn from .rel.plt. Could you point it for me?

To the best of my knowledge from grepping through the code, I'm no glibc expert unfortunately, I think a good place to start from is do-rel.h in the elf sub-directory of glibc:
https://fossies.org/dox/glibc-2.24/do-rel_8h_source.html
This has elf_dynamic_do_Rel which is the code that iterates over the Relocations in .rel.plt or .rel.dyn.

Another useful file is dynamic-link.h also in the elf sub-directory https://fossies.org/dox/glibc-2.24/dynamic-link_8h_source.html
This contains the _ELF_DYNAMIC_DO_RELOC which gets the range information for .rel.plt and .rel.dyn from the .dynamic segment/section tags such as DT_REL and DT_JMPREL. The appropriate REL or RELA version of elf_dynamic_do_Rel is called from here with the relocation information from either .rel.plt or .rel.dyn (Line 160).

The definition of elf_machine_lazy_rel (PLT) or elf_machine_rel (GOT) are in sysdeps/<TARGET>/dl-machine.h

At this stage elf_machine_lazy_rel won't do the relocation for relocations such as R_*_JUMP_SLOT, but will instead set up the .got.plt to point to the resolver. Interestingly for R_*_IRELATIVE the ifunc resolver is called so there is no lazy evaluation of the resolution.

Hope this is of some use.

emaste added a subscriber: emaste.Nov 30 2016, 11:54 AM
peter.smith abandoned this revision.Jun 22 2017, 7:57 AM

This is no longer required, implemented as part of D27406