This is an archive of the discontinued LLVM Phabricator instance.

[LLD][ARM][AArch64] ARM and AArch64 undefined weak reference values for relative relocations
ClosedPublic

Authored by peter.smith on Nov 2 2016, 5:16 AM.

Details

Summary

The ARM 32 and 64-bit ABI does not use 0 for undefined weak references that are used in PC relative relocations. In particular:

  • A branch relocation to an undefined weak resolves to the next instruction.
  • In all other cases the symbol resolves to the place so that S + A - P resolves to A.

The branch relocation resolving to the next instruction is the most important as it is a common idiom in statically linked ARM and AArch64 code to use branches to weak references to initialization functions of optional parts of the program. More importantly it can also cause spurious relocation out of range errors as the (S + A) - P of the branch relocation evaluates to (0 + A) - P which is often a large negative number.

References:

Diff Detail

Event Timeline

peter.smith updated this revision to Diff 76692.Nov 2 2016, 5:16 AM
peter.smith retitled this revision from to [LLD][ARM][AArch64] ARM and AArch64 undefined weak reference values for relative relocations.
peter.smith updated this object.
peter.smith added reviewers: ruiu, rafael.
peter.smith added a subscriber: llvm-commits.
ruiu added inline comments.Nov 2 2016, 12:09 PM
ELF/InputSection.cpp
261–264

You can remove template <class ELFT> and use uint32_t instad of ELFT::uint or uint64_t, because we know it is ARM32.

284–287

Ditto. ELF::uint -> uint64_t.

416

Doesn't isWeak() imply !isLocal()? I assume that all weak symbols are not weak.

421

nit: Since the last one ends with return, remove else.

428

Ditto.

peter.smith updated this revision to Diff 76852.Nov 3 2016, 6:22 AM

Thank you for the comments. I've update the diff with all but the comment about Weak and Local. I've put an inline comment about the assertion in Symbols.h SymbolBody::symbol() that prevents !IsLocal() from being removed.

ruiu edited edge metadata.Nov 3 2016, 11:37 AM

I think you forgot to include Symbols.h into this patch.

ELF/InputSection.cpp
261

uint64_t -> uint32_t?

peter.smith edited edge metadata.

I've made the suggested change to elf32_t.

Apologies I'm a bit confused about Symbols.h as I haven't made any changes to it, I was just pointing out that the existing implementation of symbol has an assert for !isLocal(). It is possible to put a test for isWeak() into SymbolBody, that returns false if the SymbolBody is a local and only forwards on to Symbol if it isn't. Would you like me to add that?

Existing implementation:
inline Symbol *SymbolBody::symbol() {

assert(!isLocal());
return reinterpret_cast<Symbol *>(reinterpret_cast<char *>(this) -
                                  offsetof(Symbol, Body));

}

ruiu accepted this revision.Nov 8 2016, 1:49 PM
ruiu edited edge metadata.

LGTM

This revision is now accepted and ready to land.Nov 8 2016, 1:49 PM
This revision was automatically updated to reflect the committed changes.
grimar added a subscriber: grimar.Aug 2 2018, 6:35 AM
grimar added inline comments.
lld/trunk/ELF/InputSection.cpp
266 ↗(On Diff #77327)

I just noticed that we have now the next code here:

case R_ARM_THM_JUMP11:
  return P + 2 + A;

And it seems it is uncovered by any test case we have now, in 2018.
Reporting just in case. (I do not know ARM enough to add a test, unfortunately)

peter.smith added inline comments.Aug 2 2018, 7:19 AM
lld/trunk/ELF/InputSection.cpp
266 ↗(On Diff #77327)

I should be able to come up with something. In theory it would need something like:

.thumb
b.n undefined_weak_symbol

There is the possibility that llvm-mc relaxation may decide to intervene and widen the branch though. I'll see what I can come up with.

I've proposed a test case in https://reviews.llvm.org/D50234 it does need a binary file as llvm-mc does indeed relax the b.n into a b.w.