This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Fix link failure with Android compressed relocation support.
ClosedPublic

Authored by efriedma on Oct 8 2018, 3:58 PM.

Details

Summary

Android uses a compressed relocation format, which means the size of the relocation section isn't predictable based on the number of relocations, and can vary if the layout changes in any way. To deal with this, the linker normally runs multiple passes until the layout converges.

The layout should converge if the size of the compressed relocation section increases monotonically: if the size of an encoded offset increases by one byte, the larget value which can be encoded is multiplied by 128, so the representable offsets grow much faster than the size of the section itself.

The problem here is that there is no code to ensure the size of the section doesn't decrease. If the size of the relocation section decreases, the relative offsets can increase due to alignment restrictions, so that can force the size of the relocation section to increase again. The end result is an infinite loop; the loop gets cut off after 10 iterations with the message "thunk creation not converged".

To avoid this issue, this patch adds padding to the end of the relocation section if its size would decrease. I think the extra padding is harmless because of the way the format is defined: decoding stops after it reaches the number of relocations specified in the section's header. Not completely sure about that, though.

Diff Detail

Repository
rLLD LLVM Linker

Event Timeline

efriedma created this revision.Oct 8 2018, 3:58 PM
ruiu added a comment.Oct 8 2018, 4:12 PM

How did you find it? If you find it in the wild, how large is the padding you added to workaround the issue for your program?

Found as a build failure testing a new compiler on Android.

For the library that failed to build, this patch adds one byte "extra" padding to rela.dyn.

Adding zero padding to the end of the SHT_ANDROID_REL[A] section shouldn't be a problem. It looks like the original Android relocation_packer tool left up to a page of zero bytes at the end of the packed section, and the decoders in Bionic and LLVM would ignore the trailing bytes. relocation_packer doesn't loop stabilizing SLEB sizes, but I think it didn't need to. When it shrunk a .rel[a].dyn section, it apparently only shrunk the section by a multiple of the page size. It left the virtual addresses of the sections following .rel[a].dyn unchanged, and I don't think it adjusted the offset/addend values of the relocations.

FWIW: I fixed a similar issue in the LLVM assembler with the exception table: https://reviews.llvm.org/D42722

I wonder if a non-contrived input could require more than 10 rounds to converge. It seems unlikely to happen, at least.

This change seems fine to me.

test/ELF/pack-dyn-relocs-loop.s
13

s/abd/and/ ?

FWIW: I fixed a similar issue in the LLVM assembler with the exception table: https://reviews.llvm.org/D42722

I wonder if a non-contrived input could require more than 10 rounds to converge. It seems unlikely to happen, at least.

This change seems fine to me.

I think that this change makes sense as well. The Thunks code has to be careful with Short branch selection in order to not oscillate between short and long branches. One thing that may be worth doing (not necessarily in this patch) is to make the error message less specific to Thunks, and potentially extract it from the Thunks code. At the moment all the Targets supporting packed relocations also support Thunks but this might always be the case.

I had originally chosen 10 for the pass limit for Thunks based on the relative sizes of the Thunks and the Thumb 1 branch range, and the practical experience of never seeing a working program go past 5 passes. Anything that had reached 10 had always been a bug. I suspect that a similar calculation could be done for the relocation packer if we were worried that 10 might not be enough.

ruiu added a comment.Oct 9 2018, 6:33 AM

Looking good, but I'd like to wait for pcc's response for a few days if he's okay to have a padding at the end of the section.

ruiu accepted this revision.Oct 10 2018, 11:26 AM

LGTM

Please submit.

This revision is now accepted and ready to land.Oct 10 2018, 11:26 AM
This revision was automatically updated to reflect the committed changes.