This is an archive of the discontinued LLVM Phabricator instance.

ELF: Add support for emitting dynamic relocations in the Android relocation packing format.
ClosedPublic

Authored by pcc on Oct 20 2017, 6:45 PM.

Details

Summary

The Android relocation packing format is a more compact
format for dynamic relocations in executables and DSOs
that is based on delta encoding and SLEBs. An overview
of the format can be found in the Android source code:
https://android.googlesource.com/platform/bionic/+/refs/heads/master/tools/relocation_packer/src/delta_encoder.h

This patch implements relocation packing using that format.

This implementation uses a more intelligent algorithm for compressing
relative relocations than Android's own relocation packer. As a
result it can generally create smaller relocation sections than
that packer. If I link Chromium for Android targeting ARM32 I get a
.rel.dyn of size 174693 bytes, as compared to 371832 bytes with gold
and the Android packer.

Depends on D34692

Diff Detail

Repository
rL LLVM

Event Timeline

pcc created this revision.Oct 20 2017, 6:45 PM
peter.smith edited edge metadata.Oct 22 2017, 1:38 AM

I think that this is worthwhile putting into the linker as relocations can be sorted to optimise packing and make it easier to account for address changes to the rest of the ELF file.

Some overall thoughts and suggestions:

  • Is it likely that other packing formats may come along? In this case we may be better to have an interface like --relocation-packing=(none | android | ... )
  • It would be good to have a tool to expand and display textually the packed relocations, for example llvm-objdump -r could naturally display the packed relocation sections as if they were normal relocations.
  • Is it worth adding an option to sort the relocations for packing, but not do the packing. This would make it easier to write tests, for example we could check that the relocations were sorted in the order that we would expect, with a suitable tool to unpack the encoded form into human readable text we could also check the encoding process. A secondary benefit would be that people using an external relocation packer could still benefit from the sorting.
lld/ELF/Options.td
15 ↗(On Diff #119733)

Would it be better to have --pack-dyn-relocs=<packing format>. Where we have none or Android as the initial parameters. I'm just thinking of other OSs or even Android having more than one format.

lld/ELF/SyntheticSections.cpp
1109 ↗(On Diff #119733)

Just checking to see if the new line was intentional? I've no objections to adding it.

1246 ↗(On Diff #119733)

I think that this is converting DynamicReloc to an ELFT::Rela? Would it be better to use a more specific name?

1251 ↗(On Diff #119733)

I don't think that this comment reads well and I'm not sure I understand it? I know it is just a move from another section but is it worth fixing it up?

1319 ↗(On Diff #119733)

In the final version I think it will be worth expanding the necessary details inline. I'm thinking of the stability of the link here. It also doesn't mention the LEB encoding either.

1330 ↗(On Diff #119733)

I guess this is some kind of header?

1340 ↗(On Diff #119733)

Is it worth using similarly structured names for the two partitions, for example RelativeRelocs and NonRelativeRelocs?

1353 ↗(On Diff #119733)

Is it worth moving the sort to just before the OtherRelocs are processed (near the end of the function)?

1355 ↗(On Diff #119733)

The -zcombreloc sorting order tries to order relocations by the destination symbol first so that the dynamic loader (at least on Linux) can reuse the symbol lookup result. Is there a trade off between packing efficiency and load time here? This may be handled differently by the Android dynamic linker?

1364 ↗(On Diff #119733)

Is it worth having a similarly structured name such as UngroupedRelatives and GroupedRelatives?

1369 ↗(On Diff #119733)

I found rewriting this as *I - *(I - 1) == Config->Wordsize made it easier to understand, but this may not be universal.

1375 ↗(On Diff #119733)

Should J have a type of size_t?

lld/ELF/SyntheticSections.h
405 ↗(On Diff #119733)

Is it worth having Packed as part of the name, either PackedRelocationSection or the slightly overlong AndroidPackedRelocationSection. I think it will make it more obvious what the section does from the name.

lld/ELF/Writer.cpp
1381 ↗(On Diff #119733)

Do we need a call to Script->assignAddresses() before updateAllocSize()? I'm thinking of the case where createThunks() affects the r_offsets in RelaDyn->updateAllocSize() that affects the size of RelaDyn, that affects the addresses in Thunks. I think that for correctness this all works out as we can't exit the loop unless there are no changes, but it may take longer to converge in some situations.

llvm/include/llvm/BinaryFormat/ELF.h
733 ↗(On Diff #119733)

Would be good to have a reference to where these are defined (public specification?), similarly for the DT tags.

pcc added a comment.Oct 23 2017, 2:14 PM

I think that this is worthwhile putting into the linker as relocations can be sorted to optimise packing and make it easier to account for address changes to the rest of the ELF file.

Some overall thoughts and suggestions:

  • Is it likely that other packing formats may come along? In this case we may be better to have an interface like --relocation-packing=(none | android | ... )

I think that makes sense. There's been some talk of a new packing format on the gnu-gabi lists, so it would be good to be able to accommodate whatever comes out of that.

  • It would be good to have a tool to expand and display textually the packed relocations, for example llvm-objdump -r could naturally display the packed relocation sections as if they were normal relocations.

Right, I was planning to teach llvm-objdump about this format as part of writing the test cases for this patch.

  • Is it worth adding an option to sort the relocations for packing, but not do the packing. This would make it easier to write tests, for example we could check that the relocations were sorted in the order that we would expect, with a suitable tool to unpack the encoded form into human readable text we could also check the encoding process. A secondary benefit would be that people using an external relocation packer could still benefit from the sorting.

It doesn't really seem worth it to me, as long as we have llvm-objdump support for testing purposes. I also don't see much value in making the output compatible with external relocation packers, as that cannot be done soundly in general.

lld/ELF/Options.td
15 ↗(On Diff #119733)

Will do.

lld/ELF/SyntheticSections.cpp
1109 ↗(On Diff #119733)

No, I'll remove it.

1246 ↗(On Diff #119733)

encodeDynamicReloc maybe?

1251 ↗(On Diff #119733)

I don't understand it either :)

I'll see if I can figure out what it is trying to say and fix it.

1319 ↗(On Diff #119733)

That makes sense. I'll try to write something up.

1330 ↗(On Diff #119733)

Right.

1340 ↗(On Diff #119733)

Makes sense, I'll do that.

1353 ↗(On Diff #119733)

Sounds fine to me.

1355 ↗(On Diff #119733)

The Android dynamic loader doesn't appear to have any optimization for this case. It looks like it just performs a symbol lookup for each relocation.

https://android.googlesource.com/platform/bionic/+/refs/heads/master/linker/linker.cpp#2507

1364 ↗(On Diff #119733)

I kind of prefer the name RelativeGroups over GroupedRelatives because the latter implies that the vector contains relocations as opposed to groups, when in fact it contains groups. But I don't feel too strongly about it.

1369 ↗(On Diff #119733)

I found the form I wrote it in easier to understand, but again don't feel too strongly about it.

1375 ↗(On Diff #119733)

It probably doesn't matter because J can never be larger than 7.

lld/ELF/SyntheticSections.h
405 ↗(On Diff #119733)

AndroidPacked seems fine to me.

lld/ELF/Writer.cpp
1381 ↗(On Diff #119733)

That wouldn't seem to be a concern for REL (because of delta encoding, the section (in general) will only contain relative offsets within the r/w segment (with one exception for the initial delta into the r/w segment), and r/w won't contain anything whose size will change, so I'd expect there to be at most two convergent steps), but it may be a concern for RELA (because it will also end up containing deltas within text and other sections). I'll have to experiment with this once I have RELA implemented.

llvm/include/llvm/BinaryFormat/ELF.h
733 ↗(On Diff #119733)

I haven't seen them formally defined anywhere. Probably the best we can do is link to the Android source code.

pcc updated this revision to Diff 119994.Oct 23 2017, 7:04 PM
pcc marked 9 inline comments as done.
  • Address review comments
pcc marked an inline comment as done.Oct 23 2017, 7:05 PM

Thanks for the update, I don't have any particularly strong views about the detailed comments so unless anyone else strongly agrees, I'm happy to keep with what you have. I don't have any further suggestions at the moment.

grimar added a subscriber: grimar.Oct 25 2017, 5:37 AM
pcc updated this revision to Diff 120307.Oct 25 2017, 1:41 PM
  • Add rela support
  • Add test
  • Use correct sh_link value in section header
pcc retitled this revision from [wip] ELF: Add support for emitting dynamic relocations in the Android relocation packing format. to ELF: Add support for emitting dynamic relocations in the Android relocation packing format..Oct 25 2017, 1:43 PM
pcc edited the summary of this revision. (Show Details)
ruiu added inline comments.Oct 25 2017, 5:27 PM
lld/ELF/Config.h
167 ↗(On Diff #120307)

Can this just be a boolean flag AndroidPackDynRelocs?

lld/ELF/SyntheticSections.cpp
1245 ↗(On Diff #120307)

I first thought that "*" is to emphasize a word and a closing "*" were missing.

1357 ↗(On Diff #120307)

I'd also mention that how many bytes could be saved by this encoding compared to the default dynamic relocation table to give an insight why this format was invented and is used.

1379 ↗(On Diff #120307)

I believe you can use Rel.Type insetad of R.getType(Config->IsMips64EL).

1486 ↗(On Diff #120307)

This needs comment.

lld/ELF/SyntheticSections.h
415 ↗(On Diff #120307)

std::copy should work, but for consistency with other writeTo functions, memcpy would be better.

lld/ELF/Writer.cpp
373–374 ↗(On Diff #120307)

What's wrong with doing this for packed relocations?

pcc updated this revision to Diff 120351.Oct 25 2017, 7:28 PM
pcc marked 5 inline comments as done.
  • Address review comments
lld/ELF/Config.h
167 ↗(On Diff #120307)

It can for now, I suppose.

lld/ELF/SyntheticSections.cpp
1245 ↗(On Diff #120307)

Note that this is just moving code around (see line 1287 in the old code). But seems reasonable to change it to R_*_RELATIVE.

1379 ↗(On Diff #120307)

I don't think that will work because the type is stored in the r_info field.

lld/ELF/Writer.cpp
373–374 ↗(On Diff #120307)

That would cause a section type mismatch. I've added a comment with a few more details.

ruiu added inline comments.Oct 25 2017, 7:41 PM
lld/ELF/SyntheticSections.cpp
1342 ↗(On Diff #120351)

That is huge savings! I'd put commas to the numbers because at first it looked like a reduction from 29 to 17. :)

pcc updated this revision to Diff 120353.Oct 25 2017, 7:54 PM
pcc marked an inline comment as done.
  • Add commas

Minor nits.

lld/ELF/Driver.cpp
768 ↗(On Diff #120353)

Since you always initialize Config->AndroidPackDynRelocs to false, this could be:

StringRef S = Arg->getValue();
if (S == "android")
  Config->AndroidPackDynRelocs = true;
if (S != "none")
  error("unknown -pack-dyn-relocs format: " + S);
lld/ELF/SyntheticSections.cpp
1378 ↗(On Diff #120353)

Perhaps instead of clear and push_backs better would be to

RelocData = {'A', 'P', 'S', '2'};

?

ruiu accepted this revision.Oct 26 2017, 2:16 PM

LGTM

This revision is now accepted and ready to land.Oct 26 2017, 2:16 PM
pcc updated this revision to Diff 120499.Oct 26 2017, 3:34 PM
pcc marked 2 inline comments as done.
  • Simplify
In D39152#908640, @pcc wrote:
  • Simplify

Thanks !

This revision was automatically updated to reflect the committed changes.