This is an archive of the discontinued LLVM Phabricator instance.

Cache getSymVA
Needs ReviewPublic

Authored by espindola on Apr 18 2018, 6:20 PM.

Details

Summary

It is common for getSymVA to be called at least twice for each symbol. Once for the symbol table, once for a relocation using it.

The operation is fairly expensive, so it is probably worth caching it (I will attach the benchmark results).

One issue with this is that with thunks getVA is called multiple times and produces different results as thunks are added. It is easy to just disable this for architectures that need thunks, but I am curious if someone more familiar with thunks thinks we can do better.

BTW, we should probably add testcase with thunks to the benchmarks we use. Peter, do you have a chromium build for aarch64 we could use?

Diff Detail

Event Timeline

espindola created this revision.Apr 18 2018, 6:20 PM
pcc added a comment.Apr 18 2018, 7:08 PM

BTW, we should probably add testcase with thunks to the benchmarks we use. Peter, do you have a chromium build for aarch64 we could use?

The AArch64 version of chromium isn't large enough to need thunks. The ARM32 version has plenty of thunks, though.

For a build you should be able to follow the instructions here:
https://chromium.googlesource.com/chromium/src/+/master/docs/android_build_instructions.md
If you use the GN flags target_os="android" is_official_build=true and build the target libmonochrome.so that should give you something reasonably close to what we ship.

If that's too much trouble I can try to create a build myself but it will take a while.

As well as Thunks and errata patching, I'm guessing that any future generic (not just Android) uses of packed dynamic relocs might also need symbol addresses to be recalculated (https://sourceware.org/ml/gnu-gabi/2017-q2/msg00000.html)

Some thoughts:

  • Rather than disabling for the architectures that need thunks, I think it would be better to invalidate the cache after content was changed as this might help prevent bugs if other features are added later that change addresses.
  • Adding new content can only change the address of defined non-absolute symbols, it might be possible to cache everything else but these ones. Whether this makes a significant difference in link time or not I don't know.
  • assignAddresses() could update the symbol cache instead of invalidating it, there would only be a displacement for defined non-absolute symbols.

I guess we'll need to benchmark to make sure if any of those are worth the additional complexity.