Extend lowering to support POT-indirect address computation.
All calls between different bins (executable segments) must be lowered using
page offset table (POT) indirect address computation when pagerando is
enabled. This patch adds support to ARMISelLowering to support this
indirection. The ARMPagerandoOptimizer pass then optimizes intra-bin calls back
to direct calls, since references within the same bin are guaranteeed to have a
fixed PC-relative offset.
For example, the direct call to foo:
BL <ga:@foo>
will be selected as the following, assuming foo is placed in a randomly located
bin:
cp#0: foo(potoff) cp#1: foo(binoff) %foo_potoff = LDRi12 <cp#0>, 0 %foo_bin_addr = LDRrs %R9, %vreg1 %foo_binoff = LDRi12 <cp#1>, 0 %foo_addr = ADDrr %vreg2, %vreg3 BLX %foo_addr
Constant pool entry #0 is the offset into the POT table of the bin containing
foo, and entry #1 is the relative offset of foo from the beginning of that bin.
Together these constant pool entries allow us to index the POT and add an offset
to compute the dynamic address of foo.
Inside pagerando bins, global addresses not in the GOT are computed with
approximately the following instruction sequence:
cp#0: global(gotoff) %got_addr = LDRi12 %R9, 0 %got_off_global = LDRi12 <cp#0>, 0 %global_addr = ADDrr %got_addr, %got_off_global
This sequence loads the address of the GOT into %got_addr from the first POT
entry. The constant pool entry containing the GOT-relative offset of @global is
then added to %got_addr to compute the dynamic address of @global. After this
sequence, %global_addr contains the dynamic address of @global.
Global addresses found in the GOT are loaded in the conventional way (using a
got_brel relocation on a constant pool entry), once the GOT address is loaded
from the first POT entry.
This patch set (D37580, D37581, D37582, D37583, D37584, D37585, D37586, D37587)
is a first draft of the pagerando implementation described in
http://lists.llvm.org/pipermail/llvm-dev/2017-June/113794.html.
What happens if LTO is not used? Is it just that only the current module gets put into bins and the rest isn't which is sub-optimal but correct or is it fatal? If it is fatal is there any way of asserting that LTO is required, if not then can the comment be expanded?