This is an archive of the discontinued LLVM Phabricator instance.

ARM/ELF: Better codegen for global variable addresses.
ClosedPublic

Authored by pcc on Oct 12 2015, 7:02 AM.

Details

Summary

In PIC mode we were previously computing global variable addresses (or GOT
entry addresses) by adding the PC, the PC-relative GOT displacement and
the GOT-relative symbol/GOT entry displacement. Because the latter two
displacements are fixed, we ended up performing one more addition than
necessary.

This change causes us to compute addresses using a single PC-relative
displacement, resulting in a shorter code sequence. This reduces code size
by about 4% in a recent build of Chromium for Android.

As a result of this change we no longer need to compute the GOT base address
in the ARM backend, which allows us to remove the Global Base Reg pass and
SDAG lowering for the GOT.

We also now no longer use the GOT when addressing a symbol which is known
to be defined in the same linkage unit. Specifically, the symbol must have either
hidden visibility or a strong definition in the current module in order to not use the
the GOT.

This is a change from the previous behaviour where we would use the
GOT to address externally visible symbols defined in the same module.
I think the only cases where this could matter are cases involving symbol
interposition, but we don't really support that well anyway.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 37105.Oct 12 2015, 7:02 AM
pcc retitled this revision from to ARM/ELF: Better codegen for global variable addresses..
pcc updated this object.
pcc added reviewers: jmolloy, t.p.northover.
pcc added a subscriber: llvm-commits.

Hi Peter,

Overall, I think this is a good idea. It makes a pass unnecessary, generate the expected code directly and reduces the size considerably.

However, I'm worried about the impact of this on Thumb1 code. It seems all tests are assuming Thumb2, and Thumb1's restrictions for literal pools are a lot more restrictive. I'm adding a few people that used to care about Thumb1. Feel free to add more.

I'm also concerned with potential edge cases, so if you could run the the test-suite on Thumb2 and ARM, that would be great.

I'm still trying to understand the difference between the old and the new, but the new code looks ok so far. I'll let you know as I progress.

Thanks!
--renato

lib/Target/ARM/ARMFastISel.cpp
2943 ↗(On Diff #37105)

Can you elaborate on this change?

2947 ↗(On Diff #37105)

Is it 8 for thumb1, too?

2958 ↗(On Diff #37105)

Inconsistent isThumb with t2 instruction. Also, would be good to get all checks against a consistent ISA. I believe this can be used for all three.

pcc updated this object.Oct 15 2015, 11:42 AM
pcc updated this revision to Diff 37504.Oct 15 2015, 11:49 AM
  • Address review comments
pcc added a comment.Oct 15 2015, 11:51 AM

However, I'm worried about the impact of this on Thumb1 code. It seems all tests are assuming Thumb2, and Thumb1's restrictions for literal pools are a lot more restrictive. I'm adding a few people that used to care about Thumb1. Feel free to add more.

I believe that test/CodeGen/ARM/load-global.ll is a Thumb1 test case (it targets thumbv6-linux-gnueabi).

While investigating this I found that fast ISel isn't even being used for Thumb{1,2} on ELF targets. There's a flag -arm-force-fast-isel to force it on, and if I do so I find that generated code is badly broken in Thumb1 mode (e.g. it uses Thumb2 encodings). I've modified the fast-isel-pic.ll test to force fast ISel for Thumb2, which seems to be correct for loads at least.

I'm also concerned with potential edge cases, so if you could run the the test-suite on Thumb2 and ARM, that would be great.

I'll see if I can get access to a machine to do this on, the only ones I have access to at the moment are Android devices.

lib/Target/ARM/ARMFastISel.cpp
2943 ↗(On Diff #37105)

I've modified the commit message to call out this change.

2947 ↗(On Diff #37105)

It is, done (but see main reply).

2958 ↗(On Diff #37105)

I've done what I think is the right thing for t1 once it gets fixed properly.

In D13650#268144, @pcc wrote:

I believe that test/CodeGen/ARM/load-global.ll is a Thumb1 test case (it targets thumbv6-linux-gnueabi).

Hum, that could be, yes.

While investigating this I found that fast ISel isn't even being used for Thumb{1,2} on ELF targets.

Ah, indeed! My mistake again. FastISel is disable in Thumb mode. Though, the original code had the alignment correct, so keeping is good. We don't want to make it worse.

I'm also concerned with potential edge cases, so if you could run the the test-suite on Thumb2 and ARM, that would be great.

I'll see if I can get access to a machine to do this on, the only ones I have access to at the moment are Android devices.

I'll try to build it locally myself, too. I'll let you know as soon as I have something, so we don't duplicate efforts.

cheers,
--renato

rengolin accepted this revision.Oct 23 2015, 12:20 PM
rengolin added a reviewer: rengolin.

Hi Peter,

Sorry for the delay. Things look good on my end. LGTM. Thanks!

This revision is now accepted and ready to land.Oct 23 2015, 12:20 PM
This revision was automatically updated to reflect the committed changes.