This is an archive of the discontinued LLVM Phabricator instance.

[Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables
ClosedPublic

Authored by jmolloy on Oct 14 2016, 6:43 AM.

Details

Summary

The TBB and TBH instructions in Thumb-2 allow jump tables to be compressed into sequences of bytes or shorts respectively. These instructions do not exist in Thumb-1, however it is possible to synthesize them out of a sequence of other instructions.

It turns out this sequence is so short that it's almost never a lose for performance and is ALWAYS a significant win for code size.

TBB example:

Before: lsls r0, r0, #2    After: add  r0, pc
        adr  r1, .LJTI0_0         ldrb r0, [r0, #6]
        ldr  r0, [r0, r1]         lsls r0, r0, #1
        mov  pc, r0               add  pc, r0
  => No change in prologue code size or dynamic instruction count. Jump table shrunk by a factor of 4.

The only case that can increase dynamic instruction count is the TBH case:

Before: lsls r0, r4, #2    After: lsls r4, r4, #1
        adr  r1, .LJTI0_0         add  r4, pc
        ldr  r0, [r0, r1]         ldrh r4, [r4, #6]
        mov  pc, r0               lsls r4, r4, #1
                                  add  pc, r4
=> 1 more instruction in prologue. Jump table shrunk by a factor of 2.

So there is an argument that this should be disabled when optimizing for performance (and a TBH needs to be generated). I'm not so sure about that in practice, because on small cores with Thumb-1 performance is often tied to code size. But I'm willing to turn it off when optimizing for performance if people want (also note that TBHs are fairly rare in practice!)

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy updated this revision to Diff 74676.Oct 14 2016, 6:43 AM
jmolloy retitled this revision from to [Thumb-1] Synthesize TBB/TBH instructions to make use of compressed jump tables.
jmolloy updated this object.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a subscriber: llvm-commits.
rengolin edited edge metadata.Oct 14 2016, 10:58 AM

Hi James,

Interesting idea! Though, the change is smaller than I was expecting. :)

I haven't looked in great detail, but here are some nits and questions to get you busy while I look at the rest. :)

cheers,
--renato

lib/Target/ARM/ARMConstantIslandPass.cpp
1987

rename?

lib/Target/ARM/ARMInstrThumb.td
1318

Why t2Pseudo?

test/CodeGen/ARM/jump-table-tbh.ll
30

I'm worried about the label naming here... Could it change later on?

The only test you really need here is that the first item is the same as the last. Maybe.

jmolloy updated this revision to Diff 74830.Oct 17 2016, 5:39 AM
jmolloy edited edge metadata.

Thanks Renato! Nice catch on the t2Pseudo - that was a copy-paste error.

Cheers,

James

rengolin accepted this revision.Oct 19 2016, 4:10 AM
rengolin edited edge metadata.

Hi James,

The code looks good to me, and I think the idea is a good one, including performance of Thumb1 cores.

LGTM. Thanks!

lib/Target/ARM/ARMAsmPrinter.cpp
1774

Changing the table format/alignment looks like a code-size optimisation, so can be done later.

This revision is now accepted and ready to land.Oct 19 2016, 4:10 AM
jmolloy closed this revision.Oct 19 2016, 5:16 AM

Thanks Renato, committed in r284580!