This is an archive of the discontinued LLVM Phabricator instance.

Add tail call optimizations to thumb1-only targets

Authored by Langohr on Jan 14 2015, 3:06 PM.



What the patch is meant to do:

For Tail calls identified during DAG generation, the target address will
be loaded into a register by use of the constant pool.
If R3 is used for argument passing, the target address is forced
to hard reg R12 in order to overcome limitations thumb1 register
allocator with respect to the upper registers.

I decided to fetch the target address to a register by a constant pool
lookup because when analyzing the code I found out, that the mechanisms are prepared also
for situations, where parameters are both passed in regs and on the stack. This would not be
possible when using a BL // pop {pc} sequence within the epilogue since this would change
the stack offsets.

During epilog generation, spill register restoring will be done within
the emit epilogue function.

If LR happens to be spilled on the stack by the prologue, it's restored
by use of a scratch register;
If R3 is available as scratch register (then, the target address will be in R0,R1 or R2),
LR is reloaded by a pop rC Clang; mov LR sequence after a pop of the other callee saved regs.

If R3 is not available, LR is restored prior to restoring registers (tPOP) and the stack
pointer is re-adjusted before the tail jump insn.

I have so far tested the code by hand with a number of tests by
analyzing generated assembly and by some execution tests in qemu.

In the lit testsuite I get 4 failures which I attribute at a first
analysis to the fact that the generated code for tail calls
results in different output that no longer matches the expectation strings.

For the previous version of the patch, there has been some discussion on the llvm-dev mailing list mainly concerning the
optimization goal. With respect to code size and speed, tail call optimization will not be an improvement in many situations. The main benefit I do identify is that with this patch, this will make llvm for thumb1 targets useable also for "continuation style"
code, where a large chain of of tail calls eats up the precious stack space.

Personally, I'd be integrating it as part of the default options, but I would not mind activating it only by a special compile switch.

Yours Björn

Diff Detail

Event Timeline

Langohr updated this revision to Diff 18187.Jan 14 2015, 3:06 PM
Langohr retitled this revision from to Add tail call optimizations to thumb1-only targets.
Langohr updated this object.
Langohr edited the test plan for this revision. (Show Details)
Langohr added a subscriber: Unknown Object (MLST).
Langohr abandoned this revision.Jan 15 2015, 12:01 PM