Skip to content

Commit a994185

Browse files
committedFeb 3, 2017
[ARM] Change TCReturn to tBL if tailcall optimization fails.
Summary: The tail call optimisation is performed before register allocation, so at that point we don't know if LR is being spilt or not. If LR was spilt to the stack, then we cannot do a tail call optimisation. That would involve popping back into LR which is not possible in Thumb1 code. Reviewers: rengolin, jmolloy, rovka, olista01 Reviewed By: olista01 Subscribers: llvm-commits, aemerson Differential Revision: https://reviews.llvm.org/D29020 llvm-svn: 294000
1 parent 57b63d6 commit a994185

File tree

3 files changed

+39
-6
lines changed

3 files changed

+39
-6
lines changed
 

‎llvm/lib/Target/ARM/ARMSubtarget.cpp

+6-6
Original file line numberDiff line numberDiff line change
@@ -202,12 +202,12 @@ void ARMSubtarget::initSubtargetFeatures(StringRef CPU, StringRef FS) {
202202
// support in the assembler and linker to be used. This would need to be
203203
// fixed to fully support tail calls in Thumb1.
204204
//
205-
// Doing this is tricky, since the LDM/POP instruction on Thumb doesn't take
206-
// LR. This means if we need to reload LR, it takes an extra instructions,
207-
// which outweighs the value of the tail call; but here we don't know yet
208-
// whether LR is going to be used. Probably the right approach is to
209-
// generate the tail call here and turn it back into CALL/RET in
210-
// emitEpilogue if LR is used.
205+
// For ARMv8-M, we /do/ implement tail calls. Doing this is tricky for v8-M
206+
// baseline, since the LDM/POP instruction on Thumb doesn't take LR. This
207+
// means if we need to reload LR, it takes extra instructions, which outweighs
208+
// the value of the tail call; but here we don't know yet whether LR is going
209+
// to be used. We generate the tail call here and turn it back into CALL/RET
210+
// in emitEpilogue if LR is used.
211211

212212
// Thumb1 PIC calls to external symbols use BX, so they can be tail calls,
213213
// but we need to make sure there are enough registers; the only valid

‎llvm/lib/Target/ARM/Thumb1FrameLowering.cpp

+10
Original file line numberDiff line numberDiff line change
@@ -888,6 +888,16 @@ restoreCalleeSavedRegisters(MachineBasicBlock &MBB,
888888
// ARMv4T requires BX, see emitEpilogue
889889
if (!STI.hasV5TOps())
890890
continue;
891+
// Tailcall optimization failed; change TCRETURN to a tBL
892+
if (MI->getOpcode() == ARM::TCRETURNdi ||
893+
MI->getOpcode() == ARM::TCRETURNri) {
894+
unsigned Opcode = MI->getOpcode() == ARM::TCRETURNdi
895+
? ARM::tBL : ARM::tBLXr;
896+
MachineInstrBuilder BL = BuildMI(MF, DL, TII.get(Opcode));
897+
BL.add(predOps(ARMCC::AL));
898+
BL.add(MI->getOperand(0));
899+
MBB.insert(MI, &*BL);
900+
}
891901
Reg = ARM::PC;
892902
(*MIB).setDesc(TII.get(ARM::tPOP_RET));
893903
if (MI != MBB.end())
+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
; RUN: llc %s -o - -mtriple=thumbv8m.base | FileCheck %s
2+
3+
define void @test() {
4+
; CHECK-LABEL: test:
5+
entry:
6+
%call = tail call i32 @foo()
7+
%tail = tail call i32 @foo()
8+
ret void
9+
; CHECK: bl foo
10+
; CHECK: bl foo
11+
; CHECK-NOT: b foo
12+
}
13+
14+
define void @test2() {
15+
; CHECK-LABEL: test2:
16+
entry:
17+
%tail = tail call i32 @foo()
18+
ret void
19+
; CHECK: b foo
20+
; CHECK-NOT: bl foo
21+
}
22+
23+
declare i32 @foo()

0 commit comments

Comments
 (0)
Please sign in to comment.