This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/ARM/
-
Target/
-
ARM/
-
ARMBaseRegisterInfo.cpp
-
ARMFrameLowering.cpp
1/5
Thumb1FrameLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
ARM/
-
thumb1_return_sequence.ll
-
Thumb/
-
large-stack.ll
-
long.ll
-
stack-align.ll

Differential D38143

Dynamic stack alignment for Thumb1
ClosedPublic

Authored by chill on Sep 21 2017, 10:31 AM.

Download Raw Diff

Details

Reviewers

mcrosier
asl
tyomitch
rengolin
t.p.northover
olista01
john.brawn

Commits

rGd6a4ab3d4917: [ARM] Dynamic stack alignment for 16-bit Thumb
rL316289: [ARM] Dynamic stack alignment for 16-bit Thumb

Summary

This patch adds dynamic stack alignment for Thumb1.

The motivating issue is micompilation of the following code, when targeting a
CPU, which implements only Thumb-1, like cortex-m0.

struct foo {
    alignas(16) char buf[12];
    int i;
    int *ip;
    foo() : ip(&i) {}
};

extern void g(foo &);
void f() {
    foo myFoo;
    g(myFoo);
}

When initialising the ip member, the address of i is calculated using bitwise OR:

push    {r7, lr}
.pad    #40
sub     sp, #40
movs    r1, #12
mov     r0, sp
orrs    r1, r0
str     r1, [sp, #16]

which is obviously incorrect when the starting address of that object (resp. the
stack pointer) happens to be aligned at 8 or 4 byte boundary.

When compiling for ARM or Thumb2, the stack is realigned and the problem does
not occur.

Diff Detail

Event Timeline

chill created this revision.Sep 21 2017, 10:31 AM

Herald added subscribers: kristof.beyls, javed.absar, aemerson. · View Herald TranscriptSep 21 2017, 10:31 AM

chill added reviewers: rengolin, t.p.northover.Sep 22 2017, 5:50 AM

rengolin added inline comments.Sep 22 2017, 8:01 AM

lib/Target/ARM/Thumb1FrameLowering.cpp
357	hard-coding r4 here is bound to create problems. You need to make sure it's saved earlier (and popped later). You also add four instructions to the prologue, which in Thumb1 is not great. It' better than bad codegen, of course, but you need to make sure how often the realignment will hit (from being fatal, I'm guessing not often), or if there's another way to do this (I can't think of anything). Welcoming comments from people with more Thumb1 experience than myself.

asl added inline comments.Sep 22 2017, 8:03 AM

lib/Target/ARM/Thumb1FrameLowering.cpp
357	Do we have register scavenger available here, so we could scavenge a spare register? Or it's too early to to this - before the frame is set up ?

rogfer01 added a subscriber: rogfer01.Sep 22 2017, 8:25 AM

chill added inline comments.Sep 24 2017, 1:23 PM

lib/Target/ARM/Thumb1FrameLowering.cpp
357	Thanks for pointing that out. I've fixed the patch to add R4 to the set of callee saved register. Unfortunately, I don't know of a better way to do the alignment in Thumb1. The only other alternative can think of (mov + bic) would need one more scratch register and would have more limiting range of possible alignments.
357	I've experimented with using a virtual scratch register and having the register scavenger later find a suitable hard register for it. I works ... kinda. The register R4 is used as a scratch anyway in function epilogue to restore the SP from the frame pointer, so using some other available register does not buy as anything. There's the option of using a virtual scratch register in epilogue too - but (besides triggering some bugs) the `llvm::emitThumbRegPlusImmediate` does not emit great code if the destination register is not a low register, e.g. we may get: movs r1, #-8 rsbs r1, r1, #0 add r1, r7 mov sp, r1 Also, this approach of using R4 as scratch is also employed by ARM/Thumb2 stack align code and I think this should be addressed by a separate patch.

chill marked an inline comment as done.Sep 24 2017, 1:24 PM

Thanks for the comments, much appreciated.
The next revision of the patch makes sure R4 is added to the set of callee saved registers if the function requires stack realignment, too.

Does this work correctly with "inreg" arguments? Just checking.

In D38143#879967, @asl wrote:

Does this work correctly with "inreg" arguments? Just checking.

Not sure. I couldn't produce a testcase, where presence or absence of "inreg" would make any difference.

Ping?

rengolin added subscribers: compnerd, joerg.Oct 4 2017, 1:37 AM

rengolin added inline comments.

lib/Target/ARM/Thumb1FrameLowering.cpp
357	Right, the fix works for me (rather, I can't think of anything better). @asl @compnerd @joerg?

Ping?

chill added reviewers: olista01, john.brawn.Oct 16 2017, 3:17 AM

Ping?

LGTM

This revision is now accepted and ready to land.Oct 18 2017, 7:58 AM

Thanks for the reviews and the approval.

Closed by commit rL316289: [ARM] Dynamic stack alignment for 16-bit Thumb (authored by chill). · Explain WhyOct 22 2017, 4:57 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

ARM/

ARMBaseRegisterInfo.cpp

6 lines

ARMFrameLowering.cpp

14 lines

Thumb1FrameLowering.cpp

34 lines

test/

CodeGen/

ARM/

thumb1_return_sequence.ll

10 lines

Thumb/

large-stack.ll

4 lines

long.ll

9 lines

stack-align.ll

19 lines

Diff 116491

lib/Target/ARM/ARMBaseRegisterInfo.cpp

Show First 20 Lines • Show All 379 Lines • ▼ Show 20 Lines	if (AFI->isThumbFunction() && MFI.hasVarSizedObjects()) {
return true;		return true;
}		}

return false;		return false;
}		}

bool ARMBaseRegisterInfo::canRealignStack(const MachineFunction &MF) const {		bool ARMBaseRegisterInfo::canRealignStack(const MachineFunction &MF) const {
const MachineRegisterInfo *MRI = &MF.getRegInfo();		const MachineRegisterInfo *MRI = &MF.getRegInfo();
const ARMFunctionInfo *AFI = MF.getInfo<ARMFunctionInfo>();
const ARMFrameLowering *TFI = getFrameLowering(MF);		const ARMFrameLowering *TFI = getFrameLowering(MF);
// We can't realign the stack if:		// We can't realign the stack if:
// 1. Dynamic stack realignment is explicitly disabled,		// 1. Dynamic stack realignment is explicitly disabled,
// 2. This is a Thumb1 function (it's not useful, so we don't bother), or		// 2. There are VLAs in the function and the base pointer is disabled.
// 3. There are VLAs in the function and the base pointer is disabled.
if (!TargetRegisterInfo::canRealignStack(MF))		if (!TargetRegisterInfo::canRealignStack(MF))
return false;		return false;
if (AFI->isThumb1OnlyFunction())
return false;
// Stack realignment requires a frame pointer. If we already started		// Stack realignment requires a frame pointer. If we already started
// register allocation with frame pointer elimination, it is too late now.		// register allocation with frame pointer elimination, it is too late now.
if (!MRI->canReserveReg(getFramePointerReg(MF.getSubtarget<ARMSubtarget>())))		if (!MRI->canReserveReg(getFramePointerReg(MF.getSubtarget<ARMSubtarget>())))
return false;		return false;
// We may also need a base pointer if there are dynamic allocas or stack		// We may also need a base pointer if there are dynamic allocas or stack
// pointer adjustments around calls.		// pointer adjustments around calls.
if (TFI->hasReservedCallFrame(MF))		if (TFI->hasReservedCallFrame(MF))
return true;		return true;
▲ Show 20 Lines • Show All 450 Lines • Show Last 20 Lines

lib/Target/ARM/ARMFrameLowering.cpp

Show First 20 Lines • Show All 1,607 Lines • ▼ Show 20 Lines	if (AFI->isThumb2Function() &&
(MFI.hasVarSizedObjects() \|\| RegInfo->needsStackRealignment(MF)))		(MFI.hasVarSizedObjects() \|\| RegInfo->needsStackRealignment(MF)))
SavedRegs.set(ARM::R4);		SavedRegs.set(ARM::R4);

if (AFI->isThumb1OnlyFunction()) {		if (AFI->isThumb1OnlyFunction()) {
// Spill LR if Thumb1 function uses variable length argument lists.		// Spill LR if Thumb1 function uses variable length argument lists.
if (AFI->getArgRegsSaveSize() > 0)		if (AFI->getArgRegsSaveSize() > 0)
SavedRegs.set(ARM::LR);		SavedRegs.set(ARM::LR);

// Spill R4 if Thumb1 epilogue has to restore SP from FP. We don't know		// Spill R4 if Thumb1 epilogue has to restore SP from FP or the function
// for sure what the stack size will be, but for this, an estimate is good		// requires stack alignment. We don't know for sure what the stack size
// enough. If there anything changes it, it'll be a spill, which implies		// will be, but for this, an estimate is good enough. If there anything
// we've used all the registers and so R4 is already used, so not marking		// changes it, it'll be a spill, which implies we've used all the registers
// it here will be OK.		// and so R4 is already used, so not marking it here will be OK.
// FIXME: It will be better just to find spare register here.		// FIXME: It will be better just to find spare register here.
unsigned StackSize = MFI.estimateStackSize(MF);		if (MFI.hasVarSizedObjects() \|\| RegInfo->needsStackRealignment(MF) \|\|
if (MFI.hasVarSizedObjects() \|\| StackSize > 508)		MFI.estimateStackSize(MF) > 508)
SavedRegs.set(ARM::R4);		SavedRegs.set(ARM::R4);
}		}

// See if we can spill vector registers to aligned stack.		// See if we can spill vector registers to aligned stack.
checkNumAlignedDPRCS2Regs(MF, SavedRegs);		checkNumAlignedDPRCS2Regs(MF, SavedRegs);

// Spill the BasePtr if it's used.		// Spill the BasePtr if it's used.
if (RegInfo->hasBasePointer(MF))		if (RegInfo->hasBasePointer(MF))
▲ Show 20 Lines • Show All 840 Lines • Show Last 20 Lines

lib/Target/ARM/Thumb1FrameLowering.cpp

Show First 20 Lines • Show All 346 Lines • ▼ Show 20 Lines	void Thumb1FrameLowering::emitPrologue(MachineFunction &MF,
if (STI.isTargetELF() && HasFP)		if (STI.isTargetELF() && HasFP)
MFI.setOffsetAdjustment(MFI.getOffsetAdjustment() -		MFI.setOffsetAdjustment(MFI.getOffsetAdjustment() -
AFI->getFramePtrSpillOffset());		AFI->getFramePtrSpillOffset());

AFI->setGPRCalleeSavedArea1Size(GPRCS1Size);		AFI->setGPRCalleeSavedArea1Size(GPRCS1Size);
AFI->setGPRCalleeSavedArea2Size(GPRCS2Size);		AFI->setGPRCalleeSavedArea2Size(GPRCS2Size);
AFI->setDPRCalleeSavedAreaSize(DPRCSSize);		AFI->setDPRCalleeSavedAreaSize(DPRCSSize);

// Thumb1 does not currently support dynamic stack realignment. Report a		if (RegInfo->needsStackRealignment(MF)) {
// fatal error rather then silently generate bad code.		const unsigned NrBitsToZero = countTrailingZeros(MFI.getMaxAlignment());
if (RegInfo->needsStackRealignment(MF))		// Emit the following sequence, using R4 as a temporary, since we cannot use
		rengolinUnsubmitted Done Reply Inline Actions hard-coding r4 here is bound to create problems. You need to make sure it's saved earlier (and popped later). You also add four instructions to the prologue, which in Thumb1 is not great. It' better than bad codegen, of course, but you need to make sure how often the realignment will hit (from being fatal, I'm guessing not often), or if there's another way to do this (I can't think of anything). Welcoming comments from people with more Thumb1 experience than myself. rengolin: hard-coding r4 here is bound to create problems. You need to make sure it's saved earlier (and…
		aslUnsubmitted Not Done Reply Inline Actions Do we have register scavenger available here, so we could scavenge a spare register? Or it's too early to to this - before the frame is set up ? asl: Do we have register scavenger available here, so we could scavenge a spare register? Or it's…
		chillAuthorUnsubmitted Not Done Reply Inline Actions I've experimented with using a virtual scratch register and having the register scavenger later find a suitable hard register for it. I works ... kinda. The register R4 is used as a scratch anyway in function epilogue to restore the SP from the frame pointer, so using some other available register does not buy as anything. There's the option of using a virtual scratch register in epilogue too - but (besides triggering some bugs) the `llvm::emitThumbRegPlusImmediate` does not emit great code if the destination register is not a low register, e.g. we may get: movs r1, #-8 rsbs r1, r1, #0 add r1, r7 mov sp, r1 Also, this approach of using R4 as scratch is also employed by ARM/Thumb2 stack align code and I think this should be addressed by a separate patch. chill: I've experimented with using a virtual scratch register and having the register scavenger later…
		chillAuthorUnsubmitted Not Done Reply Inline Actions Thanks for pointing that out. I've fixed the patch to add R4 to the set of callee saved register. Unfortunately, I don't know of a better way to do the alignment in Thumb1. The only other alternative can think of (mov + bic) would need one more scratch register and would have more limiting range of possible alignments. chill: Thanks for pointing that out. I've fixed the patch to add R4 to the set of callee saved…
		rengolinUnsubmitted Not Done Reply Inline Actions Right, the fix works for me (rather, I can't think of anything better). @asl @compnerd @joerg? rengolin: Right, the fix works for me (rather, I can't think of anything better). @asl @compnerd @joerg?
report_fatal_error("Dynamic stack realignment not supported for thumb1.");		// SP as a source or destination register for the shifts:
		// mov r4, sp
		// lsrs r4, r4, #NrBitsToZero
		// lsls r4, r4, #NrBitsToZero
		// mov sp, r4
		BuildMI(MBB, MBBI, dl, TII.get(ARM::tMOVr), ARM::R4)
		.addReg(ARM::SP, RegState::Kill)
		.add(predOps(ARMCC::AL));

		BuildMI(MBB, MBBI, dl, TII.get(ARM::tLSRri), ARM::R4)
		.addDef(ARM::CPSR)
		.addReg(ARM::R4, RegState::Kill)
		.addImm(NrBitsToZero)
		.add(predOps(ARMCC::AL));

		BuildMI(MBB, MBBI, dl, TII.get(ARM::tLSLri), ARM::R4)
		.addDef(ARM::CPSR)
		.addReg(ARM::R4, RegState::Kill)
		.addImm(NrBitsToZero)
		.add(predOps(ARMCC::AL));

		BuildMI(MBB, MBBI, dl, TII.get(ARM::tMOVr), ARM::SP)
		.addReg(ARM::R4, RegState::Kill)
		.add(predOps(ARMCC::AL));

		AFI->setShouldRestoreSPFromFP(true);
		}

// If we need a base pointer, set it up here. It's whatever the value		// If we need a base pointer, set it up here. It's whatever the value
// of the stack pointer is at this point. Any variable size objects		// of the stack pointer is at this point. Any variable size objects
// will be allocated after this, so we can still use the base pointer		// will be allocated after this, so we can still use the base pointer
// to reference locals.		// to reference locals.
if (RegInfo->hasBasePointer(MF))		if (RegInfo->hasBasePointer(MF))
BuildMI(MBB, MBBI, dl, TII.get(ARM::tMOVr), BasePtr)		BuildMI(MBB, MBBI, dl, TII.get(ARM::tMOVr), BasePtr)
.addReg(ARM::SP)		.addReg(ARM::SP)
▲ Show 20 Lines • Show All 559 Lines • Show Last 20 Lines

test/CodeGen/ARM/thumb1_return_sequence.ll

	; RUN: llc -mtriple=thumbv4t-none--eabi < %s \| FileCheck %s --check-prefix=CHECK-V4T			; RUN: llc -mtriple=thumbv4t-none--eabi < %s \| FileCheck %s --check-prefix=CHECK-V4T
	; RUN: llc -mtriple=thumbv5t-none--eabi < %s \| FileCheck %s --check-prefix=CHECK-V5T			; RUN: llc -mtriple=thumbv5t-none--eabi < %s \| FileCheck %s --check-prefix=CHECK-V5T

	; CHECK-V4T-LABEL: clobberframe			; CHECK-V4T-LABEL: clobberframe
	; CHECK-V5T-LABEL: clobberframe			; CHECK-V5T-LABEL: clobberframe
	define <4 x i32> @clobberframe(<6 x i32>* %p) #0 {			define <4 x i32> @clobberframe(<6 x i32>* %p) #0 {
	entry:			entry:
	; Prologue			; Prologue
	; --------			; --------
	; CHECK-V4T: push {[[SAVED:(r[4567](, )?)+]], lr}			; CHECK-V4T: push {[[SAVED:(r[4567](, )?)+]], lr}
	; CHECK-V4T: sub sp,			; CHECK-V4T: sub sp,
				; Stack is realigned because of the <6 x i32> type
				; CHECK-V4T: mov sp, r4
	; CHECK-V5T: push {[[SAVED:(r[4567](, )?)+]], lr}			; CHECK-V5T: push {[[SAVED:(r[4567](, )?)+]], lr}

	%b = alloca <6 x i32>, align 16			%b = alloca <6 x i32>, align 16
	%a = alloca <4 x i32>, align 16			%a = alloca <4 x i32>, align 16
	%stuff = load <6 x i32>, <6 x i32>* %p, align 16			%stuff = load <6 x i32>, <6 x i32>* %p, align 16
	store <6 x i32> %stuff, <6 x i32>* %b, align 16			store <6 x i32> %stuff, <6 x i32>* %b, align 16
	store <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32>* %a, align 16			store <4 x i32> <i32 0, i32 1, i32 2, i32 3>, <4 x i32>* %a, align 16
	%0 = load <4 x i32>, <4 x i32>* %a, align 16			%0 = load <4 x i32>, <4 x i32>* %a, align 16
	ret <4 x i32> %0			ret <4 x i32> %0

	; Epilogue			; Epilogue
	; --------			; --------
	; CHECK-V4T: add sp,			; Stack realignment means sp is restored from frame pointer
				; CHECK-V4T: mov sp
	; CHECK-V4T-NEXT: pop {[[SAVED]]}			; CHECK-V4T-NEXT: pop {[[SAVED]]}
	; The ISA for v4 does not support pop pc, so make sure we do not emit			; The ISA for v4 does not support pop pc, so make sure we do not emit
	; one even when we do not need to update SP.			; one even when we do not need to update SP.
	; CHECK-V4T-NOT: pop {pc}			; CHECK-V4T-NOT: pop {pc}
	; We may only use lo register to pop, but in that case, all the scratch			; We may only use lo register to pop, but in that case, all the scratch
	; ones are used.			; ones are used.
	; r12 is the only register we are allowed to clobber for AAPCS.			; r12 is the only register we are allowed to clobber for AAPCS.
	; Use it to save a lo register.			; Use it to save a lo register.
	Show All 32 Lines
	; --------			; --------
	; CHECK-V4T: pop {[[SAVED]]}			; CHECK-V4T: pop {[[SAVED]]}
	; CHECK-V4T-NEXT: mov r12, [[POP_REG:r[0-7]]]			; CHECK-V4T-NEXT: mov r12, [[POP_REG:r[0-7]]]
	; CHECK-V4T-NEXT: pop {[[POP_REG]]}			; CHECK-V4T-NEXT: pop {[[POP_REG]]}
	; CHECK-V4T-NEXT: add sp,			; CHECK-V4T-NEXT: add sp,
	; CHECK-V4T-NEXT: mov lr, [[POP_REG]]			; CHECK-V4T-NEXT: mov lr, [[POP_REG]]
	; CHECK-V4T-NEXT: mov [[POP_REG]], r12			; CHECK-V4T-NEXT: mov [[POP_REG]], r12
	; CHECK-V4T: bx lr			; CHECK-V4T: bx lr
	; CHECK-V5T: add sp,			; CHECK-V5T: lsls r4
	; CHECK-V5T-NEXT: pop {[[SAVED]]}			; CHECK-V5T-NEXT: mov sp, r4
				; CHECK-V5T: pop {[[SAVED]]}
	; CHECK-V5T-NEXT: mov r12, [[POP_REG:r[0-7]]]			; CHECK-V5T-NEXT: mov r12, [[POP_REG:r[0-7]]]
	; CHECK-V5T-NEXT: pop {[[POP_REG]]}			; CHECK-V5T-NEXT: pop {[[POP_REG]]}
	; CHECK-V5T-NEXT: add sp,			; CHECK-V5T-NEXT: add sp,
	; CHECK-V5T-NEXT: mov lr, [[POP_REG]]			; CHECK-V5T-NEXT: mov lr, [[POP_REG]]
	; CHECK-V5T-NEXT: mov [[POP_REG]], r12			; CHECK-V5T-NEXT: mov [[POP_REG]], r12
	; CHECK-V5T-NEXT: bx lr			; CHECK-V5T-NEXT: bx lr
	}			}

	▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

test/CodeGen/Thumb/large-stack.ll

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	; CHECK: ldr [[TEMP:r[0-7]]],			; CHECK: ldr [[TEMP:r[0-7]]],
	; CHECK: add sp, [[TEMP]]			; CHECK: add sp, [[TEMP]]
	; CHECK: ldr [[TEMP2:r[0-7]]],			; CHECK: ldr [[TEMP2:r[0-7]]],
	; CHECK: add [[TEMP2]], sp			; CHECK: add [[TEMP2]], sp
	; CHECK: ldr [[TEMP3:r[0-7]]],			; CHECK: ldr [[TEMP3:r[0-7]]],
	; CHECK: add sp, [[TEMP3]]			; CHECK: add sp, [[TEMP3]]
	%retval = alloca i32, align 4			%retval = alloca i32, align 4
	%tmp = alloca i32, align 4			%tmp = alloca i32, align 4
	%a = alloca [805306369 x i8], align 16			%a = alloca [805306369 x i8], align 4
	store i32 0, i32* %tmp			store i32 0, i32* %tmp
	%tmp1 = load i32, i32* %tmp			%tmp1 = load i32, i32* %tmp
	ret i32 %tmp1			ret i32 %tmp1
	}			}

	define i32 @test3_nofpelim() "no-frame-pointer-elim"="true" {			define i32 @test3_nofpelim() "no-frame-pointer-elim"="true" {
	; CHECK-LABEL: test3_nofpelim:			; CHECK-LABEL: test3_nofpelim:
	; CHECK: ldr [[TEMP:r[0-7]]],			; CHECK: ldr [[TEMP:r[0-7]]],
	; CHECK: add sp, [[TEMP]]			; CHECK: add sp, [[TEMP]]
	; CHECK: ldr [[TEMP2:r[0-7]]],			; CHECK: ldr [[TEMP2:r[0-7]]],
	; CHECK: add [[TEMP2]], sp			; CHECK: add [[TEMP2]], sp
	; CHECK: subs r4, r7,			; CHECK: subs r4, r7,
	; CHECK: mov sp, r4			; CHECK: mov sp, r4
	%retval = alloca i32, align 4			%retval = alloca i32, align 4
	%tmp = alloca i32, align 4			%tmp = alloca i32, align 4
	%a = alloca [805306369 x i8], align 16			%a = alloca [805306369 x i8], align 8
	store i32 0, i32* %tmp			store i32 0, i32* %tmp
	%tmp1 = load i32, i32* %tmp			%tmp1 = load i32, i32* %tmp
	ret i32 %tmp1			ret i32 %tmp1
	}			}

	; Here, the adds get optimized out because they are dead, but the calculation			; Here, the adds get optimized out because they are dead, but the calculation
	; of the address of stack_a is dead but not optimized out. When the address			; of the address of stack_a is dead but not optimized out. When the address
	; calculation gets expanded to two instructions, we need to avoid reading a			; calculation gets expanded to two instructions, we need to avoid reading a
	Show All 15 Lines

test/CodeGen/Thumb/long.ll

	; RUN: llc -mtriple=thumb-eabi %s -verify-machineinstrs -o - \| FileCheck %s			; RUN: llc -mtriple=thumb-eabi %s -verify-machineinstrs -o - \| \
				; RUN: FileCheck %s -check-prefix CHECK --check-prefix CHECK-EABI
	; RUN: llc -mtriple=thumb-apple-darwin %s -verify-machineinstrs -o - \| \			; RUN: llc -mtriple=thumb-apple-darwin %s -verify-machineinstrs -o - \| \
	; RUN: FileCheck %s -check-prefix CHECK -check-prefix CHECK-DARWIN			; RUN: FileCheck %s -check-prefix CHECK -check-prefix CHECK-DARWIN

	define i64 @f1() {			define i64 @f1() {
	entry:			entry:
	ret i64 0			ret i64 0
	; CHECK-LABEL: f1:			; CHECK-LABEL: f1:
	; CHECK: movs r0, #0			; CHECK: movs r0, #0
	▲ Show 20 Lines • Show All 157 Lines • ▼ Show 20 Lines
	}			}

	define i64 @f10() {			define i64 @f10() {
	entry:			entry:
	%a = alloca i64, align 8 ; <i64*> [#uses=1]			%a = alloca i64, align 8 ; <i64*> [#uses=1]
	%retval = load i64, i64* %a ; <i64> [#uses=1]			%retval = load i64, i64* %a ; <i64> [#uses=1]
	ret i64 %retval			ret i64 %retval
	; CHECK-LABEL: f10:			; CHECK-LABEL: f10:
	; CHECK: sub sp, #8			; CHECK-EABI: sub sp, #8
				; CHECK-DARWIN: add r7, sp, #4
	; CHECK: ldr r0, [sp]			; CHECK: ldr r0, [sp]
	; CHECK: ldr r1, [sp, #4]			; CHECK: ldr r1, [sp, #4]
	; CHECK: add sp, #8			; CHECK-EABI: add sp, #8
				; CHECK-DARWIN: mov sp, r4
	}			}

	define i64 @f11(i64 %x, i64 %y) {			define i64 @f11(i64 %x, i64 %y) {
	entry:			entry:
	%tmp1 = add i64 -1000, %y			%tmp1 = add i64 -1000, %y
	%tmp2 = add i64 %tmp1, -1000			%tmp2 = add i64 %tmp1, -1000
	ret i64 %tmp2			ret i64 %tmp2
	; CHECK-LABEL: f11:			; CHECK-LABEL: f11:
	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

test/CodeGen/Thumb/stack-align.ll

This file was added.

				; RUN: llc -mtriple=thumb-eabi < %s -o - \| FileCheck %s

				define void @f() local_unnamed_addr #0 {
				entry:
				; Check stack is realigned to 16 byte boundary
				%i = alloca i32, align 16
				; CHECK: mov r4, sp
				; CHECK-NEXT: lsrs r4, r4, #4
				; CHECK-NEXT: lsls r4, r4, #4
				; CHECK-NEXT: mov sp, r4
				store i32 0, i32* %i, align 16
				call void @g(i32* nonnull %i)
				ret void
				; Check stack is restored from frame pointer (using r4 as scratch)
				; CHECK: mov sp, r4
				; CHECK-NEXT: pop
				}

				declare void @g(i32*) local_unnamed_addr #2

This is an archive of the discontinued LLVM Phabricator instance.

Dynamic stack alignment for Thumb1ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 116491

lib/Target/ARM/ARMBaseRegisterInfo.cpp

lib/Target/ARM/ARMFrameLowering.cpp

lib/Target/ARM/Thumb1FrameLowering.cpp

test/CodeGen/ARM/thumb1_return_sequence.ll

test/CodeGen/Thumb/large-stack.ll

test/CodeGen/Thumb/long.ll

test/CodeGen/Thumb/stack-align.ll

Dynamic stack alignment for Thumb1
ClosedPublic