This is an archive of the discontinued LLVM Phabricator instance.

[X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325)
ClosedPublic

Authored by hans on Mar 29 2016, 1:43 PM.

Download Raw Diff

Details

Reviewers

mkuper
DavidKreitzer
rnk

Commits

rG6596977130f9: [X86] Enable call frame optimization ("mov to push") not only for optsize…
rL264966: [X86] Enable call frame optimization ("mov to push") not only for optsize…

Summary

The size savings are significant, and from what I can tell, both ICC and GCC do this [1] [2].

Please let me know what you think.

[1] https://godbolt.org/g/OtrQRa
[2] https://godbolt.org/g/5We0Zx

Diff Detail

Repository: rL LLVM

Event Timeline

hans updated this revision to Diff 51968.Mar 29 2016, 1:43 PM

hans retitled this revision from to [X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325).

hans updated this object.

hans added reviewers: DavidKreitzer, mkuper, rnk.

hans added a subscriber: llvm-commits.

Herald added subscribers: qcolombet, MatzeB. · View Herald TranscriptMar 29 2016, 1:43 PM

joerg added a subscriber: joerg.Mar 29 2016, 2:39 PM

joerg added inline comments.

test/CodeGen/X86/win32-seh-nested-finally.ll
46 ↗	(On Diff #51968)	The changes here look suspicious, doesn't the code need to restore %esp before the final popl?
test/CodeGen/X86/xmulo.ll
28 ↗	(On Diff #51968)	Each immediate push is two Bytes, materializing $0 as register is two Bytes as well, but a register push is one Byte. So for three pushes, a shorter sequence is actually: xorl %eax, %eax pushl %eax pushl %eax pushl %eax
test/CodeGen/X86/zext-fold.ll
38 ↗	(On Diff #51968)	Two register pushes are smaller than one subl?

hans added inline comments.Mar 29 2016, 4:44 PM

test/CodeGen/X86/win32-seh-nested-finally.ll
46 ↗	(On Diff #51968)	This does look weird. Reid, is there something magic about these invokes, or is mov-to-push broken here?
test/CodeGen/X86/xmulo.ll
28 ↗	(On Diff #51968)	Yes, we ca probably be more efficient here. See also PR26330 where we get this wrong the other way around.
test/CodeGen/X86/zext-fold.ll
38 ↗	(On Diff #51968)	The savings from the pushes offset the cost of the the sub. The sub seems unnecessary though. It would be nice to not emit it.

hans added inline comments.Mar 30 2016, 3:13 PM

test/CodeGen/X86/win32-seh-nested-finally.ll
46 ↗	(On Diff #51968)	Oh wait, %esp does get restored in between, it's just not in the test expectations. Let me fix that..

Clarify win32-seh-nested-finally.ll

lgtm I think this is production ready. We *mostly* build chromium with -Os, and we've fixed some bugs in this code.

lib/Target/X86/X86CallFrameOptimization.cpp
176–178 ↗	(On Diff #52139)	Huh, so we're already doing this a lot due to inalloca.
test/CodeGen/X86/win32-seh-nested-finally.ll
46 ↗	(On Diff #52139)	Great, I was halfway through checking that before I got distracted...
test/CodeGen/X86/zext-fold.ll
38–39 ↗	(On Diff #52139)	Actually, the sub probably won't be emitted on Mac and Windows, it will only appear on platforms with 16 byte stack alignment.

This revision is now accepted and ready to land.Mar 30 2016, 4:17 PM

Closed by commit rL264966: [X86] Enable call frame optimization ("mov to push") not only for optsize… (authored by hans). · Explain WhyMar 30 2016, 4:43 PM

This revision was automatically updated to reflect the committed changes.

joerg added inline comments.Mar 31 2016, 1:07 AM

llvm/trunk/test/CodeGen/X86/win32-seh-nested-finally.ll
56	Two things here for the updated patch. If the stack alignment requirement is 32bit only OR if the pushes have realigned the stack correctly (not sure if we care about the second part), the addls can be deferred to the end of the BB. It's also cheaper to use a pop to some scratch register if available.
62	This and the next block is missing a pop still? Also, add looks strange and out of context for the rest of the purpose of the text case.

Thanks a lot, Hans!
I apologize for not making this happen myself - never got to it, and then lost the ability to benchmark...

llvm/trunk/test/CodeGen/X86/win32-seh-nested-finally.ll
56	Regarding the last part - we already have code for that in X86FrameLowering::eliminateCallFramePseudoInstr(), but it's also currently only enabled for minsize.

hans added inline comments.Mar 31 2016, 9:49 AM

llvm/trunk/test/CodeGen/X86/win32-seh-nested-finally.ll
56	Using pop for more stack restores is PR26333. Filed PR27165 for considering using pop for stack restore beyond -Os.
62	Adding more checks in r265027. Are you saying add instead of pop looks out of place? See Michael's comment above. Or do you mean the "addl $12, %ebp" above?

hans added inline comments.Mar 31 2016, 9:56 AM

llvm/trunk/test/CodeGen/X86/win32-seh-nested-finally.ll
56	I mean, filed PR27165 for considering not restoring the stack between calls.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

X86CallFrameOptimization.cpp

4 lines

test/

CodeGen/

X86/

2006-05-02-InstrSched1.ll

2 lines

2006-11-12-CSRetCC.ll

9 lines

4 lines

7 lines

5 lines

2 lines

cmpxchg-clobber-flags.ll

6 lines

coalescer-commute3.ll

2 lines

hipe-prologue.ll

2 lines

i386-shrink-wrapping.ll

4 lines

25 lines

35 lines

12 lines

6 lines

4 lines

48 lines

phys-reg-local-regalloc.ll

6 lines

segmented-stacks.ll

34 lines

seh-catch-all-win32.ll

4 lines

seh-stack-realign.ll

10 lines

shrink-wrap-chkstk.ll

4 lines

sse-intel-ocl.ll

2 lines

tailcall-stackalign.ll

2 lines

twoaddr-coalesce.ll

2 lines

vararg-callee-cleanup.ll

2 lines

4 lines

30 lines

6 lines

16 lines

win32-seh-catchpad.ll

8 lines

win32-seh-nested-finally.ll

13 lines

win32_sret.ll

47 lines

xmulo.ll

18 lines

zext-fold.ll

5 lines

Diff 52155

llvm/trunk/lib/Target/X86/X86CallFrameOptimization.cpp

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	bool X86CallFrameOptimization::isProfitable(MachineFunction &MF,
ContextVector &CallSeqVector) {		ContextVector &CallSeqVector) {
// This transformation is always a win when we do not expect to have		// This transformation is always a win when we do not expect to have
// a reserved call frame. Under other circumstances, it may be either		// a reserved call frame. Under other circumstances, it may be either
// a win or a loss, and requires a heuristic.		// a win or a loss, and requires a heuristic.
bool CannotReserveFrame = MF.getFrameInfo()->hasVarSizedObjects();		bool CannotReserveFrame = MF.getFrameInfo()->hasVarSizedObjects();
if (CannotReserveFrame)		if (CannotReserveFrame)
return true;		return true;

// Don't do this when not optimizing for size.
if (!MF.getFunction()->optForSize())
return false;

unsigned StackAlign = TFL->getStackAlignment();		unsigned StackAlign = TFL->getStackAlignment();

int64_t Advantage = 0;		int64_t Advantage = 0;
for (auto CC : CallSeqVector) {		for (auto CC : CallSeqVector) {
// Call sites where no parameters are passed on the stack		// Call sites where no parameters are passed on the stack
// do not affect the cost, since there needs to be no		// do not affect the cost, since there needs to be no
// stack adjustment.		// stack adjustment.
if (CC.NoStackParams)		if (CC.NoStackParams)
▲ Show 20 Lines • Show All 363 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/2006-05-02-InstrSched1.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -march=x86 -relocation-model=static -stats 2>&1 \| \			; RUN: llc < %s -march=x86 -relocation-model=static -stats 2>&1 \| \
	; RUN: grep asm-printer \| grep 16			; RUN: grep asm-printer \| grep 15
	;			;
	; It's possible to schedule this in 14 instructions by avoiding			; It's possible to schedule this in 14 instructions by avoiding
	; callee-save registers, but the scheduler isn't currently that			; callee-save registers, but the scheduler isn't currently that
	; conervative with registers.			; conervative with registers.
	@size20 = external global i32 ; <i32*> [#uses=1]			@size20 = external global i32 ; <i32*> [#uses=1]
	@in5 = external global i8* ; <i8**> [#uses=1]			@in5 = external global i8* ; <i8**> [#uses=1]

	define i32 @compare(i8* %a, i8* %b) nounwind {			define i32 @compare(i8* %a, i8* %b) nounwind {
	Show All 15 Lines

llvm/trunk/test/CodeGen/X86/2006-11-12-CSRetCC.ll

	; RUN: llc < %s -march=x86 \| FileCheck %s			; RUN: llc < %s -march=x86 \| FileCheck %s

	target triple = "i686-pc-linux-gnu"			target triple = "i686-pc-linux-gnu"
	@str = internal constant [9 x i8] c"%f+%fi\0A\00" ; <[9 x i8]> [#uses=1]			@str = internal constant [9 x i8] c"%f+%fi\0A\00" ; <[9 x i8]> [#uses=1]

	define i32 @main() {			define i32 @main() {
	; CHECK-LABEL: main:			; CHECK-LABEL: main:
	; CHECK-NOT: ret			; CHECK-NOT: ret
	; CHECK: subl $4, %{{.*}}			; CHECK: subl $12, %esp
				; CHECK: pushl
				; CHECK: pushl
				; CHECK: pushl
				; CHECK: pushl
				; CHECK: pushl
				; CHECK: calll cexp
				; CHECK: addl $28, %esp
	; CHECK: ret			; CHECK: ret

	entry:			entry:
	%retval = alloca i32, align 4 ; <i32*> [#uses=1]			%retval = alloca i32, align 4 ; <i32*> [#uses=1]
	%tmp = alloca { double, double }, align 16 ; <{ double, double }*> [#uses=4]			%tmp = alloca { double, double }, align 16 ; <{ double, double }*> [#uses=4]
	%tmp1 = alloca { double, double }, align 16 ; <{ double, double }*> [#uses=4]			%tmp1 = alloca { double, double }, align 16 ; <{ double, double }*> [#uses=4]
	%tmp2 = alloca { double, double }, align 16 ; <{ double, double }*> [#uses=3]			%tmp2 = alloca { double, double }, align 16 ; <{ double, double }*> [#uses=3]
	%pi = alloca double, align 8 ; <double*> [#uses=2]			%pi = alloca double, align 8 ; <double*> [#uses=2]
	▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/atom-lea-sp.ll

	; RUN: llc < %s -mcpu=atom -mtriple=i686-linux \| FileCheck -check-prefix=ATOM %s			; RUN: llc < %s -mcpu=atom -mtriple=i686-linux -no-x86-call-frame-opt \| FileCheck -check-prefix=ATOM %s
	; RUN: llc < %s -mcpu=core2 -mtriple=i686-linux \| FileCheck %s			; RUN: llc < %s -mcpu=core2 -mtriple=i686-linux -no-x86-call-frame-opt \| FileCheck %s

	declare void @use_arr(i8*)			declare void @use_arr(i8*)
	declare void @many_params(i32, i32, i32, i32, i32, i32)			declare void @many_params(i32, i32, i32, i32, i32, i32)

	define void @test1() nounwind {			define void @test1() nounwind {
	; ATOM-LABEL: test1:			; ATOM-LABEL: test1:
	; ATOM: leal -1052(%esp), %esp			; ATOM: leal -1052(%esp), %esp
	; ATOM-NOT: sub			; ATOM-NOT: sub
	Show All 38 Lines

llvm/trunk/test/CodeGen/X86/avx-intel-ocl.ll

	Show All 9 Lines
	; WIN64-LABEL: testf16_inp			; WIN64-LABEL: testf16_inp
	; WIN64: vaddps {{.*}}, {{%ymm[0-1]}}			; WIN64: vaddps {{.*}}, {{%ymm[0-1]}}
	; WIN64: vaddps {{.*}}, {{%ymm[0-1]}}			; WIN64: vaddps {{.*}}, {{%ymm[0-1]}}
	; WIN64: leaq {{.*}}(%rsp), %rcx			; WIN64: leaq {{.*}}(%rsp), %rcx
	; WIN64: call			; WIN64: call
	; WIN64: ret			; WIN64: ret

	; X32-LABEL: testf16_inp			; X32-LABEL: testf16_inp
	; X32: movl %eax, (%esp)
	; X32: vaddps {{.*}}, {{%ymm[0-1]}}			; X32: vaddps {{.*}}, {{%ymm[0-1]}}
	; X32: vaddps {{.*}}, {{%ymm[0-1]}}			; X32: vaddps {{.*}}, {{%ymm[0-1]}}
				; Push is not deemed profitable if we're realigning the stack.
				; X32: {{pushl\|movl}} %eax
	; X32: call			; X32: call
	; X32: ret			; X32: ret

	; X64-LABEL: testf16_inp			; X64-LABEL: testf16_inp
	; X64: vaddps {{.*}}, {{%ymm[0-1]}}			; X64: vaddps {{.*}}, {{%ymm[0-1]}}
	; X64: vaddps {{.*}}, {{%ymm[0-1]}}			; X64: vaddps {{.*}}, {{%ymm[0-1]}}
	; X64: leaq {{.*}}(%rsp), %rdi			; X64: leaq {{.*}}(%rsp), %rdi
	; X64: call			; X64: call
	▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
	define intel_ocl_bicc <16 x float> @test_prolog_epilog(<16 x float> %a, <16 x float> %b) nounwind {			define intel_ocl_bicc <16 x float> @test_prolog_epilog(<16 x float> %a, <16 x float> %b) nounwind {
	%c = call <16 x float> @func_float16(<16 x float> %a, <16 x float> %b)			%c = call <16 x float> @func_float16(<16 x float> %a, <16 x float> %b)
	ret <16 x float> %c			ret <16 x float> %c
	}			}

	; test functions with integer parameters			; test functions with integer parameters
	; pass parameters on stack for 32-bit platform			; pass parameters on stack for 32-bit platform
	; X32-LABEL: test_int			; X32-LABEL: test_int
	; X32: movl {{.*}}, 4(%esp)			; X32: pushl {{.*}}
	; X32: movl {{.*}}, (%esp)			; X32: pushl {{.*}}
	; X32: call			; X32: call
	; X32: addl {{.*}}, %eax			; X32: addl {{.*}}, %eax

	; pass parameters in registers for 64-bit platform			; pass parameters in registers for 64-bit platform
	; X64-LABEL: test_int			; X64-LABEL: test_int
	; X64: leal {{.*}}, %edi			; X64: leal {{.*}}, %edi
	; X64: movl {{.*}}, %esi			; X64: movl {{.*}}, %esi
	; X64: call			; X64: call
	▲ Show 20 Lines • Show All 46 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/avx512-intel-ocl.ll

	Show All 9 Lines
	; WIN64-LABEL: testf16_inp			; WIN64-LABEL: testf16_inp
	; WIN64: vaddps {{.*}}, {{%zmm[0-1]}}			; WIN64: vaddps {{.*}}, {{%zmm[0-1]}}
	; WIN64: leaq {{.*}}(%rsp), %rcx			; WIN64: leaq {{.*}}(%rsp), %rcx
	; WIN64: call			; WIN64: call
	; WIN64: ret			; WIN64: ret

	; X32-LABEL: testf16_inp			; X32-LABEL: testf16_inp
	; X32: vaddps {{.*}}, {{%zmm[0-1]}}			; X32: vaddps {{.*}}, {{%zmm[0-1]}}
	; X32: movl %eax, (%esp)			; Push is not deemed profitable if we're realigning the stack.
				; X32: {{pushl\|movl}} %eax
	; X32: call			; X32: call
	; X32: ret			; X32: ret

	; X64-LABEL: testf16_inp			; X64-LABEL: testf16_inp
	; X64: vaddps {{.*}}, {{%zmm[0-1]}}			; X64: vaddps {{.*}}, {{%zmm[0-1]}}
	; X64: leaq {{.*}}(%rsp), %rdi			; X64: leaq {{.*}}(%rsp), %rdi
	; X64: call			; X64: call
	; X64: ret			; X64: ret
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	; X64-LABEL: test_prolog_epilog_with_mask			; X64-LABEL: test_prolog_epilog_with_mask
	; X64: kxorw %k{{.}}, %k{{.}}, %k1			; X64: kxorw %k{{.}}, %k{{.}}, %k1
	; X64: call			; X64: call
	define intel_ocl_bicc <16 x float> @test_prolog_epilog_with_mask(<16 x float> %a, <16 x i32> %x1, <16 x i32>%x2, <16 x i1> %mask) nounwind {			define intel_ocl_bicc <16 x float> @test_prolog_epilog_with_mask(<16 x float> %a, <16 x i32> %x1, <16 x i32>%x2, <16 x i1> %mask) nounwind {
	%cmp_res = icmp eq <16 x i32>%x1, %x2			%cmp_res = icmp eq <16 x i32>%x1, %x2
	%mask1 = xor <16 x i1> %cmp_res, %mask			%mask1 = xor <16 x i1> %cmp_res, %mask
	%c = call intel_ocl_bicc <16 x float> @func_float16_mask(<16 x float> %a, <16 x i1>%mask1)			%c = call intel_ocl_bicc <16 x float> @func_float16_mask(<16 x float> %a, <16 x i1>%mask1)
	ret <16 x float> %c			ret <16 x float> %c
	}			}
	No newline at end of file

llvm/trunk/test/CodeGen/X86/call-push.ll

	; RUN: llc < %s -mtriple=i386-apple-darwin -disable-fp-elim \| FileCheck %s			; RUN: llc < %s -mtriple=i386-apple-darwin -disable-fp-elim -no-x86-call-frame-opt \| FileCheck %s

	%struct.decode_t = type { i8, i8, i8, i8, i16, i8, i8, %struct.range_t** }			%struct.decode_t = type { i8, i8, i8, i8, i16, i8, i8, %struct.range_t** }
	%struct.range_t = type { float, float, i32, i32, i32, [0 x i8] }			%struct.range_t = type { float, float, i32, i32, i32, [0 x i8] }

	define i32 @decode_byte(%struct.decode_t* %decode) nounwind {			define i32 @decode_byte(%struct.decode_t* %decode) nounwind {
	; CHECK-LABEL: decode_byte:			; CHECK-LABEL: decode_byte:
	; CHECK: pushl			; CHECK: pushl
	; CHECK: popl			; CHECK: popl
	Show All 36 Lines

llvm/trunk/test/CodeGen/X86/cmpxchg-clobber-flags.ll

	Show All 15 Lines
	define i64 @test_intervening_call(i64* %foo, i64 %bar, i64 %baz) {			define i64 @test_intervening_call(i64* %foo, i64 %bar, i64 %baz) {
	; i386-LABEL: test_intervening_call:			; i386-LABEL: test_intervening_call:
	; i386: cmpxchg8b			; i386: cmpxchg8b
	; i386-NEXT: pushl %eax			; i386-NEXT: pushl %eax
	; i386-NEXT: seto %al			; i386-NEXT: seto %al
	; i386-NEXT: lahf			; i386-NEXT: lahf
	; i386-NEXT: movl %eax, [[FLAGS:%.*]]			; i386-NEXT: movl %eax, [[FLAGS:%.*]]
	; i386-NEXT: popl %eax			; i386-NEXT: popl %eax
	; i386-NEXT: movl %edx, 4(%esp)			; i386-NEXT: subl $8, %esp
	; i386-NEXT: movl %eax, (%esp)			; i386-NEXT: pushl %edx
				; i386-NEXT: pushl %eax
	; i386-NEXT: calll bar			; i386-NEXT: calll bar
				; i386-NEXT: addl $16, %esp
	; i386-NEXT: movl [[FLAGS]], %eax			; i386-NEXT: movl [[FLAGS]], %eax
	; i386-NEXT: addb $127, %al			; i386-NEXT: addb $127, %al
	; i386-NEXT: sahf			; i386-NEXT: sahf
	; i386-NEXT: jne			; i386-NEXT: jne

	; i386f-LABEL: test_intervening_call:			; i386f-LABEL: test_intervening_call:
	; i386f: cmpxchg8b			; i386f: cmpxchg8b
	; i386f-NEXT: movl %eax, (%esp)			; i386f-NEXT: movl %eax, (%esp)
	▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/coalescer-commute3.ll

	; RUN: llc < %s -mtriple=i686-apple-darwin -mattr=+sse2 \| grep mov \| count 6			; RUN: llc < %s -mtriple=i686-apple-darwin -mattr=+sse2 -no-x86-call-frame-opt \| grep mov \| count 6

	%struct.quad_struct = type { i32, i32, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct* }			%struct.quad_struct = type { i32, i32, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct, %struct.quad_struct* }

	define i32 @perimeter(%struct.quad_struct* %tree, i32 %size) nounwind {			define i32 @perimeter(%struct.quad_struct* %tree, i32 %size) nounwind {
	entry:			entry:
	switch i32 %size, label %UnifiedReturnBlock [			switch i32 %size, label %UnifiedReturnBlock [
	i32 2, label %bb			i32 2, label %bb
	i32 0, label %bb50			i32 0, label %bb50
	Show All 15 Lines

llvm/trunk/test/CodeGen/X86/hipe-prologue.ll

Show All 18 Lines	define {i32, i32} @test_basic(i32 %hp, i32 %p) {
call void @dummy_use (i32* %mem, i32 10)		call void @dummy_use (i32* %mem, i32 10)
%1 = insertvalue {i32, i32} undef, i32 %hp, 0		%1 = insertvalue {i32, i32} undef, i32 %hp, 0
%2 = insertvalue {i32, i32} %1, i32 %p, 1		%2 = insertvalue {i32, i32} %1, i32 %p, 1
ret {i32, i32} %1		ret {i32, i32} %1
}		}

define cc 11 {i32, i32} @test_basic_hipecc(i32 %hp, i32 %p) {		define cc 11 {i32, i32} @test_basic_hipecc(i32 %hp, i32 %p) {
; X32-Linux-LABEL: test_basic_hipecc:		; X32-Linux-LABEL: test_basic_hipecc:
; X32-Linux: leal -156(%esp), %ebx		; X32-Linux: leal -140(%esp), %ebx
; X32-Linux-NEXT: cmpl 76(%ebp), %ebx		; X32-Linux-NEXT: cmpl 76(%ebp), %ebx
; X32-Linux-NEXT: jb .LBB1_1		; X32-Linux-NEXT: jb .LBB1_1

; X32-Linux: ret		; X32-Linux: ret

; X32-Linux: .LBB1_1:		; X32-Linux: .LBB1_1:
; X32-Linux-NEXT: calll inc_stack_0		; X32-Linux-NEXT: calll inc_stack_0

Show All 32 Lines

llvm/trunk/test/CodeGen/X86/i386-shrink-wrapping.ll

	; RUN: llc %s -o - -enable-shrink-wrap=true \| FileCheck %s --check-prefix=CHECK --check-prefix=ENABLE			; RUN: llc %s -o - -enable-shrink-wrap=true -no-x86-call-frame-opt \| FileCheck %s --check-prefix=CHECK --check-prefix=ENABLE
	; RUN: llc %s -o - -enable-shrink-wrap=false \| FileCheck %s --check-prefix=CHECK --check-prefix=DISABLE			; RUN: llc %s -o - -enable-shrink-wrap=false -no-x86-call-frame-opt \| FileCheck %s --check-prefix=CHECK --check-prefix=DISABLE
	target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"			target datalayout = "e-m:e-p:32:32-f64:32:64-f80:32-n8:16:32-S128"
	target triple = "i386-apple-macosx"			target triple = "i386-apple-macosx"

	@a = common global i32 0, align 4			@a = common global i32 0, align 4
	@d = internal unnamed_addr global i1 false			@d = internal unnamed_addr global i1 false
	@b = common global i32 0, align 4			@b = common global i32 0, align 4
	@e = common global i8 0, align 1			@e = common global i8 0, align 1
	@f = common global i8 0, align 1			@f = common global i8 0, align 1
	▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/libcall-sret.ll

	; RUN: llc -mtriple=i686-linux-gnu -o - %s \| FileCheck %s			; RUN: llc -mtriple=i686-linux-gnu -o - %s \| FileCheck %s

	@var = global i128 0			@var = global i128 0

	; We were trying to convert the i128 operation into a libcall, but failing to			; We were trying to convert the i128 operation into a libcall, but failing to
	; perform sret demotion when we couldn't return the result in registers. Make			; perform sret demotion when we couldn't return the result in registers. Make
	; sure we marshal the return properly:			; sure we marshal the return properly:

	define void @test_sret_libcall(i128 %l, i128 %r) {			define void @test_sret_libcall(i128 %l, i128 %r) {
	; CHECK-LABEL: test_sret_libcall:			; CHECK-LABEL: test_sret_libcall:

	; Stack for call: 4(sret ptr), 16(i128 %l), 16(128 %r). So next logical			; Stack for call: 4(sret ptr), 16(i128 %l), 16(128 %r). So next logical
	; (aligned) place for the actual sret data is %esp + 40.			; (aligned) place for the actual sret data is %esp + 20.
	; CHECK: leal 40(%esp), [[SRET_ADDR:%[a-z]+]]			; CHECK: leal 20(%esp), [[SRET_ADDR:%[a-z]+]]
	; CHECK: movl [[SRET_ADDR]], (%esp)			; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl 72(%esp)
				; CHECK: pushl [[SRET_ADDR]]

	; CHECK: calll __multi3			; CHECK: calll __multi3
	; CHECK-DAG: movl 40(%esp), [[RES0:%[a-z]+]]
	; CHECK-DAG: movl 44(%esp), [[RES1:%[a-z]+]]			; CHECK: addl $44, %esp
	; CHECK-DAG: movl 48(%esp), [[RES2:%[a-z]+]]			; CHECK-DAG: movl 8(%esp), [[RES0:%[a-z]+]]
	; CHECK-DAG: movl 52(%esp), [[RES3:%[a-z]+]]			; CHECK-DAG: movl 12(%esp), [[RES1:%[a-z]+]]
				; CHECK-DAG: movl 16(%esp), [[RES2:%[a-z]+]]
				; CHECK-DAG: movl 20(%esp), [[RES3:%[a-z]+]]
	; CHECK-DAG: movl [[RES0]], var			; CHECK-DAG: movl [[RES0]], var
	; CHECK-DAG: movl [[RES1]], var+4			; CHECK-DAG: movl [[RES1]], var+4
	; CHECK-DAG: movl [[RES2]], var+8			; CHECK-DAG: movl [[RES2]], var+8
	; CHECK-DAG: movl [[RES3]], var+12			; CHECK-DAG: movl [[RES3]], var+12
	%prod = mul i128 %l, %r			%prod = mul i128 %l, %r
	store i128 %prod, i128* @var			store i128 %prod, i128* @var
	ret void			ret void
	}			}

llvm/trunk/test/CodeGen/X86/localescape.ll

	Show All 33 Lines
	; X64: movl .Lalloc_func$frame_escape_1(%[[parent_fp]]), %edx			; X64: movl .Lalloc_func$frame_escape_1(%[[parent_fp]]), %edx
	; X64: movq %[[str]], %rcx			; X64: movq %[[str]], %rcx
	; X64: callq printf			; X64: callq printf
	; X64: movl $42, .Lalloc_func$frame_escape_1(%[[parent_fp]])			; X64: movl $42, .Lalloc_func$frame_escape_1(%[[parent_fp]])
	; X64: retq			; X64: retq

	; X86-LABEL: print_framealloc_from_fp:			; X86-LABEL: print_framealloc_from_fp:
	; X86: pushl %esi			; X86: pushl %esi
	; X86: subl $8, %esp			; X86: movl 8(%esp), %esi
	; X86: movl 16(%esp), %esi			; X86: pushl Lalloc_func$frame_escape_0(%esi)
	; X86: movl Lalloc_func$frame_escape_0(%esi), %eax			; X86: pushl $_str
	; X86: movl %eax, 4(%esp)
	; X86: movl $_str, (%esp)
	; X86: calll _printf			; X86: calll _printf
	; X86: movl Lalloc_func$frame_escape_1(%esi), %eax			; X86: addl $8, %esp
	; X86: movl %eax, 4(%esp)			; X86: pushl Lalloc_func$frame_escape_1(%esi)
	; X86: movl $_str, (%esp)			; X86: pushl $_str
	; X86: calll _printf			; X86: calll _printf
				; X86: addl $8, %esp
	; X86: movl $42, Lalloc_func$frame_escape_1(%esi)			; X86: movl $42, Lalloc_func$frame_escape_1(%esi)
	; X86: movl $4, %eax			; X86: movl $4, %eax
	; X86: movl Lalloc_func$frame_escape_1(%esi,%eax), %eax			; X86: pushl Lalloc_func$frame_escape_1(%esi,%eax)
	; X86: movl %eax, 4(%esp)			; X86: pushl $_str
	; X86: movl $_str, (%esp)
	; X86: calll _printf			; X86: calll _printf
	; X86: addl $8, %esp			; X86: addl $8, %esp
	; X86: popl %esi			; X86: popl %esi
	; X86: retl			; X86: retl

	define void @alloc_func(i32 %n) {			define void @alloc_func(i32 %n) {
	%a = alloca i32			%a = alloca i32
	%b = alloca i32, i32 2			%b = alloca i32, i32 2
	▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
	; X64: movl $42, 36(%rsp)			; X64: movl $42, 36(%rsp)
	; X64: movl $13, 32(%rsp)			; X64: movl $13, 32(%rsp)
	; X64: xorl %ecx, %ecx			; X64: xorl %ecx, %ecx
	; X64: callq print_framealloc_from_fp			; X64: callq print_framealloc_from_fp
	; X64: addq $40, %rsp			; X64: addq $40, %rsp
	; X64: retq			; X64: retq

	; X86-LABEL: alloc_func_no_frameaddr:			; X86-LABEL: alloc_func_no_frameaddr:
	; X86: subl $12, %esp			; X86: subl $8, %esp
	; X86: Lalloc_func_no_frameaddr$frame_escape_0 = 8			; X86: Lalloc_func_no_frameaddr$frame_escape_0 = 4
	; X86: Lalloc_func_no_frameaddr$frame_escape_1 = 4			; X86: Lalloc_func_no_frameaddr$frame_escape_1 = 0
	; X86: movl $42, 8(%esp)			; X86: movl $42, 4(%esp)
	; X86: movl $13, 4(%esp)			; X86: movl $13, (%esp)
	; X86: movl $0, (%esp)			; X86: pushl $0
	; X86: calll _print_framealloc_from_fp			; X86: calll _print_framealloc_from_fp
	; X86: addl $12, %esp			; X86: addl $4, %esp
				; X86: addl $8, %esp
	; X86: retl			; X86: retl

llvm/trunk/test/CodeGen/X86/mcu-abi.ll

	Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	; CHECK: movl %edx, %eax			; CHECK: movl %edx, %eax
	; CHECK: calll __fixsfsi			; CHECK: calll __fixsfsi
	define i32 @test_lib_args(float %a, float %b) #0 {			define i32 @test_lib_args(float %a, float %b) #0 {
	%ret = fptosi float %b to i32			%ret = fptosi float %b to i32
	ret i32 %ret			ret i32 %ret
	}			}

	; CHECK-LABEL: test_fp128:			; CHECK-LABEL: test_fp128:
	; CHECK: movl (%eax), %e[[CX:..]]			; CHECK: pushl 12(%eax)
	; CHECK-NEXT: movl 4(%eax), %e[[DX:..]]			; CHECK-NEXT: pushl 8(%eax)
	; CHECK-NEXT: movl 8(%eax), %e[[SI:..]]			; CHECK-NEXT: pushl 4(%eax)
	; CHECK-NEXT: movl 12(%eax), %e[[AX:..]]			; CHECK-NEXT: pushl (%eax)
	; CHECK-NEXT: movl %e[[AX]], 12(%esp)
	; CHECK-NEXT: movl %e[[SI]], 8(%esp)
	; CHECK-NEXT: movl %e[[DX]], 4(%esp)
	; CHECK-NEXT: movl %e[[CX]], (%esp)
	; CHECK-NEXT: calll __fixtfsi			; CHECK-NEXT: calll __fixtfsi
	define i32 @test_fp128(fp128* %ptr) #0 {			define i32 @test_fp128(fp128* %ptr) #0 {
	%v = load fp128, fp128* %ptr			%v = load fp128, fp128* %ptr
	%ret = fptosi fp128 %v to i32			%ret = fptosi fp128 %v to i32
	ret i32 %ret			ret i32 %ret
	}			}

	declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #1			declare void @llvm.memcpy.p0i8.p0i8.i32(i8* nocapture, i8* nocapture readonly, i32, i32, i1) #1
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/memset-2.ll

	; NOTE: Assertions have been autogenerated by update_test_checks.py			; NOTE: Assertions have been autogenerated by update_test_checks.py
	; RUN: llc -mtriple=i386-apple-darwin -mcpu=yonah < %s \| FileCheck %s			; RUN: llc -mtriple=i386-apple-darwin -mcpu=yonah < %s \| FileCheck %s

	declare void @llvm.memset.i32(i8*, i8, i32, i32) nounwind			declare void @llvm.memset.i32(i8*, i8, i32, i32) nounwind

	define fastcc void @t1() nounwind {			define fastcc void @t1() nounwind {
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
	; CHECK: subl $12, %esp			; CHECK: subl $12, %esp
	; CHECK-NEXT: movl $188, {{[0-9]+}}(%esp)			; CHECK: pushl $188
	; CHECK-NEXT: movl $0, {{[0-9]+}}(%esp)			; CHECK-NEXT: pushl $0
	; CHECK-NEXT: movl $0, (%esp)			; CHECK-NEXT: pushl $0
	; CHECK-NEXT: calll L_memset$stub			; CHECK-NEXT: calll L_memset$stub
	;			;
	entry:			entry:
	call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)			call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
	unreachable			unreachable
	}			}

	define fastcc void @t2(i8 signext %c) nounwind {			define fastcc void @t2(i8 signext %c) nounwind {
	▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/mingw-alloca.ll

	Show All 16 Lines
	declare void @bar1(i32*)			declare void @bar1(i32*)

	define void @foo2(i32 inreg %N) nounwind {			define void @foo2(i32 inreg %N) nounwind {
	entry:			entry:
	; COFF: _foo2:			; COFF: _foo2:
	; COFF: andl $-16, %esp			; COFF: andl $-16, %esp
	; COFF: pushl %eax			; COFF: pushl %eax
	; COFF: calll __alloca			; COFF: calll __alloca
	; COFF: movl 8028(%esp), %eax			; COFF: movl 8012(%esp), %eax
	; ELF: foo2:			; ELF: foo2:
	; ELF: andl $-16, %esp			; ELF: andl $-16, %esp
	; ELF: pushl %eax			; ELF: pushl %eax
	; ELF: calll _alloca			; ELF: calll _alloca
	; ELF: movl 8028(%esp), %eax			; ELF: movl 8012(%esp), %eax
	%A2 = alloca [2000 x i32], align 16 ; <[2000 x i32]*> [#uses=1]			%A2 = alloca [2000 x i32], align 16 ; <[2000 x i32]*> [#uses=1]
	%A2.sub = getelementptr [2000 x i32], [2000 x i32]* %A2, i32 0, i32 0 ; <i32*> [#uses=1]			%A2.sub = getelementptr [2000 x i32], [2000 x i32]* %A2, i32 0, i32 0 ; <i32*> [#uses=1]
	call void @bar2( i32* %A2.sub, i32 %N )			call void @bar2( i32* %A2.sub, i32 %N )
	ret void			ret void
	}			}

	declare void @bar2(i32*, i32)			declare void @bar2(i32*, i32)

llvm/trunk/test/CodeGen/X86/movtopush.ll

	; RUN: llc < %s -mtriple=i686-windows \| FileCheck %s -check-prefix=NORMAL			; RUN: llc < %s -mtriple=i686-windows \| FileCheck %s -check-prefix=NORMAL
				; RUN: llc < %s -mtriple=i686-windows -no-x86-call-frame-opt \| FileCheck %s -check-prefix=NOPUSH
	; RUN: llc < %s -mtriple=x86_64-windows \| FileCheck %s -check-prefix=X64			; RUN: llc < %s -mtriple=x86_64-windows \| FileCheck %s -check-prefix=X64
	; RUN: llc < %s -mtriple=i686-windows -stackrealign -stack-alignment=32 \| FileCheck %s -check-prefix=ALIGNED			; RUN: llc < %s -mtriple=i686-windows -stackrealign -stack-alignment=32 \| FileCheck %s -check-prefix=ALIGNED

	%class.Class = type { i32 }			%class.Class = type { i32 }
	%struct.s = type { i64 }			%struct.s = type { i64 }

	declare void @good(i32 %a, i32 %b, i32 %c, i32 %d)			declare void @good(i32 %a, i32 %b, i32 %c, i32 %d)
	declare void @inreg(i32 %a, i32 inreg %b, i32 %c, i32 %d)			declare void @inreg(i32 %a, i32 inreg %b, i32 %c, i32 %d)
	declare x86_thiscallcc void @thiscall(%class.Class* %class, i32 %a, i32 %b, i32 %c, i32 %d)			declare x86_thiscallcc void @thiscall(%class.Class* %class, i32 %a, i32 %b, i32 %c, i32 %d)
	declare void @oneparam(i32 %a)			declare void @oneparam(i32 %a)
	declare void @eightparams(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h)			declare void @eightparams(i32 %a, i32 %b, i32 %c, i32 %d, i32 %e, i32 %f, i32 %g, i32 %h)
	declare void @struct(%struct.s* byval %a, i32 %b, i32 %c, i32 %d)			declare void @struct(%struct.s* byval %a, i32 %b, i32 %c, i32 %d)

	; Here, we should have a reserved frame, so we don't expect pushes			; We should get pushes for x86, even though there is a reserved call frame.
				; Make sure we don't touch x86-64, and that turning it off works.
	; NORMAL-LABEL: test1:			; NORMAL-LABEL: test1:
	; NORMAL: subl $16, %esp
	; NORMAL-NEXT: movl $4, 12(%esp)
	; NORMAL-NEXT: movl $3, 8(%esp)
	; NORMAL-NEXT: movl $2, 4(%esp)
	; NORMAL-NEXT: movl $1, (%esp)
	; NORMAL-NEXT: call
	; NORMAL-NEXT: addl $16, %esp
	define void @test1() {
	entry:
	call void @good(i32 1, i32 2, i32 3, i32 4)
	ret void
	}

	; We're optimizing for code size, so we should get pushes for x86,
	; even though there is a reserved call frame.
	; Make sure we don't touch x86-64
	; NORMAL-LABEL: test1b:
	; NORMAL-NOT: subl {{.*}} %esp			; NORMAL-NOT: subl {{.*}} %esp
	; NORMAL: pushl $4			; NORMAL: pushl $4
	; NORMAL-NEXT: pushl $3			; NORMAL-NEXT: pushl $3
	; NORMAL-NEXT: pushl $2			; NORMAL-NEXT: pushl $2
	; NORMAL-NEXT: pushl $1			; NORMAL-NEXT: pushl $1
	; NORMAL-NEXT: call			; NORMAL-NEXT: call
	; NORMAL-NEXT: addl $16, %esp			; NORMAL-NEXT: addl $16, %esp
	; X64-LABEL: test1b:			; X64-LABEL: test1:
	; X64: movl $1, %ecx			; X64: movl $1, %ecx
	; X64-NEXT: movl $2, %edx			; X64-NEXT: movl $2, %edx
	; X64-NEXT: movl $3, %r8d			; X64-NEXT: movl $3, %r8d
	; X64-NEXT: movl $4, %r9d			; X64-NEXT: movl $4, %r9d
	; X64-NEXT: callq good			; X64-NEXT: callq good
	define void @test1b() optsize {			; NOPUSH-LABEL: test1:
	entry:			; NOPUSH: subl $16, %esp
	call void @good(i32 1, i32 2, i32 3, i32 4)			; NOPUSH-NEXT: movl $4, 12(%esp)
	ret void			; NOPUSH-NEXT: movl $3, 8(%esp)
	}			; NOPUSH-NEXT: movl $2, 4(%esp)
				; NOPUSH-NEXT: movl $1, (%esp)
	; Same as above, but for minsize			; NOPUSH-NEXT: call
	; NORMAL-LABEL: test1c:			; NOPUSH-NEXT: addl $16, %esp
	; NORMAL-NOT: subl {{.*}} %esp			define void @test1() {
	; NORMAL: pushl $4
	; NORMAL-NEXT: pushl $3
	; NORMAL-NEXT: pushl $2
	; NORMAL-NEXT: pushl $1
	; NORMAL-NEXT: call
	; NORMAL-NEXT: addl $16, %esp
	define void @test1c() minsize {
	entry:			entry:
	call void @good(i32 1, i32 2, i32 3, i32 4)			call void @good(i32 1, i32 2, i32 3, i32 4)
	ret void			ret void
	}			}

	; If we have a reserved frame, we should have pushes			; If we have a reserved frame, we should have pushes
	; NORMAL-LABEL: test2:			; NORMAL-LABEL: test2:
	; NORMAL-NOT: subl {{.*}} %esp			; NORMAL-NOT: subl {{.*}} %esp
	▲ Show 20 Lines • Show All 312 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/phys-reg-local-regalloc.ll

	; RUN: llc < %s -stack-symbol-ordering=0 -march=x86 -mtriple=i386-apple-darwin9 -mcpu=generic -regalloc=fast -optimize-regalloc=0 \| FileCheck %s			; RUN: llc < %s -stack-symbol-ordering=0 -march=x86 -mtriple=i386-apple-darwin9 -mcpu=generic -regalloc=fast -optimize-regalloc=0 -no-x86-call-frame-opt \| FileCheck %s
	; RUN: llc -O0 < %s -stack-symbol-ordering=0 -march=x86 -mtriple=i386-apple-darwin9 -mcpu=generic -regalloc=fast \| FileCheck %s			; RUN: llc -O0 < %s -stack-symbol-ordering=0 -march=x86 -mtriple=i386-apple-darwin9 -mcpu=generic -regalloc=fast -no-x86-call-frame-opt \| FileCheck %s
	; RUN: llc < %s -stack-symbol-ordering=0 -march=x86 -mtriple=i386-apple-darwin9 -mcpu=atom -regalloc=fast -optimize-regalloc=0 \| FileCheck -check-prefix=ATOM %s			; RUN: llc < %s -stack-symbol-ordering=0 -march=x86 -mtriple=i386-apple-darwin9 -mcpu=atom -regalloc=fast -optimize-regalloc=0 -no-x86-call-frame-opt \| FileCheck -check-prefix=ATOM %s
	; CHECKed instructions should be the same with or without -O0 except on Intel Atom due to instruction scheduling.			; CHECKed instructions should be the same with or without -O0 except on Intel Atom due to instruction scheduling.

	@.str = private constant [12 x i8] c"x + y = %i\0A\00", align 1 ; <[12 x i8]*> [#uses=1]			@.str = private constant [12 x i8] c"x + y = %i\0A\00", align 1 ; <[12 x i8]*> [#uses=1]

	define i32 @main() nounwind {			define i32 @main() nounwind {
	entry:			entry:
	; CHECK: movl 24(%esp), %eax			; CHECK: movl 24(%esp), %eax
	; CHECK-NOT: movl			; CHECK-NOT: movl
	▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/segmented-stacks.ll

Show All 38 Lines	define void @test_basic() #0 {
ret void		ret void

; X32-Linux-LABEL: test_basic:		; X32-Linux-LABEL: test_basic:

; X32-Linux: cmpl %gs:48, %esp		; X32-Linux: cmpl %gs:48, %esp
; X32-Linux-NEXT: ja .LBB0_2		; X32-Linux-NEXT: ja .LBB0_2

; X32-Linux: pushl $0		; X32-Linux: pushl $0
; X32-Linux-NEXT: pushl $60		; X32-Linux-NEXT: pushl $44
; X32-Linux-NEXT: calll __morestack		; X32-Linux-NEXT: calll __morestack
; X32-Linux-NEXT: ret		; X32-Linux-NEXT: ret

; X64-Linux-LABEL: test_basic:		; X64-Linux-LABEL: test_basic:

; X64-Linux: cmpq %fs:112, %rsp		; X64-Linux: cmpq %fs:112, %rsp
; X64-Linux-NEXT: ja .LBB0_2		; X64-Linux-NEXT: ja .LBB0_2

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
; X64-Darwin-NEXT: ret		; X64-Darwin-NEXT: ret

; X32-MinGW-LABEL: test_basic:		; X32-MinGW-LABEL: test_basic:

; X32-MinGW: cmpl %fs:20, %esp		; X32-MinGW: cmpl %fs:20, %esp
; X32-MinGW-NEXT: ja LBB0_2		; X32-MinGW-NEXT: ja LBB0_2

; X32-MinGW: pushl $0		; X32-MinGW: pushl $0
; X32-MinGW-NEXT: pushl $48		; X32-MinGW-NEXT: pushl $40
; X32-MinGW-NEXT: calll ___morestack		; X32-MinGW-NEXT: calll ___morestack
; X32-MinGW-NEXT: ret		; X32-MinGW-NEXT: ret

; X64-MinGW-LABEL: test_basic:		; X64-MinGW-LABEL: test_basic:

; X64-MinGW: cmpq %gs:40, %rsp		; X64-MinGW: cmpq %gs:40, %rsp
; X64-MinGW-NEXT: ja .LBB0_2		; X64-MinGW-NEXT: ja .LBB0_2

Show All 13 Lines
; X64-FreeBSD-NEXT: ret		; X64-FreeBSD-NEXT: ret

; X32-DFlyBSD-LABEL: test_basic:		; X32-DFlyBSD-LABEL: test_basic:

; X32-DFlyBSD: cmpl %fs:16, %esp		; X32-DFlyBSD: cmpl %fs:16, %esp
; X32-DFlyBSD-NEXT: ja .LBB0_2		; X32-DFlyBSD-NEXT: ja .LBB0_2

; X32-DFlyBSD: pushl $0		; X32-DFlyBSD: pushl $0
; X32-DFlyBSD-NEXT: pushl $48		; X32-DFlyBSD-NEXT: pushl $40
; X32-DFlyBSD-NEXT: calll __morestack		; X32-DFlyBSD-NEXT: calll __morestack
; X32-DFlyBSD-NEXT: ret		; X32-DFlyBSD-NEXT: ret

; X64-DFlyBSD-LABEL: test_basic:		; X64-DFlyBSD-LABEL: test_basic:

; X64-DFlyBSD: cmpq %fs:32, %rsp		; X64-DFlyBSD: cmpq %fs:32, %rsp
; X64-DFlyBSD-NEXT: ja .LBB0_2		; X64-DFlyBSD-NEXT: ja .LBB0_2

Show All 10 Lines	define i32 @test_nested(i32 * nest %closure, i32 %other) #0 {
%mem = alloca i32, i32 10		%mem = alloca i32, i32 10
call void @dummy_use (i32* %mem, i32 10)		call void @dummy_use (i32* %mem, i32 10)
ret i32 %result		ret i32 %result

; X32-Linux: cmpl %gs:48, %esp		; X32-Linux: cmpl %gs:48, %esp
; X32-Linux-NEXT: ja .LBB1_2		; X32-Linux-NEXT: ja .LBB1_2

; X32-Linux: pushl $4		; X32-Linux: pushl $4
; X32-Linux-NEXT: pushl $60		; X32-Linux-NEXT: pushl $44
; X32-Linux-NEXT: calll __morestack		; X32-Linux-NEXT: calll __morestack
; X32-Linux-NEXT: ret		; X32-Linux-NEXT: ret

; X64-Linux: cmpq %fs:112, %rsp		; X64-Linux: cmpq %fs:112, %rsp
; X64-Linux-NEXT: ja .LBB1_2		; X64-Linux-NEXT: ja .LBB1_2

; X64-Linux: movq %r10, %rax		; X64-Linux: movq %r10, %rax
; X64-Linux-NEXT: movabsq $56, %r10		; X64-Linux-NEXT: movabsq $56, %r10
Show All 30 Lines
; X64-Darwin-NEXT: callq ___morestack		; X64-Darwin-NEXT: callq ___morestack
; X64-Darwin-NEXT: ret		; X64-Darwin-NEXT: ret
; X64-Darwin-NEXT: movq %rax, %r10		; X64-Darwin-NEXT: movq %rax, %r10

; X32-MinGW: cmpl %fs:20, %esp		; X32-MinGW: cmpl %fs:20, %esp
; X32-MinGW-NEXT: ja LBB1_2		; X32-MinGW-NEXT: ja LBB1_2

; X32-MinGW: pushl $4		; X32-MinGW: pushl $4
; X32-MinGW-NEXT: pushl $52		; X32-MinGW-NEXT: pushl $44
; X32-MinGW-NEXT: calll ___morestack		; X32-MinGW-NEXT: calll ___morestack
; X32-MinGW-NEXT: ret		; X32-MinGW-NEXT: ret

; X64-MinGW-LABEL: test_nested:		; X64-MinGW-LABEL: test_nested:
; X64-MinGW: cmpq %gs:40, %rsp		; X64-MinGW: cmpq %gs:40, %rsp
; X64-MinGW-NEXT: ja .LBB1_2		; X64-MinGW-NEXT: ja .LBB1_2

; X64-MinGW: movq %r10, %rax		; X64-MinGW: movq %r10, %rax
Show All 12 Lines
; X64-FreeBSD-NEXT: callq __morestack		; X64-FreeBSD-NEXT: callq __morestack
; X64-FreeBSD-NEXT: ret		; X64-FreeBSD-NEXT: ret
; X64-FreeBSD-NEXT: movq %rax, %r10		; X64-FreeBSD-NEXT: movq %rax, %r10

; X32-DFlyBSD: cmpl %fs:16, %esp		; X32-DFlyBSD: cmpl %fs:16, %esp
; X32-DFlyBSD-NEXT: ja .LBB1_2		; X32-DFlyBSD-NEXT: ja .LBB1_2

; X32-DFlyBSD: pushl $4		; X32-DFlyBSD: pushl $4
; X32-DFlyBSD-NEXT: pushl $52		; X32-DFlyBSD-NEXT: pushl $44
; X32-DFlyBSD-NEXT: calll __morestack		; X32-DFlyBSD-NEXT: calll __morestack
; X32-DFlyBSD-NEXT: ret		; X32-DFlyBSD-NEXT: ret

; X64-DFlyBSD: cmpq %fs:32, %rsp		; X64-DFlyBSD: cmpq %fs:32, %rsp
; X64-DFlyBSD-NEXT: ja .LBB1_2		; X64-DFlyBSD-NEXT: ja .LBB1_2

; X64-DFlyBSD: movq %r10, %rax		; X64-DFlyBSD: movq %r10, %rax
; X64-DFlyBSD-NEXT: movabsq $56, %r10		; X64-DFlyBSD-NEXT: movabsq $56, %r10
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines
; X64-Darwin-NEXT: cmpq %gs:816, %r11		; X64-Darwin-NEXT: cmpq %gs:816, %r11
; X64-Darwin-NEXT: ja LBB2_2		; X64-Darwin-NEXT: ja LBB2_2

; X64-Darwin: movabsq $40008, %r10		; X64-Darwin: movabsq $40008, %r10
; X64-Darwin-NEXT: movabsq $0, %r11		; X64-Darwin-NEXT: movabsq $0, %r11
; X64-Darwin-NEXT: callq ___morestack		; X64-Darwin-NEXT: callq ___morestack
; X64-Darwin-NEXT: ret		; X64-Darwin-NEXT: ret

; X32-MinGW: leal -40008(%esp), %ecx		; X32-MinGW: leal -40000(%esp), %ecx
; X32-MinGW-NEXT: cmpl %fs:20, %ecx		; X32-MinGW-NEXT: cmpl %fs:20, %ecx
; X32-MinGW-NEXT: ja LBB2_2		; X32-MinGW-NEXT: ja LBB2_2

; X32-MinGW: pushl $0		; X32-MinGW: pushl $0
; X32-MinGW-NEXT: pushl $40008		; X32-MinGW-NEXT: pushl $40000
; X32-MinGW-NEXT: calll ___morestack		; X32-MinGW-NEXT: calll ___morestack
; X32-MinGW-NEXT: ret		; X32-MinGW-NEXT: ret

; X64-MinGW-LABEL: test_large:		; X64-MinGW-LABEL: test_large:
; X64-MinGW: leaq -40040(%rsp), %r11		; X64-MinGW: leaq -40040(%rsp), %r11
; X64-MinGW-NEXT: cmpq %gs:40, %r11		; X64-MinGW-NEXT: cmpq %gs:40, %r11
; X64-MinGW-NEXT: ja .LBB2_2		; X64-MinGW-NEXT: ja .LBB2_2

; X64-MinGW: movabsq $40040, %r10		; X64-MinGW: movabsq $40040, %r10
; X64-MinGW-NEXT: movabsq $32, %r11		; X64-MinGW-NEXT: movabsq $32, %r11
; X64-MinGW-NEXT: callq __morestack		; X64-MinGW-NEXT: callq __morestack
; X64-MinGW-NEXT: retq		; X64-MinGW-NEXT: retq

; X64-FreeBSD: leaq -40008(%rsp), %r11		; X64-FreeBSD: leaq -40008(%rsp), %r11
; X64-FreeBSD-NEXT: cmpq %fs:24, %r11		; X64-FreeBSD-NEXT: cmpq %fs:24, %r11
; X64-FreeBSD-NEXT: ja .LBB2_2		; X64-FreeBSD-NEXT: ja .LBB2_2

; X64-FreeBSD: movabsq $40008, %r10		; X64-FreeBSD: movabsq $40008, %r10
; X64-FreeBSD-NEXT: movabsq $0, %r11		; X64-FreeBSD-NEXT: movabsq $0, %r11
; X64-FreeBSD-NEXT: callq __morestack		; X64-FreeBSD-NEXT: callq __morestack
; X64-FreeBSD-NEXT: ret		; X64-FreeBSD-NEXT: ret

; X32-DFlyBSD: leal -40008(%esp), %ecx		; X32-DFlyBSD: leal -40000(%esp), %ecx
; X32-DFlyBSD-NEXT: cmpl %fs:16, %ecx		; X32-DFlyBSD-NEXT: cmpl %fs:16, %ecx
; X32-DFlyBSD-NEXT: ja .LBB2_2		; X32-DFlyBSD-NEXT: ja .LBB2_2

; X32-DFlyBSD: pushl $0		; X32-DFlyBSD: pushl $0
; X32-DFlyBSD-NEXT: pushl $40008		; X32-DFlyBSD-NEXT: pushl $40000
; X32-DFlyBSD-NEXT: calll __morestack		; X32-DFlyBSD-NEXT: calll __morestack
; X32-DFlyBSD-NEXT: ret		; X32-DFlyBSD-NEXT: ret

; X64-DFlyBSD: leaq -40008(%rsp), %r11		; X64-DFlyBSD: leaq -40008(%rsp), %r11
; X64-DFlyBSD-NEXT: cmpq %fs:32, %r11		; X64-DFlyBSD-NEXT: cmpq %fs:32, %r11
; X64-DFlyBSD-NEXT: ja .LBB2_2		; X64-DFlyBSD-NEXT: ja .LBB2_2

; X64-DFlyBSD: movabsq $40008, %r10		; X64-DFlyBSD: movabsq $40008, %r10
Show All 9 Lines	define fastcc void @test_fastcc() #0 {
ret void		ret void

; X32-Linux-LABEL: test_fastcc:		; X32-Linux-LABEL: test_fastcc:

; X32-Linux: cmpl %gs:48, %esp		; X32-Linux: cmpl %gs:48, %esp
; X32-Linux-NEXT: ja .LBB3_2		; X32-Linux-NEXT: ja .LBB3_2

; X32-Linux: pushl $0		; X32-Linux: pushl $0
; X32-Linux-NEXT: pushl $60		; X32-Linux-NEXT: pushl $44
; X32-Linux-NEXT: calll __morestack		; X32-Linux-NEXT: calll __morestack
; X32-Linux-NEXT: ret		; X32-Linux-NEXT: ret

; X64-Linux-LABEL: test_fastcc:		; X64-Linux-LABEL: test_fastcc:

; X64-Linux: cmpq %fs:112, %rsp		; X64-Linux: cmpq %fs:112, %rsp
; X64-Linux-NEXT: ja .LBB3_2		; X64-Linux-NEXT: ja .LBB3_2

Show All 34 Lines
; X64-Darwin-NEXT: ret		; X64-Darwin-NEXT: ret

; X32-MinGW-LABEL: test_fastcc:		; X32-MinGW-LABEL: test_fastcc:

; X32-MinGW: cmpl %fs:20, %esp		; X32-MinGW: cmpl %fs:20, %esp
; X32-MinGW-NEXT: ja LBB3_2		; X32-MinGW-NEXT: ja LBB3_2

; X32-MinGW: pushl $0		; X32-MinGW: pushl $0
; X32-MinGW-NEXT: pushl $48		; X32-MinGW-NEXT: pushl $40
; X32-MinGW-NEXT: calll ___morestack		; X32-MinGW-NEXT: calll ___morestack
; X32-MinGW-NEXT: ret		; X32-MinGW-NEXT: ret

; X64-MinGW-LABEL: test_fastcc:		; X64-MinGW-LABEL: test_fastcc:

; X64-MinGW: cmpq %gs:40, %rsp		; X64-MinGW: cmpq %gs:40, %rsp
; X64-MinGW-NEXT: ja .LBB3_2		; X64-MinGW-NEXT: ja .LBB3_2

Show All 13 Lines
; X64-FreeBSD-NEXT: ret		; X64-FreeBSD-NEXT: ret

; X32-DFlyBSD-LABEL: test_fastcc:		; X32-DFlyBSD-LABEL: test_fastcc:

; X32-DFlyBSD: cmpl %fs:16, %esp		; X32-DFlyBSD: cmpl %fs:16, %esp
; X32-DFlyBSD-NEXT: ja .LBB3_2		; X32-DFlyBSD-NEXT: ja .LBB3_2

; X32-DFlyBSD: pushl $0		; X32-DFlyBSD: pushl $0
; X32-DFlyBSD-NEXT: pushl $48		; X32-DFlyBSD-NEXT: pushl $40
; X32-DFlyBSD-NEXT: calll __morestack		; X32-DFlyBSD-NEXT: calll __morestack
; X32-DFlyBSD-NEXT: ret		; X32-DFlyBSD-NEXT: ret

; X64-DFlyBSD-LABEL: test_fastcc:		; X64-DFlyBSD-LABEL: test_fastcc:

; X64-DFlyBSD: cmpq %fs:32, %rsp		; X64-DFlyBSD: cmpq %fs:32, %rsp
; X64-DFlyBSD-NEXT: ja .LBB3_2		; X64-DFlyBSD-NEXT: ja .LBB3_2

▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines

; X64-Darwin: movabsq $40008, %r10		; X64-Darwin: movabsq $40008, %r10
; X64-Darwin-NEXT: movabsq $0, %r11		; X64-Darwin-NEXT: movabsq $0, %r11
; X64-Darwin-NEXT: callq ___morestack		; X64-Darwin-NEXT: callq ___morestack
; X64-Darwin-NEXT: ret		; X64-Darwin-NEXT: ret

; X32-MinGW-LABEL: test_fastcc_large:		; X32-MinGW-LABEL: test_fastcc_large:

; X32-MinGW: leal -40008(%esp), %eax		; X32-MinGW: leal -40000(%esp), %eax
; X32-MinGW-NEXT: cmpl %fs:20, %eax		; X32-MinGW-NEXT: cmpl %fs:20, %eax
; X32-MinGW-NEXT: ja LBB4_2		; X32-MinGW-NEXT: ja LBB4_2

; X32-MinGW: pushl $0		; X32-MinGW: pushl $0
; X32-MinGW-NEXT: pushl $40008		; X32-MinGW-NEXT: pushl $40000
; X32-MinGW-NEXT: calll ___morestack		; X32-MinGW-NEXT: calll ___morestack
; X32-MinGW-NEXT: ret		; X32-MinGW-NEXT: ret

; X64-MinGW-LABEL: test_fastcc_large:		; X64-MinGW-LABEL: test_fastcc_large:

; X64-MinGW: leaq -40040(%rsp), %r11		; X64-MinGW: leaq -40040(%rsp), %r11
; X64-MinGW-NEXT: cmpq %gs:40, %r11		; X64-MinGW-NEXT: cmpq %gs:40, %r11
; X64-MinGW-NEXT: ja .LBB4_2		; X64-MinGW-NEXT: ja .LBB4_2
Show All 11 Lines

; X64-FreeBSD: movabsq $40008, %r10		; X64-FreeBSD: movabsq $40008, %r10
; X64-FreeBSD-NEXT: movabsq $0, %r11		; X64-FreeBSD-NEXT: movabsq $0, %r11
; X64-FreeBSD-NEXT: callq __morestack		; X64-FreeBSD-NEXT: callq __morestack
; X64-FreeBSD-NEXT: ret		; X64-FreeBSD-NEXT: ret

; X32-DFlyBSD-LABEL: test_fastcc_large:		; X32-DFlyBSD-LABEL: test_fastcc_large:

; X32-DFlyBSD: leal -40008(%esp), %eax		; X32-DFlyBSD: leal -40000(%esp), %eax
; X32-DFlyBSD-NEXT: cmpl %fs:16, %eax		; X32-DFlyBSD-NEXT: cmpl %fs:16, %eax
; X32-DFlyBSD-NEXT: ja .LBB4_2		; X32-DFlyBSD-NEXT: ja .LBB4_2

; X32-DFlyBSD: pushl $0		; X32-DFlyBSD: pushl $0
; X32-DFlyBSD-NEXT: pushl $40008		; X32-DFlyBSD-NEXT: pushl $40000
; X32-DFlyBSD-NEXT: calll __morestack		; X32-DFlyBSD-NEXT: calll __morestack
; X32-DFlyBSD-NEXT: ret		; X32-DFlyBSD-NEXT: ret

; X64-DFlyBSD-LABEL: test_fastcc_large:		; X64-DFlyBSD-LABEL: test_fastcc_large:

; X64-DFlyBSD: leaq -40008(%rsp), %r11		; X64-DFlyBSD: leaq -40008(%rsp), %r11
; X64-DFlyBSD-NEXT: cmpq %fs:32, %r11		; X64-DFlyBSD-NEXT: cmpq %fs:32, %r11
; X64-DFlyBSD-NEXT: ja .LBB4_2		; X64-DFlyBSD-NEXT: ja .LBB4_2
▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/seh-catch-all-win32.ll

	Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	; CHECK: popl %edi			; CHECK: popl %edi
	; CHECK: popl %ebx			; CHECK: popl %ebx
	; CHECK: retl			; CHECK: retl
	; CHECK: LBB0_[[lpbb:[0-9]+]]: # %__except{{$}}			; CHECK: LBB0_[[lpbb:[0-9]+]]: # %__except{{$}}
	; stackrestore			; stackrestore
	; CHECK: movl -24(%ebp), %esp			; CHECK: movl -24(%ebp), %esp
	; EH state -1			; EH state -1
	; CHECK: movl [[code_offs]](%ebp), %[[code:[a-z]+]]			; CHECK: movl [[code_offs]](%ebp), %[[code:[a-z]+]]
	; CHECK-DAG: movl %[[code]], 4(%esp)			; CHECK: pushl %[[code]]
	; CHECK-DAG: movl $_str, (%esp)			; CHECK: pushl $_str
	; CHECK: calll _printf			; CHECK: calll _printf

	; CHECK: .section .xdata,"dr"			; CHECK: .section .xdata,"dr"
	; CHECK: Lmain$parent_frame_offset = [[reg_offs]]			; CHECK: Lmain$parent_frame_offset = [[reg_offs]]
	; CHECK: .p2align 2			; CHECK: .p2align 2
	; CHECK: L__ehtable$main			; CHECK: L__ehtable$main
	; CHECK-NEXT: .long -1			; CHECK-NEXT: .long -1
	; CHECK-NEXT: .long _filt$main			; CHECK-NEXT: .long _filt$main
	Show All 10 Lines

llvm/trunk/test/CodeGen/X86/seh-stack-realign.ll

	Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines

	; Check that we can get the exception code from eax to the printf.			; Check that we can get the exception code from eax to the printf.

	; CHECK-LABEL: _main:			; CHECK-LABEL: _main:
	; CHECK: Lmain$frame_escape_0 = [[code_offs:[-0-9]+]]			; CHECK: Lmain$frame_escape_0 = [[code_offs:[-0-9]+]]
	; CHECK: movl %esp, [[reg_offs:[-0-9]+]](%esi)			; CHECK: movl %esp, [[reg_offs:[-0-9]+]](%esi)
	; CHECK: movl $L__ehtable$main,			; CHECK: movl $L__ehtable$main,
	; EH state 0			; EH state 0
	; CHECK: movl $0, 40(%esi)			; CHECK: movl $0, 32(%esi)
	; CHECK: calll _crash			; CHECK: calll _crash
	; CHECK: retl			; CHECK: retl
	; CHECK: LBB0_[[lpbb:[0-9]+]]: # %__except			; CHECK: LBB0_[[lpbb:[0-9]+]]: # %__except
	; Restore ESP			; Restore ESP
	; CHECK: movl -24(%ebp), %esp			; CHECK: movl -24(%ebp), %esp
	; Restore ESI			; Restore ESI
	; CHECK: leal -44(%ebp), %esi			; CHECK: leal -36(%ebp), %esi
	; Restore EBP			; Restore EBP
	; CHECK: movl 12(%esi), %ebp			; CHECK: movl 4(%esi), %ebp
	; CHECK: movl [[code_offs]](%esi), %[[code:[a-z]+]]			; CHECK: movl [[code_offs]](%esi), %[[code:[a-z]+]]
	; CHECK-DAG: movl %[[code]], 4(%esp)			; CHECK: pushl %[[code]]
	; CHECK-DAG: movl $_str, (%esp)			; CHECK: pushl $_str
	; CHECK: calll _printf			; CHECK: calll _printf

	; CHECK: .section .xdata,"dr"			; CHECK: .section .xdata,"dr"
	; CHECK: Lmain$parent_frame_offset = [[reg_offs]]			; CHECK: Lmain$parent_frame_offset = [[reg_offs]]
	; CHECK: L__ehtable$main			; CHECK: L__ehtable$main
	; CHECK-NEXT: .long -1			; CHECK-NEXT: .long -1
	; CHECK-NEXT: .long _filt$main			; CHECK-NEXT: .long _filt$main
	; CHECK-NEXT: .long LBB0_[[lpbb]]			; CHECK-NEXT: .long LBB0_[[lpbb]]
	Show All 9 Lines

llvm/trunk/test/CodeGen/X86/shrink-wrap-chkstk.ll

Show First 20 Lines • Show All 58 Lines • ▼ Show 20 Lines	false:
ret i32 %tmp.0		ret i32 %tmp.0
}		}

; CHECK-LABEL: @use_eax_before_prologue@8: # @use_eax_before_prologue		; CHECK-LABEL: @use_eax_before_prologue@8: # @use_eax_before_prologue
; CHECK: movl %ecx, %eax		; CHECK: movl %ecx, %eax
; CHECK: cmpl %edx, %eax		; CHECK: cmpl %edx, %eax
; CHECK: jge LBB1_2		; CHECK: jge LBB1_2
; CHECK: pushl %eax		; CHECK: pushl %eax
; CHECK: movl $4100, %eax		; CHECK: movl $4092, %eax
; CHECK: calll __chkstk		; CHECK: calll __chkstk
; CHECK: movl 4100(%esp), %eax		; CHECK: movl 4092(%esp), %eax
; CHECK: calll _doSomething		; CHECK: calll _doSomething
; CHECK: LBB1_2:		; CHECK: LBB1_2:
; CHECK: retl		; CHECK: retl

llvm/trunk/test/CodeGen/X86/sse-intel-ocl.ll

	; RUN: llc < %s -mtriple=i386-pc-win32 -mcpu=nehalem \| FileCheck -check-prefix=WIN32 %s			; RUN: llc < %s -mtriple=i386-pc-win32 -mcpu=nehalem \| FileCheck -check-prefix=WIN32 %s
	; RUN: llc < %s -mtriple=x86_64-win32 -mcpu=nehalem \| FileCheck -check-prefix=WIN64 %s			; RUN: llc < %s -mtriple=x86_64-win32 -mcpu=nehalem \| FileCheck -check-prefix=WIN64 %s
	; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=nehalem \| FileCheck -check-prefix=NOT_WIN %s			; RUN: llc < %s -mtriple=x86_64-apple-darwin -mcpu=nehalem \| FileCheck -check-prefix=NOT_WIN %s

	declare <16 x float> @func_float16_ptr(<16 x float>, <16 x float> *)			declare <16 x float> @func_float16_ptr(<16 x float>, <16 x float> *)
	declare <16 x float> @func_float16(<16 x float>, <16 x float>)			declare <16 x float> @func_float16(<16 x float>, <16 x float>)
	; WIN64: testf16_inp			; WIN64: testf16_inp
	; WIN64: addps {{.*}}, {{%xmm[0-3]}}			; WIN64: addps {{.*}}, {{%xmm[0-3]}}
	; WIN64: addps {{.*}}, {{%xmm[0-3]}}			; WIN64: addps {{.*}}, {{%xmm[0-3]}}
	; WIN64: addps {{.*}}, {{%xmm[0-3]}}			; WIN64: addps {{.*}}, {{%xmm[0-3]}}
	; WIN64: addps {{.*}}, {{%xmm[0-3]}}			; WIN64: addps {{.*}}, {{%xmm[0-3]}}
	; WIN64: leaq {{.*}}(%rsp), %rcx			; WIN64: leaq {{.*}}(%rsp), %rcx
	; WIN64: call			; WIN64: call
	; WIN64: ret			; WIN64: ret

	; WIN32: testf16_inp			; WIN32: testf16_inp
	; WIN32: movl %eax, (%esp)			; WIN32: pushl %eax
	; WIN32: addps {{.*}}, {{%xmm[0-3]}}			; WIN32: addps {{.*}}, {{%xmm[0-3]}}
	; WIN32: addps {{.*}}, {{%xmm[0-3]}}			; WIN32: addps {{.*}}, {{%xmm[0-3]}}
	; WIN32: addps {{.*}}, {{%xmm[0-3]}}			; WIN32: addps {{.*}}, {{%xmm[0-3]}}
	; WIN32: addps {{.*}}, {{%xmm[0-3]}}			; WIN32: addps {{.*}}, {{%xmm[0-3]}}
	; WIN32: call			; WIN32: call
	; WIN32: ret			; WIN32: ret

	; NOT_WIN: testf16_inp			; NOT_WIN: testf16_inp
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/tailcall-stackalign.ll

	; RUN: llc < %s -mtriple=i686-unknown-linux -tailcallopt \| FileCheck %s			; RUN: llc < %s -mtriple=i686-unknown-linux -tailcallopt -no-x86-call-frame-opt \| FileCheck %s
	; Linux has 8 byte alignment so the params cause stack size 20 when tailcallopt			; Linux has 8 byte alignment so the params cause stack size 20 when tailcallopt
	; is enabled, ensure that a normal fastcc call has matching stack size			; is enabled, ensure that a normal fastcc call has matching stack size


	define fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {			define fastcc i32 @tailcallee(i32 %a1, i32 %a2, i32 %a3, i32 %a4) {
	ret i32 %a3			ret i32 %a3
	}			}

	Show All 14 Lines

llvm/trunk/test/CodeGen/X86/twoaddr-coalesce.ll

	; RUN: llc < %s -march=x86 \| grep mov \| count 4			; RUN: llc < %s -march=x86 \| grep mov \| count 2
	; rdar://6523745			; rdar://6523745

	@"\01LC" = internal constant [4 x i8] c"%d\0A\00" ; <[4 x i8]*> [#uses=1]			@"\01LC" = internal constant [4 x i8] c"%d\0A\00" ; <[4 x i8]*> [#uses=1]

	define i32 @foo() nounwind {			define i32 @foo() nounwind {
	bb1.thread:			bb1.thread:
	br label %bb1			br label %bb1

	Show All 15 Lines

llvm/trunk/test/CodeGen/X86/vararg-callee-cleanup.ll

	; RUN: llc -mtriple=i686-pc-windows < %s \| FileCheck %s			; RUN: llc -mtriple=i686-pc-windows -no-x86-call-frame-opt < %s \| FileCheck %s

	target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32"			target datalayout = "e-m:w-p:32:32-i64:64-f80:32-n8:16:32-S32"

	declare x86_thiscallcc void @thiscall_thunk(i8* %this, ...)			declare x86_thiscallcc void @thiscall_thunk(i8* %this, ...)
	define i32 @call_varargs_thiscall_thunk(i8* %a, i32 %b, i32 %c, i32 %d) {			define i32 @call_varargs_thiscall_thunk(i8* %a, i32 %b, i32 %c, i32 %d) {
	call x86_thiscallcc void (i8, ...) @thiscall_thunk(i8 %a, i32 1, i32 2)			call x86_thiscallcc void (i8, ...) @thiscall_thunk(i8 %a, i32 1, i32 2)
	call x86_thiscallcc void (i8, ...) @thiscall_thunk(i8 %a, i32 1, i32 2)			call x86_thiscallcc void (i8, ...) @thiscall_thunk(i8 %a, i32 1, i32 2)
	%t1 = add i32 %b, %c			%t1 = add i32 %b, %c
	▲ Show 20 Lines • Show All 45 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/win-catchpad-csrs.ll

	Show First 20 Lines • Show All 45 Lines • ▼ Show 20 Lines
	; X86: pushl %esi			; X86: pushl %esi
	; X86: subl ${{[0-9]+}}, %esp			; X86: subl ${{[0-9]+}}, %esp
	; X86: calll _getint			; X86: calll _getint
	; X86: calll _getint			; X86: calll _getint
	; X86: calll _getint			; X86: calll _getint
	; X86: calll _getint			; X86: calll _getint
	; X86: calll _useints			; X86: calll _useints
	; X86: movl $0, -{{[0-9]+}}(%ebp)			; X86: movl $0, -{{[0-9]+}}(%ebp)
	; X86: movl $1, (%esp)			; X86: pushl $1
	; X86: calll _f			; X86: calll _f
	; X86: [[contbb:LBB0_[0-9]+]]: # %try.cont			; X86: [[contbb:LBB0_[0-9]+]]: # %try.cont
	; X86: popl %esi			; X86: popl %esi
	; X86: popl %edi			; X86: popl %edi
	; X86: popl %ebx			; X86: popl %ebx
	; X86: popl %ebp			; X86: popl %ebp
	; X86: retl			; X86: retl

	; X86: [[restorebb:LBB0_[0-9]+]]:			; X86: [[restorebb:LBB0_[0-9]+]]:
	; X86: addl $12, %ebp			; X86: addl $12, %ebp
	; X86: jmp [[contbb]]			; X86: jmp [[contbb]]

	; X86: "?catch$[[catch1bb:[0-9]+]]@?0?try_catch_catch@4HA":			; X86: "?catch$[[catch1bb:[0-9]+]]@?0?try_catch_catch@4HA":
	; X86: LBB0_[[catch1bb]]: # %handler1{{$}}			; X86: LBB0_[[catch1bb]]: # %handler1{{$}}
	; X86: pushl %ebp			; X86: pushl %ebp
	; X86-NOT: pushl			; X86-NOT: pushl
	; X86: subl $16, %esp			; X86: subl $16, %esp
	; X86: addl $12, %ebp			; X86: addl $12, %ebp
	; X86: movl $1, -{{[0-9]+}}(%ebp)			; X86: movl $1, -{{[0-9]+}}(%ebp)
	; X86: movl $2, (%esp)			; X86: pushl $2
	; X86: calll _f			; X86: calll _f
	; X86: movl $[[restorebb]], %eax			; X86: movl $[[restorebb]], %eax
	; X86-NEXT: addl $16, %esp			; X86-NEXT: addl $16, %esp
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl			; X86-NEXT: retl

	; X86: L__ehtable$try_catch_catch:			; X86: L__ehtable$try_catch_catch:
	; X86: $handlerMap$0$try_catch_catch:			; X86: $handlerMap$0$try_catch_catch:
	▲ Show 20 Lines • Show All 186 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/win-catchpad.ll

	Show First 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
	try.cont:			try.cont:
	ret i32 0			ret i32 0
	}			}

	; X86-LABEL: _try_catch_catch:			; X86-LABEL: _try_catch_catch:
	; X86: movl %esp, -[[sp_offset:[0-9]+]](%ebp)			; X86: movl %esp, -[[sp_offset:[0-9]+]](%ebp)
	; X86: movl $0, -{{[0-9]+}}(%ebp)			; X86: movl $0, -{{[0-9]+}}(%ebp)
	; X86: leal -[[local_offs:[0-9]+]](%ebp), %[[addr_reg:[a-z]+]]			; X86: leal -[[local_offs:[0-9]+]](%ebp), %[[addr_reg:[a-z]+]]
	; X86-DAG: movl %[[addr_reg]], 4(%esp)			; X86-DAG: pushl %[[addr_reg]]
	; X86-DAG: movl $1, (%esp)			; X86-DAG: pushl $1
	; X86: calll _f			; X86: calll _f
	; X86: [[contbb:LBB0_[0-9]+]]: # %try.cont			; X86: [[contbb:LBB0_[0-9]+]]: # %try.cont
	; X86: retl			; X86: retl

	; X86: [[restorebb1:LBB0_[0-9]+]]: # Block address taken			; X86: [[restorebb1:LBB0_[0-9]+]]: # Block address taken
	; X86-NEXT: # %handler1			; X86-NEXT: # %handler1
	; X86-NEXT: addl $12, %ebp			; X86-NEXT: addl $12, %ebp
	; X86: jmp [[contbb]]			; X86: jmp [[contbb]]

	; FIXME: These should be de-duplicated.			; FIXME: These should be de-duplicated.
	; X86: [[restorebb2:LBB0_[0-9]+]]: # Block address taken			; X86: [[restorebb2:LBB0_[0-9]+]]: # Block address taken
	; X86-NEXT: # %handler2			; X86-NEXT: # %handler2
	; X86-NEXT: addl $12, %ebp			; X86-NEXT: addl $12, %ebp
	; X86: jmp [[contbb]]			; X86: jmp [[contbb]]

	; X86: "?catch$[[catch1bb:[0-9]+]]@?0?try_catch_catch@4HA":			; X86: "?catch$[[catch1bb:[0-9]+]]@?0?try_catch_catch@4HA":
	; X86: LBB0_[[catch1bb]]: # %handler1{{$}}			; X86: LBB0_[[catch1bb]]: # %handler1{{$}}
	; X86: pushl %ebp			; X86: pushl %ebp
	; X86: subl $8, %esp			; X86: subl $8, %esp
	; X86: addl $12, %ebp			; X86: addl $12, %ebp
	; X86: movl %esp, -[[sp_offset]](%ebp)			; X86: movl %esp, -[[sp_offset]](%ebp)
	; X86-DAG: movl -32(%ebp), %[[e_reg:[a-z]+]]			; X86-DAG: movl -32(%ebp), %[[e_reg:[a-z]+]]
	; X86-DAG: leal -[[local_offs]](%ebp), %[[addr_reg:[a-z]+]]			; X86-DAG: leal -[[local_offs]](%ebp), %[[addr_reg:[a-z]+]]
	; X86-DAG: movl $1, -{{[0-9]+}}(%ebp)			; X86-DAG: movl $1, -{{[0-9]+}}(%ebp)
	; X86-DAG: movl %[[addr_reg]], 4(%esp)			; X86: pushl %[[addr_reg]]
	; X86-DAG: movl %[[e_reg]], (%esp)			; X86: pushl %[[e_reg]]
	; X86: calll _f			; X86: calll _f
	; X86-NEXT: movl $[[restorebb1]], %eax			; X86: addl $8, %esp
	; X86-NEXT: addl $8, %esp			; X86: movl $[[restorebb1]], %eax
	; X86-NEXT: popl %ebp			; X86: addl $8, %esp
	; X86-NEXT: retl			; X86: popl %ebp
				; X86: retl

	; X86: "?catch$[[catch2bb:[0-9]+]]@?0?try_catch_catch@4HA":			; X86: "?catch$[[catch2bb:[0-9]+]]@?0?try_catch_catch@4HA":
	; X86: LBB0_[[catch2bb]]: # %handler2{{$}}			; X86: LBB0_[[catch2bb]]: # %handler2{{$}}
	; X86: pushl %ebp			; X86: pushl %ebp
	; X86: subl $8, %esp			; X86: subl $8, %esp
	; X86: addl $12, %ebp			; X86: addl $12, %ebp
	; X86: movl %esp, -[[sp_offset]](%ebp)			; X86: movl %esp, -[[sp_offset]](%ebp)
	; X86-DAG: leal -[[local_offs]](%ebp), %[[addr_reg:[a-z]+]]			; X86-DAG: leal -[[local_offs]](%ebp), %[[addr_reg:[a-z]+]]
	; X86-DAG: movl $1, -{{[0-9]+}}(%ebp)			; X86-DAG: movl $1, -{{[0-9]+}}(%ebp)
	; X86-DAG: movl %[[addr_reg]], 4(%esp)			; X86: pushl %[[addr_reg]]
	; X86-DAG: movl $3, (%esp)			; X86: pushl $3
	; X86: calll _f			; X86: calll _f
	; X86-NEXT: movl $[[restorebb2]], %eax			; X86: addl $8, %esp
	; X86-NEXT: addl $8, %esp			; X86: movl $[[restorebb2]], %eax
	; X86-NEXT: popl %ebp			; X86: addl $8, %esp
	; X86-NEXT: retl			; X86: popl %ebp
				; X86: retl

	; X86: L__ehtable$try_catch_catch:			; X86: L__ehtable$try_catch_catch:
	; X86: $handlerMap$0$try_catch_catch:			; X86: $handlerMap$0$try_catch_catch:
	; X86-NEXT: .long 0			; X86-NEXT: .long 0
	; X86-NEXT: .long "??_R0H@8"			; X86-NEXT: .long "??_R0H@8"
	; X86-NEXT: .long -20			; X86-NEXT: .long -20
	; X86-NEXT: .long "?catch$[[catch1bb]]@?0?try_catch_catch@4HA"			; X86-NEXT: .long "?catch$[[catch1bb]]@?0?try_catch_catch@4HA"
	; X86-NEXT: .long 64			; X86-NEXT: .long 64
	▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/win-cleanuppad.ll

	Show First 20 Lines • Show All 82 Lines • ▼ Show 20 Lines

	cleanup.outer: ; preds = %invoke.cont.1, %cleanup.inner, %entry			cleanup.outer: ; preds = %invoke.cont.1, %cleanup.inner, %entry
	%1 = cleanuppad within none []			%1 = cleanuppad within none []
	call x86_thiscallcc void @"\01??1Dtor@@QAE@XZ"(%struct.Dtor* %o1) #2 [ "funclet"(token %1) ]			call x86_thiscallcc void @"\01??1Dtor@@QAE@XZ"(%struct.Dtor* %o1) #2 [ "funclet"(token %1) ]
	cleanupret from %1 unwind to caller			cleanupret from %1 unwind to caller
	}			}

	; X86-LABEL: _nested_cleanup:			; X86-LABEL: _nested_cleanup:
	; X86: movl $1, (%esp)			; X86: pushl $1
	; X86: calll _f			; X86: calll _f
	; X86: movl $2, (%esp)			; X86: pushl $2
	; X86: calll _f			; X86: calll _f
	; X86: movl $3, (%esp)			; X86: pushl $3
	; X86: calll _f			; X86: calll _f

	; X86: "?dtor$[[cleanup_inner:[0-9]+]]@?0?nested_cleanup@4HA":			; X86: "?dtor$[[cleanup_inner:[0-9]+]]@?0?nested_cleanup@4HA":
	; X86: LBB1_[[cleanup_inner]]: # %cleanup.inner{{$}}			; X86: LBB1_[[cleanup_inner]]: # %cleanup.inner{{$}}
	; X86: pushl %ebp			; X86: pushl %ebp
	; X86: leal {{.*}}(%ebp), %ecx			; X86: leal {{.*}}(%ebp), %ecx
	; X86: calll "??1Dtor@@QAE@XZ"			; X86: calll "??1Dtor@@QAE@XZ"
	; X86: popl %ebp			; X86: popl %ebp
	▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/win32-eh-states.ll

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	catch.7:
catchret from %p2 to label %try.cont.9		catchret from %p2 to label %try.cont.9
}		}

; X86-LABEL: _f:		; X86-LABEL: _f:
; X86: movl $-1, [[state:[-0-9]+]](%ebp)		; X86: movl $-1, [[state:[-0-9]+]](%ebp)
; X86: movl $___ehhandler$f, {{.*}}		; X86: movl $___ehhandler$f, {{.*}}
;		;
; X86: movl $0, [[state]](%ebp)		; X86: movl $0, [[state]](%ebp)
; X86: movl $1, (%esp)		; X86: pushl $1
; X86: calll _may_throw		; X86: calll _may_throw
;		;
; X86: movl $1, [[state]](%ebp)		; X86: movl $1, [[state]](%ebp)
; X86: movl $2, (%esp)		; X86: pushl $2
; X86: calll _may_throw		; X86: calll _may_throw
;		;
; X86: movl $2, [[state]](%ebp)		; X86: movl $2, [[state]](%ebp)
; X86: movl $3, (%esp)		; X86: pushl $3
; X86: calll _may_throw		; X86: calll _may_throw
;		;
; X86: movl $3, [[state]](%ebp)		; X86: movl $3, [[state]](%ebp)
; X86: movl $4, (%esp)		; X86: pushl $4
; X86: calll _may_throw		; X86: calll _may_throw


; X64-LABEL: f:		; X64-LABEL: f:
; X64-LABEL: $ip2state$f:		; X64-LABEL: $ip2state$f:
; X64-NEXT: .long .Lfunc_begin0@IMGREL		; X64-NEXT: .long .Lfunc_begin0@IMGREL
; X64-NEXT: .long -1		; X64-NEXT: .long -1
; X64-NEXT: .long .Ltmp{{.*}}@IMGREL+1		; X64-NEXT: .long .Ltmp{{.*}}@IMGREL+1
▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	unreachable: ; preds = %entry
unreachable		unreachable
}		}

; X86-LABEL: _g:		; X86-LABEL: _g:
; X86: movl $-1, [[state:[-0-9]+]](%ebp)		; X86: movl $-1, [[state:[-0-9]+]](%ebp)
; X86: movl $___ehhandler$g, {{.*}}		; X86: movl $___ehhandler$g, {{.*}}
;		;
; X86: movl $1, [[state]](%ebp)		; X86: movl $1, [[state]](%ebp)
; X86: movl $-1, (%esp)		; X86: pushl $-1
; X86: calll _may_throw		; X86: calll _may_throw
;		;
; X86: movl $2, [[state]](%ebp)		; X86: movl $2, [[state]](%ebp)
; X86: movl $0, (%esp)		; X86: pushl $0
; X86: calll _may_throw		; X86: calll _may_throw
;		;
; X86: movl $3, [[state]](%ebp)		; X86: movl $3, [[state]](%ebp)
; X86: movl $1, (%esp)		; X86: pushl $1
; X86: calll _may_throw		; X86: calll _may_throw
;		;
; X86: movl $2, [[state]](%ebp)		; X86: movl $2, [[state]](%ebp)
; X86: movl $2, (%esp)		; X86: pushl $2
; X86: calll _may_throw		; X86: calll _may_throw

; X64-LABEL: g:		; X64-LABEL: g:
; X64-LABEL: $ip2state$g:		; X64-LABEL: $ip2state$g:
; X64-NEXT: .long .Lfunc_begin1@IMGREL		; X64-NEXT: .long .Lfunc_begin1@IMGREL
; X64-NEXT: .long -1		; X64-NEXT: .long -1
; X64-NEXT: .long .Ltmp{{.*}}@IMGREL+1		; X64-NEXT: .long .Ltmp{{.*}}@IMGREL+1
; X64-NEXT: .long 1		; X64-NEXT: .long 1
Show All 12 Lines

llvm/trunk/test/CodeGen/X86/win32-seh-catchpad.ll

Show All 26 Lines

invoke.cont: ; preds = %entry		invoke.cont: ; preds = %entry
br label %__try.cont		br label %__try.cont
}		}

; CHECK-LABEL: _try_except:		; CHECK-LABEL: _try_except:
; Store state #0		; Store state #0
; CHECK: movl $0, -[[state:[0-9]+]](%ebp)		; CHECK: movl $0, -[[state:[0-9]+]](%ebp)
; CHECK: movl $1, (%esp)		; CHECK: pushl $1
; CHECK: calll _f		; CHECK: calll _f
; CHECK: movl $-1, -[[state]](%ebp)		; CHECK: movl $-1, -[[state]](%ebp)
; CHECK: movl $3, (%esp)		; CHECK: pushl $3
; CHECK: calll _f		; CHECK: calll _f
; CHECK: retl		; CHECK: retl

; __except		; __except
; CHECK: movl $-1, -[[state]](%ebp)		; CHECK: movl $-1, -[[state]](%ebp)
; CHECK: movl $2, (%esp)		; CHECK: pushl $2
; CHECK: calll _f		; CHECK: calll _f

; CHECK: .section .xdata,"dr"		; CHECK: .section .xdata,"dr"
; CHECK: L__ehtable$try_except:		; CHECK: L__ehtable$try_except:
; CHECK: .long -1 # ToState		; CHECK: .long -1 # ToState
; CHECK: .long _try_except_filter_catchall # Filter		; CHECK: .long _try_except_filter_catchall # Filter
; CHECK: .long LBB0_1		; CHECK: .long LBB0_1

▲ Show 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	__except:
ret void		ret void
}		}

; CHECK-LABEL: _code_in_catchpad:		; CHECK-LABEL: _code_in_catchpad:
; CHECK: # %__except.ret		; CHECK: # %__except.ret
; CHECK-NEXT: movl -24(%ebp), %esp		; CHECK-NEXT: movl -24(%ebp), %esp
; CHECK-NEXT: addl $12, %ebp		; CHECK-NEXT: addl $12, %ebp
; CHECK-NEXT: movl $-1, -16(%ebp)		; CHECK-NEXT: movl $-1, -16(%ebp)
; CHECK-NEXT: movl $2, (%esp)		; CHECK-NEXT: pushl $2
; CHECK-NEXT: calll _f		; CHECK-NEXT: calll _f


; Function Attrs: nounwind readnone		; Function Attrs: nounwind readnone
declare i8* @llvm.frameaddress(i32) #1		declare i8* @llvm.frameaddress(i32) #1

; Function Attrs: nounwind readnone		; Function Attrs: nounwind readnone
declare i8* @llvm.x86.seh.recoverfp(i8, i8) #1		declare i8* @llvm.x86.seh.recoverfp(i8, i8) #1
Show All 15 Lines

llvm/trunk/test/CodeGen/X86/win32-seh-nested-finally.ll

	Show All 37 Lines
	attributes #1 = { noinline nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-realign-stack" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }			attributes #1 = { noinline nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="false" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "no-realign-stack" "stack-protector-buffer-size"="8" "unsafe-fp-math"="false" "use-soft-float"="false" }
	attributes #2 = { nounwind readnone }			attributes #2 = { nounwind readnone }
	attributes #3 = { noinline }			attributes #3 = { noinline }

	; CHECK: _nested_finally:			; CHECK: _nested_finally:
	; CHECK: movl $-1, -[[state:[0-9]+]](%ebp)			; CHECK: movl $-1, -[[state:[0-9]+]](%ebp)
	; CHECK: movl {{.*}}, %fs:0			; CHECK: movl {{.*}}, %fs:0
	; CHECK: movl $1, -[[state]](%ebp)			; CHECK: movl $1, -[[state]](%ebp)
	; CHECK: movl $1, (%esp)			; CHECK: pushl $1
	; CHECK: calll _f			; CHECK: calll _f
				; CHECK: addl $4, %esp
	; CHECK: movl $0, -[[state]](%ebp)			; CHECK: movl $0, -[[state]](%ebp)
	; CHECK: movl $2, (%esp)			; CHECK: pushl $2
	; CHECK: calll _f			; CHECK: calll _f
				; CHECK: addl $4, %esp
	; CHECK: movl $-1, -[[state]](%ebp)			; CHECK: movl $-1, -[[state]](%ebp)
	; CHECK: movl $3, (%esp)			; CHECK: pushl $3
	; CHECK: calll _f			; CHECK: calll _f
				; CHECK: addl $4, %esp
				joergUnsubmitted Not Done Reply Inline Actions Two things here for the updated patch. If the stack alignment requirement is 32bit only OR if the pushes have realigned the stack correctly (not sure if we care about the second part), the addls can be deferred to the end of the BB. It's also cheaper to use a pop to some scratch register if available. joerg: Two things here for the updated patch. If the stack alignment requirement is 32bit only OR if…
				mkuperUnsubmitted Not Done Reply Inline Actions Regarding the last part - we already have code for that in X86FrameLowering::eliminateCallFramePseudoInstr(), but it's also currently only enabled for minsize. mkuper: Regarding the last part - we already have code for that in X86FrameLowering…
				hansAuthorUnsubmitted Not Done Reply Inline Actions Using pop for more stack restores is PR26333. Filed PR27165 for considering using pop for stack restore beyond -Os. hans: Using pop for more stack restores is PR26333. Filed PR27165 for considering using pop for stack…
				hansAuthorUnsubmitted Not Done Reply Inline Actions I mean, filed PR27165 for considering not restoring the stack between calls. hans: I mean, filed PR27165 for considering not restoring the stack between calls.
	; CHECK: retl			; CHECK: retl

	; CHECK: LBB0_[[inner:[0-9]+]]: # %ehcleanup			; CHECK: LBB0_[[inner:[0-9]+]]: # %ehcleanup
	; CHECK: pushl %ebp			; CHECK: pushl %ebp
	; CHECK: addl $12, %ebp			; CHECK: addl $12, %ebp
	; CHECK: movl $2, (%esp)			; CHECK: pushl $2
				joergUnsubmitted Not Done Reply Inline Actions This and the next block is missing a pop still? Also, add looks strange and out of context for the rest of the purpose of the text case. joerg: This and the next block is missing a pop still? Also, add looks strange and out of context for…
				hansAuthorUnsubmitted Not Done Reply Inline Actions Adding more checks in r265027. Are you saying add instead of pop looks out of place? See Michael's comment above. Or do you mean the "addl $12, %ebp" above? hans: Adding more checks in r265027. Are you saying add instead of pop looks out of place? See…
	; CHECK: calll _f			; CHECK: calll _f
	; CHECK: popl %ebp			; CHECK: popl %ebp
	; CHECK: retl			; CHECK: retl

	; CHECK: LBB0_[[outer:[0-9]+]]: # %ehcleanup.3			; CHECK: LBB0_[[outer:[0-9]+]]: # %ehcleanup.3
	; CHECK: pushl %ebp			; CHECK: pushl %ebp
	; CHECK: addl $12, %ebp			; CHECK: addl $12, %ebp
	; CHECK: movl $3, (%esp)			; CHECK: pushl $3
	; CHECK: calll _f			; CHECK: calll _f
	; CHECK: popl %ebp			; CHECK: popl %ebp
	; CHECK: retl			; CHECK: retl

	; CHECK: L__ehtable$nested_finally:			; CHECK: L__ehtable$nested_finally:
	; CHECK: .long -1 # ToState			; CHECK: .long -1 # ToState
	; CHECK: .long 0 # Null			; CHECK: .long 0 # Null
	; CHECK: .long "?dtor$[[outer]]@?0?nested_finally@4HA" # FinallyFunclet			; CHECK: .long "?dtor$[[outer]]@?0?nested_finally@4HA" # FinallyFunclet
	; CHECK: .long 0 # ToState			; CHECK: .long 0 # ToState
	; CHECK: .long 0 # Null			; CHECK: .long 0 # Null
	; CHECK: .long "?dtor$[[inner]]@?0?nested_finally@4HA" # FinallyFunclet			; CHECK: .long "?dtor$[[inner]]@?0?nested_finally@4HA" # FinallyFunclet

llvm/trunk/test/CodeGen/X86/win32_sret.ll

Show First 20 Lines • Show All 129 Lines • ▼ Show 20 Lines	entry:
call x86_thiscallcc void @"\01?foo@C5@@QAE?AUS5@@XZ"(%struct.S5* sret %s, %class.C5* %c)		call x86_thiscallcc void @"\01?foo@C5@@QAE?AUS5@@XZ"(%struct.S5* sret %s, %class.C5* %c)
; WIN32-LABEL: {{^}}_call_foo5:		; WIN32-LABEL: {{^}}_call_foo5:
; MINGW_X86-LABEL: {{^}}_call_foo5:		; MINGW_X86-LABEL: {{^}}_call_foo5:
; CYGWIN-LABEL: {{^}}_call_foo5:		; CYGWIN-LABEL: {{^}}_call_foo5:
; LINUX-LABEL: {{^}}call_foo5:		; LINUX-LABEL: {{^}}call_foo5:


; Load the address of the result and put it onto stack		; Load the address of the result and put it onto stack
; (through %ecx in the -O0 build).
; WIN32: leal {{[0-9]+}}(%esp), %e{{[a-d]}}x
; WIN32: movl %e{{[a-d]}}x, (%e{{([a-d]x)\|(sp)}})

; The this pointer goes to ECX.		; The this pointer goes to ECX.
; WIN32-NEXT: leal {{[0-9]+}}(%esp), %ecx		; (through %ecx in the -O0 build).
		; WIN32: leal {{[0-9]*}}(%esp), %e{{[a-d]}}x
		; WIN32: leal {{[0-9]*}}(%esp), %ecx
		; WIN32: pushl %e{{[a-d]}}x
; WIN32-NEXT: calll "?foo@C5@@QAE?AUS5@@XZ"		; WIN32-NEXT: calll "?foo@C5@@QAE?AUS5@@XZ"
; WIN32: retl		; WIN32: retl
ret void		ret void
}		}


%struct.test6 = type { i32, i32, i32 }		%struct.test6 = type { i32, i32, i32 }
define void @test6_f(%struct.test6* %x) nounwind {		define void @test6_f(%struct.test6* %x) nounwind {
; WIN32-LABEL: _test6_f:		; WIN32-LABEL: _test6_f:
; MINGW_X86-LABEL: _test6_f:		; MINGW_X86-LABEL: _test6_f:
; CYGWIN-LABEL: _test6_f:		; CYGWIN-LABEL: _test6_f:
; LINUX-LABEL: test6_f:		; LINUX-LABEL: test6_f:

; The %x argument is moved to %ecx. It will be the this pointer.		; The %x argument is moved to %ecx. It will be the this pointer.
; WIN32: movl 20(%esp), %ecx		; WIN32: movl 16(%esp), %ecx

; The %x argument is moved to (%esp). It will be the this pointer. With -O0
; we copy esp to ecx and use (ecx) instead of (esp).
; MINGW_X86: movl 20(%esp), %eax
; MINGW_X86: movl %eax, (%e{{([a-d]x)\|(sp)}})

; CYGWIN: movl 20(%esp), %eax
; CYGWIN: movl %eax, (%e{{([a-d]x)\|(sp)}})

; The sret pointer is (%esp)		; The sret pointer is (%esp)
; WIN32: leal 4(%esp), %[[REG:e[a-d]x]]		; WIN32: leal (%esp), %[[REG:e[a-d]x]]
; WIN32-NEXT: movl %[[REG]], (%e{{([a-d]x)\|(sp)}})		; WIN32-NEXT: pushl %[[REG]]

; The sret pointer is %ecx		; The sret pointer is %ecx
; MINGW_X86-NEXT: leal 4(%esp), %ecx		; The %x argument is moved to (%esp). It will be the this pointer.
		; MINGW_X86: leal (%esp), %ecx
		; MINGW_X86-NEXT: pushl 16(%esp)
; MINGW_X86-NEXT: calll _test6_g		; MINGW_X86-NEXT: calll _test6_g

; CYGWIN-NEXT: leal 4(%esp), %ecx		; CYGWIN: leal (%esp), %ecx
		; CYGWIN-NEXT: pushl 16(%esp)
; CYGWIN-NEXT: calll _test6_g		; CYGWIN-NEXT: calll _test6_g

%tmp = alloca %struct.test6, align 4		%tmp = alloca %struct.test6, align 4
call x86_thiscallcc void @test6_g(%struct.test6* sret %tmp, %struct.test6* %x)		call x86_thiscallcc void @test6_g(%struct.test6* sret %tmp, %struct.test6* %x)
ret void		ret void
}		}
declare x86_thiscallcc void @test6_g(%struct.test6* sret, %struct.test6*)		declare x86_thiscallcc void @test6_g(%struct.test6* sret, %struct.test6*)

; Flipping the parameters at the IR level generates the same code.		; Flipping the parameters at the IR level generates the same code.
%struct.test7 = type { i32, i32, i32 }		%struct.test7 = type { i32, i32, i32 }
define void @test7_f(%struct.test7* %x) nounwind {		define void @test7_f(%struct.test7* %x) nounwind {
; WIN32-LABEL: _test7_f:		; WIN32-LABEL: _test7_f:
; MINGW_X86-LABEL: _test7_f:		; MINGW_X86-LABEL: _test7_f:
; CYGWIN-LABEL: _test7_f:		; CYGWIN-LABEL: _test7_f:
; LINUX-LABEL: test7_f:		; LINUX-LABEL: test7_f:

; The %x argument is moved to %ecx on all OSs. It will be the this pointer.		; The %x argument is moved to %ecx on all OSs. It will be the this pointer.
; WIN32: movl 20(%esp), %ecx		; WIN32: movl 16(%esp), %ecx
; MINGW_X86: movl 20(%esp), %ecx		; MINGW_X86: movl 16(%esp), %ecx
; CYGWIN: movl 20(%esp), %ecx		; CYGWIN: movl 16(%esp), %ecx

; The sret pointer is (%esp)		; The sret pointer is (%esp)
; WIN32: leal 4(%esp), %[[REG:e[a-d]x]]		; WIN32: leal (%esp), %[[REG:e[a-d]x]]
; WIN32-NEXT: movl %[[REG]], (%e{{([a-d]x)\|(sp)}})		; WIN32-NEXT: pushl %[[REG]]
; MINGW_X86: leal 4(%esp), %[[REG:e[a-d]x]]		; MINGW_X86: leal (%esp), %[[REG:e[a-d]x]]
; MINGW_X86-NEXT: movl %[[REG]], (%e{{([a-d]x)\|(sp)}})		; MINGW_X86-NEXT: pushl %[[REG]]
; CYGWIN: leal 4(%esp), %[[REG:e[a-d]x]]		; CYGWIN: leal (%esp), %[[REG:e[a-d]x]]
; CYGWIN-NEXT: movl %[[REG]], (%e{{([a-d]x)\|(sp)}})		; CYGWIN-NEXT: pushl %[[REG]]

%tmp = alloca %struct.test7, align 4		%tmp = alloca %struct.test7, align 4
call x86_thiscallcc void @test7_g(%struct.test7* %x, %struct.test7* sret %tmp)		call x86_thiscallcc void @test7_g(%struct.test7* %x, %struct.test7* sret %tmp)
ret void		ret void
}		}

define x86_thiscallcc void @test7_g(%struct.test7* %in, %struct.test7* sret %out) {		define x86_thiscallcc void @test7_g(%struct.test7* %in, %struct.test7* sret %out) {
%s = getelementptr %struct.test7, %struct.test7* %in, i32 0, i32 0		%s = getelementptr %struct.test7, %struct.test7* %in, i32 0, i32 0
Show All 31 Lines

llvm/trunk/test/CodeGen/X86/xmulo.ll

	; RUN: llc %s -o - \| FileCheck %s			; RUN: llc %s -o - \| FileCheck %s
	target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32-S128"			target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:128:128-n8:16:32-S128"
	target triple = "i386-apple-macosx10.8.0"			target triple = "i386-apple-macosx10.8.0"

	declare {i64, i1} @llvm.umul.with.overflow.i64(i64, i64) nounwind readnone			declare {i64, i1} @llvm.umul.with.overflow.i64(i64, i64) nounwind readnone
	declare i32 @printf(i8*, ...)			declare i32 @printf(i8*, ...)

	@.str = private unnamed_addr constant [10 x i8] c"%llx, %d\0A\00", align 1			@.str = private unnamed_addr constant [10 x i8] c"%llx, %d\0A\00", align 1

	define i32 @t1() nounwind {			define i32 @t1() nounwind {
	; CHECK-LABEL: t1:			; CHECK-LABEL: t1:
	; CHECK: movl $0, 12(%esp)			; CHECK: pushl $0
	; CHECK: movl $0, 8(%esp)			; CHECK: pushl $0
	; CHECK: movl $72, 4(%esp)			; CHECK: pushl $72

	%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 8)			%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 8)
	%2 = extractvalue {i64, i1} %1, 0			%2 = extractvalue {i64, i1} %1, 0
	%3 = extractvalue {i64, i1} %1, 1			%3 = extractvalue {i64, i1} %1, 1
	%4 = zext i1 %3 to i32			%4 = zext i1 %3 to i32
	%5 = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)			%5 = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)
	ret i32 0			ret i32 0
	}			}

	define i32 @t2() nounwind {			define i32 @t2() nounwind {
	; CHECK-LABEL: t2:			; CHECK-LABEL: t2:
	; CHECK: movl $0, 12(%esp)			; CHECK: pushl $0
	; CHECK: movl $0, 8(%esp)			; CHECK: pushl $0
	; CHECK: movl $0, 4(%esp)			; CHECK: pushl $0

	%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 0)			%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 0)
	%2 = extractvalue {i64, i1} %1, 0			%2 = extractvalue {i64, i1} %1, 0
	%3 = extractvalue {i64, i1} %1, 1			%3 = extractvalue {i64, i1} %1, 1
	%4 = zext i1 %3 to i32			%4 = zext i1 %3 to i32
	%5 = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)			%5 = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)
	ret i32 0			ret i32 0
	}			}

	define i32 @t3() nounwind {			define i32 @t3() nounwind {
	; CHECK-LABEL: t3:			; CHECK-LABEL: t3:
	; CHECK: movl $1, 12(%esp)			; CHECK: pushl $1
	; CHECK: movl $-1, 8(%esp)			; CHECK: pushl $-1
	; CHECK: movl $-9, 4(%esp)			; CHECK: pushl $-9

	%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 -1)			%1 = call {i64, i1} @llvm.umul.with.overflow.i64(i64 9, i64 -1)
	%2 = extractvalue {i64, i1} %1, 0			%2 = extractvalue {i64, i1} %1, 0
	%3 = extractvalue {i64, i1} %1, 1			%3 = extractvalue {i64, i1} %1, 1
	%4 = zext i1 %3 to i32			%4 = zext i1 %3 to i32
	%5 = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)			%5 = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i64 %2, i32 %4)
	ret i32 0			ret i32 0
	}			}

llvm/trunk/test/CodeGen/X86/zext-fold.ll

	Show All 29 Lines
	define void @test3(i8 %x) nounwind readnone {			define void @test3(i8 %x) nounwind readnone {
	%A = and i8 %x, -32			%A = and i8 %x, -32
	%B = zext i8 %A to i32			%B = zext i8 %A to i32
	call void @use(i32 %B, i8 %x)			call void @use(i32 %B, i8 %x)
	ret void			ret void
	}			}
	; CHECK: test3			; CHECK: test3
	; CHECK: movzbl {{[0-9]+}}(%esp), [[REGISTER:%e[a-z]{2}]]			; CHECK: movzbl {{[0-9]+}}(%esp), [[REGISTER:%e[a-z]{2}]]
	; CHECK-NEXT: movl [[REGISTER]], 4(%esp)			; CHECK: subl $8, %esp
				; CHECK-NEXT: pushl [[REGISTER]]
	; CHECK-NEXT: andl $224, [[REGISTER]]			; CHECK-NEXT: andl $224, [[REGISTER]]
	; CHECK-NEXT: movl [[REGISTER]], (%esp)			; CHECK-NEXT: pushl [[REGISTER]]
	; CHECK-NEXT: call{{.*}}use			; CHECK-NEXT: call{{.*}}use

This is an archive of the discontinued LLVM Phabricator instance.

[X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 52155

llvm/trunk/lib/Target/X86/X86CallFrameOptimization.cpp

llvm/trunk/test/CodeGen/X86/2006-05-02-InstrSched1.ll

llvm/trunk/test/CodeGen/X86/2006-11-12-CSRetCC.ll

llvm/trunk/test/CodeGen/X86/atom-lea-sp.ll

llvm/trunk/test/CodeGen/X86/avx-intel-ocl.ll

llvm/trunk/test/CodeGen/X86/avx512-intel-ocl.ll

llvm/trunk/test/CodeGen/X86/call-push.ll

llvm/trunk/test/CodeGen/X86/cmpxchg-clobber-flags.ll

llvm/trunk/test/CodeGen/X86/coalescer-commute3.ll

llvm/trunk/test/CodeGen/X86/hipe-prologue.ll

llvm/trunk/test/CodeGen/X86/i386-shrink-wrapping.ll

llvm/trunk/test/CodeGen/X86/libcall-sret.ll

llvm/trunk/test/CodeGen/X86/localescape.ll

llvm/trunk/test/CodeGen/X86/mcu-abi.ll

llvm/trunk/test/CodeGen/X86/memset-2.ll

llvm/trunk/test/CodeGen/X86/mingw-alloca.ll

llvm/trunk/test/CodeGen/X86/movtopush.ll

llvm/trunk/test/CodeGen/X86/phys-reg-local-regalloc.ll

llvm/trunk/test/CodeGen/X86/segmented-stacks.ll

llvm/trunk/test/CodeGen/X86/seh-catch-all-win32.ll

llvm/trunk/test/CodeGen/X86/seh-stack-realign.ll

llvm/trunk/test/CodeGen/X86/shrink-wrap-chkstk.ll

llvm/trunk/test/CodeGen/X86/sse-intel-ocl.ll

llvm/trunk/test/CodeGen/X86/tailcall-stackalign.ll

llvm/trunk/test/CodeGen/X86/twoaddr-coalesce.ll

llvm/trunk/test/CodeGen/X86/vararg-callee-cleanup.ll

llvm/trunk/test/CodeGen/X86/win-catchpad-csrs.ll

llvm/trunk/test/CodeGen/X86/win-catchpad.ll

llvm/trunk/test/CodeGen/X86/win-cleanuppad.ll

llvm/trunk/test/CodeGen/X86/win32-eh-states.ll

llvm/trunk/test/CodeGen/X86/win32-seh-catchpad.ll

llvm/trunk/test/CodeGen/X86/win32-seh-nested-finally.ll

llvm/trunk/test/CodeGen/X86/win32_sret.ll

llvm/trunk/test/CodeGen/X86/xmulo.ll

llvm/trunk/test/CodeGen/X86/zext-fold.ll

[X86] Enable call frame optimization ("mov to push") not only for optsize (PR26325)
ClosedPublic