This is an archive of the discontinued LLVM Phabricator instance.

Fix i386 stack alignment for parameter type with breakdowns
Needs ReviewPublic

Authored by wxiao3 on Apr 27 2019, 8:22 AM.

Download Raw Diff

Details

Reviewers

annita.zhang
LuoYuanke
smaslov
craig.topper
rnk
hjl.tools

Summary

Parameters are broken down to register types for value passing. E.g,
float128 is broken to 4Xi32. But current implementation doesn't
take the alignment of original type into account when allocating
stack space for the first piece of the broken down parameter. E.g,
float128 parameter is passed as 4 byte-aligned instead of 16
byte-aligned which is inconsistent with i386 ABI. This will cause
runtime failure if generated code calling to libraries built by
other compiler such as GCC which follows i386 ABI.

This patch fixes the bug by taking original alignment into account.

Diff Detail

Event Timeline

wxiao3 created this revision.Apr 27 2019, 8:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 27 2019, 8:22 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

wxiao3 edited the summary of this revision. (Show Details)Apr 27 2019, 8:24 AM

Hi all,

Any comments?

IIUC, this bug is only present when SSE2 is disabled, and your change only has any effect on tests that don't enable SSE2. The normal calling convention rules for vectors pass them in XMM0-2, and then in 16 byte aligned stack memory. So, in most normal operation, LLVM does the right thing.

The fix itself is pretty obscure. You're making a change to the way all i32 and f32 arguments passed in memory are aligned, but I think you probably want the same fix for v2f64, which is probably broken down into two doubles, and handled by this 32-bit f64 rule:

// Doubles get 8-byte slots that are 4-byte aligned.
CCIfType<[f64], CCAssignToStack<8, 4>>,

No, in most normal operation for x86_32, LLVM does the wrong thing. A simple example to show the ABI bug:

$ cat test.c
void callee(int, __float128);
void caller() {
  callee(1, 2.q);
}
$ $ clang -m32 -S test.c -o -
caller:                                 # @caller
# %bb.0:                                # %entry
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp
        movl    %esp, %eax
        movl    $1073741824, 16(%eax)   # imm = 0x40000000
        movl    $0, 12(%eax)
        movl    $0, 8(%eax)
        movl    $0, 4(%eax)
        movl    $1, (%eax)
        calll   callee
        addl    $24, %esp
        popl    %ebp
        retl

$ gcc -m32 -S test.c -o -
caller:
.LFB0:
        .cfi_startproc
        pushl   %ebp
        .cfi_def_cfa_offset 8
        .cfi_offset 5, -8
        movl    %esp, %ebp
        .cfi_def_cfa_register 5
        subl    $8, %esp
        subl    $16, %esp
        movl    $0, (%esp)
        movl    $0, 4(%esp)
        movl    $0, 8(%esp)
        movl    $1073741824, 12(%esp)
        subl    $12, %esp
        pushl   $1
        call    callee
        addl    $32, %esp
        nop
        leave

For 32bit, float128 is always passed by memory even with sse/avx2/avx512 enabled.
And float128 is broken to 4xi32 for x86_32.

Revision Contents

Path

Size

include/

llvm/

Target/

TargetCallingConv.td

7 lines

lib/

Target/

X86/

X86CallingConv.td

2 lines

test/

CodeGen/

X86/

15 lines

40 lines

6 lines

57 lines

11 lines

51 lines

184 lines

masked_gather_scatter.ll

36 lines

2 lines

10 lines

42 lines

104 lines

52 lines

60 lines

104 lines

42 lines

35 lines

60 lines

35 lines

win32-pic-jumptable.ll

2 lines

utils/

TableGen/

CallingConvEmitter.cpp

27 lines

Diff 196970

include/llvm/Target/TargetCallingConv.td

	Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines
	/// stack slot of the specified size and alignment on the stack. If size is			/// stack slot of the specified size and alignment on the stack. If size is
	/// zero then the ABI size is used; if align is zero then the ABI alignment			/// zero then the ABI size is used; if align is zero then the ABI alignment
	/// is used - these may depend on the target or subtarget.			/// is used - these may depend on the target or subtarget.
	class CCAssignToStack<int size, int align> : CCAction {			class CCAssignToStack<int size, int align> : CCAction {
	int Size = size;			int Size = size;
	int Align = align;			int Align = align;
	}			}

				/// CCAssignToStackWithOrigAlign - Same as CCAssignToStack, but alignment takes
				/// original alignment into account, i.e, max(OriginAlign, Align) is used.
				class CCAssignToStackWithOrigAlign<int size, int align> : CCAction {
				int Size = size;
				int Align = align;
				}

	/// CCAssignToStackWithShadow - Same as CCAssignToStack, but with a list of			/// CCAssignToStackWithShadow - Same as CCAssignToStack, but with a list of
	/// registers to be shadowed. Note that, unlike CCAssignToRegWithShadow, this			/// registers to be shadowed. Note that, unlike CCAssignToRegWithShadow, this
	/// shadows ALL of the registers in shadowList.			/// shadows ALL of the registers in shadowList.
	class CCAssignToStackWithShadow<int size,			class CCAssignToStackWithShadow<int size,
	int align,			int align,
	list<Register> shadowList> : CCAction {			list<Register> shadowList> : CCAction {
	int Size = size;			int Size = size;
	int Align = align;			int Align = align;
	▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

lib/Target/X86/X86CallingConv.td

Show First 20 Lines • Show All 785 Lines • ▼ Show 20 Lines	def CC_X86_32_Common : CallingConv<[

// The first 3 __m64 vector arguments are passed in mmx registers if the		// The first 3 __m64 vector arguments are passed in mmx registers if the
// call is not a vararg call.		// call is not a vararg call.
CCIfNotVarArg<CCIfType<[x86mmx],		CCIfNotVarArg<CCIfType<[x86mmx],
CCAssignToReg<[MM0, MM1, MM2]>>>,		CCAssignToReg<[MM0, MM1, MM2]>>>,

// Integer/Float values get stored in stack slots that are 4 bytes in		// Integer/Float values get stored in stack slots that are 4 bytes in
// size and 4-byte aligned.		// size and 4-byte aligned.
CCIfType<[i32, f32], CCAssignToStack<4, 4>>,		CCIfType<[i32, f32], CCAssignToStackWithOrigAlign<4, 4>>,

// Doubles get 8-byte slots that are 4-byte aligned.		// Doubles get 8-byte slots that are 4-byte aligned.
CCIfType<[f64], CCAssignToStack<8, 4>>,		CCIfType<[f64], CCAssignToStack<8, 4>>,

// Long doubles get slots whose size depends on the subtarget.		// Long doubles get slots whose size depends on the subtarget.
CCIfType<[f80], CCAssignToStack<0, 4>>,		CCIfType<[f80], CCAssignToStack<0, 4>>,

// Boolean vectors of AVX-512 are passed in SIMD registers.		// Boolean vectors of AVX-512 are passed in SIMD registers.
▲ Show 20 Lines • Show All 343 Lines • Show Last 20 Lines

test/CodeGen/X86/add.ll

Show First 20 Lines • Show All 403 Lines • ▼ Show 20 Lines	; X64-WIN32-NEXT: retq
%nota = xor i32 %a, -1		%nota = xor i32 %a, -1
%r = add i32 %nota, 1		%r = add i32 %nota, 1
ret i32 %r		ret i32 %r
}		}

define <4 x i32> @inc_not_vec(<4 x i32> %a) nounwind {		define <4 x i32> @inc_not_vec(<4 x i32> %a) nounwind {
; X32-LABEL: inc_not_vec:		; X32-LABEL: inc_not_vec:
; X32: # %bb.0:		; X32: # %bb.0:
		; X32-NEXT: pushl %ebp
		; X32-NEXT: movl %esp, %ebp
; X32-NEXT: pushl %edi		; X32-NEXT: pushl %edi
; X32-NEXT: pushl %esi		; X32-NEXT: pushl %esi
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: andl $-16, %esp
		; X32-NEXT: movl 8(%ebp), %eax
; X32-NEXT: xorl %ecx, %ecx		; X32-NEXT: xorl %ecx, %ecx
; X32-NEXT: xorl %edx, %edx		; X32-NEXT: xorl %edx, %edx
; X32-NEXT: subl {{[0-9]+}}(%esp), %edx		; X32-NEXT: subl 24(%ebp), %edx
; X32-NEXT: xorl %esi, %esi		; X32-NEXT: xorl %esi, %esi
; X32-NEXT: subl {{[0-9]+}}(%esp), %esi		; X32-NEXT: subl 28(%ebp), %esi
; X32-NEXT: xorl %edi, %edi		; X32-NEXT: xorl %edi, %edi
; X32-NEXT: subl {{[0-9]+}}(%esp), %edi		; X32-NEXT: subl 32(%ebp), %edi
; X32-NEXT: subl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: subl 36(%ebp), %ecx
; X32-NEXT: movl %ecx, 12(%eax)		; X32-NEXT: movl %ecx, 12(%eax)
; X32-NEXT: movl %edi, 8(%eax)		; X32-NEXT: movl %edi, 8(%eax)
; X32-NEXT: movl %esi, 4(%eax)		; X32-NEXT: movl %esi, 4(%eax)
; X32-NEXT: movl %edx, (%eax)		; X32-NEXT: movl %edx, (%eax)
		; X32-NEXT: leal -8(%ebp), %esp
; X32-NEXT: popl %esi		; X32-NEXT: popl %esi
; X32-NEXT: popl %edi		; X32-NEXT: popl %edi
		; X32-NEXT: popl %ebp
; X32-NEXT: retl $4		; X32-NEXT: retl $4
;		;
; X64-LINUX-LABEL: inc_not_vec:		; X64-LINUX-LABEL: inc_not_vec:
; X64-LINUX: # %bb.0:		; X64-LINUX: # %bb.0:
; X64-LINUX-NEXT: pxor %xmm1, %xmm1		; X64-LINUX-NEXT: pxor %xmm1, %xmm1
; X64-LINUX-NEXT: psubd %xmm0, %xmm1		; X64-LINUX-NEXT: psubd %xmm0, %xmm1
; X64-LINUX-NEXT: movdqa %xmm1, %xmm0		; X64-LINUX-NEXT: movdqa %xmm1, %xmm0
; X64-LINUX-NEXT: retq		; X64-LINUX-NEXT: retq
▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines

test/CodeGen/X86/bool-vector.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X32		; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X32
; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X32-SSE2		; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X32-SSE2
; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=X32-AVX2		; RUN: llc < %s -mtriple=i686-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=X32-AVX2
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X64		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=-sse2 \| FileCheck %s --check-prefix=X64
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X64-SSE2		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+sse2 \| FileCheck %s --check-prefix=X64-SSE2
; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=X64-AVX2		; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 \| FileCheck %s --check-prefix=X64-AVX2

define i32 @PR15215_bad(<4 x i32> %input) {		define i32 @PR15215_bad(<4 x i32> %input) {
; X32-LABEL: PR15215_bad:		; X32-LABEL: PR15215_bad:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: movb {{[0-9]+}}(%esp), %al		; X32-NEXT: pushl %ebp
; X32-NEXT: movb {{[0-9]+}}(%esp), %cl		; X32-NEXT: .cfi_def_cfa_offset 8
; X32-NEXT: movb {{[0-9]+}}(%esp), %dl		; X32-NEXT: .cfi_offset %ebp, -8
; X32-NEXT: movb {{[0-9]+}}(%esp), %ah		; X32-NEXT: movl %esp, %ebp
		; X32-NEXT: .cfi_def_cfa_register %ebp
		; X32-NEXT: andl $-16, %esp
		; X32-NEXT: subl $16, %esp
		; X32-NEXT: movb 8(%ebp), %al
		; X32-NEXT: movb 12(%ebp), %cl
		; X32-NEXT: movb 16(%ebp), %dl
		; X32-NEXT: movb 20(%ebp), %ah
; X32-NEXT: addb %ah, %ah		; X32-NEXT: addb %ah, %ah
; X32-NEXT: andb $1, %dl		; X32-NEXT: andb $1, %dl
; X32-NEXT: orb %ah, %dl		; X32-NEXT: orb %ah, %dl
; X32-NEXT: shlb $2, %dl		; X32-NEXT: shlb $2, %dl
; X32-NEXT: addb %cl, %cl		; X32-NEXT: addb %cl, %cl
; X32-NEXT: andb $1, %al		; X32-NEXT: andb $1, %al
; X32-NEXT: orb %cl, %al		; X32-NEXT: orb %cl, %al
; X32-NEXT: andb $3, %al		; X32-NEXT: andb $3, %al
; X32-NEXT: orb %dl, %al		; X32-NEXT: orb %dl, %al
; X32-NEXT: movzbl %al, %eax		; X32-NEXT: movzbl %al, %eax
; X32-NEXT: andl $15, %eax		; X32-NEXT: andl $15, %eax
		; X32-NEXT: movl %ebp, %esp
		; X32-NEXT: popl %ebp
		; X32-NEXT: .cfi_def_cfa %esp, 4
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X32-SSE2-LABEL: PR15215_bad:		; X32-SSE2-LABEL: PR15215_bad:
; X32-SSE2: # %bb.0: # %entry		; X32-SSE2: # %bb.0: # %entry
; X32-SSE2-NEXT: pslld $31, %xmm0		; X32-SSE2-NEXT: pslld $31, %xmm0
; X32-SSE2-NEXT: movmskps %xmm0, %eax		; X32-SSE2-NEXT: movmskps %xmm0, %eax
; X32-SSE2-NEXT: retl		; X32-SSE2-NEXT: retl
;		;
Show All 34 Lines	entry:
%1 = bitcast <4 x i1> %0 to i4		%1 = bitcast <4 x i1> %0 to i4
%2 = zext i4 %1 to i32		%2 = zext i4 %1 to i32
ret i32 %2		ret i32 %2
}		}

define i32 @PR15215_good(<4 x i32> %input) {		define i32 @PR15215_good(<4 x i32> %input) {
; X32-LABEL: PR15215_good:		; X32-LABEL: PR15215_good:
; X32: # %bb.0: # %entry		; X32: # %bb.0: # %entry
; X32-NEXT: pushl %esi		; X32-NEXT: pushl %ebp
; X32-NEXT: .cfi_def_cfa_offset 8		; X32-NEXT: .cfi_def_cfa_offset 8
; X32-NEXT: .cfi_offset %esi, -8		; X32-NEXT: .cfi_offset %ebp, -8
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl %esp, %ebp
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: .cfi_def_cfa_register %ebp
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: pushl %esi
; X32-NEXT: movl {{[0-9]+}}(%esp), %esi		; X32-NEXT: andl $-16, %esp
		; X32-NEXT: subl $16, %esp
		; X32-NEXT: .cfi_offset %esi, -12
		; X32-NEXT: movl 20(%ebp), %eax
		; X32-NEXT: movl 16(%ebp), %ecx
		; X32-NEXT: movl 12(%ebp), %edx
		; X32-NEXT: movl 8(%ebp), %esi
; X32-NEXT: andl $1, %esi		; X32-NEXT: andl $1, %esi
; X32-NEXT: andl $1, %edx		; X32-NEXT: andl $1, %edx
; X32-NEXT: andl $1, %ecx		; X32-NEXT: andl $1, %ecx
; X32-NEXT: andl $1, %eax		; X32-NEXT: andl $1, %eax
; X32-NEXT: leal (%esi,%edx,2), %edx		; X32-NEXT: leal (%esi,%edx,2), %edx
; X32-NEXT: leal (%edx,%ecx,4), %ecx		; X32-NEXT: leal (%edx,%ecx,4), %ecx
; X32-NEXT: leal (%ecx,%eax,8), %eax		; X32-NEXT: leal (%ecx,%eax,8), %eax
		; X32-NEXT: leal -4(%ebp), %esp
; X32-NEXT: popl %esi		; X32-NEXT: popl %esi
; X32-NEXT: .cfi_def_cfa_offset 4		; X32-NEXT: popl %ebp
		; X32-NEXT: .cfi_def_cfa %esp, 4
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X32-SSE2-LABEL: PR15215_good:		; X32-SSE2-LABEL: PR15215_good:
; X32-SSE2: # %bb.0: # %entry		; X32-SSE2: # %bb.0: # %entry
; X32-SSE2-NEXT: pushl %esi		; X32-SSE2-NEXT: pushl %esi
; X32-SSE2-NEXT: .cfi_def_cfa_offset 8		; X32-SSE2-NEXT: .cfi_def_cfa_offset 8
; X32-SSE2-NEXT: .cfi_offset %esi, -8		; X32-SSE2-NEXT: .cfi_offset %esi, -8
; X32-SSE2-NEXT: movd %xmm0, %eax		; X32-SSE2-NEXT: movd %xmm0, %eax
▲ Show 20 Lines • Show All 99 Lines • Show Last 20 Lines

test/CodeGen/X86/cmovcmov.ll

	Show First 20 Lines • Show All 153 Lines • ▼ Show 20 Lines
	; CMOV-NEXT: .LBB4_3: # %entry			; CMOV-NEXT: .LBB4_3: # %entry
	; CMOV-NEXT: movaps %xmm3, %xmm0			; CMOV-NEXT: movaps %xmm3, %xmm0
	; CMOV-NEXT: retq			; CMOV-NEXT: retq
	;			;
	; NOCMOV-LABEL: test_select_fcmp_oeq_v4i32:			; NOCMOV-LABEL: test_select_fcmp_oeq_v4i32:
	; NOCMOV: # %bb.0: # %entry			; NOCMOV: # %bb.0: # %entry
	; NOCMOV-NEXT: pushl %edi			; NOCMOV-NEXT: pushl %edi
	; NOCMOV-NEXT: pushl %esi			; NOCMOV-NEXT: pushl %esi
				; NOCMOV-NEXT: pushl %eax
	; NOCMOV-NEXT: flds {{[0-9]+}}(%esp)			; NOCMOV-NEXT: flds {{[0-9]+}}(%esp)
	; NOCMOV-NEXT: flds {{[0-9]+}}(%esp)			; NOCMOV-NEXT: flds {{[0-9]+}}(%esp)
	; NOCMOV-NEXT: fucompp			; NOCMOV-NEXT: fucompp
	; NOCMOV-NEXT: fnstsw %ax			; NOCMOV-NEXT: fnstsw %ax
	; NOCMOV-NEXT: # kill: def $ah killed $ah killed $ax			; NOCMOV-NEXT: # kill: def $ah killed $ah killed $ax
	; NOCMOV-NEXT: sahf			; NOCMOV-NEXT: sahf
	; NOCMOV-NEXT: leal {{[0-9]+}}(%esp), %eax			; NOCMOV-NEXT: leal {{[0-9]+}}(%esp), %eax
	; NOCMOV-NEXT: jne .LBB4_3			; NOCMOV-NEXT: jne .LBB4_3
	Show All 27 Lines
	; NOCMOV-NEXT: # %bb.11: # %entry			; NOCMOV-NEXT: # %bb.11: # %entry
	; NOCMOV-NEXT: leal {{[0-9]+}}(%esp), %edi			; NOCMOV-NEXT: leal {{[0-9]+}}(%esp), %edi
	; NOCMOV-NEXT: .LBB4_12: # %entry			; NOCMOV-NEXT: .LBB4_12: # %entry
	; NOCMOV-NEXT: movl (%edi), %edi			; NOCMOV-NEXT: movl (%edi), %edi
	; NOCMOV-NEXT: movl %edi, 12(%eax)			; NOCMOV-NEXT: movl %edi, 12(%eax)
	; NOCMOV-NEXT: movl %esi, 8(%eax)			; NOCMOV-NEXT: movl %esi, 8(%eax)
	; NOCMOV-NEXT: movl %edx, 4(%eax)			; NOCMOV-NEXT: movl %edx, 4(%eax)
	; NOCMOV-NEXT: movl %ecx, (%eax)			; NOCMOV-NEXT: movl %ecx, (%eax)
				; NOCMOV-NEXT: addl $4, %esp
	; NOCMOV-NEXT: popl %esi			; NOCMOV-NEXT: popl %esi
	; NOCMOV-NEXT: popl %edi			; NOCMOV-NEXT: popl %edi
	; NOCMOV-NEXT: retl $4			; NOCMOV-NEXT: retl $4
	entry:			entry:
	%cmp = fcmp oeq float %a, %b			%cmp = fcmp oeq float %a, %b
	%r = select i1 %cmp, <4 x i32> %c, <4 x i32> %d			%r = select i1 %cmp, <4 x i32> %c, <4 x i32> %d
	ret <4 x i32> %r			ret <4 x i32> %r
	}			}
	Show All 24 Lines
	; NOCMOV-NEXT: fld1			; NOCMOV-NEXT: fld1
	; NOCMOV-NEXT: fldz			; NOCMOV-NEXT: fldz
	; NOCMOV-NEXT: jne .LBB5_1			; NOCMOV-NEXT: jne .LBB5_1
	; NOCMOV-NEXT: # %bb.2: # %entry			; NOCMOV-NEXT: # %bb.2: # %entry
	; NOCMOV-NEXT: jp .LBB5_5			; NOCMOV-NEXT: jp .LBB5_5
	; NOCMOV-NEXT: # %bb.3: # %entry			; NOCMOV-NEXT: # %bb.3: # %entry
	; NOCMOV-NEXT: fstp %st(1)			; NOCMOV-NEXT: fstp %st(1)
	; NOCMOV-NEXT: jmp .LBB5_4			; NOCMOV-NEXT: jmp .LBB5_4
	; NOCMOV-NEXT: .LBB5_1:			; NOCMOV-NEXT: .LBB5_1: # %entry
	; NOCMOV-NEXT: fstp %st(0)			; NOCMOV-NEXT: fstp %st(0)
	; NOCMOV-NEXT: .LBB5_4: # %entry			; NOCMOV-NEXT: .LBB5_4: # %entry
	; NOCMOV-NEXT: fldz			; NOCMOV-NEXT: fldz
	; NOCMOV-NEXT: .LBB5_5: # %entry			; NOCMOV-NEXT: .LBB5_5: # %entry
	; NOCMOV-NEXT: fstp %st(0)			; NOCMOV-NEXT: fstp %st(0)
	; NOCMOV-NEXT: retl			; NOCMOV-NEXT: retl
	entry:			entry:
	%cmp = fcmp une float %a, %b			%cmp = fcmp une float %a, %b
	Show All 26 Lines
	; NOCMOV-NEXT: fldz			; NOCMOV-NEXT: fldz
	; NOCMOV-NEXT: fld1			; NOCMOV-NEXT: fld1
	; NOCMOV-NEXT: jne .LBB6_1			; NOCMOV-NEXT: jne .LBB6_1
	; NOCMOV-NEXT: # %bb.2: # %entry			; NOCMOV-NEXT: # %bb.2: # %entry
	; NOCMOV-NEXT: jp .LBB6_5			; NOCMOV-NEXT: jp .LBB6_5
	; NOCMOV-NEXT: # %bb.3: # %entry			; NOCMOV-NEXT: # %bb.3: # %entry
	; NOCMOV-NEXT: fstp %st(1)			; NOCMOV-NEXT: fstp %st(1)
	; NOCMOV-NEXT: jmp .LBB6_4			; NOCMOV-NEXT: jmp .LBB6_4
	; NOCMOV-NEXT: .LBB6_1:			; NOCMOV-NEXT: .LBB6_1: # %entry
	; NOCMOV-NEXT: fstp %st(0)			; NOCMOV-NEXT: fstp %st(0)
	; NOCMOV-NEXT: .LBB6_4: # %entry			; NOCMOV-NEXT: .LBB6_4: # %entry
	; NOCMOV-NEXT: fldz			; NOCMOV-NEXT: fldz
	; NOCMOV-NEXT: .LBB6_5: # %entry			; NOCMOV-NEXT: .LBB6_5: # %entry
	; NOCMOV-NEXT: fstp %st(0)			; NOCMOV-NEXT: fstp %st(0)
	; NOCMOV-NEXT: retl			; NOCMOV-NEXT: retl
	entry:			entry:
	%cmp = fcmp oeq float %a, %b			%cmp = fcmp oeq float %a, %b
	▲ Show 20 Lines • Show All 76 Lines • Show Last 20 Lines

test/CodeGen/X86/extract-store.ll

Show First 20 Lines • Show All 506 Lines • ▼ Show 20 Lines	; AVX-X64-NEXT: retq
%vecext = extractelement <2 x double> %foo, i32 1		%vecext = extractelement <2 x double> %foo, i32 1
store double %vecext, double* %dst, align 1		store double %vecext, double* %dst, align 1
ret void		ret void
}		}

define void @extract_f128_0(fp128* nocapture %dst, <2 x fp128> %foo) nounwind {		define void @extract_f128_0(fp128* nocapture %dst, <2 x fp128> %foo) nounwind {
; SSE-X32-LABEL: extract_f128_0:		; SSE-X32-LABEL: extract_f128_0:
; SSE-X32: # %bb.0:		; SSE-X32: # %bb.0:
		; SSE-X32-NEXT: pushl %ebp
		; SSE-X32-NEXT: movl %esp, %ebp
; SSE-X32-NEXT: pushl %edi		; SSE-X32-NEXT: pushl %edi
; SSE-X32-NEXT: pushl %esi		; SSE-X32-NEXT: pushl %esi
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; SSE-X32-NEXT: andl $-32, %esp
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; SSE-X32-NEXT: subl $32, %esp
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; SSE-X32-NEXT: movl 40(%ebp), %eax
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %esi		; SSE-X32-NEXT: movl 44(%ebp), %ecx
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %edi		; SSE-X32-NEXT: movl 48(%ebp), %edx
		; SSE-X32-NEXT: movl 52(%ebp), %esi
		; SSE-X32-NEXT: movl 8(%ebp), %edi
; SSE-X32-NEXT: movl %esi, 12(%edi)		; SSE-X32-NEXT: movl %esi, 12(%edi)
; SSE-X32-NEXT: movl %edx, 8(%edi)		; SSE-X32-NEXT: movl %edx, 8(%edi)
; SSE-X32-NEXT: movl %ecx, 4(%edi)		; SSE-X32-NEXT: movl %ecx, 4(%edi)
; SSE-X32-NEXT: movl %eax, (%edi)		; SSE-X32-NEXT: movl %eax, (%edi)
		; SSE-X32-NEXT: leal -8(%ebp), %esp
; SSE-X32-NEXT: popl %esi		; SSE-X32-NEXT: popl %esi
; SSE-X32-NEXT: popl %edi		; SSE-X32-NEXT: popl %edi
		; SSE-X32-NEXT: popl %ebp
; SSE-X32-NEXT: retl		; SSE-X32-NEXT: retl
;		;
; SSE2-X64-LABEL: extract_f128_0:		; SSE2-X64-LABEL: extract_f128_0:
; SSE2-X64: # %bb.0:		; SSE2-X64: # %bb.0:
; SSE2-X64-NEXT: movq %rdx, 8(%rdi)		; SSE2-X64-NEXT: movq %rdx, 8(%rdi)
; SSE2-X64-NEXT: movq %rsi, (%rdi)		; SSE2-X64-NEXT: movq %rsi, (%rdi)
; SSE2-X64-NEXT: retq		; SSE2-X64-NEXT: retq
;		;
; SSE41-X64-LABEL: extract_f128_0:		; SSE41-X64-LABEL: extract_f128_0:
; SSE41-X64: # %bb.0:		; SSE41-X64: # %bb.0:
; SSE41-X64-NEXT: movq %rdx, 8(%rdi)		; SSE41-X64-NEXT: movq %rdx, 8(%rdi)
; SSE41-X64-NEXT: movq %rsi, (%rdi)		; SSE41-X64-NEXT: movq %rsi, (%rdi)
; SSE41-X64-NEXT: retq		; SSE41-X64-NEXT: retq
;		;
; AVX-X32-LABEL: extract_f128_0:		; AVX-X32-LABEL: extract_f128_0:
; AVX-X32: # %bb.0:		; AVX-X32: # %bb.0:
; AVX-X32-NEXT: vmovups {{[0-9]+}}(%esp), %xmm0		; AVX-X32-NEXT: pushl %ebp
; AVX-X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; AVX-X32-NEXT: movl %esp, %ebp
		; AVX-X32-NEXT: andl $-32, %esp
		; AVX-X32-NEXT: subl $32, %esp
		; AVX-X32-NEXT: vmovups 40(%ebp), %xmm0
		; AVX-X32-NEXT: movl 8(%ebp), %eax
; AVX-X32-NEXT: vmovups %xmm0, (%eax)		; AVX-X32-NEXT: vmovups %xmm0, (%eax)
		; AVX-X32-NEXT: movl %ebp, %esp
		; AVX-X32-NEXT: popl %ebp
; AVX-X32-NEXT: retl		; AVX-X32-NEXT: retl
;		;
; AVX-X64-LABEL: extract_f128_0:		; AVX-X64-LABEL: extract_f128_0:
; AVX-X64: # %bb.0:		; AVX-X64: # %bb.0:
; AVX-X64-NEXT: movq %rdx, 8(%rdi)		; AVX-X64-NEXT: movq %rdx, 8(%rdi)
; AVX-X64-NEXT: movq %rsi, (%rdi)		; AVX-X64-NEXT: movq %rsi, (%rdi)
; AVX-X64-NEXT: retq		; AVX-X64-NEXT: retq
;		;
; SSE-F128-LABEL: extract_f128_0:		; SSE-F128-LABEL: extract_f128_0:
; SSE-F128: # %bb.0:		; SSE-F128: # %bb.0:
; SSE-F128-NEXT: movups %xmm0, (%rdi)		; SSE-F128-NEXT: movups %xmm0, (%rdi)
; SSE-F128-NEXT: retq		; SSE-F128-NEXT: retq
%vecext = extractelement <2 x fp128> %foo, i32 0		%vecext = extractelement <2 x fp128> %foo, i32 0
store fp128 %vecext, fp128* %dst, align 1		store fp128 %vecext, fp128* %dst, align 1
ret void		ret void
}		}

define void @extract_f128_1(fp128* nocapture %dst, <2 x fp128> %foo) nounwind {		define void @extract_f128_1(fp128* nocapture %dst, <2 x fp128> %foo) nounwind {
; SSE-X32-LABEL: extract_f128_1:		; SSE-X32-LABEL: extract_f128_1:
; SSE-X32: # %bb.0:		; SSE-X32: # %bb.0:
		; SSE-X32-NEXT: pushl %ebp
		; SSE-X32-NEXT: movl %esp, %ebp
; SSE-X32-NEXT: pushl %edi		; SSE-X32-NEXT: pushl %edi
; SSE-X32-NEXT: pushl %esi		; SSE-X32-NEXT: pushl %esi
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; SSE-X32-NEXT: andl $-32, %esp
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; SSE-X32-NEXT: subl $32, %esp
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; SSE-X32-NEXT: movl 56(%ebp), %eax
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %esi		; SSE-X32-NEXT: movl 60(%ebp), %ecx
; SSE-X32-NEXT: movl {{[0-9]+}}(%esp), %edi		; SSE-X32-NEXT: movl 64(%ebp), %edx
		; SSE-X32-NEXT: movl 68(%ebp), %esi
		; SSE-X32-NEXT: movl 8(%ebp), %edi
; SSE-X32-NEXT: movl %esi, 12(%edi)		; SSE-X32-NEXT: movl %esi, 12(%edi)
; SSE-X32-NEXT: movl %edx, 8(%edi)		; SSE-X32-NEXT: movl %edx, 8(%edi)
; SSE-X32-NEXT: movl %ecx, 4(%edi)		; SSE-X32-NEXT: movl %ecx, 4(%edi)
; SSE-X32-NEXT: movl %eax, (%edi)		; SSE-X32-NEXT: movl %eax, (%edi)
		; SSE-X32-NEXT: leal -8(%ebp), %esp
; SSE-X32-NEXT: popl %esi		; SSE-X32-NEXT: popl %esi
; SSE-X32-NEXT: popl %edi		; SSE-X32-NEXT: popl %edi
		; SSE-X32-NEXT: popl %ebp
; SSE-X32-NEXT: retl		; SSE-X32-NEXT: retl
;		;
; SSE2-X64-LABEL: extract_f128_1:		; SSE2-X64-LABEL: extract_f128_1:
; SSE2-X64: # %bb.0:		; SSE2-X64: # %bb.0:
; SSE2-X64-NEXT: movq %r8, 8(%rdi)		; SSE2-X64-NEXT: movq %r8, 8(%rdi)
; SSE2-X64-NEXT: movq %rcx, (%rdi)		; SSE2-X64-NEXT: movq %rcx, (%rdi)
; SSE2-X64-NEXT: retq		; SSE2-X64-NEXT: retq
;		;
; SSE41-X64-LABEL: extract_f128_1:		; SSE41-X64-LABEL: extract_f128_1:
; SSE41-X64: # %bb.0:		; SSE41-X64: # %bb.0:
; SSE41-X64-NEXT: movq %r8, 8(%rdi)		; SSE41-X64-NEXT: movq %r8, 8(%rdi)
; SSE41-X64-NEXT: movq %rcx, (%rdi)		; SSE41-X64-NEXT: movq %rcx, (%rdi)
; SSE41-X64-NEXT: retq		; SSE41-X64-NEXT: retq
;		;
; AVX-X32-LABEL: extract_f128_1:		; AVX-X32-LABEL: extract_f128_1:
; AVX-X32: # %bb.0:		; AVX-X32: # %bb.0:
; AVX-X32-NEXT: vmovups {{[0-9]+}}(%esp), %xmm0		; AVX-X32-NEXT: pushl %ebp
; AVX-X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; AVX-X32-NEXT: movl %esp, %ebp
		; AVX-X32-NEXT: andl $-32, %esp
		; AVX-X32-NEXT: subl $32, %esp
		; AVX-X32-NEXT: vmovups 56(%ebp), %xmm0
		; AVX-X32-NEXT: movl 8(%ebp), %eax
; AVX-X32-NEXT: vmovups %xmm0, (%eax)		; AVX-X32-NEXT: vmovups %xmm0, (%eax)
		; AVX-X32-NEXT: movl %ebp, %esp
		; AVX-X32-NEXT: popl %ebp
; AVX-X32-NEXT: retl		; AVX-X32-NEXT: retl
;		;
; AVX-X64-LABEL: extract_f128_1:		; AVX-X64-LABEL: extract_f128_1:
; AVX-X64: # %bb.0:		; AVX-X64: # %bb.0:
; AVX-X64-NEXT: movq %r8, 8(%rdi)		; AVX-X64-NEXT: movq %r8, 8(%rdi)
; AVX-X64-NEXT: movq %rcx, (%rdi)		; AVX-X64-NEXT: movq %rcx, (%rdi)
; AVX-X64-NEXT: retq		; AVX-X64-NEXT: retq
;		;
▲ Show 20 Lines • Show All 82 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%vecext = extractelement <2 x double> %foo, i32 2 ; undef		%vecext = extractelement <2 x double> %foo, i32 2 ; undef
store double %vecext, double* %dst, align 1		store double %vecext, double* %dst, align 1
ret void		ret void
}		}

define void @extract_f128_undef(fp128* nocapture %dst, <2 x fp128> %foo) nounwind {		define void @extract_f128_undef(fp128* nocapture %dst, <2 x fp128> %foo) nounwind {
; X32-LABEL: extract_f128_undef:		; X32-LABEL: extract_f128_undef:
; X32: # %bb.0:		; X32: # %bb.0:
		; X32-NEXT: pushl %ebp
		; X32-NEXT: movl %esp, %ebp
		; X32-NEXT: andl $-32, %esp
		; X32-NEXT: movl %ebp, %esp
		; X32-NEXT: popl %ebp
; X32-NEXT: retl		; X32-NEXT: retl
;		;
; X64-LABEL: extract_f128_undef:		; X64-LABEL: extract_f128_undef:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: retq		; X64-NEXT: retq
%vecext = extractelement <2 x fp128> %foo, i32 2 ; undef		%vecext = extractelement <2 x fp128> %foo, i32 2 ; undef
store fp128 %vecext, fp128* %dst, align 1		store fp128 %vecext, fp128* %dst, align 1
ret void		ret void
}		}

test/CodeGen/X86/gather-addresses.ll

	Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines
	; WIN-SSE4-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; WIN-SSE4-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; WIN-SSE4-NEXT: movq %rdx, %xmm2			; WIN-SSE4-NEXT: movq %rdx, %xmm2
	; WIN-SSE4-NEXT: movq %r8, %xmm1			; WIN-SSE4-NEXT: movq %r8, %xmm1
	; WIN-SSE4-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]			; WIN-SSE4-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]
	; WIN-SSE4-NEXT: retq			; WIN-SSE4-NEXT: retq
	;			;
	; LIN32-LABEL: old:			; LIN32-LABEL: old:
	; LIN32: # %bb.0:			; LIN32: # %bb.0:
				; LIN32-NEXT: pushl %ebp
				; LIN32-NEXT: movl %esp, %ebp
	; LIN32-NEXT: pushl %edi			; LIN32-NEXT: pushl %edi
	; LIN32-NEXT: pushl %esi			; LIN32-NEXT: pushl %esi
	; LIN32-NEXT: movl {{[0-9]+}}(%esp), %eax			; LIN32-NEXT: andl $-8, %esp
	; LIN32-NEXT: movl {{[0-9]+}}(%esp), %ecx			; LIN32-NEXT: movl 24(%ebp), %eax
	; LIN32-NEXT: movl {{[0-9]+}}(%esp), %edx			; LIN32-NEXT: movl 16(%ebp), %ecx
				; LIN32-NEXT: movl 12(%ebp), %edx
	; LIN32-NEXT: movdqa (%edx), %xmm0			; LIN32-NEXT: movdqa (%edx), %xmm0
	; LIN32-NEXT: pand (%ecx), %xmm0			; LIN32-NEXT: pand (%ecx), %xmm0
	; LIN32-NEXT: movd %xmm0, %ecx			; LIN32-NEXT: movd %xmm0, %ecx
	; LIN32-NEXT: pextrd $1, %xmm0, %edx			; LIN32-NEXT: pextrd $1, %xmm0, %edx
	; LIN32-NEXT: pextrd $2, %xmm0, %esi			; LIN32-NEXT: pextrd $2, %xmm0, %esi
	; LIN32-NEXT: pextrd $3, %xmm0, %edi			; LIN32-NEXT: pextrd $3, %xmm0, %edi
	; LIN32-NEXT: andl %eax, %ecx			; LIN32-NEXT: andl %eax, %ecx
	; LIN32-NEXT: andl %eax, %edx			; LIN32-NEXT: andl %eax, %edx
	; LIN32-NEXT: andl %eax, %esi			; LIN32-NEXT: andl %eax, %esi
	; LIN32-NEXT: andl %eax, %edi			; LIN32-NEXT: andl %eax, %edi
	; LIN32-NEXT: movd %edx, %xmm1			; LIN32-NEXT: movd %edx, %xmm1
	; LIN32-NEXT: movd %ecx, %xmm0			; LIN32-NEXT: movd %ecx, %xmm0
	; LIN32-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]			; LIN32-NEXT: punpcklqdq {{.*#+}} xmm0 = xmm0[0],xmm1[0]
	; LIN32-NEXT: movd %edi, %xmm2			; LIN32-NEXT: movd %edi, %xmm2
	; LIN32-NEXT: movd %esi, %xmm1			; LIN32-NEXT: movd %esi, %xmm1
	; LIN32-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]			; LIN32-NEXT: punpcklqdq {{.*#+}} xmm1 = xmm1[0],xmm2[0]
				; LIN32-NEXT: leal -8(%ebp), %esp
	; LIN32-NEXT: popl %esi			; LIN32-NEXT: popl %esi
	; LIN32-NEXT: popl %edi			; LIN32-NEXT: popl %edi
				; LIN32-NEXT: popl %ebp
	; LIN32-NEXT: retl			; LIN32-NEXT: retl
	%a = load <4 x i32>, <4 x i32>* %i			%a = load <4 x i32>, <4 x i32>* %i
	%b = load <4 x i32>, <4 x i32>* %h			%b = load <4 x i32>, <4 x i32>* %h
	%j = and <4 x i32> %a, %b			%j = and <4 x i32> %a, %b
	%d0 = extractelement <4 x i32> %j, i32 0			%d0 = extractelement <4 x i32> %j, i32 0
	%d1 = extractelement <4 x i32> %j, i32 1			%d1 = extractelement <4 x i32> %j, i32 1
	%d2 = extractelement <4 x i32> %j, i32 2			%d2 = extractelement <4 x i32> %j, i32 2
	%d3 = extractelement <4 x i32> %j, i32 3			%d3 = extractelement <4 x i32> %j, i32 3
	Show All 14 Lines

test/CodeGen/X86/legalize-shift-64.ll

	Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	}			}

	; PR14668			; PR14668
	define <2 x i64> @test5(<2 x i64> %A, <2 x i64> %B) {			define <2 x i64> @test5(<2 x i64> %A, <2 x i64> %B) {
	; CHECK-LABEL: test5:			; CHECK-LABEL: test5:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: pushl %ebp			; CHECK-NEXT: pushl %ebp
	; CHECK-NEXT: .cfi_def_cfa_offset 8			; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: .cfi_offset %ebp, -8
				; CHECK-NEXT: movl %esp, %ebp
				; CHECK-NEXT: .cfi_def_cfa_register %ebp
	; CHECK-NEXT: pushl %ebx			; CHECK-NEXT: pushl %ebx
	; CHECK-NEXT: .cfi_def_cfa_offset 12
	; CHECK-NEXT: pushl %edi			; CHECK-NEXT: pushl %edi
	; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: pushl %esi			; CHECK-NEXT: pushl %esi
	; CHECK-NEXT: .cfi_def_cfa_offset 20			; CHECK-NEXT: andl $-16, %esp
				; CHECK-NEXT: subl $16, %esp
	; CHECK-NEXT: .cfi_offset %esi, -20			; CHECK-NEXT: .cfi_offset %esi, -20
	; CHECK-NEXT: .cfi_offset %edi, -16			; CHECK-NEXT: .cfi_offset %edi, -16
	; CHECK-NEXT: .cfi_offset %ebx, -12			; CHECK-NEXT: .cfi_offset %ebx, -12
	; CHECK-NEXT: .cfi_offset %ebp, -8			; CHECK-NEXT: movb 48(%ebp), %ch
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %eax			; CHECK-NEXT: movl 32(%ebp), %eax
	; CHECK-NEXT: movb {{[0-9]+}}(%esp), %ch			; CHECK-NEXT: movb 40(%ebp), %cl
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %edx			; CHECK-NEXT: movl 24(%ebp), %edx
	; CHECK-NEXT: movb {{[0-9]+}}(%esp), %cl			; CHECK-NEXT: movl %edx, %esi
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ebx			; CHECK-NEXT: shll %cl, %esi
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %esi			; CHECK-NEXT: movl 28(%ebp), %edi
	; CHECK-NEXT: movl %ebx, %edi			; CHECK-NEXT: shldl %cl, %edx, %edi
	; CHECK-NEXT: shll %cl, %edi
	; CHECK-NEXT: shldl %cl, %ebx, %esi
	; CHECK-NEXT: testb $32, %cl			; CHECK-NEXT: testb $32, %cl
	; CHECK-NEXT: movl {{[0-9]+}}(%esp), %ebp			; CHECK-NEXT: movl 36(%ebp), %edx
	; CHECK-NEXT: je .LBB4_2			; CHECK-NEXT: je .LBB4_2
	; CHECK-NEXT: # %bb.1:			; CHECK-NEXT: # %bb.1:
	; CHECK-NEXT: movl %edi, %esi			; CHECK-NEXT: movl %esi, %edi
	; CHECK-NEXT: xorl %edi, %edi			; CHECK-NEXT: xorl %esi, %esi
	; CHECK-NEXT: .LBB4_2:			; CHECK-NEXT: .LBB4_2:
	; CHECK-NEXT: movl %edx, %ebx			; CHECK-NEXT: movl %eax, %ebx
	; CHECK-NEXT: movb %ch, %cl			; CHECK-NEXT: movb %ch, %cl
	; CHECK-NEXT: shll %cl, %ebx			; CHECK-NEXT: shll %cl, %ebx
	; CHECK-NEXT: shldl %cl, %edx, %ebp			; CHECK-NEXT: shldl %cl, %eax, %edx
	; CHECK-NEXT: testb $32, %ch			; CHECK-NEXT: testb $32, %ch
	; CHECK-NEXT: je .LBB4_4			; CHECK-NEXT: je .LBB4_4
	; CHECK-NEXT: # %bb.3:			; CHECK-NEXT: # %bb.3:
	; CHECK-NEXT: movl %ebx, %ebp			; CHECK-NEXT: movl %ebx, %edx
	; CHECK-NEXT: xorl %ebx, %ebx			; CHECK-NEXT: xorl %ebx, %ebx
	; CHECK-NEXT: .LBB4_4:			; CHECK-NEXT: .LBB4_4:
	; CHECK-NEXT: movl %ebp, 12(%eax)			; CHECK-NEXT: movl 8(%ebp), %eax
				; CHECK-NEXT: movl %edx, 12(%eax)
	; CHECK-NEXT: movl %ebx, 8(%eax)			; CHECK-NEXT: movl %ebx, 8(%eax)
	; CHECK-NEXT: movl %esi, 4(%eax)			; CHECK-NEXT: movl %edi, 4(%eax)
	; CHECK-NEXT: movl %edi, (%eax)			; CHECK-NEXT: movl %esi, (%eax)
				; CHECK-NEXT: leal -12(%ebp), %esp
	; CHECK-NEXT: popl %esi			; CHECK-NEXT: popl %esi
	; CHECK-NEXT: .cfi_def_cfa_offset 16
	; CHECK-NEXT: popl %edi			; CHECK-NEXT: popl %edi
	; CHECK-NEXT: .cfi_def_cfa_offset 12
	; CHECK-NEXT: popl %ebx			; CHECK-NEXT: popl %ebx
	; CHECK-NEXT: .cfi_def_cfa_offset 8
	; CHECK-NEXT: popl %ebp			; CHECK-NEXT: popl %ebp
	; CHECK-NEXT: .cfi_def_cfa_offset 4			; CHECK-NEXT: .cfi_def_cfa %esp, 4
	; CHECK-NEXT: retl $4			; CHECK-NEXT: retl $4
	%shl = shl <2 x i64> %A, %B			%shl = shl <2 x i64> %A, %B
	ret <2 x i64> %shl			ret <2 x i64> %shl
	}			}

	; PR16108			; PR16108
	define i32 @test6() {			define i32 @test6() {
	; CHECK-LABEL: test6:			; CHECK-LABEL: test6:
	▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

test/CodeGen/X86/legalize-shl-vec.ll

; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=CHECK --check-prefix=X32		; RUN: llc < %s -mtriple=i686-unknown-unknown \| FileCheck %s --check-prefix=CHECK --check-prefix=X32
; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=CHECK --check-prefix=X64		; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefix=CHECK --check-prefix=X64

define <2 x i256> @test_shl(<2 x i256> %In) {		define <2 x i256> @test_shl(<2 x i256> %In) {
; X32-LABEL: test_shl:		; X32-LABEL: test_shl:
; X32: # %bb.0:		; X32: # %bb.0:
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: pushl %ebp
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: .cfi_def_cfa_offset 8
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: .cfi_offset %ebp, -8
		; X32-NEXT: movl %esp, %ebp
		; X32-NEXT: .cfi_def_cfa_register %ebp
		; X32-NEXT: andl $-64, %esp
		; X32-NEXT: subl $64, %esp
		; X32-NEXT: movl 8(%ebp), %eax
		; X32-NEXT: movl 132(%ebp), %ecx
		; X32-NEXT: movl 128(%ebp), %edx
; X32-NEXT: shldl $2, %edx, %ecx		; X32-NEXT: shldl $2, %edx, %ecx
; X32-NEXT: movl %ecx, 60(%eax)		; X32-NEXT: movl %ecx, 60(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: movl 124(%ebp), %ecx
; X32-NEXT: shldl $2, %ecx, %edx		; X32-NEXT: shldl $2, %ecx, %edx
; X32-NEXT: movl %edx, 56(%eax)		; X32-NEXT: movl %edx, 56(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl 120(%ebp), %edx
; X32-NEXT: shldl $2, %edx, %ecx		; X32-NEXT: shldl $2, %edx, %ecx
; X32-NEXT: movl %ecx, 52(%eax)		; X32-NEXT: movl %ecx, 52(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: movl 116(%ebp), %ecx
; X32-NEXT: shldl $2, %ecx, %edx		; X32-NEXT: shldl $2, %ecx, %edx
; X32-NEXT: movl %edx, 48(%eax)		; X32-NEXT: movl %edx, 48(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl 112(%ebp), %edx
; X32-NEXT: shldl $2, %edx, %ecx		; X32-NEXT: shldl $2, %edx, %ecx
; X32-NEXT: movl %ecx, 44(%eax)		; X32-NEXT: movl %ecx, 44(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: movl 108(%ebp), %ecx
; X32-NEXT: shldl $2, %ecx, %edx		; X32-NEXT: shldl $2, %ecx, %edx
; X32-NEXT: movl %edx, 40(%eax)		; X32-NEXT: movl %edx, 40(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl 104(%ebp), %edx
; X32-NEXT: shldl $2, %edx, %ecx		; X32-NEXT: shldl $2, %edx, %ecx
; X32-NEXT: movl %ecx, 36(%eax)		; X32-NEXT: movl %ecx, 36(%eax)
; X32-NEXT: shll $2, %edx		; X32-NEXT: shll $2, %edx
; X32-NEXT: movl %edx, 32(%eax)		; X32-NEXT: movl %edx, 32(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: movl 72(%ebp), %ecx
; X32-NEXT: shll $31, %ecx		; X32-NEXT: shll $31, %ecx
; X32-NEXT: movl %ecx, 28(%eax)		; X32-NEXT: movl %ecx, 28(%eax)
; X32-NEXT: movl $0, 24(%eax)		; X32-NEXT: movl $0, 24(%eax)
; X32-NEXT: movl $0, 20(%eax)		; X32-NEXT: movl $0, 20(%eax)
; X32-NEXT: movl $0, 16(%eax)		; X32-NEXT: movl $0, 16(%eax)
; X32-NEXT: movl $0, 12(%eax)		; X32-NEXT: movl $0, 12(%eax)
; X32-NEXT: movl $0, 8(%eax)		; X32-NEXT: movl $0, 8(%eax)
; X32-NEXT: movl $0, 4(%eax)		; X32-NEXT: movl $0, 4(%eax)
; X32-NEXT: movl $0, (%eax)		; X32-NEXT: movl $0, (%eax)
		; X32-NEXT: movl %ebp, %esp
		; X32-NEXT: popl %ebp
		; X32-NEXT: .cfi_def_cfa %esp, 4
; X32-NEXT: retl $4		; X32-NEXT: retl $4
;		;
; X64-LABEL: test_shl:		; X64-LABEL: test_shl:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdi		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdi
Show All 16 Lines	; X64-NEXT: retq
ret <2 x i256> %Out		ret <2 x i256> %Out
}		}

define <2 x i256> @test_srl(<2 x i256> %In) {		define <2 x i256> @test_srl(<2 x i256> %In) {
; X32-LABEL: test_srl:		; X32-LABEL: test_srl:
; X32: # %bb.0:		; X32: # %bb.0:
; X32-NEXT: pushl %ebp		; X32-NEXT: pushl %ebp
; X32-NEXT: .cfi_def_cfa_offset 8		; X32-NEXT: .cfi_def_cfa_offset 8
		; X32-NEXT: .cfi_offset %ebp, -8
		; X32-NEXT: movl %esp, %ebp
		; X32-NEXT: .cfi_def_cfa_register %ebp
; X32-NEXT: pushl %ebx		; X32-NEXT: pushl %ebx
; X32-NEXT: .cfi_def_cfa_offset 12
; X32-NEXT: pushl %edi		; X32-NEXT: pushl %edi
; X32-NEXT: .cfi_def_cfa_offset 16
; X32-NEXT: pushl %esi		; X32-NEXT: pushl %esi
; X32-NEXT: .cfi_def_cfa_offset 20		; X32-NEXT: andl $-64, %esp
; X32-NEXT: subl $8, %esp		; X32-NEXT: subl $64, %esp
; X32-NEXT: .cfi_def_cfa_offset 28
; X32-NEXT: .cfi_offset %esi, -20		; X32-NEXT: .cfi_offset %esi, -20
; X32-NEXT: .cfi_offset %edi, -16		; X32-NEXT: .cfi_offset %edi, -16
; X32-NEXT: .cfi_offset %ebx, -12		; X32-NEXT: .cfi_offset %ebx, -12
; X32-NEXT: .cfi_offset %ebp, -8		; X32-NEXT: movl 132(%ebp), %ebx
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl 128(%ebp), %ecx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl 124(%ebp), %eax
; X32-NEXT: movl {{[0-9]+}}(%esp), %esi		; X32-NEXT: movl 120(%ebp), %edi
; X32-NEXT: movl {{[0-9]+}}(%esp), %edi		; X32-NEXT: movl %ebx, %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %ebx		; X32-NEXT: shldl $28, %ecx, %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp		; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
; X32-NEXT: movl %edx, %ecx
; X32-NEXT: shldl $28, %eax, %ecx		; X32-NEXT: shldl $28, %eax, %ecx
; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill		; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
; X32-NEXT: shldl $28, %esi, %eax		; X32-NEXT: shldl $28, %edi, %eax
; X32-NEXT: movl %eax, (%esp) # 4-byte Spill		; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
; X32-NEXT: shldl $28, %edi, %esi		; X32-NEXT: movl 116(%ebp), %edx
; X32-NEXT: shldl $28, %ebx, %edi		; X32-NEXT: shldl $28, %edx, %edi
; X32-NEXT: shldl $28, %ebp, %ebx		; X32-NEXT: movl 112(%ebp), %ecx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: shldl $28, %ecx, %edx
; X32-NEXT: shldl $28, %eax, %ebp		; X32-NEXT: movl 108(%ebp), %eax
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: shldl $28, %eax, %ecx
; X32-NEXT: shrdl $4, %eax, %ecx		; X32-NEXT: movl 104(%ebp), %esi
; X32-NEXT: shrl $4, %edx		; X32-NEXT: shrdl $4, %eax, %esi
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: shrl $4, %ebx
; X32-NEXT: movl %edx, 60(%eax)		; X32-NEXT: movl 8(%ebp), %eax
; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload		; X32-NEXT: movl %ebx, 60(%eax)
; X32-NEXT: movl %edx, 56(%eax)		; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
; X32-NEXT: movl (%esp), %edx # 4-byte Reload		; X32-NEXT: movl %ebx, 56(%eax)
; X32-NEXT: movl %edx, 52(%eax)		; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
; X32-NEXT: movl %esi, 48(%eax)		; X32-NEXT: movl %ebx, 52(%eax)
		; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
		; X32-NEXT: movl %ebx, 48(%eax)
; X32-NEXT: movl %edi, 44(%eax)		; X32-NEXT: movl %edi, 44(%eax)
; X32-NEXT: movl %ebx, 40(%eax)		; X32-NEXT: movl %edx, 40(%eax)
; X32-NEXT: movl %ebp, 36(%eax)		; X32-NEXT: movl %ecx, 36(%eax)
; X32-NEXT: movl %ecx, 32(%eax)		; X32-NEXT: movl %esi, 32(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: movl 100(%ebp), %ecx
; X32-NEXT: shrl $31, %ecx		; X32-NEXT: shrl $31, %ecx
; X32-NEXT: movl %ecx, (%eax)		; X32-NEXT: movl %ecx, (%eax)
; X32-NEXT: movl $0, 28(%eax)		; X32-NEXT: movl $0, 28(%eax)
; X32-NEXT: movl $0, 24(%eax)		; X32-NEXT: movl $0, 24(%eax)
; X32-NEXT: movl $0, 20(%eax)		; X32-NEXT: movl $0, 20(%eax)
; X32-NEXT: movl $0, 16(%eax)		; X32-NEXT: movl $0, 16(%eax)
; X32-NEXT: movl $0, 12(%eax)		; X32-NEXT: movl $0, 12(%eax)
; X32-NEXT: movl $0, 8(%eax)		; X32-NEXT: movl $0, 8(%eax)
; X32-NEXT: movl $0, 4(%eax)		; X32-NEXT: movl $0, 4(%eax)
; X32-NEXT: addl $8, %esp		; X32-NEXT: leal -12(%ebp), %esp
; X32-NEXT: .cfi_def_cfa_offset 20
; X32-NEXT: popl %esi		; X32-NEXT: popl %esi
; X32-NEXT: .cfi_def_cfa_offset 16
; X32-NEXT: popl %edi		; X32-NEXT: popl %edi
; X32-NEXT: .cfi_def_cfa_offset 12
; X32-NEXT: popl %ebx		; X32-NEXT: popl %ebx
; X32-NEXT: .cfi_def_cfa_offset 8
; X32-NEXT: popl %ebp		; X32-NEXT: popl %ebp
; X32-NEXT: .cfi_def_cfa_offset 4		; X32-NEXT: .cfi_def_cfa %esp, 4
; X32-NEXT: retl $4		; X32-NEXT: retl $4
;		;
; X64-LABEL: test_srl:		; X64-LABEL: test_srl:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi
Show All 16 Lines	; X64-NEXT: retq
ret <2 x i256> %Out		ret <2 x i256> %Out
}		}

define <2 x i256> @test_sra(<2 x i256> %In) {		define <2 x i256> @test_sra(<2 x i256> %In) {
; X32-LABEL: test_sra:		; X32-LABEL: test_sra:
; X32: # %bb.0:		; X32: # %bb.0:
; X32-NEXT: pushl %ebp		; X32-NEXT: pushl %ebp
; X32-NEXT: .cfi_def_cfa_offset 8		; X32-NEXT: .cfi_def_cfa_offset 8
		; X32-NEXT: .cfi_offset %ebp, -8
		; X32-NEXT: movl %esp, %ebp
		; X32-NEXT: .cfi_def_cfa_register %ebp
; X32-NEXT: pushl %ebx		; X32-NEXT: pushl %ebx
; X32-NEXT: .cfi_def_cfa_offset 12
; X32-NEXT: pushl %edi		; X32-NEXT: pushl %edi
; X32-NEXT: .cfi_def_cfa_offset 16
; X32-NEXT: pushl %esi		; X32-NEXT: pushl %esi
; X32-NEXT: .cfi_def_cfa_offset 20		; X32-NEXT: andl $-64, %esp
; X32-NEXT: subl $8, %esp		; X32-NEXT: subl $64, %esp
; X32-NEXT: .cfi_def_cfa_offset 28
; X32-NEXT: .cfi_offset %esi, -20		; X32-NEXT: .cfi_offset %esi, -20
; X32-NEXT: .cfi_offset %edi, -16		; X32-NEXT: .cfi_offset %edi, -16
; X32-NEXT: .cfi_offset %ebx, -12		; X32-NEXT: .cfi_offset %ebx, -12
; X32-NEXT: .cfi_offset %ebp, -8		; X32-NEXT: movl 132(%ebp), %ebx
; X32-NEXT: movl {{[0-9]+}}(%esp), %edx		; X32-NEXT: movl 128(%ebp), %ecx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: movl 124(%ebp), %eax
; X32-NEXT: movl {{[0-9]+}}(%esp), %esi		; X32-NEXT: movl 120(%ebp), %edi
; X32-NEXT: movl {{[0-9]+}}(%esp), %edi		; X32-NEXT: movl %ebx, %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %ebx		; X32-NEXT: shldl $26, %ecx, %edx
; X32-NEXT: movl {{[0-9]+}}(%esp), %ebp		; X32-NEXT: movl %edx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
; X32-NEXT: movl %edx, %ecx
; X32-NEXT: shldl $26, %eax, %ecx		; X32-NEXT: shldl $26, %eax, %ecx
; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill		; X32-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
; X32-NEXT: shldl $26, %esi, %eax		; X32-NEXT: shldl $26, %edi, %eax
; X32-NEXT: movl %eax, (%esp) # 4-byte Spill		; X32-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
; X32-NEXT: shldl $26, %edi, %esi		; X32-NEXT: movl 116(%ebp), %edx
; X32-NEXT: shldl $26, %ebx, %edi		; X32-NEXT: shldl $26, %edx, %edi
; X32-NEXT: shldl $26, %ebp, %ebx		; X32-NEXT: movl 112(%ebp), %ecx
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: shldl $26, %ecx, %edx
; X32-NEXT: shldl $26, %eax, %ebp		; X32-NEXT: movl 108(%ebp), %eax
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: shldl $26, %eax, %ecx
; X32-NEXT: shrdl $6, %eax, %ecx		; X32-NEXT: movl 104(%ebp), %esi
; X32-NEXT: sarl $6, %edx		; X32-NEXT: shrdl $6, %eax, %esi
; X32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X32-NEXT: sarl $6, %ebx
; X32-NEXT: movl %edx, 60(%eax)		; X32-NEXT: movl 8(%ebp), %eax
; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edx # 4-byte Reload		; X32-NEXT: movl %ebx, 60(%eax)
; X32-NEXT: movl %edx, 56(%eax)		; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
; X32-NEXT: movl (%esp), %edx # 4-byte Reload		; X32-NEXT: movl %ebx, 56(%eax)
; X32-NEXT: movl %edx, 52(%eax)		; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
; X32-NEXT: movl %esi, 48(%eax)		; X32-NEXT: movl %ebx, 52(%eax)
		; X32-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ebx # 4-byte Reload
		; X32-NEXT: movl %ebx, 48(%eax)
; X32-NEXT: movl %edi, 44(%eax)		; X32-NEXT: movl %edi, 44(%eax)
; X32-NEXT: movl %ebx, 40(%eax)		; X32-NEXT: movl %edx, 40(%eax)
; X32-NEXT: movl %ebp, 36(%eax)		; X32-NEXT: movl %ecx, 36(%eax)
; X32-NEXT: movl %ecx, 32(%eax)		; X32-NEXT: movl %esi, 32(%eax)
; X32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X32-NEXT: movl 100(%ebp), %ecx
; X32-NEXT: sarl $31, %ecx		; X32-NEXT: sarl $31, %ecx
; X32-NEXT: movl %ecx, 28(%eax)		; X32-NEXT: movl %ecx, 28(%eax)
; X32-NEXT: movl %ecx, 24(%eax)		; X32-NEXT: movl %ecx, 24(%eax)
; X32-NEXT: movl %ecx, 20(%eax)		; X32-NEXT: movl %ecx, 20(%eax)
; X32-NEXT: movl %ecx, 16(%eax)		; X32-NEXT: movl %ecx, 16(%eax)
; X32-NEXT: movl %ecx, 12(%eax)		; X32-NEXT: movl %ecx, 12(%eax)
; X32-NEXT: movl %ecx, 8(%eax)		; X32-NEXT: movl %ecx, 8(%eax)
; X32-NEXT: movl %ecx, 4(%eax)		; X32-NEXT: movl %ecx, 4(%eax)
; X32-NEXT: movl %ecx, (%eax)		; X32-NEXT: movl %ecx, (%eax)
; X32-NEXT: addl $8, %esp		; X32-NEXT: leal -12(%ebp), %esp
; X32-NEXT: .cfi_def_cfa_offset 20
; X32-NEXT: popl %esi		; X32-NEXT: popl %esi
; X32-NEXT: .cfi_def_cfa_offset 16
; X32-NEXT: popl %edi		; X32-NEXT: popl %edi
; X32-NEXT: .cfi_def_cfa_offset 12
; X32-NEXT: popl %ebx		; X32-NEXT: popl %ebx
; X32-NEXT: .cfi_def_cfa_offset 8
; X32-NEXT: popl %ebp		; X32-NEXT: popl %ebp
; X32-NEXT: .cfi_def_cfa_offset 4		; X32-NEXT: .cfi_def_cfa %esp, 4
; X32-NEXT: retl $4		; X32-NEXT: retl $4
;		;
; X64-LABEL: test_sra:		; X64-LABEL: test_sra:
; X64: # %bb.0:		; X64: # %bb.0:
; X64-NEXT: movq %rdi, %rax		; X64-NEXT: movq %rdi, %rax
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rcx
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rdx
; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi		; X64-NEXT: movq {{[0-9]+}}(%rsp), %rsi
Show All 18 Lines

test/CodeGen/X86/masked_gather_scatter.ll

	Show First 20 Lines • Show All 2,519 Lines • ▼ Show 20 Lines
	; KNL_64-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]			; KNL_64-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
	; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm1 {%k1}			; KNL_64-NEXT: vgatherqps (%rdi,%zmm0,4), %ymm1 {%k1}
	; KNL_64-NEXT: vmovaps %xmm1, %xmm0			; KNL_64-NEXT: vmovaps %xmm1, %xmm0
	; KNL_64-NEXT: vzeroupper			; KNL_64-NEXT: vzeroupper
	; KNL_64-NEXT: retq			; KNL_64-NEXT: retq
	;			;
	; KNL_32-LABEL: large_index:			; KNL_32-LABEL: large_index:
	; KNL_32: # %bb.0:			; KNL_32: # %bb.0:
				; KNL_32-NEXT: pushl %ebp
				; KNL_32-NEXT: .cfi_def_cfa_offset 8
				; KNL_32-NEXT: .cfi_offset %ebp, -8
				; KNL_32-NEXT: movl %esp, %ebp
				; KNL_32-NEXT: .cfi_def_cfa_register %ebp
				; KNL_32-NEXT: andl $-32, %esp
				; KNL_32-NEXT: subl $32, %esp
	; KNL_32-NEXT: # kill: def $xmm1 killed $xmm1 def $ymm1			; KNL_32-NEXT: # kill: def $xmm1 killed $xmm1 def $ymm1
	; KNL_32-NEXT: vpsllq $63, %xmm0, %xmm0			; KNL_32-NEXT: vpsllq $63, %xmm0, %xmm0
	; KNL_32-NEXT: vptestmq %zmm0, %zmm0, %k0			; KNL_32-NEXT: vptestmq %zmm0, %zmm0, %k0
	; KNL_32-NEXT: kshiftlw $14, %k0, %k0			; KNL_32-NEXT: kshiftlw $14, %k0, %k0
	; KNL_32-NEXT: kshiftrw $14, %k0, %k1			; KNL_32-NEXT: kshiftrw $14, %k0, %k1
	; KNL_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; KNL_32-NEXT: movl 8(%ebp), %eax
	; KNL_32-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero			; KNL_32-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; KNL_32-NEXT: vpinsrd $1, {{[0-9]+}}(%esp), %xmm0, %xmm0			; KNL_32-NEXT: vpinsrd $1, 44(%ebp), %xmm0, %xmm0
	; KNL_32-NEXT: vpinsrd $2, {{[0-9]+}}(%esp), %xmm0, %xmm0			; KNL_32-NEXT: vpinsrd $2, 56(%ebp), %xmm0, %xmm0
	; KNL_32-NEXT: vpinsrd $3, {{[0-9]+}}(%esp), %xmm0, %xmm0			; KNL_32-NEXT: vpinsrd $3, 60(%ebp), %xmm0, %xmm0
	; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm1 {%k1}			; KNL_32-NEXT: vgatherqps (%eax,%zmm0,4), %ymm1 {%k1}
	; KNL_32-NEXT: vmovaps %xmm1, %xmm0			; KNL_32-NEXT: vmovaps %xmm1, %xmm0
				; KNL_32-NEXT: movl %ebp, %esp
				; KNL_32-NEXT: popl %ebp
				; KNL_32-NEXT: .cfi_def_cfa %esp, 4
	; KNL_32-NEXT: vzeroupper			; KNL_32-NEXT: vzeroupper
	; KNL_32-NEXT: retl			; KNL_32-NEXT: retl
	;			;
	; SKX-LABEL: large_index:			; SKX-LABEL: large_index:
	; SKX: # %bb.0:			; SKX: # %bb.0:
	; SKX-NEXT: vpsllq $63, %xmm0, %xmm0			; SKX-NEXT: vpsllq $63, %xmm0, %xmm0
	; SKX-NEXT: vpmovq2m %xmm0, %k1			; SKX-NEXT: vpmovq2m %xmm0, %k1
	; SKX-NEXT: vmovq %rcx, %xmm0			; SKX-NEXT: vmovq %rcx, %xmm0
	; SKX-NEXT: vmovq %rsi, %xmm2			; SKX-NEXT: vmovq %rsi, %xmm2
	; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]			; SKX-NEXT: vpunpcklqdq {{.*#+}} xmm0 = xmm2[0],xmm0[0]
	; SKX-NEXT: vgatherqps (%rdi,%xmm0,4), %xmm1 {%k1}			; SKX-NEXT: vgatherqps (%rdi,%xmm0,4), %xmm1 {%k1}
	; SKX-NEXT: vmovaps %xmm1, %xmm0			; SKX-NEXT: vmovaps %xmm1, %xmm0
	; SKX-NEXT: retq			; SKX-NEXT: retq
	;			;
	; SKX_32-LABEL: large_index:			; SKX_32-LABEL: large_index:
	; SKX_32: # %bb.0:			; SKX_32: # %bb.0:
				; SKX_32-NEXT: pushl %ebp
				; SKX_32-NEXT: .cfi_def_cfa_offset 8
				; SKX_32-NEXT: .cfi_offset %ebp, -8
				; SKX_32-NEXT: movl %esp, %ebp
				; SKX_32-NEXT: .cfi_def_cfa_register %ebp
				; SKX_32-NEXT: andl $-32, %esp
				; SKX_32-NEXT: subl $32, %esp
	; SKX_32-NEXT: vpsllq $63, %xmm0, %xmm0			; SKX_32-NEXT: vpsllq $63, %xmm0, %xmm0
	; SKX_32-NEXT: vpmovq2m %xmm0, %k1			; SKX_32-NEXT: vpmovq2m %xmm0, %k1
	; SKX_32-NEXT: movl {{[0-9]+}}(%esp), %eax			; SKX_32-NEXT: movl 8(%ebp), %eax
	; SKX_32-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero			; SKX_32-NEXT: vmovd {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; SKX_32-NEXT: vpinsrd $1, {{[0-9]+}}(%esp), %xmm0, %xmm0			; SKX_32-NEXT: vpinsrd $1, 44(%ebp), %xmm0, %xmm0
	; SKX_32-NEXT: vpinsrd $2, {{[0-9]+}}(%esp), %xmm0, %xmm0			; SKX_32-NEXT: vpinsrd $2, 56(%ebp), %xmm0, %xmm0
	; SKX_32-NEXT: vpinsrd $3, {{[0-9]+}}(%esp), %xmm0, %xmm0			; SKX_32-NEXT: vpinsrd $3, 60(%ebp), %xmm0, %xmm0
	; SKX_32-NEXT: vgatherqps (%eax,%xmm0,4), %xmm1 {%k1}			; SKX_32-NEXT: vgatherqps (%eax,%xmm0,4), %xmm1 {%k1}
	; SKX_32-NEXT: vmovaps %xmm1, %xmm0			; SKX_32-NEXT: vmovaps %xmm1, %xmm0
				; SKX_32-NEXT: movl %ebp, %esp
				; SKX_32-NEXT: popl %ebp
				; SKX_32-NEXT: .cfi_def_cfa %esp, 4
	; SKX_32-NEXT: retl			; SKX_32-NEXT: retl
	%gep.random = getelementptr float, float* %base, <2 x i128> %ind			%gep.random = getelementptr float, float* %base, <2 x i128> %ind
	%res = call <2 x float> @llvm.masked.gather.v2f32.v2p0f32(<2 x float*> %gep.random, i32 4, <2 x i1> %mask, <2 x float> %src0)			%res = call <2 x float> @llvm.masked.gather.v2f32.v2p0f32(<2 x float*> %gep.random, i32 4, <2 x i1> %mask, <2 x float> %src0)
	ret <2 x float>%res			ret <2 x float>%res
	}			}

	; Make sure we allow index to be sign extended from a smaller than i32 element size.			; Make sure we allow index to be sign extended from a smaller than i32 element size.
	define <16 x float> @sext_i8_index(float* %base, <16 x i8> %ind) {			define <16 x float> @sext_i8_index(float* %base, <16 x i8> %ind) {
	▲ Show 20 Lines • Show All 394 Lines • Show Last 20 Lines

test/CodeGen/X86/mmx-arg-passing.ll

Show All 26 Lines	; X86-64-NEXT: retq
ret void		ret void
}		}

@u2 = external global x86_mmx		@u2 = external global x86_mmx

define void @t2(<1 x i64> %v1) nounwind {		define void @t2(<1 x i64> %v1) nounwind {
; X86-32-LABEL: t2:		; X86-32-LABEL: t2:
; X86-32: ## %bb.0:		; X86-32: ## %bb.0:
		; X86-32-NEXT: pushl %eax
; X86-32-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-32-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-32-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-32-NEXT: movl {{[0-9]+}}(%esp), %ecx
; X86-32-NEXT: movl L_u2$non_lazy_ptr, %edx		; X86-32-NEXT: movl L_u2$non_lazy_ptr, %edx
; X86-32-NEXT: movl %ecx, 4(%edx)		; X86-32-NEXT: movl %ecx, 4(%edx)
; X86-32-NEXT: movl %eax, (%edx)		; X86-32-NEXT: movl %eax, (%edx)
		; X86-32-NEXT: popl %eax
; X86-32-NEXT: retl		; X86-32-NEXT: retl
;		;
; X86-64-LABEL: t2:		; X86-64-LABEL: t2:
; X86-64: ## %bb.0:		; X86-64: ## %bb.0:
; X86-64-NEXT: movq _u2@{{.*}}(%rip), %rax		; X86-64-NEXT: movq _u2@{{.*}}(%rip), %rax
; X86-64-NEXT: movq %rdi, (%rax)		; X86-64-NEXT: movq %rdi, (%rax)
; X86-64-NEXT: retq		; X86-64-NEXT: retq
%tmp = bitcast <1 x i64> %v1 to x86_mmx		%tmp = bitcast <1 x i64> %v1 to x86_mmx
store x86_mmx %tmp, x86_mmx* @u2, align 8		store x86_mmx %tmp, x86_mmx* @u2, align 8
ret void		ret void
}		}

test/CodeGen/X86/movtopush.ll

	Show First 20 Lines • Show All 478 Lines • ▼ Show 20 Lines
	; NOPUSH-NEXT: addl $32, %esp			; NOPUSH-NEXT: addl $32, %esp
	define void @pr34863_32(i32 %x) minsize nounwind {			define void @pr34863_32(i32 %x) minsize nounwind {
	entry:			entry:
	tail call void @eightparams(i32 %x, i32 %x, i32 %x, i32 %x, i32 %x, i32 %x, i32 0, i32 -1)			tail call void @eightparams(i32 %x, i32 %x, i32 %x, i32 %x, i32 %x, i32 %x, i32 0, i32 -1)
	ret void			ret void
	}			}

	; NORMAL-LABEL: pr34863_64			; NORMAL-LABEL: pr34863_64
	; NORMAL: movl 4(%esp), %eax			; NORMAL: movl 8(%ebp), %eax
	; NORMAL-NEXT: movl 8(%esp), %ecx			; NORMAL-NEXT: movl 12(%ebp), %ecx
	; NORMAL-NEXT: pushl $-1			; NORMAL-NEXT: pushl $-1
	; NORMAL-NEXT: pushl $-1			; NORMAL-NEXT: pushl $-1
	; NORMAL-NEXT: pushl $0			; NORMAL-NEXT: pushl $0
	; NORMAL-NEXT: pushl $0			; NORMAL-NEXT: pushl $0
	; NORMAL-NEXT: pushl %ecx			; NORMAL-NEXT: pushl %ecx
	; NORMAL-NEXT: pushl %eax			; NORMAL-NEXT: pushl %eax
	; NORMAL-NEXT: pushl %ecx			; NORMAL-NEXT: pushl %ecx
	; NORMAL-NEXT: pushl %eax			; NORMAL-NEXT: pushl %eax
	; NORMAL-NEXT: pushl %ecx			; NORMAL-NEXT: pushl %ecx
	; NORMAL-NEXT: pushl %eax			; NORMAL-NEXT: pushl %eax
	; NORMAL-NEXT: pushl %ecx			; NORMAL-NEXT: pushl %ecx
	; NORMAL-NEXT: pushl %eax			; NORMAL-NEXT: pushl %eax
	; NORMAL-NEXT: pushl %ecx			; NORMAL-NEXT: pushl %ecx
	; NORMAL-NEXT: pushl %eax			; NORMAL-NEXT: pushl %eax
	; NORMAL-NEXT: pushl %ecx			; NORMAL-NEXT: pushl %ecx
	; NORMAL-NEXT: pushl %eax			; NORMAL-NEXT: pushl %eax
	; NORMAL-NEXT: calll _eightparams64			; NORMAL-NEXT: calll _eightparams64
	; NORMAL-NEXT: addl $64, %esp			; NORMAL-NEXT: addl $64, %esp
	;			;
	; NOPUSH-LABEL: pr34863_64			; NOPUSH-LABEL: pr34863_64
	; NOPUSH: subl $64, %esp			; NOPUSH: subl $64, %esp
	; NOPUSH-NEXT: movl 68(%esp), %eax			; NOPUSH-NEXT: movl 8(%ebp), %eax
	; NOPUSH-NEXT: movl 72(%esp), %ecx			; NOPUSH-NEXT: movl 12(%ebp), %ecx
	; NOPUSH-NEXT: movl %ecx, 44(%esp)			; NOPUSH-NEXT: movl %ecx, 44(%esp)
	; NOPUSH-NEXT: movl %eax, 40(%esp)			; NOPUSH-NEXT: movl %eax, 40(%esp)
	; NOPUSH-NEXT: movl %ecx, 36(%esp)			; NOPUSH-NEXT: movl %ecx, 36(%esp)
	; NOPUSH-NEXT: movl %eax, 32(%esp)			; NOPUSH-NEXT: movl %eax, 32(%esp)
	; NOPUSH-NEXT: movl %ecx, 28(%esp)			; NOPUSH-NEXT: movl %ecx, 28(%esp)
	; NOPUSH-NEXT: movl %eax, 24(%esp)			; NOPUSH-NEXT: movl %eax, 24(%esp)
	; NOPUSH-NEXT: movl %ecx, 20(%esp)			; NOPUSH-NEXT: movl %ecx, 20(%esp)
	; NOPUSH-NEXT: movl %eax, 16(%esp)			; NOPUSH-NEXT: movl %eax, 16(%esp)
	; NOPUSH-NEXT: movl %ecx, 12(%esp)			; NOPUSH-NEXT: movl %ecx, 12(%esp)
	; NOPUSH-NEXT: movl %eax, 8(%esp)			; NOPUSH-NEXT: movl %eax, 8(%esp)
	; NOPUSH-NEXT: movl %ecx, 4(%esp)			; NOPUSH-NEXT: movl %ecx, 4(%esp)
	; NOPUSH-NEXT: movl %eax, (%esp)			; NOPUSH-NEXT: movl %eax, (%esp)
	; NOPUSH-NEXT: orl $-1, 60(%esp)			; NOPUSH-NEXT: orl $-1, 60(%esp)
	; NOPUSH-NEXT: orl $-1, 56(%esp)			; NOPUSH-NEXT: orl $-1, 56(%esp)
	; NOPUSH-NEXT: andl $0, 52(%esp)			; NOPUSH-NEXT: andl $0, 52(%esp)
	; NOPUSH-NEXT: andl $0, 48(%esp)			; NOPUSH-NEXT: andl $0, 48(%esp)
	; NOPUSH-NEXT: calll _eightparams64			; NOPUSH-NEXT: calll _eightparams64
	; NOPUSH-NEXT: addl $64, %esp			; NOPUSH-NEXT: movl %ebp, %esp
	define void @pr34863_64(i64 %x) minsize nounwind {			define void @pr34863_64(i64 %x) minsize nounwind {
	entry:			entry:
	tail call void @eightparams64(i64 %x, i64 %x, i64 %x, i64 %x, i64 %x, i64 %x, i64 0, i64 -1)			tail call void @eightparams64(i64 %x, i64 %x, i64 %x, i64 %x, i64 %x, i64 %x, i64 0, i64 -1)
	ret void			ret void
	}			}

test/CodeGen/X86/sadd_sat.ll

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%tmp = call i4 @llvm.sadd.sat.i4(i4 %x, i4 %y);		%tmp = call i4 @llvm.sadd.sat.i4(i4 %x, i4 %y);
ret i4 %tmp;		ret i4 %tmp;
}		}

define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {		define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
; X86-LABEL: vec:		; X86-LABEL: vec:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: pushl %ebp		; X86-NEXT: pushl %ebp
		; X86-NEXT: movl %esp, %ebp
; X86-NEXT: pushl %ebx		; X86-NEXT: pushl %ebx
; X86-NEXT: pushl %edi		; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi		; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: andl $-16, %esp
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: subl $16, %esp
		; X86-NEXT: movl 36(%ebp), %ecx
		; X86-NEXT: movl 52(%ebp), %edx
; X86-NEXT: xorl %eax, %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %ecx, %esi		; X86-NEXT: movl %ecx, %esi
; X86-NEXT: addl %edx, %esi		; X86-NEXT: addl %edx, %esi
; X86-NEXT: setns %al		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: addl %edx, %ecx		; X86-NEXT: addl %edx, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl 32(%ebp), %edx
; X86-NEXT: cmovol %eax, %ecx		; X86-NEXT: cmovol %eax, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
		; X86-NEXT: movl 48(%ebp), %esi
; X86-NEXT: xorl %eax, %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %edx, %edi		; X86-NEXT: movl %edx, %edi
; X86-NEXT: addl %esi, %edi		; X86-NEXT: addl %esi, %edi
; X86-NEXT: setns %al		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: addl %esi, %edx		; X86-NEXT: addl %esi, %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl 28(%ebp), %esi
; X86-NEXT: cmovol %eax, %edx		; X86-NEXT: cmovol %eax, %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-NEXT: movl 44(%ebp), %edi
; X86-NEXT: xorl %eax, %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %esi, %ebx		; X86-NEXT: movl %esi, %ebx
; X86-NEXT: addl %edi, %ebx		; X86-NEXT: addl %edi, %ebx
; X86-NEXT: setns %al		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: addl %edi, %esi		; X86-NEXT: addl %edi, %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-NEXT: movl 24(%ebp), %ecx
; X86-NEXT: cmovol %eax, %esi		; X86-NEXT: cmovol %eax, %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl 40(%ebp), %ebx
; X86-NEXT: xorl %ebx, %ebx		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %edi, %ebp		; X86-NEXT: movl %ecx, %edi
; X86-NEXT: addl %eax, %ebp		; X86-NEXT: addl %ebx, %edi
; X86-NEXT: setns %bl		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %ebx # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: addl %eax, %edi		; X86-NEXT: addl %ebx, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: cmovol %eax, %ecx
; X86-NEXT: cmovol %ebx, %edi		; X86-NEXT: movl 8(%ebp), %eax
; X86-NEXT: movl %ecx, 12(%eax)		; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
		; X86-NEXT: movl %edi, 12(%eax)
; X86-NEXT: movl %edx, 8(%eax)		; X86-NEXT: movl %edx, 8(%eax)
; X86-NEXT: movl %esi, 4(%eax)		; X86-NEXT: movl %esi, 4(%eax)
; X86-NEXT: movl %edi, (%eax)		; X86-NEXT: movl %ecx, (%eax)
		; X86-NEXT: leal -12(%ebp), %esp
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl $4		; X86-NEXT: retl $4
;		;
; X64-LABEL: vec:		; X64-LABEL: vec:
; X64: # %bb.0:		; X64: # %bb.0:
Show All 26 Lines

test/CodeGen/X86/scalar-fp-to-i64.ll

	Show First 20 Lines • Show All 1,537 Lines • ▼ Show 20 Lines
	; X87_LIN-NEXT: retl			; X87_LIN-NEXT: retl
	%r = fptosi x86_fp80 %a to i64			%r = fptosi x86_fp80 %a to i64
	ret i64 %r			ret i64 %r
	}			}

	define i64 @t_to_u64(fp128 %a) nounwind {			define i64 @t_to_u64(fp128 %a) nounwind {
	; AVX512_32_WIN-LABEL: t_to_u64:			; AVX512_32_WIN-LABEL: t_to_u64:
	; AVX512_32_WIN: # %bb.0:			; AVX512_32_WIN: # %bb.0:
	; AVX512_32_WIN-NEXT: subl $16, %esp			; AVX512_32_WIN-NEXT: pushl %ebp
	; AVX512_32_WIN-NEXT: vmovups {{[0-9]+}}(%esp), %xmm0			; AVX512_32_WIN-NEXT: movl %esp, %ebp
				; AVX512_32_WIN-NEXT: andl $-16, %esp
				; AVX512_32_WIN-NEXT: subl $32, %esp
				; AVX512_32_WIN-NEXT: vmovups 8(%ebp), %xmm0
	; AVX512_32_WIN-NEXT: vmovups %xmm0, (%esp)			; AVX512_32_WIN-NEXT: vmovups %xmm0, (%esp)
	; AVX512_32_WIN-NEXT: calll ___fixunstfdi			; AVX512_32_WIN-NEXT: calll ___fixunstfdi
	; AVX512_32_WIN-NEXT: addl $16, %esp			; AVX512_32_WIN-NEXT: movl %ebp, %esp
				; AVX512_32_WIN-NEXT: popl %ebp
	; AVX512_32_WIN-NEXT: retl			; AVX512_32_WIN-NEXT: retl
	;			;
	; AVX512_32_LIN-LABEL: t_to_u64:			; AVX512_32_LIN-LABEL: t_to_u64:
	; AVX512_32_LIN: # %bb.0:			; AVX512_32_LIN: # %bb.0:
	; AVX512_32_LIN-NEXT: subl $28, %esp			; AVX512_32_LIN-NEXT: subl $28, %esp
	; AVX512_32_LIN-NEXT: vmovaps {{[0-9]+}}(%esp), %xmm0			; AVX512_32_LIN-NEXT: vmovaps {{[0-9]+}}(%esp), %xmm0
	; AVX512_32_LIN-NEXT: vmovups %xmm0, (%esp)			; AVX512_32_LIN-NEXT: vmovups %xmm0, (%esp)
	; AVX512_32_LIN-NEXT: calll __fixunstfdi			; AVX512_32_LIN-NEXT: calll __fixunstfdi
	Show All 11 Lines
	; AVX512_64_LIN: # %bb.0:			; AVX512_64_LIN: # %bb.0:
	; AVX512_64_LIN-NEXT: pushq %rax			; AVX512_64_LIN-NEXT: pushq %rax
	; AVX512_64_LIN-NEXT: callq __fixunstfdi			; AVX512_64_LIN-NEXT: callq __fixunstfdi
	; AVX512_64_LIN-NEXT: popq %rcx			; AVX512_64_LIN-NEXT: popq %rcx
	; AVX512_64_LIN-NEXT: retq			; AVX512_64_LIN-NEXT: retq
	;			;
	; SSE3_32_WIN-LABEL: t_to_u64:			; SSE3_32_WIN-LABEL: t_to_u64:
	; SSE3_32_WIN: # %bb.0:			; SSE3_32_WIN: # %bb.0:
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: pushl %ebp
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: movl %esp, %ebp
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: andl $-16, %esp
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: subl $16, %esp
				; SSE3_32_WIN-NEXT: pushl 20(%ebp)
				; SSE3_32_WIN-NEXT: pushl 16(%ebp)
				; SSE3_32_WIN-NEXT: pushl 12(%ebp)
				; SSE3_32_WIN-NEXT: pushl 8(%ebp)
	; SSE3_32_WIN-NEXT: calll ___fixunstfdi			; SSE3_32_WIN-NEXT: calll ___fixunstfdi
	; SSE3_32_WIN-NEXT: addl $16, %esp			; SSE3_32_WIN-NEXT: addl $16, %esp
				; SSE3_32_WIN-NEXT: movl %ebp, %esp
				; SSE3_32_WIN-NEXT: popl %ebp
	; SSE3_32_WIN-NEXT: retl			; SSE3_32_WIN-NEXT: retl
	;			;
	; SSE3_32_LIN-LABEL: t_to_u64:			; SSE3_32_LIN-LABEL: t_to_u64:
	; SSE3_32_LIN: # %bb.0:			; SSE3_32_LIN: # %bb.0:
	; SSE3_32_LIN-NEXT: subl $12, %esp			; SSE3_32_LIN-NEXT: subl $12, %esp
	; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	Show All 13 Lines
	; SSE3_64_LIN: # %bb.0:			; SSE3_64_LIN: # %bb.0:
	; SSE3_64_LIN-NEXT: pushq %rax			; SSE3_64_LIN-NEXT: pushq %rax
	; SSE3_64_LIN-NEXT: callq __fixunstfdi			; SSE3_64_LIN-NEXT: callq __fixunstfdi
	; SSE3_64_LIN-NEXT: popq %rcx			; SSE3_64_LIN-NEXT: popq %rcx
	; SSE3_64_LIN-NEXT: retq			; SSE3_64_LIN-NEXT: retq
	;			;
	; SSE2_32_WIN-LABEL: t_to_u64:			; SSE2_32_WIN-LABEL: t_to_u64:
	; SSE2_32_WIN: # %bb.0:			; SSE2_32_WIN: # %bb.0:
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: pushl %ebp
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: movl %esp, %ebp
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: andl $-16, %esp
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: subl $16, %esp
				; SSE2_32_WIN-NEXT: pushl 20(%ebp)
				; SSE2_32_WIN-NEXT: pushl 16(%ebp)
				; SSE2_32_WIN-NEXT: pushl 12(%ebp)
				; SSE2_32_WIN-NEXT: pushl 8(%ebp)
	; SSE2_32_WIN-NEXT: calll ___fixunstfdi			; SSE2_32_WIN-NEXT: calll ___fixunstfdi
	; SSE2_32_WIN-NEXT: addl $16, %esp			; SSE2_32_WIN-NEXT: addl $16, %esp
				; SSE2_32_WIN-NEXT: movl %ebp, %esp
				; SSE2_32_WIN-NEXT: popl %ebp
	; SSE2_32_WIN-NEXT: retl			; SSE2_32_WIN-NEXT: retl
	;			;
	; SSE2_32_LIN-LABEL: t_to_u64:			; SSE2_32_LIN-LABEL: t_to_u64:
	; SSE2_32_LIN: # %bb.0:			; SSE2_32_LIN: # %bb.0:
	; SSE2_32_LIN-NEXT: subl $12, %esp			; SSE2_32_LIN-NEXT: subl $12, %esp
	; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	Show All 13 Lines
	; SSE2_64_LIN: # %bb.0:			; SSE2_64_LIN: # %bb.0:
	; SSE2_64_LIN-NEXT: pushq %rax			; SSE2_64_LIN-NEXT: pushq %rax
	; SSE2_64_LIN-NEXT: callq __fixunstfdi			; SSE2_64_LIN-NEXT: callq __fixunstfdi
	; SSE2_64_LIN-NEXT: popq %rcx			; SSE2_64_LIN-NEXT: popq %rcx
	; SSE2_64_LIN-NEXT: retq			; SSE2_64_LIN-NEXT: retq
	;			;
	; X87_WIN-LABEL: t_to_u64:			; X87_WIN-LABEL: t_to_u64:
	; X87_WIN: # %bb.0:			; X87_WIN: # %bb.0:
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: pushl %ebp
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: movl %esp, %ebp
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: andl $-16, %esp
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: subl $16, %esp
				; X87_WIN-NEXT: pushl 20(%ebp)
				; X87_WIN-NEXT: pushl 16(%ebp)
				; X87_WIN-NEXT: pushl 12(%ebp)
				; X87_WIN-NEXT: pushl 8(%ebp)
	; X87_WIN-NEXT: calll ___fixunstfdi			; X87_WIN-NEXT: calll ___fixunstfdi
	; X87_WIN-NEXT: addl $16, %esp			; X87_WIN-NEXT: addl $16, %esp
				; X87_WIN-NEXT: movl %ebp, %esp
				; X87_WIN-NEXT: popl %ebp
	; X87_WIN-NEXT: retl			; X87_WIN-NEXT: retl
	;			;
	; X87_LIN-LABEL: t_to_u64:			; X87_LIN-LABEL: t_to_u64:
	; X87_LIN: # %bb.0:			; X87_LIN: # %bb.0:
	; X87_LIN-NEXT: subl $12, %esp			; X87_LIN-NEXT: subl $12, %esp
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: calll __fixunstfdi			; X87_LIN-NEXT: calll __fixunstfdi
	; X87_LIN-NEXT: addl $28, %esp			; X87_LIN-NEXT: addl $28, %esp
	; X87_LIN-NEXT: retl			; X87_LIN-NEXT: retl
	%r = fptoui fp128 %a to i64			%r = fptoui fp128 %a to i64
	ret i64 %r			ret i64 %r
	}			}

	define i64 @t_to_s64(fp128 %a) nounwind {			define i64 @t_to_s64(fp128 %a) nounwind {
	; AVX512_32_WIN-LABEL: t_to_s64:			; AVX512_32_WIN-LABEL: t_to_s64:
	; AVX512_32_WIN: # %bb.0:			; AVX512_32_WIN: # %bb.0:
	; AVX512_32_WIN-NEXT: subl $16, %esp			; AVX512_32_WIN-NEXT: pushl %ebp
	; AVX512_32_WIN-NEXT: vmovups {{[0-9]+}}(%esp), %xmm0			; AVX512_32_WIN-NEXT: movl %esp, %ebp
				; AVX512_32_WIN-NEXT: andl $-16, %esp
				; AVX512_32_WIN-NEXT: subl $32, %esp
				; AVX512_32_WIN-NEXT: vmovups 8(%ebp), %xmm0
	; AVX512_32_WIN-NEXT: vmovups %xmm0, (%esp)			; AVX512_32_WIN-NEXT: vmovups %xmm0, (%esp)
	; AVX512_32_WIN-NEXT: calll ___fixtfdi			; AVX512_32_WIN-NEXT: calll ___fixtfdi
	; AVX512_32_WIN-NEXT: addl $16, %esp			; AVX512_32_WIN-NEXT: movl %ebp, %esp
				; AVX512_32_WIN-NEXT: popl %ebp
	; AVX512_32_WIN-NEXT: retl			; AVX512_32_WIN-NEXT: retl
	;			;
	; AVX512_32_LIN-LABEL: t_to_s64:			; AVX512_32_LIN-LABEL: t_to_s64:
	; AVX512_32_LIN: # %bb.0:			; AVX512_32_LIN: # %bb.0:
	; AVX512_32_LIN-NEXT: subl $28, %esp			; AVX512_32_LIN-NEXT: subl $28, %esp
	; AVX512_32_LIN-NEXT: vmovaps {{[0-9]+}}(%esp), %xmm0			; AVX512_32_LIN-NEXT: vmovaps {{[0-9]+}}(%esp), %xmm0
	; AVX512_32_LIN-NEXT: vmovups %xmm0, (%esp)			; AVX512_32_LIN-NEXT: vmovups %xmm0, (%esp)
	; AVX512_32_LIN-NEXT: calll __fixtfdi			; AVX512_32_LIN-NEXT: calll __fixtfdi
	Show All 11 Lines
	; AVX512_64_LIN: # %bb.0:			; AVX512_64_LIN: # %bb.0:
	; AVX512_64_LIN-NEXT: pushq %rax			; AVX512_64_LIN-NEXT: pushq %rax
	; AVX512_64_LIN-NEXT: callq __fixtfdi			; AVX512_64_LIN-NEXT: callq __fixtfdi
	; AVX512_64_LIN-NEXT: popq %rcx			; AVX512_64_LIN-NEXT: popq %rcx
	; AVX512_64_LIN-NEXT: retq			; AVX512_64_LIN-NEXT: retq
	;			;
	; SSE3_32_WIN-LABEL: t_to_s64:			; SSE3_32_WIN-LABEL: t_to_s64:
	; SSE3_32_WIN: # %bb.0:			; SSE3_32_WIN: # %bb.0:
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: pushl %ebp
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: movl %esp, %ebp
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: andl $-16, %esp
	; SSE3_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_WIN-NEXT: subl $16, %esp
				; SSE3_32_WIN-NEXT: pushl 20(%ebp)
				; SSE3_32_WIN-NEXT: pushl 16(%ebp)
				; SSE3_32_WIN-NEXT: pushl 12(%ebp)
				; SSE3_32_WIN-NEXT: pushl 8(%ebp)
	; SSE3_32_WIN-NEXT: calll ___fixtfdi			; SSE3_32_WIN-NEXT: calll ___fixtfdi
	; SSE3_32_WIN-NEXT: addl $16, %esp			; SSE3_32_WIN-NEXT: addl $16, %esp
				; SSE3_32_WIN-NEXT: movl %ebp, %esp
				; SSE3_32_WIN-NEXT: popl %ebp
	; SSE3_32_WIN-NEXT: retl			; SSE3_32_WIN-NEXT: retl
	;			;
	; SSE3_32_LIN-LABEL: t_to_s64:			; SSE3_32_LIN-LABEL: t_to_s64:
	; SSE3_32_LIN: # %bb.0:			; SSE3_32_LIN: # %bb.0:
	; SSE3_32_LIN-NEXT: subl $12, %esp			; SSE3_32_LIN-NEXT: subl $12, %esp
	; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE3_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	Show All 13 Lines
	; SSE3_64_LIN: # %bb.0:			; SSE3_64_LIN: # %bb.0:
	; SSE3_64_LIN-NEXT: pushq %rax			; SSE3_64_LIN-NEXT: pushq %rax
	; SSE3_64_LIN-NEXT: callq __fixtfdi			; SSE3_64_LIN-NEXT: callq __fixtfdi
	; SSE3_64_LIN-NEXT: popq %rcx			; SSE3_64_LIN-NEXT: popq %rcx
	; SSE3_64_LIN-NEXT: retq			; SSE3_64_LIN-NEXT: retq
	;			;
	; SSE2_32_WIN-LABEL: t_to_s64:			; SSE2_32_WIN-LABEL: t_to_s64:
	; SSE2_32_WIN: # %bb.0:			; SSE2_32_WIN: # %bb.0:
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: pushl %ebp
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: movl %esp, %ebp
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: andl $-16, %esp
	; SSE2_32_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_WIN-NEXT: subl $16, %esp
				; SSE2_32_WIN-NEXT: pushl 20(%ebp)
				; SSE2_32_WIN-NEXT: pushl 16(%ebp)
				; SSE2_32_WIN-NEXT: pushl 12(%ebp)
				; SSE2_32_WIN-NEXT: pushl 8(%ebp)
	; SSE2_32_WIN-NEXT: calll ___fixtfdi			; SSE2_32_WIN-NEXT: calll ___fixtfdi
	; SSE2_32_WIN-NEXT: addl $16, %esp			; SSE2_32_WIN-NEXT: addl $16, %esp
				; SSE2_32_WIN-NEXT: movl %ebp, %esp
				; SSE2_32_WIN-NEXT: popl %ebp
	; SSE2_32_WIN-NEXT: retl			; SSE2_32_WIN-NEXT: retl
	;			;
	; SSE2_32_LIN-LABEL: t_to_s64:			; SSE2_32_LIN-LABEL: t_to_s64:
	; SSE2_32_LIN: # %bb.0:			; SSE2_32_LIN: # %bb.0:
	; SSE2_32_LIN-NEXT: subl $12, %esp			; SSE2_32_LIN-NEXT: subl $12, %esp
	; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; SSE2_32_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	Show All 13 Lines
	; SSE2_64_LIN: # %bb.0:			; SSE2_64_LIN: # %bb.0:
	; SSE2_64_LIN-NEXT: pushq %rax			; SSE2_64_LIN-NEXT: pushq %rax
	; SSE2_64_LIN-NEXT: callq __fixtfdi			; SSE2_64_LIN-NEXT: callq __fixtfdi
	; SSE2_64_LIN-NEXT: popq %rcx			; SSE2_64_LIN-NEXT: popq %rcx
	; SSE2_64_LIN-NEXT: retq			; SSE2_64_LIN-NEXT: retq
	;			;
	; X87_WIN-LABEL: t_to_s64:			; X87_WIN-LABEL: t_to_s64:
	; X87_WIN: # %bb.0:			; X87_WIN: # %bb.0:
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: pushl %ebp
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: movl %esp, %ebp
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: andl $-16, %esp
	; X87_WIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_WIN-NEXT: subl $16, %esp
				; X87_WIN-NEXT: pushl 20(%ebp)
				; X87_WIN-NEXT: pushl 16(%ebp)
				; X87_WIN-NEXT: pushl 12(%ebp)
				; X87_WIN-NEXT: pushl 8(%ebp)
	; X87_WIN-NEXT: calll ___fixtfdi			; X87_WIN-NEXT: calll ___fixtfdi
	; X87_WIN-NEXT: addl $16, %esp			; X87_WIN-NEXT: addl $16, %esp
				; X87_WIN-NEXT: movl %ebp, %esp
				; X87_WIN-NEXT: popl %ebp
	; X87_WIN-NEXT: retl			; X87_WIN-NEXT: retl
	;			;
	; X87_LIN-LABEL: t_to_s64:			; X87_LIN-LABEL: t_to_s64:
	; X87_LIN: # %bb.0:			; X87_LIN: # %bb.0:
	; X87_LIN-NEXT: subl $12, %esp			; X87_LIN-NEXT: subl $12, %esp
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)			; X87_LIN-NEXT: pushl {{[0-9]+}}(%esp)
	; X87_LIN-NEXT: calll __fixtfdi			; X87_LIN-NEXT: calll __fixtfdi
	; X87_LIN-NEXT: addl $28, %esp			; X87_LIN-NEXT: addl $28, %esp
	; X87_LIN-NEXT: retl			; X87_LIN-NEXT: retl
	%r = fptosi fp128 %a to i64			%r = fptosi fp128 %a to i64
	ret i64 %r			ret i64 %r
	}			}

test/CodeGen/X86/select.ll

	Show First 20 Lines • Show All 489 Lines • ▼ Show 20 Lines
	; ATOM-NEXT: paddd %xmm2, %xmm1			; ATOM-NEXT: paddd %xmm2, %xmm1
	; ATOM-NEXT: movq %xmm1, 16(%rsi)			; ATOM-NEXT: movq %xmm1, 16(%rsi)
	; ATOM-NEXT: movdqa %xmm0, (%rsi)			; ATOM-NEXT: movdqa %xmm0, (%rsi)
	; ATOM-NEXT: retq			; ATOM-NEXT: retq
	;			;
	; ATHLON-LABEL: test8:			; ATHLON-LABEL: test8:
	; ATHLON: ## %bb.0:			; ATHLON: ## %bb.0:
	; ATHLON-NEXT: pushl %ebp			; ATHLON-NEXT: pushl %ebp
				; ATHLON-NEXT: movl %esp, %ebp
	; ATHLON-NEXT: pushl %ebx			; ATHLON-NEXT: pushl %ebx
	; ATHLON-NEXT: pushl %edi			; ATHLON-NEXT: pushl %edi
	; ATHLON-NEXT: pushl %esi			; ATHLON-NEXT: pushl %esi
	; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)			; ATHLON-NEXT: andl $-32, %esp
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: subl $32, %esp
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %ecx			; ATHLON-NEXT: testb $1, 8(%ebp)
				; ATHLON-NEXT: leal 60(%ebp), %eax
				; ATHLON-NEXT: leal 92(%ebp), %ecx
	; ATHLON-NEXT: cmovnel %eax, %ecx			; ATHLON-NEXT: cmovnel %eax, %ecx
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %edx			; ATHLON-NEXT: leal 56(%ebp), %eax
				; ATHLON-NEXT: leal 88(%ebp), %edx
	; ATHLON-NEXT: cmovnel %eax, %edx			; ATHLON-NEXT: cmovnel %eax, %edx
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: leal 52(%ebp), %eax
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %esi			; ATHLON-NEXT: leal 84(%ebp), %esi
	; ATHLON-NEXT: cmovnel %eax, %esi			; ATHLON-NEXT: cmovnel %eax, %esi
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: leal 48(%ebp), %eax
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %edi			; ATHLON-NEXT: leal 80(%ebp), %edi
	; ATHLON-NEXT: cmovnel %eax, %edi			; ATHLON-NEXT: cmovnel %eax, %edi
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: leal 44(%ebp), %eax
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %ebx			; ATHLON-NEXT: leal 76(%ebp), %ebx
	; ATHLON-NEXT: cmovnel %eax, %ebx			; ATHLON-NEXT: cmovnel %eax, %ebx
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: leal 72(%ebp), %eax
	; ATHLON-NEXT: leal {{[0-9]+}}(%esp), %ebp			; ATHLON-NEXT: leal 40(%ebp), %ecx
	; ATHLON-NEXT: cmovnel %eax, %ebp			; ATHLON-NEXT: cmovnel %ecx, %eax
	; ATHLON-NEXT: movl {{[0-9]+}}(%esp), %eax			; ATHLON-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx ## 4-byte Reload
	; ATHLON-NEXT: movl (%ecx), %ecx			; ATHLON-NEXT: movl (%ecx), %ecx
	; ATHLON-NEXT: movl (%edx), %edx			; ATHLON-NEXT: movl (%edx), %edx
	; ATHLON-NEXT: movl (%esi), %esi			; ATHLON-NEXT: movl (%esi), %esi
	; ATHLON-NEXT: movl (%edi), %edi			; ATHLON-NEXT: movl (%edi), %edi
	; ATHLON-NEXT: movl (%ebx), %ebx			; ATHLON-NEXT: movl (%ebx), %ebx
	; ATHLON-NEXT: movl (%ebp), %ebp			; ATHLON-NEXT: movl (%eax), %eax
				; ATHLON-NEXT: movl %eax, {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Spill
	; ATHLON-NEXT: decl %ecx			; ATHLON-NEXT: decl %ecx
				; ATHLON-NEXT: movl 12(%ebp), %eax
	; ATHLON-NEXT: movl %ecx, 20(%eax)			; ATHLON-NEXT: movl %ecx, 20(%eax)
	; ATHLON-NEXT: decl %edx			; ATHLON-NEXT: decl %edx
	; ATHLON-NEXT: movl %edx, 16(%eax)			; ATHLON-NEXT: movl %edx, 16(%eax)
	; ATHLON-NEXT: decl %esi			; ATHLON-NEXT: decl %esi
	; ATHLON-NEXT: movl %esi, 12(%eax)			; ATHLON-NEXT: movl %esi, 12(%eax)
	; ATHLON-NEXT: decl %edi			; ATHLON-NEXT: decl %edi
	; ATHLON-NEXT: movl %edi, 8(%eax)			; ATHLON-NEXT: movl %edi, 8(%eax)
	; ATHLON-NEXT: decl %ebx			; ATHLON-NEXT: decl %ebx
	; ATHLON-NEXT: movl %ebx, 4(%eax)			; ATHLON-NEXT: movl %ebx, 4(%eax)
	; ATHLON-NEXT: decl %ebp			; ATHLON-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %ecx ## 4-byte Reload
	; ATHLON-NEXT: movl %ebp, (%eax)			; ATHLON-NEXT: decl %ecx
				; ATHLON-NEXT: movl %ecx, (%eax)
				; ATHLON-NEXT: leal -12(%ebp), %esp
	; ATHLON-NEXT: popl %esi			; ATHLON-NEXT: popl %esi
	; ATHLON-NEXT: popl %edi			; ATHLON-NEXT: popl %edi
	; ATHLON-NEXT: popl %ebx			; ATHLON-NEXT: popl %ebx
	; ATHLON-NEXT: popl %ebp			; ATHLON-NEXT: popl %ebp
	; ATHLON-NEXT: retl			; ATHLON-NEXT: retl
	;			;
	; MCU-LABEL: test8:			; MCU-LABEL: test8:
	; MCU: # %bb.0:			; MCU: # %bb.0:
	▲ Show 20 Lines • Show All 762 Lines • ▼ Show 20 Lines
	; ATHLON-NEXT: cmovel %ecx, %eax			; ATHLON-NEXT: cmovel %ecx, %eax
	; ATHLON-NEXT: ## kill: def $ax killed $ax killed $eax			; ATHLON-NEXT: ## kill: def $ax killed $ax killed $eax
	; ATHLON-NEXT: retl			; ATHLON-NEXT: retl
	;			;
	; MCU-LABEL: select_xor_1b:			; MCU-LABEL: select_xor_1b:
	; MCU: # %bb.0: # %entry			; MCU: # %bb.0: # %entry
	; MCU-NEXT: testb $1, %dl			; MCU-NEXT: testb $1, %dl
	; MCU-NEXT: je .LBB24_2			; MCU-NEXT: je .LBB24_2
	; MCU-NEXT: # %bb.1:			; MCU-NEXT: # %bb.1: # %entry
	; MCU-NEXT: xorl $43, %eax			; MCU-NEXT: xorl $43, %eax
	; MCU-NEXT: .LBB24_2: # %entry			; MCU-NEXT: .LBB24_2: # %entry
	; MCU-NEXT: # kill: def $ax killed $ax killed $eax			; MCU-NEXT: # kill: def $ax killed $ax killed $eax
	; MCU-NEXT: retl			; MCU-NEXT: retl
	entry:			entry:
	%and = and i8 %cond, 1			%and = and i8 %cond, 1
	%cmp10 = icmp ne i8 %and, 1			%cmp10 = icmp ne i8 %and, 1
	%0 = xor i16 %A, 43			%0 = xor i16 %A, 43
	▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)			; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)
	; ATHLON-NEXT: cmovel %ecx, %eax			; ATHLON-NEXT: cmovel %ecx, %eax
	; ATHLON-NEXT: retl			; ATHLON-NEXT: retl
	;			;
	; MCU-LABEL: select_xor_2b:			; MCU-LABEL: select_xor_2b:
	; MCU: # %bb.0: # %entry			; MCU: # %bb.0: # %entry
	; MCU-NEXT: testb $1, %cl			; MCU-NEXT: testb $1, %cl
	; MCU-NEXT: je .LBB26_2			; MCU-NEXT: je .LBB26_2
	; MCU-NEXT: # %bb.1:			; MCU-NEXT: # %bb.1: # %entry
	; MCU-NEXT: xorl %edx, %eax			; MCU-NEXT: xorl %edx, %eax
	; MCU-NEXT: .LBB26_2: # %entry			; MCU-NEXT: .LBB26_2: # %entry
	; MCU-NEXT: retl			; MCU-NEXT: retl
	entry:			entry:
	%and = and i8 %cond, 1			%and = and i8 %cond, 1
	%cmp10 = icmp ne i8 %and, 1			%cmp10 = icmp ne i8 %and, 1
	%0 = xor i32 %B, %A			%0 = xor i32 %B, %A
	%1 = select i1 %cmp10, i32 %A, i32 %0			%1 = select i1 %cmp10, i32 %A, i32 %0
	▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)			; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)
	; ATHLON-NEXT: cmovel %ecx, %eax			; ATHLON-NEXT: cmovel %ecx, %eax
	; ATHLON-NEXT: retl			; ATHLON-NEXT: retl
	;			;
	; MCU-LABEL: select_or_b:			; MCU-LABEL: select_or_b:
	; MCU: # %bb.0: # %entry			; MCU: # %bb.0: # %entry
	; MCU-NEXT: testb $1, %cl			; MCU-NEXT: testb $1, %cl
	; MCU-NEXT: je .LBB28_2			; MCU-NEXT: je .LBB28_2
	; MCU-NEXT: # %bb.1:			; MCU-NEXT: # %bb.1: # %entry
	; MCU-NEXT: orl %edx, %eax			; MCU-NEXT: orl %edx, %eax
	; MCU-NEXT: .LBB28_2: # %entry			; MCU-NEXT: .LBB28_2: # %entry
	; MCU-NEXT: retl			; MCU-NEXT: retl
	entry:			entry:
	%and = and i8 %cond, 1			%and = and i8 %cond, 1
	%cmp10 = icmp ne i8 %and, 1			%cmp10 = icmp ne i8 %and, 1
	%0 = or i32 %B, %A			%0 = or i32 %B, %A
	%1 = select i1 %cmp10, i32 %A, i32 %0			%1 = select i1 %cmp10, i32 %A, i32 %0
	▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)			; ATHLON-NEXT: testb $1, {{[0-9]+}}(%esp)
	; ATHLON-NEXT: cmovel %ecx, %eax			; ATHLON-NEXT: cmovel %ecx, %eax
	; ATHLON-NEXT: retl			; ATHLON-NEXT: retl
	;			;
	; MCU-LABEL: select_or_1b:			; MCU-LABEL: select_or_1b:
	; MCU: # %bb.0: # %entry			; MCU: # %bb.0: # %entry
	; MCU-NEXT: testb $1, %cl			; MCU-NEXT: testb $1, %cl
	; MCU-NEXT: je .LBB30_2			; MCU-NEXT: je .LBB30_2
	; MCU-NEXT: # %bb.1:			; MCU-NEXT: # %bb.1: # %entry
	; MCU-NEXT: orl %edx, %eax			; MCU-NEXT: orl %edx, %eax
	; MCU-NEXT: .LBB30_2: # %entry			; MCU-NEXT: .LBB30_2: # %entry
	; MCU-NEXT: retl			; MCU-NEXT: retl
	entry:			entry:
	%and = and i32 %cond, 1			%and = and i32 %cond, 1
	%cmp10 = icmp ne i32 %and, 1			%cmp10 = icmp ne i32 %and, 1
	%0 = or i32 %B, %A			%0 = or i32 %B, %A
	%1 = select i1 %cmp10, i32 %A, i32 %0			%1 = select i1 %cmp10, i32 %A, i32 %0
	ret i32 %1			ret i32 %1
	}			}

test/CodeGen/X86/smul_fix.ll

	Show First 20 Lines • Show All 155 Lines • ▼ Show 20 Lines
	; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; X64-NEXT: psrld $2, %xmm0			; X64-NEXT: psrld $2, %xmm0
	; X64-NEXT: por %xmm4, %xmm0			; X64-NEXT: por %xmm4, %xmm0
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: vec:			; X86-LABEL: vec:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebp			; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebx
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: subl $16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 36(%ebp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl 32(%ebp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl 28(%ebp), %edi
	; X86-NEXT: imull {{[0-9]+}}(%esp)			; X86-NEXT: movl 24(%ebp), %eax
	; X86-NEXT: movl %edx, %ebp			; X86-NEXT: imull 40(%ebp)
	; X86-NEXT: shldl $30, %eax, %ebp
	; X86-NEXT: movl %ebx, %eax
	; X86-NEXT: imull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, %ebx			; X86-NEXT: movl %edx, %ebx
	; X86-NEXT: shldl $30, %eax, %ebx			; X86-NEXT: shldl $30, %eax, %ebx
	; X86-NEXT: movl %edi, %eax			; X86-NEXT: movl %edi, %eax
	; X86-NEXT: imull {{[0-9]+}}(%esp)			; X86-NEXT: imull 44(%ebp)
	; X86-NEXT: movl %edx, %edi			; X86-NEXT: movl %edx, %edi
	; X86-NEXT: shldl $30, %eax, %edi			; X86-NEXT: shldl $30, %eax, %edi
	; X86-NEXT: movl %esi, %eax			; X86-NEXT: movl %esi, %eax
	; X86-NEXT: imull {{[0-9]+}}(%esp)			; X86-NEXT: imull 48(%ebp)
	; X86-NEXT: shldl $30, %eax, %edx			; X86-NEXT: movl %edx, %esi
	; X86-NEXT: movl %edx, 12(%ecx)			; X86-NEXT: shldl $30, %eax, %esi
	; X86-NEXT: movl %edi, 8(%ecx)
	; X86-NEXT: movl %ebx, 4(%ecx)
	; X86-NEXT: movl %ebp, (%ecx)
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: imull 52(%ebp)
				; X86-NEXT: shldl $30, %eax, %edx
				; X86-NEXT: movl 8(%ebp), %eax
				; X86-NEXT: movl %edx, 12(%eax)
				; X86-NEXT: movl %esi, 8(%eax)
				; X86-NEXT: movl %edi, 4(%eax)
				; X86-NEXT: movl %ebx, (%eax)
				; X86-NEXT: leal -12(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	%tmp = call <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 2);			%tmp = call <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 2);
	ret <4 x i32> %tmp;			ret <4 x i32> %tmp;
	}			}
	▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]			; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
	; X64-NEXT: pmuludq %xmm2, %xmm1			; X64-NEXT: pmuludq %xmm2, %xmm1
	; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]			; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]
	; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: vec2:			; X86-LABEL: vec2:
	; X86: # %bb.0:			; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl 8(%ebp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl 36(%ebp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl 32(%ebp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 28(%ebp), %esi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 24(%ebp), %edi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %esi			; X86-NEXT: imull 40(%ebp), %edi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %edx			; X86-NEXT: imull 44(%ebp), %esi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %ecx			; X86-NEXT: imull 48(%ebp), %edx
				; X86-NEXT: imull 52(%ebp), %ecx
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %ecx, 12(%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %edx, 8(%eax)
	; X86-NEXT: movl %esi, 4(%eax)			; X86-NEXT: movl %esi, 4(%eax)
	; X86-NEXT: movl %edi, (%eax)			; X86-NEXT: movl %edi, (%eax)
				; X86-NEXT: leal -8(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	%tmp = call <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 0);			%tmp = call <4 x i32> @llvm.smul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 0);
	ret <4 x i32> %tmp;			ret <4 x i32> %tmp;
	}			}

	define i64 @func7(i64 %x, i64 %y) nounwind {			define i64 @func7(i64 %x, i64 %y) nounwind {
	; X64-LABEL: func7:			; X64-LABEL: func7:
	; X64: # %bb.0:			; X64: # %bb.0:
	▲ Show 20 Lines • Show All 106 Lines • Show Last 20 Lines

test/CodeGen/X86/sse1.ll

	Show All 38 Lines
	; vselect. With SSE1 v4f32 is a legal type but v4i1 (or any vector integer type)			; vselect. With SSE1 v4f32 is a legal type but v4i1 (or any vector integer type)
	; is not. We used to ping pong between splitting the vselect for the v4i			; is not. We used to ping pong between splitting the vselect for the v4i
	; condition operand and widening the resulting vselect for the v4f32 result.			; condition operand and widening the resulting vselect for the v4f32 result.
	; PR18036			; PR18036

	define <4 x float> @vselect(<4 x float>*%p, <4 x i32> %q) {			define <4 x float> @vselect(<4 x float>*%p, <4 x i32> %q) {
	; X86-LABEL: vselect:			; X86-LABEL: vselect:
	; X86: # %bb.0: # %entry			; X86: # %bb.0: # %entry
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: pushl %ebp
				; X86-NEXT: .cfi_def_cfa_offset 8
				; X86-NEXT: .cfi_offset %ebp, -8
				; X86-NEXT: movl %esp, %ebp
				; X86-NEXT: .cfi_def_cfa_register %ebp
				; X86-NEXT: andl $-16, %esp
				; X86-NEXT: subl $16, %esp
				; X86-NEXT: cmpl $0, 28(%ebp)
	; X86-NEXT: xorps %xmm0, %xmm0			; X86-NEXT: xorps %xmm0, %xmm0
	; X86-NEXT: je .LBB1_1			; X86-NEXT: je .LBB1_1
	; X86-NEXT: # %bb.2: # %entry			; X86-NEXT: # %bb.2: # %entry
	; X86-NEXT: xorps %xmm1, %xmm1			; X86-NEXT: xorps %xmm1, %xmm1
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl $0, 32(%ebp)
	; X86-NEXT: jne .LBB1_5			; X86-NEXT: jne .LBB1_5
	; X86-NEXT: .LBB1_4:			; X86-NEXT: .LBB1_4: # %entry
	; X86-NEXT: movss {{.*#+}} xmm2 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm2 = mem[0],zero,zero,zero
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl $0, 36(%ebp)
	; X86-NEXT: jne .LBB1_8			; X86-NEXT: jne .LBB1_8
	; X86-NEXT: .LBB1_7:			; X86-NEXT: .LBB1_7: # %entry
	; X86-NEXT: movss {{.*#+}} xmm3 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm3 = mem[0],zero,zero,zero
	; X86-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]			; X86-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl $0, 24(%ebp)
	; X86-NEXT: je .LBB1_10			; X86-NEXT: je .LBB1_10
	; X86-NEXT: jmp .LBB1_11			; X86-NEXT: jmp .LBB1_11
	; X86-NEXT: .LBB1_1:			; X86-NEXT: .LBB1_1: # %entry
	; X86-NEXT: movss {{.*#+}} xmm1 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm1 = mem[0],zero,zero,zero
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl $0, 32(%ebp)
	; X86-NEXT: je .LBB1_4			; X86-NEXT: je .LBB1_4
	; X86-NEXT: .LBB1_5: # %entry			; X86-NEXT: .LBB1_5: # %entry
	; X86-NEXT: xorps %xmm2, %xmm2			; X86-NEXT: xorps %xmm2, %xmm2
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl $0, 36(%ebp)
	; X86-NEXT: je .LBB1_7			; X86-NEXT: je .LBB1_7
	; X86-NEXT: .LBB1_8: # %entry			; X86-NEXT: .LBB1_8: # %entry
	; X86-NEXT: xorps %xmm3, %xmm3			; X86-NEXT: xorps %xmm3, %xmm3
	; X86-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]			; X86-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]
	; X86-NEXT: cmpl $0, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl $0, 24(%ebp)
	; X86-NEXT: jne .LBB1_11			; X86-NEXT: jne .LBB1_11
	; X86-NEXT: .LBB1_10:			; X86-NEXT: .LBB1_10: # %entry
	; X86-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; X86-NEXT: .LBB1_11: # %entry			; X86-NEXT: .LBB1_11: # %entry
	; X86-NEXT: unpcklps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; X86-NEXT: unpcklps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; X86-NEXT: movlhps {{.*#+}} xmm0 = xmm0[0],xmm2[0]			; X86-NEXT: movlhps {{.*#+}} xmm0 = xmm0[0],xmm2[0]
				; X86-NEXT: movl %ebp, %esp
				; X86-NEXT: popl %ebp
				; X86-NEXT: .cfi_def_cfa %esp, 4
	; X86-NEXT: retl			; X86-NEXT: retl
	;			;
	; X64-LABEL: vselect:			; X64-LABEL: vselect:
	; X64: # %bb.0: # %entry			; X64: # %bb.0: # %entry
	; X64-NEXT: testl %edx, %edx			; X64-NEXT: testl %edx, %edx
	; X64-NEXT: xorps %xmm0, %xmm0			; X64-NEXT: xorps %xmm0, %xmm0
	; X64-NEXT: je .LBB1_1			; X64-NEXT: je .LBB1_1
	; X64-NEXT: # %bb.2: # %entry			; X64-NEXT: # %bb.2: # %entry
	; X64-NEXT: xorps %xmm1, %xmm1			; X64-NEXT: xorps %xmm1, %xmm1
	; X64-NEXT: testl %ecx, %ecx			; X64-NEXT: testl %ecx, %ecx
	; X64-NEXT: jne .LBB1_5			; X64-NEXT: jne .LBB1_5
	; X64-NEXT: .LBB1_4:			; X64-NEXT: .LBB1_4: # %entry
	; X64-NEXT: movss {{.*#+}} xmm2 = mem[0],zero,zero,zero			; X64-NEXT: movss {{.*#+}} xmm2 = mem[0],zero,zero,zero
	; X64-NEXT: testl %r8d, %r8d			; X64-NEXT: testl %r8d, %r8d
	; X64-NEXT: jne .LBB1_8			; X64-NEXT: jne .LBB1_8
	; X64-NEXT: .LBB1_7:			; X64-NEXT: .LBB1_7: # %entry
	; X64-NEXT: movss {{.*#+}} xmm3 = mem[0],zero,zero,zero			; X64-NEXT: movss {{.*#+}} xmm3 = mem[0],zero,zero,zero
	; X64-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]			; X64-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]
	; X64-NEXT: testl %esi, %esi			; X64-NEXT: testl %esi, %esi
	; X64-NEXT: je .LBB1_10			; X64-NEXT: je .LBB1_10
	; X64-NEXT: jmp .LBB1_11			; X64-NEXT: jmp .LBB1_11
	; X64-NEXT: .LBB1_1:			; X64-NEXT: .LBB1_1: # %entry
	; X64-NEXT: movss {{.*#+}} xmm1 = mem[0],zero,zero,zero			; X64-NEXT: movss {{.*#+}} xmm1 = mem[0],zero,zero,zero
	; X64-NEXT: testl %ecx, %ecx			; X64-NEXT: testl %ecx, %ecx
	; X64-NEXT: je .LBB1_4			; X64-NEXT: je .LBB1_4
	; X64-NEXT: .LBB1_5: # %entry			; X64-NEXT: .LBB1_5: # %entry
	; X64-NEXT: xorps %xmm2, %xmm2			; X64-NEXT: xorps %xmm2, %xmm2
	; X64-NEXT: testl %r8d, %r8d			; X64-NEXT: testl %r8d, %r8d
	; X64-NEXT: je .LBB1_7			; X64-NEXT: je .LBB1_7
	; X64-NEXT: .LBB1_8: # %entry			; X64-NEXT: .LBB1_8: # %entry
	; X64-NEXT: xorps %xmm3, %xmm3			; X64-NEXT: xorps %xmm3, %xmm3
	; X64-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]			; X64-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm3[0],xmm2[1],xmm3[1]
	; X64-NEXT: testl %esi, %esi			; X64-NEXT: testl %esi, %esi
	; X64-NEXT: jne .LBB1_11			; X64-NEXT: jne .LBB1_11
	; X64-NEXT: .LBB1_10:			; X64-NEXT: .LBB1_10: # %entry
	; X64-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; X64-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; X64-NEXT: .LBB1_11: # %entry			; X64-NEXT: .LBB1_11: # %entry
	; X64-NEXT: unpcklps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; X64-NEXT: unpcklps {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; X64-NEXT: movlhps {{.*#+}} xmm0 = xmm0[0],xmm2[0]			; X64-NEXT: movlhps {{.*#+}} xmm0 = xmm0[0],xmm2[0]
	; X64-NEXT: retq			; X64-NEXT: retq
	entry:			entry:
	%a1 = icmp eq <4 x i32> %q, zeroinitializer			%a1 = icmp eq <4 x i32> %q, zeroinitializer
	%a14 = select <4 x i1> %a1, <4 x float> <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+0> , <4 x float> zeroinitializer			%a14 = select <4 x i1> %a1, <4 x float> <float 1.000000e+00, float 2.000000e+00, float 3.000000e+00, float 4.000000e+0> , <4 x float> zeroinitializer
	Show All 14 Lines
	}			}

	; Don't crash trying to do the impossible: an integer vector comparison doesn't exist, so we must scalarize.			; Don't crash trying to do the impossible: an integer vector comparison doesn't exist, so we must scalarize.
	; https://llvm.org/bugs/show_bug.cgi?id=30512			; https://llvm.org/bugs/show_bug.cgi?id=30512

	define <4 x i32> @PR30512(<4 x i32> %x, <4 x i32> %y) nounwind {			define <4 x i32> @PR30512(<4 x i32> %x, <4 x i32> %y) nounwind {
	; X86-LABEL: PR30512:			; X86-LABEL: PR30512:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebp
	; X86-NEXT: pushl %edi			; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %esi			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: subl $16, %esp			; X86-NEXT: subl $32, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl 36(%ebp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: xorl %ecx, %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: cmpl 52(%ebp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl 32(%ebp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: sete %cl
	; X86-NEXT: xorl %ebx, %ebx			; X86-NEXT: negl %ecx
	; X86-NEXT: cmpl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl %ecx, {{[0-9]+}}(%esp)
	; X86-NEXT: sete %bl			; X86-NEXT: xorl %ecx, %ecx
	; X86-NEXT: negl %ebx			; X86-NEXT: cmpl 48(%ebp), %eax
	; X86-NEXT: movl %ebx, {{[0-9]+}}(%esp)			; X86-NEXT: movl 28(%ebp), %eax
	; X86-NEXT: xorl %ebx, %ebx			; X86-NEXT: sete %cl
	; X86-NEXT: cmpl {{[0-9]+}}(%esp), %esi			; X86-NEXT: negl %ecx
	; X86-NEXT: sete %bl			; X86-NEXT: movl %ecx, {{[0-9]+}}(%esp)
	; X86-NEXT: negl %ebx			; X86-NEXT: xorl %ecx, %ecx
	; X86-NEXT: movl %ebx, {{[0-9]+}}(%esp)			; X86-NEXT: cmpl 44(%ebp), %eax
	; X86-NEXT: xorl %ebx, %ebx			; X86-NEXT: movl 24(%ebp), %eax
	; X86-NEXT: cmpl {{[0-9]+}}(%esp), %edx			; X86-NEXT: sete %cl
	; X86-NEXT: sete %bl			; X86-NEXT: negl %ecx
	; X86-NEXT: negl %ebx			; X86-NEXT: movl %ecx, {{[0-9]+}}(%esp)
	; X86-NEXT: movl %ebx, {{[0-9]+}}(%esp)			; X86-NEXT: xorl %ecx, %ecx
	; X86-NEXT: xorl %edx, %edx			; X86-NEXT: cmpl 40(%ebp), %eax
	; X86-NEXT: cmpl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: sete %cl
	; X86-NEXT: sete %dl			; X86-NEXT: negl %ecx
	; X86-NEXT: negl %edx			; X86-NEXT: movl %ecx, {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, (%esp)			; X86-NEXT: movl 8(%ebp), %eax
	; X86-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; X86-NEXT: movss {{.*#+}} xmm1 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm1 = mem[0],zero,zero,zero
	; X86-NEXT: unpcklps {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]			; X86-NEXT: unpcklps {{.*#+}} xmm1 = xmm1[0],xmm0[0],xmm1[1],xmm0[1]
	; X86-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; X86-NEXT: movss {{.*#+}} xmm2 = mem[0],zero,zero,zero			; X86-NEXT: movss {{.*#+}} xmm2 = mem[0],zero,zero,zero
	; X86-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm0[0],xmm2[1],xmm0[1]			; X86-NEXT: unpcklps {{.*#+}} xmm2 = xmm2[0],xmm0[0],xmm2[1],xmm0[1]
	; X86-NEXT: movlhps {{.*#+}} xmm2 = xmm2[0],xmm1[0]			; X86-NEXT: movlhps {{.*#+}} xmm2 = xmm2[0],xmm1[0]
	; X86-NEXT: andps {{\.LCPI.*}}, %xmm2			; X86-NEXT: andps {{\.LCPI.*}}, %xmm2
	; X86-NEXT: movaps %xmm2, (%eax)			; X86-NEXT: movaps %xmm2, (%eax)
	; X86-NEXT: addl $16, %esp			; X86-NEXT: movl %ebp, %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %ebp
	; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	;			;
	; X64-LABEL: PR30512:			; X64-LABEL: PR30512:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movq %rdi, %rax			; X64-NEXT: movq %rdi, %rax
	; X64-NEXT: xorl %edi, %edi			; X64-NEXT: xorl %edi, %edi
	; X64-NEXT: cmpl {{[0-9]+}}(%rsp), %r8d			; X64-NEXT: cmpl {{[0-9]+}}(%rsp), %r8d
	; X64-NEXT: sete %dil			; X64-NEXT: sete %dil
	▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

test/CodeGen/X86/ssub_sat.ll

Show First 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
%tmp = call i4 @llvm.ssub.sat.i4(i4 %x, i4 %y);		%tmp = call i4 @llvm.ssub.sat.i4(i4 %x, i4 %y);
ret i4 %tmp;		ret i4 %tmp;
}		}

define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {		define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
; X86-LABEL: vec:		; X86-LABEL: vec:
; X86: # %bb.0:		; X86: # %bb.0:
; X86-NEXT: pushl %ebp		; X86-NEXT: pushl %ebp
		; X86-NEXT: movl %esp, %ebp
; X86-NEXT: pushl %ebx		; X86-NEXT: pushl %ebx
; X86-NEXT: pushl %edi		; X86-NEXT: pushl %edi
; X86-NEXT: pushl %esi		; X86-NEXT: pushl %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx		; X86-NEXT: andl $-16, %esp
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: subl $16, %esp
		; X86-NEXT: movl 36(%ebp), %ecx
		; X86-NEXT: movl 52(%ebp), %edx
; X86-NEXT: xorl %eax, %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %ecx, %esi		; X86-NEXT: movl %ecx, %esi
; X86-NEXT: subl %edx, %esi		; X86-NEXT: subl %edx, %esi
; X86-NEXT: setns %al		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: subl %edx, %ecx		; X86-NEXT: subl %edx, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edx		; X86-NEXT: movl 32(%ebp), %edx
; X86-NEXT: cmovol %eax, %ecx		; X86-NEXT: cmovol %eax, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl %ecx, {{[-0-9]+}}(%e{{[sb]}}p) # 4-byte Spill
		; X86-NEXT: movl 48(%ebp), %esi
; X86-NEXT: xorl %eax, %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %edx, %edi		; X86-NEXT: movl %edx, %edi
; X86-NEXT: subl %esi, %edi		; X86-NEXT: subl %esi, %edi
; X86-NEXT: setns %al		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: subl %esi, %edx		; X86-NEXT: subl %esi, %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %esi		; X86-NEXT: movl 28(%ebp), %esi
; X86-NEXT: cmovol %eax, %edx		; X86-NEXT: cmovol %eax, %edx
; X86-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-NEXT: movl 44(%ebp), %edi
; X86-NEXT: xorl %eax, %eax		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %esi, %ebx		; X86-NEXT: movl %esi, %ebx
; X86-NEXT: subl %edi, %ebx		; X86-NEXT: subl %edi, %ebx
; X86-NEXT: setns %al		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: subl %edi, %esi		; X86-NEXT: subl %edi, %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %edi		; X86-NEXT: movl 24(%ebp), %ecx
; X86-NEXT: cmovol %eax, %esi		; X86-NEXT: cmovol %eax, %esi
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: movl 40(%ebp), %ebx
; X86-NEXT: xorl %ebx, %ebx		; X86-NEXT: xorl %eax, %eax
; X86-NEXT: movl %edi, %ebp		; X86-NEXT: movl %ecx, %edi
; X86-NEXT: subl %eax, %ebp		; X86-NEXT: subl %ebx, %edi
; X86-NEXT: setns %bl		; X86-NEXT: setns %al
; X86-NEXT: addl $2147483647, %ebx # imm = 0x7FFFFFFF		; X86-NEXT: addl $2147483647, %eax # imm = 0x7FFFFFFF
; X86-NEXT: subl %eax, %edi		; X86-NEXT: subl %ebx, %ecx
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax		; X86-NEXT: cmovol %eax, %ecx
; X86-NEXT: cmovol %ebx, %edi		; X86-NEXT: movl 8(%ebp), %eax
; X86-NEXT: movl %ecx, 12(%eax)		; X86-NEXT: movl {{[-0-9]+}}(%e{{[sb]}}p), %edi # 4-byte Reload
		; X86-NEXT: movl %edi, 12(%eax)
; X86-NEXT: movl %edx, 8(%eax)		; X86-NEXT: movl %edx, 8(%eax)
; X86-NEXT: movl %esi, 4(%eax)		; X86-NEXT: movl %esi, 4(%eax)
; X86-NEXT: movl %edi, (%eax)		; X86-NEXT: movl %ecx, (%eax)
		; X86-NEXT: leal -12(%ebp), %esp
; X86-NEXT: popl %esi		; X86-NEXT: popl %esi
; X86-NEXT: popl %edi		; X86-NEXT: popl %edi
; X86-NEXT: popl %ebx		; X86-NEXT: popl %ebx
; X86-NEXT: popl %ebp		; X86-NEXT: popl %ebp
; X86-NEXT: retl $4		; X86-NEXT: retl $4
;		;
; X64-LABEL: vec:		; X64-LABEL: vec:
; X64: # %bb.0:		; X64: # %bb.0:
Show All 28 Lines

test/CodeGen/X86/uadd_sat.ll

	Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	; X64-NEXT: retq			; X64-NEXT: retq
	%tmp = call i4 @llvm.uadd.sat.i4(i4 %x, i4 %y);			%tmp = call i4 @llvm.uadd.sat.i4(i4 %x, i4 %y);
	ret i4 %tmp;			ret i4 %tmp;
	}			}

	define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {			define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
	; X86-LABEL: vec:			; X86-LABEL: vec:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl 36(%ebp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl 32(%ebp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl 28(%ebp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 24(%ebp), %edi
	; X86-NEXT: addl {{[0-9]+}}(%esp), %edi			; X86-NEXT: addl 40(%ebp), %edi
	; X86-NEXT: movl $-1, %ebx			; X86-NEXT: movl $-1, %eax
	; X86-NEXT: cmovbl %ebx, %edi			; X86-NEXT: cmovbl %eax, %edi
	; X86-NEXT: addl {{[0-9]+}}(%esp), %esi			; X86-NEXT: addl 44(%ebp), %esi
	; X86-NEXT: cmovbl %ebx, %esi			; X86-NEXT: cmovbl %eax, %esi
	; X86-NEXT: addl {{[0-9]+}}(%esp), %edx			; X86-NEXT: addl 48(%ebp), %edx
	; X86-NEXT: cmovbl %ebx, %edx			; X86-NEXT: cmovbl %eax, %edx
	; X86-NEXT: addl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: addl 52(%ebp), %ecx
	; X86-NEXT: cmovbl %ebx, %ecx			; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: movl 8(%ebp), %eax
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %ecx, 12(%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %edx, 8(%eax)
	; X86-NEXT: movl %esi, 4(%eax)			; X86-NEXT: movl %esi, 4(%eax)
	; X86-NEXT: movl %edi, (%eax)			; X86-NEXT: movl %edi, (%eax)
				; X86-NEXT: leal -8(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	;			;
	; X64-LABEL: vec:			; X64-LABEL: vec:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]			; X64-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
	; X64-NEXT: paddd %xmm0, %xmm1			; X64-NEXT: paddd %xmm0, %xmm1
	; X64-NEXT: pxor %xmm2, %xmm0			; X64-NEXT: pxor %xmm2, %xmm0
	; X64-NEXT: pxor %xmm1, %xmm2			; X64-NEXT: pxor %xmm1, %xmm2
	; X64-NEXT: pcmpgtd %xmm2, %xmm0			; X64-NEXT: pcmpgtd %xmm2, %xmm0
	; X64-NEXT: por %xmm1, %xmm0			; X64-NEXT: por %xmm1, %xmm0
	; X64-NEXT: retq			; X64-NEXT: retq
	%tmp = call <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %x, <4 x i32> %y);			%tmp = call <4 x i32> @llvm.uadd.sat.v4i32(<4 x i32> %x, <4 x i32> %y);
	ret <4 x i32> %tmp;			ret <4 x i32> %tmp;
	}			}

test/CodeGen/X86/umul_fix.ll

	Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
	; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; X64-NEXT: pslld $30, %xmm0			; X64-NEXT: pslld $30, %xmm0
	; X64-NEXT: por %xmm3, %xmm0			; X64-NEXT: por %xmm3, %xmm0
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: vec:			; X86-LABEL: vec:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebp			; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebx
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: subl $16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 36(%ebp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ebx			; X86-NEXT: movl 32(%ebp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: movl 28(%ebp), %edi
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: movl 24(%ebp), %eax
	; X86-NEXT: movl %edx, %ebp			; X86-NEXT: mull 40(%ebp)
	; X86-NEXT: shldl $30, %eax, %ebp
	; X86-NEXT: movl %ebx, %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)
	; X86-NEXT: movl %edx, %ebx			; X86-NEXT: movl %edx, %ebx
	; X86-NEXT: shldl $30, %eax, %ebx			; X86-NEXT: shldl $30, %eax, %ebx
	; X86-NEXT: movl %edi, %eax			; X86-NEXT: movl %edi, %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull 44(%ebp)
	; X86-NEXT: movl %edx, %edi			; X86-NEXT: movl %edx, %edi
	; X86-NEXT: shldl $30, %eax, %edi			; X86-NEXT: shldl $30, %eax, %edi
	; X86-NEXT: movl %esi, %eax			; X86-NEXT: movl %esi, %eax
	; X86-NEXT: mull {{[0-9]+}}(%esp)			; X86-NEXT: mull 48(%ebp)
	; X86-NEXT: shldl $30, %eax, %edx			; X86-NEXT: movl %edx, %esi
	; X86-NEXT: movl %edx, 12(%ecx)			; X86-NEXT: shldl $30, %eax, %esi
	; X86-NEXT: movl %edi, 8(%ecx)
	; X86-NEXT: movl %ebx, 4(%ecx)
	; X86-NEXT: movl %ebp, (%ecx)
	; X86-NEXT: movl %ecx, %eax			; X86-NEXT: movl %ecx, %eax
				; X86-NEXT: mull 52(%ebp)
				; X86-NEXT: shldl $30, %eax, %edx
				; X86-NEXT: movl 8(%ebp), %eax
				; X86-NEXT: movl %edx, 12(%eax)
				; X86-NEXT: movl %esi, 8(%eax)
				; X86-NEXT: movl %edi, 4(%eax)
				; X86-NEXT: movl %ebx, (%eax)
				; X86-NEXT: leal -12(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebx
	; X86-NEXT: popl %ebp			; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	%tmp = call <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 2);			%tmp = call <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 2);
	ret <4 x i32> %tmp;			ret <4 x i32> %tmp;
	}			}
	▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]			; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
	; X64-NEXT: pmuludq %xmm2, %xmm1			; X64-NEXT: pmuludq %xmm2, %xmm1
	; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]			; X64-NEXT: pshufd {{.*#+}} xmm1 = xmm1[0,2,2,3]
	; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]			; X64-NEXT: punpckldq {{.*#+}} xmm0 = xmm0[0],xmm1[0],xmm0[1],xmm1[1]
	; X64-NEXT: retq			; X64-NEXT: retq
	;			;
	; X86-LABEL: vec2:			; X86-LABEL: vec2:
	; X86: # %bb.0:			; X86: # %bb.0:
				; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl 8(%ebp), %eax
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl 36(%ebp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl 32(%ebp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 28(%ebp), %esi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 24(%ebp), %edi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %esi			; X86-NEXT: imull 40(%ebp), %edi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %edx			; X86-NEXT: imull 44(%ebp), %esi
	; X86-NEXT: imull {{[0-9]+}}(%esp), %ecx			; X86-NEXT: imull 48(%ebp), %edx
				; X86-NEXT: imull 52(%ebp), %ecx
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %ecx, 12(%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %edx, 8(%eax)
	; X86-NEXT: movl %esi, 4(%eax)			; X86-NEXT: movl %esi, 4(%eax)
	; X86-NEXT: movl %edi, (%eax)			; X86-NEXT: movl %edi, (%eax)
				; X86-NEXT: leal -8(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
				; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	%tmp = call <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 0);			%tmp = call <4 x i32> @llvm.umul.fix.v4i32(<4 x i32> %x, <4 x i32> %y, i32 0);
	ret <4 x i32> %tmp;			ret <4 x i32> %tmp;
	}			}

	define i64 @func7(i64 %x, i64 %y) nounwind {			define i64 @func7(i64 %x, i64 %y) nounwind {
	; X64-LABEL: func7:			; X64-LABEL: func7:
	; X64: # %bb.0:			; X64: # %bb.0:
	▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

test/CodeGen/X86/usub_sat.ll

	Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	; X64-NEXT: retq			; X64-NEXT: retq
	%tmp = call i4 @llvm.usub.sat.i4(i4 %x, i4 %y);			%tmp = call i4 @llvm.usub.sat.i4(i4 %x, i4 %y);
	ret i4 %tmp;			ret i4 %tmp;
	}			}

	define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {			define <4 x i32> @vec(<4 x i32> %x, <4 x i32> %y) nounwind {
	; X86-LABEL: vec:			; X86-LABEL: vec:
	; X86: # %bb.0:			; X86: # %bb.0:
	; X86-NEXT: pushl %ebx			; X86-NEXT: pushl %ebp
				; X86-NEXT: movl %esp, %ebp
	; X86-NEXT: pushl %edi			; X86-NEXT: pushl %edi
	; X86-NEXT: pushl %esi			; X86-NEXT: pushl %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %eax			; X86-NEXT: andl $-16, %esp
	; X86-NEXT: movl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: movl 36(%ebp), %ecx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edx			; X86-NEXT: movl 32(%ebp), %edx
	; X86-NEXT: movl {{[0-9]+}}(%esp), %esi			; X86-NEXT: movl 28(%ebp), %esi
	; X86-NEXT: movl {{[0-9]+}}(%esp), %edi			; X86-NEXT: movl 24(%ebp), %edi
	; X86-NEXT: xorl %ebx, %ebx			; X86-NEXT: xorl %eax, %eax
	; X86-NEXT: subl {{[0-9]+}}(%esp), %edi			; X86-NEXT: subl 40(%ebp), %edi
	; X86-NEXT: cmovbl %ebx, %edi			; X86-NEXT: cmovbl %eax, %edi
	; X86-NEXT: subl {{[0-9]+}}(%esp), %esi			; X86-NEXT: subl 44(%ebp), %esi
	; X86-NEXT: cmovbl %ebx, %esi			; X86-NEXT: cmovbl %eax, %esi
	; X86-NEXT: subl {{[0-9]+}}(%esp), %edx			; X86-NEXT: subl 48(%ebp), %edx
	; X86-NEXT: cmovbl %ebx, %edx			; X86-NEXT: cmovbl %eax, %edx
	; X86-NEXT: subl {{[0-9]+}}(%esp), %ecx			; X86-NEXT: subl 52(%ebp), %ecx
	; X86-NEXT: cmovbl %ebx, %ecx			; X86-NEXT: cmovbl %eax, %ecx
				; X86-NEXT: movl 8(%ebp), %eax
	; X86-NEXT: movl %ecx, 12(%eax)			; X86-NEXT: movl %ecx, 12(%eax)
	; X86-NEXT: movl %edx, 8(%eax)			; X86-NEXT: movl %edx, 8(%eax)
	; X86-NEXT: movl %esi, 4(%eax)			; X86-NEXT: movl %esi, 4(%eax)
	; X86-NEXT: movl %edi, (%eax)			; X86-NEXT: movl %edi, (%eax)
				; X86-NEXT: leal -8(%ebp), %esp
	; X86-NEXT: popl %esi			; X86-NEXT: popl %esi
	; X86-NEXT: popl %edi			; X86-NEXT: popl %edi
	; X86-NEXT: popl %ebx			; X86-NEXT: popl %ebp
	; X86-NEXT: retl $4			; X86-NEXT: retl $4
	;			;
	; X64-LABEL: vec:			; X64-LABEL: vec:
	; X64: # %bb.0:			; X64: # %bb.0:
	; X64-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]			; X64-NEXT: movdqa {{.*#+}} xmm2 = [2147483648,2147483648,2147483648,2147483648]
	; X64-NEXT: movdqa %xmm1, %xmm3			; X64-NEXT: movdqa %xmm1, %xmm3
	; X64-NEXT: pxor %xmm2, %xmm3			; X64-NEXT: pxor %xmm2, %xmm3
	; X64-NEXT: pxor %xmm0, %xmm2			; X64-NEXT: pxor %xmm0, %xmm2
	; X64-NEXT: pcmpgtd %xmm3, %xmm2			; X64-NEXT: pcmpgtd %xmm3, %xmm2
	; X64-NEXT: psubd %xmm1, %xmm0			; X64-NEXT: psubd %xmm1, %xmm0
	; X64-NEXT: pand %xmm2, %xmm0			; X64-NEXT: pand %xmm2, %xmm0
	; X64-NEXT: retq			; X64-NEXT: retq
	%tmp = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %x, <4 x i32> %y);			%tmp = call <4 x i32> @llvm.usub.sat.v4i32(<4 x i32> %x, <4 x i32> %y);
	ret <4 x i32> %tmp;			ret <4 x i32> %tmp;
	}			}

test/CodeGen/X86/win32-pic-jumptable.ll

	; RUN: llc < %s -relocation-model=pic \| FileCheck %s			; RUN: llc < %s -relocation-model=pic \| FileCheck %s

	; CHECK: calll L0$pb			; CHECK: calll L0$pb
	; CHECK-NEXT: .cfi_adjust_cfa_offset 4
	; CHECK-NEXT: L0$pb:			; CHECK-NEXT: L0$pb:
	; CHECK-NEXT: popl %eax			; CHECK-NEXT: popl %eax
	; CHECK-NEXT: .cfi_adjust_cfa_offset -4
	; CHECK-NEXT: addl LJTI0_0(,%ecx,4), %eax			; CHECK-NEXT: addl LJTI0_0(,%ecx,4), %eax
	; CHECK-NEXT: jmpl *%eax			; CHECK-NEXT: jmpl *%eax

	; CHECK: LJTI0_0:			; CHECK: LJTI0_0:
	; CHECK-NEXT: .long LBB0_2-L0$pb			; CHECK-NEXT: .long LBB0_2-L0$pb
	; CHECK-NEXT: .long LBB0_3-L0$pb			; CHECK-NEXT: .long LBB0_3-L0$pb
	; CHECK-NEXT: .long LBB0_4-L0$pb			; CHECK-NEXT: .long LBB0_4-L0$pb
	; CHECK-NEXT: .long LBB0_5-L0$pb			; CHECK-NEXT: .long LBB0_5-L0$pb
	Show All 23 Lines

utils/TableGen/CallingConvEmitter.cpp

Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	if (Action->isSubClassOf("CCDelegateTo")) {
O << "\n" << IndentStr		O << "\n" << IndentStr
<< " State.getMachineFunction().getDataLayout()."		<< " State.getMachineFunction().getDataLayout()."
"getABITypeAlignment(EVT(LocVT).getTypeForEVT(State.getContext()"		"getABITypeAlignment(EVT(LocVT).getTypeForEVT(State.getContext()"
"))";		"))";
O << ");\n" << IndentStr		O << ");\n" << IndentStr
<< "State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset"		<< "State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset"
<< Counter << ", LocVT, LocInfo));\n";		<< Counter << ", LocVT, LocInfo));\n";
O << IndentStr << "return false;\n";		O << IndentStr << "return false;\n";
		} else if (Action->isSubClassOf("CCAssignToStackWithOrigAlign")) {
		int Size = Action->getValueAsInt("Size");
		int Align = Action->getValueAsInt("Align");

		O << IndentStr << "unsigned Offset" << ++Counter
		<< " = State.AllocateStack(";
		if (Size)
		O << Size << ", ";
		else
		O << "\n" << IndentStr
		<< " State.getMachineFunction().getDataLayout()."
		"getTypeAllocSize(EVT(LocVT).getTypeForEVT(State.getContext())),"
		" ";
		O << "std::max(ArgFlags.getOrigAlign(), (unsigned int)";
		if (Align)
		O << Align;
		else
		O << "\n" << IndentStr
		<< " State.getMachineFunction().getDataLayout()."
		"getABITypeAlignment(EVT(LocVT).getTypeForEVT(State.getContext()"
		"))";
		O << "));\n" << IndentStr
		<< "State.addLoc(CCValAssign::getMem(ValNo, ValVT, Offset"
		<< Counter << ", LocVT, LocInfo));\n";
		O << IndentStr << "return false;\n";
} else if (Action->isSubClassOf("CCAssignToStackWithShadow")) {		} else if (Action->isSubClassOf("CCAssignToStackWithShadow")) {
int Size = Action->getValueAsInt("Size");		int Size = Action->getValueAsInt("Size");
int Align = Action->getValueAsInt("Align");		int Align = Action->getValueAsInt("Align");
ListInit *ShadowRegList = Action->getValueAsListInit("ShadowRegList");		ListInit *ShadowRegList = Action->getValueAsListInit("ShadowRegList");

unsigned ShadowRegListNumber = ++Counter;		unsigned ShadowRegListNumber = ++Counter;

O << IndentStr << "static const MCPhysReg ShadowRegList"		O << IndentStr << "static const MCPhysReg ShadowRegList"
<< ShadowRegListNumber << "[] = {\n";		<< ShadowRegListNumber << "[] = {\n";
▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Fix i386 stack alignment for parameter type with breakdownsNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 196970

include/llvm/Target/TargetCallingConv.td

lib/Target/X86/X86CallingConv.td

test/CodeGen/X86/add.ll

test/CodeGen/X86/bool-vector.ll

test/CodeGen/X86/cmovcmov.ll

test/CodeGen/X86/extract-store.ll

test/CodeGen/X86/gather-addresses.ll

test/CodeGen/X86/legalize-shift-64.ll

test/CodeGen/X86/legalize-shl-vec.ll

test/CodeGen/X86/masked_gather_scatter.ll

test/CodeGen/X86/mmx-arg-passing.ll

test/CodeGen/X86/movtopush.ll

test/CodeGen/X86/sadd_sat.ll

test/CodeGen/X86/scalar-fp-to-i64.ll

test/CodeGen/X86/select.ll

test/CodeGen/X86/smul_fix.ll

test/CodeGen/X86/sse1.ll

test/CodeGen/X86/ssub_sat.ll

test/CodeGen/X86/uadd_sat.ll

test/CodeGen/X86/umul_fix.ll

test/CodeGen/X86/usub_sat.ll

test/CodeGen/X86/win32-pic-jumptable.ll

utils/TableGen/CallingConvEmitter.cpp

Fix i386 stack alignment for parameter type with breakdowns
Needs ReviewPublic