This is an archive of the discontinued LLVM Phabricator instance.

[DAG] fold FP binops with undef operands to NaN
ClosedPublic

Authored by spatel on May 17 2018, 12:46 PM.

Download Raw Diff

Details

Reviewers

efriedma
RKSimon
arsenm
jlebar
mcberg2017
craig.topper
javed.absar
tstellar
FarhanaAleen
nhaehnle
rampitec

Commits

rG17a870f07cf4: [DAG] fold FP binops with undef operands to NaN
rL332920: [DAG] fold FP binops with undef operands to NaN

Summary

This is the FP sibling of D43141 with the corresponding IR change in rL327212.

We can't propagate undef here because if a variable operand is a NaN, these binops must propagate NaN. Neither global nor node-level fast-math makes a difference. If we have 'nnan', I think later folds can turn the NaN into undef.

The tests in X86/fp-undef.ll are meant to be the definitive verification for these folds - everything reduces identically now.

The other test changes are collateral damage that I wasn't sure what to do with (see the many test changes I committed in the last day for attempts to preserve functionality independently of this change).

Let me describe those test diffs, and someone with a better understanding may be able to fix those tests:
AArch64/fcvt_combine.ll - we constant folded the fmul, not sure why the expectation was different
AMDGPU/mad-mix-lo.ll - don't know anything about what's happening here
NVPTX/implicit-def.ll - this isn't testing what it intended to test - delete the file.
X86/pr23103.ll - this isn't testing what it intended to test, but I don't know how to do that...could just delete the file?
X86/vector-reduce-fadd.ll and X86/vector-reduce-fmul.ll - I think all of these diffs are the unexpected case where the accumulator param is supposed to be used because it's a strict reduction, but we're passing in undef. Just verifies that we don't crash?
http://llvm.org/docs/LangRef.html#llvm-experimental-vector-reduce-fadd-intrinsic

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.May 17 2018, 12:46 PM

Herald added a reviewer: javed.absar. · View Herald TranscriptMay 17 2018, 12:46 PM

Herald added subscribers: kristof.beyls, tpr, nhaehnle and 3 others. · View Herald Transcript

NVPTX/implicit-def.ll - this isn't testing what it intended to test, but I don't know how to do that...could just delete the file?

This test is from 2013 and hasn't had a meaningful change since then. It's not totally clear to me what it's testing, but it looks like it's checking for a crasher.

I suspect that in the years since then we're a-ok on covering this. I support deleting it.

(Or if you want to fix up the test like you have it here, that's also OK with me.)

spatel mentioned this in D46973: Extending undef support for float arithmetic to isFast IR flags.May 17 2018, 1:02 PM

In D47026#1103526, @jlebar wrote:

NVPTX/implicit-def.ll - this isn't testing what it intended to test, but I don't know how to do that...could just delete the file?

This test is from 2013 and hasn't had a meaningful change since then. It's not totally clear to me what it's testing, but it looks like it's checking for a crasher.

I suspect that in the years since then we're a-ok on covering this. I support deleting it.

(Or if you want to fix up the test like you have it here, that's also OK with me.)

I think it's better to remove it since there is no implicit def possibility if we fold the undef to NaN. Keeping it around will just confuse others about the intent if it needs changing in the future.

Patch updated:
Removed NVPTX test since it's no longer testing the implicit def scenario that it was trying to check.

spatel edited the summary of this revision. (Show Details)May 18 2018, 7:00 AM

This looks good to me. Up to you Sanjay if you want a second set of eyes to confirm.

This revision is now accepted and ready to land.May 18 2018, 2:30 PM

In D47026#1105115, @mcberg2017 wrote:

This looks good to me. Up to you Sanjay if you want a second set of eyes to confirm.

Thanks. I'll let this sit until Monday at least, so others have a chance to reply if they'd like. Let me also add some more potential AMDGPU folks for that test diff in particular.

Closed by commit rL332920: [DAG] fold FP binops with undef operands to NaN (authored by spatel). · Explain WhyMay 21 2018, 4:58 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

CodeGen/

SelectionDAG/

SelectionDAG.cpp

23 lines

test/

CodeGen/

AArch64/

fcvt_combine.ll

3 lines

AMDGPU/

mad-mix-lo.ll

7 lines

NVPTX/

implicit-def.ll

9 lines

X86/

fp-undef.ll

466 lines

pr23103.ll

13 lines

vector-reduce-fadd.ll

145 lines

vector-reduce-fmul.ll

145 lines

Diff 147915

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,761 Lines • ▼ Show 20 Lines	if (Opcode == ISD::FP_ROUND) {
// This can return overflow, underflow, or inexact; we don't care.		// This can return overflow, underflow, or inexact; we don't care.
// FIXME need to be more flexible about rounding mode.		// FIXME need to be more flexible about rounding mode.
(void)V.convert(EVTToAPFloatSemantics(VT),		(void)V.convert(EVTToAPFloatSemantics(VT),
APFloat::rmNearestTiesToEven, &ignored);		APFloat::rmNearestTiesToEven, &ignored);
return getConstantFP(V, DL, VT);		return getConstantFP(V, DL, VT);
}		}
}		}

		// Any FP binop with an undef operand is folded to NaN. This matches the
		// behavior of the IR optimizer.
		switch (Opcode) {
		case ISD::FADD:
		case ISD::FSUB:
		case ISD::FMUL:
		case ISD::FDIV:
		case ISD::FREM:
		if (N1.isUndef() \|\| N2.isUndef())
		return getConstantFP(APFloat::getNaN(EVTToAPFloatSemantics(VT)), DL, VT);
		}

// Canonicalize an UNDEF to the RHS, even over a constant.		// Canonicalize an UNDEF to the RHS, even over a constant.
if (N1.isUndef()) {		if (N1.isUndef()) {
if (TLI->isCommutativeBinOp(Opcode)) {		if (TLI->isCommutativeBinOp(Opcode)) {
std::swap(N1, N2);		std::swap(N1, N2);
} else {		} else {
switch (Opcode) {		switch (Opcode) {
case ISD::FP_ROUND_INREG:		case ISD::FP_ROUND_INREG:
case ISD::SIGN_EXTEND_INREG:		case ISD::SIGN_EXTEND_INREG:
case ISD::SUB:		case ISD::SUB:
case ISD::FSUB:
case ISD::FDIV:
case ISD::FREM:
return getUNDEF(VT); // fold op(undef, arg2) -> undef		return getUNDEF(VT); // fold op(undef, arg2) -> undef
case ISD::UDIV:		case ISD::UDIV:
case ISD::SDIV:		case ISD::SDIV:
case ISD::UREM:		case ISD::UREM:
case ISD::SREM:		case ISD::SREM:
case ISD::SRA:		case ISD::SRA:
case ISD::SRL:		case ISD::SRL:
case ISD::SHL:		case ISD::SHL:
Show All 18 Lines	if (N2.isUndef()) {
case ISD::UDIV:		case ISD::UDIV:
case ISD::SDIV:		case ISD::SDIV:
case ISD::UREM:		case ISD::UREM:
case ISD::SREM:		case ISD::SREM:
case ISD::SRA:		case ISD::SRA:
case ISD::SRL:		case ISD::SRL:
case ISD::SHL:		case ISD::SHL:
return getUNDEF(VT); // fold op(arg1, undef) -> undef		return getUNDEF(VT); // fold op(arg1, undef) -> undef
case ISD::FADD:
case ISD::FSUB:
case ISD::FMUL:
case ISD::FDIV:
case ISD::FREM:
if (getTarget().Options.UnsafeFPMath)
return N2;
break;
case ISD::MUL:		case ISD::MUL:
case ISD::AND:		case ISD::AND:
return getConstant(0, DL, VT); // fold op(arg1, undef) -> 0		return getConstant(0, DL, VT); // fold op(arg1, undef) -> 0
case ISD::OR:		case ISD::OR:
return getAllOnesConstant(DL, VT);		return getAllOnesConstant(DL, VT);
}		}
}		}

▲ Show 20 Lines • Show All 3,878 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/fcvt_combine.ll

	Show First 20 Lines • Show All 94 Lines • ▼ Show 20 Lines
	; CHECK: fcvtzu.2s v0, v0			; CHECK: fcvtzu.2s v0, v0
	; CHECK: ret			; CHECK: ret
	define <2 x i32> @test9(<2 x float> %f) {			define <2 x i32> @test9(<2 x float> %f) {
	%mul.i = fmul <2 x float> %f, <float 16.000000e+00, float 8.000000e+00>			%mul.i = fmul <2 x float> %f, <float 16.000000e+00, float 8.000000e+00>
	%vcvt.i = fptoui <2 x float> %mul.i to <2 x i32>			%vcvt.i = fptoui <2 x float> %mul.i to <2 x i32>
	ret <2 x i32> %vcvt.i			ret <2 x i32> %vcvt.i
	}			}

	; Don't combine all undefs.			; Combine all undefs.
	; CHECK-LABEL: test10			; CHECK-LABEL: test10
	; CHECK: fmul.2s v{{[0-9]+}}, v{{[0-9]+}}, v{{[0-9]+}}
	; CHECK: fcvtzu.2s v{{[0-9]+}}, v{{[0-9]+}}			; CHECK: fcvtzu.2s v{{[0-9]+}}, v{{[0-9]+}}
	; CHECK: ret			; CHECK: ret
	define <2 x i32> @test10(<2 x float> %f) {			define <2 x i32> @test10(<2 x float> %f) {
	%mul.i = fmul <2 x float> %f, <float undef, float undef>			%mul.i = fmul <2 x float> %f, <float undef, float undef>
	%vcvt.i = fptoui <2 x float> %mul.i to <2 x i32>			%vcvt.i = fptoui <2 x float> %mul.i to <2 x i32>
	ret <2 x i32> %vcvt.i			ret <2 x i32> %vcvt.i
	}			}

	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AMDGPU/mad-mix-lo.ll

Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines	define <2 x half> @v_mad_mix_v2f32_clamp_postcvt(<2 x half> %src0, <2 x half> %src1, <2 x half> %src2) #0 {
%max = call <2 x half> @llvm.maxnum.v2f16(<2 x half> %cvt.result, <2 x half> zeroinitializer)		%max = call <2 x half> @llvm.maxnum.v2f16(<2 x half> %cvt.result, <2 x half> zeroinitializer)
%clamp = call <2 x half> @llvm.minnum.v2f16(<2 x half> %max, <2 x half> <half 1.0, half 1.0>)		%clamp = call <2 x half> @llvm.minnum.v2f16(<2 x half> %max, <2 x half> <half 1.0, half 1.0>)
ret <2 x half> %clamp		ret <2 x half> %clamp
}		}

; FIXME: Should be packed into 2 registers per argument?		; FIXME: Should be packed into 2 registers per argument?
; GCN-LABEL: {{^}}v_mad_mix_v3f32_clamp_postcvt:		; GCN-LABEL: {{^}}v_mad_mix_v3f32_clamp_postcvt:
; GCN: s_waitcnt		; GCN: s_waitcnt
; GFX9-NEXT: v_mad_mixlo_f16 v2, v2, v5, v8 op_sel_hi:[1,1,1] clamp		; GFX9-NEXT: v_mad_mixlo_f16 v2, v2, v5, v8 op_sel_hi:[1,1,1]
; GFX9-NEXT: v_mad_mixhi_f16 v2, v0, v0, v0 clamp
; GFX9-NEXT: v_mad_mixlo_f16 v0, v0, v3, v6 op_sel_hi:[1,1,1] clamp		; GFX9-NEXT: v_mad_mixlo_f16 v0, v0, v3, v6 op_sel_hi:[1,1,1] clamp
		; GFX9-NEXT: s_movk_i32 s6, 0x7e00
		; GFX9-NEXT: v_and_b32_e32 v2, 0xffff, v2
		; GFX9-NEXT: v_lshl_or_b32 v2, s6, 16, v2
; GFX9-NEXT: v_mad_mixhi_f16 v0, v1, v4, v7 op_sel_hi:[1,1,1] clamp		; GFX9-NEXT: v_mad_mixhi_f16 v0, v1, v4, v7 op_sel_hi:[1,1,1] clamp
		; GFX9-NEXT: v_pk_max_f16 v2, v2, v2 clamp
; GFX9-NEXT: v_lshrrev_b32_e32 v1, 16, v0		; GFX9-NEXT: v_lshrrev_b32_e32 v1, 16, v0
; GFX9-NEXT: s_setpc_b64		; GFX9-NEXT: s_setpc_b64
define <3 x half> @v_mad_mix_v3f32_clamp_postcvt(<3 x half> %src0, <3 x half> %src1, <3 x half> %src2) #0 {		define <3 x half> @v_mad_mix_v3f32_clamp_postcvt(<3 x half> %src0, <3 x half> %src1, <3 x half> %src2) #0 {
%src0.ext = fpext <3 x half> %src0 to <3 x float>		%src0.ext = fpext <3 x half> %src0 to <3 x float>
%src1.ext = fpext <3 x half> %src1 to <3 x float>		%src1.ext = fpext <3 x half> %src1 to <3 x float>
%src2.ext = fpext <3 x half> %src2 to <3 x float>		%src2.ext = fpext <3 x half> %src2 to <3 x float>
%result = tail call <3 x float> @llvm.fmuladd.v3f32(<3 x float> %src0.ext, <3 x float> %src1.ext, <3 x float> %src2.ext)		%result = tail call <3 x float> @llvm.fmuladd.v3f32(<3 x float> %src0.ext, <3 x float> %src1.ext, <3 x float> %src2.ext)
%cvt.result = fptrunc <3 x float> %result to <3 x half>		%cvt.result = fptrunc <3 x float> %result to <3 x half>
▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/NVPTX/implicit-def.ll

	; RUN: llc < %s -O0 -march=nvptx -mcpu=sm_20 -asm-verbose=1 \| FileCheck %s

	; CHECK: // implicit-def: %f[[F0:[0-9]+]]
	; CHECK: add.rn.f32 %f{{[0-9]+}}, %f{{[0-9]+}}, %f[[F0]];
	define float @foo(float %a) {
	%ret = fadd float %a, undef
	ret float %ret
	}

llvm/trunk/test/CodeGen/X86/fp-undef.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefixes=ANY,STRICT			; RUN: llc < %s -mtriple=x86_64-unknown-unknown \| FileCheck %s --check-prefixes=ANY,STRICT
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -enable-unsafe-fp-math \| FileCheck %s --check-prefixes=ANY,UNSAFE			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -enable-unsafe-fp-math \| FileCheck %s --check-prefixes=ANY,UNSAFE

	; This is duplicated from tests for InstSimplify. If you're			; This is duplicated from tests for InstSimplify. If you're
	; adding something here, you should probably add it there too.			; adding something here, you should probably add it there too.

	define float @fadd_undef_op0(float %x) {			define float @fadd_undef_op0(float %x) {
	; STRICT-LABEL: fadd_undef_op0:			; ANY-LABEL: fadd_undef_op0:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op0:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd float undef, %x			%r = fadd float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fadd_undef_op1(float %x) {			define float @fadd_undef_op1(float %x) {
	; STRICT-LABEL: fadd_undef_op1:			; ANY-LABEL: fadd_undef_op1:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op1:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd float %x, undef			%r = fadd float %x, undef
	ret float %r			ret float %r
	}			}

	define float @fsub_undef_op0(float %x) {			define float @fsub_undef_op0(float %x) {
	; ANY-LABEL: fsub_undef_op0:			; ANY-LABEL: fsub_undef_op0:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fsub float undef, %x			%r = fsub float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fsub_undef_op1(float %x) {			define float @fsub_undef_op1(float %x) {
	; STRICT-LABEL: fsub_undef_op1:			; ANY-LABEL: fsub_undef_op1:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: subss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fsub_undef_op1:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fsub float %x, undef			%r = fsub float %x, undef
	ret float %r			ret float %r
	}			}

	define float @fmul_undef_op0(float %x) {			define float @fmul_undef_op0(float %x) {
	; STRICT-LABEL: fmul_undef_op0:			; ANY-LABEL: fmul_undef_op0:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op0:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul float undef, %x			%r = fmul float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fmul_undef_op1(float %x) {			define float @fmul_undef_op1(float %x) {
	; STRICT-LABEL: fmul_undef_op1:			; ANY-LABEL: fmul_undef_op1:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op1:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul float %x, undef			%r = fmul float %x, undef
	ret float %r			ret float %r
	}			}

	define float @fdiv_undef_op0(float %x) {			define float @fdiv_undef_op0(float %x) {
	; ANY-LABEL: fdiv_undef_op0:			; ANY-LABEL: fdiv_undef_op0:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fdiv float undef, %x			%r = fdiv float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fdiv_undef_op1(float %x) {			define float @fdiv_undef_op1(float %x) {
	; STRICT-LABEL: fdiv_undef_op1:			; ANY-LABEL: fdiv_undef_op1:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: divss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fdiv_undef_op1:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fdiv float %x, undef			%r = fdiv float %x, undef
	ret float %r			ret float %r
	}			}

	define float @frem_undef_op0(float %x) {			define float @frem_undef_op0(float %x) {
	; ANY-LABEL: frem_undef_op0:			; ANY-LABEL: frem_undef_op0:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = frem float undef, %x			%r = frem float undef, %x
	ret float %r			ret float %r
	}			}

	define float @frem_undef_op1(float %x) {			define float @frem_undef_op1(float %x) {
	; STRICT-LABEL: frem_undef_op1:			; ANY-LABEL: frem_undef_op1:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: jmp fmodf # TAILCALL			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	;			; ANY-NEXT: retq
	; UNSAFE-LABEL: frem_undef_op1:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = frem float %x, undef			%r = frem float %x, undef
	ret float %r			ret float %r
	}			}

	; Repeat all tests with fast-math-flags. Alternate 'nnan' and 'fast' for more coverage.			; Repeat all tests with fast-math-flags. Alternate 'nnan' and 'fast' for more coverage.

	define float @fadd_undef_op0_nnan(float %x) {			define float @fadd_undef_op0_nnan(float %x) {
	; STRICT-LABEL: fadd_undef_op0_nnan:			; ANY-LABEL: fadd_undef_op0_nnan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op0_nnan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd nnan float undef, %x			%r = fadd nnan float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fadd_undef_op1_fast(float %x) {			define float @fadd_undef_op1_fast(float %x) {
	; STRICT-LABEL: fadd_undef_op1_fast:			; ANY-LABEL: fadd_undef_op1_fast:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op1_fast:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd fast float %x, undef			%r = fadd fast float %x, undef
	ret float %r			ret float %r
	}			}

	define float @fsub_undef_op0_fast(float %x) {			define float @fsub_undef_op0_fast(float %x) {
	; ANY-LABEL: fsub_undef_op0_fast:			; ANY-LABEL: fsub_undef_op0_fast:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fsub fast float undef, %x			%r = fsub fast float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fsub_undef_op1_nnan(float %x) {			define float @fsub_undef_op1_nnan(float %x) {
	; STRICT-LABEL: fsub_undef_op1_nnan:			; ANY-LABEL: fsub_undef_op1_nnan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: subss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fsub_undef_op1_nnan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fsub nnan float %x, undef			%r = fsub nnan float %x, undef
	ret float %r			ret float %r
	}			}

	define float @fmul_undef_op0_nnan(float %x) {			define float @fmul_undef_op0_nnan(float %x) {
	; STRICT-LABEL: fmul_undef_op0_nnan:			; ANY-LABEL: fmul_undef_op0_nnan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op0_nnan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul nnan float undef, %x			%r = fmul nnan float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fmul_undef_op1_fast(float %x) {			define float @fmul_undef_op1_fast(float %x) {
	; STRICT-LABEL: fmul_undef_op1_fast:			; ANY-LABEL: fmul_undef_op1_fast:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op1_fast:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul fast float %x, undef			%r = fmul fast float %x, undef
	ret float %r			ret float %r
	}			}

	define float @fdiv_undef_op0_fast(float %x) {			define float @fdiv_undef_op0_fast(float %x) {
	; ANY-LABEL: fdiv_undef_op0_fast:			; ANY-LABEL: fdiv_undef_op0_fast:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fdiv fast float undef, %x			%r = fdiv fast float undef, %x
	ret float %r			ret float %r
	}			}

	define float @fdiv_undef_op1_nnan(float %x) {			define float @fdiv_undef_op1_nnan(float %x) {
	; STRICT-LABEL: fdiv_undef_op1_nnan:			; ANY-LABEL: fdiv_undef_op1_nnan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: divss %xmm0, %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fdiv_undef_op1_nnan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fdiv nnan float %x, undef			%r = fdiv nnan float %x, undef
	ret float %r			ret float %r
	}			}

	define float @frem_undef_op0_nnan(float %x) {			define float @frem_undef_op0_nnan(float %x) {
	; ANY-LABEL: frem_undef_op0_nnan:			; ANY-LABEL: frem_undef_op0_nnan:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = frem nnan float undef, %x			%r = frem nnan float undef, %x
	ret float %r			ret float %r
	}			}

	define float @frem_undef_op1_fast(float %x) {			define float @frem_undef_op1_fast(float %x) {
	; STRICT-LABEL: frem_undef_op1_fast:			; ANY-LABEL: frem_undef_op1_fast:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: jmp fmodf # TAILCALL			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	;			; ANY-NEXT: retq
	; UNSAFE-LABEL: frem_undef_op1_fast:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = frem fast float %x, undef			%r = frem fast float %x, undef
	ret float %r			ret float %r
	}			}

	; Constant folding - undef undef.			; Constant folding - undef undef.

	define double @fadd_undef_undef(double %x) {			define double @fadd_undef_undef(double %x) {
	; STRICT-LABEL: fadd_undef_undef:			; ANY-LABEL: fadd_undef_undef:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addsd %xmm0, %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_undef:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd double undef, undef			%r = fadd double undef, undef
	ret double %r			ret double %r
	}			}

	define double @fsub_undef_undef(double %x) {			define double @fsub_undef_undef(double %x) {
	; ANY-LABEL: fsub_undef_undef:			; ANY-LABEL: fsub_undef_undef:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fsub double undef, undef			%r = fsub double undef, undef
	ret double %r			ret double %r
	}			}

	define double @fmul_undef_undef(double %x) {			define double @fmul_undef_undef(double %x) {
	; STRICT-LABEL: fmul_undef_undef:			; ANY-LABEL: fmul_undef_undef:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulsd %xmm0, %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_undef:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul double undef, undef			%r = fmul double undef, undef
	ret double %r			ret double %r
	}			}

	define double @fdiv_undef_undef(double %x) {			define double @fdiv_undef_undef(double %x) {
	; ANY-LABEL: fdiv_undef_undef:			; ANY-LABEL: fdiv_undef_undef:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fdiv double undef, undef			%r = fdiv double undef, undef
	ret double %r			ret double %r
	}			}

	define double @frem_undef_undef(double %x) {			define double @frem_undef_undef(double %x) {
	; ANY-LABEL: frem_undef_undef:			; ANY-LABEL: frem_undef_undef:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = frem double undef, undef			%r = frem double undef, undef
	ret double %r			ret double %r
	}			}

	; Constant folding.			; Constant folding.

	define float @fadd_undef_op0_nnan_constant(float %x) {			define float @fadd_undef_op0_nnan_constant(float %x) {
	; STRICT-LABEL: fadd_undef_op0_nnan_constant:			; ANY-LABEL: fadd_undef_op0_nnan_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addss {{.*}}(%rip), %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op0_nnan_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd nnan float undef, 1.0			%r = fadd nnan float undef, 1.0
	ret float %r			ret float %r
	}			}

	define float @fadd_undef_op1_constant(float %x) {			define float @fadd_undef_op1_constant(float %x) {
	; STRICT-LABEL: fadd_undef_op1_constant:			; ANY-LABEL: fadd_undef_op1_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addss {{.*}}(%rip), %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op1_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd float 2.0, undef			%r = fadd float 2.0, undef
	ret float %r			ret float %r
	}			}

	define float @fsub_undef_op0_fast_constant(float %x) {			define float @fsub_undef_op0_fast_constant(float %x) {
	; ANY-LABEL: fsub_undef_op0_fast_constant:			; ANY-LABEL: fsub_undef_op0_fast_constant:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fsub fast float undef, 3.0			%r = fsub fast float undef, 3.0
	ret float %r			ret float %r
	}			}

	define float @fsub_undef_op1_constant(float %x) {			define float @fsub_undef_op1_constant(float %x) {
	; STRICT-LABEL: fsub_undef_op1_constant:			; ANY-LABEL: fsub_undef_op1_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: subss %xmm0, %xmm0			; ANY-NEXT: retq
	; STRICT-NEXT: retq
	;
	; UNSAFE-LABEL: fsub_undef_op1_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fsub float 4.0, undef			%r = fsub float 4.0, undef
	ret float %r			ret float %r
	}			}

	define float @fmul_undef_op0_nnan_constant(float %x) {			define float @fmul_undef_op0_nnan_constant(float %x) {
	; STRICT-LABEL: fmul_undef_op0_nnan_constant:			; ANY-LABEL: fmul_undef_op0_nnan_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulss {{.*}}(%rip), %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op0_nnan_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul nnan float undef, 5.0			%r = fmul nnan float undef, 5.0
	ret float %r			ret float %r
	}			}

	define float @fmul_undef_op1_constant(float %x) {			define float @fmul_undef_op1_constant(float %x) {
	; STRICT-LABEL: fmul_undef_op1_constant:			; ANY-LABEL: fmul_undef_op1_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulss {{.*}}(%rip), %xmm0			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op1_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul float 6.0, undef			%r = fmul float 6.0, undef
	ret float %r			ret float %r
	}			}

	define float @fdiv_undef_op0_fast_constant(float %x) {			define float @fdiv_undef_op0_fast_constant(float %x) {
	; ANY-LABEL: fdiv_undef_op0_fast_constant:			; ANY-LABEL: fdiv_undef_op0_fast_constant:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fdiv fast float undef, 7.0			%r = fdiv fast float undef, 7.0
	ret float %r			ret float %r
	}			}

	define float @fdiv_undef_op1_constant(float %x) {			define float @fdiv_undef_op1_constant(float %x) {
	; STRICT-LABEL: fdiv_undef_op1_constant:			; ANY-LABEL: fdiv_undef_op1_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: divss %xmm0, %xmm0			; ANY-NEXT: retq
	; STRICT-NEXT: retq
	;
	; UNSAFE-LABEL: fdiv_undef_op1_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fdiv float 8.0, undef			%r = fdiv float 8.0, undef
	ret float %r			ret float %r
	}			}

	define float @frem_undef_op0_nnan_constant(float %x) {			define float @frem_undef_op0_nnan_constant(float %x) {
	; ANY-LABEL: frem_undef_op0_nnan_constant:			; ANY-LABEL: frem_undef_op0_nnan_constant:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = frem nnan float undef, 9.0			%r = frem nnan float undef, 9.0
	ret float %r			ret float %r
	}			}

	define float @frem_undef_op1_constant(float %x) {			define float @frem_undef_op1_constant(float %x) {
	; STRICT-LABEL: frem_undef_op1_constant:			; ANY-LABEL: frem_undef_op1_constant:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero			; ANY-NEXT: movss {{.*#+}} xmm0 = mem[0],zero,zero,zero
	; STRICT-NEXT: jmp fmodf # TAILCALL			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: frem_undef_op1_constant:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = frem float 10.0, undef			%r = frem float 10.0, undef
	ret float %r			ret float %r
	}			}

	; Constant folding - special constants: NaN.			; Constant folding - special constants: NaN.

	define double @fadd_undef_op0_constant_nan(double %x) {			define double @fadd_undef_op0_constant_nan(double %x) {
	; STRICT-LABEL: fadd_undef_op0_constant_nan:			; ANY-LABEL: fadd_undef_op0_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op0_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd double undef, 0x7FF8000000000000			%r = fadd double undef, 0x7FF8000000000000
	ret double %r			ret double %r
	}			}

	define double @fadd_undef_op1_fast_constant_nan(double %x) {			define double @fadd_undef_op1_fast_constant_nan(double %x) {
	; STRICT-LABEL: fadd_undef_op1_fast_constant_nan:			; ANY-LABEL: fadd_undef_op1_fast_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op1_fast_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd fast double 0xFFF0000000000001, undef			%r = fadd fast double 0xFFF0000000000001, undef
	ret double %r			ret double %r
	}			}

	define double @fsub_undef_op0_constant_nan(double %x) {			define double @fsub_undef_op0_constant_nan(double %x) {
	; ANY-LABEL: fsub_undef_op0_constant_nan:			; ANY-LABEL: fsub_undef_op0_constant_nan:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fsub double undef, 0xFFF8000000000010			%r = fsub double undef, 0xFFF8000000000010
	ret double %r			ret double %r
	}			}

	define double @fsub_undef_op1_nnan_constant_nan(double %x) {			define double @fsub_undef_op1_nnan_constant_nan(double %x) {
	; STRICT-LABEL: fsub_undef_op1_nnan_constant_nan:			; ANY-LABEL: fsub_undef_op1_nnan_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: subsd %xmm0, %xmm0			; ANY-NEXT: retq
	; STRICT-NEXT: retq
	;
	; UNSAFE-LABEL: fsub_undef_op1_nnan_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fsub nnan double 0x7FF0000000000011, undef			%r = fsub nnan double 0x7FF0000000000011, undef
	ret double %r			ret double %r
	}			}

	define double @fmul_undef_op0_constant_nan(double %x) {			define double @fmul_undef_op0_constant_nan(double %x) {
	; STRICT-LABEL: fmul_undef_op0_constant_nan:			; ANY-LABEL: fmul_undef_op0_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op0_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul double undef, 0x7FF8000000000100			%r = fmul double undef, 0x7FF8000000000100
	ret double %r			ret double %r
	}			}

	define double @fmul_undef_op1_fast_constant_nan(double %x) {			define double @fmul_undef_op1_fast_constant_nan(double %x) {
	; STRICT-LABEL: fmul_undef_op1_fast_constant_nan:			; ANY-LABEL: fmul_undef_op1_fast_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op1_fast_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul fast double 0xFFF0000000000101, undef			%r = fmul fast double 0xFFF0000000000101, undef
	ret double %r			ret double %r
	}			}

	define double @fdiv_undef_op0_constant_nan(double %x) {			define double @fdiv_undef_op0_constant_nan(double %x) {
	; ANY-LABEL: fdiv_undef_op0_constant_nan:			; ANY-LABEL: fdiv_undef_op0_constant_nan:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fdiv double undef, 0xFFF8000000000110			%r = fdiv double undef, 0xFFF8000000000110
	ret double %r			ret double %r
	}			}

	define double @fdiv_undef_op1_nnan_constant_nan(double %x) {			define double @fdiv_undef_op1_nnan_constant_nan(double %x) {
	; STRICT-LABEL: fdiv_undef_op1_nnan_constant_nan:			; ANY-LABEL: fdiv_undef_op1_nnan_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: divsd %xmm0, %xmm0			; ANY-NEXT: retq
	; STRICT-NEXT: retq
	;
	; UNSAFE-LABEL: fdiv_undef_op1_nnan_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fdiv nnan double 0x7FF0000000000111, undef			%r = fdiv nnan double 0x7FF0000000000111, undef
	ret double %r			ret double %r
	}			}

	define double @frem_undef_op0_constant_nan(double %x) {			define double @frem_undef_op0_constant_nan(double %x) {
	; ANY-LABEL: frem_undef_op0_constant_nan:			; ANY-LABEL: frem_undef_op0_constant_nan:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = frem double undef, 0x7FF8000000001000			%r = frem double undef, 0x7FF8000000001000
	ret double %r			ret double %r
	}			}

	define double @frem_undef_op1_fast_constant_nan(double %x) {			define double @frem_undef_op1_fast_constant_nan(double %x) {
	; STRICT-LABEL: frem_undef_op1_fast_constant_nan:			; ANY-LABEL: frem_undef_op1_fast_constant_nan:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: jmp fmod # TAILCALL			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: frem_undef_op1_fast_constant_nan:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = frem fast double 0xFFF0000000001001, undef			%r = frem fast double 0xFFF0000000001001, undef
	ret double %r			ret double %r
	}			}

	; Constant folding - special constants: Inf.			; Constant folding - special constants: Inf.

	define double @fadd_undef_op0_constant_inf(double %x) {			define double @fadd_undef_op0_constant_inf(double %x) {
	; STRICT-LABEL: fadd_undef_op0_constant_inf:			; ANY-LABEL: fadd_undef_op0_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op0_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd double undef, 0x7FF0000000000000			%r = fadd double undef, 0x7FF0000000000000
	ret double %r			ret double %r
	}			}

	define double @fadd_undef_op1_fast_constant_inf(double %x) {			define double @fadd_undef_op1_fast_constant_inf(double %x) {
	; STRICT-LABEL: fadd_undef_op1_fast_constant_inf:			; ANY-LABEL: fadd_undef_op1_fast_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: addsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fadd_undef_op1_fast_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fadd fast double 0xFFF0000000000000, undef			%r = fadd fast double 0xFFF0000000000000, undef
	ret double %r			ret double %r
	}			}

	define double @fsub_undef_op0_constant_inf(double %x) {			define double @fsub_undef_op0_constant_inf(double %x) {
	; ANY-LABEL: fsub_undef_op0_constant_inf:			; ANY-LABEL: fsub_undef_op0_constant_inf:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fsub double undef, 0xFFF0000000000000			%r = fsub double undef, 0xFFF0000000000000
	ret double %r			ret double %r
	}			}

	define double @fsub_undef_op1_ninf_constant_inf(double %x) {			define double @fsub_undef_op1_ninf_constant_inf(double %x) {
	; STRICT-LABEL: fsub_undef_op1_ninf_constant_inf:			; ANY-LABEL: fsub_undef_op1_ninf_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: subsd %xmm0, %xmm0			; ANY-NEXT: retq
	; STRICT-NEXT: retq
	;
	; UNSAFE-LABEL: fsub_undef_op1_ninf_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fsub ninf double 0x7FF0000000000000, undef			%r = fsub ninf double 0x7FF0000000000000, undef
	ret double %r			ret double %r
	}			}

	define double @fmul_undef_op0_constant_inf(double %x) {			define double @fmul_undef_op0_constant_inf(double %x) {
	; STRICT-LABEL: fmul_undef_op0_constant_inf:			; ANY-LABEL: fmul_undef_op0_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op0_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul double undef, 0x7FF0000000000000			%r = fmul double undef, 0x7FF0000000000000
	ret double %r			ret double %r
	}			}

	define double @fmul_undef_op1_fast_constant_inf(double %x) {			define double @fmul_undef_op1_fast_constant_inf(double %x) {
	; STRICT-LABEL: fmul_undef_op1_fast_constant_inf:			; ANY-LABEL: fmul_undef_op1_fast_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: mulsd {{.*}}(%rip), %xmm0			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: retq			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: fmul_undef_op1_fast_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fmul fast double 0xFFF0000000000000, undef			%r = fmul fast double 0xFFF0000000000000, undef
	ret double %r			ret double %r
	}			}

	define double @fdiv_undef_op0_constant_inf(double %x) {			define double @fdiv_undef_op0_constant_inf(double %x) {
	; ANY-LABEL: fdiv_undef_op0_constant_inf:			; ANY-LABEL: fdiv_undef_op0_constant_inf:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = fdiv double undef, 0xFFF0000000000000			%r = fdiv double undef, 0xFFF0000000000000
	ret double %r			ret double %r
	}			}

	define double @fdiv_undef_op1_ninf_constant_inf(double %x) {			define double @fdiv_undef_op1_ninf_constant_inf(double %x) {
	; STRICT-LABEL: fdiv_undef_op1_ninf_constant_inf:			; ANY-LABEL: fdiv_undef_op1_ninf_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: divsd %xmm0, %xmm0			; ANY-NEXT: retq
	; STRICT-NEXT: retq
	;
	; UNSAFE-LABEL: fdiv_undef_op1_ninf_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = fdiv ninf double 0x7FF0000000000000, undef			%r = fdiv ninf double 0x7FF0000000000000, undef
	ret double %r			ret double %r
	}			}

	define double @frem_undef_op0_constant_inf(double %x) {			define double @frem_undef_op0_constant_inf(double %x) {
	; ANY-LABEL: frem_undef_op0_constant_inf:			; ANY-LABEL: frem_undef_op0_constant_inf:
	; ANY: # %bb.0:			; ANY: # %bb.0:
				; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; ANY-NEXT: retq			; ANY-NEXT: retq
	%r = frem double undef, 0x7FF0000000000000			%r = frem double undef, 0x7FF0000000000000
	ret double %r			ret double %r
	}			}

	define double @frem_undef_op1_fast_constant_inf(double %x) {			define double @frem_undef_op1_fast_constant_inf(double %x) {
	; STRICT-LABEL: frem_undef_op1_fast_constant_inf:			; ANY-LABEL: frem_undef_op1_fast_constant_inf:
	; STRICT: # %bb.0:			; ANY: # %bb.0:
	; STRICT-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero			; ANY-NEXT: movsd {{.*#+}} xmm0 = mem[0],zero
	; STRICT-NEXT: jmp fmod # TAILCALL			; ANY-NEXT: retq
	;
	; UNSAFE-LABEL: frem_undef_op1_fast_constant_inf:
	; UNSAFE: # %bb.0:
	; UNSAFE-NEXT: retq
	%r = frem fast double 0xFFF0000000000000, undef			%r = frem fast double 0xFFF0000000000000, undef
	ret double %r			ret double %r
	}			}

llvm/trunk/test/CodeGen/X86/pr23103.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown-unknown -mcpu=generic -mattr=+avx < %s \| FileCheck %s			; RUN: llc -verify-machineinstrs -mtriple=x86_64-unknown-unknown -mcpu=generic -mattr=+avx < %s \| FileCheck %s

	; When commuting a VADDSDrr instruction, verify that the 'IsUndef' flag is			; When commuting a VADDSDrr instruction, verify that the 'IsUndef' flag is
	; correctly propagated to the operands of the resulting instruction.			; correctly propagated to the operands of the resulting instruction.
	; Test for PR23103;			; Test for PR23103;

	declare zeroext i1 @foo(<1 x double>)			declare zeroext i1 @foo(<1 x double>)

	define <1 x double> @pr23103(<1 x double>* align 8 %Vp) {			define <1 x double> @pr23103(<1 x double>* align 8 %Vp) {
	; CHECK-LABEL: pr23103:			; CHECK-LABEL: pr23103:
	; CHECK: vmovsd (%rdi), %xmm0			; CHECK: # %bb.0: # %entry
	; CHECK-NEXT: vmovsd %xmm0, {{.}}(%rsp) {{.#+}} 8-byte Spill			; CHECK-NEXT: pushq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 16
				; CHECK-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
	; CHECK-NEXT: callq foo			; CHECK-NEXT: callq foo
	; CHECK-NEXT: vaddsd {{.}}(%rsp), %xmm0, %xmm0 {{.#+}} 8-byte Folded Reload			; CHECK-NEXT: vmovsd {{.*#+}} xmm0 = mem[0],zero
	; CHECK: retq			; CHECK-NEXT: popq %rax
				; CHECK-NEXT: .cfi_def_cfa_offset 8
				; CHECK-NEXT: retq
	entry:			entry:
	%V = load <1 x double>, <1 x double>* %Vp, align 8			%V = load <1 x double>, <1 x double>* %Vp, align 8
	%call = call zeroext i1 @foo(<1 x double> %V)			%call = call zeroext i1 @foo(<1 x double> %V)
	%fadd = fadd <1 x double> %V, undef			%fadd = fadd <1 x double> %V, undef
	ret <1 x double> %fadd			ret <1 x double> %fadd
	}			}

llvm/trunk/test/CodeGen/X86/vector-reduce-fadd.ll

	Show First 20 Lines • Show All 749 Lines • ▼ Show 20 Lines

	;			;
	; vXf32 (undef)			; vXf32 (undef)
	;			;

	define float @test_v2f32_undef(<2 x float> %a0) {			define float @test_v2f32_undef(<2 x float> %a0) {
	; SSE2-LABEL: test_v2f32_undef:			; SSE2-LABEL: test_v2f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm1
	; SSE2-NEXT: addss %xmm0, %xmm1
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[1,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[1,1,2,3]
	; SSE2-NEXT: addss %xmm1, %xmm0			; SSE2-NEXT: addss {{.*}}(%rip), %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v2f32_undef:			; SSE41-LABEL: test_v2f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]			; SSE41-NEXT: movshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm0, %xmm0			; SSE41-NEXT: addss {{.*}}(%rip), %xmm0
	; SSE41-NEXT: addss %xmm1, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v2f32_undef:			; AVX-LABEL: test_v2f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddss %xmm0, %xmm0, %xmm1
	; AVX-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]			; AVX-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
	; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm0			; AVX-NEXT: vaddss {{.*}}(%rip), %xmm0, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v2f32_undef:			; AVX512-LABEL: test_v2f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddss %xmm0, %xmm0, %xmm1
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]			; AVX512-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
	; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vaddss {{.*}}(%rip), %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call float @llvm.experimental.vector.reduce.fadd.f32.f32.v2f32(float undef, <2 x float> %a0)			%1 = call float @llvm.experimental.vector.reduce.fadd.f32.f32.v2f32(float undef, <2 x float> %a0)
	ret float %1			ret float %1
	}			}

	define float @test_v4f32_undef(<4 x float> %a0) {			define float @test_v4f32_undef(<4 x float> %a0) {
	; SSE2-LABEL: test_v4f32_undef:			; SSE2-LABEL: test_v4f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm1			; SSE2-NEXT: movaps %xmm0, %xmm1
	; SSE2-NEXT: addss %xmm0, %xmm1			; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[1,1],xmm0[2,3]
				; SSE2-NEXT: addss {{.*}}(%rip), %xmm1
	; SSE2-NEXT: movaps %xmm0, %xmm2			; SSE2-NEXT: movaps %xmm0, %xmm2
	; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm0[2,3]			; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]
	; SSE2-NEXT: addss %xmm1, %xmm2			; SSE2-NEXT: addss %xmm1, %xmm2
	; SSE2-NEXT: movaps %xmm0, %xmm1
	; SSE2-NEXT: movhlps {{.*#+}} xmm1 = xmm0[1],xmm1[1]
	; SSE2-NEXT: addss %xmm2, %xmm1
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE2-NEXT: addss %xmm1, %xmm0			; SSE2-NEXT: addss %xmm2, %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v4f32_undef:			; SSE41-LABEL: test_v4f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movaps %xmm0, %xmm1			; SSE41-NEXT: movshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm0, %xmm1			; SSE41-NEXT: addss {{.*}}(%rip), %xmm1
	; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm2, %xmm1
	; SSE41-NEXT: movaps %xmm0, %xmm2			; SSE41-NEXT: movaps %xmm0, %xmm2
	; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]
	; SSE41-NEXT: addss %xmm1, %xmm2			; SSE41-NEXT: addss %xmm1, %xmm2
	; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE41-NEXT: addss %xmm2, %xmm0			; SSE41-NEXT: addss %xmm2, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v4f32_undef:			; AVX-LABEL: test_v4f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddss %xmm0, %xmm0, %xmm1			; AVX-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX-NEXT: vaddss {{.*}}(%rip), %xmm1, %xmm1
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm0			; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v4f32_undef:			; AVX512-LABEL: test_v4f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddss %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vaddss {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call float @llvm.experimental.vector.reduce.fadd.f32.f32.v4f32(float undef, <4 x float> %a0)			%1 = call float @llvm.experimental.vector.reduce.fadd.f32.f32.v4f32(float undef, <4 x float> %a0)
	ret float %1			ret float %1
	}			}

	define float @test_v8f32_undef(<8 x float> %a0) {			define float @test_v8f32_undef(<8 x float> %a0) {
	; SSE2-LABEL: test_v8f32_undef:			; SSE2-LABEL: test_v8f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm2			; SSE2-NEXT: movaps %xmm0, %xmm2
	; SSE2-NEXT: addss %xmm0, %xmm2			; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm0[2,3]
				; SSE2-NEXT: addss {{.*}}(%rip), %xmm2
	; SSE2-NEXT: movaps %xmm0, %xmm3			; SSE2-NEXT: movaps %xmm0, %xmm3
	; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[1,1],xmm0[2,3]			; SSE2-NEXT: movhlps {{.*#+}} xmm3 = xmm0[1],xmm3[1]
	; SSE2-NEXT: addss %xmm2, %xmm3			; SSE2-NEXT: addss %xmm2, %xmm3
	; SSE2-NEXT: movaps %xmm0, %xmm2
	; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]
	; SSE2-NEXT: addss %xmm3, %xmm2
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE2-NEXT: addss %xmm2, %xmm0			; SSE2-NEXT: addss %xmm3, %xmm0
	; SSE2-NEXT: addss %xmm1, %xmm0			; SSE2-NEXT: addss %xmm1, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm2			; SSE2-NEXT: movaps %xmm1, %xmm2
	; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm1[2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm1[2,3]
	; SSE2-NEXT: addss %xmm2, %xmm0			; SSE2-NEXT: addss %xmm2, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm2			; SSE2-NEXT: movaps %xmm1, %xmm2
	; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]			; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]
	; SSE2-NEXT: addss %xmm2, %xmm0			; SSE2-NEXT: addss %xmm2, %xmm0
	; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	; SSE2-NEXT: addss %xmm1, %xmm0			; SSE2-NEXT: addss %xmm1, %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v8f32_undef:			; SSE41-LABEL: test_v8f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movaps %xmm0, %xmm2			; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm0, %xmm2			; SSE41-NEXT: addss {{.*}}(%rip), %xmm2
	; SSE41-NEXT: movshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm3, %xmm2
	; SSE41-NEXT: movaps %xmm0, %xmm3			; SSE41-NEXT: movaps %xmm0, %xmm3
	; SSE41-NEXT: movhlps {{.*#+}} xmm3 = xmm0[1],xmm3[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm3 = xmm0[1],xmm3[1]
	; SSE41-NEXT: addss %xmm2, %xmm3			; SSE41-NEXT: addss %xmm2, %xmm3
	; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE41-NEXT: addss %xmm3, %xmm0			; SSE41-NEXT: addss %xmm3, %xmm0
	; SSE41-NEXT: addss %xmm1, %xmm0			; SSE41-NEXT: addss %xmm1, %xmm0
	; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm1[1,1,3,3]			; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm1[1,1,3,3]
	; SSE41-NEXT: addss %xmm2, %xmm0			; SSE41-NEXT: addss %xmm2, %xmm0
	; SSE41-NEXT: movaps %xmm1, %xmm2			; SSE41-NEXT: movaps %xmm1, %xmm2
	; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]
	; SSE41-NEXT: addss %xmm2, %xmm0			; SSE41-NEXT: addss %xmm2, %xmm0
	; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	; SSE41-NEXT: addss %xmm1, %xmm0			; SSE41-NEXT: addss %xmm1, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v8f32_undef:			; AVX-LABEL: test_v8f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddss %xmm0, %xmm0, %xmm1			; AVX-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX-NEXT: vaddss {{.*}}(%rip), %xmm1, %xmm1
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm1			; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm1
	; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm0			; AVX-NEXT: vaddss %xmm0, %xmm1, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v8f32_undef:			; AVX512-LABEL: test_v8f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddss %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vaddss {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm1
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vaddss %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: vzeroupper			; AVX512-NEXT: vzeroupper
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call float @llvm.experimental.vector.reduce.fadd.f32.f32.v8f32(float undef, <8 x float> %a0)			%1 = call float @llvm.experimental.vector.reduce.fadd.f32.f32.v8f32(float undef, <8 x float> %a0)
	ret float %1			ret float %1
	}			}

	define float @test_v16f32_undef(<16 x float> %a0) {			define float @test_v16f32_undef(<16 x float> %a0) {
	; SSE2-LABEL: test_v16f32_undef:			; SSE2-LABEL: test_v16f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm4			; SSE2-NEXT: movaps %xmm0, %xmm4
	; SSE2-NEXT: addss %xmm0, %xmm4			; SSE2-NEXT: shufps {{.*#+}} xmm4 = xmm4[1,1],xmm0[2,3]
				; SSE2-NEXT: addss {{.*}}(%rip), %xmm4
	; SSE2-NEXT: movaps %xmm0, %xmm5			; SSE2-NEXT: movaps %xmm0, %xmm5
	; SSE2-NEXT: shufps {{.*#+}} xmm5 = xmm5[1,1],xmm0[2,3]			; SSE2-NEXT: movhlps {{.*#+}} xmm5 = xmm0[1],xmm5[1]
	; SSE2-NEXT: addss %xmm4, %xmm5			; SSE2-NEXT: addss %xmm4, %xmm5
	; SSE2-NEXT: movaps %xmm0, %xmm4
	; SSE2-NEXT: movhlps {{.*#+}} xmm4 = xmm0[1],xmm4[1]
	; SSE2-NEXT: addss %xmm5, %xmm4
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE2-NEXT: addss %xmm4, %xmm0			; SSE2-NEXT: addss %xmm5, %xmm0
	; SSE2-NEXT: addss %xmm1, %xmm0			; SSE2-NEXT: addss %xmm1, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm4			; SSE2-NEXT: movaps %xmm1, %xmm4
	; SSE2-NEXT: shufps {{.*#+}} xmm4 = xmm4[1,1],xmm1[2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm4 = xmm4[1,1],xmm1[2,3]
	; SSE2-NEXT: addss %xmm4, %xmm0			; SSE2-NEXT: addss %xmm4, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm4			; SSE2-NEXT: movaps %xmm1, %xmm4
	; SSE2-NEXT: movhlps {{.*#+}} xmm4 = xmm1[1],xmm4[1]			; SSE2-NEXT: movhlps {{.*#+}} xmm4 = xmm1[1],xmm4[1]
	; SSE2-NEXT: addss %xmm4, %xmm0			; SSE2-NEXT: addss %xmm4, %xmm0
	; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	Show All 15 Lines
	; SSE2-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]			; SSE2-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]
	; SSE2-NEXT: addss %xmm1, %xmm0			; SSE2-NEXT: addss %xmm1, %xmm0
	; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]
	; SSE2-NEXT: addss %xmm3, %xmm0			; SSE2-NEXT: addss %xmm3, %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v16f32_undef:			; SSE41-LABEL: test_v16f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movaps %xmm0, %xmm4			; SSE41-NEXT: movshdup {{.*#+}} xmm4 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm0, %xmm4			; SSE41-NEXT: addss {{.*}}(%rip), %xmm4
	; SSE41-NEXT: movshdup {{.*#+}} xmm5 = xmm0[1,1,3,3]
	; SSE41-NEXT: addss %xmm5, %xmm4
	; SSE41-NEXT: movaps %xmm0, %xmm5			; SSE41-NEXT: movaps %xmm0, %xmm5
	; SSE41-NEXT: movhlps {{.*#+}} xmm5 = xmm0[1],xmm5[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm5 = xmm0[1],xmm5[1]
	; SSE41-NEXT: addss %xmm4, %xmm5			; SSE41-NEXT: addss %xmm4, %xmm5
	; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE41-NEXT: addss %xmm5, %xmm0			; SSE41-NEXT: addss %xmm5, %xmm0
	; SSE41-NEXT: addss %xmm1, %xmm0			; SSE41-NEXT: addss %xmm1, %xmm0
	; SSE41-NEXT: movshdup {{.*#+}} xmm4 = xmm1[1,1,3,3]			; SSE41-NEXT: movshdup {{.*#+}} xmm4 = xmm1[1,1,3,3]
	; SSE41-NEXT: addss %xmm4, %xmm0			; SSE41-NEXT: addss %xmm4, %xmm0
	Show All 17 Lines
	; SSE41-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]
	; SSE41-NEXT: addss %xmm1, %xmm0			; SSE41-NEXT: addss %xmm1, %xmm0
	; SSE41-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]
	; SSE41-NEXT: addss %xmm3, %xmm0			; SSE41-NEXT: addss %xmm3, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v16f32_undef:			; AVX-LABEL: test_v16f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddss %xmm0, %xmm0, %xmm2			; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; AVX-NEXT: vmovshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]			; AVX-NEXT: vaddss {{.*}}(%rip), %xmm2, %xmm2
	; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]
	; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2			; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vpermilps {{.*#+}} xmm3 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm3 = xmm0[3,1,2,3]
	; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2			; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vaddss %xmm0, %xmm2, %xmm2			; AVX-NEXT: vaddss %xmm0, %xmm2, %xmm2
	; AVX-NEXT: vmovshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]			; AVX-NEXT: vmovshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]
	; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2			; AVX-NEXT: vaddss %xmm3, %xmm2, %xmm2
	Show All 16 Lines
	; AVX-NEXT: vaddss %xmm2, %xmm0, %xmm0			; AVX-NEXT: vaddss %xmm2, %xmm0, %xmm0
	; AVX-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	; AVX-NEXT: vaddss %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddss %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v16f32_undef:			; AVX512-LABEL: test_v16f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddss %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vaddss {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2
	; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm3 = xmm2[1,1,3,3]			; AVX512-NEXT: vmovshdup {{.*#+}} xmm3 = xmm2[1,1,3,3]
	; AVX512-NEXT: vaddss %xmm3, %xmm1, %xmm1			; AVX512-NEXT: vaddss %xmm3, %xmm1, %xmm1
	▲ Show 20 Lines • Show All 568 Lines • ▼ Show 20 Lines

	;			;
	; vXf64 (undef)			; vXf64 (undef)
	;			;

	define double @test_v2f64_undef(<2 x double> %a0) {			define double @test_v2f64_undef(<2 x double> %a0) {
	; SSE-LABEL: test_v2f64_undef:			; SSE-LABEL: test_v2f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm1
	; SSE-NEXT: addsd %xmm0, %xmm1
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v2f64_undef:			; AVX-LABEL: test_v2f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddsd %xmm0, %xmm0, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vaddsd %xmm0, %xmm1, %xmm0			; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v2f64_undef:			; AVX512-LABEL: test_v2f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddsd %xmm0, %xmm0, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vaddsd {{.*}}(%rip), %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call double @llvm.experimental.vector.reduce.fadd.f64.f64.v2f64(double undef, <2 x double> %a0)			%1 = call double @llvm.experimental.vector.reduce.fadd.f64.f64.v2f64(double undef, <2 x double> %a0)
	ret double %1			ret double %1
	}			}

	define double @test_v4f64_undef(<4 x double> %a0) {			define double @test_v4f64_undef(<4 x double> %a0) {
	; SSE-LABEL: test_v4f64_undef:			; SSE-LABEL: test_v4f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm2
	; SSE-NEXT: addsd %xmm0, %xmm2
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: addsd %xmm2, %xmm0			; SSE-NEXT: addsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd %xmm1, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd %xmm1, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v4f64_undef:			; AVX-LABEL: test_v4f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddsd %xmm0, %xmm0, %xmm1			; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,0]
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm1, %xmm1
	; AVX-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vaddsd %xmm0, %xmm1, %xmm1			; AVX-NEXT: vaddsd %xmm0, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vaddsd %xmm0, %xmm1, %xmm0			; AVX-NEXT: vaddsd %xmm0, %xmm1, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v4f64_undef:			; AVX512-LABEL: test_v4f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddsd %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,0]
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vaddsd {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm1			; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: vzeroupper			; AVX512-NEXT: vzeroupper
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call double @llvm.experimental.vector.reduce.fadd.f64.f64.v4f64(double undef, <4 x double> %a0)			%1 = call double @llvm.experimental.vector.reduce.fadd.f64.f64.v4f64(double undef, <4 x double> %a0)
	ret double %1			ret double %1
	}			}

	define double @test_v8f64_undef(<8 x double> %a0) {			define double @test_v8f64_undef(<8 x double> %a0) {
	; SSE-LABEL: test_v8f64_undef:			; SSE-LABEL: test_v8f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm4
	; SSE-NEXT: addsd %xmm0, %xmm4
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: addsd %xmm4, %xmm0			; SSE-NEXT: addsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd %xmm1, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd %xmm1, %xmm0
	; SSE-NEXT: addsd %xmm2, %xmm0			; SSE-NEXT: addsd %xmm2, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]
	; SSE-NEXT: addsd %xmm2, %xmm0			; SSE-NEXT: addsd %xmm2, %xmm0
	; SSE-NEXT: addsd %xmm3, %xmm0			; SSE-NEXT: addsd %xmm3, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]
	; SSE-NEXT: addsd %xmm3, %xmm0			; SSE-NEXT: addsd %xmm3, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v8f64_undef:			; AVX-LABEL: test_v8f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddsd %xmm0, %xmm0, %xmm2			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]			; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm2, %xmm2
	; AVX-NEXT: vaddsd %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vaddsd %xmm0, %xmm2, %xmm2			; AVX-NEXT: vaddsd %xmm0, %xmm2, %xmm2
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vaddsd %xmm0, %xmm2, %xmm0			; AVX-NEXT: vaddsd %xmm0, %xmm2, %xmm0
	; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm1[1,0]
	; AVX-NEXT: vaddsd %xmm2, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm2, %xmm0, %xmm0
	; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1			; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1
	; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]
	; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v8f64_undef:			; AVX512-LABEL: test_v8f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddsd %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,0]
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vaddsd {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2
	; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]
	; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm2			; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm2
	; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]
	; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vaddsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf32x4 $3, %zmm0, %xmm0			; AVX512-NEXT: vextractf32x4 $3, %zmm0, %xmm0
	; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm1			; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vaddsd %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: vzeroupper			; AVX512-NEXT: vzeroupper
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call double @llvm.experimental.vector.reduce.fadd.f64.f64.v8f64(double undef, <8 x double> %a0)			%1 = call double @llvm.experimental.vector.reduce.fadd.f64.f64.v8f64(double undef, <8 x double> %a0)
	ret double %1			ret double %1
	}			}

	define double @test_v16f64_undef(<16 x double> %a0) {			define double @test_v16f64_undef(<16 x double> %a0) {
	; SSE-LABEL: test_v16f64_undef:			; SSE-LABEL: test_v16f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm8
	; SSE-NEXT: addsd %xmm0, %xmm8
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: addsd %xmm8, %xmm0			; SSE-NEXT: addsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd %xmm1, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]
	; SSE-NEXT: addsd %xmm1, %xmm0			; SSE-NEXT: addsd %xmm1, %xmm0
	; SSE-NEXT: addsd %xmm2, %xmm0			; SSE-NEXT: addsd %xmm2, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]
	; SSE-NEXT: addsd %xmm2, %xmm0			; SSE-NEXT: addsd %xmm2, %xmm0
	; SSE-NEXT: addsd %xmm3, %xmm0			; SSE-NEXT: addsd %xmm3, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]
	Show All 9 Lines
	; SSE-NEXT: addsd %xmm6, %xmm0			; SSE-NEXT: addsd %xmm6, %xmm0
	; SSE-NEXT: addsd %xmm7, %xmm0			; SSE-NEXT: addsd %xmm7, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm7 = xmm7[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm7 = xmm7[1,1]
	; SSE-NEXT: addsd %xmm7, %xmm0			; SSE-NEXT: addsd %xmm7, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v16f64_undef:			; AVX-LABEL: test_v16f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vaddsd %xmm0, %xmm0, %xmm4			; AVX-NEXT: vpermilpd {{.*#+}} xmm4 = xmm0[1,0]
	; AVX-NEXT: vpermilpd {{.*#+}} xmm5 = xmm0[1,0]			; AVX-NEXT: vaddsd {{.*}}(%rip), %xmm4, %xmm4
	; AVX-NEXT: vaddsd %xmm5, %xmm4, %xmm4
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vaddsd %xmm0, %xmm4, %xmm4			; AVX-NEXT: vaddsd %xmm0, %xmm4, %xmm4
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vaddsd %xmm0, %xmm4, %xmm0			; AVX-NEXT: vaddsd %xmm0, %xmm4, %xmm0
	; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm4 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm4 = xmm1[1,0]
	; AVX-NEXT: vaddsd %xmm4, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm4, %xmm0, %xmm0
	; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1			; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1
	Show All 14 Lines
	; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]
	; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vaddsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v16f64_undef:			; AVX512-LABEL: test_v16f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vaddsd %xmm0, %xmm0, %xmm2			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]			; AVX512-NEXT: vaddsd {{.*}}(%rip), %xmm2, %xmm2
	; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm3			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm3
	; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]
	; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm3			; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm3
	; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]
	; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vaddsd %xmm3, %xmm2, %xmm2
	Show All 34 Lines

llvm/trunk/test/CodeGen/X86/vector-reduce-fmul.ll

	Show First 20 Lines • Show All 641 Lines • ▼ Show 20 Lines

	;			;
	; vXf32 (undef)			; vXf32 (undef)
	;			;

	define float @test_v2f32_undef(<2 x float> %a0) {			define float @test_v2f32_undef(<2 x float> %a0) {
	; SSE2-LABEL: test_v2f32_undef:			; SSE2-LABEL: test_v2f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm1
	; SSE2-NEXT: mulss %xmm0, %xmm1
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[1,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[1,1,2,3]
	; SSE2-NEXT: mulss %xmm1, %xmm0			; SSE2-NEXT: mulss {{.*}}(%rip), %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v2f32_undef:			; SSE41-LABEL: test_v2f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]			; SSE41-NEXT: movshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm0, %xmm0			; SSE41-NEXT: mulss {{.*}}(%rip), %xmm0
	; SSE41-NEXT: mulss %xmm1, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v2f32_undef:			; AVX-LABEL: test_v2f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulss %xmm0, %xmm0, %xmm1
	; AVX-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]			; AVX-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
	; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm0			; AVX-NEXT: vmulss {{.*}}(%rip), %xmm0, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v2f32_undef:			; AVX512-LABEL: test_v2f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulss %xmm0, %xmm0, %xmm1
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]			; AVX512-NEXT: vmovshdup {{.*#+}} xmm0 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vmulss {{.*}}(%rip), %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call float @llvm.experimental.vector.reduce.fmul.f32.f32.v2f32(float undef, <2 x float> %a0)			%1 = call float @llvm.experimental.vector.reduce.fmul.f32.f32.v2f32(float undef, <2 x float> %a0)
	ret float %1			ret float %1
	}			}

	define float @test_v4f32_undef(<4 x float> %a0) {			define float @test_v4f32_undef(<4 x float> %a0) {
	; SSE2-LABEL: test_v4f32_undef:			; SSE2-LABEL: test_v4f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm1			; SSE2-NEXT: movaps %xmm0, %xmm1
	; SSE2-NEXT: mulss %xmm0, %xmm1			; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[1,1],xmm0[2,3]
				; SSE2-NEXT: mulss {{.*}}(%rip), %xmm1
	; SSE2-NEXT: movaps %xmm0, %xmm2			; SSE2-NEXT: movaps %xmm0, %xmm2
	; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm0[2,3]			; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]
	; SSE2-NEXT: mulss %xmm1, %xmm2			; SSE2-NEXT: mulss %xmm1, %xmm2
	; SSE2-NEXT: movaps %xmm0, %xmm1
	; SSE2-NEXT: movhlps {{.*#+}} xmm1 = xmm0[1],xmm1[1]
	; SSE2-NEXT: mulss %xmm2, %xmm1
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE2-NEXT: mulss %xmm1, %xmm0			; SSE2-NEXT: mulss %xmm2, %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v4f32_undef:			; SSE41-LABEL: test_v4f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movaps %xmm0, %xmm1			; SSE41-NEXT: movshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm0, %xmm1			; SSE41-NEXT: mulss {{.*}}(%rip), %xmm1
	; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm2, %xmm1
	; SSE41-NEXT: movaps %xmm0, %xmm2			; SSE41-NEXT: movaps %xmm0, %xmm2
	; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]
	; SSE41-NEXT: mulss %xmm1, %xmm2			; SSE41-NEXT: mulss %xmm1, %xmm2
	; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE41-NEXT: mulss %xmm2, %xmm0			; SSE41-NEXT: mulss %xmm2, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v4f32_undef:			; AVX-LABEL: test_v4f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulss %xmm0, %xmm0, %xmm1			; AVX-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX-NEXT: vmulss {{.*}}(%rip), %xmm1, %xmm1
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm0			; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v4f32_undef:			; AVX512-LABEL: test_v4f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulss %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vmulss {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call float @llvm.experimental.vector.reduce.fmul.f32.f32.v4f32(float undef, <4 x float> %a0)			%1 = call float @llvm.experimental.vector.reduce.fmul.f32.f32.v4f32(float undef, <4 x float> %a0)
	ret float %1			ret float %1
	}			}

	define float @test_v8f32_undef(<8 x float> %a0) {			define float @test_v8f32_undef(<8 x float> %a0) {
	; SSE2-LABEL: test_v8f32_undef:			; SSE2-LABEL: test_v8f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm2			; SSE2-NEXT: movaps %xmm0, %xmm2
	; SSE2-NEXT: mulss %xmm0, %xmm2			; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm0[2,3]
				; SSE2-NEXT: mulss {{.*}}(%rip), %xmm2
	; SSE2-NEXT: movaps %xmm0, %xmm3			; SSE2-NEXT: movaps %xmm0, %xmm3
	; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[1,1],xmm0[2,3]			; SSE2-NEXT: movhlps {{.*#+}} xmm3 = xmm0[1],xmm3[1]
	; SSE2-NEXT: mulss %xmm2, %xmm3			; SSE2-NEXT: mulss %xmm2, %xmm3
	; SSE2-NEXT: movaps %xmm0, %xmm2
	; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm0[1],xmm2[1]
	; SSE2-NEXT: mulss %xmm3, %xmm2
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE2-NEXT: mulss %xmm2, %xmm0			; SSE2-NEXT: mulss %xmm3, %xmm0
	; SSE2-NEXT: mulss %xmm1, %xmm0			; SSE2-NEXT: mulss %xmm1, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm2			; SSE2-NEXT: movaps %xmm1, %xmm2
	; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm1[2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm2 = xmm2[1,1],xmm1[2,3]
	; SSE2-NEXT: mulss %xmm2, %xmm0			; SSE2-NEXT: mulss %xmm2, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm2			; SSE2-NEXT: movaps %xmm1, %xmm2
	; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]			; SSE2-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]
	; SSE2-NEXT: mulss %xmm2, %xmm0			; SSE2-NEXT: mulss %xmm2, %xmm0
	; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	; SSE2-NEXT: mulss %xmm1, %xmm0			; SSE2-NEXT: mulss %xmm1, %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v8f32_undef:			; SSE41-LABEL: test_v8f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movaps %xmm0, %xmm2			; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm0, %xmm2			; SSE41-NEXT: mulss {{.*}}(%rip), %xmm2
	; SSE41-NEXT: movshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm3, %xmm2
	; SSE41-NEXT: movaps %xmm0, %xmm3			; SSE41-NEXT: movaps %xmm0, %xmm3
	; SSE41-NEXT: movhlps {{.*#+}} xmm3 = xmm0[1],xmm3[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm3 = xmm0[1],xmm3[1]
	; SSE41-NEXT: mulss %xmm2, %xmm3			; SSE41-NEXT: mulss %xmm2, %xmm3
	; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE41-NEXT: mulss %xmm3, %xmm0			; SSE41-NEXT: mulss %xmm3, %xmm0
	; SSE41-NEXT: mulss %xmm1, %xmm0			; SSE41-NEXT: mulss %xmm1, %xmm0
	; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm1[1,1,3,3]			; SSE41-NEXT: movshdup {{.*#+}} xmm2 = xmm1[1,1,3,3]
	; SSE41-NEXT: mulss %xmm2, %xmm0			; SSE41-NEXT: mulss %xmm2, %xmm0
	; SSE41-NEXT: movaps %xmm1, %xmm2			; SSE41-NEXT: movaps %xmm1, %xmm2
	; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm2 = xmm1[1],xmm2[1]
	; SSE41-NEXT: mulss %xmm2, %xmm0			; SSE41-NEXT: mulss %xmm2, %xmm0
	; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	; SSE41-NEXT: mulss %xmm1, %xmm0			; SSE41-NEXT: mulss %xmm1, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v8f32_undef:			; AVX-LABEL: test_v8f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulss %xmm0, %xmm0, %xmm1			; AVX-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX-NEXT: vmulss {{.*}}(%rip), %xmm1, %xmm1
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm1			; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm1
	; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm0			; AVX-NEXT: vmulss %xmm0, %xmm1, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v8f32_undef:			; AVX512-LABEL: test_v8f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulss %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vmulss {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm1
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vmulss %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: vzeroupper			; AVX512-NEXT: vzeroupper
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call float @llvm.experimental.vector.reduce.fmul.f32.f32.v8f32(float undef, <8 x float> %a0)			%1 = call float @llvm.experimental.vector.reduce.fmul.f32.f32.v8f32(float undef, <8 x float> %a0)
	ret float %1			ret float %1
	}			}

	define float @test_v16f32_undef(<16 x float> %a0) {			define float @test_v16f32_undef(<16 x float> %a0) {
	; SSE2-LABEL: test_v16f32_undef:			; SSE2-LABEL: test_v16f32_undef:
	; SSE2: # %bb.0:			; SSE2: # %bb.0:
	; SSE2-NEXT: movaps %xmm0, %xmm4			; SSE2-NEXT: movaps %xmm0, %xmm4
	; SSE2-NEXT: mulss %xmm0, %xmm4			; SSE2-NEXT: shufps {{.*#+}} xmm4 = xmm4[1,1],xmm0[2,3]
				; SSE2-NEXT: mulss {{.*}}(%rip), %xmm4
	; SSE2-NEXT: movaps %xmm0, %xmm5			; SSE2-NEXT: movaps %xmm0, %xmm5
	; SSE2-NEXT: shufps {{.*#+}} xmm5 = xmm5[1,1],xmm0[2,3]			; SSE2-NEXT: movhlps {{.*#+}} xmm5 = xmm0[1],xmm5[1]
	; SSE2-NEXT: mulss %xmm4, %xmm5			; SSE2-NEXT: mulss %xmm4, %xmm5
	; SSE2-NEXT: movaps %xmm0, %xmm4
	; SSE2-NEXT: movhlps {{.*#+}} xmm4 = xmm0[1],xmm4[1]
	; SSE2-NEXT: mulss %xmm5, %xmm4
	; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE2-NEXT: mulss %xmm4, %xmm0			; SSE2-NEXT: mulss %xmm5, %xmm0
	; SSE2-NEXT: mulss %xmm1, %xmm0			; SSE2-NEXT: mulss %xmm1, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm4			; SSE2-NEXT: movaps %xmm1, %xmm4
	; SSE2-NEXT: shufps {{.*#+}} xmm4 = xmm4[1,1],xmm1[2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm4 = xmm4[1,1],xmm1[2,3]
	; SSE2-NEXT: mulss %xmm4, %xmm0			; SSE2-NEXT: mulss %xmm4, %xmm0
	; SSE2-NEXT: movaps %xmm1, %xmm4			; SSE2-NEXT: movaps %xmm1, %xmm4
	; SSE2-NEXT: movhlps {{.*#+}} xmm4 = xmm1[1],xmm4[1]			; SSE2-NEXT: movhlps {{.*#+}} xmm4 = xmm1[1],xmm4[1]
	; SSE2-NEXT: mulss %xmm4, %xmm0			; SSE2-NEXT: mulss %xmm4, %xmm0
	; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	Show All 15 Lines
	; SSE2-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]			; SSE2-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]
	; SSE2-NEXT: mulss %xmm1, %xmm0			; SSE2-NEXT: mulss %xmm1, %xmm0
	; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]			; SSE2-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]
	; SSE2-NEXT: mulss %xmm3, %xmm0			; SSE2-NEXT: mulss %xmm3, %xmm0
	; SSE2-NEXT: retq			; SSE2-NEXT: retq
	;			;
	; SSE41-LABEL: test_v16f32_undef:			; SSE41-LABEL: test_v16f32_undef:
	; SSE41: # %bb.0:			; SSE41: # %bb.0:
	; SSE41-NEXT: movaps %xmm0, %xmm4			; SSE41-NEXT: movshdup {{.*#+}} xmm4 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm0, %xmm4			; SSE41-NEXT: mulss {{.*}}(%rip), %xmm4
	; SSE41-NEXT: movshdup {{.*#+}} xmm5 = xmm0[1,1,3,3]
	; SSE41-NEXT: mulss %xmm5, %xmm4
	; SSE41-NEXT: movaps %xmm0, %xmm5			; SSE41-NEXT: movaps %xmm0, %xmm5
	; SSE41-NEXT: movhlps {{.*#+}} xmm5 = xmm0[1],xmm5[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm5 = xmm0[1],xmm5[1]
	; SSE41-NEXT: mulss %xmm4, %xmm5			; SSE41-NEXT: mulss %xmm4, %xmm5
	; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm0 = xmm0[3,1,2,3]
	; SSE41-NEXT: mulss %xmm5, %xmm0			; SSE41-NEXT: mulss %xmm5, %xmm0
	; SSE41-NEXT: mulss %xmm1, %xmm0			; SSE41-NEXT: mulss %xmm1, %xmm0
	; SSE41-NEXT: movshdup {{.*#+}} xmm4 = xmm1[1,1,3,3]			; SSE41-NEXT: movshdup {{.*#+}} xmm4 = xmm1[1,1,3,3]
	; SSE41-NEXT: mulss %xmm4, %xmm0			; SSE41-NEXT: mulss %xmm4, %xmm0
	Show All 17 Lines
	; SSE41-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]			; SSE41-NEXT: movhlps {{.*#+}} xmm1 = xmm3[1],xmm1[1]
	; SSE41-NEXT: mulss %xmm1, %xmm0			; SSE41-NEXT: mulss %xmm1, %xmm0
	; SSE41-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]			; SSE41-NEXT: shufps {{.*#+}} xmm3 = xmm3[3,1,2,3]
	; SSE41-NEXT: mulss %xmm3, %xmm0			; SSE41-NEXT: mulss %xmm3, %xmm0
	; SSE41-NEXT: retq			; SSE41-NEXT: retq
	;			;
	; AVX-LABEL: test_v16f32_undef:			; AVX-LABEL: test_v16f32_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulss %xmm0, %xmm0, %xmm2			; AVX-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]
	; AVX-NEXT: vmovshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]			; AVX-NEXT: vmulss {{.*}}(%rip), %xmm2, %xmm2
	; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]
	; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2			; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vpermilps {{.*#+}} xmm3 = xmm0[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm3 = xmm0[3,1,2,3]
	; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2			; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vmulss %xmm0, %xmm2, %xmm2			; AVX-NEXT: vmulss %xmm0, %xmm2, %xmm2
	; AVX-NEXT: vmovshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]			; AVX-NEXT: vmovshdup {{.*#+}} xmm3 = xmm0[1,1,3,3]
	; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2			; AVX-NEXT: vmulss %xmm3, %xmm2, %xmm2
	Show All 16 Lines
	; AVX-NEXT: vmulss %xmm2, %xmm0, %xmm0			; AVX-NEXT: vmulss %xmm2, %xmm0, %xmm0
	; AVX-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[3,1,2,3]			; AVX-NEXT: vpermilps {{.*#+}} xmm1 = xmm1[3,1,2,3]
	; AVX-NEXT: vmulss %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulss %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v16f32_undef:			; AVX512-LABEL: test_v16f32_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulss %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vmovshdup {{.*#+}} xmm1 = xmm0[1,1,3,3]
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm2 = xmm0[1,1,3,3]			; AVX512-NEXT: vmulss {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]			; AVX512-NEXT: vpermilps {{.*#+}} xmm2 = xmm0[3,1,2,3]
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2
	; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vmovshdup {{.*#+}} xmm3 = xmm2[1,1,3,3]			; AVX512-NEXT: vmovshdup {{.*#+}} xmm3 = xmm2[1,1,3,3]
	; AVX512-NEXT: vmulss %xmm3, %xmm1, %xmm1			; AVX512-NEXT: vmulss %xmm3, %xmm1, %xmm1
	▲ Show 20 Lines • Show All 473 Lines • ▼ Show 20 Lines

	;			;
	; vXf64 (undef)			; vXf64 (undef)
	;			;

	define double @test_v2f64_undef(<2 x double> %a0) {			define double @test_v2f64_undef(<2 x double> %a0) {
	; SSE-LABEL: test_v2f64_undef:			; SSE-LABEL: test_v2f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm1
	; SSE-NEXT: mulsd %xmm0, %xmm1
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v2f64_undef:			; AVX-LABEL: test_v2f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulsd %xmm0, %xmm0, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vmulsd %xmm0, %xmm1, %xmm0			; AVX-NEXT: vmulsd {{.*}}(%rip), %xmm0, %xmm0
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v2f64_undef:			; AVX512-LABEL: test_v2f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulsd %xmm0, %xmm0, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vmulsd {{.*}}(%rip), %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call double @llvm.experimental.vector.reduce.fmul.f64.f64.v2f64(double undef, <2 x double> %a0)			%1 = call double @llvm.experimental.vector.reduce.fmul.f64.f64.v2f64(double undef, <2 x double> %a0)
	ret double %1			ret double %1
	}			}

	define double @test_v4f64_undef(<4 x double> %a0) {			define double @test_v4f64_undef(<4 x double> %a0) {
	; SSE-LABEL: test_v4f64_undef:			; SSE-LABEL: test_v4f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm2
	; SSE-NEXT: mulsd %xmm0, %xmm2
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: mulsd %xmm2, %xmm0			; SSE-NEXT: mulsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd %xmm1, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd %xmm1, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v4f64_undef:			; AVX-LABEL: test_v4f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulsd %xmm0, %xmm0, %xmm1			; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,0]
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX-NEXT: vmulsd {{.*}}(%rip), %xmm1, %xmm1
	; AVX-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vmulsd %xmm0, %xmm1, %xmm1			; AVX-NEXT: vmulsd %xmm0, %xmm1, %xmm1
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vmulsd %xmm0, %xmm1, %xmm0			; AVX-NEXT: vmulsd %xmm0, %xmm1, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v4f64_undef:			; AVX512-LABEL: test_v4f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulsd %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,0]
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vmulsd {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm1			; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: vzeroupper			; AVX512-NEXT: vzeroupper
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call double @llvm.experimental.vector.reduce.fmul.f64.f64.v4f64(double undef, <4 x double> %a0)			%1 = call double @llvm.experimental.vector.reduce.fmul.f64.f64.v4f64(double undef, <4 x double> %a0)
	ret double %1			ret double %1
	}			}

	define double @test_v8f64_undef(<8 x double> %a0) {			define double @test_v8f64_undef(<8 x double> %a0) {
	; SSE-LABEL: test_v8f64_undef:			; SSE-LABEL: test_v8f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm4
	; SSE-NEXT: mulsd %xmm0, %xmm4
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: mulsd %xmm4, %xmm0			; SSE-NEXT: mulsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd %xmm1, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd %xmm1, %xmm0
	; SSE-NEXT: mulsd %xmm2, %xmm0			; SSE-NEXT: mulsd %xmm2, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]
	; SSE-NEXT: mulsd %xmm2, %xmm0			; SSE-NEXT: mulsd %xmm2, %xmm0
	; SSE-NEXT: mulsd %xmm3, %xmm0			; SSE-NEXT: mulsd %xmm3, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]
	; SSE-NEXT: mulsd %xmm3, %xmm0			; SSE-NEXT: mulsd %xmm3, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v8f64_undef:			; AVX-LABEL: test_v8f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulsd %xmm0, %xmm0, %xmm2			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]			; AVX-NEXT: vmulsd {{.*}}(%rip), %xmm2, %xmm2
	; AVX-NEXT: vmulsd %xmm3, %xmm2, %xmm2
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vmulsd %xmm0, %xmm2, %xmm2			; AVX-NEXT: vmulsd %xmm0, %xmm2, %xmm2
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vmulsd %xmm0, %xmm2, %xmm0			; AVX-NEXT: vmulsd %xmm0, %xmm2, %xmm0
	; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm2 = xmm1[1,0]
	; AVX-NEXT: vmulsd %xmm2, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm2, %xmm0, %xmm0
	; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1			; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1
	; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]
	; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v8f64_undef:			; AVX512-LABEL: test_v8f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulsd %xmm0, %xmm0, %xmm1			; AVX512-NEXT: vpermilpd {{.*#+}} xmm1 = xmm0[1,0]
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]			; AVX512-NEXT: vmulsd {{.*}}(%rip), %xmm1, %xmm1
	; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm2
	; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]
	; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm2			; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm2
	; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm2[1,0]
	; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1			; AVX512-NEXT: vmulsd %xmm2, %xmm1, %xmm1
	; AVX512-NEXT: vextractf32x4 $3, %zmm0, %xmm0			; AVX512-NEXT: vextractf32x4 $3, %zmm0, %xmm0
	; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm1			; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm1
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm0			; AVX512-NEXT: vmulsd %xmm0, %xmm1, %xmm0
	; AVX512-NEXT: vzeroupper			; AVX512-NEXT: vzeroupper
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%1 = call double @llvm.experimental.vector.reduce.fmul.f64.f64.v8f64(double undef, <8 x double> %a0)			%1 = call double @llvm.experimental.vector.reduce.fmul.f64.f64.v8f64(double undef, <8 x double> %a0)
	ret double %1			ret double %1
	}			}

	define double @test_v16f64_undef(<16 x double> %a0) {			define double @test_v16f64_undef(<16 x double> %a0) {
	; SSE-LABEL: test_v16f64_undef:			; SSE-LABEL: test_v16f64_undef:
	; SSE: # %bb.0:			; SSE: # %bb.0:
	; SSE-NEXT: movapd %xmm0, %xmm8
	; SSE-NEXT: mulsd %xmm0, %xmm8
	; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm0 = xmm0[1,1]
	; SSE-NEXT: mulsd %xmm8, %xmm0			; SSE-NEXT: mulsd {{.*}}(%rip), %xmm0
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd %xmm1, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm1 = xmm1[1,1]
	; SSE-NEXT: mulsd %xmm1, %xmm0			; SSE-NEXT: mulsd %xmm1, %xmm0
	; SSE-NEXT: mulsd %xmm2, %xmm0			; SSE-NEXT: mulsd %xmm2, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm2 = xmm2[1,1]
	; SSE-NEXT: mulsd %xmm2, %xmm0			; SSE-NEXT: mulsd %xmm2, %xmm0
	; SSE-NEXT: mulsd %xmm3, %xmm0			; SSE-NEXT: mulsd %xmm3, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm3 = xmm3[1,1]
	Show All 9 Lines
	; SSE-NEXT: mulsd %xmm6, %xmm0			; SSE-NEXT: mulsd %xmm6, %xmm0
	; SSE-NEXT: mulsd %xmm7, %xmm0			; SSE-NEXT: mulsd %xmm7, %xmm0
	; SSE-NEXT: movhlps {{.*#+}} xmm7 = xmm7[1,1]			; SSE-NEXT: movhlps {{.*#+}} xmm7 = xmm7[1,1]
	; SSE-NEXT: mulsd %xmm7, %xmm0			; SSE-NEXT: mulsd %xmm7, %xmm0
	; SSE-NEXT: retq			; SSE-NEXT: retq
	;			;
	; AVX-LABEL: test_v16f64_undef:			; AVX-LABEL: test_v16f64_undef:
	; AVX: # %bb.0:			; AVX: # %bb.0:
	; AVX-NEXT: vmulsd %xmm0, %xmm0, %xmm4			; AVX-NEXT: vpermilpd {{.*#+}} xmm4 = xmm0[1,0]
	; AVX-NEXT: vpermilpd {{.*#+}} xmm5 = xmm0[1,0]			; AVX-NEXT: vmulsd {{.*}}(%rip), %xmm4, %xmm4
	; AVX-NEXT: vmulsd %xmm5, %xmm4, %xmm4
	; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0			; AVX-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX-NEXT: vmulsd %xmm0, %xmm4, %xmm4			; AVX-NEXT: vmulsd %xmm0, %xmm4, %xmm4
	; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm0 = xmm0[1,0]
	; AVX-NEXT: vmulsd %xmm0, %xmm4, %xmm0			; AVX-NEXT: vmulsd %xmm0, %xmm4, %xmm0
	; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm4 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm4 = xmm1[1,0]
	; AVX-NEXT: vmulsd %xmm4, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm4, %xmm0, %xmm0
	; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1			; AVX-NEXT: vextractf128 $1, %ymm1, %xmm1
	Show All 14 Lines
	; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]			; AVX-NEXT: vpermilpd {{.*#+}} xmm1 = xmm1[1,0]
	; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0			; AVX-NEXT: vmulsd %xmm1, %xmm0, %xmm0
	; AVX-NEXT: vzeroupper			; AVX-NEXT: vzeroupper
	; AVX-NEXT: retq			; AVX-NEXT: retq
	;			;
	; AVX512-LABEL: test_v16f64_undef:			; AVX512-LABEL: test_v16f64_undef:
	; AVX512: # %bb.0:			; AVX512: # %bb.0:
	; AVX512-NEXT: vmulsd %xmm0, %xmm0, %xmm2			; AVX512-NEXT: vpermilpd {{.*#+}} xmm2 = xmm0[1,0]
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm0[1,0]			; AVX512-NEXT: vmulsd {{.*}}(%rip), %xmm2, %xmm2
	; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm3			; AVX512-NEXT: vextractf128 $1, %ymm0, %xmm3
	; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]
	; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm3			; AVX512-NEXT: vextractf32x4 $2, %zmm0, %xmm3
	; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2
	; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]			; AVX512-NEXT: vpermilpd {{.*#+}} xmm3 = xmm3[1,0]
	; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2			; AVX512-NEXT: vmulsd %xmm3, %xmm2, %xmm2
	Show All 34 Lines