This is an archive of the discontinued LLVM Phabricator instance.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10451	It would seem that all we need to change this condition and the one below to not emit `PPCISD::VECINSERT` for 64-bit element widths (`v2i64, v2f64`). Why do we need to disable this lowering on 32-bit targets altogether?

ZarkoCA added inline comments.Apr 28 2021, 6:52 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

10451

It looks like all of the pattern matches for VINS* in PPCInstrPrefix.td hardcode i64:
eg:

  def : Pat<(v16i8 (PPCvecinsertelt v16i8:$vDi, i32:$rA, i64:$rB)),
            (VINSBLX $vDi, InsertEltShift.Sub32Left0, $rA)>;
...
 foreach i = [0, 1] in
    def : Pat<(v2i64 (PPCvecinsertelt v2i64:$vDi, i64:$rA, (i64 i))),
              (VINSD $vDi, !mul(i, 8), $rA)>;
}

So we can't emit the VECINSERT safely in 32bit mode due to this.

nemanjai added inline comments.Apr 28 2021, 8:32 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10451	Sure, so those won't match. You might be able to change `i64` to `iPTR` (I'm not sure about that) or provide patterns with `i32` instead of `i64`.

lebedev.ri retitled this revision from Disable vinsw, vinsd, and vins[wd][lr]x P10 instructions in P10 to [PowerPC] Disable vinsw, vinsd, and vins[wd][lr]x P10 instructions in P10.Apr 28 2021, 8:34 AM

Herald added a subscriber: shchenz. · View Herald TranscriptApr 28 2021, 8:34 AM

Enable safe for 32bit vins p10 instructions

ZarkoCA retitled this revision from [PowerPC] Disable vinsw, vinsd, and vins[wd][lr]x P10 instructions in P10 to [PowerPC] Enable safe for 32bit vins* P10 instructions .May 6 2021, 10:27 AM

ZarkoCA edited the summary of this revision. (Show Details)

ZarkoCA marked 2 inline comments as done.May 6 2021, 10:30 AM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10451	Thanks for the suggestion and help. It's much better to emit these when we can in 32bit mode.
llvm/lib/Target/PowerPC/PPCInstrPrefix.td
2758	I preferred to split the 32/64bit implementations mainly to keep 64bit as is. I noticed that there were no other predicate definitions in this file and they can be moved if that's preferred.

Harbormaster completed remote builds in B103032: Diff 343455.May 6 2021, 11:25 AM

nemanjai added inline comments.May 7 2021, 6:32 AM

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1250	I don't really understand how we are custom lowering this on 32-bit targets now since you've added this. Where are the PPC-specific insert nodes coming from?
llvm/lib/Target/PowerPC/PPCInstrPrefix.td
2758	This is fine.
2771	Nit: line too long (here and elsewhere).

ZarkoCA marked an inline comment as done.May 7 2021, 8:03 AM

ZarkoCA added inline comments.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
1250	We don't need this now, I don't think. Forgot to remove it in the previous diff.
llvm/lib/Target/PowerPC/PPCInstrPrefix.td
2758	Thanks.
2771	I noticed that but also there are several lines in `IsISA3_1, HasVSX, IsLittleEndian` and `IsISA3_1, HasVSX, IsBigEndian, IsPPC64` that were too long as well. Thought it may be ok but I fixed it now for this case.

Removing unnecessary isPPC64 check
fix formatting

Harbormaster completed remote builds in B103202: Diff 343684.May 7 2021, 8:58 AM

LGTM other than the nit that can be addressed on the commit.

llvm/lib/Target/PowerPC/PPCInstrPrefix.td
2769–2770	The operand should be lined up with the first operand of the node it belongs to: def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load iaddr:$rA)), i32:$rB)), Similarly on other similar lines.
2771	We haven't done super well with keeping the lines in target description files to 80 lines, but we should still try to do so on new code.

This revision is now accepted and ready to land.May 10 2021, 4:20 AM

Closed by commit rG0c41f77857fc: [PowerPC] Enable safe for 32bit vins* P10 instructions (authored by ZarkoCA). · Explain WhyMay 10 2021, 7:13 AM

This revision was automatically updated to reflect the committed changes.

ZarkoCA added a commit: rG0c41f77857fc: [PowerPC] Enable safe for 32bit vins* P10 instructions.

Revision Contents

Path

Size

llvm/

lib/

Target/

PowerPC/

PPCISelLowering.cpp

4 lines

PPCInstrPrefix.td

36 lines

test/

CodeGen/

PowerPC/

aix-vec_insert_elt.ll

444 lines

Diff 343455

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,241 Lines • ▼ Show 20 Lines	if (Subtarget.hasP9Altivec()) {
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i16, Legal);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i16, Legal);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i32, Legal);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v4i32, Legal);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i8, Legal);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i8, Legal);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i16, Legal);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i16, Legal);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i32, Legal);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i32, Legal);
setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i64, Legal);		setOperationAction(ISD::SIGN_EXTEND_INREG, MVT::v2i64, Legal);
}		}

if (Subtarget.isISA3_1())		if (Subtarget.isISA3_1() && Subtarget.isPPC64())
		nemanjaiUnsubmitted Not Done Reply Inline Actions I don't really understand how we are custom lowering this on 32-bit targets now since you've added this. Where are the PPC-specific insert nodes coming from? nemanjai: I don't really understand how we are custom lowering this on 32-bit targets now since you've…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions We don't need this now, I don't think. Forgot to remove it in the previous diff. ZarkoCA: We don't need this now, I don't think. Forgot to remove it in the previous diff.
setOperationAction(ISD::INSERT_VECTOR_ELT, MVT::v2i64, Custom);		setOperationAction(ISD::INSERT_VECTOR_ELT, MVT::v2i64, Custom);
}		}

if (Subtarget.pairedVectorMemops()) {		if (Subtarget.pairedVectorMemops()) {
addRegisterClass(MVT::v256i1, &PPC::VSRpRCRegClass);		addRegisterClass(MVT::v256i1, &PPC::VSRpRCRegClass);
setOperationAction(ISD::LOAD, MVT::v256i1, Custom);		setOperationAction(ISD::LOAD, MVT::v256i1, Custom);
setOperationAction(ISD::STORE, MVT::v256i1, Custom);		setOperationAction(ISD::STORE, MVT::v256i1, Custom);
}		}
▲ Show 20 Lines • Show All 9,180 Lines • ▼ Show 20 Lines	SDValue PPCTargetLowering::LowerINSERT_VECTOR_ELT(SDValue Op,
SDValue V1 = Op.getOperand(0);		SDValue V1 = Op.getOperand(0);
SDValue V2 = Op.getOperand(1);		SDValue V2 = Op.getOperand(1);
SDValue V3 = Op.getOperand(2);		SDValue V3 = Op.getOperand(2);

if (VT == MVT::v2f64 && C)		if (VT == MVT::v2f64 && C)
return Op;		return Op;

if (Subtarget.isISA3_1()) {		if (Subtarget.isISA3_1()) {
		if ((VT == MVT::v2i64 \|\| VT == MVT::v2f64) && !Subtarget.isPPC64())
		return SDValue();
// On P10, we have legal lowering for constant and variable indices for		// On P10, we have legal lowering for constant and variable indices for
// integer vectors.		// integer vectors.
if (VT == MVT::v16i8 \|\| VT == MVT::v8i16 \|\| VT == MVT::v4i32 \|\|		if (VT == MVT::v16i8 \|\| VT == MVT::v8i16 \|\| VT == MVT::v4i32 \|\|
		nemanjaiUnsubmitted Done Reply Inline Actions It would seem that all we need to change this condition and the one below to not emit `PPCISD::VECINSERT` for 64-bit element widths (`v2i64, v2f64`). Why do we need to disable this lowering on 32-bit targets altogether? nemanjai: It would seem that all we need to change this condition and the one below to not emit `PPCISD…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions It looks like all of the pattern matches for VINS* in `PPCInstrPrefix.td` hardcode `i64`: eg: def : Pat<(v16i8 (PPCvecinsertelt v16i8:$vDi, i32:$rA, i64:$rB)), (VINSBLX $vDi, InsertEltShift.Sub32Left0, $rA)>; ... foreach i = [0, 1] in def : Pat<(v2i64 (PPCvecinsertelt v2i64:$vDi, i64:$rA, (i64 i))), (VINSD $vDi, !mul(i, 8), $rA)>; } So we can't emit the VECINSERT safely in 32bit mode due to this. ZarkoCA: It looks like all of the pattern matches for VINS* in `PPCInstrPrefix.td` hardcode `i64`: eg…
		nemanjaiUnsubmitted Done Reply Inline Actions Sure, so those won't match. You might be able to change `i64` to `iPTR` (I'm not sure about that) or provide patterns with `i32` instead of `i64`. nemanjai: Sure, so those won't match. You might be able to change `i64` to `iPTR` (I'm not sure about…
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Thanks for the suggestion and help. It's much better to emit these when we can in 32bit mode. ZarkoCA: Thanks for the suggestion and help. It's much better to emit these when we can in 32bit mode.
VT == MVT::v2i64)		VT == MVT::v2i64)
return DAG.getNode(PPCISD::VECINSERT, dl, VT, V1, V2, V3);		return DAG.getNode(PPCISD::VECINSERT, dl, VT, V1, V2, V3);
// For f32 and f64 vectors, we have legal lowering for variable indices.		// For f32 and f64 vectors, we have legal lowering for variable indices.
// For f32 we also have legal lowering when the element is loaded from		// For f32 we also have legal lowering when the element is loaded from
// memory.		// memory.
if (VT == MVT::v4f32 \|\| VT == MVT::v2f64) {		if (VT == MVT::v4f32 \|\| VT == MVT::v2f64) {
if (!C \|\| (VT == MVT::v4f32 && dyn_cast<LoadSDNode>(V2)))		if (!C \|\| (VT == MVT::v4f32 && dyn_cast<LoadSDNode>(V2)))
return DAG.getNode(PPCISD::VECINSERT, dl, VT, V1, V2, V3);		return DAG.getNode(PPCISD::VECINSERT, dl, VT, V1, V2, V3);
▲ Show 20 Lines • Show All 6,815 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCInstrPrefix.td

		//-------------------------- Predicate definitions ---------------------------//
		def IsPPC32 : Predicate<"!Subtarget->isPPC64()">;

// Mask immediates for MMA instructions (2, 4 and 8 bits).		// Mask immediates for MMA instructions (2, 4 and 8 bits).
def Msk2Imm : ImmLeaf<i32, [{ return isUInt<2>(Imm); }]>;		def Msk2Imm : ImmLeaf<i32, [{ return isUInt<2>(Imm); }]>;
def Msk4Imm : ImmLeaf<i32, [{ return isUInt<4>(Imm); }]>;		def Msk4Imm : ImmLeaf<i32, [{ return isUInt<4>(Imm); }]>;
def Msk8Imm : ImmLeaf<i32, [{ return isUInt<8>(Imm); }]>;		def Msk8Imm : ImmLeaf<i32, [{ return isUInt<8>(Imm); }]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// PowerPC ISA 3.1 specific type constraints.		// PowerPC ISA 3.1 specific type constraints.
//		//
▲ Show 20 Lines • Show All 2,738 Lines • ▼ Show 20 Lines	foreach i = [0, 1, 2, 3] in {
def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load xaddr:$rA)), (i64 i))),		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load xaddr:$rA)), (i64 i))),
(VINSW $vDi, !mul(!sub(3, i), 4), (LWZX memrr:$rA))>;		(VINSW $vDi, !mul(!sub(3, i), 4), (LWZX memrr:$rA))>;
}		}
foreach i = [0, 1] in		foreach i = [0, 1] in
def : Pat<(v2i64 (PPCvecinsertelt v2i64:$vDi, i64:$rA, (i64 i))),		def : Pat<(v2i64 (PPCvecinsertelt v2i64:$vDi, i64:$rA, (i64 i))),
(VINSD $vDi, !mul(!sub(1, i), 8), $rA)>;		(VINSD $vDi, !mul(!sub(1, i), 8), $rA)>;
}		}

let Predicates = [IsISA3_1, HasVSX, IsBigEndian] in {		let Predicates = [IsISA3_1, HasVSX, IsBigEndian, IsPPC32] in {
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I preferred to split the 32/64bit implementations mainly to keep 64bit as is. I noticed that there were no other predicate definitions in this file and they can be moved if that's preferred. ZarkoCA: I preferred to split the 32/64bit implementations mainly to keep 64bit as is. I noticed that…
		nemanjaiUnsubmitted Not Done Reply Inline Actions This is fine. nemanjai: This is fine.
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions Thanks. ZarkoCA: Thanks.
		// Indexed vector insert element
		def : Pat<(v16i8 (PPCvecinsertelt v16i8:$vDi, i32:$rA, i32:$rB)),
		(VINSBLX $vDi, $rB, $rA)>;
		def : Pat<(v8i16 (PPCvecinsertelt v8i16:$vDi, i32:$rA, i32:$rB)),
		(VINSHLX $vDi, $rB, $rA)>;
		def : Pat<(v4i32 (PPCvecinsertelt v4i32:$vDi, i32:$rA, i32:$rB)),
		(VINSWLX $vDi, $rB, $rA)>;

		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, f32:$A, i32:$rB)),
		(VINSWLX $vDi, $rB, Bitcast.FltToInt)>;
		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load iaddr:$rA)), i32:$rB)),
		(VINSWLX $vDi, $rB, (LWZ memri:$rA))>;
		nemanjaiUnsubmitted Not Done Reply Inline Actions The operand should be lined up with the first operand of the node it belongs to: def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load iaddr:$rA)), i32:$rB)), Similarly on other similar lines. nemanjai: The operand should be lined up with the first operand of the node it belongs to: ``` def : Pat<…
		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load iaddrX34:$rA)), i32:$rB)),
		nemanjaiUnsubmitted Not Done Reply Inline Actions Nit: line too long (here and elsewhere). nemanjai: Nit: line too long (here and elsewhere).
		ZarkoCAAuthorUnsubmitted Done Reply Inline Actions I noticed that but also there are several lines in `IsISA3_1, HasVSX, IsLittleEndian` and `IsISA3_1, HasVSX, IsBigEndian, IsPPC64` that were too long as well. Thought it may be ok but I fixed it now for this case. ZarkoCA: I noticed that but also there are several lines in `IsISA3_1, HasVSX, IsLittleEndian` and…
		nemanjaiUnsubmitted Not Done Reply Inline Actions We haven't done super well with keeping the lines in target description files to 80 lines, but we should still try to do so on new code. nemanjai: We haven't done super well with keeping the lines in target description files to 80 lines, but…
		(VINSWLX $vDi, $rB, (PLWZ memri34:$rA))>;
		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load xaddr:$rA)), i32:$rB)),
		(VINSWLX $vDi, $rB, (LWZX memrr:$rA))>;

		// Immediate vector insert element
		foreach i = [0, 1, 2, 3] in {
		def : Pat<(v4i32 (PPCvecinsertelt v4i32:$vDi, i32:$rA, (i32 i))),
		(VINSW $vDi, !mul(i, 4), $rA)>;
		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load iaddr:$rA)), (i32 i))),
		(VINSW $vDi, !mul(i, 4), (LWZ memri:$rA))>;
		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load iaddrX34:$rA)), (i32 i))),
		(VINSW $vDi, !mul(i, 4), (PLWZ memri34:$rA))>;
		def : Pat<(v4f32 (PPCvecinsertelt v4f32:$vDi, (f32 (load xaddr:$rA)), (i32 i))),
		(VINSW $vDi, !mul(i, 4), (LWZX memrr:$rA))>;
		}
		}

		let Predicates = [IsISA3_1, HasVSX, IsBigEndian, IsPPC64] in {
// Indexed vector insert element		// Indexed vector insert element
def : Pat<(v16i8 (PPCvecinsertelt v16i8:$vDi, i32:$rA, i64:$rB)),		def : Pat<(v16i8 (PPCvecinsertelt v16i8:$vDi, i32:$rA, i64:$rB)),
(VINSBLX $vDi, InsertEltShift.Sub32Left0, $rA)>;		(VINSBLX $vDi, InsertEltShift.Sub32Left0, $rA)>;
def : Pat<(v8i16 (PPCvecinsertelt v8i16:$vDi, i32:$rA, i64:$rB)),		def : Pat<(v8i16 (PPCvecinsertelt v8i16:$vDi, i32:$rA, i64:$rB)),
(VINSHLX $vDi, InsertEltShift.Sub32Left1, $rA)>;		(VINSHLX $vDi, InsertEltShift.Sub32Left1, $rA)>;
def : Pat<(v4i32 (PPCvecinsertelt v4i32:$vDi, i32:$rA, i64:$rB)),		def : Pat<(v4i32 (PPCvecinsertelt v4i32:$vDi, i32:$rA, i64:$rB)),
(VINSWLX $vDi, InsertEltShift.Sub32Left2, $rA)>;		(VINSWLX $vDi, InsertEltShift.Sub32Left2, $rA)>;
def : Pat<(v2i64 (PPCvecinsertelt v2i64:$vDi, i64:$rA, i64:$rB)),		def : Pat<(v2i64 (PPCvecinsertelt v2i64:$vDi, i64:$rA, i64:$rB)),
Show All 35 Lines

llvm/test/CodeGen/PowerPC/aix-vec_insert_elt.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -verify-machineinstrs -mtriple=powerpc64-ibm-aix-xcoff -vec-extabi -mcpu=pwr9 < %s \| FileCheck %s -check-prefix=CHECK-64			; RUN: llc -verify-machineinstrs -mtriple=powerpc64-ibm-aix-xcoff -vec-extabi -mcpu=pwr9 < %s \| FileCheck %s -check-prefix=CHECK-64
	; RUN: llc -verify-machineinstrs -mtriple=powerpc-ibm-aix-xcoff -vec-extabi -mcpu=pwr9 < %s \| FileCheck %s -check-prefix=CHECK-32			; RUN: llc -verify-machineinstrs -mtriple=powerpc-ibm-aix-xcoff -vec-extabi -mcpu=pwr9 < %s \| FileCheck %s -check-prefix=CHECK-32
				; RUN: llc -verify-machineinstrs -mtriple=powerpc64-ibm-aix-xcoff -vec-extabi -mcpu=pwr10 < %s \| FileCheck %s -check-prefix=CHECK-64-P10
				; RUN: llc -verify-machineinstrs -mtriple=powerpc-ibm-aix-xcoff -vec-extabi -mcpu=pwr10 < %s \| FileCheck %s -check-prefix=CHECK-32-P10

	; Byte indexed			; Byte indexed

	define <16 x i8> @testByte(<16 x i8> %a, i64 %b, i64 %idx) {			define <16 x i8> @testByte(<16 x i8> %a, i64 %b, i64 %idx) {
	; CHECK-64-LABEL: testByte:			; CHECK-64-LABEL: testByte:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: addi 5, 1, -16			; CHECK-64-NEXT: addi 5, 1, -16
	; CHECK-64-NEXT: clrldi 4, 4, 60			; CHECK-64-NEXT: clrldi 4, 4, 60
	; CHECK-64-NEXT: stxv 34, -16(1)			; CHECK-64-NEXT: stxv 34, -16(1)
	; CHECK-64-NEXT: stbx 3, 5, 4			; CHECK-64-NEXT: stbx 3, 5, 4
	; CHECK-64-NEXT: lxv 34, -16(1)			; CHECK-64-NEXT: lxv 34, -16(1)
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testByte:			; CHECK-32-LABEL: testByte:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: addi 5, 1, -16			; CHECK-32-NEXT: addi 5, 1, -16
	; CHECK-32-NEXT: clrlwi 3, 6, 28			; CHECK-32-NEXT: clrlwi 3, 6, 28
	; CHECK-32-NEXT: stxv 34, -16(1)			; CHECK-32-NEXT: stxv 34, -16(1)
	; CHECK-32-NEXT: stbx 4, 5, 3			; CHECK-32-NEXT: stbx 4, 5, 3
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testByte:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: vinsblx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testByte:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: vinsblx 2, 6, 4
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%conv = trunc i64 %b to i8			%conv = trunc i64 %b to i8
	%vecins = insertelement <16 x i8> %a, i8 %conv, i64 %idx			%vecins = insertelement <16 x i8> %a, i8 %conv, i64 %idx
	ret <16 x i8> %vecins			ret <16 x i8> %vecins
	}			}

	; Halfword indexed			; Halfword indexed

	Show All 10 Lines
	; CHECK-32-LABEL: testHalf:			; CHECK-32-LABEL: testHalf:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: addi 5, 1, -16			; CHECK-32-NEXT: addi 5, 1, -16
	; CHECK-32-NEXT: rlwinm 3, 6, 1, 28, 30			; CHECK-32-NEXT: rlwinm 3, 6, 1, 28, 30
	; CHECK-32-NEXT: stxv 34, -16(1)			; CHECK-32-NEXT: stxv 34, -16(1)
	; CHECK-32-NEXT: sthx 4, 5, 3			; CHECK-32-NEXT: sthx 4, 5, 3
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testHalf:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: slwi 4, 4, 1
				; CHECK-64-P10-NEXT: vinshlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testHalf:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: vinshlx 2, 6, 4
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%conv = trunc i64 %b to i16			%conv = trunc i64 %b to i16
	%vecins = insertelement <8 x i16> %a, i16 %conv, i64 %idx			%vecins = insertelement <8 x i16> %a, i16 %conv, i64 %idx
	ret <8 x i16> %vecins			ret <8 x i16> %vecins
	}			}

	; Word indexed			; Word indexed

	Show All 10 Lines
	; CHECK-32-LABEL: testWord:			; CHECK-32-LABEL: testWord:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: addi 5, 1, -16			; CHECK-32-NEXT: addi 5, 1, -16
	; CHECK-32-NEXT: rlwinm 3, 6, 2, 28, 29			; CHECK-32-NEXT: rlwinm 3, 6, 2, 28, 29
	; CHECK-32-NEXT: stxv 34, -16(1)			; CHECK-32-NEXT: stxv 34, -16(1)
	; CHECK-32-NEXT: stwx 4, 5, 3			; CHECK-32-NEXT: stwx 4, 5, 3
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testWord:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: slwi 4, 4, 2
				; CHECK-64-P10-NEXT: vinswlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testWord:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: vinswlx 2, 6, 4
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%conv = trunc i64 %b to i32			%conv = trunc i64 %b to i32
	%vecins = insertelement <4 x i32> %a, i32 %conv, i64 %idx			%vecins = insertelement <4 x i32> %a, i32 %conv, i64 %idx
	ret <4 x i32> %vecins			ret <4 x i32> %vecins
	}			}

	; Word immediate			; Word immediate

	define <4 x i32> @testWordImm(<4 x i32> %a, i64 %b) {			define <4 x i32> @testWordImm(<4 x i32> %a, i64 %b) {
	; CHECK-64-LABEL: testWordImm:			; CHECK-64-LABEL: testWordImm:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: mtfprwz 0, 3			; CHECK-64-NEXT: mtfprwz 0, 3
	; CHECK-64-NEXT: xxinsertw 34, 0, 4			; CHECK-64-NEXT: xxinsertw 34, 0, 4
	; CHECK-64-NEXT: xxinsertw 34, 0, 12			; CHECK-64-NEXT: xxinsertw 34, 0, 12
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testWordImm:			; CHECK-32-LABEL: testWordImm:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: mtfprwz 0, 4			; CHECK-32-NEXT: mtfprwz 0, 4
	; CHECK-32-NEXT: xxinsertw 34, 0, 4			; CHECK-32-NEXT: xxinsertw 34, 0, 4
	; CHECK-32-NEXT: xxinsertw 34, 0, 12			; CHECK-32-NEXT: xxinsertw 34, 0, 12
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testWordImm:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: vinsw 2, 3, 4
				; CHECK-64-P10-NEXT: vinsw 2, 3, 12
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testWordImm:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: vinsw 2, 4, 4
				; CHECK-32-P10-NEXT: vinsw 2, 4, 12
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%conv = trunc i64 %b to i32			%conv = trunc i64 %b to i32
	%vecins = insertelement <4 x i32> %a, i32 %conv, i32 1			%vecins = insertelement <4 x i32> %a, i32 %conv, i32 1
	%vecins2 = insertelement <4 x i32> %vecins, i32 %conv, i32 3			%vecins2 = insertelement <4 x i32> %vecins, i32 %conv, i32 3
	ret <4 x i32> %vecins2			ret <4 x i32> %vecins2
	}			}

	; Doubleword indexed			; Doubleword indexed
	Show All 18 Lines
	; CHECK-32-NEXT: addi 3, 5, 1			; CHECK-32-NEXT: addi 3, 5, 1
	; CHECK-32-NEXT: addi 5, 1, -16			; CHECK-32-NEXT: addi 5, 1, -16
	; CHECK-32-NEXT: lxv 0, -32(1)			; CHECK-32-NEXT: lxv 0, -32(1)
	; CHECK-32-NEXT: rlwinm 3, 3, 2, 28, 29			; CHECK-32-NEXT: rlwinm 3, 3, 2, 28, 29
	; CHECK-32-NEXT: stxv 0, -16(1)			; CHECK-32-NEXT: stxv 0, -16(1)
	; CHECK-32-NEXT: stwx 4, 5, 3			; CHECK-32-NEXT: stwx 4, 5, 3
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoubleword:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: rlwinm 4, 4, 3, 0, 28
				; CHECK-64-P10-NEXT: vinsdlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoubleword:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: add 5, 6, 6
				; CHECK-32-P10-NEXT: vinswlx 2, 5, 3
				; CHECK-32-P10-NEXT: addi 3, 5, 1
				; CHECK-32-P10-NEXT: vinswlx 2, 3, 4
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <2 x i64> %a, i64 %b, i64 %idx			%vecins = insertelement <2 x i64> %a, i64 %b, i64 %idx
	ret <2 x i64> %vecins			ret <2 x i64> %vecins
	}			}

	; Doubleword immediate			; Doubleword immediate

	define <2 x i64> @testDoublewordImm(<2 x i64> %a, i64 %b) {			define <2 x i64> @testDoublewordImm(<2 x i64> %a, i64 %b) {
	; CHECK-64-LABEL: testDoublewordImm:			; CHECK-64-LABEL: testDoublewordImm:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: mtfprd 0, 3			; CHECK-64-NEXT: mtfprd 0, 3
	; CHECK-64-NEXT: xxmrghd 34, 34, 0			; CHECK-64-NEXT: xxmrghd 34, 34, 0
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoublewordImm:			; CHECK-32-LABEL: testDoublewordImm:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: mtfprwz 0, 3			; CHECK-32-NEXT: mtfprwz 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 8			; CHECK-32-NEXT: xxinsertw 34, 0, 8
	; CHECK-32-NEXT: mtfprwz 0, 4			; CHECK-32-NEXT: mtfprwz 0, 4
	; CHECK-32-NEXT: xxinsertw 34, 0, 12			; CHECK-32-NEXT: xxinsertw 34, 0, 12
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoublewordImm:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: vinsd 2, 3, 8
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoublewordImm:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: vinsw 2, 3, 8
				; CHECK-32-P10-NEXT: vinsw 2, 4, 12
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <2 x i64> %a, i64 %b, i32 1			%vecins = insertelement <2 x i64> %a, i64 %b, i32 1
	ret <2 x i64> %vecins			ret <2 x i64> %vecins
	}			}

	define <2 x i64> @testDoublewordImm2(<2 x i64> %a, i64 %b) {			define <2 x i64> @testDoublewordImm2(<2 x i64> %a, i64 %b) {
	; CHECK-64-LABEL: testDoublewordImm2:			; CHECK-64-LABEL: testDoublewordImm2:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: mtfprd 0, 3			; CHECK-64-NEXT: mtfprd 0, 3
	; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoublewordImm2:			; CHECK-32-LABEL: testDoublewordImm2:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: mtfprwz 0, 3			; CHECK-32-NEXT: mtfprwz 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 0			; CHECK-32-NEXT: xxinsertw 34, 0, 0
	; CHECK-32-NEXT: mtfprwz 0, 4			; CHECK-32-NEXT: mtfprwz 0, 4
	; CHECK-32-NEXT: xxinsertw 34, 0, 4			; CHECK-32-NEXT: xxinsertw 34, 0, 4
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoublewordImm2:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: vinsd 2, 3, 0
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoublewordImm2:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: vinsw 2, 3, 0
				; CHECK-32-P10-NEXT: vinsw 2, 4, 4
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <2 x i64> %a, i64 %b, i32 0			%vecins = insertelement <2 x i64> %a, i64 %b, i32 0
	ret <2 x i64> %vecins			ret <2 x i64> %vecins
	}			}

	; Float indexed			; Float indexed

	define <4 x float> @testFloat1(<4 x float> %a, float %b, i32 zeroext %idx1) {			define <4 x float> @testFloat1(<4 x float> %a, float %b, i32 zeroext %idx1) {
	Show All 9 Lines
	; CHECK-32-LABEL: testFloat1:			; CHECK-32-LABEL: testFloat1:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: rlwinm 3, 4, 2, 28, 29			; CHECK-32-NEXT: rlwinm 3, 4, 2, 28, 29
	; CHECK-32-NEXT: addi 4, 1, -16			; CHECK-32-NEXT: addi 4, 1, -16
	; CHECK-32-NEXT: stxv 34, -16(1)			; CHECK-32-NEXT: stxv 34, -16(1)
	; CHECK-32-NEXT: stfsx 1, 4, 3			; CHECK-32-NEXT: stfsx 1, 4, 3
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testFloat1:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: xscvdpspn 0, 1
				; CHECK-64-P10-NEXT: extsw 3, 4
				; CHECK-64-P10-NEXT: slwi 3, 3, 2
				; CHECK-64-P10-NEXT: xxsldwi 0, 0, 0, 3
				; CHECK-64-P10-NEXT: mffprwz 4, 0
				; CHECK-64-P10-NEXT: vinswlx 2, 3, 4
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testFloat1:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: xscvdpspn 0, 1
				; CHECK-32-P10-NEXT: xxsldwi 0, 0, 0, 3
				; CHECK-32-P10-NEXT: mffprwz 3, 0
				; CHECK-32-P10-NEXT: vinswlx 2, 4, 3
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <4 x float> %a, float %b, i32 %idx1			%vecins = insertelement <4 x float> %a, float %b, i32 %idx1
	ret <4 x float> %vecins			ret <4 x float> %vecins
	}			}

	define <4 x float> @testFloat2(<4 x float> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {			define <4 x float> @testFloat2(<4 x float> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {
	; CHECK-64-LABEL: testFloat2:			; CHECK-64-LABEL: testFloat2:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-DAG: lwz 6, 0(3)			; CHECK-64-NEXT: lwz 6, 0(3)
	; CHECK-64-DAG: rlwinm 4, 4, 2, 28, 29			; CHECK-64-NEXT: rlwinm 4, 4, 2, 28, 29
	; CHECK-64-DAG: addi 7, 1, -32			; CHECK-64-NEXT: addi 7, 1, -32
	; CHECK-64-DAG: stxv 34, -32(1)			; CHECK-64-NEXT: stxv 34, -32(1)
	; CHECK-64-DAG: stwx 6, 7, 4			; CHECK-64-NEXT: stwx 6, 7, 4
	; CHECK-64-DAG: rlwinm 4, 5, 2, 28, 29			; CHECK-64-NEXT: rlwinm 4, 5, 2, 28, 29
	; CHECK-64-DAG: addi 5, 1, -16			; CHECK-64-NEXT: addi 5, 1, -16
	; CHECK-64-DAG: lxv 0, -32(1)			; CHECK-64-NEXT: lxv 0, -32(1)
	; CHECK-64-DAG: lwz 3, 1(3)			; CHECK-64-NEXT: lwz 3, 1(3)
	; CHECK-64-DAG: stxv 0, -16(1)			; CHECK-64-NEXT: stxv 0, -16(1)
	; CHECK-64-DAG: stwx 3, 5, 4			; CHECK-64-NEXT: stwx 3, 5, 4
	; CHECK-64-DAG: lxv 34, -16(1)			; CHECK-64-NEXT: lxv 34, -16(1)
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testFloat2:			; CHECK-32-LABEL: testFloat2:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lwz 6, 0(3)			; CHECK-32-NEXT: lwz 6, 0(3)
	; CHECK-32-NEXT: addi 7, 1, -32			; CHECK-32-NEXT: addi 7, 1, -32
	; CHECK-32-NEXT: rlwinm 4, 4, 2, 28, 29			; CHECK-32-NEXT: rlwinm 4, 4, 2, 28, 29
	; CHECK-32-NEXT: stxv 34, -32(1)			; CHECK-32-NEXT: stxv 34, -32(1)
	; CHECK-32-NEXT: rlwinm 5, 5, 2, 28, 29			; CHECK-32-NEXT: rlwinm 5, 5, 2, 28, 29
	; CHECK-32-NEXT: stwx 6, 7, 4			; CHECK-32-NEXT: stwx 6, 7, 4
	; CHECK-32-NEXT: addi 4, 1, -16			; CHECK-32-NEXT: addi 4, 1, -16
	; CHECK-32-NEXT: lxv 0, -32(1)			; CHECK-32-NEXT: lxv 0, -32(1)
	; CHECK-32-NEXT: lwz 3, 1(3)			; CHECK-32-NEXT: lwz 3, 1(3)
	; CHECK-32-NEXT: stxv 0, -16(1)			; CHECK-32-NEXT: stxv 0, -16(1)
	; CHECK-32-NEXT: stwx 3, 4, 5			; CHECK-32-NEXT: stwx 3, 4, 5
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testFloat2:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: lwz 6, 0(3)
				; CHECK-64-P10-NEXT: extsw 4, 4
				; CHECK-64-P10-NEXT: lwz 3, 1(3)
				; CHECK-64-P10-NEXT: slwi 4, 4, 2
				; CHECK-64-P10-NEXT: vinswlx 2, 4, 6
				; CHECK-64-P10-NEXT: extsw 4, 5
				; CHECK-64-P10-NEXT: slwi 4, 4, 2
				; CHECK-64-P10-NEXT: vinswlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testFloat2:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lwz 6, 0(3)
				; CHECK-32-P10-NEXT: lwz 3, 1(3)
				; CHECK-32-P10-NEXT: vinswlx 2, 4, 6
				; CHECK-32-P10-NEXT: vinswlx 2, 5, 3
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%0 = bitcast i8* %b to float*			%0 = bitcast i8* %b to float*
	%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 1			%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 1
	%1 = bitcast i8* %add.ptr1 to float*			%1 = bitcast i8* %add.ptr1 to float*
	%2 = load float, float* %0, align 4			%2 = load float, float* %0, align 4
	%vecins = insertelement <4 x float> %a, float %2, i32 %idx1			%vecins = insertelement <4 x float> %a, float %2, i32 %idx1
	%3 = load float, float* %1, align 4			%3 = load float, float* %1, align 4
	%vecins2 = insertelement <4 x float> %vecins, float %3, i32 %idx2			%vecins2 = insertelement <4 x float> %vecins, float %3, i32 %idx2
	ret <4 x float> %vecins2			ret <4 x float> %vecins2
	}			}

	define <4 x float> @testFloat3(<4 x float> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {			define <4 x float> @testFloat3(<4 x float> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {
	; CHECK-64-LABEL: testFloat3:			; CHECK-64-LABEL: testFloat3:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-DAG: lis 6, 1			; CHECK-64-NEXT: lis 6, 1
	; CHECK-64-DAG: rlwinm 4, 4, 2, 28, 29			; CHECK-64-NEXT: rlwinm 4, 4, 2, 28, 29
	; CHECK-64-DAG: addi 7, 1, -32			; CHECK-64-NEXT: addi 7, 1, -32
	; CHECK-64-DAG: lwzx 6, 3, 6			; CHECK-64-NEXT: lwzx 6, 3, 6
	; CHECK-64-DAG: stxv 34, -32(1)			; CHECK-64-NEXT: stxv 34, -32(1)
	; CHECK-64-DAG: stwx 6, 7, 4			; CHECK-64-NEXT: stwx 6, 7, 4
	; CHECK-64-DAG: li 4, 1			; CHECK-64-NEXT: li 4, 1
	; CHECK-64-DAG: lxv 0, -32(1)			; CHECK-64-NEXT: lxv 0, -32(1)
	; CHECK-64-DAG: rldic 4, 4, 36, 27			; CHECK-64-NEXT: rldic 4, 4, 36, 27
	; CHECK-64-DAG: lwzx 3, 3, 4			; CHECK-64-NEXT: lwzx 3, 3, 4
	; CHECK-64-DAG: rlwinm 4, 5, 2, 28, 29			; CHECK-64-NEXT: rlwinm 4, 5, 2, 28, 29
	; CHECK-64-DAG: addi 5, 1, -16			; CHECK-64-NEXT: addi 5, 1, -16
	; CHECK-64-DAG: stxv 0, -16(1)			; CHECK-64-NEXT: stxv 0, -16(1)
	; CHECK-64-DAG: stwx 3, 5, 4			; CHECK-64-NEXT: stwx 3, 5, 4
	; CHECK-64-DAG: lxv 34, -16(1)			; CHECK-64-NEXT: lxv 34, -16(1)
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testFloat3:			; CHECK-32-LABEL: testFloat3:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lis 6, 1			; CHECK-32-NEXT: lis 6, 1
	; CHECK-32-NEXT: addi 7, 1, -32			; CHECK-32-NEXT: addi 7, 1, -32
	; CHECK-32-NEXT: rlwinm 4, 4, 2, 28, 29			; CHECK-32-NEXT: rlwinm 4, 4, 2, 28, 29
	; CHECK-32-NEXT: rlwinm 5, 5, 2, 28, 29			; CHECK-32-NEXT: rlwinm 5, 5, 2, 28, 29
	; CHECK-32-NEXT: lwzx 6, 3, 6			; CHECK-32-NEXT: lwzx 6, 3, 6
	; CHECK-32-NEXT: stxv 34, -32(1)			; CHECK-32-NEXT: stxv 34, -32(1)
	; CHECK-32-NEXT: stwx 6, 7, 4			; CHECK-32-NEXT: stwx 6, 7, 4
	; CHECK-32-NEXT: addi 4, 1, -16			; CHECK-32-NEXT: addi 4, 1, -16
	; CHECK-32-NEXT: lxv 0, -32(1)			; CHECK-32-NEXT: lxv 0, -32(1)
	; CHECK-32-NEXT: lwz 3, 0(3)			; CHECK-32-NEXT: lwz 3, 0(3)
	; CHECK-32-NEXT: stxv 0, -16(1)			; CHECK-32-NEXT: stxv 0, -16(1)
	; CHECK-32-NEXT: stwx 3, 4, 5			; CHECK-32-NEXT: stwx 3, 4, 5
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testFloat3:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: plwz 6, 65536(3), 0
				; CHECK-64-P10-NEXT: extsw 4, 4
				; CHECK-64-P10-NEXT: slwi 4, 4, 2
				; CHECK-64-P10-NEXT: vinswlx 2, 4, 6
				; CHECK-64-P10-NEXT: li 4, 1
				; CHECK-64-P10-NEXT: rldic 4, 4, 36, 27
				; CHECK-64-P10-NEXT: lwzx 3, 3, 4
				; CHECK-64-P10-NEXT: extsw 4, 5
				; CHECK-64-P10-NEXT: slwi 4, 4, 2
				; CHECK-64-P10-NEXT: vinswlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testFloat3:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lis 6, 1
				; CHECK-32-P10-NEXT: lwzx 6, 3, 6
				; CHECK-32-P10-NEXT: lwz 3, 0(3)
				; CHECK-32-P10-NEXT: vinswlx 2, 4, 6
				; CHECK-32-P10-NEXT: vinswlx 2, 5, 3
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%add.ptr = getelementptr inbounds i8, i8* %b, i64 65536			%add.ptr = getelementptr inbounds i8, i8* %b, i64 65536
	%0 = bitcast i8* %add.ptr to float*			%0 = bitcast i8* %add.ptr to float*
	%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 68719476736			%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 68719476736
	%1 = bitcast i8* %add.ptr1 to float*			%1 = bitcast i8* %add.ptr1 to float*
	%2 = load float, float* %0, align 4			%2 = load float, float* %0, align 4
	%vecins = insertelement <4 x float> %a, float %2, i32 %idx1			%vecins = insertelement <4 x float> %a, float %2, i32 %idx1
	%3 = load float, float* %1, align 4			%3 = load float, float* %1, align 4
	Show All 14 Lines
	;			;
	; CHECK-32-LABEL: testFloatImm1:			; CHECK-32-LABEL: testFloatImm1:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: xscvdpspn 0, 1			; CHECK-32-NEXT: xscvdpspn 0, 1
	; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3			; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 0			; CHECK-32-NEXT: xxinsertw 34, 0, 0
	; CHECK-32-NEXT: xxinsertw 34, 0, 8			; CHECK-32-NEXT: xxinsertw 34, 0, 8
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testFloatImm1:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: xscvdpspn 0, 1
				; CHECK-64-P10-NEXT: xxsldwi 0, 0, 0, 3
				; CHECK-64-P10-NEXT: xxinsertw 34, 0, 0
				; CHECK-64-P10-NEXT: xxinsertw 34, 0, 8
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testFloatImm1:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: xscvdpspn 0, 1
				; CHECK-32-P10-NEXT: xxsldwi 0, 0, 0, 3
				; CHECK-32-P10-NEXT: xxinsertw 34, 0, 0
				; CHECK-32-P10-NEXT: xxinsertw 34, 0, 8
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <4 x float> %a, float %b, i32 0			%vecins = insertelement <4 x float> %a, float %b, i32 0
	%vecins1 = insertelement <4 x float> %vecins, float %b, i32 2			%vecins1 = insertelement <4 x float> %vecins, float %b, i32 2
	ret <4 x float> %vecins1			ret <4 x float> %vecins1
	}			}

	define <4 x float> @testFloatImm2(<4 x float> %a, i32* %b) {			define <4 x float> @testFloatImm2(<4 x float> %a, i32* %b) {
	; CHECK-64-LABEL: testFloatImm2:			; CHECK-64-LABEL: testFloatImm2:
	Show All 14 Lines
	; CHECK-32-NEXT: xscvdpspn 0, 0			; CHECK-32-NEXT: xscvdpspn 0, 0
	; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3			; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 0			; CHECK-32-NEXT: xxinsertw 34, 0, 0
	; CHECK-32-NEXT: lfs 0, 4(3)			; CHECK-32-NEXT: lfs 0, 4(3)
	; CHECK-32-NEXT: xscvdpspn 0, 0			; CHECK-32-NEXT: xscvdpspn 0, 0
	; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3			; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 8			; CHECK-32-NEXT: xxinsertw 34, 0, 8
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testFloatImm2:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: lwz 4, 0(3)
				; CHECK-64-P10-NEXT: lwz 3, 4(3)
				; CHECK-64-P10-NEXT: vinsw 2, 4, 0
				; CHECK-64-P10-NEXT: vinsw 2, 3, 8
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testFloatImm2:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lwz 4, 0(3)
				; CHECK-32-P10-NEXT: lwz 3, 4(3)
				; CHECK-32-P10-NEXT: vinsw 2, 4, 0
				; CHECK-32-P10-NEXT: vinsw 2, 3, 8
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%0 = bitcast i32* %b to float*			%0 = bitcast i32* %b to float*
	%add.ptr1 = getelementptr inbounds i32, i32* %b, i64 1			%add.ptr1 = getelementptr inbounds i32, i32* %b, i64 1
	%1 = bitcast i32* %add.ptr1 to float*			%1 = bitcast i32* %add.ptr1 to float*
	%2 = load float, float* %0, align 4			%2 = load float, float* %0, align 4
	%vecins = insertelement <4 x float> %a, float %2, i32 0			%vecins = insertelement <4 x float> %a, float %2, i32 0
	%3 = load float, float* %1, align 4			%3 = load float, float* %1, align 4
	%vecins2 = insertelement <4 x float> %vecins, float %3, i32 2			%vecins2 = insertelement <4 x float> %vecins, float %3, i32 2
	Show All 23 Lines
	; CHECK-32-NEXT: xscvdpspn 0, 0			; CHECK-32-NEXT: xscvdpspn 0, 0
	; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3			; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 0			; CHECK-32-NEXT: xxinsertw 34, 0, 0
	; CHECK-32-NEXT: lfs 0, 0(3)			; CHECK-32-NEXT: lfs 0, 0(3)
	; CHECK-32-NEXT: xscvdpspn 0, 0			; CHECK-32-NEXT: xscvdpspn 0, 0
	; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3			; CHECK-32-NEXT: xxsldwi 0, 0, 0, 3
	; CHECK-32-NEXT: xxinsertw 34, 0, 8			; CHECK-32-NEXT: xxinsertw 34, 0, 8
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testFloatImm3:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: plwz 4, 262144(3), 0
				; CHECK-64-P10-NEXT: vinsw 2, 4, 0
				; CHECK-64-P10-NEXT: li 4, 1
				; CHECK-64-P10-NEXT: rldic 4, 4, 38, 25
				; CHECK-64-P10-NEXT: lwzx 3, 3, 4
				; CHECK-64-P10-NEXT: vinsw 2, 3, 8
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testFloatImm3:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lis 4, 4
				; CHECK-32-P10-NEXT: lwzx 4, 3, 4
				; CHECK-32-P10-NEXT: lwz 3, 0(3)
				; CHECK-32-P10-NEXT: vinsw 2, 4, 0
				; CHECK-32-P10-NEXT: vinsw 2, 3, 8
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%add.ptr = getelementptr inbounds i32, i32* %b, i64 65536			%add.ptr = getelementptr inbounds i32, i32* %b, i64 65536
	%0 = bitcast i32* %add.ptr to float*			%0 = bitcast i32* %add.ptr to float*
	%add.ptr1 = getelementptr inbounds i32, i32* %b, i64 68719476736			%add.ptr1 = getelementptr inbounds i32, i32* %b, i64 68719476736
	%1 = bitcast i32* %add.ptr1 to float*			%1 = bitcast i32* %add.ptr1 to float*
	%2 = load float, float* %0, align 4			%2 = load float, float* %0, align 4
	%vecins = insertelement <4 x float> %a, float %2, i32 0			%vecins = insertelement <4 x float> %a, float %2, i32 0
	%3 = load float, float* %1, align 4			%3 = load float, float* %1, align 4
	Show All 16 Lines
	; CHECK-32-LABEL: testDouble1:			; CHECK-32-LABEL: testDouble1:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: addi 4, 1, -16			; CHECK-32-NEXT: addi 4, 1, -16
	; CHECK-32-NEXT: rlwinm 3, 5, 3, 28, 28			; CHECK-32-NEXT: rlwinm 3, 5, 3, 28, 28
	; CHECK-32-NEXT: stxv 34, -16(1)			; CHECK-32-NEXT: stxv 34, -16(1)
	; CHECK-32-NEXT: stfdx 1, 4, 3			; CHECK-32-NEXT: stfdx 1, 4, 3
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDouble1:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: extsw 4, 4
				; CHECK-64-P10-NEXT: mffprd 3, 1
				; CHECK-64-P10-NEXT: rlwinm 4, 4, 3, 0, 28
				; CHECK-64-P10-NEXT: vinsdlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDouble1:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: addi 4, 1, -16
				; CHECK-32-P10-NEXT: rlwinm 3, 5, 3, 28, 28
				; CHECK-32-P10-NEXT: stxv 34, -16(1)
				; CHECK-32-P10-NEXT: stfdx 1, 4, 3
				; CHECK-32-P10-NEXT: lxv 34, -16(1)
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <2 x double> %a, double %b, i32 %idx1			%vecins = insertelement <2 x double> %a, double %b, i32 %idx1
	ret <2 x double> %vecins			ret <2 x double> %vecins
	}			}

	define <2 x double> @testDouble2(<2 x double> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {			define <2 x double> @testDouble2(<2 x double> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {
	; CHECK-64-LABEL: testDouble2:			; CHECK-64-LABEL: testDouble2:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-DAG: ld 6, 0(3)			; CHECK-64-NEXT: ld 6, 0(3)
	; CHECK-64-DAG: rlwinm 4, 4, 3, 28, 28			; CHECK-64-NEXT: rlwinm 4, 4, 3, 28, 28
	; CHECK-64-DAG: addi 7, 1, -32			; CHECK-64-NEXT: addi 7, 1, -32
	; CHECK-64-DAG: stxv 34, -32(1)			; CHECK-64-NEXT: stxv 34, -32(1)
	; CHECK-64-DAG: stdx 6, 7, 4			; CHECK-64-NEXT: stdx 6, 7, 4
	; CHECK-64-DAG: li 4, 1			; CHECK-64-NEXT: li 4, 1
	; CHECK-64-DAG: lxv 0, -32(1)			; CHECK-64-NEXT: lxv 0, -32(1)
	; CHECK-64-DAG: ldx 3, 3, 4			; CHECK-64-NEXT: ldx 3, 3, 4
	; CHECK-64-DAG: rlwinm 4, 5, 3, 28, 28			; CHECK-64-NEXT: rlwinm 4, 5, 3, 28, 28
	; CHECK-64-DAG: addi 5, 1, -16			; CHECK-64-NEXT: addi 5, 1, -16
	; CHECK-64-DAG: stxv 0, -16(1)			; CHECK-64-NEXT: stxv 0, -16(1)
	; CHECK-64-DAG: stdx 3, 5, 4			; CHECK-64-NEXT: stdx 3, 5, 4
	; CHECK-64-DAG: lxv 34, -16(1)			; CHECK-64-NEXT: lxv 34, -16(1)
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDouble2:			; CHECK-32-LABEL: testDouble2:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lfd 0, 0(3)			; CHECK-32-NEXT: lfd 0, 0(3)
	; CHECK-32-NEXT: addi 6, 1, -32			; CHECK-32-NEXT: addi 6, 1, -32
	; CHECK-32-NEXT: rlwinm 4, 4, 3, 28, 28			; CHECK-32-NEXT: rlwinm 4, 4, 3, 28, 28
	; CHECK-32-NEXT: stxv 34, -32(1)			; CHECK-32-NEXT: stxv 34, -32(1)
	; CHECK-32-NEXT: rlwinm 5, 5, 3, 28, 28			; CHECK-32-NEXT: rlwinm 5, 5, 3, 28, 28
	; CHECK-32-NEXT: stfdx 0, 6, 4			; CHECK-32-NEXT: stfdx 0, 6, 4
	; CHECK-32-NEXT: lxv 0, -32(1)			; CHECK-32-NEXT: lxv 0, -32(1)
	; CHECK-32-NEXT: lfd 1, 1(3)			; CHECK-32-NEXT: lfd 1, 1(3)
	; CHECK-32-NEXT: addi 3, 1, -16			; CHECK-32-NEXT: addi 3, 1, -16
	; CHECK-32-NEXT: stxv 0, -16(1)			; CHECK-32-NEXT: stxv 0, -16(1)
	; CHECK-32-NEXT: stfdx 1, 3, 5			; CHECK-32-NEXT: stfdx 1, 3, 5
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDouble2:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: ld 6, 0(3)
				; CHECK-64-P10-NEXT: extsw 4, 4
				; CHECK-64-P10-NEXT: pld 3, 1(3), 0
				; CHECK-64-P10-NEXT: rlwinm 4, 4, 3, 0, 28
				; CHECK-64-P10-NEXT: vinsdlx 2, 4, 6
				; CHECK-64-P10-NEXT: extsw 4, 5
				; CHECK-64-P10-NEXT: rlwinm 4, 4, 3, 0, 28
				; CHECK-64-P10-NEXT: vinsdlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDouble2:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lfd 0, 0(3)
				; CHECK-32-P10-NEXT: addi 6, 1, -32
				; CHECK-32-P10-NEXT: rlwinm 4, 4, 3, 28, 28
				; CHECK-32-P10-NEXT: stxv 34, -32(1)
				; CHECK-32-P10-NEXT: rlwinm 5, 5, 3, 28, 28
				; CHECK-32-P10-NEXT: stfdx 0, 6, 4
				; CHECK-32-P10-NEXT: lxv 0, -32(1)
				; CHECK-32-P10-NEXT: lfd 1, 1(3)
				; CHECK-32-P10-NEXT: addi 3, 1, -16
				; CHECK-32-P10-NEXT: stxv 0, -16(1)
				; CHECK-32-P10-NEXT: stfdx 1, 3, 5
				; CHECK-32-P10-NEXT: lxv 34, -16(1)
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%0 = bitcast i8* %b to double*			%0 = bitcast i8* %b to double*
	%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 1			%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 1
	%1 = bitcast i8* %add.ptr1 to double*			%1 = bitcast i8* %add.ptr1 to double*
	%2 = load double, double* %0, align 8			%2 = load double, double* %0, align 8
	%vecins = insertelement <2 x double> %a, double %2, i32 %idx1			%vecins = insertelement <2 x double> %a, double %2, i32 %idx1
	%3 = load double, double* %1, align 8			%3 = load double, double* %1, align 8
	%vecins2 = insertelement <2 x double> %vecins, double %3, i32 %idx2			%vecins2 = insertelement <2 x double> %vecins, double %3, i32 %idx2
	ret <2 x double> %vecins2			ret <2 x double> %vecins2
	}			}

	define <2 x double> @testDouble3(<2 x double> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {			define <2 x double> @testDouble3(<2 x double> %a, i8* %b, i32 zeroext %idx1, i32 zeroext %idx2) {
	; CHECK-64-LABEL: testDouble3:			; CHECK-64-LABEL: testDouble3:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-DAG: lis 6, 1			; CHECK-64-NEXT: lis 6, 1
	; CHECK-64-DAG: rlwinm 4, 4, 3, 28, 28			; CHECK-64-NEXT: rlwinm 4, 4, 3, 28, 28
	; CHECK-64-DAG: addi 7, 1, -32			; CHECK-64-NEXT: addi 7, 1, -32
	; CHECK-64-DAG: ldx 6, 3, 6			; CHECK-64-NEXT: ldx 6, 3, 6
	; CHECK-64-DAG: stxv 34, -32(1)			; CHECK-64-NEXT: stxv 34, -32(1)
	; CHECK-64-DAG: stdx 6, 7, 4			; CHECK-64-NEXT: stdx 6, 7, 4
	; CHECK-64-DAG: li 4, 1			; CHECK-64-NEXT: li 4, 1
	; CHECK-64-DAG: lxv 0, -32(1)			; CHECK-64-NEXT: lxv 0, -32(1)
	; CHECK-64-DAG: rldic 4, 4, 36, 27			; CHECK-64-NEXT: rldic 4, 4, 36, 27
	; CHECK-64-DAG: ldx 3, 3, 4			; CHECK-64-NEXT: ldx 3, 3, 4
	; CHECK-64-DAG: rlwinm 4, 5, 3, 28, 28			; CHECK-64-NEXT: rlwinm 4, 5, 3, 28, 28
	; CHECK-64-DAG: addi 5, 1, -16			; CHECK-64-NEXT: addi 5, 1, -16
	; CHECK-64-DAG: stxv 0, -16(1)			; CHECK-64-NEXT: stxv 0, -16(1)
	; CHECK-64-DAG: stdx 3, 5, 4			; CHECK-64-NEXT: stdx 3, 5, 4
	; CHECK-64-DAG: lxv 34, -16(1)			; CHECK-64-NEXT: lxv 34, -16(1)
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDouble3:			; CHECK-32-LABEL: testDouble3:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lis 6, 1			; CHECK-32-NEXT: lis 6, 1
	; CHECK-32-NEXT: rlwinm 4, 4, 3, 28, 28			; CHECK-32-NEXT: rlwinm 4, 4, 3, 28, 28
	; CHECK-32-NEXT: rlwinm 5, 5, 3, 28, 28			; CHECK-32-NEXT: rlwinm 5, 5, 3, 28, 28
	; CHECK-32-NEXT: lfdx 0, 3, 6			; CHECK-32-NEXT: lfdx 0, 3, 6
	; CHECK-32-NEXT: addi 6, 1, -32			; CHECK-32-NEXT: addi 6, 1, -32
	; CHECK-32-NEXT: stxv 34, -32(1)			; CHECK-32-NEXT: stxv 34, -32(1)
	; CHECK-32-NEXT: stfdx 0, 6, 4			; CHECK-32-NEXT: stfdx 0, 6, 4
	; CHECK-32-NEXT: lxv 0, -32(1)			; CHECK-32-NEXT: lxv 0, -32(1)
	; CHECK-32-NEXT: lfd 1, 0(3)			; CHECK-32-NEXT: lfd 1, 0(3)
	; CHECK-32-NEXT: addi 3, 1, -16			; CHECK-32-NEXT: addi 3, 1, -16
	; CHECK-32-NEXT: stxv 0, -16(1)			; CHECK-32-NEXT: stxv 0, -16(1)
	; CHECK-32-NEXT: stfdx 1, 3, 5			; CHECK-32-NEXT: stfdx 1, 3, 5
	; CHECK-32-NEXT: lxv 34, -16(1)			; CHECK-32-NEXT: lxv 34, -16(1)
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDouble3:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: pld 6, 65536(3), 0
				; CHECK-64-P10-NEXT: extsw 4, 4
				; CHECK-64-P10-NEXT: rlwinm 4, 4, 3, 0, 28
				; CHECK-64-P10-NEXT: vinsdlx 2, 4, 6
				; CHECK-64-P10-NEXT: li 4, 1
				; CHECK-64-P10-NEXT: rldic 4, 4, 36, 27
				; CHECK-64-P10-NEXT: ldx 3, 3, 4
				; CHECK-64-P10-NEXT: extsw 4, 5
				; CHECK-64-P10-NEXT: rlwinm 4, 4, 3, 0, 28
				; CHECK-64-P10-NEXT: vinsdlx 2, 4, 3
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDouble3:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lis 6, 1
				; CHECK-32-P10-NEXT: rlwinm 4, 4, 3, 28, 28
				; CHECK-32-P10-NEXT: rlwinm 5, 5, 3, 28, 28
				; CHECK-32-P10-NEXT: lfdx 0, 3, 6
				; CHECK-32-P10-NEXT: addi 6, 1, -32
				; CHECK-32-P10-NEXT: stxv 34, -32(1)
				; CHECK-32-P10-NEXT: stfdx 0, 6, 4
				; CHECK-32-P10-NEXT: lxv 0, -32(1)
				; CHECK-32-P10-NEXT: lfd 1, 0(3)
				; CHECK-32-P10-NEXT: addi 3, 1, -16
				; CHECK-32-P10-NEXT: stxv 0, -16(1)
				; CHECK-32-P10-NEXT: stfdx 1, 3, 5
				; CHECK-32-P10-NEXT: lxv 34, -16(1)
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%add.ptr = getelementptr inbounds i8, i8* %b, i64 65536			%add.ptr = getelementptr inbounds i8, i8* %b, i64 65536
	%0 = bitcast i8* %add.ptr to double*			%0 = bitcast i8* %add.ptr to double*
	%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 68719476736			%add.ptr1 = getelementptr inbounds i8, i8* %b, i64 68719476736
	%1 = bitcast i8* %add.ptr1 to double*			%1 = bitcast i8* %add.ptr1 to double*
	%2 = load double, double* %0, align 8			%2 = load double, double* %0, align 8
	%vecins = insertelement <2 x double> %a, double %2, i32 %idx1			%vecins = insertelement <2 x double> %a, double %2, i32 %idx1
	%3 = load double, double* %1, align 8			%3 = load double, double* %1, align 8
	Show All 10 Lines
	; CHECK-64-NEXT: xxpermdi 34, 1, 34, 1			; CHECK-64-NEXT: xxpermdi 34, 1, 34, 1
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoubleImm1:			; CHECK-32-LABEL: testDoubleImm1:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: # kill: def $f1 killed $f1 def $vsl1			; CHECK-32-NEXT: # kill: def $f1 killed $f1 def $vsl1
	; CHECK-32-NEXT: xxpermdi 34, 1, 34, 1			; CHECK-32-NEXT: xxpermdi 34, 1, 34, 1
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoubleImm1:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: # kill: def $f1 killed $f1 def $vsl1
				; CHECK-64-P10-NEXT: xxpermdi 34, 1, 34, 1
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoubleImm1:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: # kill: def $f1 killed $f1 def $vsl1
				; CHECK-32-P10-NEXT: xxpermdi 34, 1, 34, 1
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%vecins = insertelement <2 x double> %a, double %b, i32 0			%vecins = insertelement <2 x double> %a, double %b, i32 0
	ret <2 x double> %vecins			ret <2 x double> %vecins
	}			}

	define <2 x double> @testDoubleImm2(<2 x double> %a, i32* %b) {			define <2 x double> @testDoubleImm2(<2 x double> %a, i32* %b) {
	; CHECK-64-LABEL: testDoubleImm2:			; CHECK-64-LABEL: testDoubleImm2:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: lfd 0, 0(3)			; CHECK-64-NEXT: lfd 0, 0(3)
	; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoubleImm2:			; CHECK-32-LABEL: testDoubleImm2:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lfd 0, 0(3)			; CHECK-32-NEXT: lfd 0, 0(3)
	; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoubleImm2:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: lfd 0, 0(3)
				; CHECK-64-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoubleImm2:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lfd 0, 0(3)
				; CHECK-32-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%0 = bitcast i32* %b to double*			%0 = bitcast i32* %b to double*
	%1 = load double, double* %0, align 8			%1 = load double, double* %0, align 8
	%vecins = insertelement <2 x double> %a, double %1, i32 0			%vecins = insertelement <2 x double> %a, double %1, i32 0
	ret <2 x double> %vecins			ret <2 x double> %vecins
	}			}

	define <2 x double> @testDoubleImm3(<2 x double> %a, i32* %b) {			define <2 x double> @testDoubleImm3(<2 x double> %a, i32* %b) {
	; CHECK-64-LABEL: testDoubleImm3:			; CHECK-64-LABEL: testDoubleImm3:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: lfd 0, 4(3)			; CHECK-64-NEXT: lfd 0, 4(3)
	; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoubleImm3:			; CHECK-32-LABEL: testDoubleImm3:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lfd 0, 4(3)			; CHECK-32-NEXT: lfd 0, 4(3)
	; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoubleImm3:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: lfd 0, 4(3)
				; CHECK-64-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoubleImm3:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lfd 0, 4(3)
				; CHECK-32-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%add.ptr = getelementptr inbounds i32, i32* %b, i64 1			%add.ptr = getelementptr inbounds i32, i32* %b, i64 1
	%0 = bitcast i32* %add.ptr to double*			%0 = bitcast i32* %add.ptr to double*
	%1 = load double, double* %0, align 8			%1 = load double, double* %0, align 8
	%vecins = insertelement <2 x double> %a, double %1, i32 0			%vecins = insertelement <2 x double> %a, double %1, i32 0
	ret <2 x double> %vecins			ret <2 x double> %vecins
	}			}

	define <2 x double> @testDoubleImm4(<2 x double> %a, i32* %b) {			define <2 x double> @testDoubleImm4(<2 x double> %a, i32* %b) {
	; CHECK-64-LABEL: testDoubleImm4:			; CHECK-64-LABEL: testDoubleImm4:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: lis 4, 4			; CHECK-64-NEXT: lis 4, 4
	; CHECK-64-NEXT: lfdx 0, 3, 4			; CHECK-64-NEXT: lfdx 0, 3, 4
	; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoubleImm4:			; CHECK-32-LABEL: testDoubleImm4:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lis 4, 4			; CHECK-32-NEXT: lis 4, 4
	; CHECK-32-NEXT: lfdx 0, 3, 4			; CHECK-32-NEXT: lfdx 0, 3, 4
	; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoubleImm4:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: lis 4, 4
				; CHECK-64-P10-NEXT: lfdx 0, 3, 4
				; CHECK-64-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoubleImm4:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lis 4, 4
				; CHECK-32-P10-NEXT: lfdx 0, 3, 4
				; CHECK-32-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%add.ptr = getelementptr inbounds i32, i32* %b, i64 65536			%add.ptr = getelementptr inbounds i32, i32* %b, i64 65536
	%0 = bitcast i32* %add.ptr to double*			%0 = bitcast i32* %add.ptr to double*
	%1 = load double, double* %0, align 8			%1 = load double, double* %0, align 8
	%vecins = insertelement <2 x double> %a, double %1, i32 0			%vecins = insertelement <2 x double> %a, double %1, i32 0
	ret <2 x double> %vecins			ret <2 x double> %vecins
	}			}

	define <2 x double> @testDoubleImm5(<2 x double> %a, i32* %b) {			define <2 x double> @testDoubleImm5(<2 x double> %a, i32* %b) {
	; CHECK-64-LABEL: testDoubleImm5:			; CHECK-64-LABEL: testDoubleImm5:
	; CHECK-64: # %bb.0: # %entry			; CHECK-64: # %bb.0: # %entry
	; CHECK-64-NEXT: li 4, 1			; CHECK-64-NEXT: li 4, 1
	; CHECK-64-NEXT: rldic 4, 4, 38, 25			; CHECK-64-NEXT: rldic 4, 4, 38, 25
	; CHECK-64-NEXT: lfdx 0, 3, 4			; CHECK-64-NEXT: lfdx 0, 3, 4
	; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-64-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-64-NEXT: blr			; CHECK-64-NEXT: blr
	;			;
	; CHECK-32-LABEL: testDoubleImm5:			; CHECK-32-LABEL: testDoubleImm5:
	; CHECK-32: # %bb.0: # %entry			; CHECK-32: # %bb.0: # %entry
	; CHECK-32-NEXT: lfd 0, 0(3)			; CHECK-32-NEXT: lfd 0, 0(3)
	; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1			; CHECK-32-NEXT: xxpermdi 34, 0, 34, 1
	; CHECK-32-NEXT: blr			; CHECK-32-NEXT: blr
				;
				; CHECK-64-P10-LABEL: testDoubleImm5:
				; CHECK-64-P10: # %bb.0: # %entry
				; CHECK-64-P10-NEXT: li 4, 1
				; CHECK-64-P10-NEXT: rldic 4, 4, 38, 25
				; CHECK-64-P10-NEXT: lfdx 0, 3, 4
				; CHECK-64-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-64-P10-NEXT: blr
				;
				; CHECK-32-P10-LABEL: testDoubleImm5:
				; CHECK-32-P10: # %bb.0: # %entry
				; CHECK-32-P10-NEXT: lfd 0, 0(3)
				; CHECK-32-P10-NEXT: xxpermdi 34, 0, 34, 1
				; CHECK-32-P10-NEXT: blr
	entry:			entry:
	%add.ptr = getelementptr inbounds i32, i32* %b, i64 68719476736			%add.ptr = getelementptr inbounds i32, i32* %b, i64 68719476736
	%0 = bitcast i32* %add.ptr to double*			%0 = bitcast i32* %add.ptr to double*
	%1 = load double, double* %0, align 8			%1 = load double, double* %0, align 8
	%vecins = insertelement <2 x double> %a, double %1, i32 0			%vecins = insertelement <2 x double> %a, double %1, i32 0
	ret <2 x double> %vecins			ret <2 x double> %vecins
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Enable safe for 32bit vins* P10 instructions ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 343455

llvm/lib/Target/PowerPC/PPCISelLowering.cpp

llvm/lib/Target/PowerPC/PPCInstrPrefix.td

llvm/test/CodeGen/PowerPC/aix-vec_insert_elt.ll

[PowerPC] Enable safe for 32bit vins* P10 instructions
ClosedPublic