This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
1
PPCISelLowering.cpp
-
PPCInstrVSX.td
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
-
build-vector-tests.ll

Differential D50121

[PowerPC] Do not round values prior to converting to integer
ClosedPublic

Authored by nemanjai on Jul 31 2018, 5:40 PM.

Download Raw Diff

Details

Reviewers

cuviper
tstellar
hfinkel
kbarton

Commits

rG63740db57a3c: Merging r338658: --------------------------------------------------------------…
rGe1a525ed06e1: [PowerPC] Do not round values prior to converting to integer
rL338678: Merging r338658:
rL338658: [PowerPC] Do not round values prior to converting to integer

Summary

As pointed out in https://bugs.llvm.org/show_bug.cgi?id=38342, adding the FP_ROUND prior to converting from double precision to 4-byte integers causes loss of precision. This patch fixes that bug ensuring that we do not need to scalarize these conversions.

Diff Detail

Repository: rL LLVM

Event Timeline

nemanjai created this revision.Jul 31 2018, 5:40 PM

Herald added a subscriber: kbarton. · View Herald TranscriptJul 31 2018, 5:40 PM

nemanjai added a reviewer: kbarton.Jul 31 2018, 5:41 PM

hfinkel added inline comments.Jul 31 2018, 6:07 PM

lib/Target/PowerPC/PPCISelLowering.cpp
11702	Can you move this comment into the Is32Bit if block below? You say, "if we made it here, ...", but that actually only applies if Is32Bit is true.

One questions I had after looking into this bug was why does PPCTargetLowering::LowerFP_TO_INTForReuse() extend f32 values to f64 before creating the FCT* instructions?

Place comment into the respective if block.

In D50121#1183765, @tstellar wrote:

One questions I had after looking into this bug was why does PPCTargetLowering::LowerFP_TO_INTForReuse() extend f32 values to f64 before creating the FCT* instructions?

This is just a convenient canonicalization. On PPC hardware, both single and double precision scalar values have the same double representation in registers. There are instructions that round from double precision to single precision (essentially clear bits that a single precision operand cannot have set and produce any exceptions that should result from the conversion). Furthermore, there are arithmetic instructions that operate on single precision values and produce values that would be unchanged by an aforementioned rounding. But an extend from single precision to double precision is a noop. So adding an FP_EXTEND to keep everything neatly in f64 values makes sense.
Note that this only applies to floating point scalar values in registers. Vector floating point single precision values actually occupy a 32-bit element each and of course, single precision values are stored in memory as 32-bit entities. So converting between single and double precision vectors of floating point also involves changing the in-register size of the operand.

LGTM

This revision is now accepted and ready to land.Aug 1 2018, 8:40 AM

Closed by commit rL338658: [PowerPC] Do not round values prior to converting to integer (authored by nemanjai). · Explain WhyAug 1 2018, 5:03 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

PowerPC/

PPCISelLowering.cpp

22 lines

PPCInstrVSX.td

86 lines

test/

CodeGen/

PowerPC/

build-vector-tests.ll

357 lines

Diff 158431

lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 11,626 Lines • ▼ Show 20 Lines	SDValue PPCTargetLowering::DAGCombineExtBoolTrunc(SDNode *N,
SDValue ShiftCst =		SDValue ShiftCst =
DAG.getConstant(N->getValueSizeInBits(0) - PromBits, dl, ShiftAmountTy);		DAG.getConstant(N->getValueSizeInBits(0) - PromBits, dl, ShiftAmountTy);
return DAG.getNode(		return DAG.getNode(
ISD::SRA, dl, N->getValueType(0),		ISD::SRA, dl, N->getValueType(0),
DAG.getNode(ISD::SHL, dl, N->getValueType(0), N->getOperand(0), ShiftCst),		DAG.getNode(ISD::SHL, dl, N->getValueType(0), N->getOperand(0), ShiftCst),
ShiftCst);		ShiftCst);
}		}

		// Is this an extending load from an f32 to an f64?
		static bool isFPExtLoad(SDValue Op) {
		if (LoadSDNode *LD = dyn_cast<LoadSDNode>(Op.getNode()))
		return LD->getExtensionType() == ISD::EXTLOAD &&
		Op.getValueType() == MVT::f64;
		return false;
		}

/// Reduces the number of fp-to-int conversion when building a vector.		/// Reduces the number of fp-to-int conversion when building a vector.
///		///
/// If this vector is built out of floating to integer conversions,		/// If this vector is built out of floating to integer conversions,
/// transform it to a vector built out of floating point values followed by a		/// transform it to a vector built out of floating point values followed by a
/// single floating to integer conversion of the vector.		/// single floating to integer conversion of the vector.
/// Namely (build_vector (fptosi $A), (fptosi $B), ...)		/// Namely (build_vector (fptosi $A), (fptosi $B), ...)
/// becomes (fptosi (build_vector ($A, $B, ...)))		/// becomes (fptosi (build_vector ($A, $B, ...)))
SDValue PPCTargetLowering::		SDValue PPCTargetLowering::
Show All 18 Lines	if (FirstConversion == PPCISD::FCTIDZ \|\|
FirstConversion == PPCISD::FCTIWUZ) {		FirstConversion == PPCISD::FCTIWUZ) {
bool IsSplat = true;		bool IsSplat = true;
bool Is32Bit = FirstConversion == PPCISD::FCTIWZ \|\|		bool Is32Bit = FirstConversion == PPCISD::FCTIWZ \|\|
FirstConversion == PPCISD::FCTIWUZ;		FirstConversion == PPCISD::FCTIWUZ;
EVT SrcVT = FirstInput.getOperand(0).getValueType();		EVT SrcVT = FirstInput.getOperand(0).getValueType();
SmallVector<SDValue, 4> Ops;		SmallVector<SDValue, 4> Ops;
EVT TargetVT = N->getValueType(0);		EVT TargetVT = N->getValueType(0);
for (int i = 0, e = N->getNumOperands(); i < e; ++i) {		for (int i = 0, e = N->getNumOperands(); i < e; ++i) {
if (N->getOperand(i).getOpcode() != PPCISD::MFVSR)		SDValue NextOp = N->getOperand(i);
		if (NextOp.getOpcode() != PPCISD::MFVSR)
return SDValue();		return SDValue();
unsigned NextConversion = N->getOperand(i).getOperand(0).getOpcode();		unsigned NextConversion = NextOp.getOperand(0).getOpcode();
if (NextConversion != FirstConversion)		if (NextConversion != FirstConversion)
return SDValue();		return SDValue();
		// If we are converting to 32-bit integers, we need to add an FP_ROUND.
		// This is not valid if the input was originally double precision. It is
		// also not profitable to do unless this is an extending load in which
		// case doing this combine will allow us to combine consecutive loads.
		if (Is32Bit && !isFPExtLoad(NextOp.getOperand(0).getOperand(0)))
		return SDValue();
if (N->getOperand(i) != FirstInput)		if (N->getOperand(i) != FirstInput)
IsSplat = false;		IsSplat = false;
}		}

// If this is a splat, we leave it as-is since there will be only a single		// If this is a splat, we leave it as-is since there will be only a single
// fp-to-int conversion followed by a splat of the integer. This is better		// fp-to-int conversion followed by a splat of the integer. This is better
// for 32-bit and smaller ints and neutral for 64-bit ints.		// for 32-bit and smaller ints and neutral for 64-bit ints.
if (IsSplat)		if (IsSplat)
return SDValue();		return SDValue();

// Now that we know we have the right type of node, get its operands		// Now that we know we have the right type of node, get its operands
for (int i = 0, e = N->getNumOperands(); i < e; ++i) {		for (int i = 0, e = N->getNumOperands(); i < e; ++i) {
SDValue In = N->getOperand(i).getOperand(0);		SDValue In = N->getOperand(i).getOperand(0);
// For 32-bit values, we need to add an FP_ROUND node.		// For 32-bit values, we need to add an FP_ROUND node (if we made it here,
		hfinkelUnsubmitted Not Done Reply Inline Actions Can you move this comment into the Is32Bit if block below? You say, "if we made it here, ...", but that actually only applies if Is32Bit is true. hfinkel: Can you move this comment into the Is32Bit if block below? You say, "if we made it here, ..."…
		// we know that all inputs came from an extending load so this is safe).
if (Is32Bit) {		if (Is32Bit) {
if (In.isUndef())		if (In.isUndef())
Ops.push_back(DAG.getUNDEF(SrcVT));		Ops.push_back(DAG.getUNDEF(SrcVT));
else {		else {
SDValue Trunc = DAG.getNode(ISD::FP_ROUND, dl,		SDValue Trunc = DAG.getNode(ISD::FP_ROUND, dl,
MVT::f32, In.getOperand(0),		MVT::f32, In.getOperand(0),
DAG.getIntPtrConstant(1, dl));		DAG.getIntPtrConstant(1, dl));
Ops.push_back(Trunc);		Ops.push_back(Trunc);
▲ Show 20 Lines • Show All 2,342 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCInstrVSX.td

Show First 20 Lines • Show All 3,392 Lines • ▼ Show 20 Lines

def DblToFlt {		def DblToFlt {
dag A0 = (f32 (fpround (f64 (extractelt v2f64:$A, 0))));		dag A0 = (f32 (fpround (f64 (extractelt v2f64:$A, 0))));
dag A1 = (f32 (fpround (f64 (extractelt v2f64:$A, 1))));		dag A1 = (f32 (fpround (f64 (extractelt v2f64:$A, 1))));
dag B0 = (f32 (fpround (f64 (extractelt v2f64:$B, 0))));		dag B0 = (f32 (fpround (f64 (extractelt v2f64:$B, 0))));
dag B1 = (f32 (fpround (f64 (extractelt v2f64:$B, 1))));		dag B1 = (f32 (fpround (f64 (extractelt v2f64:$B, 1))));
}		}

		def ExtDbl {
		dag A0S = (i32 (PPCmfvsr (f64 (PPCfctiwz (f64 (extractelt v2f64:$A, 0))))));
		dag A1S = (i32 (PPCmfvsr (f64 (PPCfctiwz (f64 (extractelt v2f64:$A, 1))))));
		dag B0S = (i32 (PPCmfvsr (f64 (PPCfctiwz (f64 (extractelt v2f64:$B, 0))))));
		dag B1S = (i32 (PPCmfvsr (f64 (PPCfctiwz (f64 (extractelt v2f64:$B, 1))))));
		dag A0U = (i32 (PPCmfvsr (f64 (PPCfctiwuz (f64 (extractelt v2f64:$A, 0))))));
		dag A1U = (i32 (PPCmfvsr (f64 (PPCfctiwuz (f64 (extractelt v2f64:$A, 1))))));
		dag B0U = (i32 (PPCmfvsr (f64 (PPCfctiwuz (f64 (extractelt v2f64:$B, 0))))));
		dag B1U = (i32 (PPCmfvsr (f64 (PPCfctiwuz (f64 (extractelt v2f64:$B, 1))))));
		}

def ByteToWord {		def ByteToWord {
dag LE_A0 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 0)), i8));		dag LE_A0 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 0)), i8));
dag LE_A1 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 4)), i8));		dag LE_A1 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 4)), i8));
dag LE_A2 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 8)), i8));		dag LE_A2 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 8)), i8));
dag LE_A3 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 12)), i8));		dag LE_A3 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 12)), i8));
dag BE_A0 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 3)), i8));		dag BE_A0 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 3)), i8));
dag BE_A1 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 7)), i8));		dag BE_A1 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 7)), i8));
dag BE_A2 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 11)), i8));		dag BE_A2 = (i32 (sext_inreg (i32 (vector_extract v16i8:$A, 11)), i8));
▲ Show 20 Lines • Show All 61 Lines • ▼ Show 20 Lines
def FltToLong {		def FltToLong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctidz (fpextend f32:$A)))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctidz (fpextend f32:$A)))));
}		}
def FltToULong {		def FltToULong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz (fpextend f32:$A)))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz (fpextend f32:$A)))));
}		}
def DblToInt {		def DblToInt {
dag A = (i32 (PPCmfvsr (f64 (PPCfctiwz f64:$A))));		dag A = (i32 (PPCmfvsr (f64 (PPCfctiwz f64:$A))));
		dag B = (i32 (PPCmfvsr (f64 (PPCfctiwz f64:$B))));
		dag C = (i32 (PPCmfvsr (f64 (PPCfctiwz f64:$C))));
		dag D = (i32 (PPCmfvsr (f64 (PPCfctiwz f64:$D))));
}		}
def DblToUInt {		def DblToUInt {
dag A = (i32 (PPCmfvsr (f64 (PPCfctiwuz f64:$A))));		dag A = (i32 (PPCmfvsr (f64 (PPCfctiwuz f64:$A))));
		dag B = (i32 (PPCmfvsr (f64 (PPCfctiwuz f64:$B))));
		dag C = (i32 (PPCmfvsr (f64 (PPCfctiwuz f64:$C))));
		dag D = (i32 (PPCmfvsr (f64 (PPCfctiwuz f64:$D))));
}		}
def DblToLong {		def DblToLong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctidz f64:$A))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctidz f64:$A))));
}		}
def DblToULong {		def DblToULong {
dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz f64:$A))));		dag A = (i64 (PPCmfvsr (f64 (PPCfctiduz f64:$A))));
}		}
def DblToIntLoad {		def DblToIntLoad {
Show All 22 Lines	def MrgFP {
dag BD = (XVCVDPSP (XXPERMDI (COPY_TO_REGCLASS $B, VSRC),		dag BD = (XVCVDPSP (XXPERMDI (COPY_TO_REGCLASS $B, VSRC),
(COPY_TO_REGCLASS $D, VSRC), 0));		(COPY_TO_REGCLASS $D, VSRC), 0));
dag ABhToFlt = (XVCVDPSP (XXPERMDI $A, $B, 0));		dag ABhToFlt = (XVCVDPSP (XXPERMDI $A, $B, 0));
dag ABlToFlt = (XVCVDPSP (XXPERMDI $A, $B, 3));		dag ABlToFlt = (XVCVDPSP (XXPERMDI $A, $B, 3));
dag BAhToFlt = (XVCVDPSP (XXPERMDI $B, $A, 0));		dag BAhToFlt = (XVCVDPSP (XXPERMDI $B, $A, 0));
dag BAlToFlt = (XVCVDPSP (XXPERMDI $B, $A, 3));		dag BAlToFlt = (XVCVDPSP (XXPERMDI $B, $A, 3));
}		}

		// Word-element merge dags - conversions from f64 to i32 merged into vectors.
		def MrgWords {
		// For big endian, we merge low and hi doublewords (A, B).
		dag A0B0 = (v2f64 (XXPERMDI v2f64:$A, v2f64:$B, 0));
		dag A1B1 = (v2f64 (XXPERMDI v2f64:$A, v2f64:$B, 3));
		dag CVA1B1S = (v4i32 (XVCVDPSXWS A1B1));
		dag CVA0B0S = (v4i32 (XVCVDPSXWS A0B0));
		dag CVA1B1U = (v4i32 (XVCVDPUXWS A1B1));
		dag CVA0B0U = (v4i32 (XVCVDPUXWS A0B0));

		// For little endian, we merge low and hi doublewords (B, A).
		dag B1A1 = (v2f64 (XXPERMDI v2f64:$B, v2f64:$A, 0));
		dag B0A0 = (v2f64 (XXPERMDI v2f64:$B, v2f64:$A, 3));
		dag CVB1A1S = (v4i32 (XVCVDPSXWS B1A1));
		dag CVB0A0S = (v4i32 (XVCVDPSXWS B0A0));
		dag CVB1A1U = (v4i32 (XVCVDPUXWS B1A1));
		dag CVB0A0U = (v4i32 (XVCVDPUXWS B0A0));

		// For big endian, we merge hi doublewords of (A, C) and (B, D), convert
		// then merge.
		dag AC = (v2f64 (XXPERMDI (COPY_TO_REGCLASS f64:$A, VSRC),
		(COPY_TO_REGCLASS f64:$C, VSRC), 0));
		dag BD = (v2f64 (XXPERMDI (COPY_TO_REGCLASS f64:$B, VSRC),
		(COPY_TO_REGCLASS f64:$D, VSRC), 0));
		dag CVACS = (v4i32 (XVCVDPSXWS AC));
		dag CVBDS = (v4i32 (XVCVDPSXWS BD));
		dag CVACU = (v4i32 (XVCVDPUXWS AC));
		dag CVBDU = (v4i32 (XVCVDPUXWS BD));

		// For little endian, we merge hi doublewords of (D, B) and (C, A), convert
		// then merge.
		dag DB = (v2f64 (XXPERMDI (COPY_TO_REGCLASS f64:$D, VSRC),
		(COPY_TO_REGCLASS f64:$B, VSRC), 0));
		dag CA = (v2f64 (XXPERMDI (COPY_TO_REGCLASS f64:$C, VSRC),
		(COPY_TO_REGCLASS f64:$A, VSRC), 0));
		dag CVDBS = (v4i32 (XVCVDPSXWS DB));
		dag CVCAS = (v4i32 (XVCVDPSXWS CA));
		dag CVDBU = (v4i32 (XVCVDPUXWS DB));
		dag CVCAU = (v4i32 (XVCVDPUXWS CA));
		}

// Patterns for BUILD_VECTOR nodes.		// Patterns for BUILD_VECTOR nodes.
let AddedComplexity = 400 in {		let AddedComplexity = 400 in {

let Predicates = [HasVSX] in {		let Predicates = [HasVSX] in {
// Build vectors of floating point converted to i32.		// Build vectors of floating point converted to i32.
def : Pat<(v4i32 (build_vector DblToInt.A, DblToInt.A,		def : Pat<(v4i32 (build_vector DblToInt.A, DblToInt.A,
DblToInt.A, DblToInt.A)),		DblToInt.A, DblToInt.A)),
(v4i32 (XXSPLTW (COPY_TO_REGCLASS (XSCVDPSXWS $A), VSRC), 1))>;		(v4i32 (XXSPLTW (COPY_TO_REGCLASS (XSCVDPSXWS $A), VSRC), 1))>;
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	def : Pat<(v2f64 (build_vector f64:$A, f64:$B)),
(COPY_TO_REGCLASS $A, VSRC),		(COPY_TO_REGCLASS $A, VSRC),
(COPY_TO_REGCLASS $B, VSRC), 0))>;		(COPY_TO_REGCLASS $B, VSRC), 0))>;

def : Pat<(v4f32 (build_vector f32:$A, f32:$B, f32:$C, f32:$D)),		def : Pat<(v4f32 (build_vector f32:$A, f32:$B, f32:$C, f32:$D)),
(VMRGEW MrgFP.AC, MrgFP.BD)>;		(VMRGEW MrgFP.AC, MrgFP.BD)>;
def : Pat<(v4f32 (build_vector DblToFlt.A0, DblToFlt.A1,		def : Pat<(v4f32 (build_vector DblToFlt.A0, DblToFlt.A1,
DblToFlt.B0, DblToFlt.B1)),		DblToFlt.B0, DblToFlt.B1)),
(v4f32 (VMRGEW MrgFP.ABhToFlt, MrgFP.ABlToFlt))>;		(v4f32 (VMRGEW MrgFP.ABhToFlt, MrgFP.ABlToFlt))>;

		// Convert 4 doubles to a vector of ints.
		def : Pat<(v4i32 (build_vector DblToInt.A, DblToInt.B,
		DblToInt.C, DblToInt.D)),
		(v4i32 (VMRGEW MrgWords.CVACS, MrgWords.CVBDS))>;
		def : Pat<(v4i32 (build_vector DblToUInt.A, DblToUInt.B,
		DblToUInt.C, DblToUInt.D)),
		(v4i32 (VMRGEW MrgWords.CVACU, MrgWords.CVBDU))>;
		def : Pat<(v4i32 (build_vector ExtDbl.A0S, ExtDbl.A1S,
		ExtDbl.B0S, ExtDbl.B1S)),
		(v4i32 (VMRGEW MrgWords.CVA0B0S, MrgWords.CVA1B1S))>;
		def : Pat<(v4i32 (build_vector ExtDbl.A0U, ExtDbl.A1U,
		ExtDbl.B0U, ExtDbl.B1U)),
		(v4i32 (VMRGEW MrgWords.CVA0B0U, MrgWords.CVA1B1U))>;
}		}

let Predicates = [IsLittleEndian, HasVSX] in {		let Predicates = [IsLittleEndian, HasVSX] in {
// Little endian, available on all targets with VSX		// Little endian, available on all targets with VSX
def : Pat<(v2f64 (build_vector f64:$A, f64:$B)),		def : Pat<(v2f64 (build_vector f64:$A, f64:$B)),
(v2f64 (XXPERMDI		(v2f64 (XXPERMDI
(COPY_TO_REGCLASS $B, VSRC),		(COPY_TO_REGCLASS $B, VSRC),
(COPY_TO_REGCLASS $A, VSRC), 0))>;		(COPY_TO_REGCLASS $A, VSRC), 0))>;

def : Pat<(v4f32 (build_vector f32:$D, f32:$C, f32:$B, f32:$A)),		def : Pat<(v4f32 (build_vector f32:$D, f32:$C, f32:$B, f32:$A)),
(VMRGEW MrgFP.AC, MrgFP.BD)>;		(VMRGEW MrgFP.AC, MrgFP.BD)>;
def : Pat<(v4f32 (build_vector DblToFlt.A0, DblToFlt.A1,		def : Pat<(v4f32 (build_vector DblToFlt.A0, DblToFlt.A1,
DblToFlt.B0, DblToFlt.B1)),		DblToFlt.B0, DblToFlt.B1)),
(v4f32 (VMRGEW MrgFP.BAhToFlt, MrgFP.BAlToFlt))>;		(v4f32 (VMRGEW MrgFP.BAhToFlt, MrgFP.BAlToFlt))>;

		// Convert 4 doubles to a vector of ints.
		def : Pat<(v4i32 (build_vector DblToInt.A, DblToInt.B,
		DblToInt.C, DblToInt.D)),
		(v4i32 (VMRGEW MrgWords.CVDBS, MrgWords.CVCAS))>;
		def : Pat<(v4i32 (build_vector DblToUInt.A, DblToUInt.B,
		DblToUInt.C, DblToUInt.D)),
		(v4i32 (VMRGEW MrgWords.CVDBU, MrgWords.CVCAU))>;
		def : Pat<(v4i32 (build_vector ExtDbl.A0S, ExtDbl.A1S,
		ExtDbl.B0S, ExtDbl.B1S)),
		(v4i32 (VMRGEW MrgWords.CVB1A1S, MrgWords.CVB0A0S))>;
		def : Pat<(v4i32 (build_vector ExtDbl.A0U, ExtDbl.A1U,
		ExtDbl.B0U, ExtDbl.B1U)),
		(v4i32 (VMRGEW MrgWords.CVB1A1U, MrgWords.CVB0A0U))>;
}		}

let Predicates = [HasDirectMove] in {		let Predicates = [HasDirectMove] in {
// Endianness-neutral constant splat on P8 and newer targets. The reason		// Endianness-neutral constant splat on P8 and newer targets. The reason
// for this pattern is that on targets with direct moves, we don't expand		// for this pattern is that on targets with direct moves, we don't expand
// BUILD_VECTOR nodes for v4i32.		// BUILD_VECTOR nodes for v4i32.
def : Pat<(v4i32 (build_vector immSExt5NonZero:$A, immSExt5NonZero:$A,		def : Pat<(v4i32 (build_vector immSExt5NonZero:$A, immSExt5NonZero:$A,
immSExt5NonZero:$A, immSExt5NonZero:$A)),		immSExt5NonZero:$A, immSExt5NonZero:$A)),
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

test/CodeGen/PowerPC/build-vector-tests.ll

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines
;vector int spltMemVali(int *ptr) { //		;vector int spltMemVali(int *ptr) { //
; return (vector int)*ptr; //		; return (vector int)*ptr; //
;} //		;} //
;// P8: vspltisw //		;// P8: vspltisw //
;// P9: vspltisw //		;// P9: vspltisw //
;vector int spltCnstConvftoi() { //		;vector int spltCnstConvftoi() { //
; return (vector int) 4.74f; //		; return (vector int) 4.74f; //
;} //		;} //
;// P8: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvdpsxws //		;// P9: 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;vector int fromRegsConvftoi(float a, float b, float c, float d) { //		;vector int fromRegsConvftoi(float a, float b, float c, float d) { //
; return (vector int) { a, b, c, d }; //		; return (vector int) { a, b, c, d }; //
;} //		;} //
;// P8: lxvd2x, xxswapd //		;// P8: lxvd2x, xxswapd //
;// P9: lxvx (even lxv) //		;// P9: lxvx (even lxv) //
;vector int fromDiffConstsConvftoi() { //		;vector int fromDiffConstsConvftoi() { //
; return (vector int) { 24.46f, 234.f, 988.19f, 422.39f }; //		; return (vector int) { 24.46f, 234.f, 988.19f, 422.39f }; //
;} //		;} //
;// P8: lxvd2x, xxswapd, xvcvspsxws //		;// P8: lxvd2x, xxswapd, xvcvspsxws //
;// P9: lxvx, xvcvspsxws //		;// P9: lxvx, xvcvspsxws //
;vector int fromDiffMemConsAConvftoi(float *ptr) { //		;vector int fromDiffMemConsAConvftoi(float *ptr) { //
; return (vector int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //		; return (vector int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //
;} //		;} //
;// P8: 2 x lxvd2x, 2 x xxswapd, vperm, xvcvspsxws //		;// P8: 2 x lxvd2x, 2 x xxswapd, vperm, xvcvspsxws //
;// P9: 2 x lxvx, vperm, xvcvspsxws //		;// P9: 2 x lxvx, vperm, xvcvspsxws //
;vector int fromDiffMemConsDConvftoi(float *ptr) { //		;vector int fromDiffMemConsDConvftoi(float *ptr) { //
; return (vector int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //		; return (vector int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //
;} //		;} //
;// P8: 4 x lxsspx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: 4 x lxsspx, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: 4 x lxssp, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P9: 4 x lxssp, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// Note: if the consecutive loads learns to handle pre-inc, this can be: //		;// Note: if the consecutive loads learns to handle pre-inc, this can be: //
;// sldi 2, load, xvcvspuxws //		;// sldi 2, load, xvcvspuxws //
;vector int fromDiffMemVarAConvftoi(float *arr, int elem) { //		;vector int fromDiffMemVarAConvftoi(float *arr, int elem) { //
; return (vector int) { arr[elem], arr[elem+1], arr[elem+2], arr[elem+3] }; //		; return (vector int) { arr[elem], arr[elem+1], arr[elem+2], arr[elem+3] }; //
;} //		;} //
;// P8: 4 x lxsspx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: 4 x lxsspx, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: 4 x lxssp, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P9: 4 x lxssp, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// Note: if the consecutive loads learns to handle pre-inc, this can be: //		;// Note: if the consecutive loads learns to handle pre-inc, this can be: //
;// sldi 2, 2 x load, vperm, xvcvspuxws //		;// sldi 2, 2 x load, vperm, xvcvspuxws //
;vector int fromDiffMemVarDConvftoi(float *arr, int elem) { //		;vector int fromDiffMemVarDConvftoi(float *arr, int elem) { //
; return (vector int) { arr[elem], arr[elem-1], arr[elem-2], arr[elem-3] }; //		; return (vector int) { arr[elem], arr[elem-1], arr[elem-2], arr[elem-3] }; //
;} //		;} //
;// P8: xscvdpsxws, xxspltw //		;// P8: xscvdpsxws, xxspltw //
;// P9: xscvdpsxws, xxspltw //		;// P9: xscvdpsxws, xxspltw //
;vector int spltRegValConvftoi(float val) { //		;vector int spltRegValConvftoi(float val) { //
; return (vector int) val; //		; return (vector int) val; //
;} //		;} //
;// P8: lxsspx, xscvdpsxws, xxspltw //		;// P8: lxsspx, xscvdpsxws, xxspltw //
;// P9: lxvwsx, xvcvspsxws //		;// P9: lxvwsx, xvcvspsxws //
;vector int spltMemValConvftoi(float *ptr) { //		;vector int spltMemValConvftoi(float *ptr) { //
; return (vector int)*ptr; //		; return (vector int)*ptr; //
;} //		;} //
;// P8: vspltisw //		;// P8: vspltisw //
;// P9: vspltisw //		;// P9: vspltisw //
;vector int spltCnstConvdtoi() { //		;vector int spltCnstConvdtoi() { //
; return (vector int) 4.74; //		; return (vector int) 4.74; //
;} //		;} //
;// P8: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P9: 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;vector int fromRegsConvdtoi(double a, double b, double c, double d) { //		;vector int fromRegsConvdtoi(double a, double b, double c, double d) { //
; return (vector int) { a, b, c, d }; //		; return (vector int) { a, b, c, d }; //
;} //		;} //
;// P8: lxvd2x, xxswapd //		;// P8: lxvd2x, xxswapd //
;// P9: lxvx (even lxv) //		;// P9: lxvx (even lxv) //
;vector int fromDiffConstsConvdtoi() { //		;vector int fromDiffConstsConvdtoi() { //
; return (vector int) { 24.46, 234., 988.19, 422.39 }; //		; return (vector int) { 24.46, 234., 988.19, 422.39 }; //
;} //		;} //
;// P8: 2 x lxvd2x, 2 x xxswapd, xxmrgld, xxmrghd, 2 x xvcvdpsp, vmrgew, //		;// P8: 2 x lxvd2x, 2 x xxswapd, xxmrgld, xxmrghd, 2 x xvcvspsxws, vmrgew //
;// xvcvspsxws //		;// P9: 2 x lxvx, 2 x xxswapd, xxmrgld, xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: 2 x lxvx, 2 x xxswapd, xxmrgld, xxmrghd, 2 x xvcvdpsp, vmrgew, //
;// xvcvspsxws //
;vector int fromDiffMemConsAConvdtoi(double *ptr) { //		;vector int fromDiffMemConsAConvdtoi(double *ptr) { //
; return (vector int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //		; return (vector int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //
;} //		;} //
;// P8: 4 x lxsdx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: 4 x lxsdx, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: 4 x lfd, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P9: 4 x lfd, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;vector int fromDiffMemConsDConvdtoi(double *ptr) { //		;vector int fromDiffMemConsDConvdtoi(double *ptr) { //
; return (vector int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //		; return (vector int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //
;} //		;} //
;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;vector int fromDiffMemVarAConvdtoi(double *arr, int elem) { //		;vector int fromDiffMemVarAConvdtoi(double *arr, int elem) { //
; return (vector int) { arr[elem], arr[elem+1], arr[elem+2], arr[elem+3] }; //		; return (vector int) { arr[elem], arr[elem+1], arr[elem+2], arr[elem+3] }; //
;} //		;} //
;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspsxws //		;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvspsxws, vmrgew //
;vector int fromDiffMemVarDConvdtoi(double *arr, int elem) { //		;vector int fromDiffMemVarDConvdtoi(double *arr, int elem) { //
; return (vector int) { arr[elem], arr[elem-1], arr[elem-2], arr[elem-3] }; //		; return (vector int) { arr[elem], arr[elem-1], arr[elem-2], arr[elem-3] }; //
;} //		;} //
;// P8: xscvdpsxws, xxspltw //		;// P8: xscvdpsxws, xxspltw //
;// P9: xscvdpsxws, xxspltw //		;// P9: xscvdpsxws, xxspltw //
;vector int spltRegValConvdtoi(double val) { //		;vector int spltRegValConvdtoi(double val) { //
; return (vector int) val; //		; return (vector int) val; //
;} //		;} //
▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines
;vector unsigned int spltMemValui(unsigned int *ptr) { //		;vector unsigned int spltMemValui(unsigned int *ptr) { //
; return (vector unsigned int)*ptr; //		; return (vector unsigned int)*ptr; //
;} //		;} //
;// P8: vspltisw //		;// P8: vspltisw //
;// P9: vspltisw //		;// P9: vspltisw //
;vector unsigned int spltCnstConvftoui() { //		;vector unsigned int spltCnstConvftoui() { //
; return (vector unsigned int) 4.74f; //		; return (vector unsigned int) 4.74f; //
;} //		;} //
;// P8: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;vector unsigned int fromRegsConvftoui(float a, float b, float c, float d) { //		;vector unsigned int fromRegsConvftoui(float a, float b, float c, float d) { //
; return (vector unsigned int) { a, b, c, d }; //		; return (vector unsigned int) { a, b, c, d }; //
;} //		;} //
;// P8: lxvd2x, xxswapd //		;// P8: lxvd2x, xxswapd //
;// P9: lxvx (even lxv) //		;// P9: lxvx (even lxv) //
;vector unsigned int fromDiffConstsConvftoui() { //		;vector unsigned int fromDiffConstsConvftoui() { //
; return (vector unsigned int) { 24.46f, 234.f, 988.19f, 422.39f }; //		; return (vector unsigned int) { 24.46f, 234.f, 988.19f, 422.39f }; //
;} //		;} //
;// P8: lxvd2x, xxswapd, xvcvspuxws //		;// P8: lxvd2x, xxswapd, xvcvspuxws //
;// P9: lxvx, xvcvspuxws //		;// P9: lxvx, xvcvspuxws //
;vector unsigned int fromDiffMemConsAConvftoui(float *ptr) { //		;vector unsigned int fromDiffMemConsAConvftoui(float *ptr) { //
; return (vector unsigned int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //		; return (vector unsigned int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //
;} //		;} //
;// P8: 2 x lxvd2x, 2 x xxswapd, vperm, xvcvspuxws //		;// P8: 2 x lxvd2x, 2 x xxswapd, vperm, xvcvspuxws //
;// P9: 2 x lxvx, vperm, xvcvspuxws //		;// P9: 2 x lxvx, vperm, xvcvspuxws //
;vector unsigned int fromDiffMemConsDConvftoui(float *ptr) { //		;vector unsigned int fromDiffMemConsDConvftoui(float *ptr) { //
; return (vector unsigned int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //		; return (vector unsigned int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //
;} //		;} //
;// P8: lfsux, 3 x lxsspx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: lfsux, 3 x lxsspx, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: lfsux, 3 x lfs, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: lfsux, 3 x lfs, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// Note: if the consecutive loads learns to handle pre-inc, this can be: //		;// Note: if the consecutive loads learns to handle pre-inc, this can be: //
;// sldi 2, load, xvcvspuxws //		;// sldi 2, load, xvcvspuxws //
;vector unsigned int fromDiffMemVarAConvftoui(float *arr, int elem) { //		;vector unsigned int fromDiffMemVarAConvftoui(float *arr, int elem) { //
; return (vector unsigned int) { arr[elem], arr[elem+1], //		; return (vector unsigned int) { arr[elem], arr[elem+1], //
; arr[elem+2], arr[elem+3] }; //		; arr[elem+2], arr[elem+3] }; //
;} //		;} //
;// P8: lfsux, 3 x lxsspx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: lfsux, 3 x lxsspx, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: lfsux, 3 x lfs, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: lfsux, 3 x lfs, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// Note: if the consecutive loads learns to handle pre-inc, this can be: //		;// Note: if the consecutive loads learns to handle pre-inc, this can be: //
;// sldi 2, 2 x load, vperm, xvcvspuxws //		;// sldi 2, 2 x load, vperm, xvcvspuxws //
;vector unsigned int fromDiffMemVarDConvftoui(float *arr, int elem) { //		;vector unsigned int fromDiffMemVarDConvftoui(float *arr, int elem) { //
; return (vector unsigned int) { arr[elem], arr[elem-1], //		; return (vector unsigned int) { arr[elem], arr[elem-1], //
; arr[elem-2], arr[elem-3] }; //		; arr[elem-2], arr[elem-3] }; //
;} //		;} //
;// P8: xscvdpuxws, xxspltw //		;// P8: xscvdpuxws, xxspltw //
;// P9: xscvdpuxws, xxspltw //		;// P9: xscvdpuxws, xxspltw //
;vector unsigned int spltRegValConvftoui(float val) { //		;vector unsigned int spltRegValConvftoui(float val) { //
; return (vector unsigned int) val; //		; return (vector unsigned int) val; //
;} //		;} //
;// P8: lxsspx, xscvdpuxws, xxspltw //		;// P8: lxsspx, xscvdpuxws, xxspltw //
;// P9: lxvwsx, xvcvspuxws //		;// P9: lxvwsx, xvcvspuxws //
;vector unsigned int spltMemValConvftoui(float *ptr) { //		;vector unsigned int spltMemValConvftoui(float *ptr) { //
; return (vector unsigned int)*ptr; //		; return (vector unsigned int)*ptr; //
;} //		;} //
;// P8: vspltisw //		;// P8: vspltisw //
;// P9: vspltisw //		;// P9: vspltisw //
;vector unsigned int spltCnstConvdtoui() { //		;vector unsigned int spltCnstConvdtoui() { //
; return (vector unsigned int) 4.74; //		; return (vector unsigned int) 4.74; //
;} //		;} //
;// P8: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;vector unsigned int fromRegsConvdtoui(double a, double b, //		;vector unsigned int fromRegsConvdtoui(double a, double b, //
; double c, double d) { //		; double c, double d) { //
; return (vector unsigned int) { a, b, c, d }; //		; return (vector unsigned int) { a, b, c, d }; //
;} //		;} //
;// P8: lxvd2x, xxswapd //		;// P8: lxvd2x, xxswapd //
;// P9: lxvx (even lxv) //		;// P9: lxvx (even lxv) //
;vector unsigned int fromDiffConstsConvdtoui() { //		;vector unsigned int fromDiffConstsConvdtoui() { //
; return (vector unsigned int) { 24.46, 234., 988.19, 422.39 }; //		; return (vector unsigned int) { 24.46, 234., 988.19, 422.39 }; //
;} //		;} //
;// P8: 2 x lxvd2x, 2 x xxswapd, xxmrgld, xxmrghd, 2 x xvcvdpsp, vmrgew, //		;// P8: 2 x lxvd2x, 2 x xxswapd, xxmrgld, xxmrghd, 2 x xvcvspuxws, vmrgew //
;// xvcvspuxws //		;// P9: 2 x lxvx, xxmrgld, xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: 2 x lxvx, xxmrgld, xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //
;vector unsigned int fromDiffMemConsAConvdtoui(double *ptr) { //		;vector unsigned int fromDiffMemConsAConvdtoui(double *ptr) { //
; return (vector unsigned int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //		; return (vector unsigned int) { ptr[0], ptr[1], ptr[2], ptr[3] }; //
;} //		;} //
;// P8: 4 x lxsdx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: 4 x lxsdx, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: 4 x lfd, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: 4 x lfd, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;vector unsigned int fromDiffMemConsDConvdtoui(double *ptr) { //		;vector unsigned int fromDiffMemConsDConvdtoui(double *ptr) { //
; return (vector unsigned int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //		; return (vector unsigned int) { ptr[3], ptr[2], ptr[1], ptr[0] }; //
;} //		;} //
;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;vector unsigned int fromDiffMemVarAConvdtoui(double *arr, int elem) { //		;vector unsigned int fromDiffMemVarAConvdtoui(double *arr, int elem) { //
; return (vector unsigned int) { arr[elem], arr[elem+1], //		; return (vector unsigned int) { arr[elem], arr[elem+1], //
; arr[elem+2], arr[elem+3] }; //		; arr[elem+2], arr[elem+3] }; //
;} //		;} //
;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P8: lfdux, 3 x lxsdx, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvdpsp, vmrgew, xvcvspuxws //		;// P9: lfdux, 3 x lfd, 2 x xxmrghd, 2 x xvcvspuxws, vmrgew //
;vector unsigned int fromDiffMemVarDConvdtoui(double *arr, int elem) { //		;vector unsigned int fromDiffMemVarDConvdtoui(double *arr, int elem) { //
; return (vector unsigned int) { arr[elem], arr[elem-1], //		; return (vector unsigned int) { arr[elem], arr[elem-1], //
; arr[elem-2], arr[elem-3] }; //		; arr[elem-2], arr[elem-3] }; //
;} //		;} //
;// P8: xscvdpuxws, xxspltw //		;// P8: xscvdpuxws, xxspltw //
;// P9: xscvdpuxws, xxspltw //		;// P9: xscvdpuxws, xxspltw //
;vector unsigned int spltRegValConvdtoui(double val) { //		;vector unsigned int spltRegValConvdtoui(double val) { //
; return (vector unsigned int) val; //		; return (vector unsigned int) val; //
▲ Show 20 Lines • Show All 860 Lines • ▼ Show 20 Lines	entry:
%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3		%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3
ret <4 x i32> %vecinit6		ret <4 x i32> %vecinit6
; P9BE-LABEL: fromRegsConvftoi		; P9BE-LABEL: fromRegsConvftoi
; P9LE-LABEL: fromRegsConvftoi		; P9LE-LABEL: fromRegsConvftoi
; P8BE-LABEL: fromRegsConvftoi		; P8BE-LABEL: fromRegsConvftoi
; P8LE-LABEL: fromRegsConvftoi		; P8LE-LABEL: fromRegsConvftoi
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P9BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9BE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9BE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9BE: vmrgew v2, [[REG3]], [[REG4]]		; P9BE: vmrgew v2, [[REG3]], [[REG4]]
; P9BE: xvcvspsxws v2, v2
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P9LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9LE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9LE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9LE: vmrgew v2, [[REG4]], [[REG3]]		; P9LE: vmrgew v2, [[REG4]], [[REG3]]
; P9LE: xvcvspsxws v2, v2
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P8BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8BE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8BE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8BE: vmrgew v2, [[REG3]], [[REG4]]		; P8BE: vmrgew v2, [[REG3]], [[REG4]]
; P8BE: xvcvspsxws v2, v2
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P8LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8LE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8LE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8LE: vmrgew v2, [[REG4]], [[REG3]]		; P8LE: vmrgew v2, [[REG4]], [[REG3]]
; P8LE: xvcvspsxws v2, v2
}		}

; Function Attrs: norecurse nounwind readnone		; Function Attrs: norecurse nounwind readnone
define <4 x i32> @fromDiffConstsConvftoi() {		define <4 x i32> @fromDiffConstsConvftoi() {
entry:		entry:
ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>		ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>
; P9BE-LABEL: fromDiffConstsConvftoi		; P9BE-LABEL: fromDiffConstsConvftoi
; P9LE-LABEL: fromDiffConstsConvftoi		; P9LE-LABEL: fromDiffConstsConvftoi
▲ Show 20 Lines • Show All 238 Lines • ▼ Show 20 Lines	entry:
%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3		%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3
ret <4 x i32> %vecinit6		ret <4 x i32> %vecinit6
; P9BE-LABEL: fromRegsConvdtoi		; P9BE-LABEL: fromRegsConvdtoi
; P9LE-LABEL: fromRegsConvdtoi		; P9LE-LABEL: fromRegsConvdtoi
; P8BE-LABEL: fromRegsConvdtoi		; P8BE-LABEL: fromRegsConvdtoi
; P8LE-LABEL: fromRegsConvdtoi		; P8LE-LABEL: fromRegsConvdtoi
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P9BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9BE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9BE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9BE: vmrgew v2, [[REG3]], [[REG4]]		; P9BE: vmrgew v2, [[REG3]], [[REG4]]
; P9BE: xvcvspsxws v2, v2
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P9LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9LE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9LE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9LE: vmrgew v2, [[REG4]], [[REG3]]		; P9LE: vmrgew v2, [[REG4]], [[REG3]]
; P9LE: xvcvspsxws v2, v2
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P8BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8BE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8BE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8BE: vmrgew v2, [[REG3]], [[REG4]]		; P8BE: vmrgew v2, [[REG3]], [[REG4]]
; P8BE: xvcvspsxws v2, v2
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P8LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8LE-DAG: xvcvdpsxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8LE-DAG: xvcvdpsxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8LE: vmrgew v2, [[REG4]], [[REG3]]		; P8LE: vmrgew v2, [[REG4]], [[REG3]]
; P8LE: xvcvspsxws v2, v2
}		}

; Function Attrs: norecurse nounwind readnone		; Function Attrs: norecurse nounwind readnone
define <4 x i32> @fromDiffConstsConvdtoi() {		define <4 x i32> @fromDiffConstsConvdtoi() {
entry:		entry:
ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>		ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>
; P9BE-LABEL: fromDiffConstsConvdtoi		; P9BE-LABEL: fromDiffConstsConvdtoi
; P9LE-LABEL: fromDiffConstsConvdtoi		; P9LE-LABEL: fromDiffConstsConvdtoi
Show All 25 Lines
; P9BE-LABEL: fromDiffMemConsAConvdtoi		; P9BE-LABEL: fromDiffMemConsAConvdtoi
; P9LE-LABEL: fromDiffMemConsAConvdtoi		; P9LE-LABEL: fromDiffMemConsAConvdtoi
; P8BE-LABEL: fromDiffMemConsAConvdtoi		; P8BE-LABEL: fromDiffMemConsAConvdtoi
; P8LE-LABEL: fromDiffMemConsAConvdtoi		; P8LE-LABEL: fromDiffMemConsAConvdtoi
; P9BE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)		; P9BE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)
; P9BE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)		; P9BE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)
; P9BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]		; P9BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]
; P9BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]		; P9BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]
; P9BE-DAG: xvcvdpsp [[REG5:[vs0-9]+]], [[REG3]]		; P9BE-DAG: xvcvdpsxws [[REG5:[vs0-9]+]], [[REG3]]
; P9BE-DAG: xvcvdpsp [[REG6:[vs0-9]+]], [[REG4]]		; P9BE-DAG: xvcvdpsxws [[REG6:[vs0-9]+]], [[REG4]]
; P9BE: vmrgew v2, [[REG6]], [[REG5]]		; P9BE: vmrgew v2, [[REG6]], [[REG5]]
; P9BE: xvcvspsxws v2, v2
; P9LE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)		; P9LE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)
; P9LE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)		; P9LE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)
; P9LE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG2]], [[REG1]]		; P9LE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG2]], [[REG1]]
; P9LE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG2]], [[REG1]]		; P9LE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG2]], [[REG1]]
; P9LE-DAG: xvcvdpsp [[REG5:[vs0-9]+]], [[REG3]]		; P9LE-DAG: xvcvdpsxws [[REG5:[vs0-9]+]], [[REG3]]
; P9LE-DAG: xvcvdpsp [[REG6:[vs0-9]+]], [[REG4]]		; P9LE-DAG: xvcvdpsxws [[REG6:[vs0-9]+]], [[REG4]]
; P9LE: vmrgew v2, [[REG6]], [[REG5]]		; P9LE: vmrgew v2, [[REG6]], [[REG5]]
; P9LE: xvcvspsxws v2, v2
; P8BE: lxvd2x [[REG1:[vs0-9]+]], 0, r3		; P8BE: lxvd2x [[REG1:[vs0-9]+]], 0, r3
; P8BE: lxvd2x [[REG2:[vs0-9]+]], r3, r4		; P8BE: lxvd2x [[REG2:[vs0-9]+]], r3, r4
; P8BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]		; P8BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]
; P8BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]		; P8BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]
; P8BE-DAG: xvcvdpsp [[REG5:[vs0-9]+]], [[REG3]]		; P8BE-DAG: xvcvdpsxws [[REG5:[vs0-9]+]], [[REG3]]
; P8BE-DAG: xvcvdpsp [[REG6:[vs0-9]+]], [[REG4]]		; P8BE-DAG: xvcvdpsxws [[REG6:[vs0-9]+]], [[REG4]]
; P8BE: vmrgew v2, [[REG6]], [[REG5]]		; P8BE: vmrgew v2, [[REG6]], [[REG5]]
; P8BE: xvcvspsxws v2, v2
; P8LE: lxvd2x [[REG1:[vs0-9]+]], 0, r3		; P8LE: lxvd2x [[REG1:[vs0-9]+]], 0, r3
; P8LE: lxvd2x [[REG2:[vs0-9]+]], r3, r4		; P8LE: lxvd2x [[REG2:[vs0-9]+]], r3, r4
; P8LE-DAG: xxswapd [[REG3:[vs0-9]+]], [[REG1]]		; P8LE-DAG: xxswapd [[REG3:[vs0-9]+]], [[REG1]]
; P8LE-DAG: xxswapd [[REG4:[vs0-9]+]], [[REG2]]		; P8LE-DAG: xxswapd [[REG4:[vs0-9]+]], [[REG2]]
; P8LE-DAG: xxmrgld [[REG5:[vs0-9]+]], [[REG4]], [[REG3]]		; P8LE-DAG: xxmrgld [[REG5:[vs0-9]+]], [[REG4]], [[REG3]]
; P8LE-DAG: xxmrghd [[REG6:[vs0-9]+]], [[REG4]], [[REG3]]		; P8LE-DAG: xxmrghd [[REG6:[vs0-9]+]], [[REG4]], [[REG3]]
; P8LE-DAG: xvcvdpsp [[REG7:[vs0-9]+]], [[REG5]]		; P8LE-DAG: xvcvdpsxws [[REG7:[vs0-9]+]], [[REG5]]
; P8LE-DAG: xvcvdpsp [[REG8:[vs0-9]+]], [[REG6]]		; P8LE-DAG: xvcvdpsxws [[REG8:[vs0-9]+]], [[REG6]]
; P8LE: vmrgew v2, [[REG8]], [[REG7]]		; P8LE: vmrgew v2, [[REG8]], [[REG7]]
; P8LE: xvcvspsxws v2, v2
}		}

; Function Attrs: norecurse nounwind readonly		; Function Attrs: norecurse nounwind readonly
define <4 x i32> @fromDiffMemConsDConvdtoi(double* nocapture readonly %ptr) {		define <4 x i32> @fromDiffMemConsDConvdtoi(double* nocapture readonly %ptr) {
entry:		entry:
%arrayidx = getelementptr inbounds double, double* %ptr, i64 3		%arrayidx = getelementptr inbounds double, double* %ptr, i64 3
%0 = load double, double* %arrayidx, align 8		%0 = load double, double* %arrayidx, align 8
%conv = fptosi double %0 to i32		%conv = fptosi double %0 to i32
Show All 15 Lines
; P8BE-LABEL: fromDiffMemConsDConvdtoi		; P8BE-LABEL: fromDiffMemConsDConvdtoi
; P8LE-LABEL: fromDiffMemConsDConvdtoi		; P8LE-LABEL: fromDiffMemConsDConvdtoi
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xvcvdpsp		; P9BE: xvcvdpsxws
; P9BE: xvcvdpsp		; P9BE: xvcvdpsxws
; P9BE: vmrgew		; P9BE: vmrgew v2
; P9BE: xvcvspsxws v2
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xvcvdpsp		; P9LE: xvcvdpsxws
; P9LE: xvcvdpsp		; P9LE: xvcvdpsxws
; P9LE: vmrgew		; P9LE: vmrgew v2
; P9LE: xvcvspsxws v2
; P8BE: lfdx		; P8BE: lfdx
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xvcvdpsp		; P8BE: xvcvdpsxws
; P8BE: xvcvdpsp		; P8BE: xvcvdpsxws
; P8BE: vmrgew		; P8BE: vmrgew v2
; P8BE: xvcvspsxws v2
; P8LE: lfdx		; P8LE: lfdx
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xvcvdpsp		; P8LE: xvcvdpsxws
; P8LE: xvcvdpsp		; P8LE: xvcvdpsxws
; P8LE: vmrgew		; P8LE: vmrgew v2
; P8LE: xvcvspsxws v2
}		}

; Function Attrs: norecurse nounwind readonly		; Function Attrs: norecurse nounwind readonly
define <4 x i32> @fromDiffMemVarAConvdtoi(double* nocapture readonly %arr, i32 signext %elem) {		define <4 x i32> @fromDiffMemVarAConvdtoi(double* nocapture readonly %arr, i32 signext %elem) {
entry:		entry:
%idxprom = sext i32 %elem to i64		%idxprom = sext i32 %elem to i64
%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom		%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom
%0 = load double, double* %arrayidx, align 8		%0 = load double, double* %arrayidx, align 8
Show All 23 Lines
; P8BE-LABEL: fromDiffMemVarAConvdtoi		; P8BE-LABEL: fromDiffMemVarAConvdtoi
; P8LE-LABEL: fromDiffMemVarAConvdtoi		; P8LE-LABEL: fromDiffMemVarAConvdtoi
; P9BE: lfdux		; P9BE: lfdux
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xvcvdpsp		; P9BE: xvcvdpsxws
; P9BE: xvcvdpsp		; P9BE: xvcvdpsxws
; P9BE: vmrgew		; P9BE: vmrgew v2
; P9BE: xvcvspsxws v2
; P9LE: lfdux		; P9LE: lfdux
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xvcvdpsp		; P9LE: xvcvdpsxws
; P9LE: xvcvdpsp		; P9LE: xvcvdpsxws
; P9LE: vmrgew		; P9LE: vmrgew v2
; P9LE: xvcvspsxws v2
; P8BE: lfdux		; P8BE: lfdux
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xvcvdpsp		; P8BE: xvcvdpsxws
; P8BE: xvcvdpsp		; P8BE: xvcvdpsxws
; P8BE: vmrgew		; P8BE: vmrgew v2
; P8BE: xvcvspsxws v2
; P8LE: lfdux		; P8LE: lfdux
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xvcvdpsp		; P8LE: xvcvdpsxws
; P8LE: xvcvdpsp		; P8LE: xvcvdpsxws
; P8LE: vmrgew		; P8LE: vmrgew v2
; P8LE: xvcvspsxws v2
}		}

; Function Attrs: norecurse nounwind readonly		; Function Attrs: norecurse nounwind readonly
define <4 x i32> @fromDiffMemVarDConvdtoi(double* nocapture readonly %arr, i32 signext %elem) {		define <4 x i32> @fromDiffMemVarDConvdtoi(double* nocapture readonly %arr, i32 signext %elem) {
entry:		entry:
%idxprom = sext i32 %elem to i64		%idxprom = sext i32 %elem to i64
%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom		%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom
%0 = load double, double* %arrayidx, align 8		%0 = load double, double* %arrayidx, align 8
Show All 23 Lines
; P8BE-LABEL: fromDiffMemVarDConvdtoi		; P8BE-LABEL: fromDiffMemVarDConvdtoi
; P8LE-LABEL: fromDiffMemVarDConvdtoi		; P8LE-LABEL: fromDiffMemVarDConvdtoi
; P9BE: lfdux		; P9BE: lfdux
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xvcvdpsp		; P9BE: xvcvdpsxws
; P9BE: xvcvdpsp		; P9BE: xvcvdpsxws
; P9BE: vmrgew		; P9BE: vmrgew v2
; P9BE: xvcvspsxws v2
; P9LE: lfdux		; P9LE: lfdux
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xvcvdpsp		; P9LE: xvcvdpsxws
; P9LE: xvcvdpsp		; P9LE: xvcvdpsxws
; P9LE: vmrgew		; P9LE: vmrgew v2
; P9LE: xvcvspsxws v2
; P8BE: lfdux		; P8BE: lfdux
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xvcvdpsp		; P8BE: xvcvdpsxws
; P8BE: xvcvdpsp		; P8BE: xvcvdpsxws
; P8BE: vmrgew		; P8BE: vmrgew v2
; P8BE: xvcvspsxws v2
; P8LE: lfdux		; P8LE: lfdux
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xvcvdpsp		; P8LE: xvcvdpsxws
; P8LE: xvcvdpsp		; P8LE: xvcvdpsxws
; P8LE: vmrgew		; P8LE: vmrgew v2
; P8LE: xvcvspsxws v2
}		}

; Function Attrs: norecurse nounwind readnone		; Function Attrs: norecurse nounwind readnone
define <4 x i32> @spltRegValConvdtoi(double %val) {		define <4 x i32> @spltRegValConvdtoi(double %val) {
entry:		entry:
%conv = fptosi double %val to i32		%conv = fptosi double %val to i32
%splat.splatinsert = insertelement <4 x i32> undef, i32 %conv, i32 0		%splat.splatinsert = insertelement <4 x i32> undef, i32 %conv, i32 0
%splat.splat = shufflevector <4 x i32> %splat.splatinsert, <4 x i32> undef, <4 x i32> zeroinitializer		%splat.splat = shufflevector <4 x i32> %splat.splatinsert, <4 x i32> undef, <4 x i32> zeroinitializer
▲ Show 20 Lines • Show All 564 Lines • ▼ Show 20 Lines	entry:
%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3		%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3
ret <4 x i32> %vecinit6		ret <4 x i32> %vecinit6
; P9BE-LABEL: fromRegsConvftoui		; P9BE-LABEL: fromRegsConvftoui
; P9LE-LABEL: fromRegsConvftoui		; P9LE-LABEL: fromRegsConvftoui
; P8BE-LABEL: fromRegsConvftoui		; P8BE-LABEL: fromRegsConvftoui
; P8LE-LABEL: fromRegsConvftoui		; P8LE-LABEL: fromRegsConvftoui
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P9BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9BE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9BE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9BE: vmrgew v2, [[REG3]], [[REG4]]		; P9BE: vmrgew v2, [[REG3]], [[REG4]]
; P9BE: xvcvspuxws v2, v2
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P9LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9LE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9LE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9LE: vmrgew v2, [[REG4]], [[REG3]]		; P9LE: vmrgew v2, [[REG4]], [[REG3]]
; P9LE: xvcvspuxws v2, v2
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P8BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8BE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8BE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8BE: vmrgew v2, [[REG3]], [[REG4]]		; P8BE: vmrgew v2, [[REG3]], [[REG4]]
; P8BE: xvcvspuxws v2, v2
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P8LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8LE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8LE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8LE: vmrgew v2, [[REG4]], [[REG3]]		; P8LE: vmrgew v2, [[REG4]], [[REG3]]
; P8LE: xvcvspuxws v2, v2
}		}

; Function Attrs: norecurse nounwind readnone		; Function Attrs: norecurse nounwind readnone
define <4 x i32> @fromDiffConstsConvftoui() {		define <4 x i32> @fromDiffConstsConvftoui() {
entry:		entry:
ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>		ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>
; P9BE-LABEL: fromDiffConstsConvftoui		; P9BE-LABEL: fromDiffConstsConvftoui
; P9LE-LABEL: fromDiffConstsConvftoui		; P9LE-LABEL: fromDiffConstsConvftoui
▲ Show 20 Lines • Show All 238 Lines • ▼ Show 20 Lines	entry:
%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3		%vecinit6 = insertelement <4 x i32> %vecinit4, i32 %conv5, i32 3
ret <4 x i32> %vecinit6		ret <4 x i32> %vecinit6
; P9BE-LABEL: fromRegsConvdtoui		; P9BE-LABEL: fromRegsConvdtoui
; P9LE-LABEL: fromRegsConvdtoui		; P9LE-LABEL: fromRegsConvdtoui
; P8BE-LABEL: fromRegsConvdtoui		; P8BE-LABEL: fromRegsConvdtoui
; P8LE-LABEL: fromRegsConvdtoui		; P8LE-LABEL: fromRegsConvdtoui
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P9BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P9BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9BE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9BE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9BE: vmrgew v2, [[REG3]], [[REG4]]		; P9BE: vmrgew v2, [[REG3]], [[REG4]]
; P9BE: xvcvspuxws v2, v2
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P9LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P9LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P9LE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P9LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P9LE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P9LE: vmrgew v2, [[REG4]], [[REG3]]		; P9LE: vmrgew v2, [[REG4]], [[REG3]]
; P9LE: xvcvspuxws v2, v2
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs1, vs3
; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4		; P8BE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs2, vs4
; P8BE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8BE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8BE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8BE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8BE: vmrgew v2, [[REG3]], [[REG4]]		; P8BE: vmrgew v2, [[REG3]], [[REG4]]
; P8BE: xvcvspuxws v2, v2
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG1:[0-9]+]], vs3, vs1
; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2		; P8LE-DAG: xxmrghd {{[vs]+}}[[REG2:[0-9]+]], vs4, vs2
; P8LE-DAG: xvcvdpsp [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]		; P8LE-DAG: xvcvdpuxws [[REG3:v[0-9]+]], {{[vs]+}}[[REG1]]
; P8LE-DAG: xvcvdpsp [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]		; P8LE-DAG: xvcvdpuxws [[REG4:v[0-9]+]], {{[vs]+}}[[REG2]]
; P8LE: vmrgew v2, [[REG4]], [[REG3]]		; P8LE: vmrgew v2, [[REG4]], [[REG3]]
; P8LE: xvcvspuxws v2, v2
}		}

; Function Attrs: norecurse nounwind readnone		; Function Attrs: norecurse nounwind readnone
define <4 x i32> @fromDiffConstsConvdtoui() {		define <4 x i32> @fromDiffConstsConvdtoui() {
entry:		entry:
ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>		ret <4 x i32> <i32 24, i32 234, i32 988, i32 422>
; P9BE-LABEL: fromDiffConstsConvdtoui		; P9BE-LABEL: fromDiffConstsConvdtoui
; P9LE-LABEL: fromDiffConstsConvdtoui		; P9LE-LABEL: fromDiffConstsConvdtoui
Show All 25 Lines
; P9BE-LABEL: fromDiffMemConsAConvdtoui		; P9BE-LABEL: fromDiffMemConsAConvdtoui
; P9LE-LABEL: fromDiffMemConsAConvdtoui		; P9LE-LABEL: fromDiffMemConsAConvdtoui
; P8BE-LABEL: fromDiffMemConsAConvdtoui		; P8BE-LABEL: fromDiffMemConsAConvdtoui
; P8LE-LABEL: fromDiffMemConsAConvdtoui		; P8LE-LABEL: fromDiffMemConsAConvdtoui
; P9BE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)		; P9BE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)
; P9BE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)		; P9BE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)
; P9BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]		; P9BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]
; P9BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]		; P9BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]
; P9BE-DAG: xvcvdpsp [[REG5:[vs0-9]+]], [[REG3]]		; P9BE-DAG: xvcvdpuxws [[REG5:[vs0-9]+]], [[REG3]]
; P9BE-DAG: xvcvdpsp [[REG6:[vs0-9]+]], [[REG4]]		; P9BE-DAG: xvcvdpuxws [[REG6:[vs0-9]+]], [[REG4]]
; P9BE: vmrgew v2, [[REG6]], [[REG5]]		; P9BE: vmrgew v2, [[REG6]], [[REG5]]
; P9BE: xvcvspuxws v2, v2
; P9LE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)		; P9LE-DAG: lxv [[REG1:[vs0-9]+]], 0(r3)
; P9LE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)		; P9LE-DAG: lxv [[REG2:[vs0-9]+]], 16(r3)
; P9LE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG2]], [[REG1]]
; P9LE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG2]], [[REG1]]		; P9LE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG2]], [[REG1]]
; P9LE-DAG: xvcvdpsp [[REG5:[vs0-9]+]], [[REG3]]		; P9LE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG2]], [[REG1]]
; P9LE-DAG: xvcvdpsp [[REG6:[vs0-9]+]], [[REG4]]		; P9LE-DAG: xvcvdpuxws [[REG5:[vs0-9]+]], [[REG3]]
		; P9LE-DAG: xvcvdpuxws [[REG6:[vs0-9]+]], [[REG4]]
; P9LE: vmrgew v2, [[REG6]], [[REG5]]		; P9LE: vmrgew v2, [[REG6]], [[REG5]]
; P9LE: xvcvspuxws v2, v2
; P8BE: lxvd2x [[REG1:[vs0-9]+]], 0, r3		; P8BE: lxvd2x [[REG1:[vs0-9]+]], 0, r3
; P8BE: lxvd2x [[REG2:[vs0-9]+]], r3, r4		; P8BE: lxvd2x [[REG2:[vs0-9]+]], r3, r4
; P8BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]		; P8BE-DAG: xxmrgld [[REG3:[vs0-9]+]], [[REG1]], [[REG2]]
; P8BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]		; P8BE-DAG: xxmrghd [[REG4:[vs0-9]+]], [[REG1]], [[REG2]]
; P8BE-DAG: xvcvdpsp [[REG5:[vs0-9]+]], [[REG3]]		; P8BE-DAG: xvcvdpuxws [[REG5:[vs0-9]+]], [[REG3]]
; P8BE-DAG: xvcvdpsp [[REG6:[vs0-9]+]], [[REG4]]		; P8BE-DAG: xvcvdpuxws [[REG6:[vs0-9]+]], [[REG4]]
; P8BE: vmrgew v2, [[REG6]], [[REG5]]		; P8BE: vmrgew v2, [[REG6]], [[REG5]]
; P8BE: xvcvspuxws v2, v2
; P8LE: lxvd2x [[REG1:[vs0-9]+]], 0, r3		; P8LE: lxvd2x [[REG1:[vs0-9]+]], 0, r3
; P8LE: lxvd2x [[REG2:[vs0-9]+]], r3, r4		; P8LE: lxvd2x [[REG2:[vs0-9]+]], r3, r4
; P8LE-DAG: xxswapd [[REG3:[vs0-9]+]], [[REG1]]		; P8LE-DAG: xxswapd [[REG3:[vs0-9]+]], [[REG1]]
; P8LE-DAG: xxswapd [[REG4:[vs0-9]+]], [[REG2]]		; P8LE-DAG: xxswapd [[REG4:[vs0-9]+]], [[REG2]]
; P8LE-DAG: xxmrgld [[REG5:[vs0-9]+]], [[REG4]], [[REG3]]		; P8LE-DAG: xxmrgld [[REG5:[vs0-9]+]], [[REG4]], [[REG3]]
; P8LE-DAG: xxmrghd [[REG6:[vs0-9]+]], [[REG4]], [[REG3]]		; P8LE-DAG: xxmrghd [[REG6:[vs0-9]+]], [[REG4]], [[REG3]]
; P8LE-DAG: xvcvdpsp [[REG7:[vs0-9]+]], [[REG5]]		; P8LE-DAG: xvcvdpuxws [[REG7:[vs0-9]+]], [[REG5]]
; P8LE-DAG: xvcvdpsp [[REG8:[vs0-9]+]], [[REG6]]		; P8LE-DAG: xvcvdpuxws [[REG8:[vs0-9]+]], [[REG6]]
; P8LE: vmrgew v2, [[REG8]], [[REG7]]		; P8LE: vmrgew v2, [[REG8]], [[REG7]]
; P8LE: xvcvspuxws v2, v2
}		}

; Function Attrs: norecurse nounwind readonly		; Function Attrs: norecurse nounwind readonly
define <4 x i32> @fromDiffMemConsDConvdtoui(double* nocapture readonly %ptr) {		define <4 x i32> @fromDiffMemConsDConvdtoui(double* nocapture readonly %ptr) {
entry:		entry:
%arrayidx = getelementptr inbounds double, double* %ptr, i64 3		%arrayidx = getelementptr inbounds double, double* %ptr, i64 3
%0 = load double, double* %arrayidx, align 8		%0 = load double, double* %arrayidx, align 8
%conv = fptoui double %0 to i32		%conv = fptoui double %0 to i32
Show All 15 Lines
; P8BE-LABEL: fromDiffMemConsDConvdtoui		; P8BE-LABEL: fromDiffMemConsDConvdtoui
; P8LE-LABEL: fromDiffMemConsDConvdtoui		; P8LE-LABEL: fromDiffMemConsDConvdtoui
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xvcvdpsp		; P9BE: xvcvdpuxws
; P9BE: xvcvdpsp		; P9BE: xvcvdpuxws
; P9BE: vmrgew		; P9BE: vmrgew v2
; P9BE: xvcvspuxws v2
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xvcvdpsp		; P9LE: xvcvdpuxws
; P9LE: xvcvdpsp		; P9LE: xvcvdpuxws
; P9LE: vmrgew		; P9LE: vmrgew v2
; P9LE: xvcvspuxws v2
; P8BE: lfdx		; P8BE: lfdx
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xvcvdpsp		; P8BE: xvcvdpuxws
; P8BE: xvcvdpsp		; P8BE: xvcvdpuxws
; P8BE: vmrgew		; P8BE: vmrgew v2
; P8BE: xvcvspuxws v2
; P8LE: lfdx		; P8LE: lfdx
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xvcvdpsp		; P8LE: xvcvdpuxws
; P8LE: xvcvdpsp		; P8LE: xvcvdpuxws
; P8LE: vmrgew		; P8LE: vmrgew v2
; P8LE: xvcvspuxws v2
}		}

; Function Attrs: norecurse nounwind readonly		; Function Attrs: norecurse nounwind readonly
define <4 x i32> @fromDiffMemVarAConvdtoui(double* nocapture readonly %arr, i32 signext %elem) {		define <4 x i32> @fromDiffMemVarAConvdtoui(double* nocapture readonly %arr, i32 signext %elem) {
entry:		entry:
%idxprom = sext i32 %elem to i64		%idxprom = sext i32 %elem to i64
%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom		%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom
%0 = load double, double* %arrayidx, align 8		%0 = load double, double* %arrayidx, align 8
Show All 23 Lines
; P8BE-LABEL: fromDiffMemVarAConvdtoui		; P8BE-LABEL: fromDiffMemVarAConvdtoui
; P8LE-LABEL: fromDiffMemVarAConvdtoui		; P8LE-LABEL: fromDiffMemVarAConvdtoui
; P9BE: lfdux		; P9BE: lfdux
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xvcvdpsp		; P9BE: xvcvdpuxws
; P9BE: xvcvdpsp		; P9BE: xvcvdpuxws
; P9BE: vmrgew		; P9BE: vmrgew v2
; P9BE: xvcvspuxws v2
; P9LE: lfdux		; P9LE: lfdux
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xvcvdpsp		; P9LE: xvcvdpuxws
; P9LE: xvcvdpsp		; P9LE: xvcvdpuxws
; P9LE: vmrgew		; P9LE: vmrgew v2
; P9LE: xvcvspuxws v2
; P8BE: lfdux		; P8BE: lfdux
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xvcvdpsp		; P8BE: xvcvdpuxws
; P8BE: xvcvdpsp		; P8BE: xvcvdpuxws
; P8BE: vmrgew		; P8BE: vmrgew v2
; P8BE: xvcvspuxws v2
; P8LE: lfdux		; P8LE: lfdux
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xvcvdpsp		; P8LE: xvcvdpuxws
; P8LE: xvcvdpsp		; P8LE: xvcvdpuxws
; P8LE: vmrgew		; P8LE: vmrgew v2
; P8LE: xvcvspuxws v2
}		}

; Function Attrs: norecurse nounwind readonly		; Function Attrs: norecurse nounwind readonly
define <4 x i32> @fromDiffMemVarDConvdtoui(double* nocapture readonly %arr, i32 signext %elem) {		define <4 x i32> @fromDiffMemVarDConvdtoui(double* nocapture readonly %arr, i32 signext %elem) {
entry:		entry:
%idxprom = sext i32 %elem to i64		%idxprom = sext i32 %elem to i64
%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom		%arrayidx = getelementptr inbounds double, double* %arr, i64 %idxprom
%0 = load double, double* %arrayidx, align 8		%0 = load double, double* %arrayidx, align 8
Show All 23 Lines
; P8BE-LABEL: fromDiffMemVarDConvdtoui		; P8BE-LABEL: fromDiffMemVarDConvdtoui
; P8LE-LABEL: fromDiffMemVarDConvdtoui		; P8LE-LABEL: fromDiffMemVarDConvdtoui
; P9BE: lfdux		; P9BE: lfdux
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: lfd		; P9BE: lfd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xxmrghd		; P9BE: xxmrghd
; P9BE: xvcvdpsp		; P9BE: xvcvdpuxws
; P9BE: xvcvdpsp		; P9BE: xvcvdpuxws
; P9BE: vmrgew		; P9BE: vmrgew v2
; P9BE: xvcvspuxws v2
; P9LE: lfdux		; P9LE: lfdux
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: lfd		; P9LE: lfd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xxmrghd		; P9LE: xxmrghd
; P9LE: xvcvdpsp		; P9LE: xvcvdpuxws
; P9LE: xvcvdpsp		; P9LE: xvcvdpuxws
; P9LE: vmrgew		; P9LE: vmrgew v2
; P9LE: xvcvspuxws v2
; P8BE: lfdux		; P8BE: lfdux
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: lfd		; P8BE: lfd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xxmrghd		; P8BE: xxmrghd
; P8BE: xvcvdpsp		; P8BE: xvcvdpuxws
; P8BE: xvcvdpsp		; P8BE: xvcvdpuxws
; P8BE: vmrgew		; P8BE: vmrgew v2
; P8BE: xvcvspuxws v2
; P8LE: lfdux		; P8LE: lfdux
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: lfd		; P8LE: lfd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xxmrghd		; P8LE: xxmrghd
; P8LE: xvcvdpsp		; P8LE: xvcvdpuxws
; P8LE: xvcvdpsp		; P8LE: xvcvdpuxws
; P8LE: vmrgew		; P8LE: vmrgew v2
; P8LE: xvcvspuxws v2
}		}

; Function Attrs: norecurse nounwind readnone		; Function Attrs: norecurse nounwind readnone
define <4 x i32> @spltRegValConvdtoui(double %val) {		define <4 x i32> @spltRegValConvdtoui(double %val) {
entry:		entry:
%conv = fptoui double %val to i32		%conv = fptoui double %val to i32
%splat.splatinsert = insertelement <4 x i32> undef, i32 %conv, i32 0		%splat.splatinsert = insertelement <4 x i32> undef, i32 %conv, i32 0
%splat.splat = shufflevector <4 x i32> %splat.splatinsert, <4 x i32> undef, <4 x i32> zeroinitializer		%splat.splat = shufflevector <4 x i32> %splat.splatinsert, <4 x i32> undef, <4 x i32> zeroinitializer
▲ Show 20 Lines • Show All 1,855 Lines • Show Last 20 Lines