This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Add a signed integer absolute ISD node
ClosedPublic

Authored by RKSimon on Feb 7 2017, 7:07 AM.

Download Raw Diff

Details

Reviewers

spatel
delena
andreadb
jmolloy
mkuper
jlebar
hfinkel
efriedma

Commits

rGcf2da96c82e6: [SelectionDAG] Add a signed integer absolute ISD node
rL297780: [SelectionDAG] Add a signed integer absolute ISD node

Summary

Reduced version of D26357 - based on the discussion on llvm-dev about canonicalization of UMIN/UMAX/SMIN/SMAX as well as ABS I've reduced that patch to just the ABS ISD node (with x86/sse support) to improve basic combines and lowering.

AFAICT ARM/AArch64, PowerPC and NVPTX all have similar instructions so I'd like to make this a generic opcode and move us away from the hard coded tablegen patterns which makes it tricky to match more complex patterns.

At the moment this patch doesn't attempt legalization (and I only create an ABS node if its legal/custom) - I can add the legalization code from D26357 if people think it useful at this stage - it will let us do some extra combines, knownbits handling etc.

Diff Detail

Repository: rL LLVM

Event Timeline

RKSimon created this revision.Feb 7 2017, 7:07 AM

Herald added subscribers: igorb, aemerson. · View Herald TranscriptFeb 7 2017, 7:07 AM

I think this is a good idea, in terms of the usual trade-off (we lose the opportunity to combine the xor with other logic, simplifying it, etc.) Especially since we only introduce this post-legalization.
But I'd like to hear what other people think too.

include/llvm/CodeGen/ISDOpcodes.h
343 ↗	(On Diff #87426)	Is this the right behavior for a target-independent intrinsic? That is, do the other (non-x86) platforms that have an ABS instruction have the same behavior?
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5502 ↗	(On Diff #87426)	:-( (I was really surprised by this, but looks like we have this pattern everywhere.)
lib/Target/X86/X86ISelLowering.cpp
20853 ↗	(On Diff #87426)	Is there a reason not to use ISD::ABS explicitly here? I think it'd be clearer.

Thanks Michael - does anyone from a non-X86 target have any comments?

include/llvm/CodeGen/ISDOpcodes.h
343 ↗	(On Diff #87426)	AFAICT this is the norm and matches what we do for constant folding of the std::abs() function as well, although technically its undefined if the result isn't representable. ARM NEON has the same functionality with its basic vabs_* intrinsic, and has a saturating version vqabs_* as well. PowerPC has vec_abs / vec_abss. Not sure about the other targets? Saturating patterns can probably be best be handled by abs(smax(v, INT_MIN + 1)) style canonicalizations? But that would still have to be target specific.
lib/Target/X86/X86ISelLowering.cpp
20853 ↗	(On Diff #87426)	That's me trying to be too generic - will fix.

rebased and tidied up X86 LowerABS

@hfinkel @jmolloy Is this approach compatible with PowerPC and ARM? Cheers.

Hi,

Having asked around: The way we define this is that the VABS instruction takes a signed integer and outputs an unsigned integer, getting around this problem.

However, I believe the output of VABS(INT_MIN) is indeed bit-identical to INT_MIN.

+ @jgreenhalgh to confirm I haven't mangled his explanation.

James

In D29639#682183, @jmolloy wrote:

Hi,

Having asked around: The way we define this is that the VABS instruction takes a signed integer and outputs an unsigned integer, getting around this problem.

However, I believe the output of VABS(INT_MIN) is indeed bit-identical to INT_MIN.

+ @jgreenhalgh to confirm I haven't mangled his explanation.

Only a minor tweak on a pedantic point - the instruction doesn't really have an idea of signed/unsigned - just bits. The effect is as if you were calculating in an unsigned value of the same number of bits as the input.

That doesn't hold for the intrinsics which are all signed -> signed. But the behaviour there is ABS(INT_MIN) -> INT_MIN, so the behaviour in this patch looks fine for ARM/AArch64 Advanced SIMD.

Rebased and made it explicit in the ISDOpcodes.h that the input and result of ISD::ABS have the same bitwidth.

Does anyone have any other comments? What about legalization - should I add it as part of this patch or just keep to the legal cases for now?

ping?

ping^2

Would it make sense to use ISD::ABS over combineIntegerAbs in X86ISelLowering.cpp?

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5514 ↗	(On Diff #89414)	There are a lot of other folds we could do here (e.g. abs(-x) -> abs(x)), but maybe not worth worrying about?

In D29639#699976, @efriedma wrote:

Would it make sense to use ISD::ABS over combineIntegerAbs in X86ISelLowering.cpp?

Yes, although probably as a followup once the legalization support is in place.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5514 ↗	(On Diff #89414)	Definitely scope for further combines, but I'd rather see them in the wild before adding them.

In D29639#678538, @RKSimon wrote:

Thanks Michael - does anyone from a non-X86 target have any comments?

Hexagon has integer ABS for 32- and 64-bit integers, and for both of them abs(min_int) == min_int, so this works for us. FWIW, we also have a saturating abs for 32-bit integers.

LGTM.

It looks like we have consensus that this is generally useful, and the implementation looks fine.

This revision is now accepted and ready to land.Mar 14 2017, 11:10 AM

Closed by commit rL297780: [SelectionDAG] Add a signed integer absolute ISD node (authored by RKSimon). · Explain WhyMar 14 2017, 2:39 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

ISDOpcodes.h

6 lines

Target/

TargetSelectionDAG.td

1 line

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

29 lines

SelectionDAG.cpp

10 lines

SelectionDAGDumper.cpp

1 line

TargetLoweringBase.cpp

1 line

Target/

X86/

X86ISelLowering.h

3 lines

X86ISelLowering.cpp

29 lines

X86InstrAVX512.td

61 lines

X86InstrFragmentsSIMD.td

1 line

X86InstrSSE.td

78 lines

X86IntrinsicsInfo.h

36 lines

test/

CodeGen/

X86/

combine-abs.ll

8 lines

viabs.ll

142 lines

Diff 91774

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 333 Lines • ▼ Show 20 Lines	enum NodeType {

/// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned		/// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned
/// integers.		/// integers.
SMIN, SMAX, UMIN, UMAX,		SMIN, SMAX, UMIN, UMAX,

/// Bitwise operators - logical and, logical or, logical xor.		/// Bitwise operators - logical and, logical or, logical xor.
AND, OR, XOR,		AND, OR, XOR,

		/// ABS - Determine the unsigned absolute value of a signed integer value of
		/// the same bitwidth.
		/// Note: A value of INT_MIN will return INT_MIN, no saturation or overflow
		/// is performed.
		ABS,

/// Shift and rotation operations. After legalization, the type of the		/// Shift and rotation operations. After legalization, the type of the
/// shift amount is known to be TLI.getShiftAmountTy(). Before legalization		/// shift amount is known to be TLI.getShiftAmountTy(). Before legalization
/// the shift amount can be any type, but care must be taken to ensure it is		/// the shift amount can be any type, but care must be taken to ensure it is
/// large enough. TLI.getShiftAmountTy() is i8 on some targets, but before		/// large enough. TLI.getShiftAmountTy() is i8 on some targets, but before
/// legalization, types like i1024 can occur and i8 doesn't have enough bits		/// legalization, types like i1024 can occur and i8 doesn't have enough bits
/// to represent the shift amount.		/// to represent the shift amount.
/// When the 1st operand is a vector, the shift amount must be in the same		/// When the 1st operand is a vector, the shift amount must be in the same
/// type. (TLI.getShiftAmountTy() will return the same type when the input		/// type. (TLI.getShiftAmountTy() will return the same type when the input
▲ Show 20 Lines • Show All 579 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Target/TargetSelectionDAG.td

Show First 20 Lines • Show All 407 Lines • ▼ Show 20 Lines	def umin : SDNode<"ISD::UMIN" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;
def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,		def umax : SDNode<"ISD::UMAX" , SDTIntBinOp,
[SDNPCommutative, SDNPAssociative]>;		[SDNPCommutative, SDNPAssociative]>;

def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;		def sext_inreg : SDNode<"ISD::SIGN_EXTEND_INREG", SDTExtInreg>;
def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;		def sext_invec : SDNode<"ISD::SIGN_EXTEND_VECTOR_INREG", SDTExtInvec>;
def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;		def zext_invec : SDNode<"ISD::ZERO_EXTEND_VECTOR_INREG", SDTExtInvec>;

		def abs : SDNode<"ISD::ABS" , SDTIntUnaryOp>;
def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;		def bitreverse : SDNode<"ISD::BITREVERSE" , SDTIntUnaryOp>;
def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>;		def bswap : SDNode<"ISD::BSWAP" , SDTIntUnaryOp>;
def ctlz : SDNode<"ISD::CTLZ" , SDTIntUnaryOp>;		def ctlz : SDNode<"ISD::CTLZ" , SDTIntUnaryOp>;
def cttz : SDNode<"ISD::CTTZ" , SDTIntUnaryOp>;		def cttz : SDNode<"ISD::CTTZ" , SDTIntUnaryOp>;
def ctpop : SDNode<"ISD::CTPOP" , SDTIntUnaryOp>;		def ctpop : SDNode<"ISD::CTPOP" , SDTIntUnaryOp>;
def ctlz_zero_undef : SDNode<"ISD::CTLZ_ZERO_UNDEF", SDTIntUnaryOp>;		def ctlz_zero_undef : SDNode<"ISD::CTLZ_ZERO_UNDEF", SDTIntUnaryOp>;
def cttz_zero_undef : SDNode<"ISD::CTTZ_ZERO_UNDEF", SDTIntUnaryOp>;		def cttz_zero_undef : SDNode<"ISD::CTTZ_ZERO_UNDEF", SDTIntUnaryOp>;
def sext : SDNode<"ISD::SIGN_EXTEND", SDTIntExtendOp>;		def sext : SDNode<"ISD::SIGN_EXTEND", SDTIntExtendOp>;
▲ Show 20 Lines • Show All 709 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	private:
SDValue visitOR(SDNode *N);		SDValue visitOR(SDNode *N);
SDValue visitORLike(SDValue N0, SDValue N1, SDNode *LocReference);		SDValue visitORLike(SDValue N0, SDValue N1, SDNode *LocReference);
SDValue visitXOR(SDNode *N);		SDValue visitXOR(SDNode *N);
SDValue SimplifyVBinOp(SDNode *N);		SDValue SimplifyVBinOp(SDNode *N);
SDValue visitSHL(SDNode *N);		SDValue visitSHL(SDNode *N);
SDValue visitSRA(SDNode *N);		SDValue visitSRA(SDNode *N);
SDValue visitSRL(SDNode *N);		SDValue visitSRL(SDNode *N);
SDValue visitRotate(SDNode *N);		SDValue visitRotate(SDNode *N);
		SDValue visitABS(SDNode *N);
SDValue visitBSWAP(SDNode *N);		SDValue visitBSWAP(SDNode *N);
SDValue visitBITREVERSE(SDNode *N);		SDValue visitBITREVERSE(SDNode *N);
SDValue visitCTLZ(SDNode *N);		SDValue visitCTLZ(SDNode *N);
SDValue visitCTLZ_ZERO_UNDEF(SDNode *N);		SDValue visitCTLZ_ZERO_UNDEF(SDNode *N);
SDValue visitCTTZ(SDNode *N);		SDValue visitCTTZ(SDNode *N);
SDValue visitCTTZ_ZERO_UNDEF(SDNode *N);		SDValue visitCTTZ_ZERO_UNDEF(SDNode *N);
SDValue visitCTPOP(SDNode *N);		SDValue visitCTPOP(SDNode *N);
SDValue visitSELECT(SDNode *N);		SDValue visitSELECT(SDNode *N);
▲ Show 20 Lines • Show All 1,145 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visit(SDNode *N) {
case ISD::AND: return visitAND(N);		case ISD::AND: return visitAND(N);
case ISD::OR: return visitOR(N);		case ISD::OR: return visitOR(N);
case ISD::XOR: return visitXOR(N);		case ISD::XOR: return visitXOR(N);
case ISD::SHL: return visitSHL(N);		case ISD::SHL: return visitSHL(N);
case ISD::SRA: return visitSRA(N);		case ISD::SRA: return visitSRA(N);
case ISD::SRL: return visitSRL(N);		case ISD::SRL: return visitSRL(N);
case ISD::ROTR:		case ISD::ROTR:
case ISD::ROTL: return visitRotate(N);		case ISD::ROTL: return visitRotate(N);
		case ISD::ABS: return visitABS(N);
case ISD::BSWAP: return visitBSWAP(N);		case ISD::BSWAP: return visitBSWAP(N);
case ISD::BITREVERSE: return visitBITREVERSE(N);		case ISD::BITREVERSE: return visitBITREVERSE(N);
case ISD::CTLZ: return visitCTLZ(N);		case ISD::CTLZ: return visitCTLZ(N);
case ISD::CTLZ_ZERO_UNDEF: return visitCTLZ_ZERO_UNDEF(N);		case ISD::CTLZ_ZERO_UNDEF: return visitCTLZ_ZERO_UNDEF(N);
case ISD::CTTZ: return visitCTTZ(N);		case ISD::CTTZ: return visitCTTZ(N);
case ISD::CTTZ_ZERO_UNDEF: return visitCTTZ_ZERO_UNDEF(N);		case ISD::CTTZ_ZERO_UNDEF: return visitCTTZ_ZERO_UNDEF(N);
case ISD::CTPOP: return visitCTPOP(N);		case ISD::CTPOP: return visitCTPOP(N);
case ISD::SELECT: return visitSELECT(N);		case ISD::SELECT: return visitSELECT(N);
▲ Show 20 Lines • Show All 3,565 Lines • ▼ Show 20 Lines	if (N1C && N0.getOpcode() == ISD::XOR) {
}		}
if (const ConstantSDNode *N01C = getAsNonOpaqueConstant(N0.getOperand(1))) {		if (const ConstantSDNode *N01C = getAsNonOpaqueConstant(N0.getOperand(1))) {
SDLoc DL(N);		SDLoc DL(N);
return DAG.getNode(ISD::XOR, DL, VT, N0.getOperand(0),		return DAG.getNode(ISD::XOR, DL, VT, N0.getOperand(0),
DAG.getConstant(N1C->getAPIntValue() ^		DAG.getConstant(N1C->getAPIntValue() ^
N01C->getAPIntValue(), DL, VT));		N01C->getAPIntValue(), DL, VT));
}		}
}		}

		// fold Y = sra (X, size(X)-1); xor (add (X, Y), Y) -> (abs X)
		unsigned OpSizeInBits = VT.getScalarSizeInBits();
		if (N0.getOpcode() == ISD::ADD && N0.getOperand(1) == N1 &&
		N1.getOpcode() == ISD::SRA && N1.getOperand(0) == N0.getOperand(0) &&
		TLI.isOperationLegalOrCustom(ISD::ABS, VT)) {
		if (ConstantSDNode *C = isConstOrConstSplat(N1.getOperand(1)))
		if (C->getAPIntValue() == (OpSizeInBits - 1))
		return DAG.getNode(ISD::ABS, SDLoc(N), VT, N0.getOperand(0));
		}

// fold (xor x, x) -> 0		// fold (xor x, x) -> 0
if (N0 == N1)		if (N0 == N1)
return tryFoldToZero(SDLoc(N), TLI, VT, DAG, LegalOperations, LegalTypes);		return tryFoldToZero(SDLoc(N), TLI, VT, DAG, LegalOperations, LegalTypes);

// fold (xor (shl 1, x), -1) -> (rotl ~1, x)		// fold (xor (shl 1, x), -1) -> (rotl ~1, x)
// Here is a concrete example of this equivalence:		// Here is a concrete example of this equivalence:
// i16 x == 14		// i16 x == 14
// i16 shl == 1 << 14 == 16384 == 0b0100000000000000		// i16 shl == 1 << 14 == 16384 == 0b0100000000000000
▲ Show 20 Lines • Show All 726 Lines • ▼ Show 20 Lines	else if (Use->getOpcode() == ISD::TRUNCATE && Use->hasOneUse()) {
if (Use->getOpcode() == ISD::BRCOND)		if (Use->getOpcode() == ISD::BRCOND)
AddToWorklist(Use);		AddToWorklist(Use);
}		}
}		}

return SDValue();		return SDValue();
}		}

		SDValue DAGCombiner::visitABS(SDNode *N) {
		SDValue N0 = N->getOperand(0);
		EVT VT = N->getValueType(0);

		// fold (abs c1) -> c2
		if (DAG.isConstantIntBuildVectorOrConstantInt(N0))
		return DAG.getNode(ISD::ABS, SDLoc(N), VT, N0);
		// fold (abs (abs x)) -> (abs x)
		if (N0.getOpcode() == ISD::ABS)
		return N0;
		// fold (abs x) -> x iff not-negative
		if (DAG.SignBitIsZero(N0))
		return N0;
		return SDValue();
		}

SDValue DAGCombiner::visitBSWAP(SDNode *N) {		SDValue DAGCombiner::visitBSWAP(SDNode *N) {
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);

// fold (bswap c1) -> c2		// fold (bswap c1) -> c2
if (DAG.isConstantIntBuildVectorOrConstantInt(N0))		if (DAG.isConstantIntBuildVectorOrConstantInt(N0))
return DAG.getNode(ISD::BSWAP, SDLoc(N), VT, N0);		return DAG.getNode(ISD::BSWAP, SDLoc(N), VT, N0);
// fold (bswap (bswap x)) -> x		// fold (bswap (bswap x)) -> x
▲ Show 20 Lines • Show All 10,554 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,330 Lines • ▼ Show 20 Lines	case ISD::BITCAST:
return getConstantFP(APFloat(APFloat::IEEEhalf(), Val), DL, VT);		return getConstantFP(APFloat(APFloat::IEEEhalf(), Val), DL, VT);
if (VT == MVT::f32 && C->getValueType(0) == MVT::i32)		if (VT == MVT::f32 && C->getValueType(0) == MVT::i32)
return getConstantFP(APFloat(APFloat::IEEEsingle(), Val), DL, VT);		return getConstantFP(APFloat(APFloat::IEEEsingle(), Val), DL, VT);
if (VT == MVT::f64 && C->getValueType(0) == MVT::i64)		if (VT == MVT::f64 && C->getValueType(0) == MVT::i64)
return getConstantFP(APFloat(APFloat::IEEEdouble(), Val), DL, VT);		return getConstantFP(APFloat(APFloat::IEEEdouble(), Val), DL, VT);
if (VT == MVT::f128 && C->getValueType(0) == MVT::i128)		if (VT == MVT::f128 && C->getValueType(0) == MVT::i128)
return getConstantFP(APFloat(APFloat::IEEEquad(), Val), DL, VT);		return getConstantFP(APFloat(APFloat::IEEEquad(), Val), DL, VT);
break;		break;
		case ISD::ABS:
		return getConstant(Val.abs(), DL, VT, C->isTargetOpcode(),
		C->isOpaque());
case ISD::BITREVERSE:		case ISD::BITREVERSE:
return getConstant(Val.reverseBits(), DL, VT, C->isTargetOpcode(),		return getConstant(Val.reverseBits(), DL, VT, C->isTargetOpcode(),
C->isOpaque());		C->isOpaque());
case ISD::BSWAP:		case ISD::BSWAP:
return getConstant(Val.byteSwap(), DL, VT, C->isTargetOpcode(),		return getConstant(Val.byteSwap(), DL, VT, C->isTargetOpcode(),
C->isOpaque());		C->isOpaque());
case ISD::CTPOP:		case ISD::CTPOP:
return getConstant(Val.countPopulation(), DL, VT, C->isTargetOpcode(),		return getConstant(Val.countPopulation(), DL, VT, C->isTargetOpcode(),
▲ Show 20 Lines • Show All 103 Lines • ▼ Show 20 Lines	if (BV->isConstant()) {
case ISD::FTRUNC:		case ISD::FTRUNC:
case ISD::FFLOOR:		case ISD::FFLOOR:
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
		case ISD::ABS:
case ISD::BITREVERSE:		case ISD::BITREVERSE:
case ISD::BSWAP:		case ISD::BSWAP:
case ISD::CTLZ:		case ISD::CTLZ:
case ISD::CTLZ_ZERO_UNDEF:		case ISD::CTLZ_ZERO_UNDEF:
case ISD::CTTZ:		case ISD::CTTZ:
case ISD::CTTZ_ZERO_UNDEF:		case ISD::CTTZ_ZERO_UNDEF:
case ISD::CTPOP: {		case ISD::CTPOP: {
SDValue Ops = { Operand };		SDValue Ops = { Operand };
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	if (OpOpcode == ISD::ZERO_EXTEND \|\| OpOpcode == ISD::SIGN_EXTEND \|\|
.bitsLT(VT.getScalarType()))		.bitsLT(VT.getScalarType()))
return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));		return getNode(OpOpcode, DL, VT, Operand.getNode()->getOperand(0));
if (Operand.getNode()->getOperand(0).getValueType().bitsGT(VT))		if (Operand.getNode()->getOperand(0).getValueType().bitsGT(VT))
return getNode(ISD::TRUNCATE, DL, VT, Operand.getNode()->getOperand(0));		return getNode(ISD::TRUNCATE, DL, VT, Operand.getNode()->getOperand(0));
return Operand.getNode()->getOperand(0);		return Operand.getNode()->getOperand(0);
}		}
if (OpOpcode == ISD::UNDEF)		if (OpOpcode == ISD::UNDEF)
return getUNDEF(VT);		return getUNDEF(VT);
break;		break;
		case ISD::ABS:
		assert(VT.isInteger() && VT == Operand.getValueType() &&
		"Invalid ABS!");
		if (OpOpcode == ISD::UNDEF)
		return getUNDEF(VT);
		break;
case ISD::BSWAP:		case ISD::BSWAP:
assert(VT.isInteger() && VT == Operand.getValueType() &&		assert(VT.isInteger() && VT == Operand.getValueType() &&
"Invalid BSWAP!");		"Invalid BSWAP!");
assert((VT.getScalarSizeInBits() % 16 == 0) &&		assert((VT.getScalarSizeInBits() % 16 == 0) &&
"BSWAP types must be a multiple of 16 bits!");		"BSWAP types must be a multiple of 16 bits!");
if (OpOpcode == ISD::UNDEF)		if (OpOpcode == ISD::UNDEF)
return getUNDEF(VT);		return getUNDEF(VT);
break;		break;
▲ Show 20 Lines • Show All 4,191 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 294 Lines • ▼ Show 20 Lines	#endif
case ISD::DEBUGTRAP: return "debugtrap";		case ISD::DEBUGTRAP: return "debugtrap";
case ISD::LIFETIME_START: return "lifetime.start";		case ISD::LIFETIME_START: return "lifetime.start";
case ISD::LIFETIME_END: return "lifetime.end";		case ISD::LIFETIME_END: return "lifetime.end";
case ISD::GC_TRANSITION_START: return "gc_transition.start";		case ISD::GC_TRANSITION_START: return "gc_transition.start";
case ISD::GC_TRANSITION_END: return "gc_transition.end";		case ISD::GC_TRANSITION_END: return "gc_transition.end";
case ISD::GET_DYNAMIC_AREA_OFFSET: return "get.dynamic.area.offset";		case ISD::GET_DYNAMIC_AREA_OFFSET: return "get.dynamic.area.offset";

// Bit manipulation		// Bit manipulation
		case ISD::ABS: return "abs";
case ISD::BITREVERSE: return "bitreverse";		case ISD::BITREVERSE: return "bitreverse";
case ISD::BSWAP: return "bswap";		case ISD::BSWAP: return "bswap";
case ISD::CTPOP: return "ctpop";		case ISD::CTPOP: return "ctpop";
case ISD::CTTZ: return "cttz";		case ISD::CTTZ: return "cttz";
case ISD::CTTZ_ZERO_UNDEF: return "cttz_zero_undef";		case ISD::CTTZ_ZERO_UNDEF: return "cttz_zero_undef";
case ISD::CTLZ: return "ctlz";		case ISD::CTLZ: return "ctlz";
case ISD::CTLZ_ZERO_UNDEF: return "ctlz_zero_undef";		case ISD::CTLZ_ZERO_UNDEF: return "ctlz_zero_undef";

▲ Show 20 Lines • Show All 405 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 894 Lines • ▼ Show 20 Lines	for (MVT VT : MVT::all_valuetypes()) {
setOperationAction(ISD::FMAXNUM, VT, Expand);		setOperationAction(ISD::FMAXNUM, VT, Expand);
setOperationAction(ISD::FMINNAN, VT, Expand);		setOperationAction(ISD::FMINNAN, VT, Expand);
setOperationAction(ISD::FMAXNAN, VT, Expand);		setOperationAction(ISD::FMAXNAN, VT, Expand);
setOperationAction(ISD::FMAD, VT, Expand);		setOperationAction(ISD::FMAD, VT, Expand);
setOperationAction(ISD::SMIN, VT, Expand);		setOperationAction(ISD::SMIN, VT, Expand);
setOperationAction(ISD::SMAX, VT, Expand);		setOperationAction(ISD::SMAX, VT, Expand);
setOperationAction(ISD::UMIN, VT, Expand);		setOperationAction(ISD::UMIN, VT, Expand);
setOperationAction(ISD::UMAX, VT, Expand);		setOperationAction(ISD::UMAX, VT, Expand);
		setOperationAction(ISD::ABS, VT, Expand);

// Overflow operations default to expand		// Overflow operations default to expand
setOperationAction(ISD::SADDO, VT, Expand);		setOperationAction(ISD::SADDO, VT, Expand);
setOperationAction(ISD::SSUBO, VT, Expand);		setOperationAction(ISD::SSUBO, VT, Expand);
setOperationAction(ISD::UADDO, VT, Expand);		setOperationAction(ISD::UADDO, VT, Expand);
setOperationAction(ISD::USUBO, VT, Expand);		setOperationAction(ISD::USUBO, VT, Expand);
setOperationAction(ISD::SMULO, VT, Expand);		setOperationAction(ISD::SMULO, VT, Expand);
setOperationAction(ISD::UMULO, VT, Expand);		setOperationAction(ISD::UMULO, VT, Expand);
▲ Show 20 Lines • Show All 1,184 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86ISelLowering.h

Show First 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
/// Integer horizontal add/sub.		/// Integer horizontal add/sub.
HADD,		HADD,
HSUB,		HSUB,

/// Floating point horizontal add/sub.		/// Floating point horizontal add/sub.
FHADD,		FHADD,
FHSUB,		FHSUB,

// Integer absolute value
ABS,

// Detect Conflicts Within a Vector		// Detect Conflicts Within a Vector
CONFLICT,		CONFLICT,

/// Floating point max and min.		/// Floating point max and min.
FMAX, FMIN,		FMAX, FMIN,

/// Commutative FMIN and FMAX.		/// Commutative FMIN and FMAX.
FMAXC, FMINC,		FMAXC, FMINC,
▲ Show 20 Lines • Show All 1,142 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 890 Lines • ▼ Show 20 Lines	if (!Subtarget.useSoftFloat() && Subtarget.hasSSE2()) {
for (auto VT : { MVT::v4i32, MVT::v2i64 }) {		for (auto VT : { MVT::v4i32, MVT::v2i64 }) {
setOperationAction(ISD::SRL, VT, Custom);		setOperationAction(ISD::SRL, VT, Custom);
setOperationAction(ISD::SHL, VT, Custom);		setOperationAction(ISD::SHL, VT, Custom);
setOperationAction(ISD::SRA, VT, Custom);		setOperationAction(ISD::SRA, VT, Custom);
}		}
}		}

if (!Subtarget.useSoftFloat() && Subtarget.hasSSSE3()) {		if (!Subtarget.useSoftFloat() && Subtarget.hasSSSE3()) {
		setOperationAction(ISD::ABS, MVT::v16i8, Legal);
		setOperationAction(ISD::ABS, MVT::v8i16, Legal);
		setOperationAction(ISD::ABS, MVT::v4i32, Legal);
setOperationAction(ISD::BITREVERSE, MVT::v16i8, Custom);		setOperationAction(ISD::BITREVERSE, MVT::v16i8, Custom);
setOperationAction(ISD::CTLZ, MVT::v16i8, Custom);		setOperationAction(ISD::CTLZ, MVT::v16i8, Custom);
setOperationAction(ISD::CTLZ, MVT::v8i16, Custom);		setOperationAction(ISD::CTLZ, MVT::v8i16, Custom);
setOperationAction(ISD::CTLZ, MVT::v4i32, Custom);		setOperationAction(ISD::CTLZ, MVT::v4i32, Custom);
setOperationAction(ISD::CTLZ, MVT::v2i64, Custom);		setOperationAction(ISD::CTLZ, MVT::v2i64, Custom);
}		}

if (!Subtarget.useSoftFloat() && Subtarget.hasSSE41()) {		if (!Subtarget.useSoftFloat() && Subtarget.hasSSE41()) {
▲ Show 20 Lines • Show All 169 Lines • ▼ Show 20 Lines	if (!Subtarget.useSoftFloat() && Subtarget.hasFp256()) {
setOperationAction(ISD::SMUL_LOHI, MVT::v8i32, Custom);		setOperationAction(ISD::SMUL_LOHI, MVT::v8i32, Custom);

setOperationAction(ISD::MULHU, MVT::v16i16, HasInt256 ? Legal : Custom);		setOperationAction(ISD::MULHU, MVT::v16i16, HasInt256 ? Legal : Custom);
setOperationAction(ISD::MULHS, MVT::v16i16, HasInt256 ? Legal : Custom);		setOperationAction(ISD::MULHS, MVT::v16i16, HasInt256 ? Legal : Custom);
setOperationAction(ISD::MULHU, MVT::v32i8, Custom);		setOperationAction(ISD::MULHU, MVT::v32i8, Custom);
setOperationAction(ISD::MULHS, MVT::v32i8, Custom);		setOperationAction(ISD::MULHS, MVT::v32i8, Custom);

for (auto VT : { MVT::v32i8, MVT::v16i16, MVT::v8i32 }) {		for (auto VT : { MVT::v32i8, MVT::v16i16, MVT::v8i32 }) {
		setOperationAction(ISD::ABS, VT, HasInt256 ? Legal : Custom);
setOperationAction(ISD::SMAX, VT, HasInt256 ? Legal : Custom);		setOperationAction(ISD::SMAX, VT, HasInt256 ? Legal : Custom);
setOperationAction(ISD::UMAX, VT, HasInt256 ? Legal : Custom);		setOperationAction(ISD::UMAX, VT, HasInt256 ? Legal : Custom);
setOperationAction(ISD::SMIN, VT, HasInt256 ? Legal : Custom);		setOperationAction(ISD::SMIN, VT, HasInt256 ? Legal : Custom);
setOperationAction(ISD::UMIN, VT, HasInt256 ? Legal : Custom);		setOperationAction(ISD::UMIN, VT, HasInt256 ? Legal : Custom);
}		}

if (HasInt256) {		if (HasInt256) {
setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, MVT::v4i64, Custom);		setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, MVT::v4i64, Custom);
▲ Show 20 Lines • Show All 190 Lines • ▼ Show 20 Lines	if (Subtarget.hasDQI()) {
if (Subtarget.hasVLX()) {		if (Subtarget.hasVLX()) {
// Fast v2f32 SINT_TO_FP( v2i32 ) custom conversion.		// Fast v2f32 SINT_TO_FP( v2i32 ) custom conversion.
setOperationAction(ISD::SINT_TO_FP, MVT::v2f32, Custom);		setOperationAction(ISD::SINT_TO_FP, MVT::v2f32, Custom);
setOperationAction(ISD::FP_TO_SINT, MVT::v2f32, Custom);		setOperationAction(ISD::FP_TO_SINT, MVT::v2f32, Custom);
setOperationAction(ISD::FP_TO_UINT, MVT::v2f32, Custom);		setOperationAction(ISD::FP_TO_UINT, MVT::v2f32, Custom);
}		}
}		}
if (Subtarget.hasVLX()) {		if (Subtarget.hasVLX()) {
		setOperationAction(ISD::ABS, MVT::v4i64, Legal);
		setOperationAction(ISD::ABS, MVT::v2i64, Legal);
setOperationAction(ISD::SINT_TO_FP, MVT::v8i32, Legal);		setOperationAction(ISD::SINT_TO_FP, MVT::v8i32, Legal);
setOperationAction(ISD::UINT_TO_FP, MVT::v8i32, Legal);		setOperationAction(ISD::UINT_TO_FP, MVT::v8i32, Legal);
setOperationAction(ISD::FP_TO_SINT, MVT::v8i32, Legal);		setOperationAction(ISD::FP_TO_SINT, MVT::v8i32, Legal);
setOperationAction(ISD::FP_TO_UINT, MVT::v8i32, Legal);		setOperationAction(ISD::FP_TO_UINT, MVT::v8i32, Legal);
setOperationAction(ISD::SINT_TO_FP, MVT::v4i32, Legal);		setOperationAction(ISD::SINT_TO_FP, MVT::v4i32, Legal);
setOperationAction(ISD::FP_TO_SINT, MVT::v4i32, Legal);		setOperationAction(ISD::FP_TO_SINT, MVT::v4i32, Legal);
setOperationAction(ISD::FP_TO_UINT, MVT::v4i32, Legal);		setOperationAction(ISD::FP_TO_UINT, MVT::v4i32, Legal);
setOperationAction(ISD::ZERO_EXTEND, MVT::v4i32, Custom);		setOperationAction(ISD::ZERO_EXTEND, MVT::v4i32, Custom);
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	if (!Subtarget.useSoftFloat() && Subtarget.hasAVX512()) {
setOperationAction(ISD::SUB, MVT::v8i1, Custom);		setOperationAction(ISD::SUB, MVT::v8i1, Custom);
setOperationAction(ISD::SUB, MVT::v16i1, Custom);		setOperationAction(ISD::SUB, MVT::v16i1, Custom);
setOperationAction(ISD::MUL, MVT::v8i1, Custom);		setOperationAction(ISD::MUL, MVT::v8i1, Custom);
setOperationAction(ISD::MUL, MVT::v16i1, Custom);		setOperationAction(ISD::MUL, MVT::v16i1, Custom);

setOperationAction(ISD::MUL, MVT::v16i32, Legal);		setOperationAction(ISD::MUL, MVT::v16i32, Legal);

for (auto VT : { MVT::v16i32, MVT::v8i64 }) {		for (auto VT : { MVT::v16i32, MVT::v8i64 }) {
		setOperationAction(ISD::ABS, VT, Legal);
setOperationAction(ISD::SRL, VT, Custom);		setOperationAction(ISD::SRL, VT, Custom);
setOperationAction(ISD::SHL, VT, Custom);		setOperationAction(ISD::SHL, VT, Custom);
setOperationAction(ISD::SRA, VT, Custom);		setOperationAction(ISD::SRA, VT, Custom);
setOperationAction(ISD::CTPOP, VT, Custom);		setOperationAction(ISD::CTPOP, VT, Custom);
setOperationAction(ISD::CTTZ, VT, Custom);		setOperationAction(ISD::CTTZ, VT, Custom);
}		}

// Need to promote to 64-bit even though we have 32-bit masked instructions		// Need to promote to 64-bit even though we have 32-bit masked instructions
▲ Show 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	if (!Subtarget.useSoftFloat() && Subtarget.hasBWI()) {
if (Subtarget.hasCDI()) {		if (Subtarget.hasCDI()) {
setOperationAction(ISD::CTLZ, MVT::v32i16, Custom);		setOperationAction(ISD::CTLZ, MVT::v32i16, Custom);
setOperationAction(ISD::CTLZ, MVT::v64i8, Custom);		setOperationAction(ISD::CTLZ, MVT::v64i8, Custom);
}		}

for (auto VT : { MVT::v64i8, MVT::v32i16 }) {		for (auto VT : { MVT::v64i8, MVT::v32i16 }) {
setOperationAction(ISD::BUILD_VECTOR, VT, Custom);		setOperationAction(ISD::BUILD_VECTOR, VT, Custom);
setOperationAction(ISD::VSELECT, VT, Legal);		setOperationAction(ISD::VSELECT, VT, Legal);
		setOperationAction(ISD::ABS, VT, Legal);
setOperationAction(ISD::SRL, VT, Custom);		setOperationAction(ISD::SRL, VT, Custom);
setOperationAction(ISD::SHL, VT, Custom);		setOperationAction(ISD::SHL, VT, Custom);
setOperationAction(ISD::SRA, VT, Custom);		setOperationAction(ISD::SRA, VT, Custom);
setOperationAction(ISD::MLOAD, VT, Legal);		setOperationAction(ISD::MLOAD, VT, Legal);
setOperationAction(ISD::MSTORE, VT, Legal);		setOperationAction(ISD::MSTORE, VT, Legal);
setOperationAction(ISD::CTPOP, VT, Custom);		setOperationAction(ISD::CTPOP, VT, Custom);
setOperationAction(ISD::CTTZ, VT, Custom);		setOperationAction(ISD::CTTZ, VT, Custom);

▲ Show 20 Lines • Show All 19,577 Lines • ▼ Show 20 Lines	if (VT.getScalarType() == MVT::i1)
return DAG.getNode(ISD::XOR, SDLoc(Op), VT,		return DAG.getNode(ISD::XOR, SDLoc(Op), VT,
Op.getOperand(0), Op.getOperand(1));		Op.getOperand(0), Op.getOperand(1));
assert(Op.getSimpleValueType().is256BitVector() &&		assert(Op.getSimpleValueType().is256BitVector() &&
Op.getSimpleValueType().isInteger() &&		Op.getSimpleValueType().isInteger() &&
"Only handle AVX 256-bit vector integer operation");		"Only handle AVX 256-bit vector integer operation");
return Lower256IntArith(Op, DAG);		return Lower256IntArith(Op, DAG);
}		}

		static SDValue LowerABS(SDValue Op, SelectionDAG &DAG) {
		assert(Op.getSimpleValueType().is256BitVector() &&
		Op.getSimpleValueType().isInteger() &&
		"Only handle AVX 256-bit vector integer operation");
		MVT VT = Op.getSimpleValueType();
		unsigned NumElems = VT.getVectorNumElements();

		SDLoc dl(Op);
		SDValue Src = Op.getOperand(0);
		SDValue Lo = extract128BitVector(Src, 0, DAG, dl);
		SDValue Hi = extract128BitVector(Src, NumElems / 2, DAG, dl);

		MVT EltVT = VT.getVectorElementType();
		MVT NewVT = MVT::getVectorVT(EltVT, NumElems / 2);
		return DAG.getNode(ISD::CONCAT_VECTORS, dl, VT,
		DAG.getNode(ISD::ABS, dl, NewVT, Lo),
		DAG.getNode(ISD::ABS, dl, NewVT, Hi));
		}

static SDValue LowerMINMAX(SDValue Op, SelectionDAG &DAG) {		static SDValue LowerMINMAX(SDValue Op, SelectionDAG &DAG) {
assert(Op.getSimpleValueType().is256BitVector() &&		assert(Op.getSimpleValueType().is256BitVector() &&
Op.getSimpleValueType().isInteger() &&		Op.getSimpleValueType().isInteger() &&
"Only handle AVX 256-bit vector integer operation");		"Only handle AVX 256-bit vector integer operation");
return Lower256IntArith(Op, DAG);		return Lower256IntArith(Op, DAG);
}		}

static SDValue LowerMUL(SDValue Op, const X86Subtarget &Subtarget,		static SDValue LowerMUL(SDValue Op, const X86Subtarget &Subtarget,
▲ Show 20 Lines • Show All 2,528 Lines • ▼ Show 20 Lines	SDValue X86TargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const {
case ISD::SUBC:		case ISD::SUBC:
case ISD::SUBE: return LowerADDC_ADDE_SUBC_SUBE(Op, DAG);		case ISD::SUBE: return LowerADDC_ADDE_SUBC_SUBE(Op, DAG);
case ISD::ADD:		case ISD::ADD:
case ISD::SUB: return LowerADD_SUB(Op, DAG);		case ISD::SUB: return LowerADD_SUB(Op, DAG);
case ISD::SMAX:		case ISD::SMAX:
case ISD::SMIN:		case ISD::SMIN:
case ISD::UMAX:		case ISD::UMAX:
case ISD::UMIN: return LowerMINMAX(Op, DAG);		case ISD::UMIN: return LowerMINMAX(Op, DAG);
		case ISD::ABS: return LowerABS(Op, DAG);
case ISD::FSINCOS: return LowerFSINCOS(Op, Subtarget, DAG);		case ISD::FSINCOS: return LowerFSINCOS(Op, Subtarget, DAG);
case ISD::MLOAD: return LowerMLOAD(Op, Subtarget, DAG);		case ISD::MLOAD: return LowerMLOAD(Op, Subtarget, DAG);
case ISD::MSTORE: return LowerMSTORE(Op, Subtarget, DAG);		case ISD::MSTORE: return LowerMSTORE(Op, Subtarget, DAG);
case ISD::MGATHER: return LowerMGATHER(Op, Subtarget, DAG);		case ISD::MGATHER: return LowerMGATHER(Op, Subtarget, DAG);
case ISD::MSCATTER: return LowerMSCATTER(Op, Subtarget, DAG);		case ISD::MSCATTER: return LowerMSCATTER(Op, Subtarget, DAG);
case ISD::GC_TRANSITION_START:		case ISD::GC_TRANSITION_START:
return LowerGC_TRANSITION_START(Op, DAG);		return LowerGC_TRANSITION_START(Op, DAG);
case ISD::GC_TRANSITION_END: return LowerGC_TRANSITION_END(Op, DAG);		case ISD::GC_TRANSITION_END: return LowerGC_TRANSITION_END(Op, DAG);
▲ Show 20 Lines • Show All 396 Lines • ▼ Show 20 Lines	const char *X86TargetLowering::getTargetNodeName(unsigned Opcode) const {
case X86ISD::BLENDI: return "X86ISD::BLENDI";		case X86ISD::BLENDI: return "X86ISD::BLENDI";
case X86ISD::SHRUNKBLEND: return "X86ISD::SHRUNKBLEND";		case X86ISD::SHRUNKBLEND: return "X86ISD::SHRUNKBLEND";
case X86ISD::ADDUS: return "X86ISD::ADDUS";		case X86ISD::ADDUS: return "X86ISD::ADDUS";
case X86ISD::SUBUS: return "X86ISD::SUBUS";		case X86ISD::SUBUS: return "X86ISD::SUBUS";
case X86ISD::HADD: return "X86ISD::HADD";		case X86ISD::HADD: return "X86ISD::HADD";
case X86ISD::HSUB: return "X86ISD::HSUB";		case X86ISD::HSUB: return "X86ISD::HSUB";
case X86ISD::FHADD: return "X86ISD::FHADD";		case X86ISD::FHADD: return "X86ISD::FHADD";
case X86ISD::FHSUB: return "X86ISD::FHSUB";		case X86ISD::FHSUB: return "X86ISD::FHSUB";
case X86ISD::ABS: return "X86ISD::ABS";
case X86ISD::CONFLICT: return "X86ISD::CONFLICT";		case X86ISD::CONFLICT: return "X86ISD::CONFLICT";
case X86ISD::FMAX: return "X86ISD::FMAX";		case X86ISD::FMAX: return "X86ISD::FMAX";
case X86ISD::FMAXS: return "X86ISD::FMAXS";		case X86ISD::FMAXS: return "X86ISD::FMAXS";
case X86ISD::FMAX_RND: return "X86ISD::FMAX_RND";		case X86ISD::FMAX_RND: return "X86ISD::FMAX_RND";
case X86ISD::FMAXS_RND: return "X86ISD::FMAX_RND";		case X86ISD::FMAXS_RND: return "X86ISD::FMAX_RND";
case X86ISD::FMIN: return "X86ISD::FMIN";		case X86ISD::FMIN: return "X86ISD::FMIN";
case X86ISD::FMINS: return "X86ISD::FMINS";		case X86ISD::FMINS: return "X86ISD::FMINS";
case X86ISD::FMIN_RND: return "X86ISD::FMIN_RND";		case X86ISD::FMIN_RND: return "X86ISD::FMIN_RND";
▲ Show 20 Lines • Show All 11,601 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86InstrAVX512.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,600 Lines • ▼ Show 20 Lines	multiclass avx512_unary_rm_vl_all<bits<8> opc_b, bits<8> opc_w,
bits<8> opc_d, bits<8> opc_q,		bits<8> opc_d, bits<8> opc_q,
string OpcodeStr, SDNode OpNode> {		string OpcodeStr, SDNode OpNode> {
defm NAME : avx512_unary_rm_vl_dq<opc_d, opc_q, OpcodeStr, OpNode,		defm NAME : avx512_unary_rm_vl_dq<opc_d, opc_q, OpcodeStr, OpNode,
HasAVX512>,		HasAVX512>,
avx512_unary_rm_vl_bw<opc_b, opc_w, OpcodeStr, OpNode,		avx512_unary_rm_vl_bw<opc_b, opc_w, OpcodeStr, OpNode,
HasBWI>;		HasBWI>;
}		}

defm VPABS : avx512_unary_rm_vl_all<0x1C, 0x1D, 0x1E, 0x1F, "vpabs", X86Abs>;		defm VPABS : avx512_unary_rm_vl_all<0x1C, 0x1D, 0x1E, 0x1F, "vpabs", abs>;

def avx512_v16i1sextv16i8 : PatLeaf<(v16i8 (X86pcmpgt (bc_v16i8 (v4i32 immAllZerosV)),
VR128X:$src))>;
def avx512_v8i1sextv8i16 : PatLeaf<(v8i16 (X86vsrai VR128X:$src, (i8 15)))>;
def avx512_v4i1sextv4i32 : PatLeaf<(v4i32 (X86vsrai VR128X:$src, (i8 31)))>;
def avx512_v32i1sextv32i8 : PatLeaf<(v32i8 (X86pcmpgt (bc_v32i8 (v8i32 immAllZerosV)),
VR256X:$src))>;
def avx512_v16i1sextv16i16: PatLeaf<(v16i16 (X86vsrai VR256X:$src, (i8 15)))>;
def avx512_v8i1sextv8i32 : PatLeaf<(v8i32 (X86vsrai VR256X:$src, (i8 31)))>;

let Predicates = [HasBWI, HasVLX] in {
def : Pat<(xor
(bc_v2i64 (avx512_v16i1sextv16i8)),
(bc_v2i64 (add (v16i8 VR128X:$src), (avx512_v16i1sextv16i8)))),
(VPABSBZ128rr VR128X:$src)>;
def : Pat<(xor
(bc_v2i64 (avx512_v8i1sextv8i16)),
(bc_v2i64 (add (v8i16 VR128X:$src), (avx512_v8i1sextv8i16)))),
(VPABSWZ128rr VR128X:$src)>;
def : Pat<(xor
(bc_v4i64 (avx512_v32i1sextv32i8)),
(bc_v4i64 (add (v32i8 VR256X:$src), (avx512_v32i1sextv32i8)))),
(VPABSBZ256rr VR256X:$src)>;
def : Pat<(xor
(bc_v4i64 (avx512_v16i1sextv16i16)),
(bc_v4i64 (add (v16i16 VR256X:$src), (avx512_v16i1sextv16i16)))),
(VPABSWZ256rr VR256X:$src)>;
}
let Predicates = [HasAVX512, HasVLX] in {
def : Pat<(xor
(bc_v2i64 (avx512_v4i1sextv4i32)),
(bc_v2i64 (add (v4i32 VR128X:$src), (avx512_v4i1sextv4i32)))),
(VPABSDZ128rr VR128X:$src)>;
def : Pat<(xor
(bc_v4i64 (avx512_v8i1sextv8i32)),
(bc_v4i64 (add (v8i32 VR256X:$src), (avx512_v8i1sextv8i32)))),
(VPABSDZ256rr VR256X:$src)>;
}

let Predicates = [HasAVX512] in {
def : Pat<(xor
(bc_v8i64 (v16i1sextv16i32)),
(bc_v8i64 (add (v16i32 VR512:$src), (v16i1sextv16i32)))),
(VPABSDZrr VR512:$src)>;
def : Pat<(xor
(bc_v8i64 (v8i1sextv8i64)),
(bc_v8i64 (add (v8i64 VR512:$src), (v8i1sextv8i64)))),
(VPABSQZrr VR512:$src)>;
}
let Predicates = [HasBWI] in {
def : Pat<(xor
(bc_v8i64 (v64i1sextv64i8)),
(bc_v8i64 (add (v64i8 VR512:$src), (v64i1sextv64i8)))),
(VPABSBZrr VR512:$src)>;
def : Pat<(xor
(bc_v8i64 (v32i1sextv32i16)),
(bc_v8i64 (add (v32i16 VR512:$src), (v32i1sextv32i16)))),
(VPABSWZrr VR512:$src)>;
}

multiclass avx512_ctlz<bits<8> opc, string OpcodeStr, Predicate prd>{		multiclass avx512_ctlz<bits<8> opc, string OpcodeStr, Predicate prd>{

defm NAME : avx512_unary_rm_vl_dq<opc, opc, OpcodeStr, ctlz, prd>;		defm NAME : avx512_unary_rm_vl_dq<opc, opc, OpcodeStr, ctlz, prd>;
}		}

defm VPLZCNT : avx512_ctlz<0x44, "vplzcnt", HasCDI>;		defm VPLZCNT : avx512_ctlz<0x44, "vplzcnt", HasCDI>;
defm VPCONFLICT : avx512_unary_rm_vl_dq<0xC4, 0xC4, "vpconflict", X86Conflict, HasCDI>;		defm VPCONFLICT : avx512_unary_rm_vl_dq<0xC4, 0xC4, "vpconflict", X86Conflict, HasCDI>;
▲ Show 20 Lines • Show All 782 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td

	Show First 20 Lines • Show All 349 Lines • ▼ Show 20 Lines

	def SDTFmaRound : SDTypeProfile<1, 4, [SDTCisSameAs<0,1>,			def SDTFmaRound : SDTypeProfile<1, 4, [SDTCisSameAs<0,1>,
	SDTCisSameAs<1,2>, SDTCisSameAs<1,3>,			SDTCisSameAs<1,2>, SDTCisSameAs<1,3>,
	SDTCisFP<0>, SDTCisVT<4, i32>]>;			SDTCisFP<0>, SDTCisVT<4, i32>]>;

	def X86PAlignr : SDNode<"X86ISD::PALIGNR", SDTShuff3OpI>;			def X86PAlignr : SDNode<"X86ISD::PALIGNR", SDTShuff3OpI>;
	def X86VAlign : SDNode<"X86ISD::VALIGN", SDTShuff3OpI>;			def X86VAlign : SDNode<"X86ISD::VALIGN", SDTShuff3OpI>;

	def X86Abs : SDNode<"X86ISD::ABS", SDTIntUnaryOp>;
	def X86Conflict : SDNode<"X86ISD::CONFLICT", SDTIntUnaryOp>;			def X86Conflict : SDNode<"X86ISD::CONFLICT", SDTIntUnaryOp>;

	def X86PShufd : SDNode<"X86ISD::PSHUFD", SDTShuff2OpI>;			def X86PShufd : SDNode<"X86ISD::PSHUFD", SDTShuff2OpI>;
	def X86PShufhw : SDNode<"X86ISD::PSHUFHW", SDTShuff2OpI>;			def X86PShufhw : SDNode<"X86ISD::PSHUFHW", SDTShuff2OpI>;
	def X86PShuflw : SDNode<"X86ISD::PSHUFLW", SDTShuff2OpI>;			def X86PShuflw : SDNode<"X86ISD::PSHUFLW", SDTShuff2OpI>;

	def X86Shufp : SDNode<"X86ISD::SHUFP", SDTShuff3OpI>;			def X86Shufp : SDNode<"X86ISD::SHUFP", SDTShuff3OpI>;
	def X86Shuf128 : SDNode<"X86ISD::SHUF128", SDTShuff3OpI>;			def X86Shuf128 : SDNode<"X86ISD::SHUF128", SDTShuff3OpI>;
	▲ Show 20 Lines • Show All 761 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86InstrSSE.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 5,260 Lines • ▼ Show 20 Lines	multiclass SS3I_unop_rm_y<bits<8> opc, string OpcodeStr, ValueType vt,
def Yrm : SS38I<opc, MRMSrcMem, (outs VR256:$dst),		def Yrm : SS38I<opc, MRMSrcMem, (outs VR256:$dst),
(ins i256mem:$src),		(ins i256mem:$src),
!strconcat(OpcodeStr, "\t{$src, $dst\|$dst, $src}"),		!strconcat(OpcodeStr, "\t{$src, $dst\|$dst, $src}"),
[(set VR256:$dst,		[(set VR256:$dst,
(vt (OpNode (bitconvert (loadv4i64 addr:$src)))))]>,		(vt (OpNode (bitconvert (loadv4i64 addr:$src)))))]>,
Sched<[WriteVecALULd]>;		Sched<[WriteVecALULd]>;
}		}

// Helper fragments to match sext vXi1 to vXiY.
def v16i1sextv16i8 : PatLeaf<(v16i8 (X86pcmpgt (bc_v16i8 (v4i32 immAllZerosV)),
VR128:$src))>;
def v8i1sextv8i16 : PatLeaf<(v8i16 (X86vsrai VR128:$src, (i8 15)))>;
def v4i1sextv4i32 : PatLeaf<(v4i32 (X86vsrai VR128:$src, (i8 31)))>;
def v32i1sextv32i8 : PatLeaf<(v32i8 (X86pcmpgt (bc_v32i8 (v8i32 immAllZerosV)),
VR256:$src))>;
def v16i1sextv16i16: PatLeaf<(v16i16 (X86vsrai VR256:$src, (i8 15)))>;
def v8i1sextv8i32 : PatLeaf<(v8i32 (X86vsrai VR256:$src, (i8 31)))>;

let Predicates = [HasAVX, NoVLX_Or_NoBWI] in {
defm VPABSB : SS3I_unop_rm<0x1C, "vpabsb", v16i8, X86Abs, loadv2i64>, VEX, VEX_WIG;
defm VPABSW : SS3I_unop_rm<0x1D, "vpabsw", v8i16, X86Abs, loadv2i64>, VEX, VEX_WIG;
}
let Predicates = [HasAVX, NoVLX] in {
defm VPABSD : SS3I_unop_rm<0x1E, "vpabsd", v4i32, X86Abs, loadv2i64>, VEX, VEX_WIG;
}

let Predicates = [HasAVX, NoVLX_Or_NoBWI] in {		let Predicates = [HasAVX, NoVLX_Or_NoBWI] in {
def : Pat<(xor		defm VPABSB : SS3I_unop_rm<0x1C, "vpabsb", v16i8, abs, loadv2i64>, VEX, VEX_WIG;
(bc_v2i64 (v16i1sextv16i8)),		defm VPABSW : SS3I_unop_rm<0x1D, "vpabsw", v8i16, abs, loadv2i64>, VEX, VEX_WIG;
(bc_v2i64 (add (v16i8 VR128:$src), (v16i1sextv16i8)))),
(VPABSBrr VR128:$src)>;
def : Pat<(xor
(bc_v2i64 (v8i1sextv8i16)),
(bc_v2i64 (add (v8i16 VR128:$src), (v8i1sextv8i16)))),
(VPABSWrr VR128:$src)>;
}		}
let Predicates = [HasAVX, NoVLX] in {		let Predicates = [HasAVX, NoVLX] in {
def : Pat<(xor		defm VPABSD : SS3I_unop_rm<0x1E, "vpabsd", v4i32, abs, loadv2i64>, VEX, VEX_WIG;
(bc_v2i64 (v4i1sextv4i32)),
(bc_v2i64 (add (v4i32 VR128:$src), (v4i1sextv4i32)))),
(VPABSDrr VR128:$src)>;
}		}

let Predicates = [HasAVX2, NoVLX_Or_NoBWI] in {		let Predicates = [HasAVX2, NoVLX_Or_NoBWI] in {
defm VPABSB : SS3I_unop_rm_y<0x1C, "vpabsb", v32i8, X86Abs>, VEX, VEX_L, VEX_WIG;		defm VPABSB : SS3I_unop_rm_y<0x1C, "vpabsb", v32i8, abs>, VEX, VEX_L, VEX_WIG;
defm VPABSW : SS3I_unop_rm_y<0x1D, "vpabsw", v16i16, X86Abs>, VEX, VEX_L, VEX_WIG;		defm VPABSW : SS3I_unop_rm_y<0x1D, "vpabsw", v16i16, abs>, VEX, VEX_L, VEX_WIG;
}		}
let Predicates = [HasAVX2, NoVLX] in {		let Predicates = [HasAVX2, NoVLX] in {
defm VPABSD : SS3I_unop_rm_y<0x1E, "vpabsd", v8i32, X86Abs>, VEX, VEX_L, VEX_WIG;		defm VPABSD : SS3I_unop_rm_y<0x1E, "vpabsd", v8i32, abs>, VEX, VEX_L, VEX_WIG;
}		}

let Predicates = [HasAVX2, NoVLX_Or_NoBWI] in {		defm PABSB : SS3I_unop_rm<0x1C, "pabsb", v16i8, abs, memopv2i64>;
def : Pat<(xor		defm PABSW : SS3I_unop_rm<0x1D, "pabsw", v8i16, abs, memopv2i64>;
(bc_v4i64 (v32i1sextv32i8)),		defm PABSD : SS3I_unop_rm<0x1E, "pabsd", v4i32, abs, memopv2i64>;
(bc_v4i64 (add (v32i8 VR256:$src), (v32i1sextv32i8)))),
(VPABSBYrr VR256:$src)>;
def : Pat<(xor
(bc_v4i64 (v16i1sextv16i16)),
(bc_v4i64 (add (v16i16 VR256:$src), (v16i1sextv16i16)))),
(VPABSWYrr VR256:$src)>;
}
let Predicates = [HasAVX2, NoVLX] in {
def : Pat<(xor
(bc_v4i64 (v8i1sextv8i32)),
(bc_v4i64 (add (v8i32 VR256:$src), (v8i1sextv8i32)))),
(VPABSDYrr VR256:$src)>;
}

defm PABSB : SS3I_unop_rm<0x1C, "pabsb", v16i8, X86Abs, memopv2i64>;
defm PABSW : SS3I_unop_rm<0x1D, "pabsw", v8i16, X86Abs, memopv2i64>;
defm PABSD : SS3I_unop_rm<0x1E, "pabsd", v4i32, X86Abs, memopv2i64>;

let Predicates = [UseSSSE3] in {
def : Pat<(xor
(bc_v2i64 (v16i1sextv16i8)),
(bc_v2i64 (add (v16i8 VR128:$src), (v16i1sextv16i8)))),
(PABSBrr VR128:$src)>;
def : Pat<(xor
(bc_v2i64 (v8i1sextv8i16)),
(bc_v2i64 (add (v8i16 VR128:$src), (v8i1sextv8i16)))),
(PABSWrr VR128:$src)>;
def : Pat<(xor
(bc_v2i64 (v4i1sextv4i32)),
(bc_v2i64 (add (v4i32 VR128:$src), (v4i1sextv4i32)))),
(PABSDrr VR128:$src)>;
}

//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//
// SSSE3 - Packed Binary Operator Instructions		// SSSE3 - Packed Binary Operator Instructions
//===---------------------------------------------------------------------===//		//===---------------------------------------------------------------------===//

let Sched = WriteVecALU in {		let Sched = WriteVecALU in {
def SSE_PHADDSUBD : OpndItins<		def SSE_PHADDSUBD : OpndItins<
IIC_SSE_PHADDSUBD_RR, IIC_SSE_PHADDSUBD_RM		IIC_SSE_PHADDSUBD_RR, IIC_SSE_PHADDSUBD_RM
▲ Show 20 Lines • Show All 3,396 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h

Show First 20 Lines • Show All 364 Lines • ▼ Show 20 Lines	static const IntrinsicData IntrinsicsWithoutChain[] = {
X86_INTRINSIC_DATA(avx_sqrt_ps_256, INTR_TYPE_1OP, ISD::FSQRT, 0),		X86_INTRINSIC_DATA(avx_sqrt_ps_256, INTR_TYPE_1OP, ISD::FSQRT, 0),
X86_INTRINSIC_DATA(avx_vperm2f128_pd_256, INTR_TYPE_3OP, X86ISD::VPERM2X128, 0),		X86_INTRINSIC_DATA(avx_vperm2f128_pd_256, INTR_TYPE_3OP, X86ISD::VPERM2X128, 0),
X86_INTRINSIC_DATA(avx_vperm2f128_ps_256, INTR_TYPE_3OP, X86ISD::VPERM2X128, 0),		X86_INTRINSIC_DATA(avx_vperm2f128_ps_256, INTR_TYPE_3OP, X86ISD::VPERM2X128, 0),
X86_INTRINSIC_DATA(avx_vperm2f128_si_256, INTR_TYPE_3OP, X86ISD::VPERM2X128, 0),		X86_INTRINSIC_DATA(avx_vperm2f128_si_256, INTR_TYPE_3OP, X86ISD::VPERM2X128, 0),
X86_INTRINSIC_DATA(avx_vpermilvar_pd, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),		X86_INTRINSIC_DATA(avx_vpermilvar_pd, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),
X86_INTRINSIC_DATA(avx_vpermilvar_pd_256, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),		X86_INTRINSIC_DATA(avx_vpermilvar_pd_256, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),
X86_INTRINSIC_DATA(avx_vpermilvar_ps, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),		X86_INTRINSIC_DATA(avx_vpermilvar_ps, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),
X86_INTRINSIC_DATA(avx_vpermilvar_ps_256, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),		X86_INTRINSIC_DATA(avx_vpermilvar_ps_256, INTR_TYPE_2OP, X86ISD::VPERMILPV, 0),
X86_INTRINSIC_DATA(avx2_pabs_b, INTR_TYPE_1OP, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx2_pabs_b, INTR_TYPE_1OP, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx2_pabs_d, INTR_TYPE_1OP, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx2_pabs_d, INTR_TYPE_1OP, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx2_pabs_w, INTR_TYPE_1OP, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx2_pabs_w, INTR_TYPE_1OP, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx2_packssdw, INTR_TYPE_2OP, X86ISD::PACKSS, 0),		X86_INTRINSIC_DATA(avx2_packssdw, INTR_TYPE_2OP, X86ISD::PACKSS, 0),
X86_INTRINSIC_DATA(avx2_packsswb, INTR_TYPE_2OP, X86ISD::PACKSS, 0),		X86_INTRINSIC_DATA(avx2_packsswb, INTR_TYPE_2OP, X86ISD::PACKSS, 0),
X86_INTRINSIC_DATA(avx2_packusdw, INTR_TYPE_2OP, X86ISD::PACKUS, 0),		X86_INTRINSIC_DATA(avx2_packusdw, INTR_TYPE_2OP, X86ISD::PACKUS, 0),
X86_INTRINSIC_DATA(avx2_packuswb, INTR_TYPE_2OP, X86ISD::PACKUS, 0),		X86_INTRINSIC_DATA(avx2_packuswb, INTR_TYPE_2OP, X86ISD::PACKUS, 0),
X86_INTRINSIC_DATA(avx2_padds_b, INTR_TYPE_2OP, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx2_padds_b, INTR_TYPE_2OP, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx2_padds_w, INTR_TYPE_2OP, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx2_padds_w, INTR_TYPE_2OP, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx2_paddus_b, INTR_TYPE_2OP, X86ISD::ADDUS, 0),		X86_INTRINSIC_DATA(avx2_paddus_b, INTR_TYPE_2OP, X86ISD::ADDUS, 0),
X86_INTRINSIC_DATA(avx2_paddus_w, INTR_TYPE_2OP, X86ISD::ADDUS, 0),		X86_INTRINSIC_DATA(avx2_paddus_w, INTR_TYPE_2OP, X86ISD::ADDUS, 0),
▲ Show 20 Lines • Show All 449 Lines • ▼ Show 20 Lines	static const IntrinsicData IntrinsicsWithoutChain[] = {
X86_INTRINSIC_DATA(avx512_mask_mul_pd_512, INTR_TYPE_2OP_MASK, ISD::FMUL,		X86_INTRINSIC_DATA(avx512_mask_mul_pd_512, INTR_TYPE_2OP_MASK, ISD::FMUL,
X86ISD::FMUL_RND),		X86ISD::FMUL_RND),
X86_INTRINSIC_DATA(avx512_mask_mul_ps_512, INTR_TYPE_2OP_MASK, ISD::FMUL,		X86_INTRINSIC_DATA(avx512_mask_mul_ps_512, INTR_TYPE_2OP_MASK, ISD::FMUL,
X86ISD::FMUL_RND),		X86ISD::FMUL_RND),
X86_INTRINSIC_DATA(avx512_mask_mul_sd_round, INTR_TYPE_SCALAR_MASK_RM,		X86_INTRINSIC_DATA(avx512_mask_mul_sd_round, INTR_TYPE_SCALAR_MASK_RM,
X86ISD::FMULS_RND, 0),		X86ISD::FMULS_RND, 0),
X86_INTRINSIC_DATA(avx512_mask_mul_ss_round, INTR_TYPE_SCALAR_MASK_RM,		X86_INTRINSIC_DATA(avx512_mask_mul_ss_round, INTR_TYPE_SCALAR_MASK_RM,
X86ISD::FMULS_RND, 0),		X86ISD::FMULS_RND, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_b_128, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_b_128, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_b_256, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_b_256, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_b_512, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_b_512, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_d_128, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_d_128, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_d_256, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_d_256, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_d_512, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_d_512, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_q_128, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_q_128, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_q_256, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_q_256, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_q_512, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_q_512, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_w_128, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_w_128, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_w_256, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_w_256, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_pabs_w_512, INTR_TYPE_1OP_MASK, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(avx512_mask_pabs_w_512, INTR_TYPE_1OP_MASK, ISD::ABS, 0),
X86_INTRINSIC_DATA(avx512_mask_padds_b_128, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx512_mask_padds_b_128, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx512_mask_padds_b_256, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx512_mask_padds_b_256, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx512_mask_padds_b_512, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx512_mask_padds_b_512, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx512_mask_padds_w_128, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx512_mask_padds_w_128, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx512_mask_padds_w_256, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx512_mask_padds_w_256, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx512_mask_padds_w_512, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),		X86_INTRINSIC_DATA(avx512_mask_padds_w_512, INTR_TYPE_2OP_MASK, X86ISD::ADDS, 0),
X86_INTRINSIC_DATA(avx512_mask_paddus_b_128, INTR_TYPE_2OP_MASK, X86ISD::ADDUS, 0),		X86_INTRINSIC_DATA(avx512_mask_paddus_b_128, INTR_TYPE_2OP_MASK, X86ISD::ADDUS, 0),
X86_INTRINSIC_DATA(avx512_mask_paddus_b_256, INTR_TYPE_2OP_MASK, X86ISD::ADDUS, 0),		X86_INTRINSIC_DATA(avx512_mask_paddus_b_256, INTR_TYPE_2OP_MASK, X86ISD::ADDUS, 0),
▲ Show 20 Lines • Show All 827 Lines • ▼ Show 20 Lines
X86_INTRINSIC_DATA(sse3_hadd_ps, INTR_TYPE_2OP, X86ISD::FHADD, 0),		X86_INTRINSIC_DATA(sse3_hadd_ps, INTR_TYPE_2OP, X86ISD::FHADD, 0),
X86_INTRINSIC_DATA(sse3_hsub_pd, INTR_TYPE_2OP, X86ISD::FHSUB, 0),		X86_INTRINSIC_DATA(sse3_hsub_pd, INTR_TYPE_2OP, X86ISD::FHSUB, 0),
X86_INTRINSIC_DATA(sse3_hsub_ps, INTR_TYPE_2OP, X86ISD::FHSUB, 0),		X86_INTRINSIC_DATA(sse3_hsub_ps, INTR_TYPE_2OP, X86ISD::FHSUB, 0),
X86_INTRINSIC_DATA(sse41_insertps, INTR_TYPE_3OP, X86ISD::INSERTPS, 0),		X86_INTRINSIC_DATA(sse41_insertps, INTR_TYPE_3OP, X86ISD::INSERTPS, 0),
X86_INTRINSIC_DATA(sse41_packusdw, INTR_TYPE_2OP, X86ISD::PACKUS, 0),		X86_INTRINSIC_DATA(sse41_packusdw, INTR_TYPE_2OP, X86ISD::PACKUS, 0),
X86_INTRINSIC_DATA(sse41_pmuldq, INTR_TYPE_2OP, X86ISD::PMULDQ, 0),		X86_INTRINSIC_DATA(sse41_pmuldq, INTR_TYPE_2OP, X86ISD::PMULDQ, 0),
X86_INTRINSIC_DATA(sse4a_extrqi, INTR_TYPE_3OP, X86ISD::EXTRQI, 0),		X86_INTRINSIC_DATA(sse4a_extrqi, INTR_TYPE_3OP, X86ISD::EXTRQI, 0),
X86_INTRINSIC_DATA(sse4a_insertqi, INTR_TYPE_4OP, X86ISD::INSERTQI, 0),		X86_INTRINSIC_DATA(sse4a_insertqi, INTR_TYPE_4OP, X86ISD::INSERTQI, 0),
X86_INTRINSIC_DATA(ssse3_pabs_b_128, INTR_TYPE_1OP, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(ssse3_pabs_b_128, INTR_TYPE_1OP, ISD::ABS, 0),
X86_INTRINSIC_DATA(ssse3_pabs_d_128, INTR_TYPE_1OP, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(ssse3_pabs_d_128, INTR_TYPE_1OP, ISD::ABS, 0),
X86_INTRINSIC_DATA(ssse3_pabs_w_128, INTR_TYPE_1OP, X86ISD::ABS, 0),		X86_INTRINSIC_DATA(ssse3_pabs_w_128, INTR_TYPE_1OP, ISD::ABS, 0),
X86_INTRINSIC_DATA(ssse3_phadd_d_128, INTR_TYPE_2OP, X86ISD::HADD, 0),		X86_INTRINSIC_DATA(ssse3_phadd_d_128, INTR_TYPE_2OP, X86ISD::HADD, 0),
X86_INTRINSIC_DATA(ssse3_phadd_w_128, INTR_TYPE_2OP, X86ISD::HADD, 0),		X86_INTRINSIC_DATA(ssse3_phadd_w_128, INTR_TYPE_2OP, X86ISD::HADD, 0),
X86_INTRINSIC_DATA(ssse3_phsub_d_128, INTR_TYPE_2OP, X86ISD::HSUB, 0),		X86_INTRINSIC_DATA(ssse3_phsub_d_128, INTR_TYPE_2OP, X86ISD::HSUB, 0),
X86_INTRINSIC_DATA(ssse3_phsub_w_128, INTR_TYPE_2OP, X86ISD::HSUB, 0),		X86_INTRINSIC_DATA(ssse3_phsub_w_128, INTR_TYPE_2OP, X86ISD::HSUB, 0),
X86_INTRINSIC_DATA(ssse3_pmadd_ub_sw_128, INTR_TYPE_2OP, X86ISD::VPMADDUBSW, 0),		X86_INTRINSIC_DATA(ssse3_pmadd_ub_sw_128, INTR_TYPE_2OP, X86ISD::VPMADDUBSW, 0),
X86_INTRINSIC_DATA(ssse3_pmul_hr_sw_128, INTR_TYPE_2OP, X86ISD::MULHRS, 0),		X86_INTRINSIC_DATA(ssse3_pmul_hr_sw_128, INTR_TYPE_2OP, X86ISD::MULHRS, 0),
X86_INTRINSIC_DATA(ssse3_pshuf_b_128, INTR_TYPE_2OP, X86ISD::PSHUFB, 0),		X86_INTRINSIC_DATA(ssse3_pshuf_b_128, INTR_TYPE_2OP, X86ISD::PSHUFB, 0),
X86_INTRINSIC_DATA(xop_vpcomb, INTR_TYPE_3OP, X86ISD::VPCOM, 0),		X86_INTRINSIC_DATA(xop_vpcomb, INTR_TYPE_3OP, X86ISD::VPCOM, 0),
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/combine-abs.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 \| FileCheck %s			; RUN: llc < %s -mtriple=x86_64-unknown-unknown -mattr=+avx2 \| FileCheck %s

	; FIXME: Various missed opportunities to simplify integer absolute instructions.

	; fold (abs c1) -> c2			; fold (abs c1) -> c2
	define <4 x i32> @combine_v4i32_abs_constant() {			define <4 x i32> @combine_v4i32_abs_constant() {
	; CHECK-LABEL: combine_v4i32_abs_constant:			; CHECK-LABEL: combine_v4i32_abs_constant:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: vpabsd {{.*}}(%rip), %xmm0			; CHECK-NEXT: vmovaps {{.*#+}} xmm0 = [0,1,3,2147483648]
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%1 = call <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32> <i32 0, i32 -1, i32 3, i32 -2147483648>)			%1 = call <4 x i32> @llvm.x86.ssse3.pabs.d.128(<4 x i32> <i32 0, i32 -1, i32 3, i32 -2147483648>)
	ret <4 x i32> %1			ret <4 x i32> %1
	}			}

	define <16 x i16> @combine_v16i16_abs_constant() {			define <16 x i16> @combine_v16i16_abs_constant() {
	; CHECK-LABEL: combine_v16i16_abs_constant:			; CHECK-LABEL: combine_v16i16_abs_constant:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: vpabsw {{.*}}(%rip), %ymm0			; CHECK-NEXT: vmovaps {{.*#+}} ymm0 = [0,1,1,3,3,7,7,255,255,4096,4096,32767,32767,32768,32768,0]
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%1 = call <16 x i16> @llvm.x86.avx2.pabs.w(<16 x i16> <i16 0, i16 1, i16 -1, i16 3, i16 -3, i16 7, i16 -7, i16 255, i16 -255, i16 4096, i16 -4096, i16 32767, i16 -32767, i16 -32768, i16 32768, i16 65536>)			%1 = call <16 x i16> @llvm.x86.avx2.pabs.w(<16 x i16> <i16 0, i16 1, i16 -1, i16 3, i16 -3, i16 7, i16 -7, i16 255, i16 -255, i16 4096, i16 -4096, i16 32767, i16 -32767, i16 -32768, i16 32768, i16 65536>)
	ret <16 x i16> %1			ret <16 x i16> %1
	}			}

	; fold (abs (abs x)) -> (abs x)			; fold (abs (abs x)) -> (abs x)
	define <8 x i16> @combine_v8i16_abs_abs(<8 x i16> %a) {			define <8 x i16> @combine_v8i16_abs_abs(<8 x i16> %a) {
	; CHECK-LABEL: combine_v8i16_abs_abs:			; CHECK-LABEL: combine_v8i16_abs_abs:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: vpabsw %xmm0, %xmm0			; CHECK-NEXT: vpabsw %xmm0, %xmm0
	; CHECK-NEXT: vpabsw %xmm0, %xmm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%a1 = call <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16> %a)			%a1 = call <8 x i16> @llvm.x86.ssse3.pabs.w.128(<8 x i16> %a)
	%n2 = sub <8 x i16> zeroinitializer, %a1			%n2 = sub <8 x i16> zeroinitializer, %a1
	%c2 = icmp slt <8 x i16> %a1, zeroinitializer			%c2 = icmp slt <8 x i16> %a1, zeroinitializer
	%a2 = select <8 x i1> %c2, <8 x i16> %n2, <8 x i16> %a1			%a2 = select <8 x i1> %c2, <8 x i16> %n2, <8 x i16> %a1
	ret <8 x i16> %a2			ret <8 x i16> %a2
	}			}

	define <32 x i8> @combine_v32i8_abs_abs(<32 x i8> %a) {			define <32 x i8> @combine_v32i8_abs_abs(<32 x i8> %a) {
	; CHECK-LABEL: combine_v32i8_abs_abs:			; CHECK-LABEL: combine_v32i8_abs_abs:
	; CHECK: # BB#0:			; CHECK: # BB#0:
	; CHECK-NEXT: vpabsb %ymm0, %ymm0			; CHECK-NEXT: vpabsb %ymm0, %ymm0
	; CHECK-NEXT: vpabsb %ymm0, %ymm0
	; CHECK-NEXT: retq			; CHECK-NEXT: retq
	%n1 = sub <32 x i8> zeroinitializer, %a			%n1 = sub <32 x i8> zeroinitializer, %a
	%b1 = icmp slt <32 x i8> %a, zeroinitializer			%b1 = icmp slt <32 x i8> %a, zeroinitializer
	%a1 = select <32 x i1> %b1, <32 x i8> %n1, <32 x i8> %a			%a1 = select <32 x i1> %b1, <32 x i8> %n1, <32 x i8> %a
	%a2 = call <32 x i8> @llvm.x86.avx2.pabs.b(<32 x i8> %a1)			%a2 = call <32 x i8> @llvm.x86.avx2.pabs.b(<32 x i8> %a1)
	ret <32 x i8> %a2			ret <32 x i8> %a2
	}			}

	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/viabs.ll

	Show First 20 Lines • Show All 141 Lines • ▼ Show 20 Lines
	; SSSE3-LABEL: test_abs_gt_v8i32:			; SSSE3-LABEL: test_abs_gt_v8i32:
	; SSSE3: # BB#0:			; SSSE3: # BB#0:
	; SSSE3-NEXT: pabsd %xmm0, %xmm0			; SSSE3-NEXT: pabsd %xmm0, %xmm0
	; SSSE3-NEXT: pabsd %xmm1, %xmm1			; SSSE3-NEXT: pabsd %xmm1, %xmm1
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_gt_v8i32:			; AVX1-LABEL: test_abs_gt_v8i32:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1			; AVX1-NEXT: vpabsd %xmm0, %xmm1
	; AVX1-NEXT: vpsrad $31, %xmm1, %xmm2			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpaddd %xmm2, %xmm1, %xmm1			; AVX1-NEXT: vpabsd %xmm0, %xmm0
	; AVX1-NEXT: vpsrad $31, %xmm0, %xmm3			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
	; AVX1-NEXT: vpaddd %xmm3, %xmm0, %xmm0
	; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm3, %ymm1
	; AVX1-NEXT: vxorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_gt_v8i32:			; AVX2-LABEL: test_abs_gt_v8i32:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsd %ymm0, %ymm0			; AVX2-NEXT: vpabsd %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_gt_v8i32:			; AVX512-LABEL: test_abs_gt_v8i32:
	Show All 22 Lines
	; SSSE3-LABEL: test_abs_ge_v8i32:			; SSSE3-LABEL: test_abs_ge_v8i32:
	; SSSE3: # BB#0:			; SSSE3: # BB#0:
	; SSSE3-NEXT: pabsd %xmm0, %xmm0			; SSSE3-NEXT: pabsd %xmm0, %xmm0
	; SSSE3-NEXT: pabsd %xmm1, %xmm1			; SSSE3-NEXT: pabsd %xmm1, %xmm1
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_ge_v8i32:			; AVX1-LABEL: test_abs_ge_v8i32:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1			; AVX1-NEXT: vpabsd %xmm0, %xmm1
	; AVX1-NEXT: vpsrad $31, %xmm1, %xmm2			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpaddd %xmm2, %xmm1, %xmm1			; AVX1-NEXT: vpabsd %xmm0, %xmm0
	; AVX1-NEXT: vpsrad $31, %xmm0, %xmm3			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
	; AVX1-NEXT: vpaddd %xmm3, %xmm0, %xmm0
	; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm3, %ymm1
	; AVX1-NEXT: vxorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_ge_v8i32:			; AVX2-LABEL: test_abs_ge_v8i32:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsd %ymm0, %ymm0			; AVX2-NEXT: vpabsd %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_ge_v8i32:			; AVX512-LABEL: test_abs_ge_v8i32:
	Show All 22 Lines
	; SSSE3-LABEL: test_abs_gt_v16i16:			; SSSE3-LABEL: test_abs_gt_v16i16:
	; SSSE3: # BB#0:			; SSSE3: # BB#0:
	; SSSE3-NEXT: pabsw %xmm0, %xmm0			; SSSE3-NEXT: pabsw %xmm0, %xmm0
	; SSSE3-NEXT: pabsw %xmm1, %xmm1			; SSSE3-NEXT: pabsw %xmm1, %xmm1
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_gt_v16i16:			; AVX1-LABEL: test_abs_gt_v16i16:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1			; AVX1-NEXT: vpabsw %xmm0, %xmm1
	; AVX1-NEXT: vpsraw $15, %xmm1, %xmm2			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpaddw %xmm2, %xmm1, %xmm1			; AVX1-NEXT: vpabsw %xmm0, %xmm0
	; AVX1-NEXT: vpsraw $15, %xmm0, %xmm3			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
	; AVX1-NEXT: vpaddw %xmm3, %xmm0, %xmm0
	; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm3, %ymm1
	; AVX1-NEXT: vxorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_gt_v16i16:			; AVX2-LABEL: test_abs_gt_v16i16:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsw %ymm0, %ymm0			; AVX2-NEXT: vpabsw %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_gt_v16i16:			; AVX512-LABEL: test_abs_gt_v16i16:
	Show All 22 Lines
	; SSSE3-LABEL: test_abs_lt_v32i8:			; SSSE3-LABEL: test_abs_lt_v32i8:
	; SSSE3: # BB#0:			; SSSE3: # BB#0:
	; SSSE3-NEXT: pabsb %xmm0, %xmm0			; SSSE3-NEXT: pabsb %xmm0, %xmm0
	; SSSE3-NEXT: pabsb %xmm1, %xmm1			; SSSE3-NEXT: pabsb %xmm1, %xmm1
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_lt_v32i8:			; AVX1-LABEL: test_abs_lt_v32i8:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1			; AVX1-NEXT: vpabsb %xmm0, %xmm1
	; AVX1-NEXT: vpxor %xmm2, %xmm2, %xmm2			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpcmpgtb %xmm1, %xmm2, %xmm3			; AVX1-NEXT: vpabsb %xmm0, %xmm0
	; AVX1-NEXT: vpcmpgtb %xmm0, %xmm2, %xmm2			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
	; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm2, %ymm4
	; AVX1-NEXT: vpaddb %xmm3, %xmm1, %xmm1
	; AVX1-NEXT: vpaddb %xmm2, %xmm0, %xmm0
	; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX1-NEXT: vxorps %ymm4, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_lt_v32i8:			; AVX2-LABEL: test_abs_lt_v32i8:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsb %ymm0, %ymm0			; AVX2-NEXT: vpabsb %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_lt_v32i8:			; AVX512-LABEL: test_abs_lt_v32i8:
	Show All 22 Lines
	; SSSE3-LABEL: test_abs_le_v8i32:			; SSSE3-LABEL: test_abs_le_v8i32:
	; SSSE3: # BB#0:			; SSSE3: # BB#0:
	; SSSE3-NEXT: pabsd %xmm0, %xmm0			; SSSE3-NEXT: pabsd %xmm0, %xmm0
	; SSSE3-NEXT: pabsd %xmm1, %xmm1			; SSSE3-NEXT: pabsd %xmm1, %xmm1
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_le_v8i32:			; AVX1-LABEL: test_abs_le_v8i32:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm1			; AVX1-NEXT: vpabsd %xmm0, %xmm1
	; AVX1-NEXT: vpsrad $31, %xmm1, %xmm2			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpaddd %xmm2, %xmm1, %xmm1			; AVX1-NEXT: vpabsd %xmm0, %xmm0
	; AVX1-NEXT: vpsrad $31, %xmm0, %xmm3			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm1, %ymm0
	; AVX1-NEXT: vpaddd %xmm3, %xmm0, %xmm0
	; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm0, %ymm0
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm3, %ymm1
	; AVX1-NEXT: vxorps %ymm1, %ymm0, %ymm0
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_le_v8i32:			; AVX2-LABEL: test_abs_le_v8i32:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsd %ymm0, %ymm0			; AVX2-NEXT: vpabsd %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_le_v8i32:			; AVX512-LABEL: test_abs_le_v8i32:
	Show All 32 Lines
	; SSSE3-NEXT: pabsd %xmm0, %xmm0			; SSSE3-NEXT: pabsd %xmm0, %xmm0
	; SSSE3-NEXT: pabsd %xmm1, %xmm1			; SSSE3-NEXT: pabsd %xmm1, %xmm1
	; SSSE3-NEXT: pabsd %xmm2, %xmm2			; SSSE3-NEXT: pabsd %xmm2, %xmm2
	; SSSE3-NEXT: pabsd %xmm3, %xmm3			; SSSE3-NEXT: pabsd %xmm3, %xmm3
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_le_16i32:			; AVX1-LABEL: test_abs_le_16i32:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX1-NEXT: vpabsd %xmm0, %xmm2
	; AVX1-NEXT: vpsrad $31, %xmm2, %xmm3			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpaddd %xmm3, %xmm2, %xmm2			; AVX1-NEXT: vpabsd %xmm0, %xmm0
	; AVX1-NEXT: vpsrad $31, %xmm0, %xmm4			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm2, %ymm0
	; AVX1-NEXT: vpaddd %xmm4, %xmm0, %xmm0			; AVX1-NEXT: vpabsd %xmm1, %xmm2
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0			; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm1
	; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm4, %ymm2			; AVX1-NEXT: vpabsd %xmm1, %xmm1
	; AVX1-NEXT: vxorps %ymm2, %ymm0, %ymm0			; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm2, %ymm1
	; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2
	; AVX1-NEXT: vpsrad $31, %xmm2, %xmm3
	; AVX1-NEXT: vpaddd %xmm3, %xmm2, %xmm2
	; AVX1-NEXT: vpsrad $31, %xmm1, %xmm4
	; AVX1-NEXT: vpaddd %xmm4, %xmm1, %xmm1
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm1, %ymm1
	; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm4, %ymm2
	; AVX1-NEXT: vxorps %ymm2, %ymm1, %ymm1
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_le_16i32:			; AVX2-LABEL: test_abs_le_16i32:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsd %ymm0, %ymm0			; AVX2-NEXT: vpabsd %ymm0, %ymm0
	; AVX2-NEXT: vpabsd %ymm1, %ymm1			; AVX2-NEXT: vpabsd %ymm1, %ymm1
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	Show All 30 Lines
	; AVX2-NEXT: vpsrad $31, %xmm0, %xmm1			; AVX2-NEXT: vpsrad $31, %xmm0, %xmm1
	; AVX2-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]			; AVX2-NEXT: vpshufd {{.*#+}} xmm1 = xmm1[1,1,3,3]
	; AVX2-NEXT: vpaddq %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpaddq %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: vpxor %xmm1, %xmm0, %xmm0			; AVX2-NEXT: vpxor %xmm1, %xmm0, %xmm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_ge_v2i64:			; AVX512-LABEL: test_abs_ge_v2i64:
	; AVX512: # BB#0:			; AVX512: # BB#0:
	; AVX512-NEXT: vpsraq $63, %xmm0, %xmm1			; AVX512-NEXT: vpabsq %xmm0, %xmm0
	; AVX512-NEXT: vpaddq %xmm1, %xmm0, %xmm0
	; AVX512-NEXT: vpxor %xmm1, %xmm0, %xmm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%tmp1neg = sub <2 x i64> zeroinitializer, %a			%tmp1neg = sub <2 x i64> zeroinitializer, %a
	%b = icmp sge <2 x i64> %a, zeroinitializer			%b = icmp sge <2 x i64> %a, zeroinitializer
	%abs = select <2 x i1> %b, <2 x i64> %a, <2 x i64> %tmp1neg			%abs = select <2 x i1> %b, <2 x i64> %a, <2 x i64> %tmp1neg
	ret <2 x i64> %abs			ret <2 x i64> %abs
	}			}

	define <4 x i64> @test_abs_gt_v4i64(<4 x i64> %a) nounwind {			define <4 x i64> @test_abs_gt_v4i64(<4 x i64> %a) nounwind {
	Show All 30 Lines
	; AVX2-NEXT: vpsrad $31, %ymm0, %ymm1			; AVX2-NEXT: vpsrad $31, %ymm0, %ymm1
	; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[1,1,3,3,5,5,7,7]			; AVX2-NEXT: vpshufd {{.*#+}} ymm1 = ymm1[1,1,3,3,5,5,7,7]
	; AVX2-NEXT: vpaddq %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpaddq %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: vpxor %ymm1, %ymm0, %ymm0			; AVX2-NEXT: vpxor %ymm1, %ymm0, %ymm0
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	; AVX512-LABEL: test_abs_gt_v4i64:			; AVX512-LABEL: test_abs_gt_v4i64:
	; AVX512: # BB#0:			; AVX512: # BB#0:
	; AVX512-NEXT: vpsraq $63, %ymm0, %ymm1			; AVX512-NEXT: vpabsq %ymm0, %ymm0
	; AVX512-NEXT: vpaddq %ymm1, %ymm0, %ymm0
	; AVX512-NEXT: vpxor %ymm1, %ymm0, %ymm0
	; AVX512-NEXT: retq			; AVX512-NEXT: retq
	%tmp1neg = sub <4 x i64> zeroinitializer, %a			%tmp1neg = sub <4 x i64> zeroinitializer, %a
	%b = icmp sgt <4 x i64> %a, <i64 -1, i64 -1, i64 -1, i64 -1>			%b = icmp sgt <4 x i64> %a, <i64 -1, i64 -1, i64 -1, i64 -1>
	%abs = select <4 x i1> %b, <4 x i64> %a, <4 x i64> %tmp1neg			%abs = select <4 x i1> %b, <4 x i64> %a, <4 x i64> %tmp1neg
	ret <4 x i64> %abs			ret <4 x i64> %abs
	}			}

	define <8 x i64> @test_abs_le_v8i64(<8 x i64> %a) nounwind {			define <8 x i64> @test_abs_le_v8i64(<8 x i64> %a) nounwind {
	▲ Show 20 Lines • Show All 173 Lines • ▼ Show 20 Lines
	; SSSE3-NEXT: pabsb %xmm0, %xmm0			; SSSE3-NEXT: pabsb %xmm0, %xmm0
	; SSSE3-NEXT: pabsb %xmm1, %xmm1			; SSSE3-NEXT: pabsb %xmm1, %xmm1
	; SSSE3-NEXT: pabsb %xmm2, %xmm2			; SSSE3-NEXT: pabsb %xmm2, %xmm2
	; SSSE3-NEXT: pabsb %xmm3, %xmm3			; SSSE3-NEXT: pabsb %xmm3, %xmm3
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_lt_v64i8:			; AVX1-LABEL: test_abs_lt_v64i8:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX1-NEXT: vpabsb %xmm0, %xmm2
	; AVX1-NEXT: vpxor %xmm3, %xmm3, %xmm3			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpcmpgtb %xmm2, %xmm3, %xmm4			; AVX1-NEXT: vpabsb %xmm0, %xmm0
	; AVX1-NEXT: vpcmpgtb %xmm0, %xmm3, %xmm5			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm2, %ymm0
	; AVX1-NEXT: vinsertf128 $1, %xmm4, %ymm5, %ymm6			; AVX1-NEXT: vpabsb %xmm1, %xmm2
	; AVX1-NEXT: vpaddb %xmm4, %xmm2, %xmm2			; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm1
	; AVX1-NEXT: vpaddb %xmm5, %xmm0, %xmm0			; AVX1-NEXT: vpabsb %xmm1, %xmm1
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0			; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm2, %ymm1
	; AVX1-NEXT: vxorps %ymm6, %ymm0, %ymm0
	; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2
	; AVX1-NEXT: vpcmpgtb %xmm2, %xmm3, %xmm4
	; AVX1-NEXT: vpcmpgtb %xmm1, %xmm3, %xmm3
	; AVX1-NEXT: vinsertf128 $1, %xmm4, %ymm3, %ymm5
	; AVX1-NEXT: vpaddb %xmm4, %xmm2, %xmm2
	; AVX1-NEXT: vpaddb %xmm3, %xmm1, %xmm1
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm1, %ymm1
	; AVX1-NEXT: vxorps %ymm5, %ymm1, %ymm1
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_lt_v64i8:			; AVX2-LABEL: test_abs_lt_v64i8:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsb %ymm0, %ymm0			; AVX2-NEXT: vpabsb %ymm0, %ymm0
	; AVX2-NEXT: vpabsb %ymm1, %ymm1			; AVX2-NEXT: vpabsb %ymm1, %ymm1
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	Show All 39 Lines
	; SSSE3-NEXT: pabsw %xmm0, %xmm0			; SSSE3-NEXT: pabsw %xmm0, %xmm0
	; SSSE3-NEXT: pabsw %xmm1, %xmm1			; SSSE3-NEXT: pabsw %xmm1, %xmm1
	; SSSE3-NEXT: pabsw %xmm2, %xmm2			; SSSE3-NEXT: pabsw %xmm2, %xmm2
	; SSSE3-NEXT: pabsw %xmm3, %xmm3			; SSSE3-NEXT: pabsw %xmm3, %xmm3
	; SSSE3-NEXT: retq			; SSSE3-NEXT: retq
	;			;
	; AVX1-LABEL: test_abs_gt_v32i16:			; AVX1-LABEL: test_abs_gt_v32i16:
	; AVX1: # BB#0:			; AVX1: # BB#0:
	; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm2			; AVX1-NEXT: vpabsw %xmm0, %xmm2
	; AVX1-NEXT: vpsraw $15, %xmm2, %xmm3			; AVX1-NEXT: vextractf128 $1, %ymm0, %xmm0
	; AVX1-NEXT: vpaddw %xmm3, %xmm2, %xmm2			; AVX1-NEXT: vpabsw %xmm0, %xmm0
	; AVX1-NEXT: vpsraw $15, %xmm0, %xmm4			; AVX1-NEXT: vinsertf128 $1, %xmm0, %ymm2, %ymm0
	; AVX1-NEXT: vpaddw %xmm4, %xmm0, %xmm0			; AVX1-NEXT: vpabsw %xmm1, %xmm2
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm0, %ymm0			; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm1
	; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm4, %ymm2			; AVX1-NEXT: vpabsw %xmm1, %xmm1
	; AVX1-NEXT: vxorps %ymm2, %ymm0, %ymm0			; AVX1-NEXT: vinsertf128 $1, %xmm1, %ymm2, %ymm1
	; AVX1-NEXT: vextractf128 $1, %ymm1, %xmm2
	; AVX1-NEXT: vpsraw $15, %xmm2, %xmm3
	; AVX1-NEXT: vpaddw %xmm3, %xmm2, %xmm2
	; AVX1-NEXT: vpsraw $15, %xmm1, %xmm4
	; AVX1-NEXT: vpaddw %xmm4, %xmm1, %xmm1
	; AVX1-NEXT: vinsertf128 $1, %xmm2, %ymm1, %ymm1
	; AVX1-NEXT: vinsertf128 $1, %xmm3, %ymm4, %ymm2
	; AVX1-NEXT: vxorps %ymm2, %ymm1, %ymm1
	; AVX1-NEXT: retq			; AVX1-NEXT: retq
	;			;
	; AVX2-LABEL: test_abs_gt_v32i16:			; AVX2-LABEL: test_abs_gt_v32i16:
	; AVX2: # BB#0:			; AVX2: # BB#0:
	; AVX2-NEXT: vpabsw %ymm0, %ymm0			; AVX2-NEXT: vpabsw %ymm0, %ymm0
	; AVX2-NEXT: vpabsw %ymm1, %ymm1			; AVX2-NEXT: vpabsw %ymm1, %ymm1
	; AVX2-NEXT: retq			; AVX2-NEXT: retq
	;			;
	Show All 15 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Add a signed integer absolute ISD nodeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 91774

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

llvm/trunk/include/llvm/Target/TargetSelectionDAG.td

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

llvm/trunk/lib/Target/X86/X86ISelLowering.h

llvm/trunk/lib/Target/X86/X86ISelLowering.cpp

llvm/trunk/lib/Target/X86/X86InstrAVX512.td

llvm/trunk/lib/Target/X86/X86InstrFragmentsSIMD.td

llvm/trunk/lib/Target/X86/X86InstrSSE.td

llvm/trunk/lib/Target/X86/X86IntrinsicsInfo.h

llvm/trunk/test/CodeGen/X86/combine-abs.ll

llvm/trunk/test/CodeGen/X86/viabs.ll

[SelectionDAG] Add a signed integer absolute ISD node
ClosedPublic