Download Raw Diff

Details

Reviewers

t.p.northover
jmolloy
MatzeB
mcrosier

Commits

rG92431703d7a2: AArch64: Implement missed conditional compare sequences.
rL259387: AArch64: Implement missed conditional compare sequences.

Summary

This is an extension to the existing implementation of r242436 which
restricts to only select inputs. This version fixes missed opportunities
in pr26084 by attempting to lower conditional compare sequences of
and/or trees with setcc leafs. This will additionaly handle the case
when a tree with select input is not a conjunction-disjunction tree
but some of the sub trees are conjunction-disjunction trees.

Diff Detail

Repository: rL LLVM

Event Timeline

bmakam updated this revision to Diff 45171.Jan 18 2016, 5:14 AM

bmakam retitled this revision from to AArch64: Implement missed conditional compare sequences..

bmakam updated this object.

bmakam added a reviewer: mcrosier.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptJan 18 2016, 5:14 AM

mcrosier edited subscribers, added: gberry, mssimpso, haicheng, junbuml; removed: aemerson, rengolin.Jan 18 2016, 6:36 AM

I'd like to hear @MatzeB's comments on the patch. We should be able to handle things other than returns, but we might save that for the next step.

test/CodeGen/AArch64/arm64-ccmp.ll
404 ↗	(On Diff #45171)	Please add a test case for 'and' as well.

Perhaps a better place to catch this would be in performExtendCombine? You could look for (zext/sext/anyext (and/or i1)) there and transform it into a CSEL with optimized condition.
Here are a couple more test cases that I think this approach would catch:

int single_noselect_phi(int A, int B, int C)
{

_Bool b;
if (C) {
    b = A < 4 || B < 2;
} else {
    b = A > 0 && B > 0;
}
return b;

}

int single_ext(int A, int B, int C)
{

return (A < 4 || B < 2) + C;

}

The transformation done in LowerAND/LowerOR in this patch should always be correct I think, however it will obviously pessimize code when the result is fed back into a condition input of a node like BRCOND, CSEL, ... I think that instead of restricting this to Return inputs, this can indeed be used in more cases.
I think as LegalizeDAG works in reverse topological order, the only way we would see And i32, Or i32 (with setcc leafs) is when we previously had a zext i1->i32 or if we failed to construct a ccmp sequence because isConjunctionDisjunctionTree() was not satisfied. So I think it should be enough to test for isConjunctionDisjunctionTree() and do the transform is successful. That should also catch the cases gberry mentioned above I think (and it would IMO be cleaner than introducing CSELs earlier in the pipeline, esp. since we still have to know whether we can form a ccmp sequence at all or the transformation will be a pessimisation).

Thanks for the feedback Matthias and Geoff,

The approach here is to catch the cases where LowerSELECT failed to construct ccmp sequences, so had to be done at LowerOR/LowerAND. The current patch fails to catch the cases mentioned by Geoff because of the restriction to Return inputs. We should be able to handle other than returns and I was saving this for a follow on patch because it breaks the select_noccmp1 test case in arm64-ccmp.ll:

define i64 @select_noccmp1(i64 %v1, i64 %v2, i64 %v3, i64 %r) {
  %c0 = icmp slt i64 %v1, 0
  %c1 = icmp sgt i64 %v1, 13
  %c2 = icmp slt i64 %v3, 2
  %c4 = icmp sgt i64 %v3, 4
  %and0 = and i1 %c0, %c1
  %and1 = and i1 %c2, %c4
  %or = or i1 %and0, %and1
  %sel = select i1 %or, i64 0, i64 %r
  ret i64 %sel
}

Matthias, can you please let know why you added this test case? I think we fail to construct ccmp sequence here because the tree at or i1 is not a conjunctiondisjunction tree as we cannot push negate operation through the tree not (or (and x y) (and x z)). However, with the current approach of doing the transformation at LowerOR/LowerAND we can look at the sub-trees (and x y), (and x z) respectively and they can be transformed into ccmp sequences thus breaking this testcase. Is it reasonable to modify the test case if we agree on extending this to catch the cases Geoff mentioned?

Addressed reviewers comments/suggestions and removed the restriction to Return inputs.

Herald added a subscriber: mcrosier. · View Herald TranscriptJan 21 2016, 5:03 AM

mcrosier added inline comments.Jan 21 2016, 6:48 AM

lib/Target/AArch64/AArch64ISelLowering.cpp
1566 ↗	(On Diff #45514)	No need for the extra parens around the call is isConjunctionDisjunctionTree.
1568 ↗	(On Diff #45514)	Same; No need for the extra parens around the call is isConjunctionDisjunctionTree.
1570 ↗	(On Diff #45514)	If possible, I would hoist this check above the calls to isConjunctionDisjunctionTree.
1702 ↗	(On Diff #45514)	Please remove the extra whitespace.
1710 ↗	(On Diff #45514)	Please remove the extra whitespace.

bmakam updated this revision to Diff 45543.Jan 21 2016, 9:27 AM

Addressed comments from Chad.

The code looks good to me. Now we need to collect performance results and investigate regressions, if any are exposed.

No correctness failures observed. Pending perf data. Just want to give a heads up that this new version is more invasive and impacts almost all of Spec2k/2k6.

This looks like the right direction now, however I believe that you can shorten tryLowerToAArch64Cmp() as noted below. I'd also recommend rebaseing llvm to get r258605 (I found I bug in my ccmp creation code while re-reading it during review).

lib/Target/AArch64/AArch64ISelLowering.cpp
1557–1571 ↗	(On Diff #45545)	I would assume that you can replace the first part of this function with: bool Dummy; if (!isConjunctionDisjunctionTree(Op, Dummy)) return SDValue(); That would be shorter and give let the function bail out on some subtrees that cannot be represented with ccmp sequences (not a correctness problem but you would generate unnecessary csels below if you cannot use ccmp anyway).

Rebased and addressed Mathias comments. Still pending full perf data, so far everything tested is within noise. There is net reduction in number of static instructions which is a good thing.

MatzeB added inline comments.Jan 25 2016, 12:46 PM

lib/Target/AArch64/AArch64ISelLowering.cpp
1614–1615 ↗	(On Diff #45890)	I think the same check is performed inside isConjunctionDisjunctionTree.

Sorry for the late response. It took me some time to get the full performance data. I did not find any significant gains or regressions with this patch and there were net reductions in the number of static and dynamic instructions. Overall the data seems to be good.

Benchmark Diff

spec2006/astar -0.0314288
spec2006/bzip2 0.00814573
spec2006/dealII 0.0193043
spec2006/gcc 0.130547
spec2006/gobmk 0.155664
spec2006/h264ref -0.0788196
spec2006/hmmer -0.126024
spec2006/lbm -0.244571
spec2006/libquantum 0.197392
spec2006/mcf -0.65466
spec2006/milc -0.170548
spec2006/namd -0.0334161
spec2006/omnetpp 2.86068
spec2006/perlbench -0.598455
spec2006/povray 1.73534
spec2006/sjeng 0.184844
spec2006/soplex 0.537032
spec2006/sphinx3 1.83868
spec2006/xalancbmk -0.774806

CINT2006_GEOMEAN 0.11%
CFP2006_GEOMEAN 0.53%
CPU2006_GEOMEAN 0.27%

Only significant regressions were in xalancbmk(-0.77%) and perlbench(-0.59%) but they seemed transitory as a rerun with the latest tip shows the data is noisy,
spec2006/xalancbmk 0.232571
spec2006/perlbench 0.164644

Thanks for working on this, LGTM

This revision is now accepted and ready to land.Feb 1 2016, 10:36 AM

Closed by commit rL259387: AArch64: Implement missed conditional compare sequences. (authored by bmakam). · Explain WhyFeb 1 2016, 11:17 AM

This revision was automatically updated to reflect the committed changes.

Hi Marcello,

Without making it LegalOrCustom it will break these lit-tests:
CodeGen/AArch64/arm64-abi-varargs.ll
CodeGen/AArch64/arm64-aapcs.ll

Would it help if we add a check to let legalize expand this if it isn't legal type yet during cutom lowering?

Adding something like in tryLowerToAArch64Cmp:

if (!DAG.getTargetLoweringInfo().isTypeLegal(VT))
  return SDValue();

Thanks,
Balaram

Diff 46561

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,715 Lines • ▼ Show 20 Lines	if (isa<ConstantSDNode>(N00) \|\| isa<ConstantSDNode>(N10))
DAG.getNode(ISD::ADD, SDLoc(N0), VT, N00, N10),		DAG.getNode(ISD::ADD, SDLoc(N0), VT, N00, N10),
DAG.getNode(ISD::ADD, SDLoc(N1), VT, N01, N11));		DAG.getNode(ISD::ADD, SDLoc(N1), VT, N01, N11));
}		}

if (!VT.isVector() && SimplifyDemandedBits(SDValue(N, 0)))		if (!VT.isVector() && SimplifyDemandedBits(SDValue(N, 0)))
return SDValue(N, 0);		return SDValue(N, 0);

// fold (a+b) -> (a\|b) iff a and b share no bits.		// fold (a+b) -> (a\|b) iff a and b share no bits.
if ((!LegalOperations \|\| TLI.isOperationLegal(ISD::OR, VT)) &&		if ((!LegalOperations \|\| TLI.isOperationLegalOrCustom(ISD::OR, VT)) &&
VT.isInteger() && !VT.isVector() && DAG.haveNoCommonBitsSet(N0, N1))		VT.isInteger() && !VT.isVector() && DAG.haveNoCommonBitsSet(N0, N1))
return DAG.getNode(ISD::OR, SDLoc(N), VT, N0, N1);		return DAG.getNode(ISD::OR, SDLoc(N), VT, N0, N1);

// fold (add x, shl(0 - y, n)) -> sub(x, shl(y, n))		// fold (add x, shl(0 - y, n)) -> sub(x, shl(y, n))
if (N1.getOpcode() == ISD::SHL && N1.getOperand(0).getOpcode() == ISD::SUB &&		if (N1.getOpcode() == ISD::SHL && N1.getOperand(0).getOpcode() == ISD::SUB &&
isNullConstant(N1.getOperand(0).getOperand(0)))		isNullConstant(N1.getOperand(0).getOperand(0)))
return DAG.getNode(ISD::SUB, SDLoc(N), VT, N0,		return DAG.getNode(ISD::SUB, SDLoc(N), VT, N0,
DAG.getNode(ISD::SHL, SDLoc(N), VT,		DAG.getNode(ISD::SHL, SDLoc(N), VT,
▲ Show 20 Lines • Show All 4,625 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitZERO_EXTEND(SDNode *N) {
// (and/or/xor (zextload x), (zext cst))		// (and/or/xor (zextload x), (zext cst))
// Unless (and (load x) cst) will match as a zextload already and has		// Unless (and (load x) cst) will match as a zextload already and has
// additional users.		// additional users.
if ((N0.getOpcode() == ISD::AND \|\| N0.getOpcode() == ISD::OR \|\|		if ((N0.getOpcode() == ISD::AND \|\| N0.getOpcode() == ISD::OR \|\|
N0.getOpcode() == ISD::XOR) &&		N0.getOpcode() == ISD::XOR) &&
isa<LoadSDNode>(N0.getOperand(0)) &&		isa<LoadSDNode>(N0.getOperand(0)) &&
N0.getOperand(1).getOpcode() == ISD::Constant &&		N0.getOperand(1).getOpcode() == ISD::Constant &&
TLI.isLoadExtLegal(ISD::ZEXTLOAD, VT, N0.getValueType()) &&		TLI.isLoadExtLegal(ISD::ZEXTLOAD, VT, N0.getValueType()) &&
(!LegalOperations && TLI.isOperationLegal(N0.getOpcode(), VT))) {		(!LegalOperations && TLI.isOperationLegalOrCustom(N0.getOpcode(), VT))) {
LoadSDNode *LN0 = cast<LoadSDNode>(N0.getOperand(0));		LoadSDNode *LN0 = cast<LoadSDNode>(N0.getOperand(0));
if (LN0->getExtensionType() != ISD::SEXTLOAD && LN0->isUnindexed()) {		if (LN0->getExtensionType() != ISD::SEXTLOAD && LN0->isUnindexed()) {
bool DoXform = true;		bool DoXform = true;
SmallVector<SDNode*, 4> SetCCs;		SmallVector<SDNode*, 4> SetCCs;
if (!N0.hasOneUse()) {		if (!N0.hasOneUse()) {
if (N0.getOpcode() == ISD::AND) {		if (N0.getOpcode() == ISD::AND) {
auto *AndC = cast<ConstantSDNode>(N0.getOperand(1));		auto *AndC = cast<ConstantSDNode>(N0.getOperand(1));
auto NarrowLoad = false;		auto NarrowLoad = false;
▲ Show 20 Lines • Show All 8,428 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 482 Lines • ▼ Show 20 Lines	private:
SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerVectorSRA_SRL_SHL(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerShiftLeftParts(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerShiftRightParts(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerShiftRightParts(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerVSETCC(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerVSETCC(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerCTPOP(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerCTPOP(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerF128Call(SDValue Op, SelectionDAG &DAG,		SDValue LowerF128Call(SDValue Op, SelectionDAG &DAG,
RTLIB::Libcall Call) const;		RTLIB::Libcall Call) const;
		SDValue LowerAND(SDValue Op, SelectionDAG &DAG) const;
		SDValue LowerOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFCOPYSIGN(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFCOPYSIGN(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFP_EXTEND(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFP_EXTEND(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFP_ROUND(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFP_ROUND(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFP_TO_INT(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerINT_TO_FP(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerINT_TO_FP(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerVectorAND(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerVectorAND(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerVectorOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerVectorOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerCONCAT_VECTORS(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerCONCAT_VECTORS(SDValue Op, SelectionDAG &DAG) const;
▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
setOperationAction(ISD::FREM, MVT::f64, Expand);		setOperationAction(ISD::FREM, MVT::f64, Expand);
setOperationAction(ISD::FREM, MVT::f80, Expand);		setOperationAction(ISD::FREM, MVT::f80, Expand);

// Custom lowering hooks are needed for XOR		// Custom lowering hooks are needed for XOR
// to fold it into CSINC/CSINV.		// to fold it into CSINC/CSINV.
setOperationAction(ISD::XOR, MVT::i32, Custom);		setOperationAction(ISD::XOR, MVT::i32, Custom);
setOperationAction(ISD::XOR, MVT::i64, Custom);		setOperationAction(ISD::XOR, MVT::i64, Custom);

		// Custom lowering hooks are needed for OR
		// to fold it into CCMP.
		setOperationAction(ISD::OR, MVT::i32, Custom);
		setOperationAction(ISD::OR, MVT::i64, Custom);

		// Custom lowering hooks are needed for AND
		// to fold it into CCMP.
		setOperationAction(ISD::AND, MVT::i32, Custom);
		setOperationAction(ISD::AND, MVT::i64, Custom);

// Virtually no operation on f128 is legal, but LLVM can't expand them when		// Virtually no operation on f128 is legal, but LLVM can't expand them when
// there's a valid register class, so we need custom operations in most cases.		// there's a valid register class, so we need custom operations in most cases.
setOperationAction(ISD::FABS, MVT::f128, Expand);		setOperationAction(ISD::FABS, MVT::f128, Expand);
setOperationAction(ISD::FADD, MVT::f128, Custom);		setOperationAction(ISD::FADD, MVT::f128, Custom);
setOperationAction(ISD::FCOPYSIGN, MVT::f128, Expand);		setOperationAction(ISD::FCOPYSIGN, MVT::f128, Expand);
setOperationAction(ISD::FCOS, MVT::f128, Expand);		setOperationAction(ISD::FCOS, MVT::f128, Expand);
setOperationAction(ISD::FDIV, MVT::f128, Custom);		setOperationAction(ISD::FDIV, MVT::f128, Custom);
setOperationAction(ISD::FMA, MVT::f128, Expand);		setOperationAction(ISD::FMA, MVT::f128, Expand);
▲ Show 20 Lines • Show All 1,437 Lines • ▼ Show 20 Lines	static SDValue getAArch64Cmp(SDValue LHS, SDValue RHS, ISD::CondCode CC,
if (!Cmp) {		if (!Cmp) {
Cmp = emitComparison(LHS, RHS, CC, dl, DAG);		Cmp = emitComparison(LHS, RHS, CC, dl, DAG);
AArch64CC = changeIntCCToAArch64CC(CC);		AArch64CC = changeIntCCToAArch64CC(CC);
}		}
AArch64cc = DAG.getConstant(AArch64CC, dl, MVT_CC);		AArch64cc = DAG.getConstant(AArch64CC, dl, MVT_CC);
return Cmp;		return Cmp;
}		}

		// Attempt to form conditional compare sequences for and/or trees
		// with setcc leafs.
		static SDValue tryLowerToAArch64Cmp(SDValue Op, SelectionDAG &DAG) {
		SDValue LHS = Op.getOperand(0);
		SDValue RHS = Op.getOperand(1);
		if ((LHS.getOpcode() != ISD::SETCC) \|\| (RHS.getOpcode() != ISD::SETCC))
		return Op;

		bool CanNegate;
		if (!isConjunctionDisjunctionTree(Op, CanNegate))
		return SDValue();

		EVT VT = Op.getValueType();
		SDLoc DL(Op);
		SDValue TVal = DAG.getConstant(1, DL, VT);
		SDValue FVal = DAG.getConstant(0, DL, VT);
		SDValue CCVal;
		SDValue Cmp = getAArch64Cmp(Op, FVal, ISD::SETEQ, CCVal, DAG, DL);
		return DAG.getNode(AArch64ISD::CSEL, DL, VT, FVal, TVal, CCVal, Cmp);
		}

static std::pair<SDValue, SDValue>		static std::pair<SDValue, SDValue>
getAArch64XALUOOp(AArch64CC::CondCode &CC, SDValue Op, SelectionDAG &DAG) {		getAArch64XALUOOp(AArch64CC::CondCode &CC, SDValue Op, SelectionDAG &DAG) {
assert((Op.getValueType() == MVT::i32 \|\| Op.getValueType() == MVT::i64) &&		assert((Op.getValueType() == MVT::i32 \|\| Op.getValueType() == MVT::i64) &&
"Unsupported value type");		"Unsupported value type");
SDValue Value, Overflow;		SDValue Value, Overflow;
SDLoc DL(Op);		SDLoc DL(Op);
SDValue LHS = Op.getOperand(0);		SDValue LHS = Op.getOperand(0);
SDValue RHS = Op.getOperand(1);		SDValue RHS = Op.getOperand(1);
▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines
}		}

SDValue AArch64TargetLowering::LowerF128Call(SDValue Op, SelectionDAG &DAG,		SDValue AArch64TargetLowering::LowerF128Call(SDValue Op, SelectionDAG &DAG,
RTLIB::Libcall Call) const {		RTLIB::Libcall Call) const {
SmallVector<SDValue, 2> Ops(Op->op_begin(), Op->op_end());		SmallVector<SDValue, 2> Ops(Op->op_begin(), Op->op_end());
return makeLibCall(DAG, Call, MVT::f128, Ops, false, SDLoc(Op)).first;		return makeLibCall(DAG, Call, MVT::f128, Ops, false, SDLoc(Op)).first;
}		}

		SDValue AArch64TargetLowering::LowerAND(SDValue Op, SelectionDAG &DAG) const {
		if (Op.getValueType().isVector())
		return LowerVectorAND(Op, DAG);
		return tryLowerToAArch64Cmp(Op, DAG);
		}

		SDValue AArch64TargetLowering::LowerOR(SDValue Op, SelectionDAG &DAG) const {
		if (Op.getValueType().isVector())
		return LowerVectorOR(Op, DAG);
		return tryLowerToAArch64Cmp(Op, DAG);
		}

static SDValue LowerXOR(SDValue Op, SelectionDAG &DAG) {		static SDValue LowerXOR(SDValue Op, SelectionDAG &DAG) {
SDValue Sel = Op.getOperand(0);		SDValue Sel = Op.getOperand(0);
SDValue Other = Op.getOperand(1);		SDValue Other = Op.getOperand(1);

// If neither operand is a SELECT_CC, give up.		// If neither operand is a SELECT_CC, give up.
if (Sel.getOpcode() != ISD::SELECT_CC)		if (Sel.getOpcode() != ISD::SELECT_CC)
std::swap(Sel, Other);		std::swap(Sel, Other);
if (Sel.getOpcode() != ISD::SELECT_CC)		if (Sel.getOpcode() != ISD::SELECT_CC)
▲ Show 20 Lines • Show All 638 Lines • ▼ Show 20 Lines	SDValue AArch64TargetLowering::LowerOperation(SDValue Op,
case ISD::SRL_PARTS:		case ISD::SRL_PARTS:
case ISD::SRA_PARTS:		case ISD::SRA_PARTS:
return LowerShiftRightParts(Op, DAG);		return LowerShiftRightParts(Op, DAG);
case ISD::CTPOP:		case ISD::CTPOP:
return LowerCTPOP(Op, DAG);		return LowerCTPOP(Op, DAG);
case ISD::FCOPYSIGN:		case ISD::FCOPYSIGN:
return LowerFCOPYSIGN(Op, DAG);		return LowerFCOPYSIGN(Op, DAG);
case ISD::AND:		case ISD::AND:
return LowerVectorAND(Op, DAG);		return LowerAND(Op, DAG);
case ISD::OR:		case ISD::OR:
return LowerVectorOR(Op, DAG);		return LowerOR(Op, DAG);
case ISD::XOR:		case ISD::XOR:
return LowerXOR(Op, DAG);		return LowerXOR(Op, DAG);
case ISD::PREFETCH:		case ISD::PREFETCH:
return LowerPREFETCH(Op, DAG);		return LowerPREFETCH(Op, DAG);
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
return LowerINT_TO_FP(Op, DAG);		return LowerINT_TO_FP(Op, DAG);
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
▲ Show 20 Lines • Show All 7,839 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll

Show First 20 Lines • Show All 365 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%c1 = icmp sge i32 %v2, %v3		%c1 = icmp sge i32 %v2, %v3
%c2 = icmp eq i32 %v1, 0		%c2 = icmp eq i32 %v1, 0
%or = or i1 %c2, %c1		%or = or i1 %c2, %c1
%and = and i1 %or, %c0		%and = and i1 %or, %c0
%sel = select i1 %and, i32 %v1, i32 %v2		%sel = select i1 %and, i32 %v1, i32 %v2
ret i32 %sel		ret i32 %sel
}		}

; CHECK-LABEL: select_noccmp1		; CHECK-LABEL: single_noselect
define i64 @select_noccmp1(i64 %v1, i64 %v2, i64 %v3, i64 %r) {		define i32 @single_noselect(i32 %A, i32 %B) #0 {
; CHECK: cmp x0, #0		; CHECK: cmp w1, #1
; CHECK-NEXT: cset [[REG0:w[0-9]+]], lt		; CHECK-NEXT: ccmp w0, #1, #8, ge
; CHECK-NEXT: cmp x0, #13		; CHECK-NEXT: cset w0, lt
; CHECK-NOT: ccmp		; CHECK-NEXT: ret
		%notlhs = icmp slt i32 %A, 1
		%notrhs = icmp slt i32 %B, 1
		%lnot = or i1 %notlhs, %notrhs
		%conv = zext i1 %lnot to i32
		ret i32 %conv
		}

		; CHECK-LABEL: single_and_ext
		define i32 @single_and_ext(i32 %A, i32 %B, i32 %C) #0 {
		; CHECK: cmp w1, #2
		; CHECK-NEXT: ccmp w0, #4, #0, lt
		; CHECK-NEXT: cinc w0, w2, lt
		; CHECK-NEXT: ret
		%cmp = icmp slt i32 %A, 4
		%cmp1 = icmp slt i32 %B, 2
		%and1 = and i1 %cmp, %cmp1
		%conv = zext i1 %and1 to i32
		%add = add nsw i32 %conv, %C
		ret i32 %add
		}

		; CHECK-LABEL: single_noselect_phi
		define i32 @single_noselect_phi(i32 %A, i32 %B, i32 %C) #0 {
		; CHECK: cmp w1, #0
		; CHECK-NEXT: ccmp w0, #0, #4, gt
; CHECK-NEXT: cset [[REG1:w[0-9]+]], gt		; CHECK-NEXT: cset [[REG1:w[0-9]+]], gt
; CHECK-NEXT: cmp x2, #2		; CHECK-NEXT: cmp w1, #2
		; CHECK-NEXT: ccmp w0, #4, #8, ge
; CHECK-NEXT: cset [[REG2:w[0-9]+]], lt		; CHECK-NEXT: cset [[REG2:w[0-9]+]], lt
		; CHECK-NEXT: cmp w2, #0
		; CHECK-NEXT: csel w0, [[REG1]], [[REG2]], eq
		; CHECK-NEXT: ret
		entry:
		%tobool = icmp eq i32 %C, 0
		br i1 %tobool, label %if.else, label %if.then

		if.then: ; preds = %entry
		%cmp = icmp slt i32 %A, 4
		%cmp1 = icmp slt i32 %B, 2
		%0 = or i1 %cmp, %cmp1
		br label %if.end

		if.else: ; preds = %entry
		%cmp2 = icmp sgt i32 %A, 0
		%cmp3 = icmp sgt i32 %B, 0
		%1 = and i1 %cmp2, %cmp3
		br label %if.end

		if.end: ; preds = %if.else, %if.then
		%b.0.in = phi i1 [ %0, %if.then ], [ %1, %if.else ]
		%conv = zext i1 %b.0.in to i32
		ret i32 %conv
		}

		; CHECK-LABEL: select_noccmp1
		define i64 @select_noccmp1(i64 %v1, i64 %v2, i64 %v3, i64 %r) {
		; CHECK: cmp x0, #13
		; CHECK-NEXT: ccmp x0, #0, #0, gt
		; CHECK-NEXT: cset [[REG1:w[0-9]+]], lt
; CHECK-NEXT: cmp x2, #4		; CHECK-NEXT: cmp x2, #4
; CHECK-NEXT: cset [[REG3:w[0-9]+]], gt		; CHECK-NEXT: ccmp x2, #2, #0, gt
; CHECK-NEXT: and [[REG4:w[0-9]+]], [[REG0]], [[REG1]]		; CHECK-NEXT: cset [[REG2:w[0-9]+]], lt
; CHECK-NEXT: and [[REG5:w[0-9]+]], [[REG2]], [[REG3]]		; CHECK-NEXT: orr [[REG3:w[0-9]+]], [[REG1]], [[REG2]]
; CHECK-NEXT: orr [[REG6:w[0-9]+]], [[REG4]], [[REG5]]		; CHECK-NEXT: cmp [[REG3]], #0
; CHECK-NEXT: cmp [[REG6]], #0
; CHECK-NEXT: csel x0, xzr, x3, ne		; CHECK-NEXT: csel x0, xzr, x3, ne
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c0 = icmp slt i64 %v1, 0		%c0 = icmp slt i64 %v1, 0
%c1 = icmp sgt i64 %v1, 13		%c1 = icmp sgt i64 %v1, 13
%c2 = icmp slt i64 %v3, 2		%c2 = icmp slt i64 %v3, 2
%c4 = icmp sgt i64 %v3, 4		%c4 = icmp sgt i64 %v3, 4
%and0 = and i1 %c0, %c1		%and0 = and i1 %c0, %c1
%and1 = and i1 %c2, %c4		%and1 = and i1 %c2, %c4
▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

AArch64: Implement missed conditional compare sequences.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 46561

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll

This is an archive of the discontinued LLVM Phabricator instance.

AArch64: Implement missed conditional compare sequences.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 46561

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.h

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/trunk/test/CodeGen/AArch64/arm64-ccmp.ll

AArch64: Implement missed conditional compare sequences.
ClosedPublic