This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/ARM/
-
Target/
-
ARM/
1
ARMISelLowering.h
-
ARMISelLowering.cpp
1
ARMInstrInfo.td
1
ARMInstrThumb.td
1
ARMInstrThumb2.td
-
test/CodeGen/ARM/
-
CodeGen/
-
ARM/
-
select.ll

Differential D53190

ARM: avoid infinite combining loop
ClosedPublic

Authored by t.p.northover on Oct 12 2018, 4:46 AM.

Download Raw Diff

Details

Reviewers

javed.absar
samparker

Summary

One of the transformations done by PerformCMOVCombine increases DAG complexity so that one of the output values can be shared with the compare operation (written as a SUBS). Since this is the reverse of normal combining, it can be collapsed back later in the worklist, leading to an infinite loop.

I think there was a little existing logic to avoid this (the LHS != RHS check) , but it's fragile and nowhere near covers all possible combines; I could not extend it satisfactorily to cover this case and others.

So this patch changes the inserted ISD::SUB into a new ARMISD::OpaqueSUB with the same semantics but that doesn't undergo combining to avoid the problem.

Diff Detail

Repository: rL LLVM

Event Timeline

t.p.northover created this revision.Oct 12 2018, 4:46 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptOct 12 2018, 4:46 AM

Herald added subscribers: chrib, hiraditya, kristof.beyls, mcrosier. · View Herald Transcript

dmgreen added a subscriber: dmgreen.Oct 16 2018, 9:13 AM

Herald added a subscriber: nhaehnle. · View Herald TranscriptOct 16 2018, 9:13 AM

nhaehnle removed a subscriber: nhaehnle.Oct 16 2018, 9:20 AM

Ping.

I guess you're mainly asking review on this to get opinions on the approach to introduce new pseudo instructions just to avoid cycles during DAG combining and/or lowering?
From the patch, I furthermore guess that introducing a new pseudo leads to quite a bit of code duplication/violations of the "don't-repeat-yourself" principle in the instruction matching patterns?

With that, I wonder:

It seems unlikely this is the only place where a problem with cycles during DAG combining/lowering pops up.
- If it is the only place: what is so exceptional here that this really is the only place?
- If it is not: what are the other approaches already taken elsewhere to avoid cycles? Why are alternative approaches not applicable here (or why are they an even worse solution)? Is there actually some guidance written up somewhere on which approaches exist and which to take to avoid cycles?
Instead of producing a whole new pseudo, I guess an alternative would be to introduce a way to flag on ISD nodes "do not combine this node any further" as a more generic mechanism. I guess for that to work, some bit somewhere on the ISD node would be needed to store that info. I wonder if you considered such an approach too? I guess it has the advantage that it doesn't require violating the DRY principle in the instruction patterns?

It seems unlikely this is the only place where a problem with cycles during DAG combining/lowering pops up.

I think it's pretty rare. Mostly combines genuinely do simplify the DAG, or replace the lot with a target node. I did consider the latter but decided getting equivalent CodeGen out of it would be just as bad as (or worse than) the duplicated patterns for the subtraction.

I guess an alternative would be to introduce a way to flag on ISD nodes "do not combine this node any further" as a more generic mechanism. I guess for that to work, some bit somewhere on the ISD node would be needed to store that info.

I hadn't considered an approach like that, but I think obeying such a flag would be pretty difficult. You don't just have to check the root of a combine, but every node involved. That probably means checks throughout the DAGCombiner.

I thought the normal way to stop combining was to return the original node. Could you not manually replace N with Res and then return N?

I thought the normal way to stop combining was to return the original node. Could you not manually replace N with Res and then return N?

You can sort of hide the new node from the worklist, but it's not usually a good idea: if the node gets back on the worklist some other way, you can end up with an infinite combining loop. The goal should be a "stable" DAG, which DAGCombine won't modify further.

Ok, fair point. If we are going to introduce a new node to fix this issue, could we have a SUBS node that can be glued to the CMOV?

Ok, fair point. If we are going to introduce a new node to fix this issue, could we have a SUBS node that can be glued to the CMOV?

I believe later passes already tidy things up enough that we don't need to embellish the DAG phase. For example if I compile:

define i32 @foo(i32 %lhs, i32 %rhs, i32 %in) {
  %tst = icmp eq i32 %lhs, %rhs
  %res = select i1 %tst, i32 0, i32 %in
  ret i32 %res
}

with this patch, I still get the optimum CodeGen of a single cmp and cmov.

Ping.

My point was that, even though this extra work and as you mentioned in the comments, the test case isn't generating optimum code. We could introduce a subs node to remove the unnecessary cmp, making the sub opaque and improving codegen. Unless there's a reason why we couldn't do this?

Ah, I see what you mean. It's not pretty, but this updated patch seems to do the trick.

Great to see those other test changes! LGTM with the few minor comments, no need to re-review. cheers!

llvm/lib/Target/ARM/ARMISelLowering.h
88	Maybe just SUBS now? And with an updated comment.
llvm/lib/Target/ARM/ARMInstrInfo.td
3626	whitespace.
llvm/lib/Target/ARM/ARMInstrThumb.td
1285	whitespace.
llvm/lib/Target/ARM/ARMInstrThumb2.td
2084	whitespace.

This revision is now accepted and ready to land.Nov 28 2018, 7:24 AM

Great to see those other test changes!

Yes, definitely a pleasant surprise. Thanks for the reviews; I've committed it with your suggested changes as r348122.

Revision Contents

Path

Size

llvm/

lib/

Target/

ARM/

1 line

13 lines

9 lines

5 lines

8 lines

test/

CodeGen/

ARM/

select.ll

14 lines

Diff 169371

llvm/lib/Target/ARM/ARMISelLowering.h

Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
CMP, // ARM compare instructions.		CMP, // ARM compare instructions.
CMN, // ARM CMN instructions.		CMN, // ARM CMN instructions.
CMPZ, // ARM compare that sets only Z flag.		CMPZ, // ARM compare that sets only Z flag.
CMPFP, // ARM VFP compare instruction, sets FPSCR.		CMPFP, // ARM VFP compare instruction, sets FPSCR.
CMPFPw0, // ARM VFP compare against zero instruction, sets FPSCR.		CMPFPw0, // ARM VFP compare against zero instruction, sets FPSCR.
FMSTAT, // ARM fmstat instruction.		FMSTAT, // ARM fmstat instruction.

CMOV, // ARM conditional move instructions.		CMOV, // ARM conditional move instructions.
		OpaqueSUB, // Subtract that DAG combiner should ignore.
		samparkerUnsubmitted Not Done Reply Inline Actions Maybe just SUBS now? And with an updated comment. samparker: Maybe just SUBS now? And with an updated comment.

SSAT, // Signed saturation		SSAT, // Signed saturation
USAT, // Unsigned saturation		USAT, // Unsigned saturation

BCC_i64,		BCC_i64,

SRL_FLAG, // V,Flag = srl_flag X -> srl X, 1 + save carry out.		SRL_FLAG, // V,Flag = srl_flag X -> srl X, 1 + save carry out.
SRA_FLAG, // V,Flag = sra_flag X -> sra X, 1 + save carry out.		SRA_FLAG, // V,Flag = sra_flag X -> sra X, 1 + save carry out.
▲ Show 20 Lines • Show All 725 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,275 Lines • ▼ Show 20 Lines	const char *ARMTargetLowering::getTargetNodeName(unsigned Opcode) const {
case ARMISD::CMN: return "ARMISD::CMN";		case ARMISD::CMN: return "ARMISD::CMN";
case ARMISD::CMPZ: return "ARMISD::CMPZ";		case ARMISD::CMPZ: return "ARMISD::CMPZ";
case ARMISD::CMPFP: return "ARMISD::CMPFP";		case ARMISD::CMPFP: return "ARMISD::CMPFP";
case ARMISD::CMPFPw0: return "ARMISD::CMPFPw0";		case ARMISD::CMPFPw0: return "ARMISD::CMPFPw0";
case ARMISD::BCC_i64: return "ARMISD::BCC_i64";		case ARMISD::BCC_i64: return "ARMISD::BCC_i64";
case ARMISD::FMSTAT: return "ARMISD::FMSTAT";		case ARMISD::FMSTAT: return "ARMISD::FMSTAT";

case ARMISD::CMOV: return "ARMISD::CMOV";		case ARMISD::CMOV: return "ARMISD::CMOV";
		case ARMISD::OpaqueSUB: return "ARMISD::OpaqueSUB";

case ARMISD::SSAT: return "ARMISD::SSAT";		case ARMISD::SSAT: return "ARMISD::SSAT";
case ARMISD::USAT: return "ARMISD::USAT";		case ARMISD::USAT: return "ARMISD::USAT";

case ARMISD::SRL_FLAG: return "ARMISD::SRL_FLAG";		case ARMISD::SRL_FLAG: return "ARMISD::SRL_FLAG";
case ARMISD::SRA_FLAG: return "ARMISD::SRA_FLAG";		case ARMISD::SRA_FLAG: return "ARMISD::SRA_FLAG";
case ARMISD::RRX: return "ARMISD::RRX";		case ARMISD::RRX: return "ARMISD::RRX";

▲ Show 20 Lines • Show All 11,423 Lines • ▼ Show 20 Lines	if (CC == ARMCC::EQ && isOneConstant(TrueVal)) {
SDValue Neg = DAG.getNode(ISD::USUBO, dl, VTs, FalseVal, Sub);		SDValue Neg = DAG.getNode(ISD::USUBO, dl, VTs, FalseVal, Sub);
// ISD::SUBCARRY returns a borrow but we want the carry here		// ISD::SUBCARRY returns a borrow but we want the carry here
// actually.		// actually.
SDValue Carry =		SDValue Carry =
DAG.getNode(ISD::SUB, dl, MVT::i32,		DAG.getNode(ISD::SUB, dl, MVT::i32,
DAG.getConstant(1, dl, MVT::i32), Neg.getValue(1));		DAG.getConstant(1, dl, MVT::i32), Neg.getValue(1));
Res = DAG.getNode(ISD::ADDCARRY, dl, VTs, Sub, Neg, Carry);		Res = DAG.getNode(ISD::ADDCARRY, dl, VTs, Sub, Neg, Carry);
}		}
} else if (CC == ARMCC::NE && LHS != RHS &&		} else if (CC == ARMCC::NE && !isNullConstant(RHS) &&
(!Subtarget->isThumb1Only() \|\| isPowerOf2Constant(TrueVal))) {		(!Subtarget->isThumb1Only() \|\| isPowerOf2Constant(TrueVal))) {
// This seems pointless but will allow us to combine it further below.		// This seems pointless but will allow us to combine it further below.
// CMOV 0, z, !=, (CMPZ x, y) -> CMOV (SUB x, y), z, !=, (CMPZ x, y)		// CMOV 0, z, !=, (CMPZ x, y) -> CMOV (SUB x, y), z, !=, (CMPZ x, y)
SDValue Sub = DAG.getNode(ISD::SUB, dl, VT, LHS, RHS);		SDValue Sub = DAG.getNode(ARMISD::OpaqueSUB, dl, VT, LHS, RHS);
Res = DAG.getNode(ARMISD::CMOV, dl, VT, Sub, TrueVal, ARMcc,		Res = DAG.getNode(ARMISD::CMOV, dl, VT, Sub, TrueVal, ARMcc,
N->getOperand(3), Cmp);		N->getOperand(3), Cmp);
}		}
} else if (isNullConstant(TrueVal)) {		} else if (isNullConstant(TrueVal)) {
if (CC == ARMCC::EQ && LHS != RHS &&		if (CC == ARMCC::EQ && !isNullConstant(RHS) &&
(!Subtarget->isThumb1Only() \|\| isPowerOf2Constant(FalseVal))) {		(!Subtarget->isThumb1Only() \|\| isPowerOf2Constant(FalseVal))) {
// This seems pointless but will allow us to combine it further below		// This seems pointless but will allow us to combine it further below
// Note that we change == for != as this is the dual for the case above.		// Note that we change == for != as this is the dual for the case above.
// CMOV z, 0, ==, (CMPZ x, y) -> CMOV (SUB x, y), z, !=, (CMPZ x, y)		// CMOV z, 0, ==, (CMPZ x, y) -> CMOV (SUB x, y), z, !=, (CMPZ x, y)
SDValue Sub = DAG.getNode(ISD::SUB, dl, VT, LHS, RHS);		SDValue Sub = DAG.getNode(ARMISD::OpaqueSUB, dl, VT, LHS, RHS);
Res = DAG.getNode(ARMISD::CMOV, dl, VT, Sub, FalseVal,		Res = DAG.getNode(ARMISD::CMOV, dl, VT, Sub, FalseVal,
DAG.getConstant(ARMCC::NE, dl, MVT::i32),		DAG.getConstant(ARMCC::NE, dl, MVT::i32),
N->getOperand(3), Cmp);		N->getOperand(3), Cmp);
}		}
}		}

// On Thumb1, the DAG above may be further combined if z is a power of 2		// On Thumb1, the DAG above may be further combined if z is a power of 2
// (z == 2 ^ K).		// (z == 2 ^ K).
// CMOV (SUB x, y), z, !=, (CMPZ x, y) ->		// CMOV (SUB x, y), z, !=, (CMPZ x, y) ->
// merge t3, t4		// merge t3, t4
// where t1 = (SUBCARRY (SUB x, y), z, 0)		// where t1 = (SUBCARRY (SUB x, y), z, 0)
// t2 = (SUBCARRY (SUB x, y), t1:0, t1:1)		// t2 = (SUBCARRY (SUB x, y), t1:0, t1:1)
// t3 = if K != 0 then (SHL t2:0, K) else t2:0		// t3 = if K != 0 then (SHL t2:0, K) else t2:0
// t4 = (SUB 1, t2:1) [ we want a carry, not a borrow ]		// t4 = (SUB 1, t2:1) [ we want a carry, not a borrow ]
const APInt *TrueConst;		const APInt *TrueConst;
if (Subtarget->isThumb1Only() && CC == ARMCC::NE &&		if (Subtarget->isThumb1Only() && CC == ARMCC::NE &&
(FalseVal.getOpcode() == ISD::SUB) && (FalseVal.getOperand(0) == LHS) &&		(FalseVal.getOpcode() == ARMISD::OpaqueSUB) &&
(FalseVal.getOperand(1) == RHS) &&		(FalseVal.getOperand(0) == LHS) && (FalseVal.getOperand(1) == RHS) &&
(TrueConst = isPowerOf2Constant(TrueVal))) {		(TrueConst = isPowerOf2Constant(TrueVal))) {
SDVTList VTs = DAG.getVTList(VT, MVT::i32);		SDVTList VTs = DAG.getVTList(VT, MVT::i32);
unsigned ShiftAmount = TrueConst->logBase2();		unsigned ShiftAmount = TrueConst->logBase2();
if (ShiftAmount)		if (ShiftAmount)
TrueVal = DAG.getConstant(1, dl, VT);		TrueVal = DAG.getConstant(1, dl, VT);
SDValue Subc = DAG.getNode(ISD::USUBO, dl, VTs, FalseVal, TrueVal);		SDValue Subc = DAG.getNode(ISD::USUBO, dl, VTs, FalseVal, TrueVal);
Res = DAG.getNode(ISD::SUBCARRY, dl, VTs, FalseVal, Subc, Subc.getValue(1));		Res = DAG.getNode(ISD::SUBCARRY, dl, VTs, FalseVal, Subc, Subc.getValue(1));
// Make it a carry, not a borrow.		// Make it a carry, not a borrow.
▲ Show 20 Lines • Show All 2,363 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMInstrInfo.td

Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	def ARMcall_nolink : SDNode<"ARMISD::CALL_NOLINK", SDT_ARMcall,
SDNPVariadic]>;		SDNPVariadic]>;

def ARMretflag : SDNode<"ARMISD::RET_FLAG", SDTNone,		def ARMretflag : SDNode<"ARMISD::RET_FLAG", SDTNone,
[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;		[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
def ARMintretflag : SDNode<"ARMISD::INTRET_FLAG", SDT_ARMcall,		def ARMintretflag : SDNode<"ARMISD::INTRET_FLAG", SDT_ARMcall,
[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;		[SDNPHasChain, SDNPOptInGlue, SDNPVariadic]>;
def ARMcmov : SDNode<"ARMISD::CMOV", SDT_ARMCMov,		def ARMcmov : SDNode<"ARMISD::CMOV", SDT_ARMCMov,
[SDNPInGlue]>;		[SDNPInGlue]>;
		def ARMopaquesub : SDNode<"ARMISD::OpaqueSUB", SDTIntBinOp>;

def ARMssatnoshift : SDNode<"ARMISD::SSAT", SDTIntSatNoShOp, []>;		def ARMssatnoshift : SDNode<"ARMISD::SSAT", SDTIntSatNoShOp, []>;

def ARMusatnoshift : SDNode<"ARMISD::USAT", SDTIntSatNoShOp, []>;		def ARMusatnoshift : SDNode<"ARMISD::USAT", SDTIntSatNoShOp, []>;

def ARMbrcond : SDNode<"ARMISD::BRCOND", SDT_ARMBrcond,		def ARMbrcond : SDNode<"ARMISD::BRCOND", SDT_ARMBrcond,
[SDNPHasChain, SDNPInGlue, SDNPOutGlue]>;		[SDNPHasChain, SDNPInGlue, SDNPOutGlue]>;

▲ Show 20 Lines • Show All 3,462 Lines • ▼ Show 20 Lines
//		//

let isAdd = 1 in		let isAdd = 1 in
defm ADD : AsI1_bin_irs<0b0100, "add",		defm ADD : AsI1_bin_irs<0b0100, "add",
IIC_iALUi, IIC_iALUr, IIC_iALUsr, add, 1>;		IIC_iALUi, IIC_iALUr, IIC_iALUsr, add, 1>;
defm SUB : AsI1_bin_irs<0b0010, "sub",		defm SUB : AsI1_bin_irs<0b0010, "sub",
IIC_iALUi, IIC_iALUr, IIC_iALUsr, sub>;		IIC_iALUi, IIC_iALUr, IIC_iALUsr, sub>;


		samparkerUnsubmitted Not Done Reply Inline Actions whitespace. samparker: whitespace.
		def : ARMPat<(ARMopaquesub GPR:$Rn, mod_imm:$imm), (SUBri $Rn, mod_imm:$imm)>;
		def : ARMPat<(ARMopaquesub GPR:$Rn, GPR:$Rm), (SUBrr $Rn, $Rm)>;
		def : ARMPat<(ARMopaquesub GPR:$Rn, so_reg_imm:$shift),
		(SUBrsi $Rn, so_reg_imm:$shift)>;
		def : ARMPat<(ARMopaquesub GPR:$Rn, so_reg_reg:$shift),
		(SUBrsr $Rn, so_reg_reg:$shift)>;

// ADD and SUB with 's' bit set.		// ADD and SUB with 's' bit set.
//		//
// Currently, ADDS/SUBS are pseudo opcodes that exist only in the		// Currently, ADDS/SUBS are pseudo opcodes that exist only in the
// selection DAG. They are "lowered" to real ADD/SUB opcodes by		// selection DAG. They are "lowered" to real ADD/SUB opcodes by
// AdjustInstrPostInstrSelection where we determine whether or not to		// AdjustInstrPostInstrSelection where we determine whether or not to
// set the "s" bit based on CPSR liveness.		// set the "s" bit based on CPSR liveness.
//		//
// FIXME: Eliminate ADDS/SUBS pseudo opcodes after adding tablegen		// FIXME: Eliminate ADDS/SUBS pseudo opcodes after adding tablegen
▲ Show 20 Lines • Show All 2,561 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMInstrThumb.td

Show First 20 Lines • Show All 1,276 Lines • ▼ Show 20 Lines	def tSUBi3 : // A8.6.210 T1
T1sIGenEncodeImm<0b01111, (outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),		T1sIGenEncodeImm<0b01111, (outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),
IIC_iALUi,		IIC_iALUi,
"sub", "\t$Rd, $Rm, $imm3",		"sub", "\t$Rd, $Rm, $imm3",
[(set tGPR:$Rd, (add tGPR:$Rm, imm0_7_neg:$imm3))]>,		[(set tGPR:$Rd, (add tGPR:$Rm, imm0_7_neg:$imm3))]>,
Sched<[WriteALU]> {		Sched<[WriteALU]> {
bits<3> imm3;		bits<3> imm3;
let Inst{8-6} = imm3;		let Inst{8-6} = imm3;
}		}
		def : T1Pat<(ARMopaquesub tGPR:$Rn, imm0_7:$imm3),
		(tSUBi3 $Rn, imm0_7:$imm3)>;

samparkerUnsubmitted Not Done Reply Inline Actions whitespace. samparker: whitespace.
def tSUBi8 : // A8.6.210 T2		def tSUBi8 : // A8.6.210 T2
T1sItGenEncodeImm<{1,1,1,?,?}, (outs tGPR:$Rdn),		T1sItGenEncodeImm<{1,1,1,?,?}, (outs tGPR:$Rdn),
(ins tGPR:$Rn, imm0_255:$imm8), IIC_iALUi,		(ins tGPR:$Rn, imm0_255:$imm8), IIC_iALUi,
"sub", "\t$Rdn, $imm8",		"sub", "\t$Rdn, $imm8",
[(set tGPR:$Rdn, (add tGPR:$Rn, imm8_255_neg:$imm8))]>,		[(set tGPR:$Rdn, (add tGPR:$Rn, imm8_255_neg:$imm8))]>,
Sched<[WriteALU]>;		Sched<[WriteALU]>;
		def : T1Pat<(ARMopaquesub tGPR:$Rn, imm0_255:$imm8),
		(tSUBi8 $Rn, imm0_255:$imm8)>;

def : tInstSubst<"add${s}${p} $rd, $rn, $imm",		def : tInstSubst<"add${s}${p} $rd, $rn, $imm",
(tSUBi3 tGPR:$rd, s_cc_out:$s, tGPR:$rn, mod_imm1_7_neg:$imm, pred:$p)>;		(tSUBi3 tGPR:$rd, s_cc_out:$s, tGPR:$rn, mod_imm1_7_neg:$imm, pred:$p)>;


def : tInstSubst<"add${s}${p} $rdn, $imm",		def : tInstSubst<"add${s}${p} $rdn, $imm",
(tSUBi8 tGPR:$rdn, s_cc_out:$s, mod_imm8_255_neg:$imm, pred:$p)>;		(tSUBi8 tGPR:$rdn, s_cc_out:$s, mod_imm8_255_neg:$imm, pred:$p)>;


// Subtract register		// Subtract register
def tSUBrr : // A8.6.212		def tSUBrr : // A8.6.212
T1sIGenEncode<0b01101, (outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),		T1sIGenEncode<0b01101, (outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),
IIC_iALUr,		IIC_iALUr,
"sub", "\t$Rd, $Rn, $Rm",		"sub", "\t$Rd, $Rn, $Rm",
[(set tGPR:$Rd, (sub tGPR:$Rn, tGPR:$Rm))]>,		[(set tGPR:$Rd, (sub tGPR:$Rn, tGPR:$Rm))]>,
Sched<[WriteALU]>;		Sched<[WriteALU]>;
		def : T1Pat<(ARMopaquesub tGPR:$Rn, tGPR:$Rm), (tSUBrr $Rn, $Rm)>;

def : tInstAlias <"sub${s}${p} $Rdn, $Rm",		def : tInstAlias <"sub${s}${p} $Rdn, $Rm",
(tSUBrr tGPR:$Rdn,s_cc_out:$s, tGPR:$Rdn, tGPR:$Rm, pred:$p)>;		(tSUBrr tGPR:$Rdn,s_cc_out:$s, tGPR:$Rdn, tGPR:$Rm, pred:$p)>;

/// Similar to the above except these set the 's' bit so the		/// Similar to the above except these set the 's' bit so the
/// instruction modifies the CPSR register.		/// instruction modifies the CPSR register.
///		///
/// These opcodes will be converted to the real non-S opcodes by		/// These opcodes will be converted to the real non-S opcodes by
▲ Show 20 Lines • Show All 392 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMInstrThumb2.td

	Show First 20 Lines • Show All 2,075 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Arithmetic Instructions.			// Arithmetic Instructions.
	//			//

	let isAdd = 1 in			let isAdd = 1 in
	defm t2ADD : T2I_bin_ii12rs<0b000, "add", add, 1>;			defm t2ADD : T2I_bin_ii12rs<0b000, "add", add, 1>;
	defm t2SUB : T2I_bin_ii12rs<0b101, "sub", sub>;			defm t2SUB : T2I_bin_ii12rs<0b101, "sub", sub>;

				def : T2Pat<(ARMopaquesub GPRnopc:$Rn, t2_so_imm:$imm),
				samparkerUnsubmitted Not Done Reply Inline Actions whitespace. samparker: whitespace.
				(t2SUBri $Rn, t2_so_imm:$imm)>;
				def : T2Pat<(ARMopaquesub GPRnopc:$Rn, imm0_4095:$imm),
				(t2SUBri12 $Rn, imm0_4095:$imm)>;
				def : T2Pat<(ARMopaquesub GPRnopc:$Rn, rGPR:$Rm), (t2SUBrr $Rn, $Rm)>;
				def : T2Pat<(ARMopaquesub GPRnopc:$Rn, t2_so_reg:$ShiftedRm),
				(t2SUBrs $Rn, t2_so_reg:$ShiftedRm)>;

	// ADD and SUB with 's' bit set. No 12-bit immediate (T4) variants.			// ADD and SUB with 's' bit set. No 12-bit immediate (T4) variants.
	//			//
	// Currently, t2ADDS/t2SUBS are pseudo opcodes that exist only in the			// Currently, t2ADDS/t2SUBS are pseudo opcodes that exist only in the
	// selection DAG. They are "lowered" to real t2ADD/t2SUB opcodes by			// selection DAG. They are "lowered" to real t2ADD/t2SUB opcodes by
	// AdjustInstrPostInstrSelection where we determine whether or not to			// AdjustInstrPostInstrSelection where we determine whether or not to
	// set the "s" bit based on CPSR liveness.			// set the "s" bit based on CPSR liveness.
	//			//
	// FIXME: Eliminate t2ADDS/t2SUBS pseudo opcodes after adding tablegen			// FIXME: Eliminate t2ADDS/t2SUBS pseudo opcodes after adding tablegen
	▲ Show 20 Lines • Show All 2,798 Lines • Show Last 20 Lines

llvm/test/CodeGen/ARM/select.ll

	Show First 20 Lines • Show All 136 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: f12:			; CHECK-LABEL: f12:
	define float @f12(i32 %a, i32 %b) nounwind uwtable readnone ssp {			define float @f12(i32 %a, i32 %b) nounwind uwtable readnone ssp {
	; CHECK-NOT: floatunsisf			; CHECK-NOT: floatunsisf
	%1 = icmp eq i32 %a, %b			%1 = icmp eq i32 %a, %b
	%2 = uitofp i1 %1 to float			%2 = uitofp i1 %1 to float
	ret float %2			ret float %2
	}			}

				; N.b. sub is redundant with cmp. Don't worry if peepholer realises this and
				; removes the cmp in favour of a subs.
				; CHECK-LABEL: test_overflow_recombine:
				define i1 @test_overflow_recombine(i32 %in) {
				; CHECK: smull [[LO:r[0-9]+]], [[HI:r[0-9]+]]
				; CHECK: sub [[ZERO:r[0-9]+]], [[HI]], [[LO]], asr #31
				; CHECK: cmp [[HI]], [[LO]], asr #31
				; CHECK: movne [[ZERO]], #1
				%prod = call { i32, i1 } @llvm.smul.with.overflow.i32(i32 0, i32 %in)
				%overflow = extractvalue { i32, i1 } %prod, 1
				ret i1 %overflow
				}

				declare { i32, i1 } @llvm.smul.with.overflow.i32(i32, i32)