This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/ARM/
-
Target/
-
ARM/
-
ARMBaseInstrInfo.cpp
2
ARMISelLowering.cpp
-
ARMInstrThumb.td
-
test/CodeGen/Thumb/
-
CodeGen/
-
Thumb/
4
long.ll

Differential D30400

For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as already done for ARM and Thumb2.
ClosedPublic

Authored by tyomitch on Feb 27 2017, 3:41 AM.

Download Raw Diff

Details

Reviewers

jmolloy
rogfer01
efriedma

Commits

rG0c93ceb5d85b: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as…
rL297443: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Summary

Unfortunately, due to the Thumb1 idiosyncrasy where the instructions
can be *either* flag-setting *or* conditional, this is not expressible
with TableGen patterns, so we have to go for the custom C++ lowering.

Diff Detail

Build Status

Buildable 4664
Build 4664: arc lint + arc unit

Event Timeline

tyomitch created this revision.Feb 27 2017, 3:41 AM

Harbormaster completed remote builds in B4315: Diff 89861.Feb 27 2017, 3:41 AM

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptFeb 27 2017, 3:41 AM

tyomitch added a child revision: D30401: Refactor the multiply-accumulate combines to act on ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE]..Feb 27 2017, 4:14 AM

efriedma added a subscriber: efriedma.Feb 27 2017, 2:17 PM

efriedma added inline comments.

lib/Target/ARM/ARMISelDAGToDAG.cpp
3306 ↗	(On Diff #89861)	This assertion seems suspicious... why is it true in general?

Thanks Eli!
Indeed the assertion was wrong; this also shows how insufficient our tests for long adds/subracts were.
Updating the patch to address both these points.

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

Few nits, but otherwise, looks good.

lib/Target/ARM/ARMISelDAGToDAG.cpp
3240 ↗	(On Diff #89993)	Why can't you leave this as an early break?
3272 ↗	(On Diff #89993)	use LLVM_FALLTHROUGH

tyomitch added inline comments.Feb 28 2017, 6:36 AM

lib/Target/ARM/ARMISelDAGToDAG.cpp
3240 ↗	(On Diff #89993)	Exactly because I want it to fall through to the next case, if the condition doesn't hold.

Ok, just adding LLVM_FALLTHROUGH should be fine for me. I'll let Eli have a final look and approve.

lib/Target/ARM/ARMISelDAGToDAG.cpp
3240 ↗	(On Diff #89993)	right, I thought as much.

Added LLVM_FALLTHROUGH

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

Harbormaster completed remote builds in B4364: Diff 90031.Feb 28 2017, 7:35 AM

Are you sure we can't use the same codepath we currently use for Thumb2/ARM here?

test/CodeGen/Thumb/long.ll
78	I'd also like to see some tests here for subtraction with an immediate amount. ("add i64 %y, -10" etc.)

Are you sure we can't use the same codepath we currently use for Thumb2/ARM here?

I don't think we can.
The existing codepath is itself quite hairy: quoting a comment in ARMInstrInfo.td,

// Currently, ADDS/SUBS are pseudo opcodes that exist only in the
// selection DAG. They are "lowered" to real ADD/SUB opcodes by
// AdjustInstrPostInstrSelection where we determine whether or not to
// set the "s" bit based on CPSR liveness.
//
// FIXME: Eliminate ADDS/SUBS pseudo opcodes after adding tablegen
// support for an optional CPSR definition that corresponds to the DAG
// node's second value. We can then eliminate the implicit def of CPSR.

For the Thumb1 instructions, we cannot choose "whether or not to set the "s" bit"; it's implicitly set iff the instruction isn't predicated.

For the Thumb1 instructions, we cannot choose "whether or not to set the "s" bit"; it's implicitly set iff the instruction isn't predicated.

I think it works out anyway; outside of Thumb1 mode, we want to avoid clobbering CPSR when we don't need to, but it's perfectly legal to produce a dead definition of CPSR.

clobbering CPSR when we don't need to is the least of the problems; what we have in ARM and Thumb2 is that ADD and ADDS are defined separately, the former producing one result (to match an ADD node), and the latter producing two (to match an ADDC node). In Thumb1, we cannot define them separately, so tADD MIs are defined with an OptionalDef for CPSR. The ISel patterns won't let me match an MI with one result value (and an OptionalDef) to an ISD node producing two results. Redefining tADD to always produce two results doesn't work either, because it's assumed, by many layers including AsmParser / AsmPrinter, to still have the OptionalDef for CPSR; and the InstrEmitter won't let me have CPSR as both an OptionalDef and an actual result in the same MI.
Handwave handwave, I cannot really prove that it cannot be done, but I mean I had tried, and I couldn't.

test/CodeGen/Thumb/long.ll
78	Indeed, subtracting immediates wasn't handled well; I'll upload the updated patch.

Patch updated

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

In Thumb1, we cannot define them separately

Why not? "t2ADDSrr" is a pseudo-instruction, not an actual encoding.

lib/Target/ARM/ARMISelDAGToDAG.cpp
3318 ↗	(On Diff #90159)	The old patterns don't handle SUBC with an immediate. You can produce this situation with something like this: long long x(long long a, int b) { return a - (((long long)b << 32) \| -1U); } I think the handling here is correct, but please change it in a separate patch.

Why not? "t2ADDSrr" is a pseudo-instruction, not an actual encoding.

Right; but t2ADC / t2SBC are actual encodings (non-predicable, with non-optional def for CPSR), unlike tADC / tSBC (predicable, with an OptionalDef for CPSR).

It might be possible to do a hybrid implementation, using tPseudoInsts for tADDS / tSUBS, and custom C++ lowering for tADC / tSBC; although this feels like, out of two evils, choosing both.
It would also require duplicating a substantial portion of the code in ARMTargetLowering::AdjustInstrPostInstrSelection to take care of Thumb1 instructions separately, because their MIs have a different structure: in particular, the cc_out operand that AdjustInstrPostInstrSelection is adding must, in Thumb1 instructions, be not last but 1st (and MachineInstr doesn't even have an API to insert a new operand into the middle of an existing instruction).

lib/Target/ARM/ARMISelDAGToDAG.cpp
3318 ↗	(On Diff #90159)	The old patterns lower this code into: movs r3, #0 mvns r3, r3 subs r0, r0, r3 sbcs r1, r2 on Thumb1, and into much more compact subs.w r0, r0, #-1 sbcs r1, r2 on Thumb2. The new code lowers it into adds r0, r0, #1 sbcs r1, r2 which is equivalent, and even a bit more compact. I don't really see what the problem is, either with the old patterns or with the new code.

t2ADC / t2SBC are actual encodings (non-predicable

t2ADC should be predicable? At least, there isn't any restriction imposed by the architecture.

The rest makes sense; I'll stop pushing.

I don't really see what the problem is, either with the old patterns or with the new code.

I don't really want to mix multiple orthogonal changes, especially without any test coverage.

t2ADC should be predicable?

I'd think so too! As you see, long addition/subtraction is not the neatest part of the ARM backend :-)

I don't really want to mix multiple orthogonal changes, especially without any test coverage.

Right, now I see what you mean.

These changes are rather intertwined (it's easier to handle both ADDC and SUBC in the same branch of the switch block, than to separate them and faithfully replicate the old behaviour) but I will certainly add a test case for SUBC with immediate.

Added tests for SUBC with immediate

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

Select(RHS.getNode()) must be deferred until RHS has users; otherwise, if Select() converts RHS into a duplicate of an existing node, then the DAG automatically updates all uses of RHS to use the existing node instead, and deletes the RHS's own node.
If we call Select(RHS.getNode()) when RHS doesn't yet have any users, then nothing gets updated, RHS's node gets deleted, and we end up adding uses to a deleted node. Boom!

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

efriedma added inline comments.Mar 3 2017, 11:40 AM

lib/Target/ARM/ARMISelDAGToDAG.cpp
3299 ↗	(On Diff #90476)	Do you actually need to call Select() explicitly here? Instruction selection should pick it up automatically, I think.

tyomitch added inline comments.Mar 3 2017, 12:28 PM

lib/Target/ARM/ARMISelDAGToDAG.cpp
3299 ↗	(On Diff #90476)	No, it doesn't re-lower nodes created by `ARMDAGToDAGISel::Select()`: it is assumed to only output lowered nodes.

efriedma added inline comments.Mar 3 2017, 12:44 PM

lib/Target/ARM/ARMISelDAGToDAG.cpp
3299 ↗	(On Diff #90476)	Okay. The lowering for ISD::AND has some code which deals with a similar situation, but in a different way. Could you refactor to share the same code?

Copying the trick that the lowering for ISD::AND uses to create and lower a constant node

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

tyomitch added inline comments.Mar 7 2017, 3:00 PM

test/CodeGen/Thumb/long.ll
54	Now I see that lowering an `(ADDE x, y, (ADDC z, t))` into a chain of `(CopyFromReg CPSR, (tADD z, t)), (CopyFromReg CPSR, (tADC x, y, (CopyToReg CPSR)))`, with the CPSR-copying nodes glued to the arithmetic nodes, -- doesn't prevent LLVM from scheduling CPSR-clobbering operations in between the converted ADDC and the converted ADDE, -- such as in this test case, where a flag-setting tMOVi8 is inserted in the middle. An ugly patch is certainly better than an incorrect one, so I decided to go back and finish the "hybrid implementation" using tPseudoInsts with two integer outputs each for tADDS / tSUBS, and custom C++ lowering for tADC / tSBC.

Hybrid implementation

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

Hmm... given that you've done most of the work of fixing AdjustInstrPostInstrSelection, how hard would it be to add tADCS/tSBCS pseudo-instructions and send them through AdjustInstrPostInstrSelection, as opposed to using custom selection code in C++? I'm sort of concerned you could run into the same scheduling problem for 128-bit addition.

Adding tADCS/tSBCS pseudo-instructions does indeed let
simplify the custom selection code quite a bit, but
doesn't get rid of it entirely, as the negative-immediate
operand still needs a "recursive lowering" which cannot
be specified with ISel patterns. (This is similar to how
ISD::AND needs the custom lowering into a tBIC.)

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

but doesn't get rid of it entirely, as the negative-immediate operand still needs a "recursive lowering" which cannot be specified with ISel patterns.

Could you do this as a DAGCombine instead?

Lowering the negative-immediate operand as a DAGCombine instead

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

LGTM, with a few minor tweaks.

lib/Target/ARM/ARMISelLowering.cpp
9075	Check isThumb1Only() rather than MCID->getSize()?
9129	Check isThumb1Only() rather than MCID->getSize()?
test/CodeGen/Thumb/long.ll
3	Add -verify-machineinstrs to the RUN lines.

This revision is now accepted and ready to land.Mar 9 2017, 3:30 PM

The minor tweaks

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

same as already done for ARM and Thumb2.

Closed by commit rL297443: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, (authored by askrobov). · Explain WhyMar 9 2017, 11:52 PM

This revision was automatically updated to reflect the committed changes.

tyomitch mentioned this in D30401: Refactor the multiply-accumulate combines to act on ARMISD::ADD[CE] nodes, instead of the generic ISD::ADD[CE]..Mar 10 2017, 1:01 AM

tyomitch mentioned this in D31081: [ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs.Mar 17 2017, 10:47 AM

Diffusion mentioned this in rL301106: [ARM] ScheduleDAGRRList::DelayForLiveRegsBottomUp must consider OptionalDefs.Apr 23 2017, 12:11 AM

Revision Contents

Path

Size

lib/

Target/

ARM/

ARMBaseInstrInfo.cpp

10 lines

ARMISelLowering.cpp

75 lines

ARMInstrThumb.td

94 lines

test/

CodeGen/

Thumb/

long.ll

130 lines

Diff 91231

lib/Target/ARM/ARMBaseInstrInfo.cpp

Show First 20 Lines • Show All 2,030 Lines • ▼ Show 20 Lines	static const AddSubFlagsOpcodePair AddSubFlagsOpcodeMap[] = {
{ARM::SUBSrr, ARM::SUBrr},		{ARM::SUBSrr, ARM::SUBrr},
{ARM::SUBSrsi, ARM::SUBrsi},		{ARM::SUBSrsi, ARM::SUBrsi},
{ARM::SUBSrsr, ARM::SUBrsr},		{ARM::SUBSrsr, ARM::SUBrsr},

{ARM::RSBSri, ARM::RSBri},		{ARM::RSBSri, ARM::RSBri},
{ARM::RSBSrsi, ARM::RSBrsi},		{ARM::RSBSrsi, ARM::RSBrsi},
{ARM::RSBSrsr, ARM::RSBrsr},		{ARM::RSBSrsr, ARM::RSBrsr},

		{ARM::tADDSi3, ARM::tADDi3},
		{ARM::tADDSi8, ARM::tADDi8},
		{ARM::tADDSrr, ARM::tADDrr},
		{ARM::tADCS, ARM::tADC},

		{ARM::tSUBSi3, ARM::tSUBi3},
		{ARM::tSUBSi8, ARM::tSUBi8},
		{ARM::tSUBSrr, ARM::tSUBrr},
		{ARM::tSBCS, ARM::tSBC},

{ARM::t2ADDSri, ARM::t2ADDri},		{ARM::t2ADDSri, ARM::t2ADDri},
{ARM::t2ADDSrr, ARM::t2ADDrr},		{ARM::t2ADDSrr, ARM::t2ADDrr},
{ARM::t2ADDSrs, ARM::t2ADDrs},		{ARM::t2ADDSrs, ARM::t2ADDrs},

{ARM::t2SUBSri, ARM::t2SUBri},		{ARM::t2SUBSri, ARM::t2SUBri},
{ARM::t2SUBSrr, ARM::t2SUBrr},		{ARM::t2SUBSrr, ARM::t2SUBrr},
{ARM::t2SUBSrs, ARM::t2SUBrs},		{ARM::t2SUBSrs, ARM::t2SUBrs},

▲ Show 20 Lines • Show All 2,760 Lines • Show Last 20 Lines

lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 820 Lines • ▼ Show 20 Lines	if (Subtarget->isThumb1Only() \|\| !Subtarget->hasV6Ops()
setOperationAction(ISD::MULHS, MVT::i32, Expand);		setOperationAction(ISD::MULHS, MVT::i32, Expand);

setOperationAction(ISD::SHL_PARTS, MVT::i32, Custom);		setOperationAction(ISD::SHL_PARTS, MVT::i32, Custom);
setOperationAction(ISD::SRA_PARTS, MVT::i32, Custom);		setOperationAction(ISD::SRA_PARTS, MVT::i32, Custom);
setOperationAction(ISD::SRL_PARTS, MVT::i32, Custom);		setOperationAction(ISD::SRL_PARTS, MVT::i32, Custom);
setOperationAction(ISD::SRL, MVT::i64, Custom);		setOperationAction(ISD::SRL, MVT::i64, Custom);
setOperationAction(ISD::SRA, MVT::i64, Custom);		setOperationAction(ISD::SRA, MVT::i64, Custom);

if (!Subtarget->isThumb1Only()) {
// FIXME: We should do this for Thumb1 as well.
setOperationAction(ISD::ADDC, MVT::i32, Custom);		setOperationAction(ISD::ADDC, MVT::i32, Custom);
setOperationAction(ISD::ADDE, MVT::i32, Custom);		setOperationAction(ISD::ADDE, MVT::i32, Custom);
setOperationAction(ISD::SUBC, MVT::i32, Custom);		setOperationAction(ISD::SUBC, MVT::i32, Custom);
setOperationAction(ISD::SUBE, MVT::i32, Custom);		setOperationAction(ISD::SUBE, MVT::i32, Custom);
}

if (!Subtarget->isThumb1Only() && Subtarget->hasV6T2Ops())		if (!Subtarget->isThumb1Only() && Subtarget->hasV6T2Ops())
setOperationAction(ISD::BITREVERSE, MVT::i32, Legal);		setOperationAction(ISD::BITREVERSE, MVT::i32, Legal);

// ARM does not have ROTL.		// ARM does not have ROTL.
setOperationAction(ISD::ROTL, MVT::i32, Expand);		setOperationAction(ISD::ROTL, MVT::i32, Expand);
for (MVT VT : MVT::vector_valuetypes()) {		for (MVT VT : MVT::vector_valuetypes()) {
setOperationAction(ISD::ROTL, VT, Expand);		setOperationAction(ISD::ROTL, VT, Expand);
▲ Show 20 Lines • Show All 8,210 Lines • ▼ Show 20 Lines	void ARMTargetLowering::AdjustInstrPostInstrSelection(MachineInstr &MI,
// RSC. Coming out of isel, they have an implicit CPSR def, but the optional		// RSC. Coming out of isel, they have an implicit CPSR def, but the optional
// operand is still set to noreg. If needed, set the optional operand's		// operand is still set to noreg. If needed, set the optional operand's
// register to CPSR, and remove the redundant implicit def.		// register to CPSR, and remove the redundant implicit def.
//		//
// e.g. ADCS (..., CPSR<imp-def>) -> ADC (... opt:CPSR<def>).		// e.g. ADCS (..., CPSR<imp-def>) -> ADC (... opt:CPSR<def>).

// Rename pseudo opcodes.		// Rename pseudo opcodes.
unsigned NewOpc = convertAddSubFlagsOpcode(MI.getOpcode());		unsigned NewOpc = convertAddSubFlagsOpcode(MI.getOpcode());
		unsigned ccOutIdx;
if (NewOpc) {		if (NewOpc) {
const ARMBaseInstrInfo *TII = Subtarget->getInstrInfo();		const ARMBaseInstrInfo *TII = Subtarget->getInstrInfo();
MCID = &TII->get(NewOpc);		MCID = &TII->get(NewOpc);

assert(MCID->getNumOperands() == MI.getDesc().getNumOperands() + 1 &&		assert(MCID->getNumOperands() ==
"converted opcode should be the same except for cc_out");		MI.getDesc().getNumOperands() + 5 - MI.getDesc().getSize()
		&& "converted opcode should be the same except for cc_out"
		" (and, on Thumb1, pred)");

MI.setDesc(*MCID);		MI.setDesc(*MCID);

// Add the optional cc_out operand		// Add the optional cc_out operand
MI.addOperand(MachineOperand::CreateReg(0, /isDef=/true));		MI.addOperand(MachineOperand::CreateReg(0, /isDef=/true));

		// On Thumb1, move all input operands to the end, then add the predicate
		if (Subtarget->isThumb1Only()) {
		efriedmaUnsubmitted Not Done Reply Inline Actions Check isThumb1Only() rather than MCID->getSize()? efriedma: Check isThumb1Only() rather than MCID->getSize()?
		for (unsigned c = MCID->getNumOperands() - 4; c--;) {
		MI.addOperand(MI.getOperand(1));
		MI.RemoveOperand(1);
}		}
unsigned ccOutIdx = MCID->getNumOperands() - 1;
		// Restore the ties
		for (unsigned i = MI.getNumOperands(); i--;) {
		const MachineOperand& op = MI.getOperand(i);
		if (op.isReg() && op.isUse()) {
		int DefIdx = MCID->getOperandConstraint(i, MCOI::TIED_TO);
		if (DefIdx != -1)
		MI.tieOperands(DefIdx, i);
		}
		}

		MI.addOperand(MachineOperand::CreateImm(ARMCC::AL));
		MI.addOperand(MachineOperand::CreateReg(0, /isDef=/false));
		ccOutIdx = 1;
		} else
		ccOutIdx = MCID->getNumOperands() - 1;
		} else
		ccOutIdx = MCID->getNumOperands() - 1;

// Any ARM instruction that sets the 's' bit should specify an optional		// Any ARM instruction that sets the 's' bit should specify an optional
// "cc_out" operand in the last operand position.		// "cc_out" operand in the last operand position.
if (!MI.hasOptionalDef() \|\| !MCID->OpInfo[ccOutIdx].isOptionalDef()) {		if (!MI.hasOptionalDef() \|\| !MCID->OpInfo[ccOutIdx].isOptionalDef()) {
assert(!NewOpc && "Optional cc_out operand required");		assert(!NewOpc && "Optional cc_out operand required");
return;		return;
}		}
// Look for an implicit def of CPSR added by MachineInstr ctor. Remove it		// Look for an implicit def of CPSR added by MachineInstr ctor. Remove it
Show All 14 Lines	void ARMTargetLowering::AdjustInstrPostInstrSelection(MachineInstr &MI,
if (!definesCPSR) {		if (!definesCPSR) {
assert(!NewOpc && "Optional cc_out operand required");		assert(!NewOpc && "Optional cc_out operand required");
return;		return;
}		}
assert(deadCPSR == !Node->hasAnyUseOfValue(1) && "inconsistent dead flag");		assert(deadCPSR == !Node->hasAnyUseOfValue(1) && "inconsistent dead flag");
if (deadCPSR) {		if (deadCPSR) {
assert(!MI.getOperand(ccOutIdx).getReg() &&		assert(!MI.getOperand(ccOutIdx).getReg() &&
"expect uninitialized optional cc_out operand");		"expect uninitialized optional cc_out operand");
		// Thumb1 instructions must have the S bit even if the CPSR is dead.
		if (!Subtarget->isThumb1Only())
		efriedmaUnsubmitted Not Done Reply Inline Actions Check isThumb1Only() rather than MCID->getSize()? efriedma: Check isThumb1Only() rather than MCID->getSize()?
return;		return;
}		}

// If this instruction was defined with an optional CPSR def and its dag node		// If this instruction was defined with an optional CPSR def and its dag node
// had a live implicit CPSR def, then activate the optional CPSR def.		// had a live implicit CPSR def, then activate the optional CPSR def.
MachineOperand &MO = MI.getOperand(ccOutIdx);		MachineOperand &MO = MI.getOperand(ccOutIdx);
MO.setReg(ARM::CPSR);		MO.setReg(ARM::CPSR);
MO.setIsDef(true);		MO.setIsDef(true);
}		}
▲ Show 20 Lines • Show All 530 Lines • ▼ Show 20 Lines	if ((AddeNode->getOperand(0).getNode() == Zero &&
DAG.ReplaceAllUsesOfValueWith(SDValue(AddcNode, 0), SDValue(UMAAL.getNode(), 0));		DAG.ReplaceAllUsesOfValueWith(SDValue(AddcNode, 0), SDValue(UMAAL.getNode(), 0));

// Return original node to notify the driver to stop replacing.		// Return original node to notify the driver to stop replacing.
return SDValue(AddcNode, 0);		return SDValue(AddcNode, 0);
}		}
return SDValue();		return SDValue();
}		}

		static SDValue PerformAddeSubeCombine(SDNode *N, SelectionDAG &DAG,
		const ARMSubtarget *Subtarget) {
		if (Subtarget->isThumb1Only()) {
		SDValue RHS = N->getOperand(1);
		if (ConstantSDNode *C = dyn_cast<ConstantSDNode>(RHS)) {
		int64_t imm = C->getSExtValue();
		if (imm < 0) {
		SDLoc DL(N);

		// The with-carry-in form matches bitwise not instead of the negation.
		// Effectively, the inverse interpretation of the carry flag already
		// accounts for part of the negation.
		RHS = DAG.getConstant(~imm, DL, MVT::i32);

		unsigned Opcode = (N->getOpcode() == ARMISD::ADDE) ? ARMISD::SUBE
		: ARMISD::ADDE;
		return DAG.getNode(Opcode, DL, N->getVTList(),
		N->getOperand(0), RHS, N->getOperand(2));
		}
		}
		}
		return SDValue();
		}

/// PerformADDCCombine - Target-specific dag combine transform from		/// PerformADDCCombine - Target-specific dag combine transform from
/// ISD::ADDC, ISD::ADDE, and ISD::MUL_LOHI to MLAL or		/// ISD::ADDC, ISD::ADDE, and ISD::MUL_LOHI to MLAL or
/// ISD::ADDC, ISD::ADDE and ARMISD::UMLAL to ARMISD::UMAAL		/// ISD::ADDC, ISD::ADDE and ARMISD::UMLAL to ARMISD::UMAAL
static SDValue PerformADDCCombine(SDNode *N,		static SDValue PerformADDCCombine(SDNode *N,
TargetLowering::DAGCombinerInfo &DCI,		TargetLowering::DAGCombinerInfo &DCI,
const ARMSubtarget *Subtarget) {		const ARMSubtarget *Subtarget) {
if (Subtarget->isThumb1Only()) return SDValue();		if (Subtarget->isThumb1Only()) return SDValue();

▲ Show 20 Lines • Show All 2,026 Lines • ▼ Show 20 Lines	SDValue ARMTargetLowering::PerformDAGCombine(SDNode *N,
default: break;		default: break;
case ISD::ADDC: return PerformADDCCombine(N, DCI, Subtarget);		case ISD::ADDC: return PerformADDCCombine(N, DCI, Subtarget);
case ISD::ADD: return PerformADDCombine(N, DCI, Subtarget);		case ISD::ADD: return PerformADDCombine(N, DCI, Subtarget);
case ISD::SUB: return PerformSUBCombine(N, DCI);		case ISD::SUB: return PerformSUBCombine(N, DCI);
case ISD::MUL: return PerformMULCombine(N, DCI, Subtarget);		case ISD::MUL: return PerformMULCombine(N, DCI, Subtarget);
case ISD::OR: return PerformORCombine(N, DCI, Subtarget);		case ISD::OR: return PerformORCombine(N, DCI, Subtarget);
case ISD::XOR: return PerformXORCombine(N, DCI, Subtarget);		case ISD::XOR: return PerformXORCombine(N, DCI, Subtarget);
case ISD::AND: return PerformANDCombine(N, DCI, Subtarget);		case ISD::AND: return PerformANDCombine(N, DCI, Subtarget);
		case ARMISD::ADDE:
		case ARMISD::SUBE: return PerformAddeSubeCombine(N, DCI.DAG, Subtarget);
case ARMISD::BFI: return PerformBFICombine(N, DCI);		case ARMISD::BFI: return PerformBFICombine(N, DCI);
case ARMISD::VMOVRRD: return PerformVMOVRRDCombine(N, DCI, Subtarget);		case ARMISD::VMOVRRD: return PerformVMOVRRDCombine(N, DCI, Subtarget);
case ARMISD::VMOVDRR: return PerformVMOVDRRCombine(N, DCI.DAG);		case ARMISD::VMOVDRR: return PerformVMOVDRRCombine(N, DCI.DAG);
case ISD::STORE: return PerformSTORECombine(N, DCI);		case ISD::STORE: return PerformSTORECombine(N, DCI);
case ISD::BUILD_VECTOR: return PerformBUILD_VECTORCombine(N, DCI, Subtarget);		case ISD::BUILD_VECTOR: return PerformBUILD_VECTORCombine(N, DCI, Subtarget);
case ISD::INSERT_VECTOR_ELT: return PerformInsertEltCombine(N, DCI);		case ISD::INSERT_VECTOR_ELT: return PerformInsertEltCombine(N, DCI);
case ISD::VECTOR_SHUFFLE: return PerformVECTOR_SHUFFLECombine(N, DCI.DAG);		case ISD::VECTOR_SHUFFLE: return PerformVECTOR_SHUFFLECombine(N, DCI.DAG);
case ARMISD::VDUPLANE: return PerformVDUPLANECombine(N, DCI);		case ARMISD::VDUPLANE: return PerformVDUPLANECombine(N, DCI);
▲ Show 20 Lines • Show All 2,003 Lines • Show Last 20 Lines

lib/Target/ARM/ARMInstrThumb.td

Show First 20 Lines • Show All 904 Lines • ▼ Show 20 Lines
}		}

let isAdd = 1 in {		let isAdd = 1 in {
// Add with carry register		// Add with carry register
let isCommutable = 1, Uses = [CPSR] in		let isCommutable = 1, Uses = [CPSR] in
def tADC : // A8.6.2		def tADC : // A8.6.2
T1sItDPEncode<0b0101, (outs tGPR:$Rdn), (ins tGPR:$Rn, tGPR:$Rm), IIC_iALUr,		T1sItDPEncode<0b0101, (outs tGPR:$Rdn), (ins tGPR:$Rn, tGPR:$Rm), IIC_iALUr,
"adc", "\t$Rdn, $Rm",		"adc", "\t$Rdn, $Rm",
[(set tGPR:$Rdn, (adde tGPR:$Rn, tGPR:$Rm))]>, Sched<[WriteALU]>;		[]>, Sched<[WriteALU]>;

// Add immediate		// Add immediate
def tADDi3 : // A8.6.4 T1		def tADDi3 : // A8.6.4 T1
T1sIGenEncodeImm<0b01110, (outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),		T1sIGenEncodeImm<0b01110, (outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),
IIC_iALUi,		IIC_iALUi,
"add", "\t$Rd, $Rm, $imm3",		"add", "\t$Rd, $Rm, $imm3",
[(set tGPR:$Rd, (add tGPR:$Rm, imm0_7:$imm3))]>,		[(set tGPR:$Rd, (add tGPR:$Rm, imm0_7:$imm3))]>,
Sched<[WriteALU]> {		Sched<[WriteALU]> {
Show All 11 Lines	let isAdd = 1 in {
// Add register		// Add register
let isCommutable = 1 in		let isCommutable = 1 in
def tADDrr : // A8.6.6 T1		def tADDrr : // A8.6.6 T1
T1sIGenEncode<0b01100, (outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),		T1sIGenEncode<0b01100, (outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),
IIC_iALUr,		IIC_iALUr,
"add", "\t$Rd, $Rn, $Rm",		"add", "\t$Rd, $Rn, $Rm",
[(set tGPR:$Rd, (add tGPR:$Rn, tGPR:$Rm))]>, Sched<[WriteALU]>;		[(set tGPR:$Rd, (add tGPR:$Rn, tGPR:$Rm))]>, Sched<[WriteALU]>;

		/// Similar to the above except these set the 's' bit so the
		/// instruction modifies the CPSR register.
		///
		/// These opcodes will be converted to the real non-S opcodes by
		/// AdjustInstrPostInstrSelection after giving then an optional CPSR operand.
		let hasPostISelHook = 1, Defs = [CPSR] in {
		let isCommutable = 1 in
		def tADCS : tPseudoInst<(outs tGPR:$Rdn), (ins tGPR:$Rn, tGPR:$Rm),
		2, IIC_iALUr,
		[(set tGPR:$Rdn, CPSR, (ARMadde tGPR:$Rn, tGPR:$Rm,
		CPSR))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;

		def tADDSi3 : tPseudoInst<(outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),
		2, IIC_iALUi,
		[(set tGPR:$Rd, CPSR, (ARMaddc tGPR:$Rm,
		imm0_7:$imm3))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;

		def tADDSi8 : tPseudoInst<(outs tGPR:$Rdn), (ins tGPR:$Rn, imm0_255:$imm8),
		2, IIC_iALUi,
		[(set tGPR:$Rdn, CPSR, (ARMaddc tGPR:$Rn,
		imm8_255:$imm8))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;

		let isCommutable = 1 in
		def tADDSrr : tPseudoInst<(outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),
		2, IIC_iALUr,
		[(set tGPR:$Rd, CPSR, (ARMaddc tGPR:$Rn,
		tGPR:$Rm))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;
		}

let hasSideEffects = 0 in		let hasSideEffects = 0 in
def tADDhirr : T1pIt<(outs GPR:$Rdn), (ins GPR:$Rn, GPR:$Rm), IIC_iALUr,		def tADDhirr : T1pIt<(outs GPR:$Rdn), (ins GPR:$Rn, GPR:$Rm), IIC_iALUr,
"add", "\t$Rdn, $Rm", []>,		"add", "\t$Rdn, $Rm", []>,
T1Special<{0,0,?,?}>, Sched<[WriteALU]> {		T1Special<{0,0,?,?}>, Sched<[WriteALU]> {
// A8.6.6 T2		// A8.6.6 T2
bits<4> Rdn;		bits<4> Rdn;
bits<4> Rm;		bits<4> Rm;
let Inst{7} = Rdn{3};		let Inst{7} = Rdn{3};
▲ Show 20 Lines • Show All 243 Lines • ▼ Show 20 Lines	T1sIDPEncode<0b1001, (outs tGPR:$Rd), (ins tGPR:$Rn),
[(set tGPR:$Rd, (ineg tGPR:$Rn))]>, Sched<[WriteALU]>;		[(set tGPR:$Rd, (ineg tGPR:$Rn))]>, Sched<[WriteALU]>;

// Subtract with carry register		// Subtract with carry register
let Uses = [CPSR] in		let Uses = [CPSR] in
def tSBC : // A8.6.151		def tSBC : // A8.6.151
T1sItDPEncode<0b0110, (outs tGPR:$Rdn), (ins tGPR:$Rn, tGPR:$Rm),		T1sItDPEncode<0b0110, (outs tGPR:$Rdn), (ins tGPR:$Rn, tGPR:$Rm),
IIC_iALUr,		IIC_iALUr,
"sbc", "\t$Rdn, $Rm",		"sbc", "\t$Rdn, $Rm",
[(set tGPR:$Rdn, (sube tGPR:$Rn, tGPR:$Rm))]>,		[]>,
Sched<[WriteALU]>;		Sched<[WriteALU]>;

// Subtract immediate		// Subtract immediate
def tSUBi3 : // A8.6.210 T1		def tSUBi3 : // A8.6.210 T1
T1sIGenEncodeImm<0b01111, (outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),		T1sIGenEncodeImm<0b01111, (outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),
IIC_iALUi,		IIC_iALUi,
"sub", "\t$Rd, $Rm, $imm3",		"sub", "\t$Rd, $Rm, $imm3",
[(set tGPR:$Rd, (add tGPR:$Rm, imm0_7_neg:$imm3))]>,		[(set tGPR:$Rd, (add tGPR:$Rm, imm0_7_neg:$imm3))]>,
Show All 12 Lines
// Subtract register		// Subtract register
def tSUBrr : // A8.6.212		def tSUBrr : // A8.6.212
T1sIGenEncode<0b01101, (outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),		T1sIGenEncode<0b01101, (outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),
IIC_iALUr,		IIC_iALUr,
"sub", "\t$Rd, $Rn, $Rm",		"sub", "\t$Rd, $Rn, $Rm",
[(set tGPR:$Rd, (sub tGPR:$Rn, tGPR:$Rm))]>,		[(set tGPR:$Rd, (sub tGPR:$Rn, tGPR:$Rm))]>,
Sched<[WriteALU]>;		Sched<[WriteALU]>;

		/// Similar to the above except these set the 's' bit so the
		/// instruction modifies the CPSR register.
		///
		/// These opcodes will be converted to the real non-S opcodes by
		/// AdjustInstrPostInstrSelection after giving then an optional CPSR operand.
		let hasPostISelHook = 1, Defs = [CPSR] in {
		def tSBCS : tPseudoInst<(outs tGPR:$Rdn), (ins tGPR:$Rn, tGPR:$Rm),
		2, IIC_iALUr,
		[(set tGPR:$Rdn, CPSR, (ARMsube tGPR:$Rn, tGPR:$Rm,
		CPSR))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;

		def tSUBSi3 : tPseudoInst<(outs tGPR:$Rd), (ins tGPR:$Rm, imm0_7:$imm3),
		2, IIC_iALUi,
		[(set tGPR:$Rd, CPSR, (ARMsubc tGPR:$Rm,
		imm0_7:$imm3))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;

		def tSUBSi8 : tPseudoInst<(outs tGPR:$Rdn), (ins tGPR:$Rn, imm0_255:$imm8),
		2, IIC_iALUi,
		[(set tGPR:$Rdn, CPSR, (ARMsubc tGPR:$Rn,
		imm8_255:$imm8))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;

		def tSUBSrr : tPseudoInst<(outs tGPR:$Rd), (ins tGPR:$Rn, tGPR:$Rm),
		2, IIC_iALUr,
		[(set tGPR:$Rd, CPSR, (ARMsubc tGPR:$Rn,
		tGPR:$Rm))]>,
		Requires<[IsThumb1Only]>,
		Sched<[WriteALU]>;
		}

// Sign-extend byte		// Sign-extend byte
def tSXTB : // A8.6.222		def tSXTB : // A8.6.222
T1pIMiscEncode<{0,0,1,0,0,1,?}, (outs tGPR:$Rd), (ins tGPR:$Rm),		T1pIMiscEncode<{0,0,1,0,0,1,?}, (outs tGPR:$Rd), (ins tGPR:$Rm),
IIC_iUNAr,		IIC_iUNAr,
"sxtb", "\t$Rd, $Rm",		"sxtb", "\t$Rd, $Rm",
[(set tGPR:$Rd, (sext_inreg tGPR:$Rm, i8))]>,		[(set tGPR:$Rd, (sext_inreg tGPR:$Rm, i8))]>,
Requires<[IsThumb, IsThumb1Only, HasV6]>,		Requires<[IsThumb, IsThumb1Only, HasV6]>,
Sched<[WriteALU]>;		Sched<[WriteALU]>;
▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
//		//

// Comparisons		// Comparisons
def : T1Pat<(ARMcmpZ tGPR:$Rn, imm0_255:$imm8),		def : T1Pat<(ARMcmpZ tGPR:$Rn, imm0_255:$imm8),
(tCMPi8 tGPR:$Rn, imm0_255:$imm8)>;		(tCMPi8 tGPR:$Rn, imm0_255:$imm8)>;
def : T1Pat<(ARMcmpZ tGPR:$Rn, tGPR:$Rm),		def : T1Pat<(ARMcmpZ tGPR:$Rn, tGPR:$Rm),
(tCMPr tGPR:$Rn, tGPR:$Rm)>;		(tCMPr tGPR:$Rn, tGPR:$Rm)>;

// Add with carry
def : T1Pat<(addc tGPR:$lhs, imm0_7:$rhs),
(tADDi3 tGPR:$lhs, imm0_7:$rhs)>;
def : T1Pat<(addc tGPR:$lhs, imm8_255:$rhs),
(tADDi8 tGPR:$lhs, imm8_255:$rhs)>;
def : T1Pat<(addc tGPR:$lhs, tGPR:$rhs),
(tADDrr tGPR:$lhs, tGPR:$rhs)>;

// Subtract with carry		// Subtract with carry
def : T1Pat<(addc tGPR:$lhs, imm0_7_neg:$rhs),		def : T1Pat<(ARMaddc tGPR:$lhs, imm0_7_neg:$rhs),
(tSUBi3 tGPR:$lhs, imm0_7_neg:$rhs)>;		(tSUBSi3 tGPR:$lhs, imm0_7_neg:$rhs)>;
def : T1Pat<(addc tGPR:$lhs, imm8_255_neg:$rhs),		def : T1Pat<(ARMaddc tGPR:$lhs, imm8_255_neg:$rhs),
(tSUBi8 tGPR:$lhs, imm8_255_neg:$rhs)>;		(tSUBSi8 tGPR:$lhs, imm8_255_neg:$rhs)>;
def : T1Pat<(subc tGPR:$lhs, tGPR:$rhs),
(tSUBrr tGPR:$lhs, tGPR:$rhs)>;

// Bswap 16 with load/store		// Bswap 16 with load/store
def : T1Pat<(srl (bswap (extloadi16 t_addrmode_is2:$addr)), (i32 16)),		def : T1Pat<(srl (bswap (extloadi16 t_addrmode_is2:$addr)), (i32 16)),
(tREV16 (tLDRHi t_addrmode_is2:$addr))>;		(tREV16 (tLDRHi t_addrmode_is2:$addr))>;
def : T1Pat<(srl (bswap (extloadi16 t_addrmode_rr:$addr)), (i32 16)),		def : T1Pat<(srl (bswap (extloadi16 t_addrmode_rr:$addr)), (i32 16)),
(tREV16 (tLDRHr t_addrmode_rr:$addr))>;		(tREV16 (tLDRHr t_addrmode_rr:$addr))>;
def : T1Pat<(truncstorei16 (srl (bswap tGPR:$Rn), (i32 16)),		def : T1Pat<(truncstorei16 (srl (bswap tGPR:$Rn), (i32 16)),
t_addrmode_is2:$addr),		t_addrmode_is2:$addr),
▲ Show 20 Lines • Show All 199 Lines • Show Last 20 Lines

test/CodeGen/Thumb/long.ll

	; RUN: llc -mtriple=thumb-eabi %s -o - \| FileCheck %s			; RUN: llc -mtriple=thumb-eabi %s -verify-machineinstrs -o - \| FileCheck %s
	; RUN: llc -mtriple=thumb-apple-darwin %s -o - \| \			; RUN: llc -mtriple=thumb-apple-darwin %s -verify-machineinstrs -o - \| \
	; RUN: FileCheck %s -check-prefix CHECK -check-prefix CHECK-DARWIN			; RUN: FileCheck %s -check-prefix CHECK -check-prefix CHECK-DARWIN
				efriedmaUnsubmitted Not Done Reply Inline Actions Add -verify-machineinstrs to the RUN lines. efriedma: Add -verify-machineinstrs to the RUN lines.

	define i64 @f1() {			define i64 @f1() {
	entry:			entry:
	ret i64 0			ret i64 0
				; CHECK-LABEL: f1:
				; CHECK: movs r0, #0
				; CHECK: movs r1, r0
	}			}

	define i64 @f2() {			define i64 @f2() {
	entry:			entry:
	ret i64 1			ret i64 1
				; CHECK-LABEL: f2:
				; CHECK: movs r0, #1
				; CHECK: movs r1, #0
	}			}

	define i64 @f3() {			define i64 @f3() {
	entry:			entry:
	ret i64 2147483647			ret i64 2147483647
				; CHECK-LABEL: f3:
				; CHECK: ldr r0,
				; CHECK: movs r1, #0
	}			}

	define i64 @f4() {			define i64 @f4() {
	entry:			entry:
	ret i64 2147483648			ret i64 2147483648
				; CHECK-LABEL: f4:
				; CHECK: movs r0, #1
				; CHECK: lsls r0, r0, #31
				; CHECK: movs r1, #0
	}			}

	define i64 @f5() {			define i64 @f5() {
	entry:			entry:
	ret i64 9223372036854775807			ret i64 9223372036854775807
	; CHECK-LABEL: f5:			; CHECK-LABEL: f5:
	; CHECK: mvn			; CHECK: movs r0, #0
	; CHECK-NOT: mvn			; CHECK: mvns r0, r0
				; CHECK: ldr r1,
	}			}

	define i64 @f6(i64 %x, i64 %y) {			define i64 @f6(i64 %x, i64 %y) {
	entry:			entry:
	%tmp1 = add i64 %y, 1 ; <i64> [#uses=1]			%tmp1 = add i64 %y, 1 ; <i64> [#uses=1]
	ret i64 %tmp1			ret i64 %tmp1
	; CHECK-LABEL: f6:			; CHECK-LABEL: f6:
	; CHECK: adc			; CHECK: movs r1, #0
	; CHECK-NOT: adc			; CHECK: adds r0, r2, #1
				; CHECK: adcs r1, r3
				tyomitchAuthorUnsubmitted Not Done Reply Inline Actions Now I see that lowering an `(ADDE x, y, (ADDC z, t))` into a chain of `(CopyFromReg CPSR, (tADD z, t)), (CopyFromReg CPSR, (tADC x, y, (CopyToReg CPSR)))`, with the CPSR-copying nodes glued to the arithmetic nodes, -- doesn't prevent LLVM from scheduling CPSR-clobbering operations in between the converted ADDC and the converted ADDE, -- such as in this test case, where a flag-setting tMOVi8 is inserted in the middle. An ugly patch is certainly better than an incorrect one, so I decided to go back and finish the "hybrid implementation" using tPseudoInsts with two integer outputs each for tADDS / tSUBS, and custom C++ lowering for tADC / tSBC. tyomitch: Now I see that lowering an `(ADDE x, y, (ADDC z, t))` into a chain of `(CopyFromReg CPSR, (tADD…
				}

				define i64 @f6a(i64 %x, i64 %y) {
				entry:
				%tmp1 = add i64 %y, 10
				ret i64 %tmp1
				; CHECK-LABEL: f6a:
				; CHECK: movs r1, #0
				; CHECK: adds r2, #10
				; CHECK: adcs r1, r3
				; CHECK: movs r0, r2
				}

				define i64 @f6b(i64 %x, i64 %y) {
				entry:
				%tmp1 = add i64 %y, 1000
				ret i64 %tmp1
				; CHECK-LABEL: f6b:
				; CHECK: movs r0, #125
				; CHECK: lsls r0, r0, #3
				; CHECK: movs r1, #0
				; CHECK: adds r0, r2, r0
				; CHECK: adcs r1, r3
	}			}
				efriedmaUnsubmitted Not Done Reply Inline Actions I'd also like to see some tests here for subtraction with an immediate amount. ("add i64 %y, -10" etc.) efriedma: I'd also like to see some tests here for subtraction with an immediate amount. ("add i64 %y…
				tyomitchAuthorUnsubmitted Not Done Reply Inline Actions Indeed, subtracting immediates wasn't handled well; I'll upload the updated patch. tyomitch: Indeed, subtracting immediates wasn't handled well; I'll upload the updated patch.

	define void @f7() {			define void @f7() {
	entry:			entry:
	%tmp = call i64 @f8( ) ; <i64> [#uses=0]			%tmp = call i64 @f8( ) ; <i64> [#uses=0]
	ret void			ret void
				; CHECK-LABEL: f7:
				; CHECK: bl
	}			}

	declare i64 @f8()			declare i64 @f8()

	define i64 @f9(i64 %a, i64 %b) {			define i64 @f9(i64 %a, i64 %b) {
	entry:			entry:
	%tmp = sub i64 %a, %b ; <i64> [#uses=1]			%tmp = sub i64 %a, %b ; <i64> [#uses=1]
	ret i64 %tmp			ret i64 %tmp
	; CHECK-LABEL: f9:			; CHECK-LABEL: f9:
	; CHECK: sbc			; CHECK: subs r0, r0, r2
	; CHECK-NOT: sbc			; CHECK: sbcs r1, r3
				}

				define i64 @f9a(i64 %x, i64 %y) { ; ADDC with small negative imm => SUBS imm
				entry:
				%tmp1 = sub i64 %y, 10
				ret i64 %tmp1
				; CHECK-LABEL: f9a:
				; CHECK: movs r0, #0
				; CHECK: subs r2, #10
				; CHECK: sbcs r3, r0
				; CHECK: movs r0, r2
				; CHECK: movs r1, r3
				}

				define i64 @f9b(i64 %x, i64 %y) { ; ADDC with big negative imm => SUBS reg
				entry:
				%tmp1 = sub i64 1000, %y
				ret i64 %tmp1
				; CHECK-LABEL: f9b:
				; CHECK: movs r0, #125
				; CHECK: lsls r0, r0, #3
				; CHECK: movs r1, #0
				; CHECK: subs r0, r0, r2
				; CHECK: sbcs r1, r3
				}

				define i64 @f9c(i64 %x, i32 %y) { ; SUBS with small positive imm => SUBS imm
				entry:
				%conv = sext i32 %y to i64
				%shl = shl i64 %conv, 32
				%or = or i64 %shl, 1
				%sub = sub nsw i64 %x, %or
				ret i64 %sub
				; CHECK-LABEL: f9c:
				; CHECK: subs r0, r0, #1
				; CHECK: sbcs r1, r2
				}

				define i64 @f9d(i64 %x, i32 %y) { ; SUBS with small negative imm => SUBS reg
				; FIXME: this would be better lowered as an `ADDS imm`
				entry:
				%conv = sext i32 %y to i64
				%shl = shl i64 %conv, 32
				%or = or i64 %shl, 4294967295
				%sub = sub nsw i64 %x, %or
				ret i64 %sub
				; CHECK-LABEL: f9d:
				; CHECK: movs r3, #0
				; CHECK: mvns r3, r3
				; CHECK: subs r0, r0, r3
				; CHECK: sbcs r1, r2
	}			}

	define i64 @f(i32 %a, i32 %b) {			define i64 @f(i32 %a, i32 %b) {
	entry:			entry:
	%tmp = sext i32 %a to i64 ; <i64> [#uses=1]			%tmp = sext i32 %a to i64 ; <i64> [#uses=1]
	%tmp1 = sext i32 %b to i64 ; <i64> [#uses=1]			%tmp1 = sext i32 %b to i64 ; <i64> [#uses=1]
	%tmp2 = mul i64 %tmp1, %tmp ; <i64> [#uses=1]			%tmp2 = mul i64 %tmp1, %tmp ; <i64> [#uses=1]
	ret i64 %tmp2			ret i64 %tmp2
	; CHECK-LABEL: f:			; CHECK-LABEL: f:
				; CHECK-V6: bl __aeabi_lmul
	; CHECK-DARWIN: __muldi3			; CHECK-DARWIN: __muldi3
	}			}

	define i64 @g(i32 %a, i32 %b) {			define i64 @g(i32 %a, i32 %b) {
	entry:			entry:
	%tmp = zext i32 %a to i64 ; <i64> [#uses=1]			%tmp = zext i32 %a to i64 ; <i64> [#uses=1]
	%tmp1 = zext i32 %b to i64 ; <i64> [#uses=1]			%tmp1 = zext i32 %b to i64 ; <i64> [#uses=1]
	%tmp2 = mul i64 %tmp1, %tmp ; <i64> [#uses=1]			%tmp2 = mul i64 %tmp1, %tmp ; <i64> [#uses=1]
	ret i64 %tmp2			ret i64 %tmp2
	; CHECK-LABEL: g:			; CHECK-LABEL: g:
				; CHECK-V6: bl __aeabi_lmul
	; CHECK-DARWIN: __muldi3			; CHECK-DARWIN: __muldi3
	}			}

	define i64 @f10() {			define i64 @f10() {
	entry:			entry:
	%a = alloca i64, align 8 ; <i64*> [#uses=1]			%a = alloca i64, align 8 ; <i64*> [#uses=1]
	%retval = load i64, i64* %a ; <i64> [#uses=1]			%retval = load i64, i64* %a ; <i64> [#uses=1]
	ret i64 %retval			ret i64 %retval
				; CHECK-LABEL: f10:
				; CHECK: sub sp, #8
				; CHECK: ldr r0, [sp]
				; CHECK: ldr r1, [sp, #4]
				; CHECK: add sp, #8
	}			}

				define i64 @f11(i64 %x, i64 %y) {
				entry:
				%tmp1 = add i64 -1000, %y
				%tmp2 = add i64 %tmp1, -1000
				ret i64 %tmp2
				; CHECK-LABEL: f11:
				; CHECK: movs r1, #0
				; CHECK: ldr r0,
				; CHECK: adds r2, r2, r0
				; CHECK: sbcs r3, r1
				; CHECK: adds r0, r2, r0
				; CHECK: sbcs r3, r1
				; CHECK: movs r1, r3
				}

This is an archive of the discontinued LLVM Phabricator instance.

For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as already done for ARM and Thumb2.ClosedPublic

Details

Diff Detail

Event Timeline

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Updating D30400: For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes,

Revision Contents

Diff 91231

lib/Target/ARM/ARMBaseInstrInfo.cpp

lib/Target/ARM/ARMISelLowering.cpp

lib/Target/ARM/ARMInstrThumb.td

test/CodeGen/Thumb/long.ll

For Thumb1, lower ADDC/ADDE/SUBC/SUBE via the glueless ARMISD nodes, same as already done for ARM and Thumb2.
ClosedPublic