This is an archive of the discontinued LLVM Phabricator instance.

[Thumb] Add support for tMUL in the compare instruction peephole optimizer
ClosedPublic

Authored by SjoerdMeijer on Dec 20 2016, 7:41 AM.

Download Raw Diff

Details

Reviewers

rovka
rengolin
t.p.northover
jmolloy
dexonsmith

Commits

rG2db2a947f64b: [Thumb] Add support for tMUL in the compare instruction peephole optimizer.
rL292608: [Thumb] Add support for tMUL in the compare instruction peephole optimizer.

Summary

We also want to optimise tests like this: return a*b == 0. The MULS instruction is flag setting, so we don't need the CMP instruction but can instead branch on the result of the MULS. The generated instructions sequence for this example was: MULS, MOVS, MOVS, CMP. The MOVS instruction load the boolean values resulting from the select instruction, but these MOVS instructions are flag setting and were thus preventing this optimisation. Now we first reorder and move the MULS to before the CMP and generate sequence MOVS, MOVS, MULS, CMP so that the optimisation could still be triggered. Reordering of the MULS and MOVS is safe to do because the subsequent MOVS instructions just set the CPSR register and don't use it, i.e. the CPSR is dead.

Diff Detail

Repository: rL LLVM

Event Timeline

SjoerdMeijer updated this revision to Diff 82108.Dec 20 2016, 7:41 AM

SjoerdMeijer retitled this revision from to [Thumb] Add support for tMUL in the compare instruction peephole optimizer.

SjoerdMeijer updated this object.

SjoerdMeijer added reviewers: jmolloy, rengolin, dexonsmith, t.p.northover, rovka.

SjoerdMeijer added a subscriber: llvm-commits.

Hi Sjoerd,

I'm not sure this is the best way to go, it looks a bit hacky. Why are we generating MOVS for those constants in the first place? It feels like MOV should do the job just as well, and wouldn't touch the flags.

Cheers,
Diana

test/CodeGen/ARM/mul-cmp-peephole.ll
1 ↗	(On Diff #82108)	I think this would work better as a .mir test (use -stop-before=peephole-opt).
28 ↗	(On Diff #82108)	Is all this alloca clutter relevant to the test?

Hi Diana,

Thanks a lot for the review! We are generating these constants and moves for an expression like "return A*B == C", where A and B are variables and C is a constant, because this is lowered involving a select_cc is generated. For example:

%mul = mul nsw i32 %b, %a
%cmp = icmp eq i32 %mul, 0
%conv = zext i1 %cmp to i32
ret i32 %conv

this IR gets instruction selected to:

t5: i32 = mul t4, t2

    t19: glue = ARMISD::CMPZ t5, Constant:i32<0>
  t20: i32 = ARMISD::CMOV Constant:i32<0>, Constant:i32<1>, Constant:i32<0>, Register:i32 %CPSR, t19
t11: ch,glue = CopyToReg t0, Register:i32 %R0, t20
t12: ch = ARMISD::RET_FLAG t11, Register:i32 %R0, t11:1

and then we end up with these MIs:

%vreg2<def,tied3>, %CPSR<def> = tMUL %vreg1, %vreg0<tied0>, pred:14, pred:%noreg; tGPR:%vreg2,%vreg1,%vreg0

%vreg3<def>, %CPSR<def> = tMOVi8 1, pred:14, pred:%noreg; tGPR:%vreg3
%vreg4<def>, %CPSR<def> = tMOVi8 0, pred:14, pred:%noreg; tGPR:%vreg4
tCMPi8 %vreg2<kill>, 0, pred:14, pred:%noreg, %CPSR<imp-def>; tGPR:%vreg2
%vreg5<def> = tMOVCCr_pseudo %vreg4<kill>, %vreg3<kill>, pred:0, pred:%CPSR; tGPR:%vreg5,%vreg4,%vreg3
%R0<def> = COPY %vreg5; tGPR:%vreg5
tBX_RET

And the issue is that the move immediate instruction for Thumb1 (V6M) updates the condition flags.

The elegant solution would be a bit of (early) CFG restructuring and 2 exit blocks return the true/false conditions, so that we end up with something similar to this:

MULS     r0,r1,r0
BEQ      |L0.8|
MOVS     r0,#0
BX       lr

L0.8

MOVS     r0,#1
BX       lr

But to achieve that at this point is difficult, hence my solution to simply reorder the constants and the operation.

Right, sorry, I was thinking about Thumb2, where MOV doesn't touch the flags.

I guess reordering isn't so bad in this case, but it will affect the registers that you have available. Before, if the operands for tMUL weren't needed elsewhere you could reuse their registers for the constants, now you might need extra regs for them. E.g. in the first example from the test, you end up using up to r3 instead of just up to r2. I think it would be useful to run some benchmarks with this change, to make sure we're not causing trouble in more complicated cases.

Thanks.

Thanks for the suggestion and I will look into possible correctness issues a bit more, which was also on my todo list. I.e., I ran testing and that didn't show problems for lnt, dhrystone, coremark, eembc, geekbench, spec2k, but spec2k6 might have found something but need to check if that is noise or not.

I have verified correctness and this patch does not show any problems in regression tests or benchmarks. This peephole works on the vregs, so is still in SSA form. As a consequence, the movs won't redefine/kill the MUL operands which would make this reordering illegal. I have added an extra check though that the movs are not predicated. I have also changed the tests into .mir tests (and simplified the 2nd test).

rovka added inline comments.Jan 11 2017, 7:46 AM

lib/Target/ARM/ARMBaseInstrInfo.cpp
2528 ↗	(On Diff #83042)	Typo: MOVS instructions (also on line 2531)
2537 ↗	(On Diff #83042)	Nitpick: I think CanReorder is a better name.
2541 ↗	(On Diff #83042)	Why not isPredicated?
2541 ↗	(On Diff #83042)	You should cover this in the tests (i.e. check that the optimization is not performed for a predicated MOV).
test/CodeGen/ARM/cmp1-peephole-thumb.mir
3 ↗	(On Diff #83042)	Nitpick: I'd move this closer to the MIR code. Plus, since the code is dealing with reordering, it's probably a good idea to also check for what you're expecting to see in the output (i.e. the MOVs followed by the MUL).

Thanks for reviewing again. I have fixed the typos in the comments, changed the variable name, and modified the test case. Regarding the predicated MOVs I have taken a slightly different approach. Since we only want to do this peephole optimisation for Thumb1, I first check that the instruction is actually a Thumb1 instruction. And because Thumb1 does not have conditional movs, we can safely omit the isPredicated check (testing shows no issues).

LGTM with a couple more nitpicks (you don't have to upload a new version here, just make the changes and commit).

lib/Target/ARM/ARMBaseInstrInfo.cpp
2570 ↗	(On Diff #84559)	Another typo: eliminate :D
test/CodeGen/ARM/cmp2-peephole-thumb.mir
8 ↗	(On Diff #84559)	Please move this closer to the MIR code (as you did for the other test), and also check for the compare.

This revision is now accepted and ready to land.Jan 19 2017, 4:44 AM

Thanks! Have fixed that, and will commit.

Closed by commit rL292608: [Thumb] Add support for tMUL in the compare instruction peephole optimizer. (authored by SjoerdMeijer). · Explain WhyJan 20 2017, 5:21 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

ARM/

ARMBaseInstrInfo.cpp

377 lines

test/

CodeGen/

ARM/

cmp1-peephole-thumb.mir

78 lines

cmp2-peephole-thumb.mir

108 lines

Diff 85125

llvm/trunk/lib/Target/ARM/ARMBaseInstrInfo.cpp

Show First 20 Lines • Show All 2,429 Lines • ▼ Show 20 Lines	if ((CmpI->getOpcode() == ARM::CMPri \|\|
(OI->getOpcode() == ARM::SUBri \|\|		(OI->getOpcode() == ARM::SUBri \|\|
OI->getOpcode() == ARM::t2SUBri) &&		OI->getOpcode() == ARM::t2SUBri) &&
OI->getOperand(1).getReg() == SrcReg &&		OI->getOperand(1).getReg() == SrcReg &&
OI->getOperand(2).getImm() == ImmValue)		OI->getOperand(2).getImm() == ImmValue)
return true;		return true;
return false;		return false;
}		}

		static bool isOptimizeCompareCandidate(MachineInstr *MI, bool &IsThumb1) {
		switch (MI->getOpcode()) {
		default: return false;
		case ARM::tLSLri:
		case ARM::tLSRri:
		case ARM::tLSLrr:
		case ARM::tLSRrr:
		case ARM::tSUBrr:
		case ARM::tADDrr:
		case ARM::tADDi3:
		case ARM::tADDi8:
		case ARM::tSUBi3:
		case ARM::tSUBi8:
		case ARM::tMUL:
		IsThumb1 = true;
		LLVM_FALLTHROUGH;
		case ARM::RSBrr:
		case ARM::RSBri:
		case ARM::RSCrr:
		case ARM::RSCri:
		case ARM::ADDrr:
		case ARM::ADDri:
		case ARM::ADCrr:
		case ARM::ADCri:
		case ARM::SUBrr:
		case ARM::SUBri:
		case ARM::SBCrr:
		case ARM::SBCri:
		case ARM::t2RSBri:
		case ARM::t2ADDrr:
		case ARM::t2ADDri:
		case ARM::t2ADCrr:
		case ARM::t2ADCri:
		case ARM::t2SUBrr:
		case ARM::t2SUBri:
		case ARM::t2SBCrr:
		case ARM::t2SBCri:
		case ARM::ANDrr:
		case ARM::ANDri:
		case ARM::t2ANDrr:
		case ARM::t2ANDri:
		case ARM::ORRrr:
		case ARM::ORRri:
		case ARM::t2ORRrr:
		case ARM::t2ORRri:
		case ARM::EORrr:
		case ARM::EORri:
		case ARM::t2EORrr:
		case ARM::t2EORri:
		case ARM::t2LSRri:
		case ARM::t2LSRrr:
		case ARM::t2LSLri:
		case ARM::t2LSLrr:
		return true;
		}
		}

/// optimizeCompareInstr - Convert the instruction supplying the argument to the		/// optimizeCompareInstr - Convert the instruction supplying the argument to the
/// comparison into one that sets the zero bit in the flags register;		/// comparison into one that sets the zero bit in the flags register;
/// Remove a redundant Compare instruction if an earlier instruction can set the		/// Remove a redundant Compare instruction if an earlier instruction can set the
/// flags in the same way as Compare.		/// flags in the same way as Compare.
/// E.g. SUBrr(r1,r2) and CMPrr(r1,r2). We also handle the case where two		/// E.g. SUBrr(r1,r2) and CMPrr(r1,r2). We also handle the case where two
/// operands are swapped: SUBrr(r1,r2) and CMPrr(r2,r1), by updating the		/// operands are swapped: SUBrr(r1,r2) and CMPrr(r2,r1), by updating the
/// condition code of instructions which use the flags.		/// condition code of instructions which use the flags.
bool ARMBaseInstrInfo::optimizeCompareInstr(		bool ARMBaseInstrInfo::optimizeCompareInstr(
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	else if (MI->getParent() != CmpInstr.getParent() \|\| CmpValue != 0) {
// Thus we cannot return here.		// Thus we cannot return here.
if (CmpInstr.getOpcode() == ARM::CMPri \|\|		if (CmpInstr.getOpcode() == ARM::CMPri \|\|
CmpInstr.getOpcode() == ARM::t2CMPri)		CmpInstr.getOpcode() == ARM::t2CMPri)
MI = nullptr;		MI = nullptr;
else		else
return false;		return false;
}		}

		bool IsThumb1 = false;
		if (MI && !isOptimizeCompareCandidate(MI, IsThumb1))
		return false;

		// We also want to do this peephole for cases like this: if (a*b == 0),
		// and optimise away the CMP instruction from the generated code sequence:
		// MULS, MOVS, MOVS, CMP. Here the MOVS instructions load the boolean values
		// resulting from the select instruction, but these MOVS instructions for
		// Thumb1 (V6M) are flag setting and are thus preventing this optimisation.
		// However, if we only have MOVS instructions in between the CMP and the
		// other instruction (the MULS in this example), then the CPSR is dead so we
		// can safely reorder the sequence into: MOVS, MOVS, MULS, CMP. We do this
		// reordering and then continue the analysis hoping we can eliminate the
		// CMP. This peephole works on the vregs, so is still in SSA form. As a
		// consequence, the movs won't redefine/kill the MUL operands which would
		// make this reordering illegal.
		if (MI && IsThumb1) {
		--I;
		bool CanReorder = true;
		const bool HasStmts = I != E;
		for (; I != E; --I) {
		if (I->getOpcode() != ARM::tMOVi8) {
		CanReorder = false;
		break;
		}
		}
		if (HasStmts && CanReorder) {
		MI = MI->removeFromParent();
		E = CmpInstr;
		CmpInstr.getParent()->insert(E, MI);
		}
		I = CmpInstr;
		E = MI;
		}

// Check that CPSR isn't set between the comparison instruction and the one we		// Check that CPSR isn't set between the comparison instruction and the one we
// want to change. At the same time, search for Sub.		// want to change. At the same time, search for Sub.
const TargetRegisterInfo *TRI = &getRegisterInfo();		const TargetRegisterInfo *TRI = &getRegisterInfo();
--I;		--I;
for (; I != E; --I) {		for (; I != E; --I) {
const MachineInstr &Instr = *I;		const MachineInstr &Instr = *I;

if (Instr.modifiesRegister(ARM::CPSR, TRI) \|\|		if (Instr.modifiesRegister(ARM::CPSR, TRI) \|\|
Show All 19 Lines	bool ARMBaseInstrInfo::optimizeCompareInstr(

// The single candidate is called MI.		// The single candidate is called MI.
if (!MI) MI = Sub;		if (!MI) MI = Sub;

// We can't use a predicated instruction - it doesn't always write the flags.		// We can't use a predicated instruction - it doesn't always write the flags.
if (isPredicated(*MI))		if (isPredicated(*MI))
return false;		return false;

bool IsThumb1 = false;
switch (MI->getOpcode()) {
default: break;
case ARM::tLSLri:
case ARM::tLSRri:
case ARM::tLSLrr:
case ARM::tLSRrr:
case ARM::tSUBrr:
case ARM::tADDrr:
case ARM::tADDi3:
case ARM::tADDi8:
case ARM::tSUBi3:
case ARM::tSUBi8:
IsThumb1 = true;
LLVM_FALLTHROUGH;
case ARM::RSBrr:
case ARM::RSBri:
case ARM::RSCrr:
case ARM::RSCri:
case ARM::ADDrr:
case ARM::ADDri:
case ARM::ADCrr:
case ARM::ADCri:
case ARM::SUBrr:
case ARM::SUBri:
case ARM::SBCrr:
case ARM::SBCri:
case ARM::t2RSBri:
case ARM::t2ADDrr:
case ARM::t2ADDri:
case ARM::t2ADCrr:
case ARM::t2ADCri:
case ARM::t2SUBrr:
case ARM::t2SUBri:
case ARM::t2SBCrr:
case ARM::t2SBCri:
case ARM::ANDrr:
case ARM::ANDri:
case ARM::t2ANDrr:
case ARM::t2ANDri:
case ARM::ORRrr:
case ARM::ORRri:
case ARM::t2ORRrr:
case ARM::t2ORRri:
case ARM::EORrr:
case ARM::EORri:
case ARM::t2EORrr:
case ARM::t2EORri:
case ARM::t2LSRri:
case ARM::t2LSRrr:
case ARM::t2LSLri:
case ARM::t2LSLrr: {
// Scan forward for the use of CPSR		// Scan forward for the use of CPSR
// When checking against MI: if it's a conditional code that requires		// When checking against MI: if it's a conditional code that requires
// checking of the V bit or C bit, then this is not safe to do.		// checking of the V bit or C bit, then this is not safe to do.
// It is safe to remove CmpInstr if CPSR is redefined or killed.		// It is safe to remove CmpInstr if CPSR is redefined or killed.
// If we are done with the basic block, we need to check whether CPSR is		// If we are done with the basic block, we need to check whether CPSR is
// live-out.		// live-out.
SmallVector<std::pair<MachineOperand*, ARMCC::CondCodes>, 4>		SmallVector<std::pair<MachineOperand*, ARMCC::CondCodes>, 4>
OperandsToUpdate;		OperandsToUpdate;
bool isSafe = false;		bool isSafe = false;
I = CmpInstr;		I = CmpInstr;
E = CmpInstr.getParent()->end();		E = CmpInstr.getParent()->end();
while (!isSafe && ++I != E) {		while (!isSafe && ++I != E) {
const MachineInstr &Instr = *I;		const MachineInstr &Instr = *I;
for (unsigned IO = 0, EO = Instr.getNumOperands();		for (unsigned IO = 0, EO = Instr.getNumOperands();
!isSafe && IO != EO; ++IO) {		!isSafe && IO != EO; ++IO) {
const MachineOperand &MO = Instr.getOperand(IO);		const MachineOperand &MO = Instr.getOperand(IO);
if (MO.isRegMask() && MO.clobbersPhysReg(ARM::CPSR)) {		if (MO.isRegMask() && MO.clobbersPhysReg(ARM::CPSR)) {
isSafe = true;		isSafe = true;
break;		break;
}		}
if (!MO.isReg() \|\| MO.getReg() != ARM::CPSR)		if (!MO.isReg() \|\| MO.getReg() != ARM::CPSR)
continue;		continue;
if (MO.isDef()) {		if (MO.isDef()) {
isSafe = true;		isSafe = true;
break;		break;
}		}
// Condition code is after the operand before CPSR except for VSELs.		// Condition code is after the operand before CPSR except for VSELs.
ARMCC::CondCodes CC;		ARMCC::CondCodes CC;
bool IsInstrVSel = true;		bool IsInstrVSel = true;
switch (Instr.getOpcode()) {		switch (Instr.getOpcode()) {
default:		default:
IsInstrVSel = false;		IsInstrVSel = false;
CC = (ARMCC::CondCodes)Instr.getOperand(IO - 1).getImm();		CC = (ARMCC::CondCodes)Instr.getOperand(IO - 1).getImm();
break;		break;
case ARM::VSELEQD:		case ARM::VSELEQD:
case ARM::VSELEQS:		case ARM::VSELEQS:
CC = ARMCC::EQ;		CC = ARMCC::EQ;
break;		break;
case ARM::VSELGTD:		case ARM::VSELGTD:
case ARM::VSELGTS:		case ARM::VSELGTS:
CC = ARMCC::GT;		CC = ARMCC::GT;
break;		break;
case ARM::VSELGED:		case ARM::VSELGED:
case ARM::VSELGES:		case ARM::VSELGES:
CC = ARMCC::GE;		CC = ARMCC::GE;
break;		break;
case ARM::VSELVSS:		case ARM::VSELVSS:
case ARM::VSELVSD:		case ARM::VSELVSD:
CC = ARMCC::VS;		CC = ARMCC::VS;
break;		break;
}		}

if (Sub) {		if (Sub) {
ARMCC::CondCodes NewCC = getSwappedCondition(CC);		ARMCC::CondCodes NewCC = getSwappedCondition(CC);
if (NewCC == ARMCC::AL)		if (NewCC == ARMCC::AL)
return false;		return false;
// If we have SUB(r1, r2) and CMP(r2, r1), the condition code based		// If we have SUB(r1, r2) and CMP(r2, r1), the condition code based
// on CMP needs to be updated to be based on SUB.		// on CMP needs to be updated to be based on SUB.
// Push the condition code operands to OperandsToUpdate.		// Push the condition code operands to OperandsToUpdate.
// If it is safe to remove CmpInstr, the condition code of these		// If it is safe to remove CmpInstr, the condition code of these
// operands will be modified.		// operands will be modified.
if (SrcReg2 != 0 && Sub->getOperand(1).getReg() == SrcReg2 &&		if (SrcReg2 != 0 && Sub->getOperand(1).getReg() == SrcReg2 &&
Sub->getOperand(2).getReg() == SrcReg) {		Sub->getOperand(2).getReg() == SrcReg) {
// VSel doesn't support condition code update.		// VSel doesn't support condition code update.
if (IsInstrVSel)		if (IsInstrVSel)
return false;		return false;
OperandsToUpdate.push_back(		OperandsToUpdate.push_back(
std::make_pair(&((*I).getOperand(IO - 1)), NewCC));		std::make_pair(&((*I).getOperand(IO - 1)), NewCC));
}		}
} else {		} else {
// No Sub, so this is x = <op> y, z; cmp x, 0.		// No Sub, so this is x = <op> y, z; cmp x, 0.
switch (CC) {		switch (CC) {
case ARMCC::EQ: // Z		case ARMCC::EQ: // Z
case ARMCC::NE: // Z		case ARMCC::NE: // Z
case ARMCC::MI: // N		case ARMCC::MI: // N
case ARMCC::PL: // N		case ARMCC::PL: // N
case ARMCC::AL: // none		case ARMCC::AL: // none
// CPSR can be used multiple times, we should continue.		// CPSR can be used multiple times, we should continue.
break;		break;
case ARMCC::HS: // C		case ARMCC::HS: // C
case ARMCC::LO: // C		case ARMCC::LO: // C
case ARMCC::VS: // V		case ARMCC::VS: // V
case ARMCC::VC: // V		case ARMCC::VC: // V
case ARMCC::HI: // C Z		case ARMCC::HI: // C Z
case ARMCC::LS: // C Z		case ARMCC::LS: // C Z
case ARMCC::GE: // N V		case ARMCC::GE: // N V
case ARMCC::LT: // N V		case ARMCC::LT: // N V
case ARMCC::GT: // Z N V		case ARMCC::GT: // Z N V
case ARMCC::LE: // Z N V		case ARMCC::LE: // Z N V
// The instruction uses the V bit or C bit which is not safe.		// The instruction uses the V bit or C bit which is not safe.
return false;		return false;
}		}
}		}
}		}
}		}

// If CPSR is not killed nor re-defined, we should check whether it is		// If CPSR is not killed nor re-defined, we should check whether it is
// live-out. If it is live-out, do not optimize.		// live-out. If it is live-out, do not optimize.
if (!isSafe) {		if (!isSafe) {
MachineBasicBlock *MBB = CmpInstr.getParent();		MachineBasicBlock *MBB = CmpInstr.getParent();
for (MachineBasicBlock::succ_iterator SI = MBB->succ_begin(),		for (MachineBasicBlock::succ_iterator SI = MBB->succ_begin(),
SE = MBB->succ_end(); SI != SE; ++SI)		SE = MBB->succ_end(); SI != SE; ++SI)
if ((*SI)->isLiveIn(ARM::CPSR))		if ((*SI)->isLiveIn(ARM::CPSR))
return false;		return false;
}		}

// Toggle the optional operand to CPSR (if it exists - in Thumb1 we always		// Toggle the optional operand to CPSR (if it exists - in Thumb1 we always
// set CPSR so this is represented as an explicit output)		// set CPSR so this is represented as an explicit output)
if (!IsThumb1) {		if (!IsThumb1) {
MI->getOperand(5).setReg(ARM::CPSR);		MI->getOperand(5).setReg(ARM::CPSR);
MI->getOperand(5).setIsDef(true);		MI->getOperand(5).setIsDef(true);
}		}
assert(!isPredicated(*MI) && "Can't use flags from predicated instruction");		assert(!isPredicated(*MI) && "Can't use flags from predicated instruction");
CmpInstr.eraseFromParent();		CmpInstr.eraseFromParent();

// Modify the condition code of operands in OperandsToUpdate.		// Modify the condition code of operands in OperandsToUpdate.
// Since we have SUB(r1, r2) and CMP(r2, r1), the condition code needs to		// Since we have SUB(r1, r2) and CMP(r2, r1), the condition code needs to
// be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.		// be changed from r2 > r1 to r1 < r2, from r2 < r1 to r1 > r2, etc.
for (unsigned i = 0, e = OperandsToUpdate.size(); i < e; i++)		for (unsigned i = 0, e = OperandsToUpdate.size(); i < e; i++)
OperandsToUpdate[i].first->setImm(OperandsToUpdate[i].second);		OperandsToUpdate[i].first->setImm(OperandsToUpdate[i].second);
return true;
}
}

return false;		return true;
}		}

bool ARMBaseInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,		bool ARMBaseInstrInfo::FoldImmediate(MachineInstr &UseMI, MachineInstr &DefMI,
unsigned Reg,		unsigned Reg,
MachineRegisterInfo *MRI) const {		MachineRegisterInfo *MRI) const {
// Fold large immediates into add, sub, or, xor.		// Fold large immediates into add, sub, or, xor.
unsigned DefOpc = DefMI.getOpcode();		unsigned DefOpc = DefMI.getOpcode();
if (DefOpc != ARM::t2MOVi32imm && DefOpc != ARM::MOVi32imm)		if (DefOpc != ARM::t2MOVi32imm && DefOpc != ARM::MOVi32imm)
▲ Show 20 Lines • Show All 2,025 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/ARM/cmp1-peephole-thumb.mir

				# RUN: llc -run-pass=peephole-opt %s -o - \| FileCheck %s

				--- \|
				; ModuleID = '<stdin>'
				source_filename = "<stdin>"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumb-none--eabi"

				define i32 @f(i32 %a, i32 %b) {
				entry:
				%mul = mul nsw i32 %b, %a
				%cmp = icmp eq i32 %mul, 0
				%conv = zext i1 %cmp to i32
				ret i32 %conv
				}

				...
				---
				name: f
				# CHECK-LABEL: name: f
				alignment: 1
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				registers:
				- { id: 0, class: tgpr }
				- { id: 1, class: tgpr }
				- { id: 2, class: tgpr }
				- { id: 3, class: tgpr }
				- { id: 4, class: tgpr }
				- { id: 5, class: tgpr }
				liveins:
				- { reg: '%r0', virtual-reg: '%0' }
				- { reg: '%r1', virtual-reg: '%1' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 0
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false

				# CHECK: tMOVi8 1, 14, _
				# CHECK: tMOVi8 0, 14, _
				# CHECK: tMUL %1, %0, 14, _
				# CHECK-NOT: tCMPi8
				body: \|
				bb.0.entry:
				successors: %bb.1.entry(0x40000000), %bb.2.entry(0x40000000)
				liveins: %r0, %r1

				%1 = COPY %r1
				%0 = COPY %r0
				%2, %cpsr = tMUL %1, %0, 14, _
				%3, %cpsr = tMOVi8 1, 14, _
				%4, %cpsr = tMOVi8 0, 14, _
				tCMPi8 killed %2, 0, 14, _, implicit-def %cpsr
				tBcc %bb.2.entry, 0, %cpsr

				bb.1.entry:
				successors: %bb.2.entry(0x80000000)


				bb.2.entry:
				%5 = PHI %4, %bb.1.entry, %3, %bb.0.entry
				%r0 = COPY %5
				tBX_RET 14, _, implicit %r0

				...

llvm/trunk/test/CodeGen/ARM/cmp2-peephole-thumb.mir

				# RUN: llc -run-pass=peephole-opt %s -o - \| FileCheck %s

				# Here we check that the peephole cmp rewrite is not triggered, because
				# there is store instruction between the tMUL and tCMP, i.e. there are
				# no constants to reorder.

				--- \|
				; ModuleID = 'cmp2-peephole-thumb.ll'
				source_filename = "<stdin>"
				target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "thumb-none--eabi"

				define i32 @g(i32 %a, i32 %b) {
				entry:
				%retval = alloca i32, align 4
				%mul = alloca i32, align 4
				%mul1 = mul nsw i32 %a, %b
				store i32 %mul1, i32* %mul, align 4
				%0 = load i32, i32* %mul, align 4
				%cmp = icmp sle i32 %0, 0
				br i1 %cmp, label %if.then, label %if.end

				if.then: ; preds = %entry
				store i32 42, i32* %retval, align 4
				br label %return

				if.end: ; preds = %entry
				store i32 1, i32* %retval, align 4
				br label %return

				return: ; preds = %if.end, %if.then
				%1 = load i32, i32* %retval, align 4
				ret i32 %1
				}

				...
				---
				name: g
				# CHECK-LABEL: name: g
				alignment: 1
				exposesReturnsTwice: false
				legalized: false
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				registers:
				- { id: 0, class: tgpr }
				- { id: 1, class: tgpr }
				- { id: 2, class: tgpr }
				- { id: 3, class: tgpr }
				- { id: 4, class: tgpr }
				- { id: 5, class: tgpr }
				liveins:
				- { reg: '%r0', virtual-reg: '%0' }
				- { reg: '%r1', virtual-reg: '%1' }
				frameInfo:
				isFrameAddressTaken: false
				isReturnAddressTaken: false
				hasStackMap: false
				hasPatchPoint: false
				stackSize: 0
				offsetAdjustment: 0
				maxAlignment: 4
				adjustsStack: false
				hasCalls: false
				maxCallFrameSize: 0
				hasOpaqueSPAdjustment: false
				hasVAStart: false
				hasMustTailInVarArgFunc: false
				stack:
				- { id: 0, name: retval, offset: 0, size: 4, alignment: 4, local-offset: -4 }
				- { id: 1, name: mul, offset: 0, size: 4, alignment: 4, local-offset: -8 }

				# CHECK: tMUL
				# CHECK-NEXT: tSTRspi
				# CHECK-NEXT: tCMPi8
				body: \|
				bb.0.entry:
				successors: %bb.1.if.then(0x40000000), %bb.2.if.end(0x40000000)
				liveins: %r0, %r1

				%1 = COPY %r1
				%0 = COPY %r0
				%2, %cpsr = tMUL %0, %1, 14, _
				tSTRspi %2, %stack.1.mul, 0, 14, _ :: (store 4 into %ir.mul)
				tCMPi8 %2, 0, 14, _, implicit-def %cpsr
				tBcc %bb.2.if.end, 12, %cpsr
				tB %bb.1.if.then, 14, _

				bb.1.if.then:
				successors: %bb.3.return(0x80000000)

				%4, %cpsr = tMOVi8 42, 14, _
				tSTRspi killed %4, %stack.0.retval, 0, 14, _ :: (store 4 into %ir.retval)
				tB %bb.3.return, 14, _

				bb.2.if.end:
				successors: %bb.3.return(0x80000000)

				%3, %cpsr = tMOVi8 1, 14, _
				tSTRspi killed %3, %stack.0.retval, 0, 14, _ :: (store 4 into %ir.retval)

				bb.3.return:
				%5 = tLDRspi %stack.0.retval, 0, 14, _ :: (dereferenceable load 4 from %ir.retval)
				%r0 = COPY %5
				tBX_RET 14, _, implicit %r0

				...