This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
-
AArch64InstrInfo.h
-
AArch64InstrInfo.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
arm64-regress-opt-cmp.mir
-
subs-to-sub-opt.ll

Differential D18838

[AArch64][CodeGen] Fix of incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr
ClosedPublic

Authored by eastig on Apr 6 2016, 11:38 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
jmolloy

Commits

rGfd89fe0dd363: [AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in…
rL266969: [AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in…

Summary

AArch64InstrInfo::optimizeCompareInstr has bug PR27158 which causes generation of incorrect code. Details can be found here: https://llvm.org/bugs/show_bug.cgi?id=27158

Fix:

A condition code used after CmpInstr and before the next modification of NZCV is found. The optimization is not applied if different condition codes are used. It might be difficult to find a candidate for substitution to satisfy all of them. I think this case with multiple used condition codes does not happen often.
Then it’s checked in 'canInstrSubstituteCmpInstr' that the instruction which defines a register for CmpInstr can produce the needed condition code itself or its S variant. If it or its S variant can produce then CmpInstr is removed.

A regression test is added.
A new test to check that SUBS is replaced by SUB is added.

Diff Detail

Repository: rL LLVM

Event Timeline

eastig updated this revision to Diff 52827.Apr 6 2016, 11:38 AM

eastig retitled this revision from to [AArch64][CodeGen] Fix of incorrect peephole optimization in AArch64InstrInfo::optimizeCompareInstr.

eastig updated this object.

eastig added a reviewer: jmolloy.

eastig added a subscriber: llvm-commits.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptApr 6 2016, 11:38 AM

eastig updated this object.Apr 6 2016, 11:39 AM

t.p.northover added a subscriber: t.p.northover.Apr 6 2016, 1:01 PM

t.p.northover added inline comments.

lib/Target/AArch64/AArch64InstrInfo.cpp
975–977 ↗	(On Diff #52827)	I think the documentation should call out the fact that it really does check whether all uses are equivalent.
981 ↗	(On Diff #52827)	Why only NE? At the very least NE/EQ are pretty equivalent.
985–988 ↗	(On Diff #52827)	I don't follow this logic. A pure ADD should never produce any CondCode, and I'm not sure why an FI operand is involved at all.
1007–1033 ↗	(On Diff #52827)	I think this is probably a bit too tricksy, and both less efficient and less clear than the naive loop. I also think that looking at this from the position of CondCodes is probably the wrong level. The real distinction is which bits of NZCV are used (in particular, ADDS sets C & V differently from SUBS, as I recall, so mixing tests that only use Z & N is fine). With both of those observations, I think this function reduces to something like: for(MI= CmpInstr; MI != instr_end; ++MI) { if (MI->readsRegister(NZCV)) NZCVUsed \|= findNZCVUsedByInstr(MI); if (MI->modifiesRegister(NZCV)) { // Possibly return 0b1111 if MI also reads. return NZCVUsed; } }
1034 ↗	(On Diff #52827)	I think we should also check (or assert, if also checked before) that NZCV doesn't escape the basic block.
1043 ↗	(On Diff #52827)	We definitely shouldn't be relying on specific ordering like this, even if it is alphabetical.
1068–1071 ↗	(On Diff #52827)	These are unconditional (i.e. work on all possible AArch64CC) aren't they? Anyway, I think (with the repurposing of findUsedCondCodeAfterCmp suggested above) that the NZCV part of this function is really: if (sameSForm(..., isADDS) \|\| sameSForm(..., isSUBS)) return true; auto NZCVUsed = findUsedNZCVAfterCMP(...); return !NZCVUsed.C && !NZCVUsed.V;
1094 ↗	(On Diff #52827)	This completely short-circuits the other walks you're adding doesn't it? (E.g. findUsedCondCodeAfterCmp will never find more than one use).
test/CodeGen/AArch64/arm64-regress-opt-cmp.ll
1 ↗	(On Diff #52827)	I think these tests are rather weak given how much code is changing and the complexity of the result. I'd probably suggest a MIR test for this one; they can be a bit temperamental to get started, but allow you to exercise many more edge cases quickly.

Hi Tim,

Thank you for comments. They are very useful.
See my answers.

lib/Target/AArch64/AArch64InstrInfo.cpp
975–977 ↗	(On Diff #52827)	It seems this function name is too general and does not correspond to what the function does. It was based on tests: CodeGen/AArch64/arm64-arm64-dead-def-elimination-flag.ll CodeGen/AArch64/arm64-dead-def-frame-index.ll The tests have a comparison of a result of 'alloca' with null. They expect a compare operation to be removed. A sequence of instruction is ADDX+SUBSX+CSINCW. I overcomplicated things.
981 ↗	(On Diff #52827)	You are right.
985–988 ↗	(On Diff #52827)	The logic is wrong.
1007–1033 ↗	(On Diff #52827)	Yes, you are right.
1034 ↗	(On Diff #52827)	Do you mean to check if flags are alive in successors of the basic block? If yes, this is checked in substituteCmpInstr.
1043 ↗	(On Diff #52827)	It's a pity. Maybe such functions already exist?
1068–1071 ↗	(On Diff #52827)	You are right.
1094 ↗	(On Diff #52827)	It checks accesses(read, write) before Cmp and after MI. It was in the original code. I think this was done in more strict way to make code simpler because such a situation is rare if it ever exists.
test/CodeGen/AArch64/arm64-regress-opt-cmp.ll
1 ↗	(On Diff #52827)	I fully agree with you. I tried to write a simpler test but I failed to write it in IR. How can I write in MIR?

t.p.northover added inline comments.Apr 7 2016, 4:19 PM

lib/Target/AArch64/AArch64InstrInfo.cpp
1034 ↗	(On Diff #52827)	Yep, that's what I meant (I noticed it later). I think an assertion is probably still a good idea somewhere in the function (as documentation that anyone reading it shouldn't bother worrying, basically).
1043 ↗	(On Diff #52827)	Not as far as I'm aware, but I expect an explicit "Opcode == A \|\| Opcode == B \|\| ..." to be just as efficient when compiled. More verbose, but less worrying.
1094 ↗	(On Diff #52827)	Yep, but to me it looks like your code already handles this properly, and would permit the optimization in more cases (albeit rare ones, as you say).
test/CodeGen/AArch64/arm64-regress-opt-cmp.ll
1 ↗	(On Diff #52827)	The hardest part last time I did it was making sure the pass was registered properly for it (hint: make sure initializeXYZPass gets called). Fortunately for you, it looks like this is already done properly for the peephole optimizer so you'd use "llc -run-pass=peephole-opts /path/to/file.mir" in the RUN line. Other than that, to get a basic MIR file to test you could either run "llc -stop-after=some-pass" (useful if you don't know quite how to write some construct) or just copy an existing one and modify it as needed. The format is less obvious and documented than LLVM IR so writing one from fresh is not a good idea (yet, at least).

Updated the patch according to Tim's comments.

Ping.

t.p.northover added inline comments.Apr 18 2016, 12:13 PM

lib/Target/AArch64/AArch64InstrInfo.cpp
993 ↗	(On Diff #53407)	Can you explicitly annotate this fall-through?
1005–1006 ↗	(On Diff #53407)	This shouldn't be a fallthrough if I'm reading the manual correctly (definition of `ConditionHolds` on page J1-5267 for example).
1044–1045 ↗	(On Diff #53407)	Do we ever check that they're comparing agains the same value as MI? (Always 0 I believe).
1053–1055 ↗	(On Diff #53407)	What's allowed between MI and CmpInstr depends on what MI is: If it's an ADDS/SUBS already then any use of flags is permitted and only definitions are wrong. If it's an ADD/SUB then neither uses nor defs are allowed (of any flags). Uses imply NZCV is already live and we're going to clobber it; defs imply it would be clobbered before it reached CmpInstr.
1083–1084 ↗	(On Diff #53407)	I think this function should either return an NZCVUsed (to be checked by the caller) or check the forms of MI and CmpInstr itself. If they're SUBS/CMP or ADDS/CMN then the substitution is valid regardless of flags used.
test/CodeGen/AArch64/arm64-regress-opt-cmp.mir
1 ↗	(On Diff #53407)	This test has lots of extra cruft and only ends up checking one instance anyway. The point of MIR tests is that we can exercise more aspects of the logic than from IR alone.

Hi Tim,

Thank you for comments. See my answers.
I am updating the patch. Need some time. Quite busy on armcc.

lib/Target/AArch64/AArch64InstrInfo.cpp
993 ↗	(On Diff #53407)	I will add comments to each case to show which flags are used.
1005–1006 ↗	(On Diff #53407)	Good catch. Thank you. It's a bug. 'break' is missed.
1044–1045 ↗	(On Diff #53407)	I will rename 'substituteCmpInstr' to 'substituteCmpToZero' to be clear what is the case. I don't think we need to check that MI and CmpInstr use the same value. I added these checks because 'optimizeCompareInstr' is called for instructions which are supported by 'analyzeCompare'. Currently they are ADDS, SUBS and ANDS. We cannot substitute ANDS because 'ANDS vreg, 0' always produces 0. We can substitute 'ANDS vreg, -1' but it's not comparision with zero. Is this case worth?
1053–1055 ↗	(On Diff #53407)	You are right. I would rewrite this as follows: If MI opcode is the S form there must be no defs of flags. If MI opcode is not the S form there must be neither defs of flags nor uses of flags.
1083–1084 ↗	(On Diff #53407)	We can only be sure for N and Z flags. SUBS/CMP or ADDS/CMN can produce different C and V flags, e.g. %vr2 = SUBS %vr1, 1 ; sets C to 0 when %vr1 == 0 %cmp = SUBS %vr2, 0 ; sets C to 1
test/CodeGen/AArch64/arm64-regress-opt-cmp.mir
1 ↗	(On Diff #53407)	Yes, it looks cumbersome. I am reducing it as much as possible.

t.p.northover added inline comments.Apr 19 2016, 1:10 PM

lib/Target/AArch64/AArch64InstrInfo.cpp
1044–1045 ↗	(On Diff #53407)	Sorry about that, I was talking nonsense. I forgot that SrcReg was the destination of the candidate instruction so comparison against 0 was automatic if flags were set. No need to support anything else.
1083–1084 ↗	(On Diff #53407)	Oh, of course. Same flawed reasoning from me as above I think.

Updated according to Tim's comments

Thanks for the updates. This looks good to me now.

Tim.

lib/Target/AArch64/AArch64InstrInfo.cpp
1058–1059 ↗	(On Diff #54365)	Nice implementation!

This revision is now accepted and ready to land.Apr 20 2016, 1:21 PM

Closed by commit rL266969: [AArch64][CodeGen] Fix of PR27158: incorrect peephole optimization in… (authored by eastig). · Explain WhyApr 21 2016, 1:59 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

AArch64/

AArch64InstrInfo.h

4 lines

AArch64InstrInfo.cpp

228 lines

test/

CodeGen/

AArch64/

arm64-regress-opt-cmp.mir

41 lines

subs-to-sub-opt.ll

23 lines

Diff 54468

llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.h

Show First 20 Lines • Show All 200 Lines • ▼ Show 20 Lines	public:
getSerializableDirectMachineOperandTargetFlags() const override;		getSerializableDirectMachineOperandTargetFlags() const override;
ArrayRef<std::pair<unsigned, const char *>>		ArrayRef<std::pair<unsigned, const char *>>
getSerializableBitmaskMachineOperandTargetFlags() const override;		getSerializableBitmaskMachineOperandTargetFlags() const override;

private:		private:
void instantiateCondBranch(MachineBasicBlock &MBB, DebugLoc DL,		void instantiateCondBranch(MachineBasicBlock &MBB, DebugLoc DL,
MachineBasicBlock *TBB,		MachineBasicBlock *TBB,
ArrayRef<MachineOperand> Cond) const;		ArrayRef<MachineOperand> Cond) const;
bool substituteCmpInstr(MachineInstr *CmpInstr,		bool substituteCmpToZero(MachineInstr *CmpInstr,
unsigned SrcReg, const MachineRegisterInfo *MRI) const;		unsigned SrcReg, const MachineRegisterInfo *MRI) const;
};		};

/// emitFrameOffset - Emit instructions as needed to set DestReg to SrcReg		/// emitFrameOffset - Emit instructions as needed to set DestReg to SrcReg
/// plus Offset. This is intended to be used from within the prolog/epilog		/// plus Offset. This is intended to be used from within the prolog/epilog
/// insertion (PEI) pass, where a virtual scratch register may be allocated		/// insertion (PEI) pass, where a virtual scratch register may be allocated
/// if necessary, to be replaced by the scavenger at the end of PEI.		/// if necessary, to be replaced by the scavenger at the end of PEI.
void emitFrameOffset(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,		void emitFrameOffset(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
DebugLoc DL, unsigned DestReg, unsigned SrcReg, int Offset,		DebugLoc DL, unsigned DestReg, unsigned SrcReg, int Offset,
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.cpp

Show First 20 Lines • Show All 876 Lines • ▼ Show 20 Lines	bool AArch64InstrInfo::optimizeCompareInstr(
assert((CmpValue == 0 \|\| CmpValue == 1) && "CmpValue must be 0 or 1!");		assert((CmpValue == 0 \|\| CmpValue == 1) && "CmpValue must be 0 or 1!");
if (CmpValue != 0 \|\| SrcReg2 != 0)		if (CmpValue != 0 \|\| SrcReg2 != 0)
return false;		return false;

// CmpInstr is a Compare instruction if destination register is not used.		// CmpInstr is a Compare instruction if destination register is not used.
if (!MRI->use_nodbg_empty(CmpInstr->getOperand(0).getReg()))		if (!MRI->use_nodbg_empty(CmpInstr->getOperand(0).getReg()))
return false;		return false;

return substituteCmpInstr(CmpInstr, SrcReg, MRI);		return substituteCmpToZero(CmpInstr, SrcReg, MRI);
}		}

/// Get opcode of S version of Instr.		/// Get opcode of S version of Instr.
/// If Instr is S version its opcode is returned.		/// If Instr is S version its opcode is returned.
/// AArch64::INSTRUCTION_LIST_END is returned if Instr does not have S version		/// AArch64::INSTRUCTION_LIST_END is returned if Instr does not have S version
/// or we are not interested in it.		/// or we are not interested in it.
static unsigned sForm(MachineInstr &Instr) {		static unsigned sForm(MachineInstr &Instr) {
switch (Instr.getOpcode()) {		switch (Instr.getOpcode()) {
Show All 30 Lines
/// Check if AArch64::NZCV should be alive in successors of MBB.		/// Check if AArch64::NZCV should be alive in successors of MBB.
static bool areCFlagsAliveInSuccessors(MachineBasicBlock *MBB) {		static bool areCFlagsAliveInSuccessors(MachineBasicBlock *MBB) {
for (auto *BB : MBB->successors())		for (auto *BB : MBB->successors())
if (BB->isLiveIn(AArch64::NZCV))		if (BB->isLiveIn(AArch64::NZCV))
return true;		return true;
return false;		return false;
}		}

/// Substitute CmpInstr with another instruction which produces a needed		struct UsedNZCV {
/// condition code.		bool N;
/// Return true on success.		bool Z;
bool AArch64InstrInfo::substituteCmpInstr(MachineInstr *CmpInstr,		bool C;
unsigned SrcReg, const MachineRegisterInfo *MRI) const {		bool V;
// Get the unique definition of SrcReg.		UsedNZCV(): N(false), Z(false), C(false), V(false) {}
MachineInstr *MI = MRI->getUniqueVRegDef(SrcReg);		UsedNZCV& operator \|=(const UsedNZCV& UsedFlags) {
if (!MI)		this->N \|= UsedFlags.N;
return false;		this->Z \|= UsedFlags.Z;
		this->C \|= UsedFlags.C;
const TargetRegisterInfo *TRI = &getRegisterInfo();		this->V \|= UsedFlags.V;
if (areCFlagsAccessedBetweenInstrs(MI, CmpInstr, TRI))		return *this;
return false;		}
		};

unsigned NewOpc = sForm(*MI);		/// Find a condition code used by the instruction.
if (NewOpc == AArch64::INSTRUCTION_LIST_END)		/// Returns AArch64CC::Invalid if either the instruction does not use condition
return false;		/// codes or we don't optimize CmpInstr in the presence of such instructions.
		static AArch64CC::CondCode findCondCodeUsedByInstr(const MachineInstr &Instr) {
		switch (Instr.getOpcode()) {
		default:
		return AArch64CC::Invalid;

// Scan forward for the use of NZCV.		case AArch64::Bcc: {
// When checking against MI: if it's a conditional code requires		int Idx = Instr.findRegisterUseOperandIdx(AArch64::NZCV);
// checking of V bit, then this is not safe to do.		assert(Idx >= 2);
// It is safe to remove CmpInstr if NZCV is redefined or killed.		return static_cast<AArch64CC::CondCode>(Instr.getOperand(Idx - 2).getImm());
// If we are done with the basic block, we need to check whether NZCV is
// live-out.
bool IsSafe = false;
for (MachineBasicBlock::iterator I = CmpInstr,
E = CmpInstr->getParent()->end();
!IsSafe && ++I != E;) {
const MachineInstr &Instr = *I;
for (unsigned IO = 0, EO = Instr.getNumOperands(); !IsSafe && IO != EO;
++IO) {
const MachineOperand &MO = Instr.getOperand(IO);
if (MO.isRegMask() && MO.clobbersPhysReg(AArch64::NZCV)) {
IsSafe = true;
break;
}
if (!MO.isReg() \|\| MO.getReg() != AArch64::NZCV)
continue;
if (MO.isDef()) {
IsSafe = true;
break;
}		}

// Decode the condition code.
unsigned Opc = Instr.getOpcode();
AArch64CC::CondCode CC;
switch (Opc) {
default:
return false;
case AArch64::Bcc:
CC = (AArch64CC::CondCode)Instr.getOperand(IO - 2).getImm();
break;
case AArch64::CSINVWr:		case AArch64::CSINVWr:
case AArch64::CSINVXr:		case AArch64::CSINVXr:
case AArch64::CSINCWr:		case AArch64::CSINCWr:
case AArch64::CSINCXr:		case AArch64::CSINCXr:
case AArch64::CSELWr:		case AArch64::CSELWr:
case AArch64::CSELXr:		case AArch64::CSELXr:
case AArch64::CSNEGWr:		case AArch64::CSNEGWr:
case AArch64::CSNEGXr:		case AArch64::CSNEGXr:
case AArch64::FCSELSrrr:		case AArch64::FCSELSrrr:
case AArch64::FCSELDrrr:		case AArch64::FCSELDrrr: {
CC = (AArch64CC::CondCode)Instr.getOperand(IO - 1).getImm();		int Idx = Instr.findRegisterUseOperandIdx(AArch64::NZCV);
break;		assert(Idx >= 1);
		return static_cast<AArch64CC::CondCode>(Instr.getOperand(Idx - 1).getImm());
		}
		}
}		}

// It is not safe to remove Compare instruction if Overflow(V) is used.		static UsedNZCV getUsedNZCV(AArch64CC::CondCode CC) {
		assert(CC != AArch64CC::Invalid);
		UsedNZCV UsedFlags;
switch (CC) {		switch (CC) {
default:		default:
// NZCV can be used multiple times, we should continue.
break;		break;
case AArch64CC::VS:
case AArch64CC::VC:		case AArch64CC::EQ: // Z set
case AArch64CC::GE:		case AArch64CC::NE: // Z clear
case AArch64CC::LT:		UsedFlags.Z = true;
case AArch64CC::GT:		break;
case AArch64CC::LE:
		case AArch64CC::HI: // Z clear and C set
		case AArch64CC::LS: // Z set or C clear
		UsedFlags.Z = true;
		case AArch64CC::HS: // C set
		case AArch64CC::LO: // C clear
		UsedFlags.C = true;
		break;

		case AArch64CC::MI: // N set
		case AArch64CC::PL: // N clear
		UsedFlags.N = true;
		break;

		case AArch64CC::VS: // V set
		case AArch64CC::VC: // V clear
		UsedFlags.V = true;
		break;

		case AArch64CC::GT: // Z clear, N and V the same
		case AArch64CC::LE: // Z set, N and V differ
		UsedFlags.Z = true;
		case AArch64CC::GE: // N and V the same
		case AArch64CC::LT: // N and V differ
		UsedFlags.N = true;
		UsedFlags.V = true;
		break;
		}
		return UsedFlags;
		}

		static bool isADDSRegImm(unsigned Opcode) {
		return Opcode == AArch64::ADDSWri \|\| Opcode == AArch64::ADDSXri;
		}

		static bool isSUBSRegImm(unsigned Opcode) {
		return Opcode == AArch64::SUBSWri \|\| Opcode == AArch64::SUBSXri;
		}

		/// Check if CmpInstr can be substituted by MI.
		///
		/// CmpInstr can be substituted:
		/// - CmpInstr is either 'ADDS %vreg, 0' or 'SUBS %vreg, 0'
		/// - and, MI and CmpInstr are from the same MachineBB
		/// - and, condition flags are not alive in successors of the CmpInstr parent
		/// - and, if MI opcode is the S form there must be no defs of flags between
		/// MI and CmpInstr
		/// or if MI opcode is not the S form there must be neither defs of flags
		/// nor uses of flags between MI and CmpInstr.
		/// - and C/V flags are not used after CmpInstr
		static bool canInstrSubstituteCmpInstr(MachineInstr MI, MachineInstr CmpInstr,
		const TargetRegisterInfo *TRI) {
		assert(MI);
		assert(sForm(*MI) != AArch64::INSTRUCTION_LIST_END);
		assert(CmpInstr);

		const unsigned CmpOpcode = CmpInstr->getOpcode();
		if (!isADDSRegImm(CmpOpcode) && !isSUBSRegImm(CmpOpcode))
		return false;

		if (MI->getParent() != CmpInstr->getParent())
		return false;

		if (areCFlagsAliveInSuccessors(CmpInstr->getParent()))
		return false;

		AccessKind AccessToCheck = AK_Write;
		if (sForm(*MI) != MI->getOpcode())
		AccessToCheck = AK_All;
		if (areCFlagsAccessedBetweenInstrs(MI, CmpInstr, TRI, AccessToCheck))
		return false;

		UsedNZCV NZCVUsedAfterCmp;
		for (auto I = std::next(CmpInstr->getIterator()), E = CmpInstr->getParent()->instr_end();
		I != E; ++I) {
		const MachineInstr &Instr = *I;
		if (Instr.readsRegister(AArch64::NZCV, TRI)) {
		AArch64CC::CondCode CC = findCondCodeUsedByInstr(Instr);
		if (CC == AArch64CC::Invalid) // Unsupported conditional instruction
return false;		return false;
		NZCVUsedAfterCmp \|= getUsedNZCV(CC);
}		}

		if (Instr.modifiesRegister(AArch64::NZCV, TRI))
		break;
}		}

		return !NZCVUsedAfterCmp.C && !NZCVUsedAfterCmp.V;
}		}

// If NZCV is not killed nor re-defined, we should check whether it is		/// Substitute an instruction comparing to zero with another instruction
// live-out. If it is live-out, do not optimize.		/// which produces needed condition flags.
if (!IsSafe && areCFlagsAliveInSuccessors(CmpInstr->getParent()))		///
		/// Return true on success.
		bool AArch64InstrInfo::substituteCmpToZero(MachineInstr *CmpInstr,
		unsigned SrcReg, const MachineRegisterInfo *MRI) const {
		assert(CmpInstr);
		assert(MRI);
		// Get the unique definition of SrcReg.
		MachineInstr *MI = MRI->getUniqueVRegDef(SrcReg);
		if (!MI)
		return false;

		const TargetRegisterInfo *TRI = &getRegisterInfo();

		unsigned NewOpc = sForm(*MI);
		if (NewOpc == AArch64::INSTRUCTION_LIST_END)
		return false;

		if (!canInstrSubstituteCmpInstr(MI, CmpInstr, TRI))
return false;		return false;

// Update the instruction to set NZCV.		// Update the instruction to set NZCV.
MI->setDesc(get(NewOpc));		MI->setDesc(get(NewOpc));
CmpInstr->eraseFromParent();		CmpInstr->eraseFromParent();
bool succeeded = UpdateOperandRegClass(MI);		bool succeeded = UpdateOperandRegClass(MI);
(void)succeeded;		(void)succeeded;
assert(succeeded && "Some operands reg class are incompatible!");		assert(succeeded && "Some operands reg class are incompatible!");
▲ Show 20 Lines • Show All 2,334 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/arm64-regress-opt-cmp.mir

				# RUN: llc -mtriple=aarch64-linux-gnu -run-pass peephole-opts %s 2>&1 \| FileCheck %s
				# CHECK: %1 = ANDWri {{.*}}
				# CHECK-NEXT: %wzr = SUBSWri {{.*}}
				--- \|
				define i32 @test01() nounwind {
				entry:
				%0 = select i1 true, i32 1, i32 0
				%1 = and i32 %0, 65535
				%2 = icmp ugt i32 %1, 0
				br i1 %2, label %if.then, label %if.end

				if.then: ; preds = %entry
				ret i32 1

				if.end: ; preds = %entry
				ret i32 0
				}
				...
				---
				name: test01
				registers:
				- { id: 0, class: gpr32 }
				- { id: 1, class: gpr32common }
				body: \|
				bb.0.entry:
				successors: %bb.2.if.end, %bb.1.if.then

				%0 = MOVi32imm 1
				%1 = ANDWri killed %1, 15
				%wzr = SUBSWri killed %1, 0, 0, implicit-def %nzcv
				Bcc 9, %bb.2.if.end, implicit %nzcv

				bb.1.if.then:
				%w0 = MOVi32imm 1
				RET_ReallyLR implicit %w0

				bb.2.if.end:
				%w0 = MOVi32imm 0
				RET_ReallyLR implicit %w0

				...

llvm/trunk/test/CodeGen/AArch64/subs-to-sub-opt.ll

				; RUN: llc -mtriple=aarch64-linux-gnu -O3 -o - %s \| FileCheck %s

				@a = external global i8, align 1
				@b = external global i8, align 1

				; Test that SUBS is replaced by SUB if condition flags are not used.
				define i32 @test01() nounwind {
				; CHECK: ldrb {{.*}}
				; CHECK-NEXT: ldrb {{.*}}
				; CHECK-NEXT: sub {{.*}}
				; CHECK-NEXT: cmn {{.*}}
				entry:
				%0 = load i8, i8* @a, align 1
				%conv = zext i8 %0 to i32
				%1 = load i8, i8* @b, align 1
				%conv1 = zext i8 %1 to i32
				%s = sub nsw i32 %conv1, %conv
				%cmp0 = icmp eq i32 %s, -1
				%cmp1 = sext i1 %cmp0 to i8
				store i8 %cmp1, i8* @a
				ret i32 0
				}