This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AVR/
-
Target/
-
AVR/
-
AVRInstrInfo.h
15
AVRInstrInfo.cpp
-
AVRInstrInfo.td
-
test/CodeGen/AVR/
-
CodeGen/
-
AVR/
-
branch-relaxation-long.ll
1
branch-relaxation.ll
-
fold-cmp.ll
2
fold-cmp.mir

Differential D140917

[AVR] Optimize away cpi instructions when possible
Needs RevisionPublic

Authored by aykevl on Jan 3 2023, 12:35 PM.

Download Raw Diff

Details

Reviewers

dylanmckay
benshi001

Summary

In many cases, the cpi instruction can be skipped because a previous instruction already sets the needed flags.

This saves around 1% in binary size.

Future improvements:

remove cp in sub r1, r2 and cp r1, r2
optimize andi + breq/brne to sbrs/sbrc + rjmp like avr-gcc does (this avoids clobbering a register and should therefore result in better generated code)
maybe do the same optimization for other flags too?

Diff Detail

Event Timeline

aykevl created this revision.Jan 3 2023, 12:35 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 12:35 PM

Herald added subscribers: Jim, hiraditya. · View Herald Transcript

aykevl requested review of this revision.Jan 3 2023, 12:35 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2023, 12:35 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

update branch relaxation tests that got changed by this patch

This patch was largely based on X86InstrInfo::optimizeCompareInstr in llvm/lib/Target/X86/X86InstrInfo.cpp. Of course, the X86 version is much, much larger.

aykevl edited the summary of this revision. (Show Details)Jan 3 2023, 1:09 PM

aykevl set the repository for this revision to rG LLVM Github Monorepo.

aykevl edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B205510: Diff 486052.Jan 3 2023, 2:37 PM

I ran the TinyGo tests with this patch and all tests still pass, while code size is reduced by 0.75%.

After this patch is merged I'd like to try implementing the sbic/sbis + rjmp optimization and see whether it indeed reduces code size (I think it will avoid some unnecessary copies and reduce register pressure).

benshi001 added inline comments.Jan 9 2023, 7:28 PM

llvm/lib/Target/AVR/AVRInstrInfo.cpp
202	It would be better to add a range check here. The `MI.getOperand(1).getImm();` should be in range [0, 255], otherwise `llvm_unreachable()`; The `CmpMask` should be 255, since it is `int64_t` type.
207	Can the form be that switch { default: // TOTO: Implement more compare instructions. return false; case AVR::CPIRdK: .. }
218	I think it would better to use a `std::array` than `switch`.
233	Why `LSL`, `ROL` are not in this list ?

benshi001 added inline comments.Jan 9 2023, 8:11 PM

llvm/lib/Target/AVR/AVRInstrInfo.cpp
214	Rename this function to `setsZeroFlagInstr`.
252	Can it be if ((CmpMask & CmpValue) != 0) return false; This look more clear.
267	`const MachineBasicBlock &` ?
269	Use `auto From` ?
273	This piece of searching previous instr clobbers `SREG` code, can be isolated to a stand alone `bool` function, it may be used by future optimizations you have planed.

Generally speaking

llvm/lib/Target/AVR/AVRInstrInfo.cpp
214	change the parameter to `const MachineInstr *` .
251	Generally speaking, I suggest you make a skeleton first, then handle special situations (current `cpi` and further ones you mentioned in your commit message) in stand alone funcitons, this way would make the code looks more clear.
256	`const auto *` ?
268	`FoundDef` -> `FoundClobber` .

benshi001 requested changes to this revision.Jan 9 2023, 8:36 PM

This revision now requires changes to proceed.Jan 9 2023, 8:36 PM

benshi001 added inline comments.Jan 9 2023, 8:42 PM

llvm/test/CodeGen/AVR/branch-relaxation.ll
4	`CHECK-NOT: cpi`
llvm/test/CodeGen/AVR/fold-cmp.mir
13	It would be better to add `CHECK-NOT: cpi` in all your tests.

benshi001 added inline comments.Jan 9 2023, 11:35 PM

llvm/lib/Target/AVR/AVRInstrInfo.cpp
290	Can it be `if (Instr.getOpcode() == AVR::BRNEk \|\| Instr.getOpcode() == AVR::BREQk)` ? I do not think we need a `switch` which means more than 2 choices.
301	Is this comment line correct? May it be // This instruction might read other flags (than Z) which are set by CPI.

Does this pass happpen before the expansion of pseudoes or after? Do we have to consider pseudoes ?

benshi001 added inline comments.Jan 9 2023, 11:51 PM

llvm/test/CodeGen/AVR/fold-cmp.mir
84	How about adding more cases that there is a stand alone `CPI` (the `ANDI` maybe in a preceeding block), in which the CPI is not deleted. `ADIW` + `CPI`, the `CPI` should not be deleted.

Revision Contents

Path

Size

llvm/

lib/

Target/

AVR/

AVRInstrInfo.h

8 lines

AVRInstrInfo.cpp

141 lines

AVRInstrInfo.td

2 lines

test/

CodeGen/

AVR/

branch-relaxation-long.ll

2 lines

branch-relaxation.ll

4 lines

fold-cmp.ll

28 lines

fold-cmp.mir

84 lines

Diff 486052

llvm/lib/Target/AVR/AVRInstrInfo.h

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	void loadRegFromStackSlot(MachineBasicBlock &MBB,
int FrameIndex, const TargetRegisterClass *RC,		int FrameIndex, const TargetRegisterClass *RC,
const TargetRegisterInfo *TRI,		const TargetRegisterInfo *TRI,
Register VReg) const override;		Register VReg) const override;
unsigned isLoadFromStackSlot(const MachineInstr &MI,		unsigned isLoadFromStackSlot(const MachineInstr &MI,
int &FrameIndex) const override;		int &FrameIndex) const override;
unsigned isStoreToStackSlot(const MachineInstr &MI,		unsigned isStoreToStackSlot(const MachineInstr &MI,
int &FrameIndex) const override;		int &FrameIndex) const override;

		bool analyzeCompare(const MachineInstr &MI, Register &SrcReg,
		Register &SrcReg2, int64_t &CmpMask,
		int64_t &CmpValue) const override;

		bool optimizeCompareInstr(MachineInstr &CmpInstr, Register SrcReg,
		Register SrcReg2, int64_t CmpMask, int64_t CmpValue,
		const MachineRegisterInfo *MRI) const override;

// Branch analysis.		// Branch analysis.
bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,		bool analyzeBranch(MachineBasicBlock &MBB, MachineBasicBlock *&TBB,
MachineBasicBlock *&FBB,		MachineBasicBlock *&FBB,
SmallVectorImpl<MachineOperand> &Cond,		SmallVectorImpl<MachineOperand> &Cond,
bool AllowModify = false) const override;		bool AllowModify = false) const override;
unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,		unsigned insertBranch(MachineBasicBlock &MBB, MachineBasicBlock *TBB,
MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,		MachineBasicBlock *FBB, ArrayRef<MachineOperand> Cond,
const DebugLoc &DL,		const DebugLoc &DL,
Show All 23 Lines

llvm/lib/Target/AVR/AVRInstrInfo.cpp

Show First 20 Lines • Show All 181 Lines • ▼ Show 20 Lines	void AVRInstrInfo::loadRegFromStackSlot(MachineBasicBlock &MBB,
}		}

BuildMI(MBB, MI, DebugLoc(), get(Opcode), DestReg)		BuildMI(MBB, MI, DebugLoc(), get(Opcode), DestReg)
.addFrameIndex(FrameIndex)		.addFrameIndex(FrameIndex)
.addImm(0)		.addImm(0)
.addMemOperand(MMO);		.addMemOperand(MMO);
}		}

		// Analyze a compare instruction. This is mainly used in the peephole optimize
		// before calling optimizeCompareInstr.
		bool AVRInstrInfo::analyzeCompare(const MachineInstr &MI, Register &SrcReg,
		Register &SrcReg2, int64_t &CmpMask,
		int64_t &CmpValue) const {
		switch (MI.getOpcode()) {
		default:
		break;
		case AVR::CPIRdK:
		SrcReg = MI.getOperand(0).getReg();
		SrcReg2 = 0;
		CmpMask = ~0;
		CmpValue = MI.getOperand(1).getImm();
		benshi001Unsubmitted Not Done Reply Inline Actions It would be better to add a range check here. The `MI.getOperand(1).getImm();` should be in range [0, 255], otherwise `llvm_unreachable()`; The `CmpMask` should be 255, since it is `int64_t` type. benshi001: It would be better to add a range check here. 1. The `MI.getOperand(1).getImm();` should be in…
		return true;
		}

		// There are more compare instructions, but they're not implemented yet.
		return false;
		benshi001Unsubmitted Not Done Reply Inline Actions Can the form be that switch { default: // TOTO: Implement more compare instructions. return false; case AVR::CPIRdK: .. } benshi001: Can the form be that ``` switch { default: // TOTO: Implement more compare instructions.
		}

		// Returns true if this instruction has a single 8-bit output register and it
		// sets the zero flag in SREG according to this output.
		// (There are other instructions that affect the zero flag, but not in this
		// specific way).
		static bool setsZeroFlag(MachineInstr *Instr) {
		benshi001Unsubmitted Not Done Reply Inline Actions Rename this function to `setsZeroFlagInstr`. benshi001: Rename this function to `setsZeroFlagInstr`.
		benshi001Unsubmitted Not Done Reply Inline Actions change the parameter to `const MachineInstr ` . benshi001:* change the parameter to `const MachineInstr *` .
		switch (Instr->getOpcode()) {
		default:
		return false;
		case AVR::ADCRdRr:
		benshi001Unsubmitted Not Done Reply Inline Actions I think it would better to use a `std::array` than `switch`. benshi001: I think it would better to use a `std::array` than `switch`.
		case AVR::ADDRdRr:
		case AVR::ANDRdRr:
		case AVR::ANDIRdK:
		case AVR::ASRRd:
		case AVR::COMRd:
		case AVR::DECRd:
		case AVR::EORRdRr:
		case AVR::INCRd:
		case AVR::LSRRd:
		case AVR::NEGRd:
		case AVR::ORRdRr:
		case AVR::ORIRdK:
		case AVR::RORRd:
		case AVR::SUBRdRr:
		case AVR::SUBIRdK:
		benshi001Unsubmitted Not Done Reply Inline Actions Why `LSL`, `ROL` are not in this list ? benshi001: Why `LSL`, `ROL` are not in this list ?
		return true;
		}
		}

		// Optimize compare instructions. At the moment, it can only remove a comparison
		// against zero (cpi rN, 0x00) where only the Z flag of SREG is used and a
		// previous instruction (like andi) already sets this flag.
		bool AVRInstrInfo::optimizeCompareInstr(MachineInstr &CmpInstr, Register SrcReg,
		Register SrcReg2, int64_t CmpMask,
		int64_t CmpValue,
		const MachineRegisterInfo *MRI) const {
		// This function is pretty large. The first part checks all the preconditions
		// for removing the compare instruction. Only at the very end (when all checks
		// pass) is the compare instruction removed.

		// Check whether this is an instruction of the form 'cpi rN, 0x00'.
		if (CmpInstr.getOpcode() != AVR::CPIRdK)
		return false; // only optimize CPI instructions
		benshi001Unsubmitted Not Done Reply Inline Actions Generally speaking, I suggest you make a skeleton first, then handle special situations (current `cpi` and further ones you mentioned in your commit message) in stand alone funcitons, this way would make the code looks more clear. benshi001: Generally speaking, I suggest you make a skeleton first, then handle special situations…
		if (CmpMask != ~0 \|\| CmpValue != 0)
		benshi001Unsubmitted Not Done Reply Inline Actions Can it be if ((CmpMask & CmpValue) != 0) return false; This look more clear. benshi001: Can it be ``` if ((CmpMask & CmpValue) != 0) return false; ``` This look more clear.
		return false; // currently we only support optimizing comparisons against 0

		// Find the instruction that defines the register input for cpi.
		MachineInstr *SrcRegDef = MRI->getVRegDef(SrcReg);
		benshi001Unsubmitted Not Done Reply Inline Actions `const auto ` ? benshi001:* ` const auto *` ?

		if (!setsZeroFlag(SrcRegDef))
		// This is some other instruction, like ld.
		return false;

		// Iterate over the instructions before the compare instruction and check that
		// none of them modify SREG.
		// TODO: we could allow instructions that leave the Z flag unchanged, like
		// sei.
		const TargetRegisterInfo *TRI = &getRegisterInfo();
		MachineBasicBlock &CmpMBB = *CmpInstr.getParent();
		benshi001Unsubmitted Not Done Reply Inline Actions `const MachineBasicBlock &` ? benshi001: `const MachineBasicBlock &` ?
		bool FoundDef = false;
		benshi001Unsubmitted Not Done Reply Inline Actions `FoundDef` -> `FoundClobber` . benshi001: `FoundDef` -> `FoundClobber` .
		MachineBasicBlock::reverse_iterator From =
		benshi001Unsubmitted Not Done Reply Inline Actions Use `auto From` ? benshi001: Use `auto From` ?
		std::next(MachineBasicBlock::reverse_iterator(CmpInstr));
		for (MachineInstr &Inst : make_range(From, CmpMBB.rend())) {
		if (&Inst == SrcRegDef) {
		FoundDef = true;
		benshi001Unsubmitted Not Done Reply Inline Actions This piece of searching previous instr clobbers `SREG` code, can be isolated to a stand alone `bool` function, it may be used by future optimizations you have planed. benshi001: This piece of searching previous instr clobbers `SREG` code, can be isolated to a stand alone…
		break;
		}
		if (Inst.modifiesRegister(AVR::SREG, TRI))
		return false;
		}
		if (!FoundDef)
		// We arrived at the start of the basic block without a modification of
		// SREG. This is likely uncommon so not worth traversing further.
		return false;

		// Check the instructions that follow the compare instruction. If they read
		// the SREG register, they may only use the zero flag.
		bool FlagsMayLiveOut = true;
		MachineBasicBlock::iterator AfterCmpInstr =
		std::next(MachineBasicBlock::iterator(CmpInstr));
		for (MachineInstr &Instr : make_range(AfterCmpInstr, CmpMBB.end())) {
		switch (Instr.getOpcode()) {
		benshi001Unsubmitted Not Done Reply Inline Actions Can it be `if (Instr.getOpcode() == AVR::BRNEk \|\| Instr.getOpcode() == AVR::BREQk)` ? I do not think we need a `switch` which means more than 2 choices. benshi001: Can it be `if (Instr.getOpcode() == AVR::BRNEk \|\| Instr.getOpcode() == AVR::BREQk)` ? I do not…
		case AVR::BRNEk:
		case AVR::BREQk:
		// These instructions only use the zero flag.
		if (Instr.getOperand(0).getMBB()->isLiveIn(AVR::SREG))
		// Unlikely, but to be sure: check that the block to branch to doesn't
		// use the current SREG value.
		return false;
		continue;
		}
		if (Instr.readsRegister(AVR::SREG, TRI))
		// This instruction might read the Z flag.
		benshi001Unsubmitted Not Done Reply Inline Actions Is this comment line correct? May it be // This instruction might read other flags (than Z) which are set by CPI. benshi001: Is this comment line correct? May it be ``` // This instruction might read other flags (than…
		return false;
		if (Instr.definesRegister(AVR::SREG, TRI)) {
		// The SREG register is updated, in theory also including the Z flag.
		// Many instructions don't fully modify SREG without declaring they also
		// read it, so this isn't strictly speaking safe, but I don't think the
		// compiler makes use of this fact anywhere.
		FlagsMayLiveOut = false;
		break;
		}
		}

		// One of the successor blocks uses the SREG register.
		// This is unlikely, but make sure to correctly handle this case anyway.
		if (FlagsMayLiveOut) {
		for (MachineBasicBlock *Successor : CmpMBB.successors())
		if (Successor->isLiveIn(AVR::SREG))
		return false;
		}

		// We can safely remove the comparison instruction!
		CmpInstr.eraseFromParent();

		// The implicit SREG output operand may have been set as dead. After this
		// transformation, it is not dead anymore.
		SrcRegDef->findRegisterDefOperand(AVR::SREG)->setIsDead(false);

		return true;
		}

const MCInstrDesc &AVRInstrInfo::getBrCond(AVRCC::CondCodes CC) const {		const MCInstrDesc &AVRInstrInfo::getBrCond(AVRCC::CondCodes CC) const {
switch (CC) {		switch (CC) {
default:		default:
llvm_unreachable("Unknown condition code!");		llvm_unreachable("Unknown condition code!");
case AVRCC::COND_EQ:		case AVRCC::COND_EQ:
return get(AVR::BREQk);		return get(AVR::BREQk);
case AVRCC::COND_NE:		case AVRCC::COND_NE:
return get(AVR::BRNEk);		return get(AVR::BRNEk);
▲ Show 20 Lines • Show All 373 Lines • Show Last 20 Lines

llvm/lib/Target/AVR/AVRInstrInfo.td

Show First 20 Lines • Show All 1,009 Lines • ▼ Show 20 Lines	let isTerminator = 1, isReturn = 1, isBarrier = 1 in {
def RET : F16<0b1001010100001000, (outs), (ins), "ret", [(AVRretflag)]>;		def RET : F16<0b1001010100001000, (outs), (ins), "ret", [(AVRretflag)]>;

def RETI : F16<0b1001010100011000, (outs), (ins), "reti", [(AVRretiflag)]>;		def RETI : F16<0b1001010100011000, (outs), (ins), "reti", [(AVRretiflag)]>;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Compare operations.		// Compare operations.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
let Defs = [SREG] in {		let Defs = [SREG], isCompare = 1 in {
// CPSE Rd, Rr		// CPSE Rd, Rr
// Compare Rd and Rr, skipping the next instruction if they are equal.		// Compare Rd and Rr, skipping the next instruction if they are equal.
let isBarrier = 1, isBranch = 1,		let isBarrier = 1, isBranch = 1,
isTerminator = 1 in def CPSE : FRdRr<0b0001, 0b00, (outs),		isTerminator = 1 in def CPSE : FRdRr<0b0001, 0b00, (outs),
(ins GPR8		(ins GPR8
: $rd, GPR8		: $rd, GPR8
: $rr),		: $rr),
"cpse\t$rd, $rr", []>;		"cpse\t$rd, $rr", []>;
▲ Show 20 Lines • Show All 1,536 Lines • Show Last 20 Lines

llvm/test/CodeGen/AVR/branch-relaxation-long.ll

	; RUN: llc < %s -march=avr \| FileCheck %s			; RUN: llc < %s -march=avr \| FileCheck %s

	; CHECK-LABEL: relax_to_jmp:			; CHECK-LABEL: relax_to_jmp:
	; CHECK: cpi r{{[0-9]+}}, 0			; CHECK: andi r{{[0-9]+}}, 1
	; CHECK: brne [[BB1:.LBB[0-9]+_[0-9]+]]			; CHECK: brne [[BB1:.LBB[0-9]+_[0-9]+]]
	; CHECK: jmp [[BB2:.LBB[0-9]+_[0-9]+]]			; CHECK: jmp [[BB2:.LBB[0-9]+_[0-9]+]]
	; CHECK: [[BB1]]:			; CHECK: [[BB1]]:
	; CHECK: nop			; CHECK: nop
	; CHECK: [[BB2]]:			; CHECK: [[BB2]]:
	define i8 @relax_to_jmp(i1 %a) {			define i8 @relax_to_jmp(i1 %a) {
	entry-block:			entry-block:
	br i1 %a, label %hello, label %finished			br i1 %a, label %hello, label %finished
	▲ Show 20 Lines • Show All 4,125 Lines • Show Last 20 Lines

llvm/test/CodeGen/AVR/branch-relaxation.ll

; RUN: llc < %s -march=avr \| FileCheck %s		; RUN: llc < %s -march=avr \| FileCheck %s

; CHECK-LABEL: relax_breq		; CHECK-LABEL: relax_breq
; CHECK: cpi r{{[0-9]+}}, 0		; CHECK: andi r{{[0-9]+}}, 1
		benshi001Unsubmitted Not Done Reply Inline Actions `CHECK-NOT: cpi` benshi001: `CHECK-NOT: cpi`
; CHECK: brne .LBB0_1		; CHECK: brne .LBB0_1
; CHECK: rjmp .LBB0_2		; CHECK: rjmp .LBB0_2
; .LBB0_1:		; .LBB0_1:

define i8 @relax_breq(i1 %a) {		define i8 @relax_breq(i1 %a) {
entry-block:		entry-block:
br i1 %a, label %hello, label %finished		br i1 %a, label %hello, label %finished

▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	hello:
call void asm sideeffect "nop", ""()		call void asm sideeffect "nop", ""()
call void asm sideeffect "nop", ""()		call void asm sideeffect "nop", ""()
br label %finished		br label %finished
finished:		finished:
ret i8 3		ret i8 3
}		}

; CHECK-LABEL: no_relax_breq		; CHECK-LABEL: no_relax_breq
; CHECK: cpi r{{[0-9]+}}, 0		; CHECK: andi r{{[0-9]+}}, 1
; CHECK: breq [[END_BB:.LBB[0-9]+_[0-9]+]]		; CHECK: breq [[END_BB:.LBB[0-9]+_[0-9]+]]
; CHECK: nop		; CHECK: nop
; ...		; ...
; .LBB0_1:		; .LBB0_1:
define i8 @no_relax_breq(i1 %a) {		define i8 @no_relax_breq(i1 %a) {
entry-block:		entry-block:
br i1 %a, label %hello, label %finished		br i1 %a, label %hello, label %finished

Show All 18 Lines

llvm/test/CodeGen/AVR/fold-cmp.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mcpu=attiny85 < %s -mtriple=avr \| FileCheck %s

				; This file checks whether cpi is optimized away. The full test is in
				; fold-cmp.mir. This just checks that it also works for the whole pipeline.

				; Check whether the SREG flags of the 'and' instruction are used directly in the
				; following conditional branch.
				define i8 @bitsAreZero(i8 %val) {
				; CHECK-LABEL: bitsAreZero:
				; CHECK: ; %bb.0: ; %entry
				; CHECK-NEXT: andi r24, 6
				; CHECK-NEXT: breq .LBB0_2
				; CHECK-NEXT: ; %bb.1: ; %else
				; CHECK-NEXT: ldi r24, 2
				; CHECK-NEXT: ret
				; CHECK-NEXT: .LBB0_2: ; %then
				; CHECK-NEXT: ldi r24, 1
				; CHECK-NEXT: ret
				entry:
				%bits = and i8 %val, 6
				%cmp = icmp eq i8 %bits, 0
				br i1 %cmp, label %then, label %else
				then:
				ret i8 1
				else:
				ret i8 2
				}

llvm/test/CodeGen/AVR/fold-cmp.mir

This file was added.

				# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
				# RUN: llc -O0 -run-pass=peephole-opt -mtriple=avr %s -o - \| FileCheck %s

				---
				name: common_case
				body: \|
				bb.0.entry:
				; CHECK-LABEL: name: common_case
				; CHECK: successors: %bb.0(0x80000000)
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:ld8 = COPY $r0
				; CHECK-NEXT: [[ANDIRdK:%[0-9]+]]:ld8 = ANDIRdK [[COPY]], 3, implicit-def $sreg
				; CHECK-NEXT: BRNEk %bb.0, implicit $sreg
				benshi001Unsubmitted Not Done Reply Inline Actions It would be better to add `CHECK-NOT: cpi` in all your tests. benshi001: It would be better to add `CHECK-NOT: cpi` in all your tests.
				; CHECK-NEXT: RET
				%0:ld8 = COPY $r0
				%1:ld8 = ANDIRdK %0:ld8, 3, implicit-def dead $sreg
				CPIRdK killed %1:ld8, 0, implicit-def $sreg
				BRNEk %bb.0.entry, implicit $sreg
				RET
				...
				---
				# A clobber between the andi and cpi instructions blocks the optimization.
				name: clobber_between
				body: \|
				bb.0.entry:
				; CHECK-LABEL: name: clobber_between
				; CHECK: successors: %bb.0(0x80000000)
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:ld8 = COPY $r0
				; CHECK-NEXT: [[ANDIRdK:%[0-9]+]]:ld8 = ANDIRdK [[COPY]], 3, implicit-def dead $sreg
				; CHECK-NEXT: [[ADDRdRr:%[0-9]+]]:ld8 = ADDRdRr [[COPY]], [[COPY]], implicit-def dead $sreg
				; CHECK-NEXT: CPIRdK killed [[ANDIRdK]], 0, implicit-def $sreg
				; CHECK-NEXT: BRNEk %bb.0, implicit $sreg
				; CHECK-NEXT: RET
				%0:ld8 = COPY $r0
				%1:ld8 = ANDIRdK %0:ld8, 3, implicit-def dead $sreg
				%2:ld8 = ADDRdRr %0:ld8, %0:ld8, implicit-def dead $sreg
				CPIRdK killed %1:ld8, 0, implicit-def $sreg
				BRNEk %bb.0.entry, implicit $sreg
				RET
				...
				---
				# Other instructions (like ldi) don't clobber SREG.
				name: noclobber_between
				body: \|
				bb.0.entry:
				; CHECK-LABEL: name: noclobber_between
				; CHECK: successors: %bb.0(0x80000000)
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:ld8 = COPY $r0
				; CHECK-NEXT: [[ANDIRdK:%[0-9]+]]:ld8 = ANDIRdK [[COPY]], 3, implicit-def $sreg
				; CHECK-NEXT: [[LDIRdK:%[0-9]+]]:ld8 = LDIRdK 5
				; CHECK-NEXT: [[LDIRdK1:%[0-9]+]]:ld8 = LDIRdK 6
				; CHECK-NEXT: BRNEk %bb.0, implicit $sreg
				; CHECK-NEXT: RET
				%0:ld8 = COPY $r0
				%1:ld8 = ANDIRdK %0:ld8, 3, implicit-def dead $sreg
				%2:ld8 = LDIRdK 5
				CPIRdK killed %1:ld8, 0, implicit-def $sreg
				%3:ld8 = LDIRdK 6
				BRNEk %bb.0.entry, implicit $sreg
				RET
				...
				---
				# SREG uses block the optimization (even though in this case BLD doesn't use the
				# zero bit).
				name: use_after
				body: \|
				bb.0.entry:
				; CHECK-LABEL: name: use_after
				; CHECK: successors: %bb.0(0x80000000)
				; CHECK-NEXT: {{ $}}
				; CHECK-NEXT: [[COPY:%[0-9]+]]:ld8 = COPY $r0
				; CHECK-NEXT: [[ANDIRdK:%[0-9]+]]:ld8 = ANDIRdK [[COPY]], 3, implicit-def dead $sreg
				; CHECK-NEXT: CPIRdK killed [[ANDIRdK]], 0, implicit-def $sreg
				; CHECK-NEXT: [[BLD:%[0-9]+]]:ld8 = BLD [[ANDIRdK]], 3, implicit $sreg
				; CHECK-NEXT: BRNEk %bb.0, implicit $sreg
				; CHECK-NEXT: RET
				%0:ld8 = COPY $r0
				%1:ld8 = ANDIRdK %0:ld8, 3, implicit-def dead $sreg
				CPIRdK killed %1:ld8, 0, implicit-def $sreg
				%2:ld8 = BLD %1:ld8, 3, implicit $sreg
				BRNEk %bb.0.entry, implicit $sreg
				RET
				benshi001Unsubmitted Not Done Reply Inline Actions How about adding more cases that there is a stand alone `CPI` (the `ANDI` maybe in a preceeding block), in which the CPI is not deleted. `ADIW` + `CPI`, the `CPI` should not be deleted. benshi001: How about adding more cases 1. that there is a stand alone `CPI` (the `ANDI` maybe in a…