This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVInstrInfo.td
-
RISCVRegisterInfo.h
-
RISCVRegisterInfo.cpp
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
-
remat.ll

Differential D46182

[RISCV] Set isReMaterializable on ADDI and LUI instructions
ClosedPublic

Authored by asb on Apr 27 2018, 5:21 AM.

Download Raw Diff

Details

Reviewers

apazos
mgrang
sabuasal
shiva0217

Commits

rG6a53023b4e12: [RISCV] Set isReMaterializable on ADDI and LUI instructions
rL332617: [RISCV] Set isReMaterializable on ADDI and LUI instructions

Summary

Although this patch is straight-forward, I'm hoping that more eyes on it will help point out if there are any additional hooks I may be missing related to rematerialisation.

The isReMaterlizable flag is somewhat confusing, unlike most other instruction flags it is currently interpreted as a hint (mightBeRematerializable would be a better name). While LUI is always rematerialisable, for an instruction like ADDI it depends on its operands. TargetInstrInfo::isTriviallyReMaterializable will call TargetInstrInfo::isReallyTriviallyReMaterializable, which in turn calls TargetInstrInfo::isReallyTriviallyReMaterializableGeneric. We rely on the logic in the latter to pick out instances of ADDI that really are rematerializable.

The isReMaterializable flag does make a difference on a variety of test programs. The recently committed remat.ll test case demonstrates how stack usage is reduce and a unnecessary lw/sw can be removed. Stack usage in the Proc0 function in dhrystone reduces from 192 bytes to 112 bytes.

For the sake of completeness, this patch also implements RISCVRegisterInfo::isConstantPhysReg. Although this is called from a number of places, it doesn't seem to result in different codegen for any programs I've thrown at it. However, it is called in the rematerialisation codepath and it seems sensible to implement something correct here.

@apazos, @mgrang, @sabuasal: could you please check the workloads you reported before indicated spilling of trivial values to ensure this patch addresses those.

Diff Detail

Repository: rL LLVM

Event Timeline

asb created this revision.Apr 27 2018, 5:21 AM

Herald added subscribers: edward-jones, zzheng, kito-cheng and 5 others. · View Herald TranscriptApr 27 2018, 5:21 AM

@asb Thanks for the patch. I tested this on our internal workload and it gave us ~44 bytes savings. However, instead if we mark all ALU insts as isReMaterializable then we get ~226 bytes savings:

let hasSideEffects = 0, isReMaterializable = 1, mayLoad = 0, mayStore = 0 in
class ALU_ri<bits<3> funct3, string opcodestr>
    : RVInstI<funct3, OPC_OP_IMM, (outs GPR:$rd), (ins GPR:$rs1, simm12:$imm12),
              opcodestr, "$rd, $rs1, $imm12">;

Note: Marking ALU insts in addition to ADDI and LUI as isReMaterializable still gave us only ~44 bytes savings.

In D46182#1081488, @mgrang wrote:

@asb Thanks for the patch. I tested this on our internal workload and it gave us ~44 bytes savings.
However, instead if we mark all ALU insts as isReMaterializable then we get ~226 bytes savings:
Note: Marking ALU insts in addition to ADDI and LUI as isReMaterializable still gave us only ~44 bytes savings.

Hi Mandeep - thanks for taking a look. I'm a bit confused, which of the above 3 statements is true?

In D46182#1081552, @asb wrote:

In D46182#1081488, @mgrang wrote:

@asb Thanks for the patch. I tested this on our internal workload and it gave us ~44 bytes savings.
However, instead if we mark all ALU insts as isReMaterializable then we get ~226 bytes savings:
Note: Marking ALU insts in addition to ADDI and LUI as isReMaterializable still gave us only ~44 bytes savings.

Hi Mandeep - thanks for taking a look. I'm a bit confused, which of the above 3 statements is true?

Sorry for not being clear enough. This patch alone gives us 44 bytes. Internally, I had tried a patch which just marks ALU insts as isReMat and that gave us 226 bytes. But combining the two patches still gives us only 44 bytes.

So I digged a bit more and it seems the degradation in code size comes from marking LUI as isReMaterializable. This results in repeated load zero for our workload:

lui x10, 0

So in your patch if I simply remove LUI isReMat (and keep ADDI isReMat) then I get 226 bytes (without my ALU patch).

In D46182#1081575, @mgrang wrote:
Sorry for not being clear enough. This patch alone gives us 44 bytes. Internally, I had tried a patch which just marks ALU insts as isReMat and that gave us 226 bytes. But combining the two patches still gives us only 44 bytes.

So I digged a bit more and it seems the degradation in code size comes from marking LUI as isReMaterializable. This results in repeated load zero for our workload:
lui x10, 0
So in your patch if I simply remove LUI isReMat (and keep ADDI isReMat) then I get 226 bytes (without my ALU patch).

Thanks for the datapoint. I'm not surprised there's no big code size win. I'm mostly seeing minor codegen changes, and where those changes are noticeable the win is in reducing stack size.

The use of LUI to zero a register is curious. I'll play around with those patch variants and see if I can replicate something similar. If there's a representative example that would be interesting, but I know from playing around with this series of patches that tweak codegen that it can be _really_ difficult to pull out a minimal example.

I haven't seen the selection of lui $reg, 0 at all in the wild. Can you take a closer look at where you're seeing this please? It implies something _very_ odd is going on with constant lowering.

LGTM. I didn't get a chance to look further into this. Since this patch helps with overall code size I think we can commit this now and we can report back once we have done more analysis.

This revision is now accepted and ready to land.May 10 2018, 8:00 AM

Closed by commit rL332617: [RISCV] Set isReMaterializable on ADDI and LUI instructions (authored by asb). · Explain WhyMay 17 2018, 8:55 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

RISCV/

RISCVInstrInfo.td

6 lines

RISCVRegisterInfo.h

2 lines

RISCVRegisterInfo.cpp

4 lines

test/

CodeGen/

RISCV/

remat.ll

67 lines

Diff 147328

llvm/trunk/lib/Target/RISCV/RISCVInstrInfo.td

	Show First 20 Lines • Show All 261 Lines • ▼ Show 20 Lines
	class Priv<string opcodestr, bits<7> funct7>			class Priv<string opcodestr, bits<7> funct7>
	: RVInstR<funct7, 0b000, OPC_SYSTEM, (outs), (ins GPR:$rs1, GPR:$rs2),			: RVInstR<funct7, 0b000, OPC_SYSTEM, (outs), (ins GPR:$rs1, GPR:$rs2),
	opcodestr, "">;			opcodestr, "">;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Instructions			// Instructions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let hasSideEffects = 0, mayLoad = 0, mayStore = 0 in {			let hasSideEffects = 0, isReMaterializable = 1, mayLoad = 0, mayStore = 0 in {
	def LUI : RVInstU<OPC_LUI, (outs GPR:$rd), (ins uimm20:$imm20),			def LUI : RVInstU<OPC_LUI, (outs GPR:$rd), (ins uimm20:$imm20),
	"lui", "$rd, $imm20">;			"lui", "$rd, $imm20">;

	def AUIPC : RVInstU<OPC_AUIPC, (outs GPR:$rd), (ins uimm20:$imm20),			def AUIPC : RVInstU<OPC_AUIPC, (outs GPR:$rd), (ins uimm20:$imm20),
	"auipc", "$rd, $imm20">;			"auipc", "$rd, $imm20">;

	let isCall = 1 in			let isCall = 1 in
	def JAL : RVInstJ<OPC_JAL, (outs GPR:$rd), (ins simm21_lsb0:$imm20),			def JAL : RVInstJ<OPC_JAL, (outs GPR:$rd), (ins simm21_lsb0:$imm20),
	Show All 17 Lines
	def LW : Load_ri<0b010, "lw">;			def LW : Load_ri<0b010, "lw">;
	def LBU : Load_ri<0b100, "lbu">;			def LBU : Load_ri<0b100, "lbu">;
	def LHU : Load_ri<0b101, "lhu">;			def LHU : Load_ri<0b101, "lhu">;

	def SB : Store_rri<0b000, "sb">;			def SB : Store_rri<0b000, "sb">;
	def SH : Store_rri<0b001, "sh">;			def SH : Store_rri<0b001, "sh">;
	def SW : Store_rri<0b010, "sw">;			def SW : Store_rri<0b010, "sw">;

				// ADDI isn't always rematerializable, but isReMaterializable will be used as
				// a hint which is verified in isReallyTriviallyReMaterializable.
				let isReMaterializable = 1 in
	def ADDI : ALU_ri<0b000, "addi">;			def ADDI : ALU_ri<0b000, "addi">;

	def SLTI : ALU_ri<0b010, "slti">;			def SLTI : ALU_ri<0b010, "slti">;
	def SLTIU : ALU_ri<0b011, "sltiu">;			def SLTIU : ALU_ri<0b011, "sltiu">;
	def XORI : ALU_ri<0b100, "xori">;			def XORI : ALU_ri<0b100, "xori">;
	def ORI : ALU_ri<0b110, "ori">;			def ORI : ALU_ri<0b110, "ori">;
	def ANDI : ALU_ri<0b111, "andi">;			def ANDI : ALU_ri<0b111, "andi">;

	def SLLI : Shift_ri<0, 0b001, "slli">;			def SLLI : Shift_ri<0, 0b001, "slli">;
	def SRLI : Shift_ri<0, 0b101, "srli">;			def SRLI : Shift_ri<0, 0b101, "srli">;
	▲ Show 20 Lines • Show All 404 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/RISCV/RISCVRegisterInfo.h

Show All 26 Lines	struct RISCVRegisterInfo : public RISCVGenRegisterInfo {

const uint32_t *getCallPreservedMask(const MachineFunction &MF,		const uint32_t *getCallPreservedMask(const MachineFunction &MF,
CallingConv::ID) const override;		CallingConv::ID) const override;

const MCPhysReg getCalleeSavedRegs(const MachineFunction MF) const override;		const MCPhysReg getCalleeSavedRegs(const MachineFunction MF) const override;

BitVector getReservedRegs(const MachineFunction &MF) const override;		BitVector getReservedRegs(const MachineFunction &MF) const override;

		bool isConstantPhysReg(unsigned PhysReg) const override;

const uint32_t *getNoPreservedMask() const override;		const uint32_t *getNoPreservedMask() const override;

void eliminateFrameIndex(MachineBasicBlock::iterator MI, int SPAdj,		void eliminateFrameIndex(MachineBasicBlock::iterator MI, int SPAdj,
unsigned FIOperandNum,		unsigned FIOperandNum,
RegScavenger *RS = nullptr) const override;		RegScavenger *RS = nullptr) const override;

unsigned getFrameRegister(const MachineFunction &MF) const override;		unsigned getFrameRegister(const MachineFunction &MF) const override;

Show All 15 Lines

llvm/trunk/lib/Target/RISCV/RISCVRegisterInfo.cpp

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	BitVector RISCVRegisterInfo::getReservedRegs(const MachineFunction &MF) const {
markSuperRegs(Reserved, RISCV::X2); // sp		markSuperRegs(Reserved, RISCV::X2); // sp
markSuperRegs(Reserved, RISCV::X3); // gp		markSuperRegs(Reserved, RISCV::X3); // gp
markSuperRegs(Reserved, RISCV::X4); // tp		markSuperRegs(Reserved, RISCV::X4); // tp
markSuperRegs(Reserved, RISCV::X8); // fp		markSuperRegs(Reserved, RISCV::X8); // fp
assert(checkAllSuperRegsMarked(Reserved));		assert(checkAllSuperRegsMarked(Reserved));
return Reserved;		return Reserved;
}		}

		bool RISCVRegisterInfo::isConstantPhysReg(unsigned PhysReg) const {
		return PhysReg == RISCV::X0;
		}

const uint32_t *RISCVRegisterInfo::getNoPreservedMask() const {		const uint32_t *RISCVRegisterInfo::getNoPreservedMask() const {
return CSR_NoRegs_RegMask;		return CSR_NoRegs_RegMask;
}		}

void RISCVRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator II,		void RISCVRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator II,
int SPAdj, unsigned FIOperandNum,		int SPAdj, unsigned FIOperandNum,
RegScavenger *RS) const {		RegScavenger *RS) const {
assert(SPAdj == 0 && "Unexpected non-zero SPAdj value");		assert(SPAdj == 0 && "Unexpected non-zero SPAdj value");
▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/RISCV/remat.ll

	Show All 14 Lines
	@i = common global i32 0, align 4			@i = common global i32 0, align 4
	@h = common global i32 0, align 4			@h = common global i32 0, align 4

	; This test case benefits from codegen recognising that some values are			; This test case benefits from codegen recognising that some values are
	; trivially rematerialisable, meaning they are recreated rather than saved to			; trivially rematerialisable, meaning they are recreated rather than saved to
	; the stack and restored. It creates high register pressure to force this			; the stack and restored. It creates high register pressure to force this
	; situation.			; situation.

	; TODO: it makes no sense to spill %hi(h), %hi(l) or the constant 64 to the
	; stack.

	define i32 @test() nounwind {			define i32 @test() nounwind {
	; RV32I-LABEL: test:			; RV32I-LABEL: test:
	; RV32I: # %bb.0: # %entry			; RV32I: # %bb.0: # %entry
	; RV32I-NEXT: addi sp, sp, -64			; RV32I-NEXT: addi sp, sp, -48
	; RV32I-NEXT: sw ra, 60(sp)			; RV32I-NEXT: sw ra, 44(sp)
	; RV32I-NEXT: sw s1, 56(sp)			; RV32I-NEXT: sw s1, 40(sp)
	; RV32I-NEXT: sw s2, 52(sp)			; RV32I-NEXT: sw s2, 36(sp)
	; RV32I-NEXT: sw s3, 48(sp)			; RV32I-NEXT: sw s3, 32(sp)
	; RV32I-NEXT: sw s4, 44(sp)			; RV32I-NEXT: sw s4, 28(sp)
	; RV32I-NEXT: sw s5, 40(sp)			; RV32I-NEXT: sw s5, 24(sp)
	; RV32I-NEXT: sw s6, 36(sp)			; RV32I-NEXT: sw s6, 20(sp)
	; RV32I-NEXT: sw s7, 32(sp)			; RV32I-NEXT: sw s7, 16(sp)
	; RV32I-NEXT: sw s8, 28(sp)			; RV32I-NEXT: sw s8, 12(sp)
	; RV32I-NEXT: sw s9, 24(sp)			; RV32I-NEXT: sw s9, 8(sp)
	; RV32I-NEXT: sw s10, 20(sp)			; RV32I-NEXT: sw s10, 4(sp)
	; RV32I-NEXT: sw s11, 16(sp)			; RV32I-NEXT: sw s11, 0(sp)
	; RV32I-NEXT: lui s3, %hi(a)			; RV32I-NEXT: lui s3, %hi(a)
	; RV32I-NEXT: lw a0, %lo(a)(s3)			; RV32I-NEXT: lw a0, %lo(a)(s3)
	; RV32I-NEXT: beqz a0, .LBB0_11			; RV32I-NEXT: beqz a0, .LBB0_11
	; RV32I-NEXT: # %bb.1: # %for.body.preheader			; RV32I-NEXT: # %bb.1: # %for.body.preheader
	; RV32I-NEXT: lui a1, %hi(l)
	; RV32I-NEXT: sw a1, 12(sp)
	; RV32I-NEXT: lui s5, %hi(k)			; RV32I-NEXT: lui s5, %hi(k)
	; RV32I-NEXT: lui s6, %hi(j)			; RV32I-NEXT: lui s6, %hi(j)
	; RV32I-NEXT: lui s7, %hi(i)			; RV32I-NEXT: lui s7, %hi(i)
	; RV32I-NEXT: lui a1, %hi(h)
	; RV32I-NEXT: sw a1, 8(sp)
	; RV32I-NEXT: lui s9, %hi(g)			; RV32I-NEXT: lui s9, %hi(g)
	; RV32I-NEXT: lui s10, %hi(f)			; RV32I-NEXT: lui s10, %hi(f)
	; RV32I-NEXT: lui s11, %hi(e)			; RV32I-NEXT: lui s11, %hi(e)
	; RV32I-NEXT: lui s8, %hi(d)			; RV32I-NEXT: lui s8, %hi(d)
	; RV32I-NEXT: addi s1, zero, 32			; RV32I-NEXT: addi s1, zero, 32
	; RV32I-NEXT: lui s2, %hi(c)			; RV32I-NEXT: lui s2, %hi(c)
	; RV32I-NEXT: lui s4, %hi(b)			; RV32I-NEXT: lui s4, %hi(b)
	; RV32I-NEXT: addi a1, zero, 64
	; RV32I-NEXT: sw a1, 4(sp)
	; RV32I-NEXT: .LBB0_2: # %for.body			; RV32I-NEXT: .LBB0_2: # %for.body
	; RV32I-NEXT: # =>This Inner Loop Header: Depth=1			; RV32I-NEXT: # =>This Inner Loop Header: Depth=1
	; RV32I-NEXT: lw a1, 12(sp)			; RV32I-NEXT: lui a1, %hi(l)
	; RV32I-NEXT: lw a1, %lo(l)(a1)			; RV32I-NEXT: lw a1, %lo(l)(a1)
	; RV32I-NEXT: beqz a1, .LBB0_4			; RV32I-NEXT: beqz a1, .LBB0_4
	; RV32I-NEXT: # %bb.3: # %if.then			; RV32I-NEXT: # %bb.3: # %if.then
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a4, %lo(e)(s11)			; RV32I-NEXT: lw a4, %lo(e)(s11)
	; RV32I-NEXT: lw a3, %lo(d)(s8)			; RV32I-NEXT: lw a3, %lo(d)(s8)
	; RV32I-NEXT: lw a2, %lo(c)(s2)			; RV32I-NEXT: lw a2, %lo(c)(s2)
	; RV32I-NEXT: lw a1, %lo(b)(s4)			; RV32I-NEXT: lw a1, %lo(b)(s4)
	; RV32I-NEXT: mv a5, s1			; RV32I-NEXT: mv a5, s1
	; RV32I-NEXT: call foo			; RV32I-NEXT: call foo
	; RV32I-NEXT: .LBB0_4: # %if.end			; RV32I-NEXT: .LBB0_4: # %if.end
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a0, %lo(k)(s5)			; RV32I-NEXT: lw a0, %lo(k)(s5)
	; RV32I-NEXT: beqz a0, .LBB0_6			; RV32I-NEXT: beqz a0, .LBB0_6
	; RV32I-NEXT: # %bb.5: # %if.then3			; RV32I-NEXT: # %bb.5: # %if.then3
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a4, %lo(f)(s10)			; RV32I-NEXT: lw a4, %lo(f)(s10)
	; RV32I-NEXT: lw a3, %lo(e)(s11)			; RV32I-NEXT: lw a3, %lo(e)(s11)
	; RV32I-NEXT: lw a2, %lo(d)(s8)			; RV32I-NEXT: lw a2, %lo(d)(s8)
	; RV32I-NEXT: lw a1, %lo(c)(s2)			; RV32I-NEXT: lw a1, %lo(c)(s2)
	; RV32I-NEXT: lw a0, %lo(b)(s4)			; RV32I-NEXT: lw a0, %lo(b)(s4)
	; RV32I-NEXT: lw a5, 4(sp)			; RV32I-NEXT: addi a5, zero, 64
	; RV32I-NEXT: call foo			; RV32I-NEXT: call foo
	; RV32I-NEXT: .LBB0_6: # %if.end5			; RV32I-NEXT: .LBB0_6: # %if.end5
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a0, %lo(j)(s6)			; RV32I-NEXT: lw a0, %lo(j)(s6)
	; RV32I-NEXT: beqz a0, .LBB0_8			; RV32I-NEXT: beqz a0, .LBB0_8
	; RV32I-NEXT: # %bb.7: # %if.then7			; RV32I-NEXT: # %bb.7: # %if.then7
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a4, %lo(g)(s9)			; RV32I-NEXT: lw a4, %lo(g)(s9)
	; RV32I-NEXT: lw a3, %lo(f)(s10)			; RV32I-NEXT: lw a3, %lo(f)(s10)
	; RV32I-NEXT: lw a2, %lo(e)(s11)			; RV32I-NEXT: lw a2, %lo(e)(s11)
	; RV32I-NEXT: lw a1, %lo(d)(s8)			; RV32I-NEXT: lw a1, %lo(d)(s8)
	; RV32I-NEXT: lw a0, %lo(c)(s2)			; RV32I-NEXT: lw a0, %lo(c)(s2)
	; RV32I-NEXT: mv a5, s1			; RV32I-NEXT: mv a5, s1
	; RV32I-NEXT: call foo			; RV32I-NEXT: call foo
	; RV32I-NEXT: .LBB0_8: # %if.end9			; RV32I-NEXT: .LBB0_8: # %if.end9
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a0, %lo(i)(s7)			; RV32I-NEXT: lw a0, %lo(i)(s7)
	; RV32I-NEXT: beqz a0, .LBB0_10			; RV32I-NEXT: beqz a0, .LBB0_10
	; RV32I-NEXT: # %bb.9: # %if.then11			; RV32I-NEXT: # %bb.9: # %if.then11
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a0, 8(sp)			; RV32I-NEXT: lui a0, %hi(h)
	; RV32I-NEXT: lw a4, %lo(h)(a0)			; RV32I-NEXT: lw a4, %lo(h)(a0)
	; RV32I-NEXT: lw a3, %lo(g)(s9)			; RV32I-NEXT: lw a3, %lo(g)(s9)
	; RV32I-NEXT: lw a2, %lo(f)(s10)			; RV32I-NEXT: lw a2, %lo(f)(s10)
	; RV32I-NEXT: lw a1, %lo(e)(s11)			; RV32I-NEXT: lw a1, %lo(e)(s11)
	; RV32I-NEXT: lw a0, %lo(d)(s8)			; RV32I-NEXT: lw a0, %lo(d)(s8)
	; RV32I-NEXT: mv a5, s1			; RV32I-NEXT: mv a5, s1
	; RV32I-NEXT: call foo			; RV32I-NEXT: call foo
	; RV32I-NEXT: .LBB0_10: # %for.inc			; RV32I-NEXT: .LBB0_10: # %for.inc
	; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1			; RV32I-NEXT: # in Loop: Header=BB0_2 Depth=1
	; RV32I-NEXT: lw a0, %lo(a)(s3)			; RV32I-NEXT: lw a0, %lo(a)(s3)
	; RV32I-NEXT: addi a0, a0, -1			; RV32I-NEXT: addi a0, a0, -1
	; RV32I-NEXT: sw a0, %lo(a)(s3)			; RV32I-NEXT: sw a0, %lo(a)(s3)
	; RV32I-NEXT: bnez a0, .LBB0_2			; RV32I-NEXT: bnez a0, .LBB0_2
	; RV32I-NEXT: .LBB0_11: # %for.end			; RV32I-NEXT: .LBB0_11: # %for.end
	; RV32I-NEXT: addi a0, zero, 1			; RV32I-NEXT: addi a0, zero, 1
	; RV32I-NEXT: lw s11, 16(sp)			; RV32I-NEXT: lw s11, 0(sp)
	; RV32I-NEXT: lw s10, 20(sp)			; RV32I-NEXT: lw s10, 4(sp)
	; RV32I-NEXT: lw s9, 24(sp)			; RV32I-NEXT: lw s9, 8(sp)
	; RV32I-NEXT: lw s8, 28(sp)			; RV32I-NEXT: lw s8, 12(sp)
	; RV32I-NEXT: lw s7, 32(sp)			; RV32I-NEXT: lw s7, 16(sp)
	; RV32I-NEXT: lw s6, 36(sp)			; RV32I-NEXT: lw s6, 20(sp)
	; RV32I-NEXT: lw s5, 40(sp)			; RV32I-NEXT: lw s5, 24(sp)
	; RV32I-NEXT: lw s4, 44(sp)			; RV32I-NEXT: lw s4, 28(sp)
	; RV32I-NEXT: lw s3, 48(sp)			; RV32I-NEXT: lw s3, 32(sp)
	; RV32I-NEXT: lw s2, 52(sp)			; RV32I-NEXT: lw s2, 36(sp)
	; RV32I-NEXT: lw s1, 56(sp)			; RV32I-NEXT: lw s1, 40(sp)
	; RV32I-NEXT: lw ra, 60(sp)			; RV32I-NEXT: lw ra, 44(sp)
	; RV32I-NEXT: addi sp, sp, 64			; RV32I-NEXT: addi sp, sp, 48
	; RV32I-NEXT: ret			; RV32I-NEXT: ret
	entry:			entry:
	%.pr = load i32, i32* @a, align 4			%.pr = load i32, i32* @a, align 4
	%tobool14 = icmp eq i32 %.pr, 0			%tobool14 = icmp eq i32 %.pr, 0
	br i1 %tobool14, label %for.end, label %for.body			br i1 %tobool14, label %for.end, label %for.body

	for.body: ; preds = %entry, %for.inc			for.body: ; preds = %entry, %for.inc
	%0 = phi i32 [ %dec, %for.inc ], [ %.pr, %entry ]			%0 = phi i32 [ %dec, %for.inc ], [ %.pr, %entry ]
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines