This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVISelDAGToDAG.h
4/4
RISCVISelDAGToDAG.cpp
-
RISCVISelLowering.cpp
4/4
RISCVInstrInfoB.td
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
7/7
rv32Zbb.ll
-
rv64Zbb.ll

Differential D79870

[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructions
ClosedPublic

Authored by PaoloS on May 13 2020, 8:29 AM.

Download Raw Diff

Details

Reviewers

simoncook
edward-jones
asb
lewis-revill

Commits

rGf749d92f7a32: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm…
rGe2692f0ee7f3: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm…

Summary

This patch provides optimization of bit manipulation operations by enabling
the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the base subset (zbb subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Clifford Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

PaoloS created this revision.May 13 2020, 8:29 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2020, 8:30 AM

Herald added subscribers: llvm-commits, evandro, luismarques and 25 others. · View Herald Transcript

PaoloS added a child revision: D79871: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbp asm instructions.May 13 2020, 8:42 AM

Harbormaster completed remote builds in B56603: Diff 263729.May 13 2020, 10:50 AM

Sorry for the delay on this - the lockdown situation is really hurting my review time, though it looks like my childcare situation will improve from the week after next.

Quite a lot of comments below, but I think this actually is almost ready to go with a few extra patterns or at least reworked test cases. Thanks again for your work.

I've added a few comments inline, beyond that:

For the test files, I'd probably just have RV32IB and RV64IB check lines. The RV32I-NOT/RV64I-NOT lines are helpful defence in depth, but they're not very compatible with update_llc_test_checks.py which we prefer to use whenever possible. This is like the float-*.ll test files. Alternatively we just let update_llc_test_checks generate code for the non-bitmanip targets (more like the mul.ll test).
Use update_llc_test_checks.py to generate and maintain the check lines
Missing slliu.w?
No immediate variants? e.g. sloi, sroi
Missing some W instructions such as SLOW and SROW
I'd suggest having one file called something like rvZbb.ll which contains tests that are relevant for both RV32 and RV64. Note this likely does include both i64 and i32 test cases - we want to ensure reasonable codegen for i32 values on RV64 and for i64 values on RV32, in both cases using hardware instructions when possible. If there are tests that really would just be noise for RV32, then put those in rvZ64bb.ll.

I think we do have flexibility to commit something that falls short of handling codegen for all Zbb instructions, but if doing so it would be helpful to note in e.g. the test file (ideally even with tests!) cases that aren't yet handled, so it's easy to return to. If it's easy enough to add the missing cases, that would be preferred.

llvm/test/CodeGen/RISCV/rv32Zbb.ll
42	clz on a zero is a well defined operation that will return XLEN. So shouldn't this just lower to clz and ret?
62	Same comment as for clz above

Sorry for the late answer.
I'm catching up with this now.

I agree on the reorganization of the tests. I'm fixing that.
I notice that the tests of the 64 bit instructions on 32 bit are quite noisy (above all for clz, ctz and pcnt). I'll soon upload a revision so that you can all see.

The immediate shifts with ones instead (sloi, sroi) have the problem that LLVM optimizes them so that instead of having DAG nodes resembling the straightforward operation:

(sloi)

~(~x << shamt)

it prefers to use a mask:

(x << shamt) | (~(-1 << shamt))

That means that in the DAG pattern there's a constant (the right operand of the 'or') with a value that depends on the value of the shamt and that the resulting pattern (sloi/sroi) won't use.
Of course we could just drop it since it isn't used, like this:

def : Pat<(or (shl GPR:$rs1, simm12:$shamt), mask),
        (SLOI GPR:$rs1, simm12:$shamt)>;

But that introduces an ambiguity.
In order to check that the operand is actually the mask derived from the shamt we need to check that it is related to that.
I'm trying now to see if a ComplexPattern can do the trick and select sloi and sroi for me while checking that the mask is correct.
I'm not sure though if it is a neat enough solution for upstream.

About slliu.w instead the issue is quite different. As the documentation says slliu.w is identical to slli apart from the fact that it zeroes the (xlen-1):31 bits before shifting.
LLVM though optimizes out such casting before getting to the instruction selection. In that way it is not possibile to distinguish it from a normal slli. Considering that the result is a single instruction (slli) in any case I think it's just better to leave it like that and let the user use the slliu.w instruction directly if needed.
A similar thing happens with ctzw.
While for clzw and pcntw LLVM doesn't optimize out the truncation as not doing it could actually affect the result, for ctz it doesn't care. I guess that since it's checking the tail zeroes until up bit 31, once it sees that the lower 32 bits of the number are not 0 it processes the original 64 bit value normally.
Otherwise it returns 32.
That makes it unpractical to tell it apart from a rv64 ctz.
The outcome is that for llvm.cttz.i32 on rv64 instead of getting ctzw I get anyway ctz.

PaoloS marked 4 inline comments as done.Jun 25 2020, 5:11 AM

PaoloS added inline comments.

llvm/test/CodeGen/RISCV/rv32Zbb.ll
42	I agree, unfortunately the code gets split into multiple basic blocks before the selection and just the block with the condition a0 != 0 has the ctlz operation in it. Since I can focus on one block per time when pattern matching that's what I could do from the backend. I based the pattern matching of clz on the llvm instrinc llvm.ctlz.i32 that already relies on its own idiom recognition in the middle end. A solution could be to turn off the intrinsics and try to pattern match it directly from the backend, maybe we could semplify it. But the scope is limited.
62	Same as above

Added missing pattern-matching for *w instructions.
Added codegen tests.
Added ComplexPattern instances that are crucial to pattern-match SLOI, SROI, SLOIW, SROIW and SLLIUW.
Both 32 and 64 bit test files have both 32 and 64 bit test cases of the instructions (were existing).

Just a clarification. I decided to split the tests into 32bit and 64bit because the 32bit code compiled on RV64 commonly produces sign-extended IR and that's when many *w instructions are selected. A version of the tests in a unique file could imply on one hand to have 32 bit IR with sign-extension compiled for RV32 (harmless but redundant), on the other hand we would have i32 code with no explicit sign-extension compiled for RV32. That is correct but it might lead to misleading selections, like pattern-matching the IR code of a 32bit SLOI on RV64 with a RV64 SLOI instead of a SLOIW (the difference is that SLOIW ignores the upper 32 bit of the result while RV64 doesn't).

lewis-revill added inline comments.Jul 9 2020, 6:51 AM

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
197	Indentation within these Select functions is messed up, presumably due to a mix of tabs and spaces.
286	I'm not sure the convention other select functions for W instructions follow but perhaps an assert for IsRV64 should be added for completeness?
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	Can these W selects be guarded for 64 bit only?

PaoloS marked 2 inline comments as done.Jul 9 2020, 8:38 AM

PaoloS added inline comments.

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
197	Yes, I was trying to use spaces only in the end. Must have missed these.
286	Well, SLOIW exists only on RV64. I could add it, but I think it would be a bit redundant if I guard the selects only for RV64. But yes, for completeness I probably should.

PaoloS marked an inline comment as done.Jul 9 2020, 7:24 PM

PaoloS added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	Not sure how to do it, they can't be enclosed in Predicates like the instruction patterns.

Fixed indentation.
Added architecture type control for complex pattern matching of sloiw, sroiw and slliuw.

Updated the test:

the tests have been updated from the top of all the sub-patches together so that they are exactly the same as they would be if updated with the whole final patch.
labels specific to the sub-extension have been added alongside the generic RISCVIB label (that activates all the sub-extensions) so that we can see how differently the patterns are matched with the specific subextension or with all of them together.
the tests will probably fail if run by checking out the commit of a subextension and if updated they'll change. These tests are designed to work with the final squashed patch.

Corrected the order of the patterns.

PaoloS edited the summary of this revision. (Show Details)Jul 14 2020, 6:13 AM

lewis-revill added inline comments.Jul 14 2020, 8:04 AM

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	Looks like you fixed this with the operand to `ComplexPattern`. Only nitpick is to get the `:` characters aligned vertically here.
llvm/test/CodeGen/RISCV/rv32Zbb.ll
33	Nitpick: For these tests on RV32 where no bitmanip instructions are selected (EG: `slo_i64`, `sro_i64`, `min_i64` etc.) perhaps it's worth either omitting these, or if the goal is to eventually support them, just add a quick comment? I noticed the same in the 3rd and 5th patches too, for `rol_i64` and `fshl_i64`.

Thanks Paolo, tests are all passing and apart from the nitpicks this is a green light from me, as with the rest in this series.

This revision is now accepted and ready to land.Jul 14 2020, 8:09 AM

PaoloS marked an inline comment as done.Jul 14 2020, 8:28 AM

PaoloS added inline comments.

llvm/test/CodeGen/RISCV/rv32Zbb.ll
33	I see what you mean. I just like the idea to show consistency with the other tests while showing cases where there's still room for improvement. Much of this codegen pattern-matching work was also to look for cases that could be optimized. Also things could still change considering that a new subextension is being drafted. On the other hand I understand that it looks like these tests are wrong I guess since they don't show particular changes.

PaoloS marked 7 inline comments as done.Jul 14 2020, 10:25 AM

PaoloS added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	On it.
llvm/test/CodeGen/RISCV/rv32Zbb.ll
33	I'm commenting those.

PaoloS marked an inline comment as done.Jul 14 2020, 11:19 AM

Aligned the declarations of the complex patterns.
Added comments to inefficient tests.

Thank you very much @lewis-revill, very appreciated.
I haven't got commit access, can you @asb or someone else commit it for me?
Unless of course there's something else that needs immediate correction.

Many thanks all.

Sure I'll land these apologies for the delay..

No worries.
Thank you @lewis-revill

Closed by commit rGe2692f0ee7f3: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm… (authored by lewis-revill). · Explain WhyJul 15 2020, 4:20 AM

This revision was automatically updated to reflect the committed changes.

It's a shame this just missed the creation of the llvm 11.0 branch, do we think it's worth trying to get this backported since it only just missed?

In D79870#2153608, @simoncook wrote:

It's a shame this just missed the creation of the llvm 11.0 branch, do we think it's worth trying to get this backported since it only just missed?

I wouldn't be opposed. As they're all guarded by experimental flags, the risk of issues in cherry-picking these patches is pretty minimal.

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelDAGToDAG.h

6 lines

RISCVISelDAGToDAG.cpp

190 lines

RISCVISelLowering.cpp

9 lines

RISCVInstrInfoB.td

76 lines

test/

CodeGen/

RISCV/

rv32Zbb.ll

1218 lines

rv64Zbb.ll

1149 lines

Diff 278136

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h

Show All 39 Lines	public:

void Select(SDNode *Node) override;		void Select(SDNode *Node) override;

bool SelectInlineAsmMemoryOperand(const SDValue &Op, unsigned ConstraintID,		bool SelectInlineAsmMemoryOperand(const SDValue &Op, unsigned ConstraintID,
std::vector<SDValue> &OutOps) override;		std::vector<SDValue> &OutOps) override;

bool SelectAddrFI(SDValue Addr, SDValue &Base);		bool SelectAddrFI(SDValue Addr, SDValue &Base);

		bool SelectSLOI(SDValue N, SDValue &RS1, SDValue &Shamt);
		bool SelectSROI(SDValue N, SDValue &RS1, SDValue &Shamt);
		bool SelectSLLIUW(SDValue N, SDValue &RS1, SDValue &Shamt);
		bool SelectSLOIW(SDValue N, SDValue &RS1, SDValue &Shamt);
		bool SelectSROIW(SDValue N, SDValue &RS1, SDValue &Shamt);

// Include the pieces autogenerated from the target description.		// Include the pieces autogenerated from the target description.
#include "RISCVGenDAGISel.inc"		#include "RISCVGenDAGISel.inc"

private:		private:
void doPeepholeLoadStoreADDI();		void doPeepholeLoadStoreADDI();
};		};
}		}

#endif		#endif

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

	Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines
	bool RISCVDAGToDAGISel::SelectAddrFI(SDValue Addr, SDValue &Base) {			bool RISCVDAGToDAGISel::SelectAddrFI(SDValue Addr, SDValue &Base) {
	if (auto FIN = dyn_cast<FrameIndexSDNode>(Addr)) {			if (auto FIN = dyn_cast<FrameIndexSDNode>(Addr)) {
	Base = CurDAG->getTargetFrameIndex(FIN->getIndex(), Subtarget->getXLenVT());			Base = CurDAG->getTargetFrameIndex(FIN->getIndex(), Subtarget->getXLenVT());
	return true;			return true;
	}			}
	return false;			return false;
	}			}

				// Check that it is a SLOI (Shift Left Ones Immediate). We first check that
				// it is the right node tree:
				//
				// (OR (SHL RS1, VC2), VC1)
				//
				// and then we check that VC1, the mask used to fill with ones, is compatible
				// with VC2, the shamt:
				//
				// VC1 == maskTrailingOnes<uint64_t>(VC2)

				bool RISCVDAGToDAGISel::SelectSLOI(SDValue N, SDValue &RS1, SDValue &Shamt) {
				lewis-revillUnsubmitted Done Reply Inline Actions Indentation within these Select functions is messed up, presumably due to a mix of tabs and spaces. lewis-revill: Indentation within these Select functions is messed up, presumably due to a mix of tabs and…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions Yes, I was trying to use spaces only in the end. Must have missed these. PaoloS: Yes, I was trying to use spaces only in the end. Must have missed these.
				MVT XLenVT = Subtarget->getXLenVT();
				if (N.getOpcode() == ISD::OR) {
				SDValue Or = N;
				if (Or.getOperand(0).getOpcode() == ISD::SHL) {
				SDValue Shl = Or.getOperand(0);
				if (isa<ConstantSDNode>(Shl.getOperand(1)) &&
				isa<ConstantSDNode>(Or.getOperand(1))) {
				if (XLenVT == MVT::i64) {
				uint64_t VC1 = Or.getConstantOperandVal(1);
				uint64_t VC2 = Shl.getConstantOperandVal(1);
				if (VC1 == maskTrailingOnes<uint64_t>(VC2)) {
				RS1 = Shl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Shl.getOperand(1).getValueType());
				return true;
				}
				}
				if (XLenVT == MVT::i32) {
				uint32_t VC1 = Or.getConstantOperandVal(1);
				uint32_t VC2 = Shl.getConstantOperandVal(1);
				if (VC1 == maskTrailingOnes<uint32_t>(VC2)) {
				RS1 = Shl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Shl.getOperand(1).getValueType());
				return true;
				}
				}
				}
				}
				}
				return false;
				}

				// Check that it is a SROI (Shift Right Ones Immediate). We first check that
				// it is the right node tree:
				//
				// (OR (SRL RS1, VC2), VC1)
				//
				// and then we check that VC1, the mask used to fill with ones, is compatible
				// with VC2, the shamt:
				//
				// VC1 == maskLeadingOnes<uint64_t>(VC2)

				bool RISCVDAGToDAGISel::SelectSROI(SDValue N, SDValue &RS1, SDValue &Shamt) {
				MVT XLenVT = Subtarget->getXLenVT();
				if (N.getOpcode() == ISD::OR) {
				SDValue Or = N;
				if (Or.getOperand(0).getOpcode() == ISD::SRL) {
				SDValue Srl = Or.getOperand(0);
				if (isa<ConstantSDNode>(Srl.getOperand(1)) &&
				isa<ConstantSDNode>(Or.getOperand(1))) {
				if (XLenVT == MVT::i64) {
				uint64_t VC1 = Or.getConstantOperandVal(1);
				uint64_t VC2 = Srl.getConstantOperandVal(1);
				if (VC1 == maskLeadingOnes<uint64_t>(VC2)) {
				RS1 = Srl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Srl.getOperand(1).getValueType());
				return true;
				}
				}
				if (XLenVT == MVT::i32) {
				uint32_t VC1 = Or.getConstantOperandVal(1);
				uint32_t VC2 = Srl.getConstantOperandVal(1);
				if (VC1 == maskLeadingOnes<uint32_t>(VC2)) {
				RS1 = Srl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Srl.getOperand(1).getValueType());
				return true;
				}
				}
				}
				}
				}
				return false;
				}

				// Check that it is a SLLIUW (Shift Logical Left Immediate Unsigned i32
				// on RV64).
				// SLLIUW is the same as SLLI except for the fact that it clears the bits
				// XLEN-1:32 of the input RS1 before shifting.
				// We first check that it is the right node tree:
				//
				// (AND (SHL RS1, VC2), VC1)
				//
				// We check that VC2, the shamt is less than 32, otherwise the pattern is
				// exactly the same as SLLI and we give priority to that.
				// Eventually we check that that VC1, the mask used to clear the upper 32 bits
				// of RS1, is correct:
				lewis-revillUnsubmitted Done Reply Inline Actions I'm not sure the convention other select functions for W instructions follow but perhaps an assert for IsRV64 should be added for completeness? lewis-revill: I'm not sure the convention other select functions for W instructions follow but perhaps an…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions Well, SLOIW exists only on RV64. I could add it, but I think it would be a bit redundant if I guard the selects only for RV64. But yes, for completeness I probably should. PaoloS: Well, SLOIW exists only on RV64. I could add it, but I think it would be a bit redundant if I…
				//
				// VC1 == (0xFFFFFFFF << VC2)

				bool RISCVDAGToDAGISel::SelectSLLIUW(SDValue N, SDValue &RS1, SDValue &Shamt) {
				if (N.getOpcode() == ISD::AND && Subtarget->getXLenVT() == MVT::i64) {
				SDValue And = N;
				if (And.getOperand(0).getOpcode() == ISD::SHL) {
				SDValue Shl = And.getOperand(0);
				if (isa<ConstantSDNode>(Shl.getOperand(1)) &&
				isa<ConstantSDNode>(And.getOperand(1))) {
				uint64_t VC1 = And.getConstantOperandVal(1);
				uint64_t VC2 = Shl.getConstantOperandVal(1);
				if (VC2 < 32 && VC1 == ((uint64_t)0xFFFFFFFF << VC2)) {
				RS1 = Shl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Shl.getOperand(1).getValueType());
				return true;
				}
				}
				}
				}
				return false;
				}

				// Check that it is a SLOIW (Shift Left Ones Immediate i32 on RV64).
				// We first check that it is the right node tree:
				//
				// (SIGN_EXTEND_INREG (OR (SHL RS1, VC2), VC1))
				//
				// and then we check that VC1, the mask used to fill with ones, is compatible
				// with VC2, the shamt:
				//
				// VC1 == maskTrailingOnes<uint32_t>(VC2)

				bool RISCVDAGToDAGISel::SelectSLOIW(SDValue N, SDValue &RS1, SDValue &Shamt) {
				if (Subtarget->getXLenVT() == MVT::i64 &&
				N.getOpcode() == ISD::SIGN_EXTEND_INREG &&
				cast<VTSDNode>(N.getOperand(1))->getVT() == MVT::i32) {
				if (N.getOperand(0).getOpcode() == ISD::OR) {
				SDValue Or = N.getOperand(0);
				if (Or.getOperand(0).getOpcode() == ISD::SHL) {
				SDValue Shl = Or.getOperand(0);
				if (isa<ConstantSDNode>(Shl.getOperand(1)) &&
				isa<ConstantSDNode>(Or.getOperand(1))) {
				uint32_t VC1 = Or.getConstantOperandVal(1);
				uint32_t VC2 = Shl.getConstantOperandVal(1);
				if (VC1 == maskTrailingOnes<uint32_t>(VC2)) {
				RS1 = Shl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Shl.getOperand(1).getValueType());
				return true;
				}
				}
				}
				}
				}
				return false;
				}

				// Check that it is a SROIW (Shift Right Ones Immediate i32 on RV64).
				// We first check that it is the right node tree:
				//
				// (OR (SHL RS1, VC2), VC1)
				//
				// and then we check that VC1, the mask used to fill with ones, is compatible
				// with VC2, the shamt:
				//
				// VC1 == maskLeadingOnes<uint32_t>(VC2)

				bool RISCVDAGToDAGISel::SelectSROIW(SDValue N, SDValue &RS1, SDValue &Shamt) {
				if (N.getOpcode() == ISD::OR && Subtarget->getXLenVT() == MVT::i64) {
				SDValue Or = N;
				if (Or.getOperand(0).getOpcode() == ISD::SRL) {
				SDValue Srl = Or.getOperand(0);
				if (isa<ConstantSDNode>(Srl.getOperand(1)) &&
				isa<ConstantSDNode>(Or.getOperand(1))) {
				uint32_t VC1 = Or.getConstantOperandVal(1);
				uint32_t VC2 = Srl.getConstantOperandVal(1);
				if (VC1 == maskLeadingOnes<uint32_t>(VC2)) {
				RS1 = Srl.getOperand(0);
				Shamt = CurDAG->getTargetConstant(VC2, SDLoc(N),
				Srl.getOperand(1).getValueType());
				return true;
				}
				}
				}
				}
				return false;
				}

	// Merge an ADDI into the offset of a load/store instruction where possible.			// Merge an ADDI into the offset of a load/store instruction where possible.
	// (load (addi base, off1), off2) -> (load base, off1+off2)			// (load (addi base, off1), off2) -> (load base, off1+off2)
	// (store val, (addi base, off1), off2) -> (store val, base, off1+off2)			// (store val, (addi base, off1), off2) -> (store val, base, off1+off2)
	// This is possible when off1+off2 fits a 12-bit immediate.			// This is possible when off1+off2 fits a 12-bit immediate.
	void RISCVDAGToDAGISel::doPeepholeLoadStoreADDI() {			void RISCVDAGToDAGISel::doPeepholeLoadStoreADDI() {
	SelectionDAG::allnodes_iterator Position(CurDAG->getRoot().getNode());			SelectionDAG::allnodes_iterator Position(CurDAG->getRoot().getNode());
	++Position;			++Position;

	▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,

setOperationAction(ISD::SHL_PARTS, XLenVT, Custom);		setOperationAction(ISD::SHL_PARTS, XLenVT, Custom);
setOperationAction(ISD::SRL_PARTS, XLenVT, Custom);		setOperationAction(ISD::SRL_PARTS, XLenVT, Custom);
setOperationAction(ISD::SRA_PARTS, XLenVT, Custom);		setOperationAction(ISD::SRA_PARTS, XLenVT, Custom);

setOperationAction(ISD::ROTL, XLenVT, Expand);		setOperationAction(ISD::ROTL, XLenVT, Expand);
setOperationAction(ISD::ROTR, XLenVT, Expand);		setOperationAction(ISD::ROTR, XLenVT, Expand);
setOperationAction(ISD::BSWAP, XLenVT, Expand);		setOperationAction(ISD::BSWAP, XLenVT, Expand);

		if (!Subtarget.hasStdExtZbb()) {
setOperationAction(ISD::CTTZ, XLenVT, Expand);		setOperationAction(ISD::CTTZ, XLenVT, Expand);
setOperationAction(ISD::CTLZ, XLenVT, Expand);		setOperationAction(ISD::CTLZ, XLenVT, Expand);
setOperationAction(ISD::CTPOP, XLenVT, Expand);		setOperationAction(ISD::CTPOP, XLenVT, Expand);
		}

ISD::CondCode FPCCToExtend[] = {		ISD::CondCode FPCCToExtend[] = {
ISD::SETOGT, ISD::SETOGE, ISD::SETONE, ISD::SETUEQ, ISD::SETUGT,		ISD::SETOGT, ISD::SETOGE, ISD::SETONE, ISD::SETUEQ, ISD::SETUGT,
ISD::SETUGE, ISD::SETULT, ISD::SETULE, ISD::SETUNE, ISD::SETGT,		ISD::SETUGE, ISD::SETULT, ISD::SETULE, ISD::SETUNE, ISD::SETGT,
ISD::SETGE, ISD::SETNE};		ISD::SETGE, ISD::SETNE};

ISD::NodeType FPOpToExtend[] = {		ISD::NodeType FPOpToExtend[] = {
ISD::FSIN, ISD::FCOS, ISD::FSINCOS, ISD::FPOW, ISD::FREM, ISD::FP16_TO_FP,		ISD::FSIN, ISD::FCOS, ISD::FSINCOS, ISD::FPOW, ISD::FREM, ISD::FP16_TO_FP,
▲ Show 20 Lines • Show All 2,854 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoB.td

	Show First 20 Lines • Show All 626 Lines • ▼ Show 20 Lines
	def : CompressPat<(SUB GPRC:$rs1, X0, GPRC:$rs1),			def : CompressPat<(SUB GPRC:$rs1, X0, GPRC:$rs1),
	(C_NEG GPRC:$rs1)>;			(C_NEG GPRC:$rs1)>;
	} // Predicates = [HasStdExtZbproposedc, HasStdExtC]			} // Predicates = [HasStdExtZbproposedc, HasStdExtC]

	let Predicates = [HasStdExtZbproposedc, HasStdExtZbbOrZbp, HasStdExtC, IsRV64] in {			let Predicates = [HasStdExtZbproposedc, HasStdExtZbbOrZbp, HasStdExtC, IsRV64] in {
	def : CompressPat<(PACK GPRC:$rs1, GPRC:$rs1, X0),			def : CompressPat<(PACK GPRC:$rs1, GPRC:$rs1, X0),
	(C_ZEXTW GPRC:$rs1)>;			(C_ZEXTW GPRC:$rs1)>;
	} // Predicates = [HasStdExtZbproposedc, HasStdExtC, IsRV64]			} // Predicates = [HasStdExtZbproposedc, HasStdExtC, IsRV64]

				//===----------------------------------------------------------------------===//
				// Codegen patterns
				//===----------------------------------------------------------------------===//
				def SLOIPat : ComplexPattern<XLenVT, 2, "SelectSLOI", [or]>;
				def SROIPat : ComplexPattern<XLenVT, 2, "SelectSROI", [or]>;
				def SLLIUWPat : ComplexPattern<i64, 2, "SelectSLLIUW", [and]>;
				lewis-revillUnsubmitted Done Reply Inline Actions Can these W selects be guarded for 64 bit only? lewis-revill: Can these W selects be guarded for 64 bit only?
				PaoloSAuthorUnsubmitted Done Reply Inline Actions Not sure how to do it, they can't be enclosed in Predicates like the instruction patterns. PaoloS: Not sure how to do it, they can't be enclosed in Predicates like the instruction patterns.
				lewis-revillUnsubmitted Done Reply Inline Actions Looks like you fixed this with the operand to `ComplexPattern`. Only nitpick is to get the `:` characters aligned vertically here. lewis-revill: Looks like you fixed this with the operand to `ComplexPattern`. Only nitpick is to get the `:`…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions On it. PaoloS: On it.
				def SLOIWPat : ComplexPattern<i64, 2, "SelectSLOIW", [sext_inreg]>;
				def SROIWPat : ComplexPattern<i64, 2, "SelectSROIW", [or]>;

				let Predicates = [HasStdExtZbb] in {
				def : Pat<(xor (shl (xor GPR:$rs1, -1), GPR:$rs2), -1),
				(SLO GPR:$rs1, GPR:$rs2)>;
				def : Pat<(xor (srl (xor GPR:$rs1, -1), GPR:$rs2), -1),
				(SRO GPR:$rs1, GPR:$rs2)>;
				def : Pat<(SLOIPat GPR:$rs1, uimmlog2xlen:$shamt),
				(SLOI GPR:$rs1, uimmlog2xlen:$shamt)>;
				def : Pat<(SROIPat GPR:$rs1, uimmlog2xlen:$shamt),
				(SROI GPR:$rs1, uimmlog2xlen:$shamt)>;
				def : Pat<(ctlz GPR:$rs1), (CLZ GPR:$rs1)>;
				def : Pat<(cttz GPR:$rs1), (CTZ GPR:$rs1)>;
				def : Pat<(ctpop GPR:$rs1), (PCNT GPR:$rs1)>;
				} // Predicates = [HasStdExtZbb]

				let Predicates = [HasStdExtZbb, IsRV32] in
				def : Pat<(sra (shl GPR:$rs1, (i32 24)), (i32 24)), (SEXTB GPR:$rs1)>;
				let Predicates = [HasStdExtZbb, IsRV64] in
				def : Pat<(sra (shl GPR:$rs1, (i64 56)), (i64 56)), (SEXTB GPR:$rs1)>;

				let Predicates = [HasStdExtZbb, IsRV32] in
				def : Pat<(sra (shl GPR:$rs1, (i32 16)), (i32 16)), (SEXTH GPR:$rs1)>;
				let Predicates = [HasStdExtZbb, IsRV64] in
				def : Pat<(sra (shl GPR:$rs1, (i64 48)), (i64 48)), (SEXTH GPR:$rs1)>;

				let Predicates = [HasStdExtZbb] in {
				def : Pat<(smin GPR:$rs1, GPR:$rs2), (MIN GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs1, GPR:$rs2, (XLenVT 20), GPR:$rs1, GPR:$rs2),
				(MIN GPR:$rs1, GPR:$rs2)>;
				def : Pat<(smax GPR:$rs1, GPR:$rs2), (MAX GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs2, GPR:$rs1, (XLenVT 20), GPR:$rs1, GPR:$rs2),
				(MAX GPR:$rs1, GPR:$rs2)>;
				def : Pat<(umin GPR:$rs1, GPR:$rs2), (MINU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs1, GPR:$rs2, (XLenVT 12), GPR:$rs1, GPR:$rs2),
				(MINU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(umax GPR:$rs1, GPR:$rs2), (MAXU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs2, GPR:$rs1, (XLenVT 12), GPR:$rs1, GPR:$rs2),
				(MAXU GPR:$rs1, GPR:$rs2)>;
				} // Predicates = [HasStdExtZbb]

				let Predicates = [HasStdExtZbb, IsRV64] in {
				def : Pat<(and (add GPR:$rs, simm12:$simm12), (i64 0xFFFFFFFF)),
				(ADDIWU GPR:$rs, simm12:$simm12)>;
				def : Pat<(SLLIUWPat GPR:$rs1, uimmlog2xlen:$shamt),
				(SLLIUW GPR:$rs1, uimmlog2xlen:$shamt)>;
				def : Pat<(and (add GPR:$rs1, GPR:$rs2), (i64 0xFFFFFFFF)),
				(ADDWU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(and (sub GPR:$rs1, GPR:$rs2), (i64 0xFFFFFFFF)),
				(SUBWU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(add GPR:$rs1, (and GPR:$rs2, (i64 0xFFFFFFFF))),
				(ADDUW GPR:$rs1, GPR:$rs2)>;
				def : Pat<(sub GPR:$rs1, (and GPR:$rs2, (i64 0xFFFFFFFF))),
				(SUBUW GPR:$rs1, GPR:$rs2)>;
				def : Pat<(xor (riscv_sllw (xor GPR:$rs1, -1), GPR:$rs2), -1),
				(SLOW GPR:$rs1, GPR:$rs2)>;
				def : Pat<(xor (riscv_srlw (xor GPR:$rs1, -1), GPR:$rs2), -1),
				(SROW GPR:$rs1, GPR:$rs2)>;
				def : Pat<(SLOIWPat GPR:$rs1, uimmlog2xlen:$shamt),
				(SLOIW GPR:$rs1, uimmlog2xlen:$shamt)>;
				def : Pat<(SROIWPat GPR:$rs1, uimmlog2xlen:$shamt),
				(SROIW GPR:$rs1, uimmlog2xlen:$shamt)>;
				def : Pat<(add (ctlz (and GPR:$rs1, (i64 0xFFFFFFFF))), (i64 -32)),
				(CLZW GPR:$rs1)>;
				// We don't pattern-match CTZW here as it has the same pattern and result as
				// RV64 CTZ
				def : Pat<(ctpop (and GPR:$rs1, (i64 0xFFFFFFFF))), (PCNTW GPR:$rs1)>;
				} // Predicates = [HasStdExtZbb, IsRV64]

llvm/test/CodeGen/RISCV/rv32Zbb.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32I
				; RUN: llc -mtriple=riscv32 -mattr=+experimental-b -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32IB
				; RUN: llc -mtriple=riscv32 -mattr=+experimental-zbb -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32IBB

				define i32 @slo_i32(i32 %a, i32 %b) nounwind {
				; RV32I-LABEL: slo_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: sll a0, a0, a1
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: slo_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: slo a0, a0, a1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: slo_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: slo a0, a0, a1
				; RV32IBB-NEXT: ret
				%neg = xor i32 %a, -1
				%shl = shl i32 %neg, %b
				%neg1 = xor i32 %shl, -1
				ret i32 %neg1
				}

				; As we are not matching directly i64 code patterns on RV32 some i64 patterns
				; don't have yet any matching bit manipulation instructions on RV32.
				lewis-revillUnsubmitted Done Reply Inline Actions Nitpick: For these tests on RV32 where no bitmanip instructions are selected (EG: `slo_i64`, `sro_i64`, `min_i64` etc.) perhaps it's worth either omitting these, or if the goal is to eventually support them, just add a quick comment? I noticed the same in the 3rd and 5th patches too, for `rol_i64` and `fshl_i64`. lewis-revill: Nitpick: For these tests on RV32 where no bitmanip instructions are selected (EG: `slo_i64`…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions I see what you mean. I just like the idea to show consistency with the other tests while showing cases where there's still room for improvement. Much of this codegen pattern-matching work was also to look for cases that could be optimized. Also things could still change considering that a new subextension is being drafted. On the other hand I understand that it looks like these tests are wrong I guess since they don't show particular changes. PaoloS: I see what you mean. I just like the idea to show consistency with the other tests while…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions I'm commenting those. PaoloS: I'm commenting those.
				; This test is presented here in case future expansions of the experimental-b
				; extension introduce instructions suitable for this pattern.

				define i64 @slo_i64(i64 %a, i64 %b) nounwind {
				; RV32I-LABEL: slo_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi a3, a2, -32
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: bltz a3, .LBB1_2
				asbUnsubmitted Done Reply Inline Actions clz on a zero is a well defined operation that will return XLEN. So shouldn't this just lower to clz and ret? asb: clz on a zero is a well defined operation that will return XLEN. So shouldn't this just lower…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions I agree, unfortunately the code gets split into multiple basic blocks before the selection and just the block with the condition a0 != 0 has the ctlz operation in it. Since I can focus on one block per time when pattern matching that's what I could do from the backend. I based the pattern matching of clz on the llvm instrinc llvm.ctlz.i32 that already relies on its own idiom recognition in the middle end. A solution could be to turn off the intrinsics and try to pattern match it directly from the backend, maybe we could semplify it. But the scope is limited. PaoloS: I agree, unfortunately the code gets split into multiple basic blocks before the selection and…
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: mv a2, zero
				; RV32I-NEXT: sll a1, a0, a3
				; RV32I-NEXT: j .LBB1_3
				; RV32I-NEXT: .LBB1_2:
				; RV32I-NEXT: not a1, a1
				; RV32I-NEXT: sll a1, a1, a2
				; RV32I-NEXT: addi a3, zero, 31
				; RV32I-NEXT: sub a3, a3, a2
				; RV32I-NEXT: srli a4, a0, 1
				; RV32I-NEXT: srl a3, a4, a3
				; RV32I-NEXT: or a1, a1, a3
				; RV32I-NEXT: sll a2, a0, a2
				; RV32I-NEXT: .LBB1_3:
				; RV32I-NEXT: not a1, a1
				; RV32I-NEXT: not a0, a2
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: slo_i64:
				; RV32IB: # %bb.0:
				asbUnsubmitted Done Reply Inline Actions Same comment as for clz above asb: Same comment as for clz above
				PaoloSAuthorUnsubmitted Done Reply Inline Actions Same as above PaoloS: Same as above
				; RV32IB-NEXT: addi a3, a2, -32
				; RV32IB-NEXT: not a0, a0
				; RV32IB-NEXT: bltz a3, .LBB1_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: mv a2, zero
				; RV32IB-NEXT: sll a1, a0, a3
				; RV32IB-NEXT: j .LBB1_3
				; RV32IB-NEXT: .LBB1_2:
				; RV32IB-NEXT: not a1, a1
				; RV32IB-NEXT: sll a1, a1, a2
				; RV32IB-NEXT: addi a3, zero, 31
				; RV32IB-NEXT: sub a3, a3, a2
				; RV32IB-NEXT: srli a4, a0, 1
				; RV32IB-NEXT: srl a3, a4, a3
				; RV32IB-NEXT: or a1, a1, a3
				; RV32IB-NEXT: sll a2, a0, a2
				; RV32IB-NEXT: .LBB1_3:
				; RV32IB-NEXT: not a1, a1
				; RV32IB-NEXT: not a0, a2
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: slo_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: addi a3, a2, -32
				; RV32IBB-NEXT: not a0, a0
				; RV32IBB-NEXT: bltz a3, .LBB1_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: mv a2, zero
				; RV32IBB-NEXT: sll a1, a0, a3
				; RV32IBB-NEXT: j .LBB1_3
				; RV32IBB-NEXT: .LBB1_2:
				; RV32IBB-NEXT: not a1, a1
				; RV32IBB-NEXT: sll a1, a1, a2
				; RV32IBB-NEXT: addi a3, zero, 31
				; RV32IBB-NEXT: sub a3, a3, a2
				; RV32IBB-NEXT: srli a4, a0, 1
				; RV32IBB-NEXT: srl a3, a4, a3
				; RV32IBB-NEXT: or a1, a1, a3
				; RV32IBB-NEXT: sll a2, a0, a2
				; RV32IBB-NEXT: .LBB1_3:
				; RV32IBB-NEXT: not a1, a1
				; RV32IBB-NEXT: not a0, a2
				; RV32IBB-NEXT: ret
				%neg = xor i64 %a, -1
				%shl = shl i64 %neg, %b
				%neg1 = xor i64 %shl, -1
				ret i64 %neg1
				}

				define i32 @sro_i32(i32 %a, i32 %b) nounwind {
				; RV32I-LABEL: sro_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: srl a0, a0, a1
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sro_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sro a0, a0, a1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sro_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sro a0, a0, a1
				; RV32IBB-NEXT: ret
				%neg = xor i32 %a, -1
				%shr = lshr i32 %neg, %b
				%neg1 = xor i32 %shr, -1
				ret i32 %neg1
				}

				; As we are not matching directly i64 code patterns on RV32 some i64 patterns
				; don't have yet any matching bit manipulation instructions on RV32.
				; This test is presented here in case future expansions of the experimental-b
				; extension introduce instructions suitable for this pattern.

				define i64 @sro_i64(i64 %a, i64 %b) nounwind {
				; RV32I-LABEL: sro_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi a3, a2, -32
				; RV32I-NEXT: not a1, a1
				; RV32I-NEXT: bltz a3, .LBB3_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: mv a2, zero
				; RV32I-NEXT: srl a0, a1, a3
				; RV32I-NEXT: j .LBB3_3
				; RV32I-NEXT: .LBB3_2:
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: srl a0, a0, a2
				; RV32I-NEXT: addi a3, zero, 31
				; RV32I-NEXT: sub a3, a3, a2
				; RV32I-NEXT: slli a4, a1, 1
				; RV32I-NEXT: sll a3, a4, a3
				; RV32I-NEXT: or a0, a0, a3
				; RV32I-NEXT: srl a2, a1, a2
				; RV32I-NEXT: .LBB3_3:
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: not a1, a2
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sro_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: addi a3, a2, -32
				; RV32IB-NEXT: not a1, a1
				; RV32IB-NEXT: bltz a3, .LBB3_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: mv a2, zero
				; RV32IB-NEXT: srl a0, a1, a3
				; RV32IB-NEXT: j .LBB3_3
				; RV32IB-NEXT: .LBB3_2:
				; RV32IB-NEXT: not a0, a0
				; RV32IB-NEXT: srl a0, a0, a2
				; RV32IB-NEXT: addi a3, zero, 31
				; RV32IB-NEXT: sub a3, a3, a2
				; RV32IB-NEXT: slli a4, a1, 1
				; RV32IB-NEXT: sll a3, a4, a3
				; RV32IB-NEXT: or a0, a0, a3
				; RV32IB-NEXT: srl a2, a1, a2
				; RV32IB-NEXT: .LBB3_3:
				; RV32IB-NEXT: not a0, a0
				; RV32IB-NEXT: not a1, a2
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sro_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: addi a3, a2, -32
				; RV32IBB-NEXT: not a1, a1
				; RV32IBB-NEXT: bltz a3, .LBB3_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: mv a2, zero
				; RV32IBB-NEXT: srl a0, a1, a3
				; RV32IBB-NEXT: j .LBB3_3
				; RV32IBB-NEXT: .LBB3_2:
				; RV32IBB-NEXT: not a0, a0
				; RV32IBB-NEXT: srl a0, a0, a2
				; RV32IBB-NEXT: addi a3, zero, 31
				; RV32IBB-NEXT: sub a3, a3, a2
				; RV32IBB-NEXT: slli a4, a1, 1
				; RV32IBB-NEXT: sll a3, a4, a3
				; RV32IBB-NEXT: or a0, a0, a3
				; RV32IBB-NEXT: srl a2, a1, a2
				; RV32IBB-NEXT: .LBB3_3:
				; RV32IBB-NEXT: not a0, a0
				; RV32IBB-NEXT: not a1, a2
				; RV32IBB-NEXT: ret
				%neg = xor i64 %a, -1
				%shr = lshr i64 %neg, %b
				%neg1 = xor i64 %shr, -1
				ret i64 %neg1
				}

				define i32 @sloi_i32(i32 %a) nounwind {
				; RV32I-LABEL: sloi_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a0, a0, 1
				; RV32I-NEXT: ori a0, a0, 1
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sloi_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sloi a0, a0, 1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sloi_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sloi a0, a0, 1
				; RV32IBB-NEXT: ret
				%neg = shl i32 %a, 1
				%neg12 = or i32 %neg, 1
				ret i32 %neg12
				}

				define i64 @sloi_i64(i64 %a) nounwind {
				; RV32I-LABEL: sloi_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: srli a2, a0, 31
				; RV32I-NEXT: slli a1, a1, 1
				; RV32I-NEXT: or a1, a1, a2
				; RV32I-NEXT: slli a0, a0, 1
				; RV32I-NEXT: ori a0, a0, 1
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sloi_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: addi a2, zero, 1
				; RV32IB-NEXT: fsl a1, a1, a2, a0
				; RV32IB-NEXT: sloi a0, a0, 1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sloi_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: srli a2, a0, 31
				; RV32IBB-NEXT: slli a1, a1, 1
				; RV32IBB-NEXT: or a1, a1, a2
				; RV32IBB-NEXT: sloi a0, a0, 1
				; RV32IBB-NEXT: ret
				%neg = shl i64 %a, 1
				%neg12 = or i64 %neg, 1
				ret i64 %neg12
				}

				define i32 @sroi_i32(i32 %a) nounwind {
				; RV32I-LABEL: sroi_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: srli a0, a0, 1
				; RV32I-NEXT: lui a1, 524288
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sroi_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sroi a0, a0, 1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sroi_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sroi a0, a0, 1
				; RV32IBB-NEXT: ret
				%neg = lshr i32 %a, 1
				%neg12 = or i32 %neg, -2147483648
				ret i32 %neg12
				}

				define i64 @sroi_i64(i64 %a) nounwind {
				; RV32I-LABEL: sroi_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a2, a1, 31
				; RV32I-NEXT: srli a0, a0, 1
				; RV32I-NEXT: or a0, a0, a2
				; RV32I-NEXT: srli a1, a1, 1
				; RV32I-NEXT: lui a2, 524288
				; RV32I-NEXT: or a1, a1, a2
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sroi_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: addi a2, zero, 31
				; RV32IB-NEXT: fsl a0, a1, a2, a0
				; RV32IB-NEXT: sroi a1, a1, 1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sroi_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: slli a2, a1, 31
				; RV32IBB-NEXT: srli a0, a0, 1
				; RV32IBB-NEXT: or a0, a0, a2
				; RV32IBB-NEXT: sroi a1, a1, 1
				; RV32IBB-NEXT: ret
				%neg = lshr i64 %a, 1
				%neg12 = or i64 %neg, -9223372036854775808
				ret i64 %neg12
				}

				declare i32 @llvm.ctlz.i32(i32, i1)

				define i32 @ctlz_i32(i32 %a) nounwind {
				; RV32I-LABEL: ctlz_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp)
				; RV32I-NEXT: beqz a0, .LBB8_2
				; RV32I-NEXT: # %bb.1: # %cond.false
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 2
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 8
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 16
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui a2, 349525
				; RV32I-NEXT: addi a2, a2, 1365
				; RV32I-NEXT: and a1, a1, a2
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: lui a1, 209715
				; RV32I-NEXT: addi a1, a1, 819
				; RV32I-NEXT: and a2, a0, a1
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: add a0, a2, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: lui a1, 61681
				; RV32I-NEXT: addi a1, a1, -241
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, 4112
				; RV32I-NEXT: addi a1, a1, 257
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: srli a0, a0, 24
				; RV32I-NEXT: j .LBB8_3
				; RV32I-NEXT: .LBB8_2:
				; RV32I-NEXT: addi a0, zero, 32
				; RV32I-NEXT: .LBB8_3: # %cond.end
				; RV32I-NEXT: lw ra, 12(sp)
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: ctlz_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beqz a0, .LBB8_2
				; RV32IB-NEXT: # %bb.1: # %cond.false
				; RV32IB-NEXT: clz a0, a0
				; RV32IB-NEXT: ret
				; RV32IB-NEXT: .LBB8_2:
				; RV32IB-NEXT: addi a0, zero, 32
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: ctlz_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: beqz a0, .LBB8_2
				; RV32IBB-NEXT: # %bb.1: # %cond.false
				; RV32IBB-NEXT: clz a0, a0
				; RV32IBB-NEXT: ret
				; RV32IBB-NEXT: .LBB8_2:
				; RV32IBB-NEXT: addi a0, zero, 32
				; RV32IBB-NEXT: ret
				%1 = call i32 @llvm.ctlz.i32(i32 %a, i1 false)
				ret i32 %1
				}

				declare i64 @llvm.ctlz.i64(i64, i1)

				define i64 @ctlz_i64(i64 %a) nounwind {
				; RV32I-LABEL: ctlz_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -32
				; RV32I-NEXT: sw ra, 28(sp)
				; RV32I-NEXT: sw s0, 24(sp)
				; RV32I-NEXT: sw s1, 20(sp)
				; RV32I-NEXT: sw s2, 16(sp)
				; RV32I-NEXT: sw s3, 12(sp)
				; RV32I-NEXT: sw s4, 8(sp)
				; RV32I-NEXT: sw s5, 4(sp)
				; RV32I-NEXT: sw s6, 0(sp)
				; RV32I-NEXT: mv s3, a1
				; RV32I-NEXT: mv s4, a0
				; RV32I-NEXT: srli a0, a1, 1
				; RV32I-NEXT: or a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 2
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 8
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 16
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui a2, 349525
				; RV32I-NEXT: addi s5, a2, 1365
				; RV32I-NEXT: and a1, a1, s5
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: lui a1, 209715
				; RV32I-NEXT: addi s1, a1, 819
				; RV32I-NEXT: and a1, a0, s1
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, s1
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: lui a1, 61681
				; RV32I-NEXT: addi s6, a1, -241
				; RV32I-NEXT: and a0, a0, s6
				; RV32I-NEXT: lui a1, 4112
				; RV32I-NEXT: addi s0, a1, 257
				; RV32I-NEXT: mv a1, s0
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: mv s2, a0
				; RV32I-NEXT: srli a0, s4, 1
				; RV32I-NEXT: or a0, s4, a0
				; RV32I-NEXT: srli a1, a0, 2
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 8
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 16
				; RV32I-NEXT: or a0, a0, a1
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: and a1, a1, s5
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: and a1, a0, s1
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, s1
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: and a0, a0, s6
				; RV32I-NEXT: mv a1, s0
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: bnez s3, .LBB9_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: srli a0, a0, 24
				; RV32I-NEXT: addi a0, a0, 32
				; RV32I-NEXT: j .LBB9_3
				; RV32I-NEXT: .LBB9_2:
				; RV32I-NEXT: srli a0, s2, 24
				; RV32I-NEXT: .LBB9_3:
				; RV32I-NEXT: mv a1, zero
				; RV32I-NEXT: lw s6, 0(sp)
				; RV32I-NEXT: lw s5, 4(sp)
				; RV32I-NEXT: lw s4, 8(sp)
				; RV32I-NEXT: lw s3, 12(sp)
				; RV32I-NEXT: lw s2, 16(sp)
				; RV32I-NEXT: lw s1, 20(sp)
				; RV32I-NEXT: lw s0, 24(sp)
				; RV32I-NEXT: lw ra, 28(sp)
				; RV32I-NEXT: addi sp, sp, 32
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: ctlz_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: bnez a1, .LBB9_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: clz a0, a0
				; RV32IB-NEXT: addi a0, a0, 32
				; RV32IB-NEXT: mv a1, zero
				; RV32IB-NEXT: ret
				; RV32IB-NEXT: .LBB9_2:
				; RV32IB-NEXT: clz a0, a1
				; RV32IB-NEXT: mv a1, zero
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: ctlz_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: bnez a1, .LBB9_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: clz a0, a0
				; RV32IBB-NEXT: addi a0, a0, 32
				; RV32IBB-NEXT: mv a1, zero
				; RV32IBB-NEXT: ret
				; RV32IBB-NEXT: .LBB9_2:
				; RV32IBB-NEXT: clz a0, a1
				; RV32IBB-NEXT: mv a1, zero
				; RV32IBB-NEXT: ret
				%1 = call i64 @llvm.ctlz.i64(i64 %a, i1 false)
				ret i64 %1
				}

				declare i32 @llvm.cttz.i32(i32, i1)

				define i32 @cttz_i32(i32 %a) nounwind {
				; RV32I-LABEL: cttz_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp)
				; RV32I-NEXT: beqz a0, .LBB10_2
				; RV32I-NEXT: # %bb.1: # %cond.false
				; RV32I-NEXT: addi a1, a0, -1
				; RV32I-NEXT: not a0, a0
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui a2, 349525
				; RV32I-NEXT: addi a2, a2, 1365
				; RV32I-NEXT: and a1, a1, a2
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: lui a1, 209715
				; RV32I-NEXT: addi a1, a1, 819
				; RV32I-NEXT: and a2, a0, a1
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: add a0, a2, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: lui a1, 61681
				; RV32I-NEXT: addi a1, a1, -241
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, 4112
				; RV32I-NEXT: addi a1, a1, 257
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: srli a0, a0, 24
				; RV32I-NEXT: j .LBB10_3
				; RV32I-NEXT: .LBB10_2:
				; RV32I-NEXT: addi a0, zero, 32
				; RV32I-NEXT: .LBB10_3: # %cond.end
				; RV32I-NEXT: lw ra, 12(sp)
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: cttz_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beqz a0, .LBB10_2
				; RV32IB-NEXT: # %bb.1: # %cond.false
				; RV32IB-NEXT: ctz a0, a0
				; RV32IB-NEXT: ret
				; RV32IB-NEXT: .LBB10_2:
				; RV32IB-NEXT: addi a0, zero, 32
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: cttz_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: beqz a0, .LBB10_2
				; RV32IBB-NEXT: # %bb.1: # %cond.false
				; RV32IBB-NEXT: ctz a0, a0
				; RV32IBB-NEXT: ret
				; RV32IBB-NEXT: .LBB10_2:
				; RV32IBB-NEXT: addi a0, zero, 32
				; RV32IBB-NEXT: ret
				%1 = call i32 @llvm.cttz.i32(i32 %a, i1 false)
				ret i32 %1
				}

				declare i64 @llvm.cttz.i64(i64, i1)

				define i64 @cttz_i64(i64 %a) nounwind {
				; RV32I-LABEL: cttz_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -32
				; RV32I-NEXT: sw ra, 28(sp)
				; RV32I-NEXT: sw s0, 24(sp)
				; RV32I-NEXT: sw s1, 20(sp)
				; RV32I-NEXT: sw s2, 16(sp)
				; RV32I-NEXT: sw s3, 12(sp)
				; RV32I-NEXT: sw s4, 8(sp)
				; RV32I-NEXT: sw s5, 4(sp)
				; RV32I-NEXT: sw s6, 0(sp)
				; RV32I-NEXT: mv s3, a1
				; RV32I-NEXT: mv s4, a0
				; RV32I-NEXT: addi a0, a0, -1
				; RV32I-NEXT: not a1, s4
				; RV32I-NEXT: and a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui a2, 349525
				; RV32I-NEXT: addi s5, a2, 1365
				; RV32I-NEXT: and a1, a1, s5
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: lui a1, 209715
				; RV32I-NEXT: addi s0, a1, 819
				; RV32I-NEXT: and a1, a0, s0
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, s0
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: lui a1, 61681
				; RV32I-NEXT: addi s6, a1, -241
				; RV32I-NEXT: and a0, a0, s6
				; RV32I-NEXT: lui a1, 4112
				; RV32I-NEXT: addi s1, a1, 257
				; RV32I-NEXT: mv a1, s1
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: mv s2, a0
				; RV32I-NEXT: addi a0, s3, -1
				; RV32I-NEXT: not a1, s3
				; RV32I-NEXT: and a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: and a1, a1, s5
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: and a1, a0, s0
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, s0
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: and a0, a0, s6
				; RV32I-NEXT: mv a1, s1
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: bnez s4, .LBB11_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: srli a0, a0, 24
				; RV32I-NEXT: addi a0, a0, 32
				; RV32I-NEXT: j .LBB11_3
				; RV32I-NEXT: .LBB11_2:
				; RV32I-NEXT: srli a0, s2, 24
				; RV32I-NEXT: .LBB11_3:
				; RV32I-NEXT: mv a1, zero
				; RV32I-NEXT: lw s6, 0(sp)
				; RV32I-NEXT: lw s5, 4(sp)
				; RV32I-NEXT: lw s4, 8(sp)
				; RV32I-NEXT: lw s3, 12(sp)
				; RV32I-NEXT: lw s2, 16(sp)
				; RV32I-NEXT: lw s1, 20(sp)
				; RV32I-NEXT: lw s0, 24(sp)
				; RV32I-NEXT: lw ra, 28(sp)
				; RV32I-NEXT: addi sp, sp, 32
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: cttz_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: bnez a0, .LBB11_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: ctz a0, a1
				; RV32IB-NEXT: addi a0, a0, 32
				; RV32IB-NEXT: mv a1, zero
				; RV32IB-NEXT: ret
				; RV32IB-NEXT: .LBB11_2:
				; RV32IB-NEXT: ctz a0, a0
				; RV32IB-NEXT: mv a1, zero
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: cttz_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: bnez a0, .LBB11_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: ctz a0, a1
				; RV32IBB-NEXT: addi a0, a0, 32
				; RV32IBB-NEXT: mv a1, zero
				; RV32IBB-NEXT: ret
				; RV32IBB-NEXT: .LBB11_2:
				; RV32IBB-NEXT: ctz a0, a0
				; RV32IBB-NEXT: mv a1, zero
				; RV32IBB-NEXT: ret
				%1 = call i64 @llvm.cttz.i64(i64 %a, i1 false)
				ret i64 %1
				}

				declare i32 @llvm.ctpop.i32(i32)

				define i32 @ctpop_i32(i32 %a) nounwind {
				; RV32I-LABEL: ctpop_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -16
				; RV32I-NEXT: sw ra, 12(sp)
				; RV32I-NEXT: srli a1, a0, 1
				; RV32I-NEXT: lui a2, 349525
				; RV32I-NEXT: addi a2, a2, 1365
				; RV32I-NEXT: and a1, a1, a2
				; RV32I-NEXT: sub a0, a0, a1
				; RV32I-NEXT: lui a1, 209715
				; RV32I-NEXT: addi a1, a1, 819
				; RV32I-NEXT: and a2, a0, a1
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: add a0, a2, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: lui a1, 61681
				; RV32I-NEXT: addi a1, a1, -241
				; RV32I-NEXT: and a0, a0, a1
				; RV32I-NEXT: lui a1, 4112
				; RV32I-NEXT: addi a1, a1, 257
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: srli a0, a0, 24
				; RV32I-NEXT: lw ra, 12(sp)
				; RV32I-NEXT: addi sp, sp, 16
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: ctpop_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: pcnt a0, a0
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: ctpop_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: pcnt a0, a0
				; RV32IBB-NEXT: ret
				%1 = call i32 @llvm.ctpop.i32(i32 %a)
				ret i32 %1
				}

				declare i64 @llvm.ctpop.i64(i64)

				define i64 @ctpop_i64(i64 %a) nounwind {
				; RV32I-LABEL: ctpop_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: addi sp, sp, -32
				; RV32I-NEXT: sw ra, 28(sp)
				; RV32I-NEXT: sw s0, 24(sp)
				; RV32I-NEXT: sw s1, 20(sp)
				; RV32I-NEXT: sw s2, 16(sp)
				; RV32I-NEXT: sw s3, 12(sp)
				; RV32I-NEXT: sw s4, 8(sp)
				; RV32I-NEXT: sw s5, 4(sp)
				; RV32I-NEXT: mv s2, a0
				; RV32I-NEXT: srli a0, a1, 1
				; RV32I-NEXT: lui a2, 349525
				; RV32I-NEXT: addi s3, a2, 1365
				; RV32I-NEXT: and a0, a0, s3
				; RV32I-NEXT: sub a0, a1, a0
				; RV32I-NEXT: lui a1, 209715
				; RV32I-NEXT: addi s0, a1, 819
				; RV32I-NEXT: and a1, a0, s0
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, s0
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: lui a1, 61681
				; RV32I-NEXT: addi s4, a1, -241
				; RV32I-NEXT: and a0, a0, s4
				; RV32I-NEXT: lui a1, 4112
				; RV32I-NEXT: addi s1, a1, 257
				; RV32I-NEXT: mv a1, s1
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: srli s5, a0, 24
				; RV32I-NEXT: srli a0, s2, 1
				; RV32I-NEXT: and a0, a0, s3
				; RV32I-NEXT: sub a0, s2, a0
				; RV32I-NEXT: and a1, a0, s0
				; RV32I-NEXT: srli a0, a0, 2
				; RV32I-NEXT: and a0, a0, s0
				; RV32I-NEXT: add a0, a1, a0
				; RV32I-NEXT: srli a1, a0, 4
				; RV32I-NEXT: add a0, a0, a1
				; RV32I-NEXT: and a0, a0, s4
				; RV32I-NEXT: mv a1, s1
				; RV32I-NEXT: call __mulsi3
				; RV32I-NEXT: srli a0, a0, 24
				; RV32I-NEXT: add a0, a0, s5
				; RV32I-NEXT: mv a1, zero
				; RV32I-NEXT: lw s5, 4(sp)
				; RV32I-NEXT: lw s4, 8(sp)
				; RV32I-NEXT: lw s3, 12(sp)
				; RV32I-NEXT: lw s2, 16(sp)
				; RV32I-NEXT: lw s1, 20(sp)
				; RV32I-NEXT: lw s0, 24(sp)
				; RV32I-NEXT: lw ra, 28(sp)
				; RV32I-NEXT: addi sp, sp, 32
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: ctpop_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: pcnt a1, a1
				; RV32IB-NEXT: pcnt a0, a0
				; RV32IB-NEXT: add a0, a0, a1
				; RV32IB-NEXT: mv a1, zero
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: ctpop_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: pcnt a1, a1
				; RV32IBB-NEXT: pcnt a0, a0
				; RV32IBB-NEXT: add a0, a0, a1
				; RV32IBB-NEXT: mv a1, zero
				; RV32IBB-NEXT: ret
				%1 = call i64 @llvm.ctpop.i64(i64 %a)
				ret i64 %1
				}

				define i32 @sextb_i32(i32 %a) nounwind {
				; RV32I-LABEL: sextb_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a0, a0, 24
				; RV32I-NEXT: srai a0, a0, 24
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sextb_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sext.b a0, a0
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sextb_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sext.b a0, a0
				; RV32IBB-NEXT: ret
				%shl = shl i32 %a, 24
				%shr = ashr exact i32 %shl, 24
				ret i32 %shr
				}

				define i64 @sextb_i64(i64 %a) nounwind {
				; RV32I-LABEL: sextb_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a1, a0, 24
				; RV32I-NEXT: srai a0, a1, 24
				; RV32I-NEXT: srai a1, a1, 31
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sextb_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sext.b a2, a0
				; RV32IB-NEXT: slli a0, a0, 24
				; RV32IB-NEXT: srai a1, a0, 31
				; RV32IB-NEXT: mv a0, a2
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sextb_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sext.b a2, a0
				; RV32IBB-NEXT: slli a0, a0, 24
				; RV32IBB-NEXT: srai a1, a0, 31
				; RV32IBB-NEXT: mv a0, a2
				; RV32IBB-NEXT: ret
				%shl = shl i64 %a, 56
				%shr = ashr exact i64 %shl, 56
				ret i64 %shr
				}

				define i32 @sexth_i32(i32 %a) nounwind {
				; RV32I-LABEL: sexth_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a0, a0, 16
				; RV32I-NEXT: srai a0, a0, 16
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sexth_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sext.h a0, a0
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sexth_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sext.h a0, a0
				; RV32IBB-NEXT: ret
				%shl = shl i32 %a, 16
				%shr = ashr exact i32 %shl, 16
				ret i32 %shr
				}

				define i64 @sexth_i64(i64 %a) nounwind {
				; RV32I-LABEL: sexth_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: slli a1, a0, 16
				; RV32I-NEXT: srai a0, a1, 16
				; RV32I-NEXT: srai a1, a1, 31
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: sexth_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sext.h a2, a0
				; RV32IB-NEXT: slli a0, a0, 16
				; RV32IB-NEXT: srai a1, a0, 31
				; RV32IB-NEXT: mv a0, a2
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: sexth_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: sext.h a2, a0
				; RV32IBB-NEXT: slli a0, a0, 16
				; RV32IBB-NEXT: srai a1, a0, 31
				; RV32IBB-NEXT: mv a0, a2
				; RV32IBB-NEXT: ret
				%shl = shl i64 %a, 48
				%shr = ashr exact i64 %shl, 48
				ret i64 %shr
				}

				define i32 @min_i32(i32 %a, i32 %b) nounwind {
				; RV32I-LABEL: min_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: blt a0, a1, .LBB18_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: mv a0, a1
				; RV32I-NEXT: .LBB18_2:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: min_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: min a0, a0, a1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: min_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: min a0, a0, a1
				; RV32IBB-NEXT: ret
				%cmp = icmp slt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				; As we are not matching directly i64 code patterns on RV32 some i64 patterns
				; don't have yet any matching bit manipulation instructions on RV32.
				; This test is presented here in case future expansions of the experimental-b
				; extension introduce instructions suitable for this pattern.

				define i64 @min_i64(i64 %a, i64 %b) nounwind {
				; RV32I-LABEL: min_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: beq a1, a3, .LBB19_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: slt a4, a1, a3
				; RV32I-NEXT: beqz a4, .LBB19_3
				; RV32I-NEXT: j .LBB19_4
				; RV32I-NEXT: .LBB19_2:
				; RV32I-NEXT: sltu a4, a0, a2
				; RV32I-NEXT: bnez a4, .LBB19_4
				; RV32I-NEXT: .LBB19_3:
				; RV32I-NEXT: mv a0, a2
				; RV32I-NEXT: mv a1, a3
				; RV32I-NEXT: .LBB19_4:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: min_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beq a1, a3, .LBB19_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: slt a4, a1, a3
				; RV32IB-NEXT: beqz a4, .LBB19_3
				; RV32IB-NEXT: j .LBB19_4
				; RV32IB-NEXT: .LBB19_2:
				; RV32IB-NEXT: sltu a4, a0, a2
				; RV32IB-NEXT: bnez a4, .LBB19_4
				; RV32IB-NEXT: .LBB19_3:
				; RV32IB-NEXT: mv a0, a2
				; RV32IB-NEXT: mv a1, a3
				; RV32IB-NEXT: .LBB19_4:
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: min_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: beq a1, a3, .LBB19_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: slt a4, a1, a3
				; RV32IBB-NEXT: beqz a4, .LBB19_3
				; RV32IBB-NEXT: j .LBB19_4
				; RV32IBB-NEXT: .LBB19_2:
				; RV32IBB-NEXT: sltu a4, a0, a2
				; RV32IBB-NEXT: bnez a4, .LBB19_4
				; RV32IBB-NEXT: .LBB19_3:
				; RV32IBB-NEXT: mv a0, a2
				; RV32IBB-NEXT: mv a1, a3
				; RV32IBB-NEXT: .LBB19_4:
				; RV32IBB-NEXT: ret
				%cmp = icmp slt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i32 @max_i32(i32 %a, i32 %b) nounwind {
				; RV32I-LABEL: max_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: blt a1, a0, .LBB20_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: mv a0, a1
				; RV32I-NEXT: .LBB20_2:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: max_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: max a0, a0, a1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: max_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: max a0, a0, a1
				; RV32IBB-NEXT: ret
				%cmp = icmp sgt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				; As we are not matching directly i64 code patterns on RV32 some i64 patterns
				; don't have yet any matching bit manipulation instructions on RV32.
				; This test is presented here in case future expansions of the experimental-b
				; extension introduce instructions suitable for this pattern.

				define i64 @max_i64(i64 %a, i64 %b) nounwind {
				; RV32I-LABEL: max_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: beq a1, a3, .LBB21_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: slt a4, a3, a1
				; RV32I-NEXT: beqz a4, .LBB21_3
				; RV32I-NEXT: j .LBB21_4
				; RV32I-NEXT: .LBB21_2:
				; RV32I-NEXT: sltu a4, a2, a0
				; RV32I-NEXT: bnez a4, .LBB21_4
				; RV32I-NEXT: .LBB21_3:
				; RV32I-NEXT: mv a0, a2
				; RV32I-NEXT: mv a1, a3
				; RV32I-NEXT: .LBB21_4:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: max_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beq a1, a3, .LBB21_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: slt a4, a3, a1
				; RV32IB-NEXT: beqz a4, .LBB21_3
				; RV32IB-NEXT: j .LBB21_4
				; RV32IB-NEXT: .LBB21_2:
				; RV32IB-NEXT: sltu a4, a2, a0
				; RV32IB-NEXT: bnez a4, .LBB21_4
				; RV32IB-NEXT: .LBB21_3:
				; RV32IB-NEXT: mv a0, a2
				; RV32IB-NEXT: mv a1, a3
				; RV32IB-NEXT: .LBB21_4:
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: max_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: beq a1, a3, .LBB21_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: slt a4, a3, a1
				; RV32IBB-NEXT: beqz a4, .LBB21_3
				; RV32IBB-NEXT: j .LBB21_4
				; RV32IBB-NEXT: .LBB21_2:
				; RV32IBB-NEXT: sltu a4, a2, a0
				; RV32IBB-NEXT: bnez a4, .LBB21_4
				; RV32IBB-NEXT: .LBB21_3:
				; RV32IBB-NEXT: mv a0, a2
				; RV32IBB-NEXT: mv a1, a3
				; RV32IBB-NEXT: .LBB21_4:
				; RV32IBB-NEXT: ret
				%cmp = icmp sgt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i32 @minu_i32(i32 %a, i32 %b) nounwind {
				; RV32I-LABEL: minu_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: bltu a0, a1, .LBB22_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: mv a0, a1
				; RV32I-NEXT: .LBB22_2:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: minu_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: minu a0, a0, a1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: minu_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: minu a0, a0, a1
				; RV32IBB-NEXT: ret
				%cmp = icmp ult i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				; As we are not matching directly i64 code patterns on RV32 some i64 patterns
				; don't have yet any matching bit manipulation instructions on RV32.
				; This test is presented here in case future expansions of the experimental-b
				; extension introduce instructions suitable for this pattern.

				define i64 @minu_i64(i64 %a, i64 %b) nounwind {
				; RV32I-LABEL: minu_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: beq a1, a3, .LBB23_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: sltu a4, a1, a3
				; RV32I-NEXT: beqz a4, .LBB23_3
				; RV32I-NEXT: j .LBB23_4
				; RV32I-NEXT: .LBB23_2:
				; RV32I-NEXT: sltu a4, a0, a2
				; RV32I-NEXT: bnez a4, .LBB23_4
				; RV32I-NEXT: .LBB23_3:
				; RV32I-NEXT: mv a0, a2
				; RV32I-NEXT: mv a1, a3
				; RV32I-NEXT: .LBB23_4:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: minu_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beq a1, a3, .LBB23_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: sltu a4, a1, a3
				; RV32IB-NEXT: beqz a4, .LBB23_3
				; RV32IB-NEXT: j .LBB23_4
				; RV32IB-NEXT: .LBB23_2:
				; RV32IB-NEXT: sltu a4, a0, a2
				; RV32IB-NEXT: bnez a4, .LBB23_4
				; RV32IB-NEXT: .LBB23_3:
				; RV32IB-NEXT: mv a0, a2
				; RV32IB-NEXT: mv a1, a3
				; RV32IB-NEXT: .LBB23_4:
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: minu_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: beq a1, a3, .LBB23_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: sltu a4, a1, a3
				; RV32IBB-NEXT: beqz a4, .LBB23_3
				; RV32IBB-NEXT: j .LBB23_4
				; RV32IBB-NEXT: .LBB23_2:
				; RV32IBB-NEXT: sltu a4, a0, a2
				; RV32IBB-NEXT: bnez a4, .LBB23_4
				; RV32IBB-NEXT: .LBB23_3:
				; RV32IBB-NEXT: mv a0, a2
				; RV32IBB-NEXT: mv a1, a3
				; RV32IBB-NEXT: .LBB23_4:
				; RV32IBB-NEXT: ret
				%cmp = icmp ult i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i32 @maxu_i32(i32 %a, i32 %b) nounwind {
				; RV32I-LABEL: maxu_i32:
				; RV32I: # %bb.0:
				; RV32I-NEXT: bltu a1, a0, .LBB24_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: mv a0, a1
				; RV32I-NEXT: .LBB24_2:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: maxu_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: maxu a0, a0, a1
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: maxu_i32:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: maxu a0, a0, a1
				; RV32IBB-NEXT: ret
				%cmp = icmp ugt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				; As we are not matching directly i64 code patterns on RV32 some i64 patterns
				; don't have yet any matching bit manipulation instructions on RV32.
				; This test is presented here in case future expansions of the experimental-b
				; extension introduce instructions suitable for this pattern.

				define i64 @maxu_i64(i64 %a, i64 %b) nounwind {
				; RV32I-LABEL: maxu_i64:
				; RV32I: # %bb.0:
				; RV32I-NEXT: beq a1, a3, .LBB25_2
				; RV32I-NEXT: # %bb.1:
				; RV32I-NEXT: sltu a4, a3, a1
				; RV32I-NEXT: beqz a4, .LBB25_3
				; RV32I-NEXT: j .LBB25_4
				; RV32I-NEXT: .LBB25_2:
				; RV32I-NEXT: sltu a4, a2, a0
				; RV32I-NEXT: bnez a4, .LBB25_4
				; RV32I-NEXT: .LBB25_3:
				; RV32I-NEXT: mv a0, a2
				; RV32I-NEXT: mv a1, a3
				; RV32I-NEXT: .LBB25_4:
				; RV32I-NEXT: ret
				;
				; RV32IB-LABEL: maxu_i64:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beq a1, a3, .LBB25_2
				; RV32IB-NEXT: # %bb.1:
				; RV32IB-NEXT: sltu a4, a3, a1
				; RV32IB-NEXT: beqz a4, .LBB25_3
				; RV32IB-NEXT: j .LBB25_4
				; RV32IB-NEXT: .LBB25_2:
				; RV32IB-NEXT: sltu a4, a2, a0
				; RV32IB-NEXT: bnez a4, .LBB25_4
				; RV32IB-NEXT: .LBB25_3:
				; RV32IB-NEXT: mv a0, a2
				; RV32IB-NEXT: mv a1, a3
				; RV32IB-NEXT: .LBB25_4:
				; RV32IB-NEXT: ret
				;
				; RV32IBB-LABEL: maxu_i64:
				; RV32IBB: # %bb.0:
				; RV32IBB-NEXT: beq a1, a3, .LBB25_2
				; RV32IBB-NEXT: # %bb.1:
				; RV32IBB-NEXT: sltu a4, a3, a1
				; RV32IBB-NEXT: beqz a4, .LBB25_3
				; RV32IBB-NEXT: j .LBB25_4
				; RV32IBB-NEXT: .LBB25_2:
				; RV32IBB-NEXT: sltu a4, a2, a0
				; RV32IBB-NEXT: bnez a4, .LBB25_4
				; RV32IBB-NEXT: .LBB25_3:
				; RV32IBB-NEXT: mv a0, a2
				; RV32IBB-NEXT: mv a1, a3
				; RV32IBB-NEXT: .LBB25_4:
				; RV32IBB-NEXT: ret
				%cmp = icmp ugt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

llvm/test/CodeGen/RISCV/rv64Zbb.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64I
				; RUN: llc -mtriple=riscv64 -mattr=+experimental-b -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64IB
				; RUN: llc -mtriple=riscv64 -mattr=+experimental-zbb -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64IBB

				define signext i32 @slo_i32(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: slo_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: sllw a0, a0, a1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: slo_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: slow a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: slo_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: slow a0, a0, a1
				; RV64IBB-NEXT: ret
				%neg = xor i32 %a, -1
				%shl = shl i32 %neg, %b
				%neg1 = xor i32 %shl, -1
				ret i32 %neg1
				}

				define i64 @slo_i64(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: slo_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: sll a0, a0, a1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: slo_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: slo a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: slo_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: slo a0, a0, a1
				; RV64IBB-NEXT: ret
				%neg = xor i64 %a, -1
				%shl = shl i64 %neg, %b
				%neg1 = xor i64 %shl, -1
				ret i64 %neg1
				}

				define signext i32 @sro_i32(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: sro_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: srlw a0, a0, a1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sro_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: srow a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sro_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: srow a0, a0, a1
				; RV64IBB-NEXT: ret
				%neg = xor i32 %a, -1
				%shr = lshr i32 %neg, %b
				%neg1 = xor i32 %shr, -1
				ret i32 %neg1
				}

				define i64 @sro_i64(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: sro_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: srl a0, a0, a1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sro_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sro a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sro_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sro a0, a0, a1
				; RV64IBB-NEXT: ret
				%neg = xor i64 %a, -1
				%shr = lshr i64 %neg, %b
				%neg1 = xor i64 %shr, -1
				ret i64 %neg1
				}

				define signext i32 @sloi_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: sloi_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 1
				; RV64I-NEXT: ori a0, a0, 1
				; RV64I-NEXT: sext.w a0, a0
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sloi_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sloiw a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sloi_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sloiw a0, a0, 1
				; RV64IBB-NEXT: ret
				%neg = shl i32 %a, 1
				%neg12 = or i32 %neg, 1
				ret i32 %neg12
				}

				define i64 @sloi_i64(i64 %a) nounwind {
				; RV64I-LABEL: sloi_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 1
				; RV64I-NEXT: ori a0, a0, 1
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sloi_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sloi a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sloi_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sloi a0, a0, 1
				; RV64IBB-NEXT: ret
				%neg = shl i64 %a, 1
				%neg12 = or i64 %neg, 1
				ret i64 %neg12
				}

				define signext i32 @sroi_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: sroi_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: srli a0, a0, 1
				; RV64I-NEXT: lui a1, 524288
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sroi_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sroiw a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sroi_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sroiw a0, a0, 1
				; RV64IBB-NEXT: ret
				%neg = lshr i32 %a, 1
				%neg12 = or i32 %neg, -2147483648
				ret i32 %neg12
				}

				define i64 @sroi_i64(i64 %a) nounwind {
				; RV64I-LABEL: sroi_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: srli a0, a0, 1
				; RV64I-NEXT: addi a1, zero, -1
				; RV64I-NEXT: slli a1, a1, 63
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sroi_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sroi a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sroi_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sroi a0, a0, 1
				; RV64IBB-NEXT: ret
				%neg = lshr i64 %a, 1
				%neg12 = or i64 %neg, -9223372036854775808
				ret i64 %neg12
				}

				declare i32 @llvm.ctlz.i32(i32, i1)

				define signext i32 @ctlz_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: ctlz_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp)
				; RV64I-NEXT: beqz a0, .LBB8_2
				; RV64I-NEXT: # %bb.1: # %cond.false
				; RV64I-NEXT: srliw a1, a0, 1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 2
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 8
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 16
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 32
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: srli a1, a0, 1
				; RV64I-NEXT: lui a2, 21845
				; RV64I-NEXT: addiw a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: and a1, a1, a2
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: lui a1, 13107
				; RV64I-NEXT: addiw a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: and a2, a0, a1
				; RV64I-NEXT: srli a0, a0, 2
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: add a0, a2, a0
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: lui a1, 3855
				; RV64I-NEXT: addiw a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: lui a1, 4112
				; RV64I-NEXT: addiw a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: call __muldi3
				; RV64I-NEXT: srli a0, a0, 56
				; RV64I-NEXT: addi a0, a0, -32
				; RV64I-NEXT: j .LBB8_3
				; RV64I-NEXT: .LBB8_2:
				; RV64I-NEXT: addi a0, zero, 32
				; RV64I-NEXT: .LBB8_3: # %cond.end
				; RV64I-NEXT: ld ra, 8(sp)
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: ctlz_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: beqz a0, .LBB8_2
				; RV64IB-NEXT: # %bb.1: # %cond.false
				; RV64IB-NEXT: clzw a0, a0
				; RV64IB-NEXT: ret
				; RV64IB-NEXT: .LBB8_2:
				; RV64IB-NEXT: addi a0, zero, 32
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: ctlz_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: beqz a0, .LBB8_2
				; RV64IBB-NEXT: # %bb.1: # %cond.false
				; RV64IBB-NEXT: clzw a0, a0
				; RV64IBB-NEXT: ret
				; RV64IBB-NEXT: .LBB8_2:
				; RV64IBB-NEXT: addi a0, zero, 32
				; RV64IBB-NEXT: ret
				%1 = call i32 @llvm.ctlz.i32(i32 %a, i1 false)
				ret i32 %1
				}

				declare i64 @llvm.ctlz.i64(i64, i1)

				define i64 @ctlz_i64(i64 %a) nounwind {
				; RV64I-LABEL: ctlz_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp)
				; RV64I-NEXT: beqz a0, .LBB9_2
				; RV64I-NEXT: # %bb.1: # %cond.false
				; RV64I-NEXT: srli a1, a0, 1
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 2
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 8
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 16
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 32
				; RV64I-NEXT: or a0, a0, a1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: srli a1, a0, 1
				; RV64I-NEXT: lui a2, 21845
				; RV64I-NEXT: addiw a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: and a1, a1, a2
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: lui a1, 13107
				; RV64I-NEXT: addiw a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: and a2, a0, a1
				; RV64I-NEXT: srli a0, a0, 2
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: add a0, a2, a0
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: lui a1, 3855
				; RV64I-NEXT: addiw a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: lui a1, 4112
				; RV64I-NEXT: addiw a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: call __muldi3
				; RV64I-NEXT: srli a0, a0, 56
				; RV64I-NEXT: j .LBB9_3
				; RV64I-NEXT: .LBB9_2:
				; RV64I-NEXT: addi a0, zero, 64
				; RV64I-NEXT: .LBB9_3: # %cond.end
				; RV64I-NEXT: ld ra, 8(sp)
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: ctlz_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: beqz a0, .LBB9_2
				; RV64IB-NEXT: # %bb.1: # %cond.false
				; RV64IB-NEXT: clz a0, a0
				; RV64IB-NEXT: ret
				; RV64IB-NEXT: .LBB9_2:
				; RV64IB-NEXT: addi a0, zero, 64
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: ctlz_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: beqz a0, .LBB9_2
				; RV64IBB-NEXT: # %bb.1: # %cond.false
				; RV64IBB-NEXT: clz a0, a0
				; RV64IBB-NEXT: ret
				; RV64IBB-NEXT: .LBB9_2:
				; RV64IBB-NEXT: addi a0, zero, 64
				; RV64IBB-NEXT: ret
				%1 = call i64 @llvm.ctlz.i64(i64 %a, i1 false)
				ret i64 %1
				}

				declare i32 @llvm.cttz.i32(i32, i1)

				define signext i32 @cttz_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: cttz_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp)
				; RV64I-NEXT: beqz a0, .LBB10_2
				; RV64I-NEXT: # %bb.1: # %cond.false
				; RV64I-NEXT: addi a1, a0, -1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 1
				; RV64I-NEXT: lui a2, 21845
				; RV64I-NEXT: addiw a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: and a1, a1, a2
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: lui a1, 13107
				; RV64I-NEXT: addiw a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: and a2, a0, a1
				; RV64I-NEXT: srli a0, a0, 2
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: add a0, a2, a0
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: lui a1, 3855
				; RV64I-NEXT: addiw a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: lui a1, 4112
				; RV64I-NEXT: addiw a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: call __muldi3
				; RV64I-NEXT: srli a0, a0, 56
				; RV64I-NEXT: j .LBB10_3
				; RV64I-NEXT: .LBB10_2:
				; RV64I-NEXT: addi a0, zero, 32
				; RV64I-NEXT: .LBB10_3: # %cond.end
				; RV64I-NEXT: ld ra, 8(sp)
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: cttz_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: beqz a0, .LBB10_2
				; RV64IB-NEXT: # %bb.1: # %cond.false
				; RV64IB-NEXT: ctz a0, a0
				; RV64IB-NEXT: ret
				; RV64IB-NEXT: .LBB10_2:
				; RV64IB-NEXT: addi a0, zero, 32
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: cttz_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: beqz a0, .LBB10_2
				; RV64IBB-NEXT: # %bb.1: # %cond.false
				; RV64IBB-NEXT: ctz a0, a0
				; RV64IBB-NEXT: ret
				; RV64IBB-NEXT: .LBB10_2:
				; RV64IBB-NEXT: addi a0, zero, 32
				; RV64IBB-NEXT: ret
				%1 = call i32 @llvm.cttz.i32(i32 %a, i1 false)
				ret i32 %1
				}

				declare i64 @llvm.cttz.i64(i64, i1)

				define i64 @cttz_i64(i64 %a) nounwind {
				; RV64I-LABEL: cttz_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp)
				; RV64I-NEXT: beqz a0, .LBB11_2
				; RV64I-NEXT: # %bb.1: # %cond.false
				; RV64I-NEXT: addi a1, a0, -1
				; RV64I-NEXT: not a0, a0
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 1
				; RV64I-NEXT: lui a2, 21845
				; RV64I-NEXT: addiw a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: and a1, a1, a2
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: lui a1, 13107
				; RV64I-NEXT: addiw a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: and a2, a0, a1
				; RV64I-NEXT: srli a0, a0, 2
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: add a0, a2, a0
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: lui a1, 3855
				; RV64I-NEXT: addiw a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: lui a1, 4112
				; RV64I-NEXT: addiw a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: call __muldi3
				; RV64I-NEXT: srli a0, a0, 56
				; RV64I-NEXT: j .LBB11_3
				; RV64I-NEXT: .LBB11_2:
				; RV64I-NEXT: addi a0, zero, 64
				; RV64I-NEXT: .LBB11_3: # %cond.end
				; RV64I-NEXT: ld ra, 8(sp)
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: cttz_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: beqz a0, .LBB11_2
				; RV64IB-NEXT: # %bb.1: # %cond.false
				; RV64IB-NEXT: ctz a0, a0
				; RV64IB-NEXT: ret
				; RV64IB-NEXT: .LBB11_2:
				; RV64IB-NEXT: addi a0, zero, 64
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: cttz_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: beqz a0, .LBB11_2
				; RV64IBB-NEXT: # %bb.1: # %cond.false
				; RV64IBB-NEXT: ctz a0, a0
				; RV64IBB-NEXT: ret
				; RV64IBB-NEXT: .LBB11_2:
				; RV64IBB-NEXT: addi a0, zero, 64
				; RV64IBB-NEXT: ret
				%1 = call i64 @llvm.cttz.i64(i64 %a, i1 false)
				ret i64 %1
				}

				declare i32 @llvm.ctpop.i32(i32)

				define signext i32 @ctpop_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: ctpop_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp)
				; RV64I-NEXT: slli a1, a0, 32
				; RV64I-NEXT: srli a1, a1, 32
				; RV64I-NEXT: srliw a0, a0, 1
				; RV64I-NEXT: lui a2, 349525
				; RV64I-NEXT: addiw a2, a2, 1365
				; RV64I-NEXT: and a0, a0, a2
				; RV64I-NEXT: sub a0, a1, a0
				; RV64I-NEXT: srli a1, a0, 2
				; RV64I-NEXT: lui a2, 13107
				; RV64I-NEXT: addiw a2, a2, 819
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 819
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 819
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 819
				; RV64I-NEXT: and a1, a1, a2
				; RV64I-NEXT: and a0, a0, a2
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: lui a1, 3855
				; RV64I-NEXT: addiw a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: lui a1, 4112
				; RV64I-NEXT: addiw a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: call __muldi3
				; RV64I-NEXT: srli a0, a0, 56
				; RV64I-NEXT: ld ra, 8(sp)
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: ctpop_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: pcntw a0, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: ctpop_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: pcntw a0, a0
				; RV64IBB-NEXT: ret
				%1 = call i32 @llvm.ctpop.i32(i32 %a)
				ret i32 %1
				}

				declare i64 @llvm.ctpop.i64(i64)

				define i64 @ctpop_i64(i64 %a) nounwind {
				; RV64I-LABEL: ctpop_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi sp, sp, -16
				; RV64I-NEXT: sd ra, 8(sp)
				; RV64I-NEXT: srli a1, a0, 1
				; RV64I-NEXT: lui a2, 21845
				; RV64I-NEXT: addiw a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: slli a2, a2, 12
				; RV64I-NEXT: addi a2, a2, 1365
				; RV64I-NEXT: and a1, a1, a2
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: lui a1, 13107
				; RV64I-NEXT: addiw a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 819
				; RV64I-NEXT: and a2, a0, a1
				; RV64I-NEXT: srli a0, a0, 2
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: add a0, a2, a0
				; RV64I-NEXT: srli a1, a0, 4
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: lui a1, 3855
				; RV64I-NEXT: addiw a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, 241
				; RV64I-NEXT: slli a1, a1, 12
				; RV64I-NEXT: addi a1, a1, -241
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: lui a1, 4112
				; RV64I-NEXT: addiw a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: slli a1, a1, 16
				; RV64I-NEXT: addi a1, a1, 257
				; RV64I-NEXT: call __muldi3
				; RV64I-NEXT: srli a0, a0, 56
				; RV64I-NEXT: ld ra, 8(sp)
				; RV64I-NEXT: addi sp, sp, 16
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: ctpop_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: pcnt a0, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: ctpop_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: pcnt a0, a0
				; RV64IBB-NEXT: ret
				%1 = call i64 @llvm.ctpop.i64(i64 %a)
				ret i64 %1
				}

				define signext i32 @sextb_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: sextb_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 56
				; RV64I-NEXT: srai a0, a0, 56
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sextb_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sext.b a0, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sextb_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sext.b a0, a0
				; RV64IBB-NEXT: ret
				%shl = shl i32 %a, 24
				%shr = ashr exact i32 %shl, 24
				ret i32 %shr
				}

				define i64 @sextb_i64(i64 %a) nounwind {
				; RV64I-LABEL: sextb_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 56
				; RV64I-NEXT: srai a0, a0, 56
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sextb_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sext.b a0, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sextb_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sext.b a0, a0
				; RV64IBB-NEXT: ret
				%shl = shl i64 %a, 56
				%shr = ashr exact i64 %shl, 56
				ret i64 %shr
				}

				define signext i32 @sexth_i32(i32 signext %a) nounwind {
				; RV64I-LABEL: sexth_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 48
				; RV64I-NEXT: srai a0, a0, 48
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sexth_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sext.h a0, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sexth_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sext.h a0, a0
				; RV64IBB-NEXT: ret
				%shl = shl i32 %a, 16
				%shr = ashr exact i32 %shl, 16
				ret i32 %shr
				}

				define i64 @sexth_i64(i64 %a) nounwind {
				; RV64I-LABEL: sexth_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 48
				; RV64I-NEXT: srai a0, a0, 48
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: sexth_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sext.h a0, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: sexth_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: sext.h a0, a0
				; RV64IBB-NEXT: ret
				%shl = shl i64 %a, 48
				%shr = ashr exact i64 %shl, 48
				ret i64 %shr
				}

				define signext i32 @min_i32(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: min_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: blt a0, a1, .LBB18_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB18_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: min_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: min a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: min_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: min a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp slt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i64 @min_i64(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: min_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: blt a0, a1, .LBB19_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB19_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: min_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: min a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: min_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: min a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp slt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define signext i32 @max_i32(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: max_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: blt a1, a0, .LBB20_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB20_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: max_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: max a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: max_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: max a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp sgt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i64 @max_i64(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: max_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: blt a1, a0, .LBB21_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB21_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: max_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: max a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: max_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: max a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp sgt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define signext i32 @minu_i32(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: minu_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: bltu a0, a1, .LBB22_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB22_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: minu_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: minu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: minu_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: minu a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp ult i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i64 @minu_i64(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: minu_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: bltu a0, a1, .LBB23_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB23_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: minu_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: minu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: minu_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: minu a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp ult i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define signext i32 @maxu_i32(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: maxu_i32:
				; RV64I: # %bb.0:
				; RV64I-NEXT: bltu a1, a0, .LBB24_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB24_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: maxu_i32:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: maxu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: maxu_i32:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: maxu a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp ugt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i64 @maxu_i64(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: maxu_i64:
				; RV64I: # %bb.0:
				; RV64I-NEXT: bltu a1, a0, .LBB25_2
				; RV64I-NEXT: # %bb.1:
				; RV64I-NEXT: mv a0, a1
				; RV64I-NEXT: .LBB25_2:
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: maxu_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: maxu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: maxu_i64:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: maxu a0, a0, a1
				; RV64IBB-NEXT: ret
				%cmp = icmp ugt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				; We select a i32 addi that zero-extends the result on RV64 as addiwu

				define zeroext i32 @zext_add_to_addiwu(i32 signext %a) nounwind {
				; RV64I-LABEL: zext_add_to_addiwu:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, a0, 1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: zext_add_to_addiwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addiwu a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: zext_add_to_addiwu:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: addiwu a0, a0, 1
				; RV64IBB-NEXT: ret
				%add = add i32 %a, 1
				ret i32 %add
				}

				define i64 @addiwu(i64 %a) nounwind {
				; RV64I-LABEL: addiwu:
				; RV64I: # %bb.0:
				; RV64I-NEXT: addi a0, a0, 1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: addiwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addiwu a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: addiwu:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: addiwu a0, a0, 1
				; RV64IBB-NEXT: ret
				%conv = add i64 %a, 1
				%conv1 = and i64 %conv, 4294967295
				ret i64 %conv1
				}

				define i64 @slliuw(i64 %a) nounwind {
				; RV64I-LABEL: slliuw:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a0, a0, 1
				; RV64I-NEXT: addi a1, zero, 1
				; RV64I-NEXT: slli a1, a1, 33
				; RV64I-NEXT: addi a1, a1, -2
				; RV64I-NEXT: and a0, a0, a1
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: slliuw:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: slliu.w a0, a0, 1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: slliuw:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: slliu.w a0, a0, 1
				; RV64IBB-NEXT: ret
				%conv1 = shl i64 %a, 1
				%shl = and i64 %conv1, 8589934590
				ret i64 %shl
				}

				; We select a i32 add that zero-extends the result on RV64 as addwu

				define zeroext i32 @zext_add_to_addwu(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: zext_add_to_addwu:
				; RV64I: # %bb.0:
				; RV64I-NEXT: add a0, a0, a1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: zext_add_to_addwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addwu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: zext_add_to_addwu:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: addwu a0, a0, a1
				; RV64IBB-NEXT: ret
				%add = add i32 %a, %b
				ret i32 %add
				}

				define i64 @addwu(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: addwu:
				; RV64I: # %bb.0:
				; RV64I-NEXT: add a0, a1, a0
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: addwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addwu a0, a1, a0
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: addwu:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: addwu a0, a1, a0
				; RV64IBB-NEXT: ret
				%add = add i64 %b, %a
				%conv1 = and i64 %add, 4294967295
				ret i64 %conv1
				}

				; We select a i32 sub that zero-extends the result on RV64 as subwu

				define zeroext i32 @zext_sub_to_subwu(i32 signext %a, i32 signext %b) nounwind {
				; RV64I-LABEL: zext_sub_to_subwu:
				; RV64I: # %bb.0:
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: zext_sub_to_subwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: subwu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: zext_sub_to_subwu:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: subwu a0, a0, a1
				; RV64IBB-NEXT: ret
				%sub = sub i32 %a, %b
				ret i32 %sub
				}

				define i64 @subwu(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: subwu:
				; RV64I: # %bb.0:
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: slli a0, a0, 32
				; RV64I-NEXT: srli a0, a0, 32
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: subwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: subwu a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: subwu:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: subwu a0, a0, a1
				; RV64IBB-NEXT: ret
				%sub = sub i64 %a, %b
				%conv1 = and i64 %sub, 4294967295
				ret i64 %conv1
				}

				define i64 @adduw(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: adduw:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a1, 32
				; RV64I-NEXT: srli a1, a1, 32
				; RV64I-NEXT: add a0, a1, a0
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: adduw:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addu.w a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: adduw:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: addu.w a0, a0, a1
				; RV64IBB-NEXT: ret
				%and = and i64 %b, 4294967295
				%add = add i64 %and, %a
				ret i64 %add
				}

				define i64 @subuw(i64 %a, i64 %b) nounwind {
				; RV64I-LABEL: subuw:
				; RV64I: # %bb.0:
				; RV64I-NEXT: slli a1, a1, 32
				; RV64I-NEXT: srli a1, a1, 32
				; RV64I-NEXT: sub a0, a0, a1
				; RV64I-NEXT: ret
				;
				; RV64IB-LABEL: subuw:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: subu.w a0, a0, a1
				; RV64IB-NEXT: ret
				;
				; RV64IBB-LABEL: subuw:
				; RV64IBB: # %bb.0:
				; RV64IBB-NEXT: subu.w a0, a0, a1
				; RV64IBB-NEXT: ret
				%and = and i64 %b, 4294967295
				%sub = sub i64 %a, %and
				ret i64 %sub
				}

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 278136

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.h

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVInstrInfoB.td

llvm/test/CodeGen/RISCV/rv32Zbb.ll

llvm/test/CodeGen/RISCV/rv64Zbb.ll

[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructions
ClosedPublic