This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVISelLowering.cpp
4/4
RISCVInstrInfoB.td
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
7/7
rv32Zbb.ll
-
rv64Zbb.ll

Differential D79870

[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructions
ClosedPublic

Authored by PaoloS on May 13 2020, 8:29 AM.

Download Raw Diff

Details

Reviewers

simoncook
edward-jones
asb
lewis-revill

Commits

rGf749d92f7a32: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm…
rGe2692f0ee7f3: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm…

Summary

This patch provides optimization of bit manipulation operations by enabling
the +experimental-b target feature.
It adds matching of single block patterns of instructions to specific
bit-manip instructions from the base subset (zbb subextension) of the
experimental B extension of RISC-V.
It adds also the correspondent codegen tests.

This patch is based on Clifford Wolf's proposal for the bit manipulation
extension of RISCV:
https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

PaoloS created this revision.May 13 2020, 8:29 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2020, 8:30 AM

Herald added subscribers: llvm-commits, evandro, luismarques and 25 others. · View Herald Transcript

PaoloS added a child revision: D79871: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbp asm instructions.May 13 2020, 8:42 AM

Harbormaster completed remote builds in B56603: Diff 263729.May 13 2020, 10:50 AM

Sorry for the delay on this - the lockdown situation is really hurting my review time, though it looks like my childcare situation will improve from the week after next.

Quite a lot of comments below, but I think this actually is almost ready to go with a few extra patterns or at least reworked test cases. Thanks again for your work.

I've added a few comments inline, beyond that:

For the test files, I'd probably just have RV32IB and RV64IB check lines. The RV32I-NOT/RV64I-NOT lines are helpful defence in depth, but they're not very compatible with update_llc_test_checks.py which we prefer to use whenever possible. This is like the float-*.ll test files. Alternatively we just let update_llc_test_checks generate code for the non-bitmanip targets (more like the mul.ll test).
Use update_llc_test_checks.py to generate and maintain the check lines
Missing slliu.w?
No immediate variants? e.g. sloi, sroi
Missing some W instructions such as SLOW and SROW
I'd suggest having one file called something like rvZbb.ll which contains tests that are relevant for both RV32 and RV64. Note this likely does include both i64 and i32 test cases - we want to ensure reasonable codegen for i32 values on RV64 and for i64 values on RV32, in both cases using hardware instructions when possible. If there are tests that really would just be noise for RV32, then put those in rvZ64bb.ll.

I think we do have flexibility to commit something that falls short of handling codegen for all Zbb instructions, but if doing so it would be helpful to note in e.g. the test file (ideally even with tests!) cases that aren't yet handled, so it's easy to return to. If it's easy enough to add the missing cases, that would be preferred.

llvm/test/CodeGen/RISCV/rv32Zbb.ll
41	clz on a zero is a well defined operation that will return XLEN. So shouldn't this just lower to clz and ret?
61	Same comment as for clz above

Sorry for the late answer.
I'm catching up with this now.

I agree on the reorganization of the tests. I'm fixing that.
I notice that the tests of the 64 bit instructions on 32 bit are quite noisy (above all for clz, ctz and pcnt). I'll soon upload a revision so that you can all see.

The immediate shifts with ones instead (sloi, sroi) have the problem that LLVM optimizes them so that instead of having DAG nodes resembling the straightforward operation:

(sloi)

~(~x << shamt)

it prefers to use a mask:

(x << shamt) | (~(-1 << shamt))

That means that in the DAG pattern there's a constant (the right operand of the 'or') with a value that depends on the value of the shamt and that the resulting pattern (sloi/sroi) won't use.
Of course we could just drop it since it isn't used, like this:

def : Pat<(or (shl GPR:$rs1, simm12:$shamt), mask),
        (SLOI GPR:$rs1, simm12:$shamt)>;

But that introduces an ambiguity.
In order to check that the operand is actually the mask derived from the shamt we need to check that it is related to that.
I'm trying now to see if a ComplexPattern can do the trick and select sloi and sroi for me while checking that the mask is correct.
I'm not sure though if it is a neat enough solution for upstream.

About slliu.w instead the issue is quite different. As the documentation says slliu.w is identical to slli apart from the fact that it zeroes the (xlen-1):31 bits before shifting.
LLVM though optimizes out such casting before getting to the instruction selection. In that way it is not possibile to distinguish it from a normal slli. Considering that the result is a single instruction (slli) in any case I think it's just better to leave it like that and let the user use the slliu.w instruction directly if needed.
A similar thing happens with ctzw.
While for clzw and pcntw LLVM doesn't optimize out the truncation as not doing it could actually affect the result, for ctz it doesn't care. I guess that since it's checking the tail zeroes until up bit 31, once it sees that the lower 32 bits of the number are not 0 it processes the original 64 bit value normally.
Otherwise it returns 32.
That makes it unpractical to tell it apart from a rv64 ctz.
The outcome is that for llvm.cttz.i32 on rv64 instead of getting ctzw I get anyway ctz.

PaoloS marked 4 inline comments as done.Jun 25 2020, 5:11 AM

PaoloS added inline comments.

llvm/test/CodeGen/RISCV/rv32Zbb.ll
41	I agree, unfortunately the code gets split into multiple basic blocks before the selection and just the block with the condition a0 != 0 has the ctlz operation in it. Since I can focus on one block per time when pattern matching that's what I could do from the backend. I based the pattern matching of clz on the llvm instrinc llvm.ctlz.i32 that already relies on its own idiom recognition in the middle end. A solution could be to turn off the intrinsics and try to pattern match it directly from the backend, maybe we could semplify it. But the scope is limited.
61	Same as above

Added missing pattern-matching for *w instructions.
Added codegen tests.
Added ComplexPattern instances that are crucial to pattern-match SLOI, SROI, SLOIW, SROIW and SLLIUW.
Both 32 and 64 bit test files have both 32 and 64 bit test cases of the instructions (were existing).

Just a clarification. I decided to split the tests into 32bit and 64bit because the 32bit code compiled on RV64 commonly produces sign-extended IR and that's when many *w instructions are selected. A version of the tests in a unique file could imply on one hand to have 32 bit IR with sign-extension compiled for RV32 (harmless but redundant), on the other hand we would have i32 code with no explicit sign-extension compiled for RV32. That is correct but it might lead to misleading selections, like pattern-matching the IR code of a 32bit SLOI on RV64 with a RV64 SLOI instead of a SLOIW (the difference is that SLOIW ignores the upper 32 bit of the result while RV64 doesn't).

lewis-revill added inline comments.Jul 9 2020, 6:51 AM

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
173 ↗	(On Diff #276249)	Indentation within these Select functions is messed up, presumably due to a mix of tabs and spaces.
262 ↗	(On Diff #276249)	I'm not sure the convention other select functions for W instructions follow but perhaps an assert for IsRV64 should be added for completeness?
llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	Can these W selects be guarded for 64 bit only?

PaoloS marked 2 inline comments as done.Jul 9 2020, 8:38 AM

PaoloS added inline comments.

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp
173 ↗	(On Diff #276249)	Yes, I was trying to use spaces only in the end. Must have missed these.
262 ↗	(On Diff #276249)	Well, SLOIW exists only on RV64. I could add it, but I think it would be a bit redundant if I guard the selects only for RV64. But yes, for completeness I probably should.

PaoloS marked an inline comment as done.Jul 9 2020, 7:24 PM

PaoloS added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	Not sure how to do it, they can't be enclosed in Predicates like the instruction patterns.

Fixed indentation.
Added architecture type control for complex pattern matching of sloiw, sroiw and slliuw.

Updated the test:

the tests have been updated from the top of all the sub-patches together so that they are exactly the same as they would be if updated with the whole final patch.
labels specific to the sub-extension have been added alongside the generic RISCVIB label (that activates all the sub-extensions) so that we can see how differently the patterns are matched with the specific subextension or with all of them together.
the tests will probably fail if run by checking out the commit of a subextension and if updated they'll change. These tests are designed to work with the final squashed patch.

Corrected the order of the patterns.

PaoloS edited the summary of this revision. (Show Details)Jul 14 2020, 6:13 AM

lewis-revill added inline comments.Jul 14 2020, 8:04 AM

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	Looks like you fixed this with the operand to `ComplexPattern`. Only nitpick is to get the `:` characters aligned vertically here.
llvm/test/CodeGen/RISCV/rv32Zbb.ll
33	Nitpick: For these tests on RV32 where no bitmanip instructions are selected (EG: `slo_i64`, `sro_i64`, `min_i64` etc.) perhaps it's worth either omitting these, or if the goal is to eventually support them, just add a quick comment? I noticed the same in the 3rd and 5th patches too, for `rol_i64` and `fshl_i64`.

Thanks Paolo, tests are all passing and apart from the nitpicks this is a green light from me, as with the rest in this series.

This revision is now accepted and ready to land.Jul 14 2020, 8:09 AM

PaoloS marked an inline comment as done.Jul 14 2020, 8:28 AM

PaoloS added inline comments.

llvm/test/CodeGen/RISCV/rv32Zbb.ll
33	I see what you mean. I just like the idea to show consistency with the other tests while showing cases where there's still room for improvement. Much of this codegen pattern-matching work was also to look for cases that could be optimized. Also things could still change considering that a new subextension is being drafted. On the other hand I understand that it looks like these tests are wrong I guess since they don't show particular changes.

PaoloS marked 7 inline comments as done.Jul 14 2020, 10:25 AM

PaoloS added inline comments.

llvm/lib/Target/RISCV/RISCVInstrInfoB.td
641	On it.
llvm/test/CodeGen/RISCV/rv32Zbb.ll
33	I'm commenting those.

PaoloS marked an inline comment as done.Jul 14 2020, 11:19 AM

Aligned the declarations of the complex patterns.
Added comments to inefficient tests.

Thank you very much @lewis-revill, very appreciated.
I haven't got commit access, can you @asb or someone else commit it for me?
Unless of course there's something else that needs immediate correction.

Many thanks all.

Sure I'll land these apologies for the delay..

No worries.
Thank you @lewis-revill

Closed by commit rGe2692f0ee7f3: [RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm… (authored by lewis-revill). · Explain WhyJul 15 2020, 4:20 AM

This revision was automatically updated to reflect the committed changes.

It's a shame this just missed the creation of the llvm 11.0 branch, do we think it's worth trying to get this backported since it only just missed?

In D79870#2153608, @simoncook wrote:

It's a shame this just missed the creation of the llvm 11.0 branch, do we think it's worth trying to get this backported since it only just missed?

I wouldn't be opposed. As they're all guarded by experimental flags, the risk of issues in cherry-picking these patches is pretty minimal.

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelLowering.cpp

9 lines

RISCVInstrInfoB.td

52 lines

test/

CodeGen/

RISCV/

rv32Zbb.ll

153 lines

rv64Zbb.ll

213 lines

Diff 263729

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,

setOperationAction(ISD::SHL_PARTS, XLenVT, Custom);		setOperationAction(ISD::SHL_PARTS, XLenVT, Custom);
setOperationAction(ISD::SRL_PARTS, XLenVT, Custom);		setOperationAction(ISD::SRL_PARTS, XLenVT, Custom);
setOperationAction(ISD::SRA_PARTS, XLenVT, Custom);		setOperationAction(ISD::SRA_PARTS, XLenVT, Custom);

setOperationAction(ISD::ROTL, XLenVT, Expand);		setOperationAction(ISD::ROTL, XLenVT, Expand);
setOperationAction(ISD::ROTR, XLenVT, Expand);		setOperationAction(ISD::ROTR, XLenVT, Expand);
setOperationAction(ISD::BSWAP, XLenVT, Expand);		setOperationAction(ISD::BSWAP, XLenVT, Expand);

		if (!Subtarget.hasStdExtZbb()) {
setOperationAction(ISD::CTTZ, XLenVT, Expand);		setOperationAction(ISD::CTTZ, XLenVT, Expand);
setOperationAction(ISD::CTLZ, XLenVT, Expand);		setOperationAction(ISD::CTLZ, XLenVT, Expand);
setOperationAction(ISD::CTPOP, XLenVT, Expand);		setOperationAction(ISD::CTPOP, XLenVT, Expand);
		}

ISD::CondCode FPCCToExtend[] = {		ISD::CondCode FPCCToExtend[] = {
ISD::SETOGT, ISD::SETOGE, ISD::SETONE, ISD::SETUEQ, ISD::SETUGT,		ISD::SETOGT, ISD::SETOGE, ISD::SETONE, ISD::SETUEQ, ISD::SETUGT,
ISD::SETUGE, ISD::SETULT, ISD::SETULE, ISD::SETUNE, ISD::SETGT,		ISD::SETUGE, ISD::SETULT, ISD::SETULE, ISD::SETUNE, ISD::SETGT,
ISD::SETGE, ISD::SETNE};		ISD::SETGE, ISD::SETNE};

ISD::NodeType FPOpToExtend[] = {		ISD::NodeType FPOpToExtend[] = {
ISD::FSIN, ISD::FCOS, ISD::FSINCOS, ISD::FPOW, ISD::FREM, ISD::FP16_TO_FP,		ISD::FSIN, ISD::FCOS, ISD::FSINCOS, ISD::FPOW, ISD::FREM, ISD::FP16_TO_FP,
▲ Show 20 Lines • Show All 2,791 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVInstrInfoB.td

	Show First 20 Lines • Show All 626 Lines • ▼ Show 20 Lines
	def : CompressPat<(SUB GPRC:$rs1, X0, GPRC:$rs1),			def : CompressPat<(SUB GPRC:$rs1, X0, GPRC:$rs1),
	(C_NEG GPRC:$rs1)>;			(C_NEG GPRC:$rs1)>;
	} // Predicates = [HasStdExtZbproposedc, HasStdExtC]			} // Predicates = [HasStdExtZbproposedc, HasStdExtC]

	let Predicates = [HasStdExtZbproposedc, HasStdExtZbbOrZbp, HasStdExtC, IsRV64] in {			let Predicates = [HasStdExtZbproposedc, HasStdExtZbbOrZbp, HasStdExtC, IsRV64] in {
	def : CompressPat<(PACK GPRC:$rs1, GPRC:$rs1, X0),			def : CompressPat<(PACK GPRC:$rs1, GPRC:$rs1, X0),
	(C_ZEXTW GPRC:$rs1)>;			(C_ZEXTW GPRC:$rs1)>;
	} // Predicates = [HasStdExtZbproposedc, HasStdExtC, IsRV64]			} // Predicates = [HasStdExtZbproposedc, HasStdExtC, IsRV64]

				//===----------------------------------------------------------------------===//
				// Codegen patterns
				//===----------------------------------------------------------------------===//

				let Predicates = [HasStdExtZbb] in {
				def : Pat<(xor (shl (xor GPR:$rs1, -1), GPR:$rs2), -1),
				lewis-revillUnsubmitted Done Reply Inline Actions Can these W selects be guarded for 64 bit only? lewis-revill: Can these W selects be guarded for 64 bit only?
				PaoloSAuthorUnsubmitted Done Reply Inline Actions Not sure how to do it, they can't be enclosed in Predicates like the instruction patterns. PaoloS: Not sure how to do it, they can't be enclosed in Predicates like the instruction patterns.
				lewis-revillUnsubmitted Done Reply Inline Actions Looks like you fixed this with the operand to `ComplexPattern`. Only nitpick is to get the `:` characters aligned vertically here. lewis-revill: Looks like you fixed this with the operand to `ComplexPattern`. Only nitpick is to get the `:`…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions On it. PaoloS: On it.
				(SLO GPR:$rs1, GPR:$rs2)>;
				def : Pat<(xor (srl (xor GPR:$rs1, -1), GPR:$rs2), -1),
				(SRO GPR:$rs1, GPR:$rs2)>;
				def : Pat<(ctlz GPR:$rs1), (CLZ GPR:$rs1)>;
				def : Pat<(cttz GPR:$rs1), (CTZ GPR:$rs1)>;
				def : Pat<(ctpop GPR:$rs1), (PCNT GPR:$rs1)>;
				} // Predicates = [HasStdExtZbb]

				let Predicates = [HasStdExtZbb, IsRV32] in
				def : Pat<(sra (shl GPR:$rs1, (i32 24)), (i32 24)), (SEXTB GPR:$rs1)>;
				let Predicates = [HasStdExtZbb, IsRV64] in
				def : Pat<(sra (shl GPR:$rs1, (i64 56)), (i64 56)), (SEXTB GPR:$rs1)>;

				let Predicates = [HasStdExtZbb, IsRV32] in
				def : Pat<(sra (shl GPR:$rs1, (i32 16)), (i32 16)), (SEXTH GPR:$rs1)>;
				let Predicates = [HasStdExtZbb, IsRV64] in
				def : Pat<(sra (shl GPR:$rs1, (i64 48)), (i64 48)), (SEXTH GPR:$rs1)>;

				let Predicates = [HasStdExtZbb] in {
				def : Pat<(smin GPR:$rs1, GPR:$rs2), (MIN GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs1, GPR:$rs2, (XLenVT 20), GPR:$rs1, GPR:$rs2),
				(MIN GPR:$rs1, GPR:$rs2)>;
				def : Pat<(smax GPR:$rs1, GPR:$rs2), (MAX GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs2, GPR:$rs1, (XLenVT 20), GPR:$rs1, GPR:$rs2),
				(MAX GPR:$rs1, GPR:$rs2)>;
				def : Pat<(umin GPR:$rs1, GPR:$rs2), (MINU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs1, GPR:$rs2, (XLenVT 12), GPR:$rs1, GPR:$rs2),
				(MINU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(umax GPR:$rs1, GPR:$rs2), (MAXU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(riscv_selectcc GPR:$rs2, GPR:$rs1, (XLenVT 12), GPR:$rs1, GPR:$rs2),
				(MAXU GPR:$rs1, GPR:$rs2)>;
				} // Predicates = [HasStdExtZbb]

				let Predicates = [HasStdExtZbb, IsRV64] in {
				def : Pat<(and (add GPR:$rs, simm12:$simm12), 0xFFFFFFFF),
				(ADDIWU GPR:$rs, simm12:$simm12)>;
				def : Pat<(and (add GPR:$rs1, GPR:$rs2), 0xFFFFFFFF),
				(ADDWU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(and (sub GPR:$rs1, GPR:$rs2), 0xFFFFFFFF),
				(SUBWU GPR:$rs1, GPR:$rs2)>;
				def : Pat<(add GPR:$rs1, (and GPR:$rs2, 0xFFFFFFFF)),
				(ADDUW GPR:$rs1, GPR:$rs2)>;
				def : Pat<(sub GPR:$rs1, (and GPR:$rs2, 0xFFFFFFFF)),
				(SUBUW GPR:$rs1, GPR:$rs2)>;
				} // Predicates = [HasStdExtZbb, IsRV64]

llvm/test/CodeGen/RISCV/rv32Zbb.ll

This file was added.

				; RUN: llc -mtriple=riscv32 -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32I
				; RUN: llc -mtriple=riscv32 -mattr=+experimental-b -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32IB
				; RUN: llc -mtriple=riscv32 -mattr=+experimental-zbb -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV32IB

				define i32 @slo(i32 %a, i32 %b) nounwind {
				; RV32I-NOT: slo a0, a0, a1
				;
				; RV32IB-LABEL: slo:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: slo a0, a0, a1
				; RV32IB-NEXT: ret
				%neg = xor i32 %a, -1
				%shl = shl i32 %neg, %b
				%neg1 = xor i32 %shl, -1
				ret i32 %neg1
				}

				define i32 @sro(i32 %a, i32 %b) nounwind {
				; RV32I-NOT: sro a0, a0, a1
				;
				; RV32IB-LABEL: sro:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sro a0, a0, a1
				; RV32IB-NEXT: ret
				%neg = xor i32 %a, -1
				%shr = lshr i32 %neg, %b
				%neg1 = xor i32 %shr, -1
				ret i32 %neg1
				}

				lewis-revillUnsubmitted Done Reply Inline Actions Nitpick: For these tests on RV32 where no bitmanip instructions are selected (EG: `slo_i64`, `sro_i64`, `min_i64` etc.) perhaps it's worth either omitting these, or if the goal is to eventually support them, just add a quick comment? I noticed the same in the 3rd and 5th patches too, for `rol_i64` and `fshl_i64`. lewis-revill: Nitpick: For these tests on RV32 where no bitmanip instructions are selected (EG: `slo_i64`…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions I see what you mean. I just like the idea to show consistency with the other tests while showing cases where there's still room for improvement. Much of this codegen pattern-matching work was also to look for cases that could be optimized. Also things could still change considering that a new subextension is being drafted. On the other hand I understand that it looks like these tests are wrong I guess since they don't show particular changes. PaoloS: I see what you mean. I just like the idea to show consistency with the other tests while…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions I'm commenting those. PaoloS: I'm commenting those.
				declare i32 @llvm.ctlz.i32(i32, i1)

				define i32 @ctlz_i32(i32 %a) nounwind {
				; RV32I-NOT: clz a0, a0
				;
				; RV32IB-LABEL: ctlz_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beqz a0, .LBB2_2
				asbUnsubmitted Done Reply Inline Actions clz on a zero is a well defined operation that will return XLEN. So shouldn't this just lower to clz and ret? asb: clz on a zero is a well defined operation that will return XLEN. So shouldn't this just lower…
				PaoloSAuthorUnsubmitted Done Reply Inline Actions I agree, unfortunately the code gets split into multiple basic blocks before the selection and just the block with the condition a0 != 0 has the ctlz operation in it. Since I can focus on one block per time when pattern matching that's what I could do from the backend. I based the pattern matching of clz on the llvm instrinc llvm.ctlz.i32 that already relies on its own idiom recognition in the middle end. A solution could be to turn off the intrinsics and try to pattern match it directly from the backend, maybe we could semplify it. But the scope is limited. PaoloS: I agree, unfortunately the code gets split into multiple basic blocks before the selection and…
				; RV32IB-NEXT: # %bb.1: # %cond.false
				; RV32IB-NEXT: clz a0, a0
				; RV32IB-NEXT: ret
				; RV32IB-NEXT: .LBB2_2:
				; RV32IB-NEXT: addi a0, zero, 32
				; RV32IB-NEXT: ret
				%1 = call i32 @llvm.ctlz.i32(i32 %a, i1 false)
				ret i32 %1
				}

				declare i32 @llvm.cttz.i32(i32, i1)

				define i32 @cttz_i32(i32 %a) nounwind {
				; RV32I-NOT: ctz a0, a0
				;
				; RV32IB-LABEL: cttz_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: beqz a0, .LBB3_2
				; RV32IB-NEXT: # %bb.1: # %cond.false
				; RV32IB-NEXT: ctz a0, a0
				asbUnsubmitted Done Reply Inline Actions Same comment as for clz above asb: Same comment as for clz above
				PaoloSAuthorUnsubmitted Done Reply Inline Actions Same as above PaoloS: Same as above
				; RV32IB-NEXT: ret
				; RV32IB-NEXT: .LBB3_2:
				; RV32IB-NEXT: addi a0, zero, 32
				; RV32IB-NEXT: ret
				%1 = call i32 @llvm.cttz.i32(i32 %a, i1 false)
				ret i32 %1
				}

				declare i32 @llvm.ctpop.i32(i32)

				define i32 @ctpop_i32(i32 %a) nounwind {
				; RV32I-NOT: pcnt a0, a0
				;
				; RV32IB-LABEL: ctpop_i32:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: pcnt a0, a0
				; RV32IB-NEXT: ret
				%1 = call i32 @llvm.ctpop.i32(i32 %a)
				ret i32 %1
				}

				define i32 @sextb(i32 %a) nounwind {
				; RV32I-NOT: sext.b a0, a0
				;
				; RV32IB-LABEL: sextb:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sext.b a0, a0
				; RV32IB-NEXT: ret
				%shl = shl i32 %a, 24
				%shr = ashr exact i32 %shl, 24
				ret i32 %shr
				}

				define i32 @sexth(i32 %a) nounwind {
				; RV32I-NOT: sext.h a0, a0
				;
				; RV32IB-LABEL: sexth:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: sext.h a0, a0
				; RV32IB-NEXT: ret
				%shl = shl i32 %a, 16
				%shr = ashr exact i32 %shl, 16
				ret i32 %shr
				}

				define i32 @min(i32 %a, i32 %b) nounwind {
				; RV32I-NOT: min a0, a0, a1
				;
				; RV32IB-LABEL: min:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: min a0, a0, a1
				; RV32IB-NEXT: ret
				%cmp = icmp slt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i32 @max(i32 %a, i32 %b) nounwind {
				; RV32I-NOT: max a0, a0, a1
				;
				; RV32IB-LABEL: max:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: max a0, a0, a1
				; RV32IB-NEXT: ret
				%cmp = icmp sgt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i32 @minu(i32 %a, i32 %b) nounwind {
				; RV32I-NOT: minu a0, a0, a1
				;
				; RV32IB-LABEL: minu:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: minu a0, a0, a1
				; RV32IB-NEXT: ret
				%cmp = icmp ult i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

				define i32 @maxu(i32 %a, i32 %b) nounwind {
				; RV32I-NOT: maxu a0, a0, a1
				;
				; RV32IB-LABEL: maxu:
				; RV32IB: # %bb.0:
				; RV32IB-NEXT: maxu a0, a0, a1
				; RV32IB-NEXT: ret
				%cmp = icmp ugt i32 %a, %b
				%cond = select i1 %cmp, i32 %a, i32 %b
				ret i32 %cond
				}

llvm/test/CodeGen/RISCV/rv64Zbb.ll

This file was added.

				; RUN: llc -mtriple=riscv64 -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64I
				; RUN: llc -mtriple=riscv64 -mattr=+experimental-b -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64IB
				; RUN: llc -mtriple=riscv64 -mattr=+experimental-zbb -verify-machineinstrs < %s \
				; RUN: \| FileCheck %s -check-prefix=RV64IB

				define i64 @slo(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: slo a0, a0, a1
				;
				; RV64IB-LABEL: slo:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: slo a0, a0, a1
				; RV64IB-NEXT: ret
				%neg = xor i64 %a, -1
				%shl = shl i64 %neg, %b
				%neg1 = xor i64 %shl, -1
				ret i64 %neg1
				}

				define i64 @sro(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: sro a0, a0, a1
				;
				; RV64IB-LABEL: sro:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sro a0, a0, a1
				; RV64IB-NEXT: ret
				%neg = xor i64 %a, -1
				%shr = lshr i64 %neg, %b
				%neg1 = xor i64 %shr, -1
				ret i64 %neg1
				}

				declare i64 @llvm.ctlz.i64(i64, i1)

				define i64 @ctlz_i64(i64 %a) nounwind {
				; RV64I-NOT: clz a0, a0
				;
				; RV64IB-LABEL: ctlz_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: beqz a0, .LBB2_2
				; RV64IB-NEXT: # %bb.1: # %cond.false
				; RV64IB-NEXT: clz a0, a0
				; RV64IB-NEXT: ret
				; RV64IB-NEXT: .LBB2_2:
				; RV64IB-NEXT: addi a0, zero, 64
				; RV64IB-NEXT: ret
				%1 = call i64 @llvm.ctlz.i64(i64 %a, i1 false)
				ret i64 %1
				}

				declare i64 @llvm.cttz.i64(i64, i1)

				define i64 @cttz_i64(i64 %a) nounwind {
				; RV64I-NOT: ctz a0, a0
				;
				; RV64IB-LABEL: cttz_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: beqz a0, .LBB3_2
				; RV64IB-NEXT: # %bb.1: # %cond.false
				; RV64IB-NEXT: ctz a0, a0
				; RV64IB-NEXT: ret
				; RV64IB-NEXT: .LBB3_2:
				; RV64IB-NEXT: addi a0, zero, 64
				; RV64IB-NEXT: ret
				%1 = call i64 @llvm.cttz.i64(i64 %a, i1 false)
				ret i64 %1
				}

				declare i64 @llvm.ctpop.i64(i64)

				define i64 @ctpop_i64(i64 %a) nounwind {
				; RV64I-NOT: pcnt a0, a0
				;
				; RV64IB-LABEL: ctpop_i64:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: pcnt a0, a0
				; RV64IB-NEXT: ret
				%1 = call i64 @llvm.ctpop.i64(i64 %a)
				ret i64 %1
				}

				define i64 @sextb(i64 %a) nounwind {
				; RV64I-NOT: sext.b a0, a0
				;
				; RV64IB-LABEL: sextb:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sext.b a0, a0
				; RV64IB-NEXT: ret
				%shl = shl i64 %a, 56
				%shr = ashr exact i64 %shl, 56
				ret i64 %shr
				}

				define i64 @sexth(i64 %a) nounwind {
				; RV64I-NOT: sext.h a0, a0
				;
				; RV64IB-LABEL: sexth:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: sext.h a0, a0
				; RV64IB-NEXT: ret
				%shl = shl i64 %a, 48
				%shr = ashr exact i64 %shl, 48
				ret i64 %shr
				}

				define i64 @min(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: min a0, a0, a1
				;
				; RV64IB-LABEL: min:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: min a0, a0, a1
				; RV64IB-NEXT: ret
				%cmp = icmp slt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i64 @max(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: max a0, a0, a1
				;
				; RV64IB-LABEL: max:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: max a0, a0, a1
				; RV64IB-NEXT: ret
				%cmp = icmp sgt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i64 @minu(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: minu a0, a0, a1
				;
				; RV64IB-LABEL: minu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: minu a0, a0, a1
				; RV64IB-NEXT: ret
				%cmp = icmp ult i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i64 @maxu(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: maxu a0, a0, a1
				;
				; RV64IB-LABEL: maxu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: maxu a0, a0, a1
				; RV64IB-NEXT: ret
				%cmp = icmp ugt i64 %a, %b
				%cond = select i1 %cmp, i64 %a, i64 %b
				ret i64 %cond
				}

				define i64 @addiwu(i64 %a) nounwind {
				; RV64I-NOT: addiwu a0, a0, 1
				;
				; RV64IB-LABEL: addiwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addiwu a0, a0, 1
				; RV64IB-NEXT: ret
				%conv = add i64 %a, 1
				%conv1 = and i64 %conv, 4294967295
				ret i64 %conv1
				}

				define i64 @addwu(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: addwu a0, a1, a0
				;
				; RV64IB-LABEL: addwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addwu a0, a1, a0
				; RV64IB-NEXT: ret
				%add = add i64 %b, %a
				%conv1 = and i64 %add, 4294967295
				ret i64 %conv1
				}

				define i64 @subwu(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: subwu a0, a0, a1
				;
				; RV64IB-LABEL: subwu:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: subwu a0, a0, a1
				; RV64IB-NEXT: ret
				%sub = sub i64 %a, %b
				%conv1 = and i64 %sub, 4294967295
				ret i64 %conv1
				}

				define i64 @adduw(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: addu.w a0, a0, a1
				;
				; RV64IB-LABEL: adduw:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: addu.w a0, a0, a1
				; RV64IB-NEXT: ret
				%and = and i64 %b, 4294967295
				%add = add i64 %and, %a
				ret i64 %add
				}

				define i64 @subuw(i64 %a, i64 %b) nounwind {
				; RV64I-NOT: subu.w a0, a0, a1
				;
				; RV64IB-LABEL: subuw:
				; RV64IB: # %bb.0:
				; RV64IB-NEXT: subu.w a0, a0, a1
				; RV64IB-NEXT: ret
				%and = and i64 %b, 4294967295
				%sub = sub i64 %a, %and
				ret i64 %sub
				}

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructionsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 263729

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

llvm/lib/Target/RISCV/RISCVInstrInfoB.td

llvm/test/CodeGen/RISCV/rv32Zbb.ll

llvm/test/CodeGen/RISCV/rv64Zbb.ll

[RISCV] Add matching of codegen patterns to RISCV Bit Manipulation Zbb asm instructions
ClosedPublic