This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases
ClosedPublic

Authored by craig.topper on Mar 29 2022, 8:07 PM.

Download Raw Diff

Details

Reviewers

luismarques
asb
frasercrmck

Commits

rG447750053328: [RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases

Summary

Previously, these isel optimizations were disabled if the AND could
be selected as a ANDI instruction. This patch disables the optimizations
only if the immediate is valid for C.ANDI. If we can't use C.ANDI,
we might be able to compress the shift instructions instead.

I'm not checking the C extension since we have relatively poor test
coverage of the C extension. Without C extension the code size
should be equal. My only concern would be if the shift+andi had
better latency/throughput on a particular CPU.

I did have to add a peephole to match SRLIW if the input is zexti32
to prevent a regression in rv64zbp.ll.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Mar 29 2022, 8:07 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2022, 8:07 PM

Herald added subscribers: sunshaoce, VincentWu, luke957 and 26 others. · View Herald Transcript

craig.topper requested review of this revision.Mar 29 2022, 8:07 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 29 2022, 8:07 PM

Herald added subscribers: • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

LGTM.

I'm not checking the C extension since we have relatively poor test coverage of the C extension.

Note that in a TODO?

My only concern would be if the shift+andi had better latency/throughput on a particular CPU.

If someone can list CPUs where that's true please chime in. The only one I'm aware of is (depending on configuration) the PicoRV32, so this seems like a good tradeoff.

This revision is now accepted and ready to land.Mar 30 2022, 10:26 AM

This revision was landed with ongoing or failed builds.Mar 30 2022, 11:49 AM

Closed by commit rG447750053328: [RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases (authored by craig.topper). · Explain Why

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rG447750053328: [RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases.

Harbormaster completed remote builds in B156879: Diff 419040.Mar 30 2022, 1:14 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelDAGToDAG.cpp

29 lines

test/

CodeGen/

RISCV/

bitreverse-shift.ll

16 lines

bswap-shift.ll

8 lines

rv64zbp.ll

8 lines

selectcc-to-shiftand.ll

8 lines

urem-seteq-illegal-types.ll

8 lines

Diff 419235

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

Show First 20 Lines • Show All 718 Lines • ▼ Show 20 Lines	if (!C)
break;		break;
uint64_t C2 = C->getZExtValue();		uint64_t C2 = C->getZExtValue();
unsigned XLen = Subtarget->getXLen();		unsigned XLen = Subtarget->getXLen();
if (!C2 \|\| C2 >= XLen)		if (!C2 \|\| C2 >= XLen)
break;		break;

uint64_t C1 = N1C->getZExtValue();		uint64_t C1 = N1C->getZExtValue();

// Keep track of whether this is an andi.		// Keep track of whether this is a c.andi. If we can't use c.andi, the
bool IsANDI = isInt<12>(N1C->getSExtValue());		// shift pair might offer more compression opportunities.
		// TODO: We could check for C extension here, but we don't have many lit
		// tests with the C extension enabled so not checking gets better coverage.
		// TODO: What if ANDI faster than shift?
		bool IsCANDI = isInt<6>(N1C->getSExtValue());

// Clear irrelevant bits in the mask.		// Clear irrelevant bits in the mask.
if (LeftShift)		if (LeftShift)
C1 &= maskTrailingZeros<uint64_t>(C2);		C1 &= maskTrailingZeros<uint64_t>(C2);
else		else
C1 &= maskTrailingOnes<uint64_t>(XLen - C2);		C1 &= maskTrailingOnes<uint64_t>(XLen - C2);

// Some transforms should only be done if the shift has a single use or		// Some transforms should only be done if the shift has a single use or
Show All 34 Lines	if (!LeftShift && isMask_64(C1)) {
return;		return;
}		}

// (srli (slli x, c3-c2), c3).		// (srli (slli x, c3-c2), c3).
// Skip it in order to select sraiw.		// Skip it in order to select sraiw.
bool Skip = Subtarget->hasStdExtZba() && C3 == 32 &&		bool Skip = Subtarget->hasStdExtZba() && C3 == 32 &&
X.getOpcode() == ISD::SIGN_EXTEND_INREG &&		X.getOpcode() == ISD::SIGN_EXTEND_INREG &&
cast<VTSDNode>(X.getOperand(1))->getVT() == MVT::i32;		cast<VTSDNode>(X.getOperand(1))->getVT() == MVT::i32;
if (OneUseOrZExtW && !IsANDI && !Skip) {		if (OneUseOrZExtW && !IsCANDI && !Skip) {
SDNode *SLLI = CurDAG->getMachineNode(		SDNode *SLLI = CurDAG->getMachineNode(
RISCV::SLLI, DL, XLenVT, X,		RISCV::SLLI, DL, XLenVT, X,
CurDAG->getTargetConstant(C3 - C2, DL, XLenVT));		CurDAG->getTargetConstant(C3 - C2, DL, XLenVT));
SDNode *SRLI =		SDNode *SRLI =
CurDAG->getMachineNode(RISCV::SRLI, DL, XLenVT, SDValue(SLLI, 0),		CurDAG->getMachineNode(RISCV::SRLI, DL, XLenVT, SDValue(SLLI, 0),
CurDAG->getTargetConstant(C3, DL, XLenVT));		CurDAG->getTargetConstant(C3, DL, XLenVT));
ReplaceNode(Node, SRLI);		ReplaceNode(Node, SRLI);
return;		return;
Show All 13 Lines	if (LeftShift && isShiftedMask_64(C1)) {
SDNode *SLLI_UW =		SDNode *SLLI_UW =
CurDAG->getMachineNode(RISCV::SLLI_UW, DL, XLenVT, X,		CurDAG->getMachineNode(RISCV::SLLI_UW, DL, XLenVT, X,
CurDAG->getTargetConstant(C2, DL, XLenVT));		CurDAG->getTargetConstant(C2, DL, XLenVT));
ReplaceNode(Node, SLLI_UW);		ReplaceNode(Node, SLLI_UW);
return;		return;
}		}

// (srli (slli c2+c3), c3)		// (srli (slli c2+c3), c3)
if (OneUseOrZExtW && !IsANDI) {		if (OneUseOrZExtW && !IsCANDI) {
SDNode *SLLI = CurDAG->getMachineNode(		SDNode *SLLI = CurDAG->getMachineNode(
RISCV::SLLI, DL, XLenVT, X,		RISCV::SLLI, DL, XLenVT, X,
CurDAG->getTargetConstant(C2 + C3, DL, XLenVT));		CurDAG->getTargetConstant(C2 + C3, DL, XLenVT));
SDNode *SRLI =		SDNode *SRLI =
CurDAG->getMachineNode(RISCV::SRLI, DL, XLenVT, SDValue(SLLI, 0),		CurDAG->getMachineNode(RISCV::SRLI, DL, XLenVT, SDValue(SLLI, 0),
CurDAG->getTargetConstant(C3, DL, XLenVT));		CurDAG->getTargetConstant(C3, DL, XLenVT));
ReplaceNode(Node, SRLI);		ReplaceNode(Node, SRLI);
return;		return;
}		}
}		}
}		}

// Turn (and (shr x, c2), c1) -> (slli (srli x, c2+c3), c3) if c1 is a		// Turn (and (shr x, c2), c1) -> (slli (srli x, c2+c3), c3) if c1 is a
// shifted mask with c2 leading zeros and c3 trailing zeros.		// shifted mask with c2 leading zeros and c3 trailing zeros.
if (!LeftShift && isShiftedMask_64(C1)) {		if (!LeftShift && isShiftedMask_64(C1)) {
uint64_t Leading = XLen - (64 - countLeadingZeros(C1));		uint64_t Leading = XLen - (64 - countLeadingZeros(C1));
uint64_t C3 = countTrailingZeros(C1);		uint64_t C3 = countTrailingZeros(C1);
if (Leading == C2 && C2 + C3 < XLen && OneUseOrZExtW && !IsANDI) {		if (Leading == C2 && C2 + C3 < XLen && OneUseOrZExtW && !IsCANDI) {
		unsigned SrliOpc = RISCV::SRLI;
		// If the input is zexti32 we should use SRLIW.
		if (X.getOpcode() == ISD::AND && isa<ConstantSDNode>(X.getOperand(1)) &&
		X.getConstantOperandVal(1) == UINT64_C(0xFFFFFFFF)) {
		SrliOpc = RISCV::SRLIW;
		X = X.getOperand(0);
		}
SDNode *SRLI = CurDAG->getMachineNode(		SDNode *SRLI = CurDAG->getMachineNode(
RISCV::SRLI, DL, XLenVT, X,		SrliOpc, DL, XLenVT, X,
CurDAG->getTargetConstant(C2 + C3, DL, XLenVT));		CurDAG->getTargetConstant(C2 + C3, DL, XLenVT));
SDNode *SLLI =		SDNode *SLLI =
CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLI, 0),		CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLI, 0),
CurDAG->getTargetConstant(C3, DL, XLenVT));		CurDAG->getTargetConstant(C3, DL, XLenVT));
ReplaceNode(Node, SLLI);		ReplaceNode(Node, SLLI);
return;		return;
}		}
// If the leading zero count is C2+32, we can use SRLIW instead of SRLI.		// If the leading zero count is C2+32, we can use SRLIW instead of SRLI.
if (Leading > 32 && (Leading - 32) == C2 && C2 + C3 < 32 &&		if (Leading > 32 && (Leading - 32) == C2 && C2 + C3 < 32 &&
OneUseOrZExtW && !IsANDI) {		OneUseOrZExtW && !IsCANDI) {
SDNode *SRLIW = CurDAG->getMachineNode(		SDNode *SRLIW = CurDAG->getMachineNode(
RISCV::SRLIW, DL, XLenVT, X,		RISCV::SRLIW, DL, XLenVT, X,
CurDAG->getTargetConstant(C2 + C3, DL, XLenVT));		CurDAG->getTargetConstant(C2 + C3, DL, XLenVT));
SDNode *SLLI =		SDNode *SLLI =
CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLIW, 0),		CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLIW, 0),
CurDAG->getTargetConstant(C3, DL, XLenVT));		CurDAG->getTargetConstant(C3, DL, XLenVT));
ReplaceNode(Node, SLLI);		ReplaceNode(Node, SLLI);
return;		return;
}		}
}		}

// Turn (and (shl x, c2), c1) -> (slli (srli x, c3-c2), c3) if c1 is a		// Turn (and (shl x, c2), c1) -> (slli (srli x, c3-c2), c3) if c1 is a
// shifted mask with no leading zeros and c3 trailing zeros.		// shifted mask with no leading zeros and c3 trailing zeros.
if (LeftShift && isShiftedMask_64(C1)) {		if (LeftShift && isShiftedMask_64(C1)) {
uint64_t Leading = XLen - (64 - countLeadingZeros(C1));		uint64_t Leading = XLen - (64 - countLeadingZeros(C1));
uint64_t C3 = countTrailingZeros(C1);		uint64_t C3 = countTrailingZeros(C1);
if (Leading == 0 && C2 < C3 && OneUseOrZExtW && !IsANDI) {		if (Leading == 0 && C2 < C3 && OneUseOrZExtW && !IsCANDI) {
SDNode *SRLI = CurDAG->getMachineNode(		SDNode *SRLI = CurDAG->getMachineNode(
RISCV::SRLI, DL, XLenVT, X,		RISCV::SRLI, DL, XLenVT, X,
CurDAG->getTargetConstant(C3 - C2, DL, XLenVT));		CurDAG->getTargetConstant(C3 - C2, DL, XLenVT));
SDNode *SLLI =		SDNode *SLLI =
CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLI, 0),		CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLI, 0),
CurDAG->getTargetConstant(C3, DL, XLenVT));		CurDAG->getTargetConstant(C3, DL, XLenVT));
ReplaceNode(Node, SLLI);		ReplaceNode(Node, SLLI);
return;		return;
}		}
// If we have (32-C2) leading zeros, we can use SRLIW instead of SRLI.		// If we have (32-C2) leading zeros, we can use SRLIW instead of SRLI.
if (C2 < C3 && Leading + C2 == 32 && OneUseOrZExtW && !IsANDI) {		if (C2 < C3 && Leading + C2 == 32 && OneUseOrZExtW && !IsCANDI) {
SDNode *SRLIW = CurDAG->getMachineNode(		SDNode *SRLIW = CurDAG->getMachineNode(
RISCV::SRLIW, DL, XLenVT, X,		RISCV::SRLIW, DL, XLenVT, X,
CurDAG->getTargetConstant(C3 - C2, DL, XLenVT));		CurDAG->getTargetConstant(C3 - C2, DL, XLenVT));
SDNode *SLLI =		SDNode *SLLI =
CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLIW, 0),		CurDAG->getMachineNode(RISCV::SLLI, DL, XLenVT, SDValue(SRLIW, 0),
CurDAG->getTargetConstant(C3, DL, XLenVT));		CurDAG->getTargetConstant(C3, DL, XLenVT));
ReplaceNode(Node, SLLI);		ReplaceNode(Node, SLLI);
return;		return;
▲ Show 20 Lines • Show All 1,426 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/bitreverse-shift.ll

Show First 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	; RV64ZBKB-NEXT: ret
ret i64 %3		ret i64 %3
}		}

define i8 @test_bitreverse_shli_bitreverse_i8(i8 %a) nounwind {		define i8 @test_bitreverse_shli_bitreverse_i8(i8 %a) nounwind {
; RV32ZBKB-LABEL: test_bitreverse_shli_bitreverse_i8:		; RV32ZBKB-LABEL: test_bitreverse_shli_bitreverse_i8:
; RV32ZBKB: # %bb.0:		; RV32ZBKB: # %bb.0:
; RV32ZBKB-NEXT: rev8 a0, a0		; RV32ZBKB-NEXT: rev8 a0, a0
; RV32ZBKB-NEXT: brev8 a0, a0		; RV32ZBKB-NEXT: brev8 a0, a0
; RV32ZBKB-NEXT: srli a0, a0, 21		; RV32ZBKB-NEXT: srli a0, a0, 24
; RV32ZBKB-NEXT: andi a0, a0, 2040		; RV32ZBKB-NEXT: slli a0, a0, 3
; RV32ZBKB-NEXT: rev8 a0, a0		; RV32ZBKB-NEXT: rev8 a0, a0
; RV32ZBKB-NEXT: brev8 a0, a0		; RV32ZBKB-NEXT: brev8 a0, a0
; RV32ZBKB-NEXT: srli a0, a0, 24		; RV32ZBKB-NEXT: srli a0, a0, 24
; RV32ZBKB-NEXT: ret		; RV32ZBKB-NEXT: ret
;		;
; RV64ZBKB-LABEL: test_bitreverse_shli_bitreverse_i8:		; RV64ZBKB-LABEL: test_bitreverse_shli_bitreverse_i8:
; RV64ZBKB: # %bb.0:		; RV64ZBKB: # %bb.0:
; RV64ZBKB-NEXT: rev8 a0, a0		; RV64ZBKB-NEXT: rev8 a0, a0
; RV64ZBKB-NEXT: brev8 a0, a0		; RV64ZBKB-NEXT: brev8 a0, a0
; RV64ZBKB-NEXT: srli a0, a0, 53		; RV64ZBKB-NEXT: srli a0, a0, 56
; RV64ZBKB-NEXT: andi a0, a0, 2040		; RV64ZBKB-NEXT: slli a0, a0, 3
; RV64ZBKB-NEXT: rev8 a0, a0		; RV64ZBKB-NEXT: rev8 a0, a0
; RV64ZBKB-NEXT: brev8 a0, a0		; RV64ZBKB-NEXT: brev8 a0, a0
; RV64ZBKB-NEXT: srli a0, a0, 56		; RV64ZBKB-NEXT: srli a0, a0, 56
; RV64ZBKB-NEXT: ret		; RV64ZBKB-NEXT: ret
%1 = call i8 @llvm.bitreverse.i8(i8 %a)		%1 = call i8 @llvm.bitreverse.i8(i8 %a)
%2 = shl i8 %1, 3		%2 = shl i8 %1, 3
%3 = call i8 @llvm.bitreverse.i8(i8 %2)		%3 = call i8 @llvm.bitreverse.i8(i8 %2)
ret i8 %3		ret i8 %3
}		}

define i16 @test_bitreverse_shli_bitreverse_i16(i16 %a) nounwind {		define i16 @test_bitreverse_shli_bitreverse_i16(i16 %a) nounwind {
; RV32ZBKB-LABEL: test_bitreverse_shli_bitreverse_i16:		; RV32ZBKB-LABEL: test_bitreverse_shli_bitreverse_i16:
; RV32ZBKB: # %bb.0:		; RV32ZBKB: # %bb.0:
; RV32ZBKB-NEXT: rev8 a0, a0		; RV32ZBKB-NEXT: rev8 a0, a0
; RV32ZBKB-NEXT: brev8 a0, a0		; RV32ZBKB-NEXT: brev8 a0, a0
; RV32ZBKB-NEXT: srli a0, a0, 9		; RV32ZBKB-NEXT: srli a0, a0, 16
; RV32ZBKB-NEXT: andi a0, a0, -128		; RV32ZBKB-NEXT: slli a0, a0, 7
; RV32ZBKB-NEXT: rev8 a0, a0		; RV32ZBKB-NEXT: rev8 a0, a0
; RV32ZBKB-NEXT: brev8 a0, a0		; RV32ZBKB-NEXT: brev8 a0, a0
; RV32ZBKB-NEXT: srli a0, a0, 16		; RV32ZBKB-NEXT: srli a0, a0, 16
; RV32ZBKB-NEXT: ret		; RV32ZBKB-NEXT: ret
;		;
; RV64ZBKB-LABEL: test_bitreverse_shli_bitreverse_i16:		; RV64ZBKB-LABEL: test_bitreverse_shli_bitreverse_i16:
; RV64ZBKB: # %bb.0:		; RV64ZBKB: # %bb.0:
; RV64ZBKB-NEXT: rev8 a0, a0		; RV64ZBKB-NEXT: rev8 a0, a0
; RV64ZBKB-NEXT: brev8 a0, a0		; RV64ZBKB-NEXT: brev8 a0, a0
; RV64ZBKB-NEXT: srli a0, a0, 41		; RV64ZBKB-NEXT: srli a0, a0, 48
; RV64ZBKB-NEXT: andi a0, a0, -128		; RV64ZBKB-NEXT: slli a0, a0, 7
; RV64ZBKB-NEXT: rev8 a0, a0		; RV64ZBKB-NEXT: rev8 a0, a0
; RV64ZBKB-NEXT: brev8 a0, a0		; RV64ZBKB-NEXT: brev8 a0, a0
; RV64ZBKB-NEXT: srli a0, a0, 48		; RV64ZBKB-NEXT: srli a0, a0, 48
; RV64ZBKB-NEXT: ret		; RV64ZBKB-NEXT: ret
%1 = call i16 @llvm.bitreverse.i16(i16 %a)		%1 = call i16 @llvm.bitreverse.i16(i16 %a)
%2 = shl i16 %1, 7		%2 = shl i16 %1, 7
%3 = call i16 @llvm.bitreverse.i16(i16 %2)		%3 = call i16 @llvm.bitreverse.i16(i16 %2)
ret i16 %3		ret i16 %3
▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/bswap-shift.ll

Show First 20 Lines • Show All 117 Lines • ▼ Show 20 Lines	; RV64ZB-NEXT: ret
%3 = call i64 @llvm.bswap.i64(i64 %2)		%3 = call i64 @llvm.bswap.i64(i64 %2)
ret i64 %3		ret i64 %3
}		}

define i16 @test_bswap_shli_7_bswap_i16(i16 %a) nounwind {		define i16 @test_bswap_shli_7_bswap_i16(i16 %a) nounwind {
; RV32ZB-LABEL: test_bswap_shli_7_bswap_i16:		; RV32ZB-LABEL: test_bswap_shli_7_bswap_i16:
; RV32ZB: # %bb.0:		; RV32ZB: # %bb.0:
; RV32ZB-NEXT: rev8 a0, a0		; RV32ZB-NEXT: rev8 a0, a0
; RV32ZB-NEXT: srli a0, a0, 9		; RV32ZB-NEXT: srli a0, a0, 16
; RV32ZB-NEXT: andi a0, a0, -128		; RV32ZB-NEXT: slli a0, a0, 7
; RV32ZB-NEXT: rev8 a0, a0		; RV32ZB-NEXT: rev8 a0, a0
; RV32ZB-NEXT: srli a0, a0, 16		; RV32ZB-NEXT: srli a0, a0, 16
; RV32ZB-NEXT: ret		; RV32ZB-NEXT: ret
;		;
; RV64ZB-LABEL: test_bswap_shli_7_bswap_i16:		; RV64ZB-LABEL: test_bswap_shli_7_bswap_i16:
; RV64ZB: # %bb.0:		; RV64ZB: # %bb.0:
; RV64ZB-NEXT: rev8 a0, a0		; RV64ZB-NEXT: rev8 a0, a0
; RV64ZB-NEXT: srli a0, a0, 41		; RV64ZB-NEXT: srli a0, a0, 48
; RV64ZB-NEXT: andi a0, a0, -128		; RV64ZB-NEXT: slli a0, a0, 7
; RV64ZB-NEXT: rev8 a0, a0		; RV64ZB-NEXT: rev8 a0, a0
; RV64ZB-NEXT: srli a0, a0, 48		; RV64ZB-NEXT: srli a0, a0, 48
; RV64ZB-NEXT: ret		; RV64ZB-NEXT: ret
%1 = call i16 @llvm.bswap.i16(i16 %a)		%1 = call i16 @llvm.bswap.i16(i16 %a)
%2 = shl i16 %1, 7		%2 = shl i16 %1, 7
%3 = call i16 @llvm.bswap.i16(i16 %2)		%3 = call i16 @llvm.bswap.i16(i16 %2)
ret i16 %3		ret i16 %3
}		}
▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rv64zbp.ll

	Show First 20 Lines • Show All 2,752 Lines • ▼ Show 20 Lines
	; RV64I-LABEL: bswap_rotr_i32:			; RV64I-LABEL: bswap_rotr_i32:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: slli a1, a0, 8			; RV64I-NEXT: slli a1, a0, 8
	; RV64I-NEXT: lui a2, 4080			; RV64I-NEXT: lui a2, 4080
	; RV64I-NEXT: and a1, a1, a2			; RV64I-NEXT: and a1, a1, a2
	; RV64I-NEXT: slli a2, a0, 24			; RV64I-NEXT: slli a2, a0, 24
	; RV64I-NEXT: or a1, a2, a1			; RV64I-NEXT: or a1, a2, a1
	; RV64I-NEXT: srliw a2, a0, 24			; RV64I-NEXT: srliw a2, a0, 24
	; RV64I-NEXT: srliw a0, a0, 8			; RV64I-NEXT: srliw a0, a0, 16
	; RV64I-NEXT: andi a0, a0, -256			; RV64I-NEXT: slli a0, a0, 8
	; RV64I-NEXT: or a0, a0, a2			; RV64I-NEXT: or a0, a0, a2
	; RV64I-NEXT: slliw a0, a0, 16			; RV64I-NEXT: slliw a0, a0, 16
	; RV64I-NEXT: srliw a1, a1, 16			; RV64I-NEXT: srliw a1, a1, 16
	; RV64I-NEXT: or a0, a1, a0			; RV64I-NEXT: or a0, a1, a0
	; RV64I-NEXT: ret			; RV64I-NEXT: ret
	;			;
	; RV64ZBP-LABEL: bswap_rotr_i32:			; RV64ZBP-LABEL: bswap_rotr_i32:
	; RV64ZBP: # %bb.0:			; RV64ZBP: # %bb.0:
	; RV64ZBP-NEXT: greviw a0, a0, 8			; RV64ZBP-NEXT: greviw a0, a0, 8
	; RV64ZBP-NEXT: ret			; RV64ZBP-NEXT: ret
	%1 = call i32 @llvm.bswap.i32(i32 %a)			%1 = call i32 @llvm.bswap.i32(i32 %a)
	%2 = call i32 @llvm.fshr.i32(i32 %1, i32 %1, i32 16)			%2 = call i32 @llvm.fshr.i32(i32 %1, i32 %1, i32 16)
	ret i32 %2			ret i32 %2
	}			}

	define i32 @bswap_rotl_i32(i32 %a) {			define i32 @bswap_rotl_i32(i32 %a) {
	; RV64I-LABEL: bswap_rotl_i32:			; RV64I-LABEL: bswap_rotl_i32:
	; RV64I: # %bb.0:			; RV64I: # %bb.0:
	; RV64I-NEXT: srliw a1, a0, 24			; RV64I-NEXT: srliw a1, a0, 24
	; RV64I-NEXT: srliw a2, a0, 8			; RV64I-NEXT: srliw a2, a0, 16
	; RV64I-NEXT: andi a2, a2, -256			; RV64I-NEXT: slli a2, a2, 8
	; RV64I-NEXT: or a1, a2, a1			; RV64I-NEXT: or a1, a2, a1
	; RV64I-NEXT: slli a2, a0, 8			; RV64I-NEXT: slli a2, a0, 8
	; RV64I-NEXT: lui a3, 4080			; RV64I-NEXT: lui a3, 4080
	; RV64I-NEXT: and a2, a2, a3			; RV64I-NEXT: and a2, a2, a3
	; RV64I-NEXT: slli a0, a0, 24			; RV64I-NEXT: slli a0, a0, 24
	; RV64I-NEXT: or a0, a0, a2			; RV64I-NEXT: or a0, a0, a2
	; RV64I-NEXT: srliw a0, a0, 16			; RV64I-NEXT: srliw a0, a0, 16
	; RV64I-NEXT: slliw a1, a1, 16			; RV64I-NEXT: slliw a1, a1, 16
	▲ Show 20 Lines • Show All 534 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/selectcc-to-shiftand.ll

Show All 25 Lines	; RV64-NEXT: ret
ret i32 %retval		ret i32 %retval
}		}

; Compare if negative and select of constants where one constant is zero and the		; Compare if negative and select of constants where one constant is zero and the
; other is a single bit.		; other is a single bit.
define i32 @neg_sel_special_constant(i32 signext %a) {		define i32 @neg_sel_special_constant(i32 signext %a) {
; RV32-LABEL: neg_sel_special_constant:		; RV32-LABEL: neg_sel_special_constant:
; RV32: # %bb.0:		; RV32: # %bb.0:
; RV32-NEXT: srli a0, a0, 22		; RV32-NEXT: srli a0, a0, 31
; RV32-NEXT: andi a0, a0, 512		; RV32-NEXT: slli a0, a0, 9
; RV32-NEXT: ret		; RV32-NEXT: ret
;		;
; RV64-LABEL: neg_sel_special_constant:		; RV64-LABEL: neg_sel_special_constant:
; RV64: # %bb.0:		; RV64: # %bb.0:
; RV64-NEXT: li a1, 1		; RV64-NEXT: li a1, 1
; RV64-NEXT: slli a1, a1, 31		; RV64-NEXT: slli a1, a1, 31
; RV64-NEXT: and a0, a0, a1		; RV64-NEXT: and a0, a0, a1
; RV64-NEXT: srli a0, a0, 22		; RV64-NEXT: srli a0, a0, 22
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines
; Compare if positive and select of constants where one constant is zero and the		; Compare if positive and select of constants where one constant is zero and the
; other is a single bit.		; other is a single bit.
; TODO: Why do RV32 and RV64 generate different code? RV64 uses more registers,		; TODO: Why do RV32 and RV64 generate different code? RV64 uses more registers,
; but the addi isn't part of the dependency chain of %a so may be faster.		; but the addi isn't part of the dependency chain of %a so may be faster.
define i32 @pos_sel_special_constant(i32 signext %a) {		define i32 @pos_sel_special_constant(i32 signext %a) {
; RV32-LABEL: pos_sel_special_constant:		; RV32-LABEL: pos_sel_special_constant:
; RV32: # %bb.0:		; RV32: # %bb.0:
; RV32-NEXT: not a0, a0		; RV32-NEXT: not a0, a0
; RV32-NEXT: srli a0, a0, 22		; RV32-NEXT: srli a0, a0, 31
; RV32-NEXT: andi a0, a0, 512		; RV32-NEXT: slli a0, a0, 9
; RV32-NEXT: ret		; RV32-NEXT: ret
;		;
; RV64-LABEL: pos_sel_special_constant:		; RV64-LABEL: pos_sel_special_constant:
; RV64: # %bb.0:		; RV64: # %bb.0:
; RV64-NEXT: slti a0, a0, 0		; RV64-NEXT: slti a0, a0, 0
; RV64-NEXT: xori a0, a0, 1		; RV64-NEXT: xori a0, a0, 1
; RV64-NEXT: slli a0, a0, 9		; RV64-NEXT: slli a0, a0, 9
; RV64-NEXT: ret		; RV64-NEXT: ret
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll

	Show First 20 Lines • Show All 529 Lines • ▼ Show 20 Lines
	; RV64M-NEXT: ret			; RV64M-NEXT: ret
	;			;
	; RV32MV-LABEL: test_urem_vec:			; RV32MV-LABEL: test_urem_vec:
	; RV32MV: # %bb.0:			; RV32MV: # %bb.0:
	; RV32MV-NEXT: addi sp, sp, -16			; RV32MV-NEXT: addi sp, sp, -16
	; RV32MV-NEXT: lw a1, 0(a0)			; RV32MV-NEXT: lw a1, 0(a0)
	; RV32MV-NEXT: andi a2, a1, 2047			; RV32MV-NEXT: andi a2, a1, 2047
	; RV32MV-NEXT: sh a2, 8(sp)			; RV32MV-NEXT: sh a2, 8(sp)
	; RV32MV-NEXT: srli a2, a1, 11			; RV32MV-NEXT: slli a2, a1, 10
	; RV32MV-NEXT: andi a2, a2, 2047			; RV32MV-NEXT: srli a2, a2, 21
	; RV32MV-NEXT: sh a2, 10(sp)			; RV32MV-NEXT: sh a2, 10(sp)
	; RV32MV-NEXT: lb a2, 4(a0)			; RV32MV-NEXT: lb a2, 4(a0)
	; RV32MV-NEXT: slli a2, a2, 10			; RV32MV-NEXT: slli a2, a2, 10
	; RV32MV-NEXT: srli a1, a1, 22			; RV32MV-NEXT: srli a1, a1, 22
	; RV32MV-NEXT: or a1, a1, a2			; RV32MV-NEXT: or a1, a1, a2
	; RV32MV-NEXT: andi a1, a1, 2047			; RV32MV-NEXT: andi a1, a1, 2047
	; RV32MV-NEXT: sh a1, 12(sp)			; RV32MV-NEXT: sh a1, 12(sp)
	; RV32MV-NEXT: vsetivli zero, 4, e16, mf2, ta, mu			; RV32MV-NEXT: vsetivli zero, 4, e16, mf2, ta, mu
	▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
	; RV64MV-NEXT: lbu a1, 4(a0)			; RV64MV-NEXT: lbu a1, 4(a0)
	; RV64MV-NEXT: lwu a2, 0(a0)			; RV64MV-NEXT: lwu a2, 0(a0)
	; RV64MV-NEXT: slli a1, a1, 32			; RV64MV-NEXT: slli a1, a1, 32
	; RV64MV-NEXT: or a1, a2, a1			; RV64MV-NEXT: or a1, a2, a1
	; RV64MV-NEXT: srli a2, a1, 22			; RV64MV-NEXT: srli a2, a1, 22
	; RV64MV-NEXT: sh a2, 12(sp)			; RV64MV-NEXT: sh a2, 12(sp)
	; RV64MV-NEXT: andi a2, a1, 2047			; RV64MV-NEXT: andi a2, a1, 2047
	; RV64MV-NEXT: sh a2, 8(sp)			; RV64MV-NEXT: sh a2, 8(sp)
	; RV64MV-NEXT: srli a1, a1, 11			; RV64MV-NEXT: slli a1, a1, 42
	; RV64MV-NEXT: andi a1, a1, 2047			; RV64MV-NEXT: srli a1, a1, 53
	; RV64MV-NEXT: sh a1, 10(sp)			; RV64MV-NEXT: sh a1, 10(sp)
	; RV64MV-NEXT: vsetivli zero, 4, e16, mf2, ta, mu			; RV64MV-NEXT: vsetivli zero, 4, e16, mf2, ta, mu
	; RV64MV-NEXT: addi a1, sp, 8			; RV64MV-NEXT: addi a1, sp, 8
	; RV64MV-NEXT: vle16.v v8, (a1)			; RV64MV-NEXT: vle16.v v8, (a1)
	; RV64MV-NEXT: vmv.v.i v9, 10			; RV64MV-NEXT: vmv.v.i v9, 10
	; RV64MV-NEXT: li a1, 9			; RV64MV-NEXT: li a1, 9
	; RV64MV-NEXT: vsetvli zero, zero, e16, mf2, tu, mu			; RV64MV-NEXT: vsetvli zero, zero, e16, mf2, tu, mu
	; RV64MV-NEXT: vmv.s.x v9, a1			; RV64MV-NEXT: vmv.s.x v9, a1
	▲ Show 20 Lines • Show All 50 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more casesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 419235

llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp

llvm/test/CodeGen/RISCV/bitreverse-shift.ll

llvm/test/CodeGen/RISCV/bswap-shift.ll

llvm/test/CodeGen/RISCV/rv64zbp.ll

llvm/test/CodeGen/RISCV/selectcc-to-shiftand.ll

llvm/test/CodeGen/RISCV/urem-seteq-illegal-types.ll

[RISCV] ISel (and (shift X, C1), C2)) to shift pair in more cases
ClosedPublic