This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
-
RISCVISelLowering.cpp
-
test/CodeGen/RISCV/
-
CodeGen/
-
RISCV/
3/3
i64-icmp.ll

Differential D129980

[RISCV] Optimize (seteq (i64 (and X, 0xffffffff)), C1)
ClosedPublic

Authored by craig.topper on Jul 17 2022, 8:20 PM.

Download Raw Diff

Details

Reviewers

asb
luismarques
reames

Commits

rG0b0275289961: [RISCV] Optimize (seteq (i64 (and X, 0xffffffff)), C1)

Summary

(and X, 0xffffffff) requires 2 shifts in the base ISA. Since we
know the result is being used by a compare, we can use a sext_inreg
instead of an AND if we also modify C1 to have 33 sign bits instead
of 32 leading zeros. This can also improve the generated code for
materializing C1.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Jul 17 2022, 8:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2022, 8:20 PM

Herald added subscribers: sunshaoce, VincentWu, luke957 and 28 others. · View Herald Transcript

craig.topper requested review of this revision.Jul 17 2022, 8:20 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2022, 8:20 PM

Herald added subscribers: • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

craig.topper edited the summary of this revision. (Show Details)Jul 17 2022, 8:21 PM

Harbormaster completed remote builds in B175943: Diff 445380.Jul 17 2022, 8:21 PM

craig.topper added inline comments.Jul 17 2022, 8:22 PM

llvm/test/CodeGen/RISCV/i64-icmp.ll
693	I think we can use addiw here to avoid the sext.w?
708	I think we can use subw instead of xor to avoid the sext.w
733	I think we can use addiw to avoid the sext.w

craig.topper added a parent revision: D129981: [RISCV] Pre-commit tests for D129980. NFC.Jul 17 2022, 8:23 PM

LGTM

Correct me if I'm wrong, but this should improve the idiomatic icmp eq i32 patterns for ELEN=64 right? If so, have you thought about analogous cases for 16 and 8 bits w/Zb*? On the surface, using a sext.h/b over a zext.h/b wouldn't seem to matter, but the constant simplifications are potentially interesting. Should we be canonicalizing towards sext anyways?

This revision is now accepted and ready to land.Jul 18 2022, 8:59 AM

In D129980#3660043, @reames wrote:

LGTM

Correct me if I'm wrong, but this should improve the idiomatic icmp eq i32 patterns for ELEN=64 right? If so, have you thought about analogous cases for 16 and 8 bits w/Zb*? On the surface, using a sext.h/b over a zext.h/b wouldn't seem to matter, but the constant simplifications are potentially interesting. Should we be canonicalizing towards sext anyways?

I don't think I understand this question. ELEN=64 is a vector term, but we're talking about scalar instructions here. Unless you meant XLEN=64?

In D129980#3660107, @craig.topper wrote:

In D129980#3660043, @reames wrote:

LGTM

Correct me if I'm wrong, but this should improve the idiomatic icmp eq i32 patterns for ELEN=64 right? If so, have you thought about analogous cases for 16 and 8 bits w/Zb*? On the surface, using a sext.h/b over a zext.h/b wouldn't seem to matter, but the constant simplifications are potentially interesting. Should we be canonicalizing towards sext anyways?

I don't think I understand this question. ELEN=64 is a vector term, but we're talking about scalar instructions here. Unless you meant XLEN=64?

Oops, yes.

In D129980#3660144, @reames wrote:

In D129980#3660107, @craig.topper wrote:

In D129980#3660043, @reames wrote:

LGTM

Correct me if I'm wrong, but this should improve the idiomatic icmp eq i32 patterns for ELEN=64 right? If so, have you thought about analogous cases for 16 and 8 bits w/Zb*? On the surface, using a sext.h/b over a zext.h/b wouldn't seem to matter, but the constant simplifications are potentially interesting. Should we be canonicalizing towards sext anyways?

I don't think I understand this question. ELEN=64 is a vector term, but we're talking about scalar instructions here. Unless you meant XLEN=64?

Oops, yes.

SelectionDAG type legalization is biased to prefer sign extend for all i32 compares via the isSExtCheaperThanZExt check in DAGTypeLegalizer::PromoteSetCCOperands. The cases I've seen this patch affect seem to come from the middle end with i64 type.

There is a generic fold to convert sext_in_reg to AND for equality compares(the opposite of this patch) in TargetLowering::SimplifySetCC. It is disabled for RISCV-V by isSExtCheaperThanZExt.

I think you're right there might be some be some interesting constant simplifications. I'm not sure it's restricted to Zb either. i16 constants in the range [63488, 65535] could be sign extended to enable the use of li instead of lui+addi. All i8 constants for this should fit in li already, but I think sign extending could enable c.li. Not sure if it makes sense to always use sext for i8 and i16 type legalization via isSExtCheaperThanZExt.

This revision was landed with ongoing or failed builds.Jul 18 2022, 10:55 AM

Closed by commit rG0b0275289961: [RISCV] Optimize (seteq (i64 (and X, 0xffffffff)), C1) (authored by craig.topper). · Explain Why

This revision was automatically updated to reflect the committed changes.

craig.topper mentioned this in rG464b3a9d8a1a: [RISCV] Pre-commit tests for D129980. NFC.

craig.topper added a commit: rG0b0275289961: [RISCV] Optimize (seteq (i64 (and X, 0xffffffff)), C1).

craig.topper mentioned this in D131113: [RISCV] Prevent infinite loop after D129980..Aug 3 2022, 1:32 PM

craig.topper mentioned this in rG53d560b22f5b: [RISCV] Prevent infinite loop after D129980..Aug 3 2022, 3:21 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVISelLowering.cpp

48 lines

test/

CodeGen/

RISCV/

i64-icmp.ll

23 lines

Diff 445568

llvm/lib/Target/RISCV/RISCVISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 928 Lines • ▼ Show 20 Lines	RISCVTargetLowering::RISCVTargetLowering(const TargetMachine &TM,
setPrefFunctionAlignment(FunctionAlignment);		setPrefFunctionAlignment(FunctionAlignment);

setMinimumJumpTableEntries(5);		setMinimumJumpTableEntries(5);

// Jumps are expensive, compared to logic		// Jumps are expensive, compared to logic
setJumpIsExpensive();		setJumpIsExpensive();

setTargetDAGCombine({ISD::INTRINSIC_WO_CHAIN, ISD::ADD, ISD::SUB, ISD::AND,		setTargetDAGCombine({ISD::INTRINSIC_WO_CHAIN, ISD::ADD, ISD::SUB, ISD::AND,
ISD::OR, ISD::XOR});		ISD::OR, ISD::XOR, ISD::SETCC});
if (Subtarget.is64Bit())		if (Subtarget.is64Bit())
setTargetDAGCombine(ISD::SRA);		setTargetDAGCombine(ISD::SRA);

if (Subtarget.hasStdExtF())		if (Subtarget.hasStdExtF())
setTargetDAGCombine({ISD::FADD, ISD::FMAXNUM, ISD::FMINNUM});		setTargetDAGCombine({ISD::FADD, ISD::FMAXNUM, ISD::FMINNUM});

if (Subtarget.hasStdExtZbp())		if (Subtarget.hasStdExtZbp())
setTargetDAGCombine({ISD::ROTL, ISD::ROTR});		setTargetDAGCombine({ISD::ROTL, ISD::ROTR});
▲ Show 20 Lines • Show All 7,167 Lines • ▼ Show 20 Lines	static SDValue performXORCombine(SDNode *N, SelectionDAG &DAG) {

if (SDValue V = combineBinOpToReduce(N, DAG))		if (SDValue V = combineBinOpToReduce(N, DAG))
return V;		return V;
// fold (xor (select cond, 0, y), x) ->		// fold (xor (select cond, 0, y), x) ->
// (select cond, x, (xor x, y))		// (select cond, x, (xor x, y))
return combineSelectAndUseCommutative(N, DAG, /AllOnes/ false);		return combineSelectAndUseCommutative(N, DAG, /AllOnes/ false);
}		}

		// Replace (seteq (i64 (and X, 0xffffffff)), C1) with
		// (seteq (i64 (sext_inreg (X, i32)), C1')) where C1' is C1 sign extended from
		// bit 31. Same for setne. C1' may be cheaper to materialize and the sext_inreg
		// can become a sext.w instead of a shift pair.
		static SDValue performSETCCCombine(SDNode *N, SelectionDAG &DAG,
		const RISCVSubtarget &Subtarget) {
		SDValue N0 = N->getOperand(0);
		SDValue N1 = N->getOperand(1);
		EVT VT = N->getValueType(0);
		EVT OpVT = N0.getValueType();

		if (OpVT != MVT::i64 \|\| !Subtarget.is64Bit())
		return SDValue();

		// RHS needs to be a constant.
		auto *N1C = dyn_cast<ConstantSDNode>(N1);
		if (!N1C)
		return SDValue();

		// LHS needs to be (and X, 0xffffffff).
		if (N0.getOpcode() != ISD::AND \|\| !N0.hasOneUse() \|\|
		!isa<ConstantSDNode>(N0.getOperand(1)) \|\|
		N0.getConstantOperandVal(1) != UINT64_C(0xffffffff))
		return SDValue();

		// Looking for an equality compare.
		ISD::CondCode Cond = cast<CondCodeSDNode>(N->getOperand(2))->get();
		if (!isIntEqualitySetCC(Cond))
		return SDValue();

		const APInt &C1 = cast<ConstantSDNode>(N1)->getAPIntValue();

		SDLoc dl(N);
		// If the constant is larger than 2^32 - 1 it is impossible for both sides
		// to be equal.
		if (C1.getActiveBits() > 32)
		return DAG.getBoolConstant(Cond == ISD::SETNE, dl, VT, OpVT);

		SDValue SExtOp = DAG.getNode(ISD::SIGN_EXTEND_INREG, N, OpVT,
		N0.getOperand(0), DAG.getValueType(MVT::i32));
		return DAG.getSetCC(dl, VT, SExtOp, DAG.getConstant(C1.trunc(32).sext(64),
		dl, OpVT), Cond);
		}

static SDValue		static SDValue
performSIGN_EXTEND_INREGCombine(SDNode *N, SelectionDAG &DAG,		performSIGN_EXTEND_INREGCombine(SDNode *N, SelectionDAG &DAG,
const RISCVSubtarget &Subtarget) {		const RISCVSubtarget &Subtarget) {
SDValue Src = N->getOperand(0);		SDValue Src = N->getOperand(0);
EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);

// Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)		// Fold (sext_inreg (fmv_x_anyexth X), i16) -> (fmv_x_signexth X)
if (Src.getOpcode() == RISCVISD::FMV_X_ANYEXTH &&		if (Src.getOpcode() == RISCVISD::FMV_X_ANYEXTH &&
▲ Show 20 Lines • Show All 726 Lines • ▼ Show 20 Lines	SDValue RISCVTargetLowering::PerformDAGCombine(SDNode *N,
case ISD::FADD:		case ISD::FADD:
case ISD::UMAX:		case ISD::UMAX:
case ISD::UMIN:		case ISD::UMIN:
case ISD::SMAX:		case ISD::SMAX:
case ISD::SMIN:		case ISD::SMIN:
case ISD::FMAXNUM:		case ISD::FMAXNUM:
case ISD::FMINNUM:		case ISD::FMINNUM:
return combineBinOpToReduce(N, DAG);		return combineBinOpToReduce(N, DAG);
		case ISD::SETCC:
		return performSETCCCombine(N, DAG, Subtarget);
case ISD::SIGN_EXTEND_INREG:		case ISD::SIGN_EXTEND_INREG:
return performSIGN_EXTEND_INREGCombine(N, DAG, Subtarget);		return performSIGN_EXTEND_INREGCombine(N, DAG, Subtarget);
case ISD::ZERO_EXTEND:		case ISD::ZERO_EXTEND:
// Fold (zero_extend (fp_to_uint X)) to prevent forming fcvt+zexti32 during		// Fold (zero_extend (fp_to_uint X)) to prevent forming fcvt+zexti32 during
// type legalization. This is safe because fp_to_uint produces poison if		// type legalization. This is safe because fp_to_uint produces poison if
// it overflows.		// it overflows.
if (N->getValueType(0) == MVT::i64 && Subtarget.is64Bit()) {		if (N->getValueType(0) == MVT::i64 && Subtarget.is64Bit()) {
SDValue Src = N->getOperand(0);		SDValue Src = N->getOperand(0);
▲ Show 20 Lines • Show All 3,602 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/i64-icmp.ll

Show First 20 Lines • Show All 683 Lines • ▼ Show 20 Lines	; RV64I-NEXT: ret
%1 = icmp sle i64 %a, -2050		%1 = icmp sle i64 %a, -2050
%2 = zext i1 %1 to i64		%2 = zext i1 %1 to i64
ret i64 %2		ret i64 %2
}		}

define i64 @icmp_eq_zext_inreg_small_constant(i64 %a) nounwind {		define i64 @icmp_eq_zext_inreg_small_constant(i64 %a) nounwind {
; RV64I-LABEL: icmp_eq_zext_inreg_small_constant:		; RV64I-LABEL: icmp_eq_zext_inreg_small_constant:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: slli a0, a0, 32		; RV64I-NEXT: sext.w a0, a0
; RV64I-NEXT: srli a0, a0, 32
; RV64I-NEXT: addi a0, a0, -123		; RV64I-NEXT: addi a0, a0, -123
		craig.topperAuthorUnsubmitted Done Reply Inline Actions I think we can use addiw here to avoid the sext.w? craig.topper: I think we can use addiw here to avoid the sext.w?
; RV64I-NEXT: seqz a0, a0		; RV64I-NEXT: seqz a0, a0
; RV64I-NEXT: ret		; RV64I-NEXT: ret
%1 = and i64 %a, 4294967295		%1 = and i64 %a, 4294967295
%2 = icmp eq i64 %1, 123		%2 = icmp eq i64 %1, 123
%3 = zext i1 %2 to i64		%3 = zext i1 %2 to i64
ret i64 %3		ret i64 %3
}		}

define i64 @icmp_eq_zext_inreg_large_constant(i64 %a) nounwind {		define i64 @icmp_eq_zext_inreg_large_constant(i64 %a) nounwind {
; RV64I-LABEL: icmp_eq_zext_inreg_large_constant:		; RV64I-LABEL: icmp_eq_zext_inreg_large_constant:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: slli a0, a0, 32		; RV64I-NEXT: sext.w a0, a0
; RV64I-NEXT: srli a0, a0, 32		; RV64I-NEXT: lui a1, 563901
; RV64I-NEXT: lui a1, 138		; RV64I-NEXT: addiw a1, a1, -529
; RV64I-NEXT: addiw a1, a1, -1347
; RV64I-NEXT: slli a1, a1, 12
; RV64I-NEXT: addi a1, a1, -529
; RV64I-NEXT: xor a0, a0, a1		; RV64I-NEXT: xor a0, a0, a1
		craig.topperAuthorUnsubmitted Done Reply Inline Actions I think we can use subw instead of xor to avoid the sext.w craig.topper: I think we can use subw instead of xor to avoid the sext.w
; RV64I-NEXT: seqz a0, a0		; RV64I-NEXT: seqz a0, a0
; RV64I-NEXT: ret		; RV64I-NEXT: ret
%1 = and i64 %a, 4294967295		%1 = and i64 %a, 4294967295
%2 = icmp eq i64 %1, 2309737967		%2 = icmp eq i64 %1, 2309737967
%3 = zext i1 %2 to i64		%3 = zext i1 %2 to i64
ret i64 %3		ret i64 %3
}		}

define i64 @icmp_ne_zext_inreg_small_constant(i64 %a) nounwind {		define i64 @icmp_ne_zext_inreg_small_constant(i64 %a) nounwind {
; RV64I-LABEL: icmp_ne_zext_inreg_small_constant:		; RV64I-LABEL: icmp_ne_zext_inreg_small_constant:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: slli a0, a0, 32		; RV64I-NEXT: sext.w a0, a0
; RV64I-NEXT: srli a0, a0, 32
; RV64I-NEXT: snez a0, a0		; RV64I-NEXT: snez a0, a0
; RV64I-NEXT: ret		; RV64I-NEXT: ret
%1 = and i64 %a, 4294967295		%1 = and i64 %a, 4294967295
%2 = icmp ne i64 %1, 0		%2 = icmp ne i64 %1, 0
%3 = zext i1 %2 to i64		%3 = zext i1 %2 to i64
ret i64 %3		ret i64 %3
}		}

define i64 @icmp_ne_zext_inreg_large_constant(i64 %a) nounwind {		define i64 @icmp_ne_zext_inreg_large_constant(i64 %a) nounwind {
; RV64I-LABEL: icmp_ne_zext_inreg_large_constant:		; RV64I-LABEL: icmp_ne_zext_inreg_large_constant:
; RV64I: # %bb.0:		; RV64I: # %bb.0:
; RV64I-NEXT: slli a0, a0, 32		; RV64I-NEXT: sext.w a0, a0
; RV64I-NEXT: srli a0, a0, 32		; RV64I-NEXT: addi a0, a0, 2
		craig.topperAuthorUnsubmitted Done Reply Inline Actions I think we can use addiw to avoid the sext.w craig.topper: I think we can use addiw to avoid the sext.w
; RV64I-NEXT: li a1, 1
; RV64I-NEXT: slli a1, a1, 32
; RV64I-NEXT: addi a1, a1, -2
; RV64I-NEXT: xor a0, a0, a1
; RV64I-NEXT: snez a0, a0		; RV64I-NEXT: snez a0, a0
; RV64I-NEXT: ret		; RV64I-NEXT: ret
%1 = and i64 %a, 4294967295		%1 = and i64 %a, 4294967295
%2 = icmp ne i64 %1, 4294967294		%2 = icmp ne i64 %1, 4294967294
%3 = zext i1 %2 to i64		%3 = zext i1 %2 to i64
ret i64 %3		ret i64 %3
}		}