Download Raw Diff

Details

Reviewers

sdesmalen
kmclaughlin
frasercrmck
peterwaller-arm
spatel

Commits

rGdaa80339dfcb: [CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable…

Summary

I have updated TargetLowering::isConstTrueVal to also consider
SPLAT_VECTOR nodes with constant integer operands. This allows the
optimisation to also work for targets that support scalable vectors.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

david-arm created this revision.Jan 13 2022, 4:11 AM

Herald added subscribers: luke957, luismarques, apazos and 19 others. · View Herald TranscriptJan 13 2022, 4:11 AM

david-arm requested review of this revision.Jan 13 2022, 4:11 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 13 2022, 4:11 AM

Herald added subscribers: llvm-commits, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B143132: Diff 399631.Jan 13 2022, 4:41 AM

Nice, I'd missed that one. I was going to ask about SPLAT_VECTOR_PARTS but I'm not sure we could get a test for that, seeing as I think it's only generated for 64-bit integer splats on RV32.

llvm/test/CodeGen/RISCV/rvv/cmp-folds.ll
4	I think using `--check-prefixes=CHECK,RV32` is fairly common in our RVV testing and looks like it'd help remove all duplicate checks.

Used better check prefixes for RISCV tests.

david-arm marked an inline comment as done.Jan 13 2022, 5:01 AM

david-arm added inline comments.

llvm/test/CodeGen/RISCV/rvv/cmp-folds.ll
4	Good suggestion, thanks!

Harbormaster completed remote builds in B143135: Diff 399637.Jan 13 2022, 5:36 AM

frasercrmck added inline comments.Jan 13 2022, 8:02 AM

llvm/test/CodeGen/RISCV/rvv/cmp-folds.ll
4	Oh but of course if there's no use of the RV32 or RV64 checks then lit will fail, sorry about that: I always forget. Sounds like you can just use `FileCheck %s`

Removed unnecessary CHECK lines.

Herald added a subscriber: alextsao1999. · View Herald TranscriptJan 17 2022, 3:28 AM

david-arm marked an inline comment as done.Jan 17 2022, 3:29 AM

frasercrmck added inline comments.Jan 17 2022, 3:30 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3208	I think we have to support the truncating case as with `BUILD_VECTOR` above, don't we?

david-arm added inline comments.Jan 17 2022, 3:45 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3208	Hmm, I looked at the definition of SPLAT_VECTOR and you're right - the operand is allowed to be wider than the vector element type of the result. I hadn't realised that :(. The only problem is that I have absolutely no idea how to write a test case for this using IR! I'm not sure if there is a way to create a splat (using SPLAT_VECTOR) in IR that is truncating. Unless you know of any RISCV examples where this would happen?

Harbormaster completed remote builds in B143751: Diff 400485.Jan 17 2022, 4:13 AM

Add support for truncating splats.

Harbormaster completed remote builds in B143760: Diff 400497.Jan 17 2022, 5:54 AM

frasercrmck added inline comments.Jan 18 2022, 8:07 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3208	Hmm, it would definitely happen during legalization as we only have either i32 (RV32) or i64 (RV64) as legal integer types, so most splats, really, like `<vscale x 1 x i8>` with a constant integer. But you'd need to be calling this during/after legalization. So no I'm not sure it's testable, per se. What you've got looks right, though?

sdesmalen added inline comments.Jan 20 2022, 7:26 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

3203–3204

Instead of checking for SPLAT_VECTOR explicitly, how about:

if (auto *CN = dyn_cast<ConstantSDNode>(N)) {
  CVal = CN->getAPIntValue();
} else if (isConstantSplatVector(N, CVal)) {
  unsigned EltWidth = N->getValueType(0).getScalarSizeInBits();
  if (EltWidth < CVal.getBitwidth())
    CVal = CVal.trunc(EltWidth);
} else
  return false;

david-arm added inline comments.Jan 21 2022, 1:42 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3203–3204	Sadly this doesn't work because `isConstantSplatVector` demands the element width of the result be the same as the element width in the BuildVectorSDNode, which means we end up with lots of failing tests. It's just a minor point, but `isConstantSplatVector` also permits FP splats, so we'd have to restrict it to integer too.

sdesmalen added inline comments.Jan 27 2022, 9:36 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
3203–3204	In that case, maybe you can use ConstantSDNode* llvm::isConstOrConstSplat(SDValue N, bool AllowUndefs, bool AllowTruncation) which allows you to specify whether truncation is allowed? It's bonkers how many of these functions exist...

Herald added a subscriber: • pcwang-thead. · View Herald TranscriptJan 27 2022, 9:36 AM

Updated isConsTrueVal to use isConstOrConstSplat.

Herald added a subscriber: ecnelises. · View Herald TranscriptJan 28 2022, 4:17 AM

david-arm marked 4 inline comments as done.Jan 28 2022, 4:17 AM

Thanks for the change, I think this is an improvement since we can now reuse some existing code for recognising the splat!

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
909–910	Can you update isConstFalseVal as well to only take an SDValue as part of this patch?

david-arm updated this revision to Diff 403983.Jan 28 2022, 5:33 AM

david-arm marked an inline comment as done.

liaolucy added a subscriber: liaolucy.Jan 28 2022, 5:42 AM

Thanks @david-arm, LGTM!

This revision is now accepted and ready to land.Jan 28 2022, 5:53 AM

Harbormaster completed remote builds in B146258: Diff 403983.Jan 28 2022, 7:32 AM

This revision was landed with ongoing or failed builds.Feb 1 2022, 1:50 AM

Closed by commit rGdaa80339dfcb: [CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable… (authored by david-arm). · Explain Why

This revision was automatically updated to reflect the committed changes.

david-arm added a commit: rGdaa80339dfcb: [CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable….

Diff 404839

llvm/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 3,676 Lines • ▼ Show 20 Lines	public:

bool recursivelyDeleteUnusedNodes(SDNode *N);		bool recursivelyDeleteUnusedNodes(SDNode *N);

void CommitTargetLoweringOpt(const TargetLoweringOpt &TLO);		void CommitTargetLoweringOpt(const TargetLoweringOpt &TLO);
};		};

/// Return if the N is a constant or constant vector equal to the true value		/// Return if the N is a constant or constant vector equal to the true value
/// from getBooleanContents().		/// from getBooleanContents().
bool isConstTrueVal(const SDNode *N) const;		bool isConstTrueVal(SDValue N) const;

/// Return if the N is a constant or constant vector equal to the false value		/// Return if the N is a constant or constant vector equal to the false value
/// from getBooleanContents().		/// from getBooleanContents().
bool isConstFalseVal(const SDNode *N) const;		bool isConstFalseVal(SDValue N) const;

/// Return if \p N is a True value when extended to \p VT.		/// Return if \p N is a True value when extended to \p VT.
bool isExtendedTrueVal(const ConstantSDNode *N, EVT VT, bool SExt) const;		bool isExtendedTrueVal(const ConstantSDNode *N, EVT VT, bool SExt) const;

/// Try to simplify a setcc built with the specified operands and cc. If it is		/// Try to simplify a setcc built with the specified operands and cc. If it is
/// unable to simplify it, return a null SDValue.		/// unable to simplify it, return a null SDValue.
SDValue SimplifySetCC(EVT VT, SDValue N0, SDValue N1, ISD::CondCode Cond,		SDValue SimplifySetCC(EVT VT, SDValue N0, SDValue N1, ISD::CondCode Cond,
bool foldBooleans, DAGCombinerInfo &DCI,		bool foldBooleans, DAGCombinerInfo &DCI,
▲ Show 20 Lines • Show All 1,118 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 900 Lines • ▼ Show 20 Lines	if (MatchStrict &&
(N.getOpcode() == ISD::STRICT_FSETCC \|\|		(N.getOpcode() == ISD::STRICT_FSETCC \|\|
N.getOpcode() == ISD::STRICT_FSETCCS)) {		N.getOpcode() == ISD::STRICT_FSETCCS)) {
LHS = N.getOperand(1);		LHS = N.getOperand(1);
RHS = N.getOperand(2);		RHS = N.getOperand(2);
CC = N.getOperand(3);		CC = N.getOperand(3);
return true;		return true;
}		}

if (N.getOpcode() != ISD::SELECT_CC \|\|		if (N.getOpcode() != ISD::SELECT_CC \|\| !TLI.isConstTrueVal(N.getOperand(2)) \|\|
!TLI.isConstTrueVal(N.getOperand(2).getNode()) \|\|		!TLI.isConstFalseVal(N.getOperand(3)))
		sdesmalenUnsubmitted Done Reply Inline Actions Can you update isConstFalseVal as well to only take an SDValue as part of this patch? sdesmalen: Can you update isConstFalseVal as well to only take an SDValue as part of this patch?
!TLI.isConstFalseVal(N.getOperand(3).getNode()))
return false;		return false;

if (TLI.getBooleanContents(N.getValueType()) ==		if (TLI.getBooleanContents(N.getValueType()) ==
TargetLowering::UndefinedBooleanContent)		TargetLowering::UndefinedBooleanContent)
return false;		return false;

LHS = N.getOperand(0);		LHS = N.getOperand(0);
RHS = N.getOperand(1);		RHS = N.getOperand(1);
▲ Show 20 Lines • Show All 7,110 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visitXOR(SDNode *N) {

// reassociate xor		// reassociate xor
if (SDValue RXOR = reassociateOps(ISD::XOR, DL, N0, N1, N->getFlags()))		if (SDValue RXOR = reassociateOps(ISD::XOR, DL, N0, N1, N->getFlags()))
return RXOR;		return RXOR;

// fold !(x cc y) -> (x !cc y)		// fold !(x cc y) -> (x !cc y)
unsigned N0Opcode = N0.getOpcode();		unsigned N0Opcode = N0.getOpcode();
SDValue LHS, RHS, CC;		SDValue LHS, RHS, CC;
if (TLI.isConstTrueVal(N1.getNode()) &&		if (TLI.isConstTrueVal(N1) &&
isSetCCEquivalent(N0, LHS, RHS, CC, /MatchStrict/true)) {		isSetCCEquivalent(N0, LHS, RHS, CC, /MatchStrict/ true)) {
ISD::CondCode NotCC = ISD::getSetCCInverse(cast<CondCodeSDNode>(CC)->get(),		ISD::CondCode NotCC = ISD::getSetCCInverse(cast<CondCodeSDNode>(CC)->get(),
LHS.getValueType());		LHS.getValueType());
if (!LegalOperations \|\|		if (!LegalOperations \|\|
TLI.isCondCodeLegal(NotCC, LHS.getSimpleValueType())) {		TLI.isCondCodeLegal(NotCC, LHS.getSimpleValueType())) {
switch (N0Opcode) {		switch (N0Opcode) {
default:		default:
llvm_unreachable("Unhandled SetCC Equivalent!");		llvm_unreachable("Unhandled SetCC Equivalent!");
case ISD::SETCC:		case ISD::SETCC:
▲ Show 20 Lines • Show All 16,096 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,188 Lines • ▼ Show 20 Lines	assert((Op.getOpcode() >= ISD::BUILTIN_OP_END \|\|
"Should use isSplatValue if you don't know whether Op"		"Should use isSplatValue if you don't know whether Op"
" is a target node!");		" is a target node!");
return false;		return false;
}		}

// FIXME: Ideally, this would use ISD::isConstantSplatVector(), but that must		// FIXME: Ideally, this would use ISD::isConstantSplatVector(), but that must
// work with truncating build vectors and vectors with elements of less than		// work with truncating build vectors and vectors with elements of less than
// 8 bits.		// 8 bits.
bool TargetLowering::isConstTrueVal(const SDNode *N) const {		bool TargetLowering::isConstTrueVal(SDValue N) const {
if (!N)		if (!N)
return false;		return false;

		unsigned EltWidth;
APInt CVal;		APInt CVal;
if (auto *CN = dyn_cast<ConstantSDNode>(N)) {		if (ConstantSDNode CN = isConstOrConstSplat(N, /AllowUndefs=*/false,
		/AllowTruncation=/true)) {
		sdesmalenUnsubmitted Done Reply Inline Actions Instead of checking for SPLAT_VECTOR explicitly, how about: if (auto CN = dyn_cast<ConstantSDNode>(N)) { CVal = CN->getAPIntValue(); } else if (isConstantSplatVector(N, CVal)) { unsigned EltWidth = N->getValueType(0).getScalarSizeInBits(); if (EltWidth < CVal.getBitwidth()) CVal = CVal.trunc(EltWidth); } else return false; sdesmalen:* Instead of checking for SPLAT_VECTOR explicitly, how about: if (auto *CN =…
		david-armAuthorUnsubmitted Done Reply Inline Actions Sadly this doesn't work because `isConstantSplatVector` demands the element width of the result be the same as the element width in the BuildVectorSDNode, which means we end up with lots of failing tests. It's just a minor point, but `isConstantSplatVector` also permits FP splats, so we'd have to restrict it to integer too. david-arm: Sadly this doesn't work because `isConstantSplatVector` demands the element width of the result…
		sdesmalenUnsubmitted Done Reply Inline Actions In that case, maybe you can use ConstantSDNode* llvm::isConstOrConstSplat(SDValue N, bool AllowUndefs, bool AllowTruncation) which allows you to specify whether truncation is allowed? It's bonkers how many of these functions exist... sdesmalen: In that case, maybe you can use ConstantSDNode* llvm::isConstOrConstSplat(SDValue N, bool…
CVal = CN->getAPIntValue();		CVal = CN->getAPIntValue();
} else if (auto *BV = dyn_cast<BuildVectorSDNode>(N)) {		EltWidth = N.getValueType().getScalarSizeInBits();
auto *CN = BV->getConstantSplatNode();		} else
if (!CN)
return false;		return false;
		frasercrmckUnsubmitted Done Reply Inline Actions I think we have to support the truncating case as with `BUILD_VECTOR` above, don't we? frasercrmck: I think we have to support the truncating case as with `BUILD_VECTOR` above, don't we?
		david-armAuthorUnsubmitted Done Reply Inline Actions Hmm, I looked at the definition of SPLAT_VECTOR and you're right - the operand is allowed to be wider than the vector element type of the result. I hadn't realised that :(. The only problem is that I have absolutely no idea how to write a test case for this using IR! I'm not sure if there is a way to create a splat (using SPLAT_VECTOR) in IR that is truncating. Unless you know of any RISCV examples where this would happen? david-arm: Hmm, I looked at the definition of SPLAT_VECTOR and you're right - the operand is allowed to be…
		frasercrmckUnsubmitted Done Reply Inline Actions Hmm, it would definitely happen during legalization as we only have either i32 (RV32) or i64 (RV64) as legal integer types, so most splats, really, like `<vscale x 1 x i8>` with a constant integer. But you'd need to be calling this during/after legalization. So no I'm not sure it's testable, per se. What you've got looks right, though? frasercrmck: Hmm, it would definitely happen during legalization as we only have either i32 (RV32) or i64…

// If this is a truncating build vector, truncate the splat value.		// If this is a truncating splat, truncate the splat value.
// Otherwise, we may fail to match the expected values below.		// Otherwise, we may fail to match the expected values below.
unsigned BVEltWidth = BV->getValueType(0).getScalarSizeInBits();		if (EltWidth < CVal.getBitWidth())
CVal = CN->getAPIntValue();		CVal = CVal.trunc(EltWidth);
if (BVEltWidth < CVal.getBitWidth())
CVal = CVal.trunc(BVEltWidth);
} else {
return false;
}

switch (getBooleanContents(N->getValueType(0))) {		switch (getBooleanContents(N.getValueType())) {
case UndefinedBooleanContent:		case UndefinedBooleanContent:
return CVal[0];		return CVal[0];
case ZeroOrOneBooleanContent:		case ZeroOrOneBooleanContent:
return CVal.isOne();		return CVal.isOne();
case ZeroOrNegativeOneBooleanContent:		case ZeroOrNegativeOneBooleanContent:
return CVal.isAllOnes();		return CVal.isAllOnes();
}		}

llvm_unreachable("Invalid boolean contents");		llvm_unreachable("Invalid boolean contents");
}		}

bool TargetLowering::isConstFalseVal(const SDNode *N) const {		bool TargetLowering::isConstFalseVal(SDValue N) const {
if (!N)		if (!N)
return false;		return false;

const ConstantSDNode *CN = dyn_cast<ConstantSDNode>(N);		const ConstantSDNode *CN = dyn_cast<ConstantSDNode>(N);
if (!CN) {		if (!CN) {
const BuildVectorSDNode *BV = dyn_cast<BuildVectorSDNode>(N);		const BuildVectorSDNode *BV = dyn_cast<BuildVectorSDNode>(N);
if (!BV)		if (!BV)
return false;		return false;
▲ Show 20 Lines • Show All 518 Lines • ▼ Show 20 Lines	if ((Cond == ISD::SETEQ \|\| Cond == ISD::SETNE) &&
// setcc (sext (setcc x, y, cc)), -1, setne) -> setcc (x, y, inv(cc))		// setcc (sext (setcc x, y, cc)), -1, setne) -> setcc (x, y, inv(cc))
// setcc (sext (setcc x, y, cc)), -1, seteq) -> setcc (x, y, cc)		// setcc (sext (setcc x, y, cc)), -1, seteq) -> setcc (x, y, cc)
SDValue TopSetCC = N0->getOperand(0);		SDValue TopSetCC = N0->getOperand(0);
unsigned N0Opc = N0->getOpcode();		unsigned N0Opc = N0->getOpcode();
bool SExt = (N0Opc == ISD::SIGN_EXTEND);		bool SExt = (N0Opc == ISD::SIGN_EXTEND);
if (TopSetCC.getValueType() == MVT::i1 && VT == MVT::i1 &&		if (TopSetCC.getValueType() == MVT::i1 && VT == MVT::i1 &&
TopSetCC.getOpcode() == ISD::SETCC &&		TopSetCC.getOpcode() == ISD::SETCC &&
(N0Opc == ISD::ZERO_EXTEND \|\| N0Opc == ISD::SIGN_EXTEND) &&		(N0Opc == ISD::ZERO_EXTEND \|\| N0Opc == ISD::SIGN_EXTEND) &&
(isConstFalseVal(N1C) \|\|		(isConstFalseVal(N1) \|\|
isExtendedTrueVal(N1C, N0->getValueType(0), SExt))) {		isExtendedTrueVal(N1C, N0->getValueType(0), SExt))) {

bool Inverse = (N1C->isZero() && Cond == ISD::SETEQ) \|\|		bool Inverse = (N1C->isZero() && Cond == ISD::SETEQ) \|\|
(!N1C->isZero() && Cond == ISD::SETNE);		(!N1C->isZero() && Cond == ISD::SETNE);

if (!Inverse)		if (!Inverse)
return TopSetCC;		return TopSetCC;

▲ Show 20 Lines • Show All 5,294 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 14,521 Lines • ▼ Show 20 Lines	if (SDValue Result = PerformSHLSimplify(N, DCI, Subtarget))
return Result;		return Result;
}		}

if (Subtarget->hasMVEIntegerOps()) {		if (Subtarget->hasMVEIntegerOps()) {
// fold (xor(vcmp/z, 1)) into a vcmp with the opposite condition.		// fold (xor(vcmp/z, 1)) into a vcmp with the opposite condition.
SDValue N0 = N->getOperand(0);		SDValue N0 = N->getOperand(0);
SDValue N1 = N->getOperand(1);		SDValue N1 = N->getOperand(1);
const TargetLowering *TLI = Subtarget->getTargetLowering();		const TargetLowering *TLI = Subtarget->getTargetLowering();
if (TLI->isConstTrueVal(N1.getNode()) &&		if (TLI->isConstTrueVal(N1) &&
(N0->getOpcode() == ARMISD::VCMP \|\| N0->getOpcode() == ARMISD::VCMPZ)) {		(N0->getOpcode() == ARMISD::VCMP \|\| N0->getOpcode() == ARMISD::VCMPZ)) {
if (CanInvertMVEVCMP(N0)) {		if (CanInvertMVEVCMP(N0)) {
SDLoc DL(N0);		SDLoc DL(N0);
ARMCC::CondCodes CC = ARMCC::getOppositeCondition(getVCMPCondCode(N0));		ARMCC::CondCodes CC = ARMCC::getOppositeCondition(getVCMPCondCode(N0));

SmallVector<SDValue, 4> Ops;		SmallVector<SDValue, 4> Ops;
Ops.push_back(N0->getOperand(0));		Ops.push_back(N0->getOperand(0));
if (N0->getOpcode() == ARMISD::VCMP)		if (N0->getOpcode() == ARMISD::VCMP)
▲ Show 20 Lines • Show All 7,126 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-cmp-folds.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64-linux-unknown -mattr=+sve -o - < %s \| FileCheck %s

				define <vscale x 8 x i1> @not_icmp_sle_nxv8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b) {
				; CHECK-LABEL: not_icmp_sle_nxv8i16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.h
				; CHECK-NEXT: cmpgt p0.h, p0/z, z0.h, z1.h
				; CHECK-NEXT: ret
				%icmp = icmp sle <vscale x 8 x i16> %a, %b
				%tmp = insertelement <vscale x 8 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 8 x i1> %tmp, <vscale x 8 x i1> undef, <vscale x 8 x i32> zeroinitializer
				%not = xor <vscale x 8 x i1> %ones, %icmp
				ret <vscale x 8 x i1> %not
				}

				define <vscale x 4 x i1> @not_icmp_sgt_nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
				; CHECK-LABEL: not_icmp_sgt_nxv4i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: cmpge p0.s, p0/z, z1.s, z0.s
				; CHECK-NEXT: ret
				%icmp = icmp sgt <vscale x 4 x i32> %a, %b
				%tmp = insertelement <vscale x 4 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 4 x i1> %tmp, <vscale x 4 x i1> undef, <vscale x 4 x i32> zeroinitializer
				%not = xor <vscale x 4 x i1> %icmp, %ones
				ret <vscale x 4 x i1> %not
				}

				define <vscale x 2 x i1> @not_fcmp_une_nxv2f64(<vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: not_fcmp_une_nxv2f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: fcmeq p0.d, p0/z, z0.d, z1.d
				; CHECK-NEXT: ret
				%icmp = fcmp une <vscale x 2 x double> %a, %b
				%tmp = insertelement <vscale x 2 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 2 x i1> %tmp, <vscale x 2 x i1> undef, <vscale x 2 x i32> zeroinitializer
				%not = xor <vscale x 2 x i1> %icmp, %ones
				ret <vscale x 2 x i1> %not
				}

				define <vscale x 4 x i1> @not_fcmp_uge_nxv4f32(<vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: not_fcmp_uge_nxv4f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: fcmgt p0.s, p0/z, z1.s, z0.s
				; CHECK-NEXT: ret
				%icmp = fcmp uge <vscale x 4 x float> %a, %b
				%tmp = insertelement <vscale x 4 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 4 x i1> %tmp, <vscale x 4 x i1> undef, <vscale x 4 x i32> zeroinitializer
				%not = xor <vscale x 4 x i1> %icmp, %ones
				ret <vscale x 4 x i1> %not
				}

llvm/test/CodeGen/RISCV/rvv/cmp-folds.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=riscv32 -mattr=+m,+d,+zfh,+v -verify-machineinstrs < %s \| FileCheck %s
				; RUN: llc -mtriple=riscv64 -mattr=+m,+d,+zfh,+v -verify-machineinstrs < %s \| FileCheck %s

				frasercrmckUnsubmitted Done Reply Inline Actions I think using `--check-prefixes=CHECK,RV32` is fairly common in our RVV testing and looks like it'd help remove all duplicate checks. frasercrmck: I think using `--check-prefixes=CHECK,RV32` is fairly common in our RVV testing and looks like…
				david-armAuthorUnsubmitted Done Reply Inline Actions Good suggestion, thanks! david-arm: Good suggestion, thanks!
				frasercrmckUnsubmitted Done Reply Inline Actions Oh but of course if there's no use of the RV32 or RV64 checks then lit will fail, sorry about that: I always forget. Sounds like you can just use `FileCheck %s` frasercrmck: Oh but of course if there's no use of the RV32 or RV64 checks then lit will fail, sorry about…
				define <vscale x 8 x i1> @not_icmp_sle_nxv8i16(<vscale x 8 x i16> %a, <vscale x 8 x i16> %b) {
				; CHECK-LABEL: not_icmp_sle_nxv8i16:
				; CHECK: # %bb.0:
				; CHECK-NEXT: vsetvli a0, zero, e16, m2, ta, mu
				; CHECK-NEXT: vmslt.vv v0, v10, v8
				; CHECK-NEXT: ret
				%icmp = icmp sle <vscale x 8 x i16> %a, %b
				%tmp = insertelement <vscale x 8 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 8 x i1> %tmp, <vscale x 8 x i1> undef, <vscale x 8 x i32> zeroinitializer
				%not = xor <vscale x 8 x i1> %ones, %icmp
				ret <vscale x 8 x i1> %not
				}

				define <vscale x 4 x i1> @not_icmp_sgt_nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b) {
				; CHECK-LABEL: not_icmp_sgt_nxv4i32:
				; CHECK: # %bb.0:
				; CHECK-NEXT: vsetvli a0, zero, e32, m2, ta, mu
				; CHECK-NEXT: vmsle.vv v0, v8, v10
				; CHECK-NEXT: ret
				%icmp = icmp sgt <vscale x 4 x i32> %a, %b
				%tmp = insertelement <vscale x 4 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 4 x i1> %tmp, <vscale x 4 x i1> undef, <vscale x 4 x i32> zeroinitializer
				%not = xor <vscale x 4 x i1> %icmp, %ones
				ret <vscale x 4 x i1> %not
				}

				define <vscale x 2 x i1> @not_fcmp_une_nxv2f64(<vscale x 2 x double> %a, <vscale x 2 x double> %b) {
				; CHECK-LABEL: not_fcmp_une_nxv2f64:
				; CHECK: # %bb.0:
				; CHECK-NEXT: vsetvli a0, zero, e64, m2, ta, mu
				; CHECK-NEXT: vmfeq.vv v0, v8, v10
				; CHECK-NEXT: ret
				%icmp = fcmp une <vscale x 2 x double> %a, %b
				%tmp = insertelement <vscale x 2 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 2 x i1> %tmp, <vscale x 2 x i1> undef, <vscale x 2 x i32> zeroinitializer
				%not = xor <vscale x 2 x i1> %icmp, %ones
				ret <vscale x 2 x i1> %not
				}

				define <vscale x 4 x i1> @not_fcmp_uge_nxv4f32(<vscale x 4 x float> %a, <vscale x 4 x float> %b) {
				; CHECK-LABEL: not_fcmp_uge_nxv4f32:
				; CHECK: # %bb.0:
				; CHECK-NEXT: vsetvli a0, zero, e32, m2, ta, mu
				; CHECK-NEXT: vmflt.vv v0, v8, v10
				; CHECK-NEXT: ret
				%icmp = fcmp uge <vscale x 4 x float> %a, %b
				%tmp = insertelement <vscale x 4 x i1> undef, i1 true, i32 0
				%ones = shufflevector <vscale x 4 x i1> %tmp, <vscale x 4 x i1> undef, <vscale x 4 x i32> zeroinitializer
				%not = xor <vscale x 4 x i1> %icmp, %ones
				ret <vscale x 4 x i1> %not
				}

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable vectors
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 404839

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/Target/ARM/ARMISelLowering.cpp

llvm/test/CodeGen/AArch64/sve-cmp-folds.ll

llvm/test/CodeGen/RISCV/rvv/cmp-folds.ll

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable vectorsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 404839

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/lib/Target/ARM/ARMISelLowering.cpp

llvm/test/CodeGen/AArch64/sve-cmp-folds.ll

llvm/test/CodeGen/RISCV/rvv/cmp-folds.ll

[CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable vectors
ClosedPublic