This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
IR/
-
VPIntrinsics.def
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
3/6
DAGCombiner.cpp
-
SelectionDAG.cpp
-
test/CodeGen/RISCV/rvv/
-
CodeGen/
-
RISCV/
-
rvv/
-
fixed-vectors-vadd-vp.ll
-
undef-vp-ops.ll
-
vadd-vp.ll

Differential D109148

[DAGCombiner][VP] Fold zero-length or false-masked VP ops
ClosedPublic

Authored by frasercrmck on Sep 2 2021, 5:06 AM.

Download Raw Diff

Details

Reviewers

craig.topper
RKSimon
spatel
simoll

Commits

rGe2b46e336bad: [DAGCombiner][VP] Fold zero-length or false-masked VP ops

Summary

This patch adds a generic DAGCombine for vector-predicated (VP) nodes.
Those for which we can determine that no vector element is active can be
replaced by either undef or, for reductions, the start value.

This is tested rather trivially at the IR level, where it's possible
that we want to teach instcombine to perform this optimization.

However, we can also see the zero-evl case arise during SelectionDAG
legalization, when wide VP operations can be split into two and the
upper operation emerges as trivially false.

It's possible that we could perform this optimization "proactively"
(both on legal vectors and before splitting) and reduce the width of an
operation and insert it into a larger undef vector:

v8i32 vp_add x, y, mask, 4
->
v8i32 insert_subvector (v8i32 undef), (v4i32 vp_add xsub, ysub, mask, 4), i32 0

This is somewhat analogous to similar vector narrow/widening
optimizations, but it's unclear at this point whether that's beneficial
to do this for VP ops for any/all targets.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

frasercrmck created this revision.Sep 2 2021, 5:06 AM

Herald added subscribers: ecnelises, rogfer01, luismarques and 20 others. · View Herald TranscriptSep 2 2021, 5:06 AM

frasercrmck requested review of this revision.Sep 2 2021, 5:06 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2021, 5:06 AM

Herald added subscribers: llvm-commits, vkmr, MaskRay. · View Herald Transcript

frasercrmck added a project: Restricted Project.Sep 2 2021, 5:07 AM

Harbormaster completed remote builds in B122283: Diff 370233.Sep 2 2021, 5:07 AM

This makes sense. +1 for also doing the corresponding IR optimization in InstSimplify.

I am not so sure about pro-active type narrowing (introducing insert_subvector nodes), at least not for all targets: we take VP nodes pretty much raw in the VE backend.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
22063	You could use the property macros in `include/llvm/IR/VPIntrinsics.def` for this intead. Eg, define `ISD::isVPBinaryOp(ISD)` amd `ISD::isVPReductionOP(ISD)` and use them here.
22083	`cast<MemSDNode>` and handle all mem ops generically? `MemSDNode::getChain()` gives you the chain. `MemSDNode::readMem()` tells you whether we need an `UNDEF`. This should automatically handle the upcoming `strided_load/store`.

remove switch
use binary-op and reduction helpers
use MemSDNode to simplify load, store, etc.

Herald added a subscriber: dexonsmith. · View Herald TranscriptSep 6 2021, 5:13 AM

frasercrmck marked 2 inline comments as done.Sep 6 2021, 5:15 AM

frasercrmck added inline comments.

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
22063	All good ideas, thank you. I've added those now. I slightly changed the macros from `HANDLE_` to `PROPERTY` since they're not currently being used to "handle" anything: they just flag an opcode as having a certain property.

Harbormaster completed remote builds in B122750: Diff 370898.Sep 6 2021, 6:01 AM

RKSimon added inline comments.Sep 7 2021, 10:32 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
1744	Should we have a #undef BEGIN_REGISTER_VP_SDNODE after the include?

craig.topper added inline comments.Sep 7 2021, 10:47 AM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
1744	Isn't it undefed at the end of VPIntrinsics.def?

RKSimon added inline comments.Sep 7 2021, 1:00 PM

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
1744	So it does - sorry about that :)

LGTM. We should cleanup the VPIntrinsics.def file some time .. originally every property was attached to some entity (intrinsic and/or sdnode) by placing the property macro in the scope of the entity. Now, we are seeing the ID being repeated in the property macros, making it independent of the scope. Let's discuss this on the call. This should not hold back the patch.

In D109148#3020272, @simoll wrote:

LGTM. We should cleanup the VPIntrinsics.def file some time .. originally every property was attached to some entity (intrinsic and/or sdnode) by placing the property macro in the scope of the entity. Now, we are seeing the ID being repeated in the property macros, making it independent of the scope. Let's discuss this on the call. This should not hold back the patch.

Makes sense, let's discuss.

I'll need someone's official acceptance before I can merge this patch. I'll also get round to doing the same thing in InstSimplify some time.

LGTM

This revision is now accepted and ready to land.Sep 24 2021, 9:41 AM

This revision was landed with ongoing or failed builds.Sep 27 2021, 3:40 AM

Closed by commit rGe2b46e336bad: [DAGCombiner][VP] Fold zero-length or false-masked VP ops (authored by frasercrmck). · Explain Why

This revision was automatically updated to reflect the committed changes.

frasercrmck added a commit: rGe2b46e336bad: [DAGCombiner][VP] Fold zero-length or false-masked VP ops.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

ISDOpcodes.h

6 lines

IR/

VPIntrinsics.def

17 lines

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

38 lines

SelectionDAG.cpp

22 lines

test/

CodeGen/

RISCV/

rvv/

fixed-vectors-vadd-vp.ll

25 lines

undef-vp-ops.ll

633 lines

vadd-vp.ll

41 lines

Diff 375191

llvm/include/llvm/CodeGen/ISDOpcodes.h

	Show First 20 Lines • Show All 1,261 Lines • ▼ Show 20 Lines

	/// Get underlying scalar opcode for VECREDUCE opcode.			/// Get underlying scalar opcode for VECREDUCE opcode.
	/// For example ISD::AND for ISD::VECREDUCE_AND.			/// For example ISD::AND for ISD::VECREDUCE_AND.
	NodeType getVecReduceBaseOpcode(unsigned VecReduceOpcode);			NodeType getVecReduceBaseOpcode(unsigned VecReduceOpcode);

	/// Whether this is a vector-predicated Opcode.			/// Whether this is a vector-predicated Opcode.
	bool isVPOpcode(unsigned Opcode);			bool isVPOpcode(unsigned Opcode);

				/// Whether this is a vector-predicated binary operation opcode.
				bool isVPBinaryOp(unsigned Opcode);

				/// Whether this is a vector-predicated reduction opcode.
				bool isVPReduction(unsigned Opcode);

	/// The operand position of the vector mask.			/// The operand position of the vector mask.
	Optional<unsigned> getVPMaskIdx(unsigned Opcode);			Optional<unsigned> getVPMaskIdx(unsigned Opcode);

	/// The operand position of the explicit vector length parameter.			/// The operand position of the explicit vector length parameter.
	Optional<unsigned> getVPExplicitVectorLengthIdx(unsigned Opcode);			Optional<unsigned> getVPExplicitVectorLengthIdx(unsigned Opcode);

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	/// MemIndexedMode enum - This enum defines the load / store indexed			/// MemIndexedMode enum - This enum defines the load / store indexed
	▲ Show 20 Lines • Show All 171 Lines • Show Last 20 Lines

llvm/include/llvm/IR/VPIntrinsics.def

	Show First 20 Lines • Show All 110 Lines • ▼ Show 20 Lines
	#define HANDLE_VP_IS_MEMOP(VPID, POINTERPOS, DATAPOS)			#define HANDLE_VP_IS_MEMOP(VPID, POINTERPOS, DATAPOS)
	#endif			#endif

	// Map this VP reduction intrinsic to its reduction operand positions.			// Map this VP reduction intrinsic to its reduction operand positions.
	#ifndef HANDLE_VP_REDUCTION			#ifndef HANDLE_VP_REDUCTION
	#define HANDLE_VP_REDUCTION(ID, STARTPOS, VECTORPOS)			#define HANDLE_VP_REDUCTION(ID, STARTPOS, VECTORPOS)
	#endif			#endif

				// A property to infer VP binary-op SDNode opcodes automatically.
				#ifndef PROPERTY_VP_BINARYOP_SDNODE
				#define PROPERTY_VP_BINARYOP_SDNODE(ID)
				#endif

				// A property to infer VP reduction SDNode opcodes automatically.
				#ifndef PROPERTY_VP_REDUCTION_SDNODE
				#define PROPERTY_VP_REDUCTION_SDNODE(ID)
				#endif

	/// } Property Macros			/// } Property Macros

	///// Integer Arithmetic {			///// Integer Arithmetic {

	// Specialized helper macro for integer binary operators (%x, %y, %mask, %evl).			// Specialized helper macro for integer binary operators (%x, %y, %mask, %evl).
	#ifdef HELPER_REGISTER_BINARY_INT_VP			#ifdef HELPER_REGISTER_BINARY_INT_VP
	#error "The internal helper macro HELPER_REGISTER_BINARY_INT_VP is already defined!"			#error "The internal helper macro HELPER_REGISTER_BINARY_INT_VP is already defined!"
	#endif			#endif
	#define HELPER_REGISTER_BINARY_INT_VP(INTRIN, SDOPC, OPC) \			#define HELPER_REGISTER_BINARY_INT_VP(INTRIN, SDOPC, OPC) \
	BEGIN_REGISTER_VP(INTRIN, 2, 3, SDOPC, -1) \			BEGIN_REGISTER_VP(INTRIN, 2, 3, SDOPC, -1) \
	HANDLE_VP_TO_OPC(OPC) \			HANDLE_VP_TO_OPC(OPC) \
				PROPERTY_VP_BINARYOP_SDNODE(SDOPC) \
	END_REGISTER_VP(INTRIN, SDOPC)			END_REGISTER_VP(INTRIN, SDOPC)



	// llvm.vp.add(x,y,mask,vlen)			// llvm.vp.add(x,y,mask,vlen)
	HELPER_REGISTER_BINARY_INT_VP(vp_add, VP_ADD, Add)			HELPER_REGISTER_BINARY_INT_VP(vp_add, VP_ADD, Add)

	// llvm.vp.and(x,y,mask,vlen)			// llvm.vp.and(x,y,mask,vlen)
	▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines
	#ifdef HELPER_REGISTER_BINARY_FP_VP			#ifdef HELPER_REGISTER_BINARY_FP_VP
	#error \			#error \
	"The internal helper macro HELPER_REGISTER_BINARY_FP_VP is already defined!"			"The internal helper macro HELPER_REGISTER_BINARY_FP_VP is already defined!"
	#endif			#endif
	#define HELPER_REGISTER_BINARY_FP_VP(OPSUFFIX, SDOPC, OPC) \			#define HELPER_REGISTER_BINARY_FP_VP(OPSUFFIX, SDOPC, OPC) \
	BEGIN_REGISTER_VP(vp_##OPSUFFIX, 2, 3, SDOPC, -1) \			BEGIN_REGISTER_VP(vp_##OPSUFFIX, 2, 3, SDOPC, -1) \
	HANDLE_VP_TO_OPC(OPC) \			HANDLE_VP_TO_OPC(OPC) \
	HANDLE_VP_TO_CONSTRAINEDFP(1, 1, experimental_constrained_##OPSUFFIX) \			HANDLE_VP_TO_CONSTRAINEDFP(1, 1, experimental_constrained_##OPSUFFIX) \
				PROPERTY_VP_BINARYOP_SDNODE(SDOPC) \
	END_REGISTER_VP(vp_##OPSUFFIX, SDOPC)			END_REGISTER_VP(vp_##OPSUFFIX, SDOPC)

	// llvm.vp.fadd(x,y,mask,vlen)			// llvm.vp.fadd(x,y,mask,vlen)
	HELPER_REGISTER_BINARY_FP_VP(fadd, VP_FADD, FAdd)			HELPER_REGISTER_BINARY_FP_VP(fadd, VP_FADD, FAdd)

	// llvm.vp.fsub(x,y,mask,vlen)			// llvm.vp.fsub(x,y,mask,vlen)
	HELPER_REGISTER_BINARY_FP_VP(fsub, VP_FSUB, FSub)			HELPER_REGISTER_BINARY_FP_VP(fsub, VP_FSUB, FSub)

	▲ Show 20 Lines • Show All 52 Lines • ▼ Show 20 Lines
	// Specialized helper macro for VP reductions (%start, %x, %mask, %evl).			// Specialized helper macro for VP reductions (%start, %x, %mask, %evl).
	#ifdef HELPER_REGISTER_REDUCTION_VP			#ifdef HELPER_REGISTER_REDUCTION_VP
	#error "The internal helper macro HELPER_REGISTER_REDUCTION_VP is already defined!"			#error "The internal helper macro HELPER_REGISTER_REDUCTION_VP is already defined!"
	#endif			#endif
	#define HELPER_REGISTER_REDUCTION_VP(VPINTRIN, SDOPC, INTRIN) \			#define HELPER_REGISTER_REDUCTION_VP(VPINTRIN, SDOPC, INTRIN) \
	BEGIN_REGISTER_VP(VPINTRIN, 2, 3, SDOPC, -1) \			BEGIN_REGISTER_VP(VPINTRIN, 2, 3, SDOPC, -1) \
	HANDLE_VP_TO_INTRIN(INTRIN) \			HANDLE_VP_TO_INTRIN(INTRIN) \
	HANDLE_VP_REDUCTION(VPINTRIN, 0, 1) \			HANDLE_VP_REDUCTION(VPINTRIN, 0, 1) \
				PROPERTY_VP_REDUCTION_SDNODE(SDOPC) \
	END_REGISTER_VP(VPINTRIN, SDOPC)			END_REGISTER_VP(VPINTRIN, SDOPC)

	// llvm.vp.reduce.add(start,x,mask,vlen)			// llvm.vp.reduce.add(start,x,mask,vlen)
	HELPER_REGISTER_REDUCTION_VP(vp_reduce_add, VP_REDUCE_ADD,			HELPER_REGISTER_REDUCTION_VP(vp_reduce_add, VP_REDUCE_ADD,
	experimental_vector_reduce_add)			experimental_vector_reduce_add)

	// llvm.vp.reduce.mul(start,x,mask,vlen)			// llvm.vp.reduce.mul(start,x,mask,vlen)
	HELPER_REGISTER_REDUCTION_VP(vp_reduce_mul, VP_REDUCE_MUL,			HELPER_REGISTER_REDUCTION_VP(vp_reduce_mul, VP_REDUCE_MUL,
	▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
	#define HELPER_REGISTER_REDUCTION_SEQ_VP(VPINTRIN, SDOPC, SEQ_SDOPC, INTRIN) \			#define HELPER_REGISTER_REDUCTION_SEQ_VP(VPINTRIN, SDOPC, SEQ_SDOPC, INTRIN) \
	BEGIN_REGISTER_VP_INTRINSIC(VPINTRIN, 2, 3) \			BEGIN_REGISTER_VP_INTRINSIC(VPINTRIN, 2, 3) \
	BEGIN_REGISTER_VP_SDNODE(SDOPC, -1, VPINTRIN, 2, 3) \			BEGIN_REGISTER_VP_SDNODE(SDOPC, -1, VPINTRIN, 2, 3) \
	END_REGISTER_VP_SDNODE(SDOPC) \			END_REGISTER_VP_SDNODE(SDOPC) \
	BEGIN_REGISTER_VP_SDNODE(SEQ_SDOPC, -1, VPINTRIN, 2, 3) \			BEGIN_REGISTER_VP_SDNODE(SEQ_SDOPC, -1, VPINTRIN, 2, 3) \
	END_REGISTER_VP_SDNODE(SEQ_SDOPC) \			END_REGISTER_VP_SDNODE(SEQ_SDOPC) \
	HANDLE_VP_TO_INTRIN(INTRIN) \			HANDLE_VP_TO_INTRIN(INTRIN) \
	HANDLE_VP_REDUCTION(VPINTRIN, 0, 1) \			HANDLE_VP_REDUCTION(VPINTRIN, 0, 1) \
				PROPERTY_VP_REDUCTION_SDNODE(SDOPC) \
				PROPERTY_VP_REDUCTION_SDNODE(SEQ_SDOPC) \
	END_REGISTER_VP_INTRINSIC(VPINTRIN)			END_REGISTER_VP_INTRINSIC(VPINTRIN)

	// llvm.vp.reduce.fadd(start,x,mask,vlen)			// llvm.vp.reduce.fadd(start,x,mask,vlen)
	HELPER_REGISTER_REDUCTION_SEQ_VP(vp_reduce_fadd, VP_REDUCE_FADD,			HELPER_REGISTER_REDUCTION_SEQ_VP(vp_reduce_fadd, VP_REDUCE_FADD,
	VP_REDUCE_SEQ_FADD,			VP_REDUCE_SEQ_FADD,
	experimental_vector_reduce_fadd)			experimental_vector_reduce_fadd)

	// llvm.vp.reduce.fmul(start,x,mask,vlen)			// llvm.vp.reduce.fmul(start,x,mask,vlen)
	Show All 21 Lines
	#undef END_REGISTER_VP			#undef END_REGISTER_VP
	#undef END_REGISTER_VP_INTRINSIC			#undef END_REGISTER_VP_INTRINSIC
	#undef END_REGISTER_VP_SDNODE			#undef END_REGISTER_VP_SDNODE
	#undef HANDLE_VP_TO_OPC			#undef HANDLE_VP_TO_OPC
	#undef HANDLE_VP_TO_CONSTRAINEDFP			#undef HANDLE_VP_TO_CONSTRAINEDFP
	#undef HANDLE_VP_TO_INTRIN			#undef HANDLE_VP_TO_INTRIN
	#undef HANDLE_VP_IS_MEMOP			#undef HANDLE_VP_IS_MEMOP
	#undef HANDLE_VP_REDUCTION			#undef HANDLE_VP_REDUCTION
				#undef PROPERTY_VP_BINARYOP_SDNODE
				#undef PROPERTY_VP_REDUCTION_SDNODE

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 509 Lines • ▼ Show 20 Lines	private:
SDValue visitINSERT_SUBVECTOR(SDNode *N);		SDValue visitINSERT_SUBVECTOR(SDNode *N);
SDValue visitMLOAD(SDNode *N);		SDValue visitMLOAD(SDNode *N);
SDValue visitMSTORE(SDNode *N);		SDValue visitMSTORE(SDNode *N);
SDValue visitMGATHER(SDNode *N);		SDValue visitMGATHER(SDNode *N);
SDValue visitMSCATTER(SDNode *N);		SDValue visitMSCATTER(SDNode *N);
SDValue visitFP_TO_FP16(SDNode *N);		SDValue visitFP_TO_FP16(SDNode *N);
SDValue visitFP16_TO_FP(SDNode *N);		SDValue visitFP16_TO_FP(SDNode *N);
SDValue visitVECREDUCE(SDNode *N);		SDValue visitVECREDUCE(SDNode *N);
		SDValue visitVPOp(SDNode *N);

SDValue visitFADDForFMACombine(SDNode *N);		SDValue visitFADDForFMACombine(SDNode *N);
SDValue visitFSUBForFMACombine(SDNode *N);		SDValue visitFSUBForFMACombine(SDNode *N);
SDValue visitFMULForFMADistributiveCombine(SDNode *N);		SDValue visitFMULForFMADistributiveCombine(SDNode *N);

SDValue XformToShuffleWithZero(SDNode *N);		SDValue XformToShuffleWithZero(SDNode *N);
bool reassociationCanBreakAddressingModePattern(unsigned Opc,		bool reassociationCanBreakAddressingModePattern(unsigned Opc,
const SDLoc &DL, SDValue N0,		const SDLoc &DL, SDValue N0,
▲ Show 20 Lines • Show All 1,207 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visit(SDNode *N) {
case ISD::VECREDUCE_OR:		case ISD::VECREDUCE_OR:
case ISD::VECREDUCE_XOR:		case ISD::VECREDUCE_XOR:
case ISD::VECREDUCE_SMAX:		case ISD::VECREDUCE_SMAX:
case ISD::VECREDUCE_SMIN:		case ISD::VECREDUCE_SMIN:
case ISD::VECREDUCE_UMAX:		case ISD::VECREDUCE_UMAX:
case ISD::VECREDUCE_UMIN:		case ISD::VECREDUCE_UMIN:
case ISD::VECREDUCE_FMAX:		case ISD::VECREDUCE_FMAX:
case ISD::VECREDUCE_FMIN: return visitVECREDUCE(N);		case ISD::VECREDUCE_FMIN: return visitVECREDUCE(N);
		#define BEGIN_REGISTER_VP_SDNODE(SDOPC, ...) case ISD::SDOPC:
		#include "llvm/IR/VPIntrinsics.def"
		return visitVPOp(N);
		RKSimonUnsubmitted Not Done Reply Inline Actions Should we have a #undef BEGIN_REGISTER_VP_SDNODE after the include? RKSimon: Should we have a #undef BEGIN_REGISTER_VP_SDNODE after the include?
		craig.topperUnsubmitted Not Done Reply Inline Actions Isn't it undefed at the end of VPIntrinsics.def? craig.topper: Isn't it undefed at the end of VPIntrinsics.def?
		RKSimonUnsubmitted Not Done Reply Inline Actions So it does - sorry about that :) RKSimon: So it does - sorry about that :)
}		}
return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::combine(SDNode *N) {		SDValue DAGCombiner::combine(SDNode *N) {
SDValue RV;		SDValue RV;
if (!DisableGenericCombines)		if (!DisableGenericCombines)
RV = visit(N);		RV = visit(N);
▲ Show 20 Lines • Show All 20,284 Lines • ▼ Show 20 Lines	if (!TLI.isOperationLegalOrCustom(Opcode, VT) &&
TLI.isOperationLegalOrCustom(NewOpcode, VT) &&		TLI.isOperationLegalOrCustom(NewOpcode, VT) &&
DAG.ComputeNumSignBits(N0) == VT.getScalarSizeInBits())		DAG.ComputeNumSignBits(N0) == VT.getScalarSizeInBits())
return DAG.getNode(NewOpcode, SDLoc(N), N->getValueType(0), N0);		return DAG.getNode(NewOpcode, SDLoc(N), N->getValueType(0), N0);
}		}

return SDValue();		return SDValue();
}		}

		SDValue DAGCombiner::visitVPOp(SDNode *N) {
		// VP operations in which all vector elements are disabled - either by
		// determining that the mask is all false or that the EVL is 0 - can be
		// eliminated.
		bool AreAllEltsDisabled = false;
		if (auto EVLIdx = ISD::getVPExplicitVectorLengthIdx(N->getOpcode()))
		AreAllEltsDisabled \|= isNullConstant(N->getOperand(*EVLIdx));
		if (auto MaskIdx = ISD::getVPMaskIdx(N->getOpcode()))
		AreAllEltsDisabled \|=
		ISD::isConstantSplatVectorAllZeros(N->getOperand(*MaskIdx).getNode());

		// This is the only generic VP combine we support for now.
		if (!AreAllEltsDisabled)
		return SDValue();

		// Binary operations can be replaced by UNDEF.
		if (ISD::isVPBinaryOp(N->getOpcode()))
		return DAG.getUNDEF(N->getValueType(0));

		simollUnsubmitted Done Reply Inline Actions You could use the property macros in `include/llvm/IR/VPIntrinsics.def` for this intead. Eg, define `ISD::isVPBinaryOp(ISD)` amd `ISD::isVPReductionOP(ISD)` and use them here. simoll: You could use the property macros in `include/llvm/IR/VPIntrinsics.def` for this intead. Eg…
		frasercrmckAuthorUnsubmitted Done Reply Inline Actions All good ideas, thank you. I've added those now. I slightly changed the macros from `HANDLE_` to `PROPERTY` since they're not currently being used to "handle" anything: they just flag an opcode as having a certain property. frasercrmck: All good ideas, thank you. I've added those now. I slightly changed the macros from `HANDLE_`…
		// VP Memory operations can be replaced by either the chain (stores) or the
		// chain + undef (loads).
		if (const auto *MemSD = dyn_cast<MemSDNode>(N)) {
		if (MemSD->writeMem())
		return MemSD->getChain();
		return CombineTo(N, DAG.getUNDEF(N->getValueType(0)), MemSD->getChain());
		}

		// Reduction operations return the start operand when no elements are active.
		if (ISD::isVPReduction(N->getOpcode()))
		return N->getOperand(0);

		return SDValue();
		}

/// Returns a vector_shuffle if it able to transform an AND to a vector_shuffle		/// Returns a vector_shuffle if it able to transform an AND to a vector_shuffle
/// with the destination vector and a zero vector.		/// with the destination vector and a zero vector.
/// e.g. AND V, <0xffffffff, 0, 0xffffffff, 0>. ==>		/// e.g. AND V, <0xffffffff, 0, 0xffffffff, 0>. ==>
/// vector_shuffle V, Zero, <0, 4, 2, 4>		/// vector_shuffle V, Zero, <0, 4, 2, 4>
SDValue DAGCombiner::XformToShuffleWithZero(SDNode *N) {		SDValue DAGCombiner::XformToShuffleWithZero(SDNode *N) {
		simollUnsubmitted Done Reply Inline Actions `cast<MemSDNode>` and handle all mem ops generically? `MemSDNode::getChain()` gives you the chain. `MemSDNode::readMem()` tells you whether we need an `UNDEF`. This should automatically handle the upcoming `strided_load/store`. simoll: `cast<MemSDNode>` and handle all mem ops generically? `MemSDNode::getChain()` gives you the…
assert(N->getOpcode() == ISD::AND && "Unexpected opcode!");		assert(N->getOpcode() == ISD::AND && "Unexpected opcode!");

EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
SDValue LHS = N->getOperand(0);		SDValue LHS = N->getOperand(0);
SDValue RHS = peekThroughBitcasts(N->getOperand(1));		SDValue RHS = peekThroughBitcasts(N->getOperand(1));
SDLoc DL(N);		SDLoc DL(N);

// Make sure we're not running after operation legalization where it		// Make sure we're not running after operation legalization where it
▲ Show 20 Lines • Show All 1,581 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 406 Lines • ▼ Show 20 Lines	default:
return false;		return false;
#define BEGIN_REGISTER_VP_SDNODE(SDOPC, ...) \		#define BEGIN_REGISTER_VP_SDNODE(SDOPC, ...) \
case ISD::SDOPC: \		case ISD::SDOPC: \
return true;		return true;
#include "llvm/IR/VPIntrinsics.def"		#include "llvm/IR/VPIntrinsics.def"
}		}
}		}

		bool ISD::isVPBinaryOp(unsigned Opcode) {
		switch (Opcode) {
		default:
		return false;
		#define PROPERTY_VP_BINARYOP_SDNODE(SDOPC) \
		case ISD::SDOPC: \
		return true;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

		bool ISD::isVPReduction(unsigned Opcode) {
		switch (Opcode) {
		default:
		return false;
		#define PROPERTY_VP_REDUCTION_SDNODE(SDOPC) \
		case ISD::SDOPC: \
		return true;
		#include "llvm/IR/VPIntrinsics.def"
		}
		}

/// The operand position of the vector mask.		/// The operand position of the vector mask.
Optional<unsigned> ISD::getVPMaskIdx(unsigned Opcode) {		Optional<unsigned> ISD::getVPMaskIdx(unsigned Opcode) {
switch (Opcode) {		switch (Opcode) {
default:		default:
return None;		return None;
#define BEGIN_REGISTER_VP_SDNODE(SDOPC, LEGALPOS, TDNAME, MASKPOS, ...) \		#define BEGIN_REGISTER_VP_SDNODE(SDOPC, LEGALPOS, TDNAME, MASKPOS, ...) \
case ISD::SDOPC: \		case ISD::SDOPC: \
return MASKPOS;		return MASKPOS;
▲ Show 20 Lines • Show All 10,648 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll

Show First 20 Lines • Show All 473 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
ret <256 x i8> %v		ret <256 x i8> %v
}		}

; FIXME: The upper half is doing nothing.		; FIXME: The upper half is doing nothing.

define <256 x i8> @vadd_vi_v258i8_evl128(<256 x i8> %va, <256 x i1> %m) {		define <256 x i8> @vadd_vi_v258i8_evl128(<256 x i8> %va, <256 x i1> %m) {
; CHECK-LABEL: vadd_vi_v258i8_evl128:		; CHECK-LABEL: vadd_vi_v258i8_evl128:
; CHECK: # %bb.0:		; CHECK: # %bb.0:
; CHECK-NEXT: addi a1, zero, 128		; CHECK-NEXT: addi a0, zero, 128
; CHECK-NEXT: vsetvli zero, a1, e8, m8, ta, mu		; CHECK-NEXT: vsetvli zero, a0, e8, m8, ta, mu
; CHECK-NEXT: vle1.v v25, (a0)
; CHECK-NEXT: vadd.vi v8, v8, -1, v0.t		; CHECK-NEXT: vadd.vi v8, v8, -1, v0.t
; CHECK-NEXT: vsetivli zero, 0, e8, m8, ta, mu
; CHECK-NEXT: vmv1r.v v0, v25
; CHECK-NEXT: vadd.vi v16, v16, -1, v0.t
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%elt.head = insertelement <256 x i8> undef, i8 -1, i32 0		%elt.head = insertelement <256 x i8> undef, i8 -1, i32 0
%vb = shufflevector <256 x i8> %elt.head, <256 x i8> undef, <256 x i32> zeroinitializer		%vb = shufflevector <256 x i8> %elt.head, <256 x i8> undef, <256 x i32> zeroinitializer
%v = call <256 x i8> @llvm.vp.add.v258i8(<256 x i8> %va, <256 x i8> %vb, <256 x i1> %m, i32 128)		%v = call <256 x i8> @llvm.vp.add.v258i8(<256 x i8> %va, <256 x i8> %vb, <256 x i1> %m, i32 128)
ret <256 x i8> %v		ret <256 x i8> %v
}		}

declare <2 x i16> @llvm.vp.add.v2i16(<2 x i16>, <2 x i16>, <2 x i1>, i32)		declare <2 x i16> @llvm.vp.add.v2i16(<2 x i16>, <2 x i16>, <2 x i1>, i32)
▲ Show 20 Lines • Show All 1,123 Lines • ▼ Show 20 Lines	; RV64-NEXT: ret
%elt.head = insertelement <32 x i64> undef, i64 -1, i32 0		%elt.head = insertelement <32 x i64> undef, i64 -1, i32 0
%vb = shufflevector <32 x i64> %elt.head, <32 x i64> undef, <32 x i32> zeroinitializer		%vb = shufflevector <32 x i64> %elt.head, <32 x i64> undef, <32 x i32> zeroinitializer
%head = insertelement <32 x i1> undef, i1 true, i32 0		%head = insertelement <32 x i1> undef, i1 true, i32 0
%m = shufflevector <32 x i1> %head, <32 x i1> undef, <32 x i32> zeroinitializer		%m = shufflevector <32 x i1> %head, <32 x i1> undef, <32 x i32> zeroinitializer
%v = call <32 x i64> @llvm.vp.add.v32i64(<32 x i64> %va, <32 x i64> %vb, <32 x i1> %m, i32 %evl)		%v = call <32 x i64> @llvm.vp.add.v32i64(<32 x i64> %va, <32 x i64> %vb, <32 x i1> %m, i32 %evl)
ret <32 x i64> %v		ret <32 x i64> %v
}		}

; FIXME: After splitting, the "high" vadd.vv is doing nothing; could be		; FIXME: We don't match vadd.vi on RV32.
; replaced by undef.

define <32 x i64> @vadd_vx_v32i64_evl12(<32 x i64> %va, <32 x i1> %m) {		define <32 x i64> @vadd_vx_v32i64_evl12(<32 x i64> %va, <32 x i1> %m) {
; RV32-LABEL: vadd_vx_v32i64_evl12:		; RV32-LABEL: vadd_vx_v32i64_evl12:
; RV32: # %bb.0:		; RV32: # %bb.0:
; RV32-NEXT: vsetivli zero, 2, e8, mf4, ta, mu
; RV32-NEXT: vslidedown.vi v1, v0, 2
; RV32-NEXT: addi a0, zero, 32		; RV32-NEXT: addi a0, zero, 32
; RV32-NEXT: vsetvli zero, a0, e32, m8, ta, mu		; RV32-NEXT: vsetvli zero, a0, e32, m8, ta, mu
; RV32-NEXT: vmv.v.i v24, -1		; RV32-NEXT: vmv.v.i v16, -1
; RV32-NEXT: vsetivli zero, 12, e64, m8, ta, mu		; RV32-NEXT: vsetivli zero, 12, e64, m8, ta, mu
; RV32-NEXT: vadd.vv v8, v8, v24, v0.t		; RV32-NEXT: vadd.vv v8, v8, v16, v0.t
; RV32-NEXT: vsetivli zero, 0, e64, m8, ta, mu
; RV32-NEXT: vmv1r.v v0, v1
; RV32-NEXT: vadd.vv v16, v16, v24, v0.t
; RV32-NEXT: ret		; RV32-NEXT: ret
;		;
; RV64-LABEL: vadd_vx_v32i64_evl12:		; RV64-LABEL: vadd_vx_v32i64_evl12:
; RV64: # %bb.0:		; RV64: # %bb.0:
; RV64-NEXT: vsetivli zero, 2, e8, mf4, ta, mu
; RV64-NEXT: vslidedown.vi v25, v0, 2
; RV64-NEXT: vsetivli zero, 12, e64, m8, ta, mu		; RV64-NEXT: vsetivli zero, 12, e64, m8, ta, mu
; RV64-NEXT: vadd.vi v8, v8, -1, v0.t		; RV64-NEXT: vadd.vi v8, v8, -1, v0.t
; RV64-NEXT: vsetivli zero, 0, e64, m8, ta, mu
; RV64-NEXT: vmv1r.v v0, v25
; RV64-NEXT: vadd.vi v16, v16, -1, v0.t
; RV64-NEXT: ret		; RV64-NEXT: ret
%elt.head = insertelement <32 x i64> undef, i64 -1, i32 0		%elt.head = insertelement <32 x i64> undef, i64 -1, i32 0
%vb = shufflevector <32 x i64> %elt.head, <32 x i64> undef, <32 x i32> zeroinitializer		%vb = shufflevector <32 x i64> %elt.head, <32 x i64> undef, <32 x i32> zeroinitializer
%v = call <32 x i64> @llvm.vp.add.v32i64(<32 x i64> %va, <32 x i64> %vb, <32 x i1> %m, i32 12)		%v = call <32 x i64> @llvm.vp.add.v32i64(<32 x i64> %va, <32 x i64> %vb, <32 x i1> %m, i32 12)
ret <32 x i64> %v		ret <32 x i64> %v
}		}

define <32 x i64> @vadd_vx_v32i64_evl27(<32 x i64> %va, <32 x i1> %m) {		define <32 x i64> @vadd_vx_v32i64_evl27(<32 x i64> %va, <32 x i1> %m) {
Show All 29 Lines

llvm/test/CodeGen/RISCV/rvv/undef-vp-ops.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=riscv32 -mattr=+d,+experimental-zfh,+experimental-v -target-abi=ilp32d -riscv-v-vector-bits-min=128 \
				; RUN: -verify-machineinstrs < %s \| FileCheck %s
				; RUN: llc -mtriple=riscv64 -mattr=+d,+experimental-zfh,+experimental-v -target-abi=lp64d -riscv-v-vector-bits-min=128 \
				; RUN: -verify-machineinstrs < %s \| FileCheck %s

				; Test that we can remove trivially-undef VP operations of various kinds.

				declare <4 x i32> @llvm.vp.load.v4i32.p0v4i32(<4 x i32>*, <4 x i1>, i32)

				define <4 x i32> @vload_v4i32_zero_evl(<4 x i32>* %ptr, <4 x i1> %m) {
				; CHECK-LABEL: vload_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%v = call <4 x i32> @llvm.vp.load.v4i32.p0v4i32(<4 x i32>* %ptr, <4 x i1> %m, i32 0)
				ret <4 x i32> %v
				}

				define <4 x i32> @vload_v4i32_false_mask(<4 x i32>* %ptr, i32 %evl) {
				; CHECK-LABEL: vload_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%v = call <4 x i32> @llvm.vp.load.v4i32.p0v4i32(<4 x i32>* %ptr, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %v
				}

				declare <4 x i32> @llvm.vp.gather.v4i32.v4p0i32(<4 x i32*>, <4 x i1>, i32)

				define <4 x i32> @vgather_v4i32_v4i32_zero_evl(<4 x i32*> %ptrs, <4 x i1> %m) {
				; CHECK-LABEL: vgather_v4i32_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%v = call <4 x i32> @llvm.vp.gather.v4i32.v4p0i32(<4 x i32*> %ptrs, <4 x i1> %m, i32 0)
				ret <4 x i32> %v
				}

				define <4 x i32> @vgather_v4i32_v4i32_false_mask(<4 x i32*> %ptrs, i32 %evl) {
				; CHECK-LABEL: vgather_v4i32_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%v = call <4 x i32> @llvm.vp.gather.v4i32.v4p0i32(<4 x i32*> %ptrs, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %v
				}

				declare void @llvm.vp.store.v4i32.p0v4i32(<4 x i32>, <4 x i32>*, <4 x i1>, i32)

				define void @vstore_v4i32_zero_evl(<4 x i32> %val, <4 x i32>* %ptr, <4 x i1> %m) {
				; CHECK-LABEL: vstore_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				call void @llvm.vp.store.v4i32.p0v4i32(<4 x i32> %val, <4 x i32>* %ptr, <4 x i1> %m, i32 0)
				ret void
				}

				define void @vstore_v4i32_false_mask(<4 x i32> %val, <4 x i32>* %ptr, i32 %evl) {
				; CHECK-LABEL: vstore_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				call void @llvm.vp.store.v4i32.p0v4i32(<4 x i32> %val, <4 x i32>* %ptr, <4 x i1> zeroinitializer, i32 %evl)
				ret void
				}

				declare void @llvm.vp.scatter.v4i32.v4p0i32(<4 x i32>, <4 x i32*>, <4 x i1>, i32)

				define void @vscatter_v4i32_zero_evl(<4 x i32> %val, <4 x i32*> %ptrs, <4 x i1> %m) {
				; CHECK-LABEL: vscatter_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				call void @llvm.vp.scatter.v4i32.v4p0i32(<4 x i32> %val, <4 x i32*> %ptrs, <4 x i1> %m, i32 0)
				ret void
				}

				define void @vscatter_v4i32_false_mask(<4 x i32> %val, <4 x i32*> %ptrs, i32 %evl) {
				; CHECK-LABEL: vscatter_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				call void @llvm.vp.scatter.v4i32.v4p0i32(<4 x i32> %val, <4 x i32*> %ptrs, <4 x i1> zeroinitializer, i32 %evl)
				ret void
				}

				declare <4 x i32> @llvm.vp.add.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vadd_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vadd_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vadd_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vadd_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.add.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.and.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vand_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vand_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vand_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vand_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.and.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vlshr_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vlshr_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vlshr_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vlshr_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.lshr.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.mul.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vmul_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vmul_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vmul_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vmul_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.mul.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.or.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vor_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vor_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vor_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vor_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.or.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vsdiv_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vsdiv_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vsdiv_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vsdiv_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.sdiv.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.srem.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vsrem_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vsrem_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vsrem_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vsrem_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.srem.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.sub.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vsub_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vsub_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vsub_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vsub_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.sub.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vudiv_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vudiv_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vudiv_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vudiv_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.udiv.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.urem.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vurem_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vurem_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vurem_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vurem_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.urem.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x i32> @llvm.vp.xor.v4i32(<4 x i32>, <4 x i32>, <4 x i1>, i32)

				define <4 x i32> @vxor_v4i32_zero_evl(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vxor_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> %m, i32 0)
				ret <4 x i32> %s
				}

				define <4 x i32> @vxor_v4i32_false_mask(<4 x i32> %va, <4 x i32> %vb, i32 %evl) {
				; CHECK-LABEL: vxor_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x i32> @llvm.vp.xor.v4i32(<4 x i32> %va, <4 x i32> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x i32> %s
				}

				declare <4 x float> @llvm.vp.fadd.v4f32(<4 x float>, <4 x float>, <4 x i1>, i32)

				define <4 x float> @vfadd_v4f32_zero_evl(<4 x float> %va, <4 x float> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vfadd_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> %m, i32 0)
				ret <4 x float> %s
				}

				define <4 x float> @vfadd_v4f32_false_mask(<4 x float> %va, <4 x float> %vb, i32 %evl) {
				; CHECK-LABEL: vfadd_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fadd.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x float> %s
				}

				declare <4 x float> @llvm.vp.fsub.v4f32(<4 x float>, <4 x float>, <4 x i1>, i32)

				define <4 x float> @vfsub_v4f32_zero_evl(<4 x float> %va, <4 x float> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vfsub_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> %m, i32 0)
				ret <4 x float> %s
				}

				define <4 x float> @vfsub_v4f32_false_mask(<4 x float> %va, <4 x float> %vb, i32 %evl) {
				; CHECK-LABEL: vfsub_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fsub.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x float> %s
				}

				declare <4 x float> @llvm.vp.fmul.v4f32(<4 x float>, <4 x float>, <4 x i1>, i32)

				define <4 x float> @vfmul_v4f32_zero_evl(<4 x float> %va, <4 x float> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vfmul_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> %m, i32 0)
				ret <4 x float> %s
				}

				define <4 x float> @vfmul_v4f32_false_mask(<4 x float> %va, <4 x float> %vb, i32 %evl) {
				; CHECK-LABEL: vfmul_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fmul.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x float> %s
				}

				declare <4 x float> @llvm.vp.fdiv.v4f32(<4 x float>, <4 x float>, <4 x i1>, i32)

				define <4 x float> @vfdiv_v4f32_zero_evl(<4 x float> %va, <4 x float> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vfdiv_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> %m, i32 0)
				ret <4 x float> %s
				}

				define <4 x float> @vfdiv_v4f32_false_mask(<4 x float> %va, <4 x float> %vb, i32 %evl) {
				; CHECK-LABEL: vfdiv_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.fdiv.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x float> %s
				}

				declare <4 x float> @llvm.vp.frem.v4f32(<4 x float>, <4 x float>, <4 x i1>, i32)

				define <4 x float> @vfrem_v4f32_zero_evl(<4 x float> %va, <4 x float> %vb, <4 x i1> %m) {
				; CHECK-LABEL: vfrem_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> %m, i32 0)
				ret <4 x float> %s
				}

				define <4 x float> @vfrem_v4f32_false_mask(<4 x float> %va, <4 x float> %vb, i32 %evl) {
				; CHECK-LABEL: vfrem_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call <4 x float> @llvm.vp.frem.v4f32(<4 x float> %va, <4 x float> %vb, <4 x i1> zeroinitializer, i32 %evl)
				ret <4 x float> %s
				}

				declare i32 @llvm.vp.reduce.add.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_add_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_add_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_add_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_add_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.add.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.mul.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_mul_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_mul_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_mul_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_mul_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.mul.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.and.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_and_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_and_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_and_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_and_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.and.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.or.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_or_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_or_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_or_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_or_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.or.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.xor.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_xor_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_xor_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_xor_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_xor_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.xor.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.smax.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_smax_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_smax_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.smax.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_smax_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_smax_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.smax.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.smin.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_smin_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_smin_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.smin.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_smin_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_smin_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.smin.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.umax.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_umax_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_umax_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_umax_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_umax_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.umax.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare i32 @llvm.vp.reduce.umin.v4i32(i32, <4 x i32>, <4 x i1>, i32)

				define i32 @vreduce_umin_v4i32_zero_evl(i32 %start, <4 x i32> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_umin_v4i32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %val, <4 x i1> %m, i32 0)
				ret i32 %s
				}

				define i32 @vreduce_umin_v4i32_false_mask(i32 %start, <4 x i32> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_umin_v4i32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call i32 @llvm.vp.reduce.umin.v4i32(i32 %start, <4 x i32> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret i32 %s
				}

				declare float @llvm.vp.reduce.fadd.v4f32(float, <4 x float>, <4 x i1>, i32)

				define float @vreduce_seq_fadd_v4f32_zero_evl(float %start, <4 x float> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_seq_fadd_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %val, <4 x i1> %m, i32 0)
				ret float %s
				}

				define float @vreduce_seq_fadd_v4f32_false_mask(float %start, <4 x float> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_seq_fadd_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret float %s
				}

				define float @vreduce_fadd_v4f32_zero_evl(float %start, <4 x float> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_fadd_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call reassoc float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %val, <4 x i1> %m, i32 0)
				ret float %s
				}

				define float @vreduce_fadd_v4f32_false_mask(float %start, <4 x float> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_fadd_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call reassoc float @llvm.vp.reduce.fadd.v4f32(float %start, <4 x float> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret float %s
				}

				declare float @llvm.vp.reduce.fmul.v4f32(float, <4 x float>, <4 x i1>, i32)

				define float @vreduce_seq_fmul_v4f32_zero_evl(float %start, <4 x float> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_seq_fmul_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %val, <4 x i1> %m, i32 0)
				ret float %s
				}

				define float @vreduce_seq_fmul_v4f32_false_mask(float %start, <4 x float> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_seq_fmul_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret float %s
				}

				define float @vreduce_fmul_v4f32_zero_evl(float %start, <4 x float> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_fmul_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call reassoc float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %val, <4 x i1> %m, i32 0)
				ret float %s
				}

				define float @vreduce_fmul_v4f32_false_mask(float %start, <4 x float> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_fmul_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call reassoc float @llvm.vp.reduce.fmul.v4f32(float %start, <4 x float> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret float %s
				}

				declare float @llvm.vp.reduce.fmin.v4f32(float, <4 x float>, <4 x i1>, i32)

				define float @vreduce_fmin_v4f32_zero_evl(float %start, <4 x float> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_fmin_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %val, <4 x i1> %m, i32 0)
				ret float %s
				}

				define float @vreduce_fmin_v4f32_false_mask(float %start, <4 x float> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_fmin_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fmin.v4f32(float %start, <4 x float> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret float %s
				}

				declare float @llvm.vp.reduce.fmax.v4f32(float, <4 x float>, <4 x i1>, i32)

				define float @vreduce_fmax_v4f32_zero_evl(float %start, <4 x float> %val, <4 x i1> %m) {
				; CHECK-LABEL: vreduce_fmax_v4f32_zero_evl:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fmax.v4f32(float %start, <4 x float> %val, <4 x i1> %m, i32 0)
				ret float %s
				}

				define float @vreduce_fmax_v4f32_false_mask(float %start, <4 x float> %val, i32 %evl) {
				; CHECK-LABEL: vreduce_fmax_v4f32_false_mask:
				; CHECK: # %bb.0:
				; CHECK-NEXT: ret
				%s = call float @llvm.vp.reduce.fmax.v4f32(float %start, <4 x float> %val, <4 x i1> zeroinitializer, i32 %evl)
				ret float %s
				}

llvm/test/CodeGen/RISCV/rvv/vadd-vp.ll

Show First 20 Lines • Show All 1,631 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%vb = shufflevector <vscale x 32 x i32> %elt.head, <vscale x 32 x i32> undef, <vscale x 32 x i32> zeroinitializer		%vb = shufflevector <vscale x 32 x i32> %elt.head, <vscale x 32 x i32> undef, <vscale x 32 x i32> zeroinitializer
%evl = call i32 @llvm.vscale.i32()		%evl = call i32 @llvm.vscale.i32()
%evl0 = mul i32 %evl, 8		%evl0 = mul i32 %evl, 8
%v = call <vscale x 32 x i32> @llvm.vp.add.nxv32i32(<vscale x 32 x i32> %va, <vscale x 32 x i32> %vb, <vscale x 32 x i1> %m, i32 %evl0)		%v = call <vscale x 32 x i32> @llvm.vp.add.nxv32i32(<vscale x 32 x i32> %va, <vscale x 32 x i32> %vb, <vscale x 32 x i1> %m, i32 %evl0)
ret <vscale x 32 x i32> %v		ret <vscale x 32 x i32> %v
}		}

; FIXME: The first vadd.vi should be able to infer that its AVL is equivalent to VLMAX.		; FIXME: The first vadd.vi should be able to infer that its AVL is equivalent to VLMAX.
; FIXME: The upper half of the operation is doing nothing.		; FIXME: The upper half of the operation is doing nothing but we don't catch
		; that on RV64; we issue a usubsat(and (vscale x 16), 0xffffffff, vscale x 16)
		; (the "original" %evl is the "and", due to known-bits issues with legalizing
		; the i32 %evl to i64) and this isn't detected as 0.
		; This could be resolved in the future with more detailed KnownBits analysis
		; for ISD::VSCALE.

define <vscale x 32 x i32> @vadd_vi_nxv32i32_evl_nx16(<vscale x 32 x i32> %va, <vscale x 32 x i1> %m) {		define <vscale x 32 x i32> @vadd_vi_nxv32i32_evl_nx16(<vscale x 32 x i32> %va, <vscale x 32 x i1> %m) {
; CHECK-LABEL: vadd_vi_nxv32i32_evl_nx16:		; RV32-LABEL: vadd_vi_nxv32i32_evl_nx16:
; CHECK: # %bb.0:		; RV32: # %bb.0:
; CHECK-NEXT: csrr a0, vlenb		; RV32-NEXT: csrr a0, vlenb
; CHECK-NEXT: srli a1, a0, 2		; RV32-NEXT: slli a0, a0, 1
; CHECK-NEXT: vsetvli a2, zero, e8, mf2, ta, mu		; RV32-NEXT: vsetvli zero, a0, e32, m8, ta, mu
; CHECK-NEXT: vslidedown.vx v25, v0, a1		; RV32-NEXT: vadd.vi v8, v8, -1, v0.t
; CHECK-NEXT: slli a0, a0, 1		; RV32-NEXT: ret
; CHECK-NEXT: vsetvli zero, a0, e32, m8, ta, mu		;
; CHECK-NEXT: vadd.vi v8, v8, -1, v0.t		; RV64-LABEL: vadd_vi_nxv32i32_evl_nx16:
; CHECK-NEXT: vsetivli zero, 0, e32, m8, ta, mu		; RV64: # %bb.0:
; CHECK-NEXT: vmv1r.v v0, v25		; RV64-NEXT: csrr a0, vlenb
; CHECK-NEXT: vadd.vi v16, v16, -1, v0.t		; RV64-NEXT: srli a1, a0, 2
; CHECK-NEXT: ret		; RV64-NEXT: vsetvli a2, zero, e8, mf2, ta, mu
		; RV64-NEXT: vslidedown.vx v25, v0, a1
		; RV64-NEXT: slli a0, a0, 1
		; RV64-NEXT: vsetvli zero, a0, e32, m8, ta, mu
		; RV64-NEXT: vadd.vi v8, v8, -1, v0.t
		; RV64-NEXT: vsetivli zero, 0, e32, m8, ta, mu
		; RV64-NEXT: vmv1r.v v0, v25
		; RV64-NEXT: vadd.vi v16, v16, -1, v0.t
		; RV64-NEXT: ret
%elt.head = insertelement <vscale x 32 x i32> undef, i32 -1, i32 0		%elt.head = insertelement <vscale x 32 x i32> undef, i32 -1, i32 0
%vb = shufflevector <vscale x 32 x i32> %elt.head, <vscale x 32 x i32> undef, <vscale x 32 x i32> zeroinitializer		%vb = shufflevector <vscale x 32 x i32> %elt.head, <vscale x 32 x i32> undef, <vscale x 32 x i32> zeroinitializer
%evl = call i32 @llvm.vscale.i32()		%evl = call i32 @llvm.vscale.i32()
%evl0 = mul i32 %evl, 16		%evl0 = mul i32 %evl, 16
%v = call <vscale x 32 x i32> @llvm.vp.add.nxv32i32(<vscale x 32 x i32> %va, <vscale x 32 x i32> %vb, <vscale x 32 x i1> %m, i32 %evl0)		%v = call <vscale x 32 x i32> @llvm.vp.add.nxv32i32(<vscale x 32 x i32> %va, <vscale x 32 x i32> %vb, <vscale x 32 x i1> %m, i32 %evl0)
ret <vscale x 32 x i32> %v		ret <vscale x 32 x i32> %v
}		}

▲ Show 20 Lines • Show All 415 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner][VP] Fold zero-length or false-masked VP opsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 375191

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/include/llvm/IR/VPIntrinsics.def

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vadd-vp.ll

llvm/test/CodeGen/RISCV/rvv/undef-vp-ops.ll

llvm/test/CodeGen/RISCV/rvv/vadd-vp.ll

[DAGCombiner][VP] Fold zero-length or false-masked VP ops
ClosedPublic