This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
ISDOpcodes.h
-
TargetLowering.h
-
lib/
-
CodeGen/
-
SelectionDAG/
-
DAGCombiner.cpp
-
LegalizeDAG.cpp
-
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
-
LegalizeVectorOps.cpp
-
LegalizeVectorTypes.cpp
-
TargetLowering.cpp
-
TargetLoweringBase.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64ISelLowering.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
vecreduce-add-legalization.ll
-
vecreduce-and-legalization.ll
-
vecreduce-fadd-legalization.ll
-
vecreduce-fmax-legalization.ll
-
vecreduce-umax-legalization.ll

Differential D58015

[SelectionDAG][AArch64] Legalize VECREDUCE
ClosedPublic

Authored by nikic on Feb 10 2019, 7:44 AM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
efriedma
• gnzlbg
aemerson
sdesmalen

Commits

rGaa7cfa75f97e: [SDAG][AArch64] Legalize VECREDUCE
rL355860: [SDAG][AArch64] Legalize VECREDUCE

Summary

Fixes https://bugs.llvm.org/show_bug.cgi?id=36796.

Implement basic legalizations (PromoteIntRes, PromoteIntOp, ExpandIntRes, ScalarizeVecOp, WidenVecOp) for VECREDUCE opcodes. There are more legalizations missing (esp float legalizations), but there's no way to test them right now, so I'm not adding them.

This also includes a few more changes to make this work somewhat reasonably:

Add support for expanding VECREDUCE in SDAG. Usually experimental.vector.reduce is expanded prior to codegen, but if the target does have native vector reduce, it may of course still be necessary to expand due to legalization issues. Given the intended purpose, this uses a trivial linear expansion.
Allow the result type of integer VECREDUCE to be larger than the vector element type. For example we need to be able to reduce a v8i8 into an (nominally) i32 result type on AArch64.
Use the vector operand type rather than the scalar result type to determine the action, so we can control exactly which vector types are supported. Also default VECREDUCE to Expand.
On AArch64 (only target using VECREDUCE), explicitly specify for which vector types the reductions are supported.

I believe this fixes all the VECREDUCE legalization/expansion related assertions on AArch64.

Diff Detail

Repository: rL LLVM

Event Timeline

nikic created this revision.Feb 10 2019, 7:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2019, 7:44 AM

Herald added subscribers: llvm-commits, kristof.beyls, javed.absar. · View Herald Transcript

RKSimon added a reviewer: • gnzlbg.Feb 14 2019, 12:20 PM

Ping

RKSimon added a reviewer: aemerson.Feb 25 2019, 4:18 AM

Thanks for this patch @nikic! Please find some comments inline.

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3972 ↗	(On Diff #186151)	Although the code in SelectionDAGLegalize::LegalizeOp() will fallthrough and get into ConvertNodeToLibcall(), should this conceptually not be done in SelectionDAGLegalize::ExpandNode() ? It is not creating libcalls, but rather expanding the operation into scalar operations on the vector's elements.
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
1538 ↗	(On Diff #186151)	This promotes the result type of the operation as well as the operand. As far as I understand it, the result type is legal at this point. If bits(EltVT) > bits(VT), should this promote the operation to work on a bigger type, and finally truncate the result?
3245 ↗	(On Diff #186151)	Do we need to expand the entire operation here if only the result needs expansion (and not the input vector itself)?
lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4264 ↗	(On Diff #186151)	Do we want to expand the operation here, or just widen the input operand (inserted with zeros as you suggest in the comment)? If a target just wants the operation to be widened, but does have support for reductions on the widened vectors, we probably prefer that over the default lowering that expandVecReduce provides.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
5640 ↗	(On Diff #186151)	Since all the VECREDUCE_<OP> are non-strict, can we split the vector in a low- and high half, and do the reduction on both halves? This would eventually boil down to using ScalarizeVecOp_VECREDUCE that transforms <1 x i32> => i32, and would allow more parallelised execution of the reduction. Is this maybe something you already tried?
5674 ↗	(On Diff #186151)	nit: unnecessary curly braces
lib/CodeGen/TargetLoweringBase.cpp
670 ↗	(On Diff #186151)	This patch has AArch64 tests for VECREDUCE_ADD, VECREDUCE_FMAX and VECREDUCE_UMAX, which all have custom lowering but actually adds lots of generic support for VECREDUCE operations that don't have custom lowering. I think we'll also want to add some tests for those cases.

Address review. Also perform expansion during vec op legalization.

nikic added inline comments.Mar 1 2019, 3:12 PM

lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3972 ↗	(On Diff #186151)	Thanks for catching this, I didn't intend to put it in here. I scrolled too far while trying to find the end of the switch :)
lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
1538 ↗	(On Diff #186151)	Right, this was also promoting the result type (if necessary), but missed the truncate. I've added it now.
3245 ↗	(On Diff #186151)	I think so. Expanding the result without expanding the operand would need something like a hi/lo reduce, which can compute the hi/lo half of the reduction result.
lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4264 ↗	(On Diff #186151)	I agree that this is the better approach in general and have implemented it now. It's slightly more complicated than just padding with zeros, as the neutral element is different for the different reductions. Doing this gives some pretty bad code for cases like `v1i8`, where we end up inserting 7 zero elements before performing the reduction. To avoid this, I've added a DAGCombine to convert a VECREDUCE of one element into an EXTRACT_VECTOR_ELT, but generating good code for cases like `v9i8` would need more work.
lib/CodeGen/SelectionDAG/TargetLowering.cpp
5640 ↗	(On Diff #186151)	This mostly already happens as part of the SplitVecOp legalization. Mostly, because it only goes down to a legal type. I've used a naive implementation here, because this expansion code is intended only for legalization purposes, not to produce particularly good code for unsupported general reductions. If a target doesn't support a reduction, it should request it to be expanded prior to SDAG construction, which will produce a shuffle reduction if possible. AArch64 currently always opts out of expansion, even though it does not support all reductions -- I plan to change that in a separate patch.
lib/CodeGen/TargetLoweringBase.cpp
670 ↗	(On Diff #186151)	I've added two more tests for AND and FADD. As mentioned in another comment, I plan to make these go through the pre-SDAG expansion code in the future though.

What about expanding the reductions into shuffle vector sequences? If we add support for that, such that the resulting constructed SDAG would be the same as the IR expansion shuffle vector sequence, then we pave the way for a move to using the intrinsics for all targets as the canonical form. So what we'd do is:

Add the expansion to shuffle vector sequences (instead of a naive implementation)
Move targets to use the intrinsic representation unconditionally. This means we don't need the useReductionIntrinsic TTI took any more. Targets' TargetLowering would need to specify which reduction kinds to expand using the new SDAG expansion code.
...and as a result we can kill the ExpandReductions pass and finally move these intrinsics from experimental to fully supported and preferred representations.

I'm not saying this all needs to be done by you, but I think it's worth tackling 1) here.

If possible, use shuffle reduction in vecreduce expansion.

Herald added a subscriber: hiraditya. · View Herald TranscriptMar 9 2019, 12:07 PM

I've extended the expansion code to use a tree reduction for pow2 vectors as long as we have the necessary operations. I think the most relevant example is the v16f32 reduction for fadd:

define float @test_v16f32(<16 x float> %a) nounwind {
; CHECK-LABEL: test_v16f32:
; CHECK:       // %bb.0:
; CHECK-NEXT:    fadd v1.4s, v1.4s, v3.4s
; CHECK-NEXT:    fadd v0.4s, v0.4s, v2.4s
; CHECK-NEXT:    fadd v0.4s, v0.4s, v1.4s
; CHECK-NEXT:    ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT:    fadd v0.2s, v0.2s, v1.2s
; CHECK-NEXT:    faddp s0, v0.2s
; CHECK-NEXT:    ret
  %b = call fast nnan float @llvm.experimental.vector.reduce.fadd.f32.v16f32(float 0.0, <16 x float> %a)
  ret float %b
}

Not being overly familiar with AArch64 this looks optimal. Results for "weird" vector sizes are sometimes quite bad though (on the other hand, the current pre-sdag expansion code doesn't support them at all).

Thanks for making these changes, I think the patch looks good now!

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4264 ↗	(On Diff #186151)	Thanks for this!
llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
3918 ↗	(On Diff #189991)	nit: simplify change -> simply change?
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4307 ↗	(On Diff #189991)	I wanted to suggest doing a build_vector of all NeutralElems, and then a INSERT_SUBVECTOR, so you can benefit from cheap 'splat immediate' instructions, but unfortunately you then run into missing pieces of legalisation of INSERT_SUBVECTOR.

This revision is now accepted and ready to land.Mar 11 2019, 4:24 AM

Closed by commit rL355860: [SDAG][AArch64] Legalize VECREDUCE (authored by nikic). · Explain WhyMar 11 2019, 1:21 PM

This revision was automatically updated to reflect the committed changes.

nikic marked 2 inline comments as done.

nikic added inline comments.Mar 11 2019, 1:21 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4307 ↗	(On Diff #189991)	I initially wanted to use INSERT_SUBVECTOR here, but then found out that it requires the insertion index to be a multiple of the subvector length, which is not terribly useful for legalization purposes :(

sdesmalen added inline comments.Mar 11 2019, 2:26 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4307 ↗	(On Diff #189991)	Wouldn't the insertion index always be 0? (i.e. inserting the subvector into the lower elements (starting at index 0) of a wide all-NeutralElems vector).

nikic marked an inline comment as done.Mar 11 2019, 2:43 PM

nikic added inline comments.

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4307 ↗	(On Diff #189991)	Uh yes, sorry. I was focused on inserting the NeutralElems into the top of the wide vector, while doing it the other way around makes a lot more sense and only needs a zero index. I'm assuming the missing legalization you're referring to is WidenVecOp? Possible alternatives here could be a VSELECT or SHUFFLE_VECTOR, but not sure if these will get sensible lowerings.

aemerson added inline comments.Mar 11 2019, 7:08 PM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
4307 ↗	(On Diff #189991)	Nit: NeutralElem would be more accurately named as something like IdentityValue. And for widening, what about using CONCAT_VECTORS?
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5673 ↗	(On Diff #189991)	To be honest this isn't quite what I meant. I meant generating the same tree of shufflevector instructions used for tree reductions. That way, if we switch targets to use the intrinsics and mark them all as "Expand" by default, then the existing pattern matching code would continue to work. E.g SelectionDAG::matchBinOpReduction should match the code generated by this.

sdesmalen added inline comments.Mar 12 2019, 8:34 AM

llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
5673 ↗	(On Diff #189991)	AFAICT this code is doing the same as what `llvm::getShuffleReduction()` does (which is called by the `ExpandReductions` pass), so functionally the expansion should not be much different from what we had before the patch. I would actually think the code generated here should be easier to support by all targets than a true tree reduction, as there is always an element-wise vector-operation available, where there aren't always pair-wise operations, so the code resulting from this expansion should work reasonably well (log-n) without additional lowering/matching effort.

spatel mentioned this in D90247: [AArch64] Add legalizations for VECREDUCE_SEQ_FADD .Oct 27 2020, 1:09 PM

spatel mentioned this in D96552: [Vectorizers][TTI] remove option to bypass creation of vector reduction intrinsics.Feb 11 2021, 2:36 PM

spatel mentioned this in rG79b1b4a58151: [Vectorizers][TTI] remove option to bypass creation of vector reduction….Feb 12 2021, 5:34 AM

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

ISDOpcodes.h

7 lines

TargetLowering.h

4 lines

lib/

CodeGen/

SelectionDAG/

DAGCombiner.cpp

32 lines

LegalizeDAG.cpp

31 lines

LegalizeIntegerTypes.cpp

81 lines

LegalizeTypes.h

5 lines

LegalizeVectorOps.cpp

40 lines

LegalizeVectorTypes.cpp

95 lines

TargetLowering.cpp

58 lines

TargetLoweringBase.cpp

15 lines

Target/

AArch64/

AArch64ISelLowering.cpp

6 lines

test/

CodeGen/

AArch64/

vecreduce-add-legalization.ll

169 lines

vecreduce-and-legalization.ll

198 lines

vecreduce-fadd-legalization.ll

83 lines

vecreduce-fmax-legalization.ll

77 lines

vecreduce-umax-legalization.ll

177 lines

Diff 190146

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 866 Lines • ▼ Show 20 Lines	enum NodeType {
/// Generic reduction nodes. These nodes represent horizontal vector		/// Generic reduction nodes. These nodes represent horizontal vector
/// reduction operations, producing a scalar result.		/// reduction operations, producing a scalar result.
/// The STRICT variants perform reductions in sequential order. The first		/// The STRICT variants perform reductions in sequential order. The first
/// operand is an initial scalar accumulator value, and the second operand		/// operand is an initial scalar accumulator value, and the second operand
/// is the vector to reduce.		/// is the vector to reduce.
VECREDUCE_STRICT_FADD, VECREDUCE_STRICT_FMUL,		VECREDUCE_STRICT_FADD, VECREDUCE_STRICT_FMUL,
/// These reductions are non-strict, and have a single vector operand.		/// These reductions are non-strict, and have a single vector operand.
VECREDUCE_FADD, VECREDUCE_FMUL,		VECREDUCE_FADD, VECREDUCE_FMUL,
		/// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants.
		VECREDUCE_FMAX, VECREDUCE_FMIN,
		/// Integer reductions may have a result type larger than the vector element
		/// type. However, the reduction is performed using the vector element type
		/// and the value in the top bits is unspecified.
VECREDUCE_ADD, VECREDUCE_MUL,		VECREDUCE_ADD, VECREDUCE_MUL,
VECREDUCE_AND, VECREDUCE_OR, VECREDUCE_XOR,		VECREDUCE_AND, VECREDUCE_OR, VECREDUCE_XOR,
VECREDUCE_SMAX, VECREDUCE_SMIN, VECREDUCE_UMAX, VECREDUCE_UMIN,		VECREDUCE_SMAX, VECREDUCE_SMIN, VECREDUCE_UMAX, VECREDUCE_UMIN,
/// FMIN/FMAX nodes can have flags, for NaN/NoNaN variants.
VECREDUCE_FMAX, VECREDUCE_FMIN,

/// BUILTIN_OP_END - This must be the last enum value in this list.		/// BUILTIN_OP_END - This must be the last enum value in this list.
/// The target-specific pre-isel opcode values start here.		/// The target-specific pre-isel opcode values start here.
BUILTIN_OP_END		BUILTIN_OP_END
};		};

/// FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations		/// FIRST_TARGET_MEMORY_OPCODE - Target-specific pre-isel operations
/// which do not reference a specific memory location should be less than		/// which do not reference a specific memory location should be less than
▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

	Show First 20 Lines • Show All 3,887 Lines • ▼ Show 20 Lines
	/// integers as its arguments.			/// integers as its arguments.
	SDValue expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const;			SDValue expandFixedPointMul(SDNode *Node, SelectionDAG &DAG) const;

	/// Method for building the DAG expansion of ISD::[US]MULO. Returns whether			/// Method for building the DAG expansion of ISD::[US]MULO. Returns whether
	/// expansion was successful and populates the Result and Overflow arguments.			/// expansion was successful and populates the Result and Overflow arguments.
	bool expandMULO(SDNode *Node, SDValue &Result, SDValue &Overflow,			bool expandMULO(SDNode *Node, SDValue &Result, SDValue &Overflow,
	SelectionDAG &DAG) const;			SelectionDAG &DAG) const;

				/// Expand a VECREDUCE_* into an explicit calculation. If Count is specified,
				/// only the first Count elements of the vector are used.
				SDValue expandVecReduce(SDNode *Node, SelectionDAG &DAG) const;

	//===--------------------------------------------------------------------===//			//===--------------------------------------------------------------------===//
	// Instruction Emitting Hooks			// Instruction Emitting Hooks
	//			//

	/// This method should be implemented by targets that mark instructions with			/// This method should be implemented by targets that mark instructions with
	/// the 'usesCustomInserter' flag. These instructions are special in various			/// the 'usesCustomInserter' flag. These instructions are special in various
	/// ways, which require special support to insert. The specified MachineInstr			/// ways, which require special support to insert. The specified MachineInstr
	/// is created but not inserted into any basic blocks, and this method is			/// is created but not inserted into any basic blocks, and this method is
	▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 392 Lines • ▼ Show 20 Lines	private:
SDValue visitSCALAR_TO_VECTOR(SDNode *N);		SDValue visitSCALAR_TO_VECTOR(SDNode *N);
SDValue visitINSERT_SUBVECTOR(SDNode *N);		SDValue visitINSERT_SUBVECTOR(SDNode *N);
SDValue visitMLOAD(SDNode *N);		SDValue visitMLOAD(SDNode *N);
SDValue visitMSTORE(SDNode *N);		SDValue visitMSTORE(SDNode *N);
SDValue visitMGATHER(SDNode *N);		SDValue visitMGATHER(SDNode *N);
SDValue visitMSCATTER(SDNode *N);		SDValue visitMSCATTER(SDNode *N);
SDValue visitFP_TO_FP16(SDNode *N);		SDValue visitFP_TO_FP16(SDNode *N);
SDValue visitFP16_TO_FP(SDNode *N);		SDValue visitFP16_TO_FP(SDNode *N);
		SDValue visitVECREDUCE(SDNode *N);

SDValue visitFADDForFMACombine(SDNode *N);		SDValue visitFADDForFMACombine(SDNode *N);
SDValue visitFSUBForFMACombine(SDNode *N);		SDValue visitFSUBForFMACombine(SDNode *N);
SDValue visitFMULForFMADistributiveCombine(SDNode *N);		SDValue visitFMULForFMADistributiveCombine(SDNode *N);

SDValue XformToShuffleWithZero(SDNode *N);		SDValue XformToShuffleWithZero(SDNode *N);
SDValue ReassociateOps(unsigned Opc, const SDLoc &DL, SDValue N0,		SDValue ReassociateOps(unsigned Opc, const SDLoc &DL, SDValue N0,
SDValue N1, SDNodeFlags Flags);		SDValue N1, SDNodeFlags Flags);
▲ Show 20 Lines • Show All 1,178 Lines • ▼ Show 20 Lines	SDValue DAGCombiner::visit(SDNode *N) {
case ISD::SCALAR_TO_VECTOR: return visitSCALAR_TO_VECTOR(N);		case ISD::SCALAR_TO_VECTOR: return visitSCALAR_TO_VECTOR(N);
case ISD::INSERT_SUBVECTOR: return visitINSERT_SUBVECTOR(N);		case ISD::INSERT_SUBVECTOR: return visitINSERT_SUBVECTOR(N);
case ISD::MGATHER: return visitMGATHER(N);		case ISD::MGATHER: return visitMGATHER(N);
case ISD::MLOAD: return visitMLOAD(N);		case ISD::MLOAD: return visitMLOAD(N);
case ISD::MSCATTER: return visitMSCATTER(N);		case ISD::MSCATTER: return visitMSCATTER(N);
case ISD::MSTORE: return visitMSTORE(N);		case ISD::MSTORE: return visitMSTORE(N);
case ISD::FP_TO_FP16: return visitFP_TO_FP16(N);		case ISD::FP_TO_FP16: return visitFP_TO_FP16(N);
case ISD::FP16_TO_FP: return visitFP16_TO_FP(N);		case ISD::FP16_TO_FP: return visitFP16_TO_FP(N);
		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN: return visitVECREDUCE(N);
}		}
return SDValue();		return SDValue();
}		}

SDValue DAGCombiner::combine(SDNode *N) {		SDValue DAGCombiner::combine(SDNode *N) {
SDValue RV = visit(N);		SDValue RV = visit(N);

// If nothing happened, try a target-specific DAG combine.		// If nothing happened, try a target-specific DAG combine.
▲ Show 20 Lines • Show All 16,699 Lines • ▼ Show 20 Lines	if (AndConst && AndConst->getAPIntValue() == 0xffff) {
return DAG.getNode(ISD::FP16_TO_FP, SDLoc(N), N->getValueType(0),		return DAG.getNode(ISD::FP16_TO_FP, SDLoc(N), N->getValueType(0),
N0.getOperand(0));		N0.getOperand(0));
}		}
}		}

return SDValue();		return SDValue();
}		}

		SDValue DAGCombiner::visitVECREDUCE(SDNode *N) {
		SDValue N0 = N->getOperand(0);
		EVT VT = N0.getValueType();

		// VECREDUCE over 1-element vector is just an extract.
		if (VT.getVectorNumElements() == 1) {
		SDLoc dl(N);
		SDValue Res = DAG.getNode(
		ISD::EXTRACT_VECTOR_ELT, dl, VT.getVectorElementType(), N0,
		DAG.getConstant(0, dl, TLI.getVectorIdxTy(DAG.getDataLayout())));
		if (Res.getValueType() != N->getValueType(0))
		Res = DAG.getNode(ISD::ANY_EXTEND, dl, N->getValueType(0), Res);
		return Res;
		}

		return SDValue();
		}

/// Returns a vector_shuffle if it able to transform an AND to a vector_shuffle		/// Returns a vector_shuffle if it able to transform an AND to a vector_shuffle
/// with the destination vector and a zero vector.		/// with the destination vector and a zero vector.
/// e.g. AND V, <0xffffffff, 0, 0xffffffff, 0>. ==>		/// e.g. AND V, <0xffffffff, 0, 0xffffffff, 0>. ==>
/// vector_shuffle V, Zero, <0, 4, 2, 4>		/// vector_shuffle V, Zero, <0, 4, 2, 4>
SDValue DAGCombiner::XformToShuffleWithZero(SDNode *N) {		SDValue DAGCombiner::XformToShuffleWithZero(SDNode *N) {
assert(N->getOpcode() == ISD::AND && "Unexpected opcode!");		assert(N->getOpcode() == ISD::AND && "Unexpected opcode!");

EVT VT = N->getValueType(0);		EVT VT = N->getValueType(0);
▲ Show 20 Lines • Show All 1,310 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

Show First 20 Lines • Show All 1,134 Lines • ▼ Show 20 Lines	#endif
case ISD::MSCATTER:		case ISD::MSCATTER:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedScatterSDNode>(Node)->getValue().getValueType());		cast<MaskedScatterSDNode>(Node)->getValue().getValueType());
break;		break;
case ISD::MSTORE:		case ISD::MSTORE:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<MaskedStoreSDNode>(Node)->getValue().getValueType());		cast<MaskedStoreSDNode>(Node)->getValue().getValueType());
break;		break;
		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN:
		Action = TLI.getOperationAction(
		Node->getOpcode(), Node->getOperand(0).getValueType());
		break;
default:		default:
if (Node->getOpcode() >= ISD::BUILTIN_OP_END) {		if (Node->getOpcode() >= ISD::BUILTIN_OP_END) {
Action = TargetLowering::Legal;		Action = TargetLowering::Legal;
} else {		} else {
Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));		Action = TLI.getOperationAction(Node->getOpcode(), Node->getValueType(0));
}		}
break;		break;
}		}
▲ Show 20 Lines • Show All 2,446 Lines • ▼ Show 20 Lines	for (unsigned Idx = 0; Idx < NumElem; Idx++) {
Scalars.push_back(DAG.getNode(Node->getOpcode(), dl,		Scalars.push_back(DAG.getNode(Node->getOpcode(), dl,
VT.getScalarType(), Ex, Sh));		VT.getScalarType(), Ex, Sh));
}		}

SDValue Result = DAG.getBuildVector(Node->getValueType(0), dl, Scalars);		SDValue Result = DAG.getBuildVector(Node->getValueType(0), dl, Scalars);
ReplaceNode(SDValue(Node, 0), Result);		ReplaceNode(SDValue(Node, 0), Result);
break;		break;
}		}
		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN:
		Results.push_back(TLI.expandVecReduce(Node, DAG));
		break;
case ISD::GLOBAL_OFFSET_TABLE:		case ISD::GLOBAL_OFFSET_TABLE:
case ISD::GlobalAddress:		case ISD::GlobalAddress:
case ISD::GlobalTLSAddress:		case ISD::GlobalTLSAddress:
case ISD::ExternalSymbol:		case ISD::ExternalSymbol:
case ISD::ConstantPool:		case ISD::ConstantPool:
case ISD::JumpTable:		case ISD::JumpTable:
case ISD::INTRINSIC_W_CHAIN:		case ISD::INTRINSIC_W_CHAIN:
case ISD::INTRINSIC_WO_CHAIN:		case ISD::INTRINSIC_WO_CHAIN:
▲ Show 20 Lines • Show All 885 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 166 Lines • ▼ Show 20 Lines	#endif
case ISD::ATOMIC_LOAD_UMAX:		case ISD::ATOMIC_LOAD_UMAX:
case ISD::ATOMIC_SWAP:		case ISD::ATOMIC_SWAP:
Res = PromoteIntRes_Atomic1(cast<AtomicSDNode>(N)); break;		Res = PromoteIntRes_Atomic1(cast<AtomicSDNode>(N)); break;

case ISD::ATOMIC_CMP_SWAP:		case ISD::ATOMIC_CMP_SWAP:
case ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS:		case ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS:
Res = PromoteIntRes_AtomicCmpSwap(cast<AtomicSDNode>(N), ResNo);		Res = PromoteIntRes_AtomicCmpSwap(cast<AtomicSDNode>(N), ResNo);
break;		break;

		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		Res = PromoteIntRes_VECREDUCE(N);
		break;
}		}

// If the result is null then the sub-method took care of registering it.		// If the result is null then the sub-method took care of registering it.
if (Res.getNode())		if (Res.getNode())
SetPromotedInteger(SDValue(N, ResNo), Res);		SetPromotedInteger(SDValue(N, ResNo), Res);
}		}

SDValue DAGTypeLegalizer::PromoteIntRes_MERGE_VALUES(SDNode *N,		SDValue DAGTypeLegalizer::PromoteIntRes_MERGE_VALUES(SDNode *N,
▲ Show 20 Lines • Show All 919 Lines • ▼ Show 20 Lines	bool DAGTypeLegalizer::PromoteIntegerOperand(SDNode *N, unsigned OpNo) {
case ISD::RETURNADDR: Res = PromoteIntOp_FRAMERETURNADDR(N); break;		case ISD::RETURNADDR: Res = PromoteIntOp_FRAMERETURNADDR(N); break;

case ISD::PREFETCH: Res = PromoteIntOp_PREFETCH(N, OpNo); break;		case ISD::PREFETCH: Res = PromoteIntOp_PREFETCH(N, OpNo); break;

case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::UMULFIX: Res = PromoteIntOp_MULFIX(N); break;		case ISD::UMULFIX: Res = PromoteIntOp_MULFIX(N); break;

case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;		case ISD::FPOWI: Res = PromoteIntOp_FPOWI(N); break;

		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN: Res = PromoteIntOp_VECREDUCE(N); break;
}		}

// If the result is null, the sub-method took care of registering results etc.		// If the result is null, the sub-method took care of registering results etc.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;

// If the result is N, the sub-method updated N in place. Tell the legalizer		// If the result is N, the sub-method updated N in place. Tell the legalizer
// core about this.		// core about this.
if (Res.getNode() == N)		if (Res.getNode() == N)
▲ Show 20 Lines • Show All 360 Lines • ▼ Show 20 Lines	return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0), N->getOperand(1),
0);		0);
}		}

SDValue DAGTypeLegalizer::PromoteIntOp_FPOWI(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntOp_FPOWI(SDNode *N) {
SDValue Op = SExtPromotedInteger(N->getOperand(1));		SDValue Op = SExtPromotedInteger(N->getOperand(1));
return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0), Op), 0);		return SDValue(DAG.UpdateNodeOperands(N, N->getOperand(0), Op), 0);
}		}

		SDValue DAGTypeLegalizer::PromoteIntOp_VECREDUCE(SDNode *N) {
		SDLoc dl(N);
		SDValue Op;
		switch (N->getOpcode()) {
		default: llvm_unreachable("Expected integer vector reduction");
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		Op = GetPromotedInteger(N->getOperand(0));
		break;
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		Op = SExtPromotedInteger(N->getOperand(0));
		break;
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		Op = ZExtPromotedInteger(N->getOperand(0));
		break;
		}

		EVT EltVT = Op.getValueType().getVectorElementType();
		EVT VT = N->getValueType(0);
		if (VT.bitsGE(EltVT))
		return DAG.getNode(N->getOpcode(), SDLoc(N), VT, Op);

		// Result size must be >= element size. If this is not the case after
		// promotion, also promote the result type and then truncate.
		SDValue Reduce = DAG.getNode(N->getOpcode(), dl, EltVT, Op);
		return DAG.getNode(ISD::TRUNCATE, dl, VT, Reduce);
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Integer Result Expansion		// Integer Result Expansion
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// ExpandIntegerResult - This method is called when the specified result of the		/// ExpandIntegerResult - This method is called when the specified result of the
/// specified node is found to need expansion. At this point, the node may also		/// specified node is found to need expansion. At this point, the node may also
/// have invalid operands or may have other results that need promotion, we just		/// have invalid operands or may have other results that need promotion, we just
/// know that (at least) one result needs expansion.		/// know that (at least) one result needs expansion.
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	#endif
case ISD::SMULO: ExpandIntRes_XMULO(N, Lo, Hi); break;		case ISD::SMULO: ExpandIntRes_XMULO(N, Lo, Hi); break;

case ISD::SADDSAT:		case ISD::SADDSAT:
case ISD::UADDSAT:		case ISD::UADDSAT:
case ISD::SSUBSAT:		case ISD::SSUBSAT:
case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;		case ISD::USUBSAT: ExpandIntRes_ADDSUBSAT(N, Lo, Hi); break;
case ISD::SMULFIX:		case ISD::SMULFIX:
case ISD::UMULFIX: ExpandIntRes_MULFIX(N, Lo, Hi); break;		case ISD::UMULFIX: ExpandIntRes_MULFIX(N, Lo, Hi); break;

		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN: ExpandIntRes_VECREDUCE(N, Lo, Hi); break;
}		}

// If Lo/Hi is null, the sub-method took care of registering results etc.		// If Lo/Hi is null, the sub-method took care of registering results etc.
if (Lo.getNode())		if (Lo.getNode())
SetExpandedInteger(SDValue(N, ResNo), Lo, Hi);		SetExpandedInteger(SDValue(N, ResNo), Lo, Hi);
}		}

/// Lower an atomic node to the appropriate builtin call.		/// Lower an atomic node to the appropriate builtin call.
▲ Show 20 Lines • Show All 1,532 Lines • ▼ Show 20 Lines	SDValue Swap = DAG.getAtomicCmpSwap(
ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, dl,		ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS, dl,
cast<AtomicSDNode>(N)->getMemoryVT(), VTs, N->getOperand(0),		cast<AtomicSDNode>(N)->getMemoryVT(), VTs, N->getOperand(0),
N->getOperand(1), Zero, Zero, cast<AtomicSDNode>(N)->getMemOperand());		N->getOperand(1), Zero, Zero, cast<AtomicSDNode>(N)->getMemOperand());

ReplaceValueWith(SDValue(N, 0), Swap.getValue(0));		ReplaceValueWith(SDValue(N, 0), Swap.getValue(0));
ReplaceValueWith(SDValue(N, 1), Swap.getValue(2));		ReplaceValueWith(SDValue(N, 1), Swap.getValue(2));
}		}

		void DAGTypeLegalizer::ExpandIntRes_VECREDUCE(SDNode *N,
		SDValue &Lo, SDValue &Hi) {
		// TODO For VECREDUCE_(AND\|OR\|XOR) we could split the vector and calculate
		// both halves independently.
		SDValue Res = TLI.expandVecReduce(N, DAG);
		SplitInteger(Res, Lo, Hi);
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Integer Operand Expansion		// Integer Operand Expansion
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// ExpandIntegerOperand - This method is called when the specified operand of		/// ExpandIntegerOperand - This method is called when the specified operand of
/// the specified node is found to need expansion. At this point, all of the		/// the specified node is found to need expansion. At this point, all of the
/// result types of the node are known to be legal, but other operands of the		/// result types of the node are known to be legal, but other operands of the
/// node may need promotion or expansion as well as the specified one.		/// node may need promotion or expansion as well as the specified one.
▲ Show 20 Lines • Show All 652 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_INSERT_VECTOR_ELT(SDNode *N) {
SDValue V0 = GetPromotedInteger(N->getOperand(0));		SDValue V0 = GetPromotedInteger(N->getOperand(0));

SDValue ConvElem = DAG.getNode(ISD::ANY_EXTEND, dl,		SDValue ConvElem = DAG.getNode(ISD::ANY_EXTEND, dl,
NOutVTElem, N->getOperand(1));		NOutVTElem, N->getOperand(1));
return DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, NOutVT,		return DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, NOutVT,
V0, ConvElem, N->getOperand(2));		V0, ConvElem, N->getOperand(2));
}		}

		SDValue DAGTypeLegalizer::PromoteIntRes_VECREDUCE(SDNode *N) {
		// The VECREDUCE result size may be larger than the element size, so
		// we can simply change the result type.
		SDLoc dl(N);
		EVT NVT = TLI.getTypeToTransformTo(*DAG.getContext(), N->getValueType(0));
		return DAG.getNode(N->getOpcode(), dl, NVT, N->getOperand(0));
		}

SDValue DAGTypeLegalizer::PromoteIntOp_EXTRACT_VECTOR_ELT(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntOp_EXTRACT_VECTOR_ELT(SDNode *N) {
SDLoc dl(N);		SDLoc dl(N);
SDValue V0 = GetPromotedInteger(N->getOperand(0));		SDValue V0 = GetPromotedInteger(N->getOperand(0));
SDValue V1 = DAG.getZExtOrTrunc(N->getOperand(1), dl,		SDValue V1 = DAG.getZExtOrTrunc(N->getOperand(1), dl,
TLI.getVectorIdxTy(DAG.getDataLayout()));		TLI.getVectorIdxTy(DAG.getDataLayout()));
SDValue Ext = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl,		SDValue Ext = DAG.getNode(ISD::EXTRACT_VECTOR_ELT, dl,
V0->getValueType(0).getScalarType(), V0, V1);		V0->getValueType(0).getScalarType(), V0, V1);

▲ Show 20 Lines • Show All 43 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines	private:
SDValue PromoteIntRes_UADDSUBO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_UADDSUBO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_ADDSUBCARRY(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_ADDSUBCARRY(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_UNDEF(SDNode *N);		SDValue PromoteIntRes_UNDEF(SDNode *N);
SDValue PromoteIntRes_VAARG(SDNode *N);		SDValue PromoteIntRes_VAARG(SDNode *N);
SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo);		SDValue PromoteIntRes_XMULO(SDNode *N, unsigned ResNo);
SDValue PromoteIntRes_ADDSUBSAT(SDNode *N);		SDValue PromoteIntRes_ADDSUBSAT(SDNode *N);
SDValue PromoteIntRes_MULFIX(SDNode *N);		SDValue PromoteIntRes_MULFIX(SDNode *N);
SDValue PromoteIntRes_FLT_ROUNDS(SDNode *N);		SDValue PromoteIntRes_FLT_ROUNDS(SDNode *N);
		SDValue PromoteIntRes_VECREDUCE(SDNode *N);

// Integer Operand Promotion.		// Integer Operand Promotion.
bool PromoteIntegerOperand(SDNode *N, unsigned OpNo);		bool PromoteIntegerOperand(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_ANY_EXTEND(SDNode *N);		SDValue PromoteIntOp_ANY_EXTEND(SDNode *N);
SDValue PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N);		SDValue PromoteIntOp_ATOMIC_STORE(AtomicSDNode *N);
SDValue PromoteIntOp_BITCAST(SDNode *N);		SDValue PromoteIntOp_BITCAST(SDNode *N);
SDValue PromoteIntOp_BUILD_PAIR(SDNode *N);		SDValue PromoteIntOp_BUILD_PAIR(SDNode *N);
SDValue PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_BR_CC(SDNode *N, unsigned OpNo);
Show All 18 Lines	private:
SDValue PromoteIntOp_MLOAD(MaskedLoadSDNode *N, unsigned OpNo);		SDValue PromoteIntOp_MLOAD(MaskedLoadSDNode *N, unsigned OpNo);
SDValue PromoteIntOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);		SDValue PromoteIntOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);
SDValue PromoteIntOp_MGATHER(MaskedGatherSDNode *N, unsigned OpNo);		SDValue PromoteIntOp_MGATHER(MaskedGatherSDNode *N, unsigned OpNo);
SDValue PromoteIntOp_ADDSUBCARRY(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_ADDSUBCARRY(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_FRAMERETURNADDR(SDNode *N);		SDValue PromoteIntOp_FRAMERETURNADDR(SDNode *N);
SDValue PromoteIntOp_PREFETCH(SDNode *N, unsigned OpNo);		SDValue PromoteIntOp_PREFETCH(SDNode *N, unsigned OpNo);
SDValue PromoteIntOp_MULFIX(SDNode *N);		SDValue PromoteIntOp_MULFIX(SDNode *N);
SDValue PromoteIntOp_FPOWI(SDNode *N);		SDValue PromoteIntOp_FPOWI(SDNode *N);
		SDValue PromoteIntOp_VECREDUCE(SDNode *N);

void PromoteSetCCOperands(SDValue &LHS,SDValue &RHS, ISD::CondCode Code);		void PromoteSetCCOperands(SDValue &LHS,SDValue &RHS, ISD::CondCode Code);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Integer Expansion Support: LegalizeIntegerTypes.cpp		// Integer Expansion Support: LegalizeIntegerTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Given a processed operand Op which was expanded into two integers of half		/// Given a processed operand Op which was expanded into two integers of half
▲ Show 20 Lines • Show All 42 Lines • ▼ Show 20 Lines	private:

void ExpandIntRes_SADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_SADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_UADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_UADDSUBO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_XMULO (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_XMULO (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_ADDSUBSAT (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ADDSUBSAT (SDNode *N, SDValue &Lo, SDValue &Hi);
void ExpandIntRes_MULFIX (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_MULFIX (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandIntRes_ATOMIC_LOAD (SDNode *N, SDValue &Lo, SDValue &Hi);		void ExpandIntRes_ATOMIC_LOAD (SDNode *N, SDValue &Lo, SDValue &Hi);
		void ExpandIntRes_VECREDUCE (SDNode *N, SDValue &Lo, SDValue &Hi);

void ExpandShiftByConstant(SDNode *N, const APInt &Amt,		void ExpandShiftByConstant(SDNode *N, const APInt &Amt,
SDValue &Lo, SDValue &Hi);		SDValue &Lo, SDValue &Hi);
bool ExpandShiftWithKnownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);		bool ExpandShiftWithKnownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);
bool ExpandShiftWithUnknownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);		bool ExpandShiftWithUnknownAmountBit(SDNode *N, SDValue &Lo, SDValue &Hi);

// Integer Operand Expansion.		// Integer Operand Expansion.
bool ExpandIntegerOperand(SDNode *N, unsigned OpNo);		bool ExpandIntegerOperand(SDNode *N, unsigned OpNo);
▲ Show 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	private:
SDValue ScalarizeVecOp_BITCAST(SDNode *N);		SDValue ScalarizeVecOp_BITCAST(SDNode *N);
SDValue ScalarizeVecOp_UnaryOp(SDNode *N);		SDValue ScalarizeVecOp_UnaryOp(SDNode *N);
SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);		SDValue ScalarizeVecOp_CONCAT_VECTORS(SDNode *N);
SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue ScalarizeVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue ScalarizeVecOp_VSELECT(SDNode *N);		SDValue ScalarizeVecOp_VSELECT(SDNode *N);
SDValue ScalarizeVecOp_VSETCC(SDNode *N);		SDValue ScalarizeVecOp_VSETCC(SDNode *N);
SDValue ScalarizeVecOp_STORE(StoreSDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo);		SDValue ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo);
		SDValue ScalarizeVecOp_VECREDUCE(SDNode *N);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Vector Splitting Support: LegalizeVectorTypes.cpp		// Vector Splitting Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Given a processed vector Op which was split into vectors of half the size,		/// Given a processed vector Op which was split into vectors of half the size,
/// this method returns the halves. The first elements of Op coincide with the		/// this method returns the halves. The first elements of Op coincide with the
/// elements of Lo; the remaining elements of Op coincide with the elements of		/// elements of Lo; the remaining elements of Op coincide with the elements of
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines	private:
SDValue WidenVecOp_STORE(SDNode* N);		SDValue WidenVecOp_STORE(SDNode* N);
SDValue WidenVecOp_MSTORE(SDNode* N, unsigned OpNo);		SDValue WidenVecOp_MSTORE(SDNode* N, unsigned OpNo);
SDValue WidenVecOp_MGATHER(SDNode* N, unsigned OpNo);		SDValue WidenVecOp_MGATHER(SDNode* N, unsigned OpNo);
SDValue WidenVecOp_MSCATTER(SDNode* N, unsigned OpNo);		SDValue WidenVecOp_MSCATTER(SDNode* N, unsigned OpNo);
SDValue WidenVecOp_SETCC(SDNode* N);		SDValue WidenVecOp_SETCC(SDNode* N);

SDValue WidenVecOp_Convert(SDNode *N);		SDValue WidenVecOp_Convert(SDNode *N);
SDValue WidenVecOp_FCOPYSIGN(SDNode *N);		SDValue WidenVecOp_FCOPYSIGN(SDNode *N);
		SDValue WidenVecOp_VECREDUCE(SDNode *N);

//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//
// Vector Widening Utilities Support: LegalizeVectorTypes.cpp		// Vector Widening Utilities Support: LegalizeVectorTypes.cpp
//===--------------------------------------------------------------------===//		//===--------------------------------------------------------------------===//

/// Helper function to generate a set of loads to load a vector with a		/// Helper function to generate a set of loads to load a vector with a
/// resulting wider type. It takes:		/// resulting wider type. It takes:
/// LdChain: list of chains for the load to be generated.		/// LdChain: list of chains for the load to be generated.
▲ Show 20 Lines • Show All 107 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

Show First 20 Lines • Show All 288 Lines • ▼ Show 20 Lines	if (StVT.isVector() && ST->isTruncatingStore()) {
}		}
case TargetLowering::Expand:		case TargetLowering::Expand:
Changed = true;		Changed = true;
return LegalizeOp(ExpandStore(Op));		return LegalizeOp(ExpandStore(Op));
}		}
}		}
}		}

bool HasVectorValue = false;		bool HasVectorValueOrOp = false;
for (SDNode::value_iterator J = Node->value_begin(), E = Node->value_end();		for (auto J = Node->value_begin(), E = Node->value_end(); J != E; ++J)
J != E;		HasVectorValueOrOp \|= J->isVector();
++J)		for (const SDValue &Op : Node->op_values())
HasVectorValue \|= J->isVector();		HasVectorValueOrOp \|= Op.getValueType().isVector();
if (!HasVectorValue)
		if (!HasVectorValueOrOp)
return TranslateLegalizeResults(Op, Result);		return TranslateLegalizeResults(Op, Result);

TargetLowering::LegalizeAction Action = TargetLowering::Legal;		TargetLowering::LegalizeAction Action = TargetLowering::Legal;
switch (Op.getOpcode()) {		switch (Op.getOpcode()) {
default:		default:
return TranslateLegalizeResults(Op, Result);		return TranslateLegalizeResults(Op, Result);
case ISD::STRICT_FADD:		case ISD::STRICT_FADD:
case ISD::STRICT_FSUB:		case ISD::STRICT_FSUB:
▲ Show 20 Lines • Show All 125 Lines • ▼ Show 20 Lines	case ISD::UMULFIX: {
break;		break;
}		}
case ISD::FP_ROUND_INREG:		case ISD::FP_ROUND_INREG:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
cast<VTSDNode>(Node->getOperand(1))->getVT());		cast<VTSDNode>(Node->getOperand(1))->getVT());
break;		break;
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN:
Action = TLI.getOperationAction(Node->getOpcode(),		Action = TLI.getOperationAction(Node->getOpcode(),
Node->getOperand(0).getValueType());		Node->getOperand(0).getValueType());
break;		break;
}		}

LLVM_DEBUG(dbgs() << "\nLegalizing vector op: "; Node->dump(&DAG));		LLVM_DEBUG(dbgs() << "\nLegalizing vector op: "; Node->dump(&DAG));

switch (Action) {		switch (Action) {
▲ Show 20 Lines • Show All 359 Lines • ▼ Show 20 Lines	SDValue VectorLegalizer::Expand(SDValue Op) {
case ISD::STRICT_FNEARBYINT:		case ISD::STRICT_FNEARBYINT:
case ISD::STRICT_FMAXNUM:		case ISD::STRICT_FMAXNUM:
case ISD::STRICT_FMINNUM:		case ISD::STRICT_FMINNUM:
case ISD::STRICT_FCEIL:		case ISD::STRICT_FCEIL:
case ISD::STRICT_FFLOOR:		case ISD::STRICT_FFLOOR:
case ISD::STRICT_FROUND:		case ISD::STRICT_FROUND:
case ISD::STRICT_FTRUNC:		case ISD::STRICT_FTRUNC:
return ExpandStrictFPOp(Op);		return ExpandStrictFPOp(Op);
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN:
		return TLI.expandVecReduce(Op.getNode(), DAG);
default:		default:
return DAG.UnrollVectorOp(Op.getNode());		return DAG.UnrollVectorOp(Op.getNode());
}		}
}		}

SDValue VectorLegalizer::ExpandSELECT(SDValue Op) {		SDValue VectorLegalizer::ExpandSELECT(SDValue Op) {
// Lower a select instruction where the condition is a scalar and the		// Lower a select instruction where the condition is a scalar and the
// operands are vectors. Lower this select to VSELECT and implement it		// operands are vectors. Lower this select to VSELECT and implement it
▲ Show 20 Lines • Show All 495 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 600 Lines • ▼ Show 20 Lines	case ISD::SETCC:
Res = ScalarizeVecOp_VSETCC(N);		Res = ScalarizeVecOp_VSETCC(N);
break;		break;
case ISD::STORE:		case ISD::STORE:
Res = ScalarizeVecOp_STORE(cast<StoreSDNode>(N), OpNo);		Res = ScalarizeVecOp_STORE(cast<StoreSDNode>(N), OpNo);
break;		break;
case ISD::FP_ROUND:		case ISD::FP_ROUND:
Res = ScalarizeVecOp_FP_ROUND(N, OpNo);		Res = ScalarizeVecOp_FP_ROUND(N, OpNo);
break;		break;
		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN:
		Res = ScalarizeVecOp_VECREDUCE(N);
		break;
}		}
}		}

// If the result is null, the sub-method took care of registering results etc.		// If the result is null, the sub-method took care of registering results etc.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;

// If the result is N, the sub-method updated N in place. Tell the legalizer		// If the result is N, the sub-method updated N in place. Tell the legalizer
// core about this.		// core about this.
▲ Show 20 Lines • Show All 114 Lines • ▼ Show 20 Lines
SDValue DAGTypeLegalizer::ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo) {		SDValue DAGTypeLegalizer::ScalarizeVecOp_FP_ROUND(SDNode *N, unsigned OpNo) {
SDValue Elt = GetScalarizedVector(N->getOperand(0));		SDValue Elt = GetScalarizedVector(N->getOperand(0));
SDValue Res = DAG.getNode(ISD::FP_ROUND, SDLoc(N),		SDValue Res = DAG.getNode(ISD::FP_ROUND, SDLoc(N),
N->getValueType(0).getVectorElementType(), Elt,		N->getValueType(0).getVectorElementType(), Elt,
N->getOperand(1));		N->getOperand(1));
return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Res);		return DAG.getNode(ISD::SCALAR_TO_VECTOR, SDLoc(N), N->getValueType(0), Res);
}		}

		SDValue DAGTypeLegalizer::ScalarizeVecOp_VECREDUCE(SDNode *N) {
		SDValue Res = GetScalarizedVector(N->getOperand(0));
		// Result type may be wider than element type.
		if (Res.getValueType() != N->getValueType(0))
		Res = DAG.getNode(ISD::ANY_EXTEND, SDLoc(N), N->getValueType(0), Res);
		return Res;
		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Result Vector Splitting		// Result Vector Splitting
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// This method is called when the specified result of the specified node is		/// This method is called when the specified result of the specified node is
/// found to need vector splitting. At this point, the node may also have		/// found to need vector splitting. At this point, the node may also have
/// invalid operands or may have other results that need legalization, we just		/// invalid operands or may have other results that need legalization, we just
/// know that (at least) one result needs vector splitting.		/// know that (at least) one result needs vector splitting.
▲ Show 20 Lines • Show All 3,116 Lines • ▼ Show 20 Lines	#endif
case ISD::FP_EXTEND:		case ISD::FP_EXTEND:
case ISD::FP_TO_SINT:		case ISD::FP_TO_SINT:
case ISD::FP_TO_UINT:		case ISD::FP_TO_UINT:
case ISD::SINT_TO_FP:		case ISD::SINT_TO_FP:
case ISD::UINT_TO_FP:		case ISD::UINT_TO_FP:
case ISD::TRUNCATE:		case ISD::TRUNCATE:
Res = WidenVecOp_Convert(N);		Res = WidenVecOp_Convert(N);
break;		break;

		case ISD::VECREDUCE_FADD:
		case ISD::VECREDUCE_FMUL:
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_MUL:
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_SMAX:
		case ISD::VECREDUCE_SMIN:
		case ISD::VECREDUCE_UMAX:
		case ISD::VECREDUCE_UMIN:
		case ISD::VECREDUCE_FMAX:
		case ISD::VECREDUCE_FMIN:
		Res = WidenVecOp_VECREDUCE(N);
		break;
}		}

// If Res is null, the sub-method took care of registering the result.		// If Res is null, the sub-method took care of registering the result.
if (!Res.getNode()) return false;		if (!Res.getNode()) return false;

// If the result is N, the sub-method updated N in place. Tell the legalizer		// If the result is N, the sub-method updated N in place. Tell the legalizer
// core about this.		// core about this.
if (Res.getNode() == N)		if (Res.getNode() == N)
▲ Show 20 Lines • Show All 332 Lines • ▼ Show 20 Lines	EVT ResVT = EVT::getVectorVT(*DAG.getContext(),
VT.getVectorNumElements());		VT.getVectorNumElements());
SDValue CC = DAG.getNode(		SDValue CC = DAG.getNode(
ISD::EXTRACT_SUBVECTOR, dl, ResVT, WideSETCC,		ISD::EXTRACT_SUBVECTOR, dl, ResVT, WideSETCC,
DAG.getConstant(0, dl, TLI.getVectorIdxTy(DAG.getDataLayout())));		DAG.getConstant(0, dl, TLI.getVectorIdxTy(DAG.getDataLayout())));

return PromoteTargetBoolean(CC, VT);		return PromoteTargetBoolean(CC, VT);
}		}

		SDValue DAGTypeLegalizer::WidenVecOp_VECREDUCE(SDNode *N) {
		SDLoc dl(N);
		SDValue Op = GetWidenedVector(N->getOperand(0));
		EVT OrigVT = N->getOperand(0).getValueType();
		EVT WideVT = Op.getValueType();
		EVT ElemVT = OrigVT.getVectorElementType();

		SDValue NeutralElem;
		switch (N->getOpcode()) {
		case ISD::VECREDUCE_ADD:
		case ISD::VECREDUCE_OR:
		case ISD::VECREDUCE_XOR:
		case ISD::VECREDUCE_UMAX:
		NeutralElem = DAG.getConstant(0, dl, ElemVT);
		break;
		case ISD::VECREDUCE_MUL:
		NeutralElem = DAG.getConstant(1, dl, ElemVT);
		break;
		case ISD::VECREDUCE_AND:
		case ISD::VECREDUCE_UMIN:
		NeutralElem = DAG.getAllOnesConstant(dl, ElemVT);
		break;
		case ISD::VECREDUCE_SMAX:
		NeutralElem = DAG.getConstant(
		APInt::getSignedMinValue(ElemVT.getSizeInBits()), dl, ElemVT);
		break;
		case ISD::VECREDUCE_SMIN:
		NeutralElem = DAG.getConstant(
		APInt::getSignedMaxValue(ElemVT.getSizeInBits()), dl, ElemVT);
		break;
		case ISD::VECREDUCE_FADD:
		NeutralElem = DAG.getConstantFP(0.0, dl, ElemVT);
		break;
		case ISD::VECREDUCE_FMUL:
		NeutralElem = DAG.getConstantFP(1.0, dl, ElemVT);
		break;
		case ISD::VECREDUCE_FMAX:
		NeutralElem = DAG.getConstantFP(
		std::numeric_limits<double>::infinity(), dl, ElemVT);
		break;
		case ISD::VECREDUCE_FMIN:
		NeutralElem = DAG.getConstantFP(
		-std::numeric_limits<double>::infinity(), dl, ElemVT);
		break;
		}

		// Pad the vector with the neutral element.
		unsigned OrigElts = OrigVT.getVectorNumElements();
		unsigned WideElts = WideVT.getVectorNumElements();
		for (unsigned Idx = OrigElts; Idx < WideElts; Idx++)
		Op = DAG.getNode(ISD::INSERT_VECTOR_ELT, dl, WideVT, Op, NeutralElem,
		DAG.getConstant(Idx, dl, TLI.getVectorIdxTy(DAG.getDataLayout())));

		return DAG.getNode(N->getOpcode(), dl, N->getValueType(0), Op, N->getFlags());
		}


//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Vector Widening Utilities		// Vector Widening Utilities
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// Utility function to find the type to chop up a widen vector for load/store		// Utility function to find the type to chop up a widen vector for load/store
// TLI: Target lowering used to determine legal types.		// TLI: Target lowering used to determine legal types.
// Width: Width left need to load/store.		// Width: Width left need to load/store.
▲ Show 20 Lines • Show All 453 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp

Show First 20 Lines • Show All 5,611 Lines • ▼ Show 20 Lines	bool TargetLowering::expandMULO(SDNode *Node, SDValue &Result,
EVT RType = Node->getValueType(1);		EVT RType = Node->getValueType(1);
if (RType.getSizeInBits() < Overflow.getValueSizeInBits())		if (RType.getSizeInBits() < Overflow.getValueSizeInBits())
Overflow = DAG.getNode(ISD::TRUNCATE, dl, RType, Overflow);		Overflow = DAG.getNode(ISD::TRUNCATE, dl, RType, Overflow);

assert(RType.getSizeInBits() == Overflow.getValueSizeInBits() &&		assert(RType.getSizeInBits() == Overflow.getValueSizeInBits() &&
"Unexpected result type for S/UMULO legalization");		"Unexpected result type for S/UMULO legalization");
return true;		return true;
}		}

		SDValue TargetLowering::expandVecReduce(SDNode *Node, SelectionDAG &DAG) const {
		SDLoc dl(Node);
		bool NoNaN = Node->getFlags().hasNoNaNs();
		unsigned BaseOpcode = 0;
		switch (Node->getOpcode()) {
		default: llvm_unreachable("Expected VECREDUCE opcode");
		case ISD::VECREDUCE_FADD: BaseOpcode = ISD::FADD; break;
		case ISD::VECREDUCE_FMUL: BaseOpcode = ISD::FMUL; break;
		case ISD::VECREDUCE_ADD: BaseOpcode = ISD::ADD; break;
		case ISD::VECREDUCE_MUL: BaseOpcode = ISD::MUL; break;
		case ISD::VECREDUCE_AND: BaseOpcode = ISD::AND; break;
		case ISD::VECREDUCE_OR: BaseOpcode = ISD::OR; break;
		case ISD::VECREDUCE_XOR: BaseOpcode = ISD::XOR; break;
		case ISD::VECREDUCE_SMAX: BaseOpcode = ISD::SMAX; break;
		case ISD::VECREDUCE_SMIN: BaseOpcode = ISD::SMIN; break;
		case ISD::VECREDUCE_UMAX: BaseOpcode = ISD::UMAX; break;
		case ISD::VECREDUCE_UMIN: BaseOpcode = ISD::UMIN; break;
		case ISD::VECREDUCE_FMAX:
		BaseOpcode = NoNaN ? ISD::FMAXNUM : ISD::FMAXIMUM;
		break;
		case ISD::VECREDUCE_FMIN:
		BaseOpcode = NoNaN ? ISD::FMINNUM : ISD::FMINIMUM;
		break;
		}

		SDValue Op = Node->getOperand(0);
		EVT VT = Op.getValueType();

		// Try to use a shuffle reduction for power of two vectors.
		if (VT.isPow2VectorType()) {
		while (VT.getVectorNumElements() > 1) {
		EVT HalfVT = VT.getHalfNumVectorElementsVT(*DAG.getContext());
		if (!isOperationLegalOrCustom(BaseOpcode, HalfVT))
		break;

		SDValue Lo, Hi;
		std::tie(Lo, Hi) = DAG.SplitVector(Op, dl);
		Op = DAG.getNode(BaseOpcode, dl, HalfVT, Lo, Hi);
		VT = HalfVT;
		}
		}

		EVT EltVT = VT.getVectorElementType();
		unsigned NumElts = VT.getVectorNumElements();

		SmallVector<SDValue, 8> Ops;
		DAG.ExtractVectorElements(Op, Ops, 0, NumElts);

		SDValue Res = Ops[0];
		for (unsigned i = 1; i < NumElts; i++)
		Res = DAG.getNode(BaseOpcode, dl, EltVT, Res, Ops[i], Node->getFlags());

		// Result type may be wider than element type.
		if (EltVT != Node->getValueType(0))
		Res = DAG.getNode(ISD::ANY_EXTEND, dl, Node->getValueType(0), Res);
		return Res;
		}

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 659 Lines • ▼ Show 20 Lines	if (VT.isVector()) {
setOperationAction(ISD::FCOPYSIGN, VT, Expand);		setOperationAction(ISD::FCOPYSIGN, VT, Expand);
setOperationAction(ISD::ANY_EXTEND_VECTOR_INREG, VT, Expand);		setOperationAction(ISD::ANY_EXTEND_VECTOR_INREG, VT, Expand);
setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, VT, Expand);		setOperationAction(ISD::SIGN_EXTEND_VECTOR_INREG, VT, Expand);
setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);		setOperationAction(ISD::ZERO_EXTEND_VECTOR_INREG, VT, Expand);
}		}

// For most targets @llvm.get.dynamic.area.offset just returns 0.		// For most targets @llvm.get.dynamic.area.offset just returns 0.
setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);		setOperationAction(ISD::GET_DYNAMIC_AREA_OFFSET, VT, Expand);

		// Vector reduction default to expand.
		setOperationAction(ISD::VECREDUCE_FADD, VT, Expand);
		setOperationAction(ISD::VECREDUCE_FMUL, VT, Expand);
		setOperationAction(ISD::VECREDUCE_ADD, VT, Expand);
		setOperationAction(ISD::VECREDUCE_MUL, VT, Expand);
		setOperationAction(ISD::VECREDUCE_AND, VT, Expand);
		setOperationAction(ISD::VECREDUCE_OR, VT, Expand);
		setOperationAction(ISD::VECREDUCE_XOR, VT, Expand);
		setOperationAction(ISD::VECREDUCE_SMAX, VT, Expand);
		setOperationAction(ISD::VECREDUCE_SMIN, VT, Expand);
		setOperationAction(ISD::VECREDUCE_UMAX, VT, Expand);
		setOperationAction(ISD::VECREDUCE_UMIN, VT, Expand);
		setOperationAction(ISD::VECREDUCE_FMAX, VT, Expand);
		setOperationAction(ISD::VECREDUCE_FMIN, VT, Expand);
}		}

// Most targets ignore the @llvm.prefetch intrinsic.		// Most targets ignore the @llvm.prefetch intrinsic.
setOperationAction(ISD::PREFETCH, MVT::Other, Expand);		setOperationAction(ISD::PREFETCH, MVT::Other, Expand);

// Most targets also ignore the @llvm.readcyclecounter intrinsic.		// Most targets also ignore the @llvm.readcyclecounter intrinsic.
setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);		setOperationAction(ISD::READCYCLECOUNTER, MVT::i64, Expand);

▲ Show 20 Lines • Show All 1,203 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 692 Lines • ▼ Show 20 Lines	if (Subtarget->hasNEON()) {
// AArch64 doesn't have MUL.2d:		// AArch64 doesn't have MUL.2d:
setOperationAction(ISD::MUL, MVT::v2i64, Expand);		setOperationAction(ISD::MUL, MVT::v2i64, Expand);
// Custom handling for some quad-vector types to detect MULL.		// Custom handling for some quad-vector types to detect MULL.
setOperationAction(ISD::MUL, MVT::v8i16, Custom);		setOperationAction(ISD::MUL, MVT::v8i16, Custom);
setOperationAction(ISD::MUL, MVT::v4i32, Custom);		setOperationAction(ISD::MUL, MVT::v4i32, Custom);
setOperationAction(ISD::MUL, MVT::v2i64, Custom);		setOperationAction(ISD::MUL, MVT::v2i64, Custom);

// Vector reductions		// Vector reductions
for (MVT VT : MVT::integer_valuetypes()) {		for (MVT VT : { MVT::v8i8, MVT::v4i16, MVT::v2i32,
		MVT::v16i8, MVT::v8i16, MVT::v4i32, MVT::v2i64 }) {
setOperationAction(ISD::VECREDUCE_ADD, VT, Custom);		setOperationAction(ISD::VECREDUCE_ADD, VT, Custom);
setOperationAction(ISD::VECREDUCE_SMAX, VT, Custom);		setOperationAction(ISD::VECREDUCE_SMAX, VT, Custom);
setOperationAction(ISD::VECREDUCE_SMIN, VT, Custom);		setOperationAction(ISD::VECREDUCE_SMIN, VT, Custom);
setOperationAction(ISD::VECREDUCE_UMAX, VT, Custom);		setOperationAction(ISD::VECREDUCE_UMAX, VT, Custom);
setOperationAction(ISD::VECREDUCE_UMIN, VT, Custom);		setOperationAction(ISD::VECREDUCE_UMIN, VT, Custom);
}		}
for (MVT VT : MVT::fp_valuetypes()) {		for (MVT VT : { MVT::v4f16, MVT::v2f32,
		MVT::v8f16, MVT::v4f32, MVT::v2f64 }) {
setOperationAction(ISD::VECREDUCE_FMAX, VT, Custom);		setOperationAction(ISD::VECREDUCE_FMAX, VT, Custom);
setOperationAction(ISD::VECREDUCE_FMIN, VT, Custom);		setOperationAction(ISD::VECREDUCE_FMIN, VT, Custom);
}		}

setOperationAction(ISD::ANY_EXTEND, MVT::v4i32, Legal);		setOperationAction(ISD::ANY_EXTEND, MVT::v4i32, Legal);
setTruncStoreAction(MVT::v2i32, MVT::v2i16, Expand);		setTruncStoreAction(MVT::v2i32, MVT::v2i16, Expand);
// Likewise, narrowing and extending vector loads/stores aren't handled		// Likewise, narrowing and extending vector loads/stores aren't handled
// directly.		// directly.
▲ Show 20 Lines • Show All 11,254 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/vecreduce-add-legalization.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon \| FileCheck %s --check-prefix=CHECK

				declare i1 @llvm.experimental.vector.reduce.add.i1.v1i1(<1 x i1> %a)
				declare i8 @llvm.experimental.vector.reduce.add.i8.v1i8(<1 x i8> %a)
				declare i16 @llvm.experimental.vector.reduce.add.i16.v1i16(<1 x i16> %a)
				declare i24 @llvm.experimental.vector.reduce.add.i24.v1i24(<1 x i24> %a)
				declare i32 @llvm.experimental.vector.reduce.add.i32.v1i32(<1 x i32> %a)
				declare i64 @llvm.experimental.vector.reduce.add.i64.v1i64(<1 x i64> %a)
				declare i128 @llvm.experimental.vector.reduce.add.i128.v1i128(<1 x i128> %a)

				declare i8 @llvm.experimental.vector.reduce.add.i8.v3i8(<3 x i8> %a)
				declare i8 @llvm.experimental.vector.reduce.add.i8.v9i8(<9 x i8> %a)
				declare i32 @llvm.experimental.vector.reduce.add.i32.v3i32(<3 x i32> %a)
				declare i1 @llvm.experimental.vector.reduce.add.i1.v4i1(<4 x i1> %a)
				declare i24 @llvm.experimental.vector.reduce.add.i24.v4i24(<4 x i24> %a)
				declare i128 @llvm.experimental.vector.reduce.add.i128.v2i128(<2 x i128> %a)
				declare i32 @llvm.experimental.vector.reduce.add.i32.v16i32(<16 x i32> %a)

				define i1 @test_v1i1(<1 x i1> %a) nounwind {
				; CHECK-LABEL: test_v1i1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and w0, w0, #0x1
				; CHECK-NEXT: ret
				%b = call i1 @llvm.experimental.vector.reduce.add.i1.v1i1(<1 x i1> %a)
				ret i1 %b
				}

				define i8 @test_v1i8(<1 x i8> %a) nounwind {
				; CHECK-LABEL: test_v1i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w0, v0.b[0]
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.add.i8.v1i8(<1 x i8> %a)
				ret i8 %b
				}

				define i16 @test_v1i16(<1 x i16> %a) nounwind {
				; CHECK-LABEL: test_v1i16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w0, v0.h[0]
				; CHECK-NEXT: ret
				%b = call i16 @llvm.experimental.vector.reduce.add.i16.v1i16(<1 x i16> %a)
				ret i16 %b
				}

				define i24 @test_v1i24(<1 x i24> %a) nounwind {
				; CHECK-LABEL: test_v1i24:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call i24 @llvm.experimental.vector.reduce.add.i24.v1i24(<1 x i24> %a)
				ret i24 %b
				}

				define i32 @test_v1i32(<1 x i32> %a) nounwind {
				; CHECK-LABEL: test_v1i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.add.i32.v1i32(<1 x i32> %a)
				ret i32 %b
				}

				define i64 @test_v1i64(<1 x i64> %a) nounwind {
				; CHECK-LABEL: test_v1i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: fmov x0, d0
				; CHECK-NEXT: ret
				%b = call i64 @llvm.experimental.vector.reduce.add.i64.v1i64(<1 x i64> %a)
				ret i64 %b
				}

				define i128 @test_v1i128(<1 x i128> %a) nounwind {
				; CHECK-LABEL: test_v1i128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call i128 @llvm.experimental.vector.reduce.add.i128.v1i128(<1 x i128> %a)
				ret i128 %b
				}

				define i8 @test_v3i8(<3 x i8> %a) nounwind {
				; CHECK-LABEL: test_v3i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: movi d0, #0000000000000000
				; CHECK-NEXT: mov v0.h[0], w0
				; CHECK-NEXT: mov v0.h[1], w1
				; CHECK-NEXT: mov v0.h[2], w2
				; CHECK-NEXT: addv h0, v0.4h
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.add.i8.v3i8(<3 x i8> %a)
				ret i8 %b
				}

				define i8 @test_v9i8(<9 x i8> %a) nounwind {
				; CHECK-LABEL: test_v9i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov v0.b[9], wzr
				; CHECK-NEXT: mov v0.b[10], wzr
				; CHECK-NEXT: mov v0.b[11], wzr
				; CHECK-NEXT: mov v0.b[12], wzr
				; CHECK-NEXT: mov v0.b[13], wzr
				; CHECK-NEXT: mov v0.b[14], wzr
				; CHECK-NEXT: mov v0.b[15], wzr
				; CHECK-NEXT: addv b0, v0.16b
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.add.i8.v9i8(<9 x i8> %a)
				ret i8 %b
				}

				define i32 @test_v3i32(<3 x i32> %a) nounwind {
				; CHECK-LABEL: test_v3i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov v0.s[3], wzr
				; CHECK-NEXT: addv s0, v0.4s
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.add.i32.v3i32(<3 x i32> %a)
				ret i32 %b
				}

				define i1 @test_v4i1(<4 x i1> %a) nounwind {
				; CHECK-LABEL: test_v4i1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: addv h0, v0.4h
				; CHECK-NEXT: fmov w8, s0
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%b = call i1 @llvm.experimental.vector.reduce.add.i1.v4i1(<4 x i1> %a)
				ret i1 %b
				}

				define i24 @test_v4i24(<4 x i24> %a) nounwind {
				; CHECK-LABEL: test_v4i24:
				; CHECK: // %bb.0:
				; CHECK-NEXT: addv s0, v0.4s
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i24 @llvm.experimental.vector.reduce.add.i24.v4i24(<4 x i24> %a)
				ret i24 %b
				}

				define i128 @test_v2i128(<2 x i128> %a) nounwind {
				; CHECK-LABEL: test_v2i128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: adds x0, x0, x2
				; CHECK-NEXT: adcs x1, x1, x3
				; CHECK-NEXT: ret
				%b = call i128 @llvm.experimental.vector.reduce.add.i128.v2i128(<2 x i128> %a)
				ret i128 %b
				}

				define i32 @test_v16i32(<16 x i32> %a) nounwind {
				; CHECK-LABEL: test_v16i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: add v1.4s, v1.4s, v3.4s
				; CHECK-NEXT: add v0.4s, v0.4s, v2.4s
				; CHECK-NEXT: add v0.4s, v0.4s, v1.4s
				; CHECK-NEXT: addv s0, v0.4s
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.add.i32.v16i32(<16 x i32> %a)
				ret i32 %b
				}

llvm/trunk/test/CodeGen/AArch64/vecreduce-and-legalization.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon \| FileCheck %s --check-prefix=CHECK

				declare i1 @llvm.experimental.vector.reduce.and.i1.v1i1(<1 x i1> %a)
				declare i8 @llvm.experimental.vector.reduce.and.i8.v1i8(<1 x i8> %a)
				declare i16 @llvm.experimental.vector.reduce.and.i16.v1i16(<1 x i16> %a)
				declare i24 @llvm.experimental.vector.reduce.and.i24.v1i24(<1 x i24> %a)
				declare i32 @llvm.experimental.vector.reduce.and.i32.v1i32(<1 x i32> %a)
				declare i64 @llvm.experimental.vector.reduce.and.i64.v1i64(<1 x i64> %a)
				declare i128 @llvm.experimental.vector.reduce.and.i128.v1i128(<1 x i128> %a)

				declare i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a)
				declare i8 @llvm.experimental.vector.reduce.and.i8.v9i8(<9 x i8> %a)
				declare i32 @llvm.experimental.vector.reduce.and.i32.v3i32(<3 x i32> %a)
				declare i1 @llvm.experimental.vector.reduce.and.i1.v4i1(<4 x i1> %a)
				declare i24 @llvm.experimental.vector.reduce.and.i24.v4i24(<4 x i24> %a)
				declare i128 @llvm.experimental.vector.reduce.and.i128.v2i128(<2 x i128> %a)
				declare i32 @llvm.experimental.vector.reduce.and.i32.v16i32(<16 x i32> %a)

				define i1 @test_v1i1(<1 x i1> %a) nounwind {
				; CHECK-LABEL: test_v1i1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and w0, w0, #0x1
				; CHECK-NEXT: ret
				%b = call i1 @llvm.experimental.vector.reduce.and.i1.v1i1(<1 x i1> %a)
				ret i1 %b
				}

				define i8 @test_v1i8(<1 x i8> %a) nounwind {
				; CHECK-LABEL: test_v1i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w0, v0.b[0]
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.and.i8.v1i8(<1 x i8> %a)
				ret i8 %b
				}

				define i16 @test_v1i16(<1 x i16> %a) nounwind {
				; CHECK-LABEL: test_v1i16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w0, v0.h[0]
				; CHECK-NEXT: ret
				%b = call i16 @llvm.experimental.vector.reduce.and.i16.v1i16(<1 x i16> %a)
				ret i16 %b
				}

				define i24 @test_v1i24(<1 x i24> %a) nounwind {
				; CHECK-LABEL: test_v1i24:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call i24 @llvm.experimental.vector.reduce.and.i24.v1i24(<1 x i24> %a)
				ret i24 %b
				}

				define i32 @test_v1i32(<1 x i32> %a) nounwind {
				; CHECK-LABEL: test_v1i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.and.i32.v1i32(<1 x i32> %a)
				ret i32 %b
				}

				define i64 @test_v1i64(<1 x i64> %a) nounwind {
				; CHECK-LABEL: test_v1i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: fmov x0, d0
				; CHECK-NEXT: ret
				%b = call i64 @llvm.experimental.vector.reduce.and.i64.v1i64(<1 x i64> %a)
				ret i64 %b
				}

				define i128 @test_v1i128(<1 x i128> %a) nounwind {
				; CHECK-LABEL: test_v1i128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call i128 @llvm.experimental.vector.reduce.and.i128.v1i128(<1 x i128> %a)
				ret i128 %b
				}

				define i8 @test_v3i8(<3 x i8> %a) nounwind {
				; CHECK-LABEL: test_v3i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and w8, w0, w1
				; CHECK-NEXT: and w8, w8, w2
				; CHECK-NEXT: and w0, w8, #0xff
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a)
				ret i8 %b
				}

				define i8 @test_v9i8(<9 x i8> %a) nounwind {
				; CHECK-LABEL: test_v9i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov w8, #-1
				; CHECK-NEXT: mov v0.b[9], w8
				; CHECK-NEXT: mov v0.b[10], w8
				; CHECK-NEXT: mov v0.b[11], w8
				; CHECK-NEXT: mov v0.b[12], w8
				; CHECK-NEXT: mov v0.b[13], w8
				; CHECK-NEXT: mov v0.b[14], w8
				; CHECK-NEXT: mov v0.b[15], w8
				; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
				; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: umov w8, v0.b[1]
				; CHECK-NEXT: umov w9, v0.b[0]
				; CHECK-NEXT: and w8, w9, w8
				; CHECK-NEXT: umov w9, v0.b[2]
				; CHECK-NEXT: and w8, w8, w9
				; CHECK-NEXT: umov w9, v0.b[3]
				; CHECK-NEXT: and w8, w8, w9
				; CHECK-NEXT: umov w9, v0.b[4]
				; CHECK-NEXT: and w8, w8, w9
				; CHECK-NEXT: umov w9, v0.b[5]
				; CHECK-NEXT: and w8, w8, w9
				; CHECK-NEXT: umov w9, v0.b[6]
				; CHECK-NEXT: and w8, w8, w9
				; CHECK-NEXT: umov w9, v0.b[7]
				; CHECK-NEXT: and w0, w8, w9
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.and.i8.v9i8(<9 x i8> %a)
				ret i8 %b
				}

				define i32 @test_v3i32(<3 x i32> %a) nounwind {
				; CHECK-LABEL: test_v3i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov w8, #-1
				; CHECK-NEXT: mov v0.s[3], w8
				; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
				; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: mov w8, v0.s[1]
				; CHECK-NEXT: fmov w9, s0
				; CHECK-NEXT: and w0, w9, w8
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.and.i32.v3i32(<3 x i32> %a)
				ret i32 %b
				}

				define i1 @test_v4i1(<4 x i1> %a) nounwind {
				; CHECK-LABEL: test_v4i1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w10, v0.h[1]
				; CHECK-NEXT: umov w11, v0.h[0]
				; CHECK-NEXT: umov w9, v0.h[2]
				; CHECK-NEXT: and w10, w11, w10
				; CHECK-NEXT: umov w8, v0.h[3]
				; CHECK-NEXT: and w9, w10, w9
				; CHECK-NEXT: and w8, w9, w8
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%b = call i1 @llvm.experimental.vector.reduce.and.i1.v4i1(<4 x i1> %a)
				ret i1 %b
				}

				define i24 @test_v4i24(<4 x i24> %a) nounwind {
				; CHECK-LABEL: test_v4i24:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
				; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: mov w8, v0.s[1]
				; CHECK-NEXT: fmov w9, s0
				; CHECK-NEXT: and w0, w9, w8
				; CHECK-NEXT: ret
				%b = call i24 @llvm.experimental.vector.reduce.and.i24.v4i24(<4 x i24> %a)
				ret i24 %b
				}

				define i128 @test_v2i128(<2 x i128> %a) nounwind {
				; CHECK-LABEL: test_v2i128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and x0, x0, x2
				; CHECK-NEXT: and x1, x1, x3
				; CHECK-NEXT: ret
				%b = call i128 @llvm.experimental.vector.reduce.and.i128.v2i128(<2 x i128> %a)
				ret i128 %b
				}

				define i32 @test_v16i32(<16 x i32> %a) nounwind {
				; CHECK-LABEL: test_v16i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and v1.16b, v1.16b, v3.16b
				; CHECK-NEXT: and v0.16b, v0.16b, v2.16b
				; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
				; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
				; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: mov w8, v0.s[1]
				; CHECK-NEXT: fmov w9, s0
				; CHECK-NEXT: and w0, w9, w8
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.and.i32.v16i32(<16 x i32> %a)
				ret i32 %b
				}

llvm/trunk/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon \| FileCheck %s --check-prefix=CHECK

				declare half @llvm.experimental.vector.reduce.fadd.f16.v1f16(half, <1 x half>)
				declare float @llvm.experimental.vector.reduce.fadd.f32.v1f32(float, <1 x float>)
				declare double @llvm.experimental.vector.reduce.fadd.f64.v1f64(double, <1 x double>)
				declare fp128 @llvm.experimental.vector.reduce.fadd.f128.v1f128(fp128, <1 x fp128>)

				declare float @llvm.experimental.vector.reduce.fadd.f32.v3f32(float, <3 x float>)
				declare fp128 @llvm.experimental.vector.reduce.fadd.f128.v2f128(fp128, <2 x fp128>)
				declare float @llvm.experimental.vector.reduce.fadd.f32.v16f32(float, <16 x float>)

				define half @test_v1f16(<1 x half> %a) nounwind {
				; CHECK-LABEL: test_v1f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call fast nnan half @llvm.experimental.vector.reduce.fadd.f16.v1f16(half 0.0, <1 x half> %a)
				ret half %b
				}

				define float @test_v1f32(<1 x float> %a) nounwind {
				; CHECK-LABEL: test_v1f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: // kill: def $s0 killed $s0 killed $q0
				; CHECK-NEXT: ret
				%b = call fast nnan float @llvm.experimental.vector.reduce.fadd.f32.v1f32(float 0.0, <1 x float> %a)
				ret float %b
				}

				define double @test_v1f64(<1 x double> %a) nounwind {
				; CHECK-LABEL: test_v1f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call fast nnan double @llvm.experimental.vector.reduce.fadd.f64.v1f64(double 0.0, <1 x double> %a)
				ret double %b
				}

				define fp128 @test_v1f128(<1 x fp128> %a) nounwind {
				; CHECK-LABEL: test_v1f128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call fast nnan fp128 @llvm.experimental.vector.reduce.fadd.f128.v1f128(fp128 zeroinitializer, <1 x fp128> %a)
				ret fp128 %b
				}

				define float @test_v3f32(<3 x float> %a) nounwind {
				; CHECK-LABEL: test_v3f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fmov s1, wzr
				; CHECK-NEXT: mov v0.s[3], v1.s[0]
				; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
				; CHECK-NEXT: fadd v0.2s, v0.2s, v1.2s
				; CHECK-NEXT: faddp s0, v0.2s
				; CHECK-NEXT: ret
				%b = call fast nnan float @llvm.experimental.vector.reduce.fadd.f32.v3f32(float 0.0, <3 x float> %a)
				ret float %b
				}

				define fp128 @test_v2f128(<2 x fp128> %a) nounwind {
				; CHECK-LABEL: test_v2f128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: str x30, [sp, #-16]! // 8-byte Folded Spill
				; CHECK-NEXT: bl __addtf3
				; CHECK-NEXT: ldr x30, [sp], #16 // 8-byte Folded Reload
				; CHECK-NEXT: ret
				%b = call fast nnan fp128 @llvm.experimental.vector.reduce.fadd.f128.v2f128(fp128 zeroinitializer, <2 x fp128> %a)
				ret fp128 %b
				}

				define float @test_v16f32(<16 x float> %a) nounwind {
				; CHECK-LABEL: test_v16f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fadd v1.4s, v1.4s, v3.4s
				; CHECK-NEXT: fadd v0.4s, v0.4s, v2.4s
				; CHECK-NEXT: fadd v0.4s, v0.4s, v1.4s
				; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
				; CHECK-NEXT: fadd v0.2s, v0.2s, v1.2s
				; CHECK-NEXT: faddp s0, v0.2s
				; CHECK-NEXT: ret
				%b = call fast nnan float @llvm.experimental.vector.reduce.fadd.f32.v16f32(float 0.0, <16 x float> %a)
				ret float %b
				}

llvm/trunk/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon \| FileCheck %s --check-prefix=CHECK

				declare half @llvm.experimental.vector.reduce.fmax.f16.v1f16(<1 x half> %a)
				declare float @llvm.experimental.vector.reduce.fmax.f32.v1f32(<1 x float> %a)
				declare double @llvm.experimental.vector.reduce.fmax.f64.v1f64(<1 x double> %a)
				declare fp128 @llvm.experimental.vector.reduce.fmax.f128.v1f128(<1 x fp128> %a)

				declare float @llvm.experimental.vector.reduce.fmax.f32.v3f32(<3 x float> %a)
				declare fp128 @llvm.experimental.vector.reduce.fmax.f128.v2f128(<2 x fp128> %a)
				declare float @llvm.experimental.vector.reduce.fmax.f32.v16f32(<16 x float> %a)

				define half @test_v1f16(<1 x half> %a) nounwind {
				; CHECK-LABEL: test_v1f16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call nnan half @llvm.experimental.vector.reduce.fmax.f16.v1f16(<1 x half> %a)
				ret half %b
				}

				define float @test_v1f32(<1 x float> %a) nounwind {
				; CHECK-LABEL: test_v1f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: // kill: def $s0 killed $s0 killed $q0
				; CHECK-NEXT: ret
				%b = call nnan float @llvm.experimental.vector.reduce.fmax.f32.v1f32(<1 x float> %a)
				ret float %b
				}

				define double @test_v1f64(<1 x double> %a) nounwind {
				; CHECK-LABEL: test_v1f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call nnan double @llvm.experimental.vector.reduce.fmax.f64.v1f64(<1 x double> %a)
				ret double %b
				}

				define fp128 @test_v1f128(<1 x fp128> %a) nounwind {
				; CHECK-LABEL: test_v1f128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call nnan fp128 @llvm.experimental.vector.reduce.fmax.f128.v1f128(<1 x fp128> %a)
				ret fp128 %b
				}

				define float @test_v3f32(<3 x float> %a) nounwind {
				; CHECK-LABEL: test_v3f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: orr w8, wzr, #0x7f800000
				; CHECK-NEXT: fmov s1, w8
				; CHECK-NEXT: mov v0.s[3], v1.s[0]
				; CHECK-NEXT: fmaxnmv s0, v0.4s
				; CHECK-NEXT: ret
				%b = call nnan float @llvm.experimental.vector.reduce.fmax.f32.v3f32(<3 x float> %a)
				ret float %b
				}

				define fp128 @test_v2f128(<2 x fp128> %a) nounwind {
				; CHECK-LABEL: test_v2f128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: b fmaxl
				%b = call nnan fp128 @llvm.experimental.vector.reduce.fmax.f128.v2f128(<2 x fp128> %a)
				ret fp128 %b
				}

				define float @test_v16f32(<16 x float> %a) nounwind {
				; CHECK-LABEL: test_v16f32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: fmaxnm v1.4s, v1.4s, v3.4s
				; CHECK-NEXT: fmaxnm v0.4s, v0.4s, v2.4s
				; CHECK-NEXT: fmaxnm v0.4s, v0.4s, v1.4s
				; CHECK-NEXT: fmaxnmv s0, v0.4s
				; CHECK-NEXT: ret
				%b = call nnan float @llvm.experimental.vector.reduce.fmax.f32.v16f32(<16 x float> %a)
				ret float %b
				}

llvm/trunk/test/CodeGen/AArch64/vecreduce-umax-legalization.ll

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc < %s -mtriple=aarch64-none-linux-gnu -mattr=+neon \| FileCheck %s --check-prefix=CHECK

				declare i1 @llvm.experimental.vector.reduce.umax.i1.v1i1(<1 x i1> %a)
				declare i8 @llvm.experimental.vector.reduce.umax.i8.v1i8(<1 x i8> %a)
				declare i16 @llvm.experimental.vector.reduce.umax.i16.v1i16(<1 x i16> %a)
				declare i24 @llvm.experimental.vector.reduce.umax.i24.v1i24(<1 x i24> %a)
				declare i32 @llvm.experimental.vector.reduce.umax.i32.v1i32(<1 x i32> %a)
				declare i64 @llvm.experimental.vector.reduce.umax.i64.v1i64(<1 x i64> %a)
				declare i128 @llvm.experimental.vector.reduce.umax.i128.v1i128(<1 x i128> %a)

				declare i8 @llvm.experimental.vector.reduce.umax.i8.v3i8(<3 x i8> %a)
				declare i8 @llvm.experimental.vector.reduce.umax.i8.v9i8(<9 x i8> %a)
				declare i32 @llvm.experimental.vector.reduce.umax.i32.v3i32(<3 x i32> %a)
				declare i1 @llvm.experimental.vector.reduce.umax.i1.v4i1(<4 x i1> %a)
				declare i24 @llvm.experimental.vector.reduce.umax.i24.v4i24(<4 x i24> %a)
				declare i128 @llvm.experimental.vector.reduce.umax.i128.v2i128(<2 x i128> %a)
				declare i32 @llvm.experimental.vector.reduce.umax.i32.v16i32(<16 x i32> %a)

				define i1 @test_v1i1(<1 x i1> %a) nounwind {
				; CHECK-LABEL: test_v1i1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and w0, w0, #0x1
				; CHECK-NEXT: ret
				%b = call i1 @llvm.experimental.vector.reduce.umax.i1.v1i1(<1 x i1> %a)
				ret i1 %b
				}

				define i8 @test_v1i8(<1 x i8> %a) nounwind {
				; CHECK-LABEL: test_v1i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w0, v0.b[0]
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.umax.i8.v1i8(<1 x i8> %a)
				ret i8 %b
				}

				define i16 @test_v1i16(<1 x i16> %a) nounwind {
				; CHECK-LABEL: test_v1i16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: umov w0, v0.h[0]
				; CHECK-NEXT: ret
				%b = call i16 @llvm.experimental.vector.reduce.umax.i16.v1i16(<1 x i16> %a)
				ret i16 %b
				}

				define i24 @test_v1i24(<1 x i24> %a) nounwind {
				; CHECK-LABEL: test_v1i24:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call i24 @llvm.experimental.vector.reduce.umax.i24.v1i24(<1 x i24> %a)
				ret i24 %b
				}

				define i32 @test_v1i32(<1 x i32> %a) nounwind {
				; CHECK-LABEL: test_v1i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.umax.i32.v1i32(<1 x i32> %a)
				ret i32 %b
				}

				define i64 @test_v1i64(<1 x i64> %a) nounwind {
				; CHECK-LABEL: test_v1i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
				; CHECK-NEXT: fmov x0, d0
				; CHECK-NEXT: ret
				%b = call i64 @llvm.experimental.vector.reduce.umax.i64.v1i64(<1 x i64> %a)
				ret i64 %b
				}

				define i128 @test_v1i128(<1 x i128> %a) nounwind {
				; CHECK-LABEL: test_v1i128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: ret
				%b = call i128 @llvm.experimental.vector.reduce.umax.i128.v1i128(<1 x i128> %a)
				ret i128 %b
				}

				define i8 @test_v3i8(<3 x i8> %a) nounwind {
				; CHECK-LABEL: test_v3i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: movi d0, #0000000000000000
				; CHECK-NEXT: mov v0.h[0], w0
				; CHECK-NEXT: mov v0.h[1], w1
				; CHECK-NEXT: mov v0.h[2], w2
				; CHECK-NEXT: bic v0.4h, #255, lsl #8
				; CHECK-NEXT: umaxv h0, v0.4h
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.umax.i8.v3i8(<3 x i8> %a)
				ret i8 %b
				}

				define i8 @test_v9i8(<9 x i8> %a) nounwind {
				; CHECK-LABEL: test_v9i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov v0.b[9], wzr
				; CHECK-NEXT: mov v0.b[10], wzr
				; CHECK-NEXT: mov v0.b[11], wzr
				; CHECK-NEXT: mov v0.b[12], wzr
				; CHECK-NEXT: mov v0.b[13], wzr
				; CHECK-NEXT: mov v0.b[14], wzr
				; CHECK-NEXT: mov v0.b[15], wzr
				; CHECK-NEXT: umaxv b0, v0.16b
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i8 @llvm.experimental.vector.reduce.umax.i8.v9i8(<9 x i8> %a)
				ret i8 %b
				}

				define i32 @test_v3i32(<3 x i32> %a) nounwind {
				; CHECK-LABEL: test_v3i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mov v0.s[3], wzr
				; CHECK-NEXT: umaxv s0, v0.4s
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.umax.i32.v3i32(<3 x i32> %a)
				ret i32 %b
				}

				define i1 @test_v4i1(<4 x i1> %a) nounwind {
				; CHECK-LABEL: test_v4i1:
				; CHECK: // %bb.0:
				; CHECK-NEXT: movi v1.4h, #1
				; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: umaxv h0, v0.4h
				; CHECK-NEXT: fmov w8, s0
				; CHECK-NEXT: and w0, w8, #0x1
				; CHECK-NEXT: ret
				%b = call i1 @llvm.experimental.vector.reduce.umax.i1.v4i1(<4 x i1> %a)
				ret i1 %b
				}

				define i24 @test_v4i24(<4 x i24> %a) nounwind {
				; CHECK-LABEL: test_v4i24:
				; CHECK: // %bb.0:
				; CHECK-NEXT: bic v0.4s, #255, lsl #24
				; CHECK-NEXT: umaxv s0, v0.4s
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i24 @llvm.experimental.vector.reduce.umax.i24.v4i24(<4 x i24> %a)
				ret i24 %b
				}

				define i128 @test_v2i128(<2 x i128> %a) nounwind {
				; CHECK-LABEL: test_v2i128:
				; CHECK: // %bb.0:
				; CHECK-NEXT: cmp x0, x2
				; CHECK-NEXT: csel x8, x0, x2, hi
				; CHECK-NEXT: cmp x1, x3
				; CHECK-NEXT: csel x9, x0, x2, hi
				; CHECK-NEXT: csel x0, x8, x9, eq
				; CHECK-NEXT: csel x1, x1, x3, hi
				; CHECK-NEXT: ret
				%b = call i128 @llvm.experimental.vector.reduce.umax.i128.v2i128(<2 x i128> %a)
				ret i128 %b
				}

				define i32 @test_v16i32(<16 x i32> %a) nounwind {
				; CHECK-LABEL: test_v16i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: umax v1.4s, v1.4s, v3.4s
				; CHECK-NEXT: umax v0.4s, v0.4s, v2.4s
				; CHECK-NEXT: umax v0.4s, v0.4s, v1.4s
				; CHECK-NEXT: umaxv s0, v0.4s
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				%b = call i32 @llvm.experimental.vector.reduce.umax.i32.v16i32(<16 x i32> %a)
				ret i32 %b
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG][AArch64] Legalize VECREDUCEClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 190146

llvm/trunk/include/llvm/CodeGen/ISDOpcodes.h

llvm/trunk/include/llvm/CodeGen/TargetLowering.h

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeTypes.h

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/trunk/lib/CodeGen/SelectionDAG/TargetLowering.cpp

llvm/trunk/lib/CodeGen/TargetLoweringBase.cpp

llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/trunk/test/CodeGen/AArch64/vecreduce-add-legalization.ll

llvm/trunk/test/CodeGen/AArch64/vecreduce-and-legalization.ll

llvm/trunk/test/CodeGen/AArch64/vecreduce-fadd-legalization.ll

llvm/trunk/test/CodeGen/AArch64/vecreduce-fmax-legalization.ll

llvm/trunk/test/CodeGen/AArch64/vecreduce-umax-legalization.ll

[SelectionDAG][AArch64] Legalize VECREDUCE
ClosedPublic