This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/
1/3
LangRef.rst
-
include/llvm/
-
llvm/
-
CodeGen/
1/1
BasicTTIImpl.h
-
ISDOpcodes.h
3/4
SelectionDAG.h
-
IR/
-
IRBuilder.h
-
Intrinsics.td
-
Target/
-
TargetSelectionDAG.td
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
1/2
LegalizeIntegerTypes.cpp
-
LegalizeTypes.h
3/5
LegalizeVectorTypes.cpp
3/7
SelectionDAG.cpp
-
SelectionDAGBuilder.h
3/3
SelectionDAGBuilder.cpp
-
SelectionDAGDumper.cpp
-
IR/
2/3
IRBuilder.cpp
-
Verifier.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64ISelLowering.h
6/7
AArch64ISelLowering.cpp
1/2
AArch64TargetTransformInfo.cpp
-
test/
-
Analysis/CostModel/AArch64/
-
CostModel/
-
AArch64/
-
neon-stepvector.ll
-
sve-stepvector.ll
-
CodeGen/AArch64/
-
AArch64/
2/3
neon-stepvector.ll
-
sve-stepvector.ll
-
Verifier/
-
stepvector-intrinsic.ll
-
unittests/
-
CodeGen/
-
AArch64SelectionDAGTest.cpp
-
IR/
-
IRBuilderTest.cpp

Differential D97299

[IR][SVE] Add new llvm.experimental.stepvector intrinsic
ClosedPublic

Authored by david-arm on Feb 23 2021, 7:50 AM.

Download Raw Diff

Details

Reviewers

sdesmalen
kmclaughlin
paulwalker-arm
ctetreau
fhahn
efriedma
rogfer01

Summary

This patch adds a new llvm.experimental.stepvector intrinsic,
which takes no arguments and returns a linear integer sequence of
values of the form <0, 1, ...>. It is primarily intended for
scalable vectors, although it will work for fixed width vectors
too. It is intended that later patches will make use of this
new intrinsic when vectorising induction variables, currently only
supported for fixed width. I've added a new CreateStepVector
method to the IRBuilder, which will generate a call to this
intrinsic for scalable vectors and fall back on creating a
ConstantVector for fixed width.

For scalable vectors this intrinsic is lowered to a new ISD node
called STEP_VECTOR, which takes a single constant integer argument
as the step. During lowering this argument is set to a value of 1.
The reason for this additional argument at the codegen level is
because in future patches we will introduce various generic DAG
combines such as

mul step_vector(1), 2 -> step_vector(2)
add step_vector(1), step_vector(1) -> step_vector(2)
shl step_vector(1), 1 -> step_vector(2)
etc.

that encourage a canonical format for all targets. This hopefully
means all other targets supporting scalable vectors can benefit
from this too.

I've added cost model tests for both fixed width and scalable
vectors:

llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll
llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll

as well as codegen lowering tests for fixed width and scalable
vectors:

llvm/test/CodeGen/AArch64/neon-stepvector.ll
llvm/test/CodeGen/AArch64/sve-stepvector.ll

See this thread for discussion of the intrinsic:
https://lists.llvm.org/pipermail/llvm-dev/2021-January/147943.html

Diff Detail

Event Timeline

david-arm created this revision.Feb 23 2021, 7:50 AM

Herald added a reviewer: efriedma. · View Herald TranscriptFeb 23 2021, 7:50 AM

Herald added subscribers: dexonsmith, jdoerfert, psnobl and 3 others. · View Herald Transcript

david-arm requested review of this revision.Feb 23 2021, 7:50 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2021, 7:50 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B90403: Diff 325794.Feb 23 2021, 7:50 AM

david-arm added a parent revision: D97276: [CodeGen] Canonicalise adds/subs of i1 vectors using XOR.Feb 23 2021, 7:51 AM

david-arm added a reviewer: rogfer01.Feb 23 2021, 7:56 AM

kmclaughlin added inline comments.Feb 24 2021, 5:27 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1649	nit: can you add an error message to this assert?
llvm/test/CodeGen/AArch64/neon-stepvector.ll
148	Should there also be some tests here for illegal predicate types such as v32i1, as there are in sve-stepvector.ll?

david-arm added inline comments.Feb 24 2021, 8:34 AM

llvm/test/CodeGen/AArch64/neon-stepvector.ll
148	Hi @kmclaughlin, thanks for pointing that out. So I originally did add some tests for illegal types for neon, but for some reason the i1 element types weren't being promoted to i8 and so we just ended up using GPRs instead, i.e. something like `mov %x0, #some_immediate` I thought it looked really odd and inconsistent with the legal types so I left them out. I'm not sure if that's a bug with the AArch64 backend for Neon or intended behaviour.

paulwalker-arm added inline comments.Feb 25 2021, 10:44 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4800	Given that we mandate the Step must be an immediate, I wondering if DAG should have a getStepVector function that takes such. I say this because, for example, what are you expecting to happen if `DAG.getNode(ISD::ANY_EXTEND` did not get folded away. As an aside, do you really not care about the type of extension? I think you absolutely need zero extension based on the current node restrictions.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1657–1659	I guess there's no action required for this patch as fixed length vectors are currently excluded, however, if this idiom becomes common then I suggest creating a DAG.getElementCountAsNode(EVT) like function. That way code like this will just work for both fixed and scalable vectors.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4323–4325	`if (cast<ConstantSDNode>(Step)->isNullValue())` ?
4668	Is it worth also validating that Operand is at least as big as the result element type. That way if the "not-negative" requirement ever get's dropped I think the signedness will not actually matter?
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
6949	Perhaps worth punting this into visitStepVector. This is what we've done for vector.reverse.
llvm/lib/IR/IRBuilder.cpp
97	Given this is a FixedVectorType you should be able to short cut ElementCount and use getNumElements() directly.
105–107	I cannot see how this assert will ever fire and so `return ConstantVector::get(Indices);` just looks nicer.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
9052–9053	I'm not a fan of the pattern duplication within AArch64SVEInstrInfo.td so I'm thinking we should lower all ISD::STEP_VECTORs to AArch64ISD::INDEX_VECTOR. Regardless of whether you do that I would expect the predicate handling to be more generic, for example: `return ISD::TRUNC(ISD::STEP_VECTOR(Op.getConstantOperandVal())`

david-arm added inline comments.Mar 1 2021, 12:40 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
9052–9053	Hi @paulwalker-arm, me too for i1 case, however I tried the generic route first exactly as you suggest and it didn't work. The problem was that if I created the ISD::TRUNC node we never ended up custom lowering as you'd expect. If you look at AArch64TargetLowering::LowerTRUNCATE for i1 element types we return this: return DAG.getSetCC(dl, VT, And, Zero, ISD::SETNE); which then never gets into our custom lowering code, despite the action being set to Custom for precisely those types. We then crash in isel trying to match a `setcc` operation. However I'd love to do it this way if you've got any suggestions for how I can solve this problem? I suspect there are just too many levels of custom lowering involved, i.e. custom lower STEP_VECTOR, followed by custom lower TRUNCATE, followed by custom lower SETCC. So in general your preference is to custom lower all types and go through this function to create AArch64ISD::INDEX_VECTOR?

david-arm added inline comments.Mar 1 2021, 12:50 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
9052–9053	One solution might be to rewrite LowerTRUNCATE to avoid returning SETCC for scalable vectors and use AArch64ISD::SETCC_MERGE_ZERO directly? I'd just assumed people would find this a bit ugly.

david-arm updated this revision to Diff 327720.Mar 3 2021, 1:58 AM

david-arm marked 9 inline comments as done.Mar 3 2021, 2:27 AM

Harbormaster completed remote builds in B91767: Diff 327720.Mar 3 2021, 5:48 AM

david-arm added a child revision: D97861: [LoopVectorize][NFC] Refactor code to use IRBuilder::CreateStepVector.Mar 3 2021, 7:55 AM

If the ISD node is going to get an argument for the step, why not let the new intrinsic have this same argument?

In D97299#2603806, @ctetreau wrote:

If the ISD node is going to get an argument for the step, why not let the new intrinsic have this same argument?

@ctetreau: We've followed the same route as for llvm.vscale() -> ISD::VSCALE so yes the code generation side is more complete/convenient. Considering we're expecting llvm.experimental.stepvector() to be redundant once there's time to push for the preferred ConstantVector solution, we feel it's better to keep the intrinsic as simple as possible.

@ctetreau: We've followed the same route as for llvm.vscale() -> ISD::VSCALE so yes the code generation side is more complete/convenient. Considering we're expecting llvm.experimental.stepvector() to be redundant once there's time to push for the preferred ConstantVector solution, we feel it's better to keep the intrinsic as simple as possible.

OK, fair enough.

I'm still working through the patch but here's what I've got for now.

llvm/docs/LangRef.rst
16572	To be consistent with the other experimental vector intrinsics this should be `fixed-width`?
llvm/include/llvm/CodeGen/BasicTTIImpl.h
1253–1254	This seems like the wrong place to validate the intrinsic.
llvm/include/llvm/CodeGen/SelectionDAG.h
650–651	I guess I've caused this with my previous "don't use a node request" and thus the answer will be no, but I'll ask the question anyway. Can OpVT be dropped here? as it seems inconvenient.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1657–1663	The Step is an immediate, as is the known part of LoVT's element count, which suggests you shouldn't need to use DAG for the maths because you can do `getVScale(N->getConstantOperandAPInt(0) * LoVT.getVectorMinNumElements())`
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4325	I think `DAG.getConstant(0...` is better here so that the target's preferred nodes (i.e. SPLAT_VECTOR or BUILD_VECTOR) are used.
llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
10938–10939	Is this needed? I ask because `DAG.getStepVector` will validate the result VT in one place rather than expecting all it's users to do likewise.
10946–10949	What about pushing this into `DAG.getStepVector`? that way there's a generic way for any code to create this vector without needing to worry about the result type.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
272–273	In a world where InstructionCost is everywhere I think it's better for the InstructionCost side to be the LHS because it reduces the number of required operator overloads. Check with @sdesmalen but I think you'll save some work if you write this as `getArithmeticInstrCost() * (LT.first - 1)`.

paulwalker-arm added inline comments.Mar 10 2021, 3:03 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
9059–9062	Is it possible to use getPromotedVTForPredicate here?
9067	I can imagine a route where this promotion is also required for the `ElemVT != MVT::i1` case. Related to this I suspect we'll need an implementation of `PromoteIntOp_STEP_VECTOR` but currently don't because there's nothing that can exercise the `Step != 1` case.
10216	Is this test needed? I'm guessing this code was added as part of SVE support (v#i1 being an illegal type for NEON).

david-arm added inline comments.Mar 11 2021, 12:44 AM

llvm/include/llvm/CodeGen/SelectionDAG.h
650–651	Yeah, that's right. This arose because of trying to pass in an immediate value here. I'm not sure of a much less ugly way when removing OpVT. If we remove it then I have to create a type every time based upon the current width of Step and assume the caller has set the bit width correctly. I can't just use the element type of ResVT because this takes me back to square one, i.e. that I then need to worry about promoting the type to i32 or i64. I could pass in a ConstantSDNode instead of the {OpVT, Step} pair, which has both the OpVT type and the APInt combined?

paulwalker-arm added inline comments.Mar 11 2021, 2:02 AM

llvm/include/llvm/CodeGen/SelectionDAG.h
650–651	Thanks for the information Dave and sorry for messing you around here. How about we go back to a stock SDValue but add an assert in `getNode` that the second operand is the constant we expect? This doesn't prevent people using DAG interfaces to transform what we know to be a constant but at least we'll catch is very early if it goes wrong. I still think this interface is of value as a generic way to create such vectors regardless of vector type.

Changed interface SelectionDAG::getStepVector to take a SDValue for the Step and to support fixed width vectors.
Use getPromotedVTForPredicate in lowering code for STEP_VECTOR operating on i1 vectors.
In SplitVecRes_STEP_VECTOR folded the multiply into the getVScale call.
Addressed other review comments.

Harbormaster completed remote builds in B93490: Diff 330233.Mar 12 2021, 7:05 AM

david-arm marked 13 inline comments as done.Mar 12 2021, 7:07 AM

bin.cheng-ali added a subscriber: bin.cheng-ali.Mar 13 2021, 8:38 AM

bin.cheng-ali added inline comments.

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1657–1659	My question is if it should become common? I mean if fixed length vector can be handled more efficiently, shall we always lower and handle fixed/scalable length differently, rather than handle them simultaneously in more and more functions, which has no obvious benefit?

paulwalker-arm added inline comments.Mar 14 2021, 4:03 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1657–1659	I agree and hope we evolve to a more unified set of nodes that work for all vector types. That said, this is a larger conversation and so today we're trying to add scalable vector support in a way that is a NFC for fixed length vectors. This patch does add a unified interface in the form of `SelectionDAG::getSepVector(...)` that should allow for portable code regardless of vector type. SplitVecRes_STEP_VECTOR is specifically linked to `ISD::STEP_VECTOR`, which can only be created for scalable vector types. So whilst this function can be easily extended to cover fixed length vector types, it would not be testable in a way that preserved this "NFC for fixed length vectors" goal.

dexonsmith removed a subscriber: dexonsmith.Mar 15 2021, 1:31 PM

I'm still struggling with how best to represent STEPVECTOR's scale operand. The patch as it currently stands is probably good enough but I do wonder how long we'll get by without needing to implement PromoteIntOp_STEP_VECTOR. I also wonder if we can take a similar approach as done for INSERT_SUBVECTOR's index operand.

The description for how STEPVECTOR handles overflow also seems a bit limiting. Specifically this patch lowers vector of i1 based STEP_VECTOR operations, but with the current description only the first two lanes of any vector type will result in defined behaviour, which makes me wonder the value in allowing such support.

llvm/include/llvm/CodeGen/SelectionDAG.h
836	This is no longer true, perhaps just `Returns a vector of type...`
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
1754	Given this is fixed length specific I think calling `getVectorNumElements()` is clearer.
4324	Comment is no longer valid. I'd just remove it.
4670	FYI: APInt has a function for this called `isNonNegative()`.
llvm/lib/IR/IRBuilder.cpp
95	Keep this style if you prefer but I think it will be cleaner done as if (isa<ScalableVectorType>(DstType)) return CreateIntrinsic(Intrinsic::experimental_stepvector, {DstType}, {},.... <fixed vector handling here>
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
264–265	This is safe to assume and already checked within Verifier.cpp.
llvm/test/CodeGen/AArch64/neon-stepvector.ll
10–11	For this test to be correct requires LCPI0_0 to point to something meaningful. I believe update_llc_test_checks.py strips this information so you'll need to manually add it. We did likewise for named-vector-shuffle-reverse-neon.ll.

Addressed review comments.
After recent upstream discussion on the SVE sync call it was decided to remove support for vectors of i1 types.

Harbormaster completed remote builds in B94191: Diff 331195.Mar 17 2021, 4:00 AM

Changed code to check for element sizes >= 8 bits instead of 1!

Harbormaster completed remote builds in B94499: Diff 331610.Mar 18 2021, 11:58 AM

A couple of niggles but otherwise looks good to me.

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp
4793	Could use `N->getConstantOperandAPInt(0)` here.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4667	To match the comment you'll need to add an `isInteger` test.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
9052–9053	This assert should not be necessary as you're protecting this already within getNode and thus can assume the node to be valid.

This revision is now accepted and ready to land.Mar 19 2021, 8:57 AM

fhahn added inline comments.Mar 21 2021, 2:33 PM

llvm/docs/LangRef.rst
16614	Would it make sense to limit the intrinsic to integer vectors only? Otherwise, why is it not recommended to use it with different element types? I think the behaviour with other element types should also be spelled out, if other element types are supported.
16615	Can we make the wording here consistent with the other scalable intrinsics?

Updated LangRef to better describe the intrinsic behaviour.

Harbormaster completed remote builds in B94966: Diff 332253.Mar 22 2021, 6:43 AM

In D97299#2641106, @david-arm wrote:

Updated LangRef to better describe the intrinsic behaviour.

Thanks. I think it should be possible to enforce the new restrictions in the IR verifier, to prevent users from accidentally using them with wrong types.

In D97299#2641273, @fhahn wrote:

In D97299#2641106, @david-arm wrote:

Updated LangRef to better describe the intrinsic behaviour.

Thanks. I think it should be possible to enforce the new restrictions in the IR verifier, to prevent users from accidentally using them with wrong types.

Yeah that's right. The type (including the element size) are defended in this patch in Verifier.cpp.

In D97299#2641278, @david-arm wrote:

In D97299#2641273, @fhahn wrote:

In D97299#2641106, @david-arm wrote:

Updated LangRef to better describe the intrinsic behaviour.

Thanks. I think it should be possible to enforce the new restrictions in the IR verifier, to prevent users from accidentally using them with wrong types.

Yeah that's right. The type (including the element size) are defended in this patch in Verifier.cpp.

That's great! It still looks like there are negative tests missing for the checks?

Added verifier tests.

Harbormaster completed remote builds in B94989: Diff 332285.Mar 22 2021, 9:12 AM

Thanks, LGTM!

Closed by commit 748ae5281d4f7f0ff261ba9e8c57e6b6fcfdb31e

In D97299#2629320, @paulwalker-arm wrote:

I'm still struggling with how best to represent STEPVECTOR's scale operand. The patch as it currently stands is probably good enough but I do wonder how long we'll get by without needing to implement PromoteIntOp_STEP_VECTOR. I also wonder if we can take a similar approach as done for INSERT_SUBVECTOR's index operand.

Hey. Sorry to raise a closed thread but this exact issue is making it difficult to support STEP_VECTOR on RISC-V. On RV32 we don't have legal i64 but do have legal i64 vectors. So we're hitting an assert during visitStepVector where we try and create a nxvXi64 STEP_VECTOR with an i32 operand. Are you aware of anything that stops us from changing the requirements to be an integer pointer type? It might limit optimizations but we can always not perform DAG combines if it's not safe to do so?

In D97299#2658448, @frasercrmck wrote:

In D97299#2629320, @paulwalker-arm wrote:

I'm still struggling with how best to represent STEPVECTOR's scale operand. The patch as it currently stands is probably good enough but I do wonder how long we'll get by without needing to implement PromoteIntOp_STEP_VECTOR. I also wonder if we can take a similar approach as done for INSERT_SUBVECTOR's index operand.

Hey. Sorry to raise a closed thread but this exact issue is making it difficult to support STEP_VECTOR on RISC-V. On RV32 we don't have legal i64 but do have legal i64 vectors. So we're hitting an assert during visitStepVector where we try and create a nxvXi64 STEP_VECTOR with an i32 operand. Are you aware of anything that stops us from changing the requirements to be an integer pointer type? It might limit optimizations but we can always not perform DAG combines if it's not safe to do so?

That's good feedback, thanks. The problem is if we ever decide to allow a negative operand. By forcing the operand to have at least the same bit width as the result's element type we managed to build in some future-proof-ness. That said I'm not locked into this idea plus we can easily switch the operand to be signed if that make more sense and as you say, not dag combine when it would be unsafe to do so. So if this is proving problematic for you then I've nothing against making the operand an integer pointer type.

Revision Contents

Path

Size

llvm/

docs/

LangRef.rst

30 lines

include/

llvm/

CodeGen/

BasicTTIImpl.h

6 lines

ISDOpcodes.h

8 lines

SelectionDAG.h

4 lines

IR/

IRBuilder.h

3 lines

Intrinsics.td

3 lines

Target/

TargetSelectionDAG.td

2 lines

lib/

CodeGen/

SelectionDAG/

LegalizeIntegerTypes.cpp

13 lines

LegalizeTypes.h

2 lines

LegalizeVectorTypes.cpp

27 lines

SelectionDAG.cpp

37 lines

SelectionDAGBuilder.h

1 line

SelectionDAGBuilder.cpp

14 lines

SelectionDAGDumper.cpp

1 line

IR/

IRBuilder.cpp

17 lines

Verifier.cpp

9 lines

Target/

AArch64/

AArch64ISelLowering.h

1 line

AArch64ISelLowering.cpp

18 lines

AArch64TargetTransformInfo.cpp

13 lines

test/

Analysis/

CostModel/

AArch64/

neon-stepvector.ll

34 lines

sve-stepvector.ll

39 lines

CodeGen/

AArch64/

neon-stepvector.ll

181 lines

sve-stepvector.ll

121 lines

Verifier/

stepvector-intrinsic.ll

29 lines

unittests/

CodeGen/

AArch64SelectionDAGTest.cpp

14 lines

IR/

IRBuilderTest.cpp

26 lines

Diff 332285

llvm/docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 16,563 Lines • ▼ Show 20 Lines

	Overview:			Overview:
	"""""""""			"""""""""

	The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by			The '``llvm.experimental.vector.splice.*``' intrinsics construct a vector by
	concatenating elements from the first input vector with elements of the second			concatenating elements from the first input vector with elements of the second
	input vector, returning a vector of the same type as the input vectors. The			input vector, returning a vector of the same type as the input vectors. The
	signed immediate, modulo the number of elements in the vector, is the index			signed immediate, modulo the number of elements in the vector, is the index
	into the first vector from which to extract the result value. This means			into the first vector from which to extract the result value. This means
				paulwalker-armUnsubmitted Done Reply Inline Actions To be consistent with the other experimental vector intrinsics this should be `fixed-width`? paulwalker-arm: To be consistent with the other experimental vector intrinsics this should be `fixed-width`?
	conceptually that for a positive immediate, a vector is extracted from			conceptually that for a positive immediate, a vector is extracted from
	``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative			``concat(%vec1, %vec2)`` starting at index ``imm``, whereas for a negative
	immediate, it extracts ``-imm`` trailing elements from the first vector, and			immediate, it extracts ``-imm`` trailing elements from the first vector, and
	the remaining elements from ``%vec2``.			the remaining elements from ``%vec2``.

	These intrinsics work for both fixed and scalable vectors. While this intrinsic			These intrinsics work for both fixed and scalable vectors. While this intrinsic
	is marked as experimental, the recommended way to express this operation for			is marked as experimental, the recommended way to express this operation for
	fixed-width vectors is still to use a shufflevector, as that may allow for more			fixed-width vectors is still to use a shufflevector, as that may allow for more
	Show All 10 Lines
	Arguments:			Arguments:
	""""""""""			""""""""""

	The first two operands are vectors with the same type. The third argument			The first two operands are vectors with the same type. The third argument
	``imm`` is the start index, modulo VL, where VL is the runtime vector length of			``imm`` is the start index, modulo VL, where VL is the runtime vector length of
	the source/result vector. The ``imm`` is a signed integer constant in the range			the source/result vector. The ``imm`` is a signed integer constant in the range
	``-VL <= imm < VL``. For values outside of this range the result is poison.			``-VL <= imm < VL``. For values outside of this range the result is poison.


				'``llvm.experimental.stepvector``' Intrinsic

				This is an overloaded intrinsic. You can use ``llvm.experimental.stepvector``
				to generate a vector whose lane values comprise the linear sequence
				<0, 1, 2, ...>. It is primarily intended for scalable vectors.

				::

				declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
				declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()

				The '``llvm.experimental.stepvector``' intrinsics are used to create vectors
				of integers whose elements contain a linear sequence of values starting from 0
				with a step of 1. This experimental intrinsic can only be used for vectors
				with integer elements that are at least 8 bits in size. If the sequence value
				fhahnUnsubmitted Not Done Reply Inline Actions Would it make sense to limit the intrinsic to integer vectors only? Otherwise, why is it not recommended to use it with different element types? I think the behaviour with other element types should also be spelled out, if other element types are supported. fhahn: Would it make sense to limit the intrinsic to integer vectors only? Otherwise, why is it not…
				exceeds the allowed limit for the element type then the result for that lane is
				fhahnUnsubmitted Not Done Reply Inline Actions Can we make the wording here consistent with the other scalable intrinsics? fhahn: Can we make the wording here consistent with the other scalable intrinsics?
				undefined.

				These intrinsics work for both fixed and scalable vectors. While this intrinsic
				is marked as experimental, the recommended way to express this operation for
				fixed-width vectors is still to generate a constant vector instead.


				Arguments:
				""""""""""

				None.


	Matrix Intrinsics			Matrix Intrinsics
	-----------------			-----------------

	Operations on matrixes requiring shape information (like number of rows/columns			Operations on matrixes requiring shape information (like number of rows/columns
	or the memory layout) can be expressed using the matrix intrinsics. These			or the memory layout) can be expressed using the matrix intrinsics. These
	intrinsics require matrix dimensions to be passed as immediate arguments, and			intrinsics require matrix dimensions to be passed as immediate arguments, and
	matrixes are passed and returned as vectors. This means that for a ``R`` x			matrixes are passed and returned as vectors. This means that for a ``R`` x
	``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the			``C`` matrix, element ``i`` of column ``j`` is at index ``j * R + i`` in the
	▲ Show 20 Lines • Show All 5,186 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show First 20 Lines • Show All 1,243 Lines • ▼ Show 20 Lines	unsigned getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
}		}
case Intrinsic::masked_gather: {		case Intrinsic::masked_gather: {
const Value *Mask = Args[2];		const Value *Mask = Args[2];
bool VarMask = !isa<Constant>(Mask);		bool VarMask = !isa<Constant>(Mask);
Align Alignment = cast<ConstantInt>(Args[1])->getAlignValue();		Align Alignment = cast<ConstantInt>(Args[1])->getAlignValue();
return thisT()->getGatherScatterOpCost(Instruction::Load, RetTy, Args[0],		return thisT()->getGatherScatterOpCost(Instruction::Load, RetTy, Args[0],
VarMask, Alignment, CostKind, I);		VarMask, Alignment, CostKind, I);
}		}
		case Intrinsic::experimental_stepvector: {
		if (isa<ScalableVectorType>(RetTy))
		return BaseT::getIntrinsicInstrCost(ICA, CostKind);
		paulwalker-armUnsubmitted Done Reply Inline Actions This seems like the wrong place to validate the intrinsic. paulwalker-arm: This seems like the wrong place to validate the intrinsic.
		// The cost of materialising a constant integer vector.
		return TargetTransformInfo::TCC_Basic;
		}
case Intrinsic::experimental_vector_extract: {		case Intrinsic::experimental_vector_extract: {
// FIXME: Handle case where a scalable vector is extracted from a scalable		// FIXME: Handle case where a scalable vector is extracted from a scalable
// vector		// vector
if (isa<ScalableVectorType>(RetTy))		if (isa<ScalableVectorType>(RetTy))
return BaseT::getIntrinsicInstrCost(ICA, CostKind);		return BaseT::getIntrinsicInstrCost(ICA, CostKind);
unsigned Index = cast<ConstantInt>(Args[1])->getZExtValue();		unsigned Index = cast<ConstantInt>(Args[1])->getZExtValue();
return thisT()->getShuffleCost(TTI::SK_ExtractSubvector,		return thisT()->getShuffleCost(TTI::SK_ExtractSubvector,
cast<VectorType>(Args[0]->getType()),		cast<VectorType>(Args[0]->getType()),
▲ Show 20 Lines • Show All 793 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/ISDOpcodes.h

Show First 20 Lines • Show All 586 Lines • ▼ Show 20 Lines	enum NodeType {
/// scalar values joined together and then duplicated in all lanes. This		/// scalar values joined together and then duplicated in all lanes. This
/// represents a SPLAT_VECTOR that has had its scalar operand expanded. This		/// represents a SPLAT_VECTOR that has had its scalar operand expanded. This
/// allows representing a 64-bit splat on a target with 32-bit integers. The		/// allows representing a 64-bit splat on a target with 32-bit integers. The
/// total width of the scalars must cover the element width. SCALAR1 contains		/// total width of the scalars must cover the element width. SCALAR1 contains
/// the least significant bits of the value regardless of endianness and all		/// the least significant bits of the value regardless of endianness and all
/// scalars should have the same type.		/// scalars should have the same type.
SPLAT_VECTOR_PARTS,		SPLAT_VECTOR_PARTS,

		/// STEP_VECTOR(IMM) - Returns a scalable vector whose lanes are comprised
		/// of a linear sequence of unsigned values starting from 0 with a step of
		/// IMM, where IMM must be a constant positive integer value. The operation
		/// does not support returning fixed-width vectors or non-constant operands.
		/// If the sequence value exceeds the limit allowed for the element type then
		/// the values for those lanes are undefined.
		STEP_VECTOR,

/// MULHU/MULHS - Multiply high - Multiply two integers of type iN,		/// MULHU/MULHS - Multiply high - Multiply two integers of type iN,
/// producing an unsigned/signed value of type i[2*N], then return the top		/// producing an unsigned/signed value of type i[2*N], then return the top
/// part.		/// part.
MULHU,		MULHU,
MULHS,		MULHS,

/// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned		/// [US]{MIN/MAX} - Binary minimum or maximum or signed or unsigned
/// integers.		/// integers.
▲ Show 20 Lines • Show All 819 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/SelectionDAG.h

Show First 20 Lines • Show All 641 Lines • ▼ Show 20 Lines	SDValue getTargetConstant(const APInt &Val, const SDLoc &DL, EVT VT,
bool isOpaque = false) {		bool isOpaque = false) {
return getConstant(Val, DL, VT, true, isOpaque);		return getConstant(Val, DL, VT, true, isOpaque);
}		}
SDValue getTargetConstant(const ConstantInt &Val, const SDLoc &DL, EVT VT,		SDValue getTargetConstant(const ConstantInt &Val, const SDLoc &DL, EVT VT,
bool isOpaque = false) {		bool isOpaque = false) {
return getConstant(Val, DL, VT, true, isOpaque);		return getConstant(Val, DL, VT, true, isOpaque);
}		}

/// Create a true or false constant of type \p VT using the target's		/// Create a true or false constant of type \p VT using the target's
/// BooleanContent for type \p OpVT.		/// BooleanContent for type \p OpVT.
		paulwalker-armUnsubmitted Done Reply Inline Actions I guess I've caused this with my previous "don't use a node request" and thus the answer will be no, but I'll ask the question anyway. Can OpVT be dropped here? as it seems inconvenient. paulwalker-arm: I guess I've caused this with my previous "don't use a node request" and thus the answer will…
		david-armAuthorUnsubmitted Done Reply Inline Actions Yeah, that's right. This arose because of trying to pass in an immediate value here. I'm not sure of a much less ugly way when removing OpVT. If we remove it then I have to create a type every time based upon the current width of Step and assume the caller has set the bit width correctly. I can't just use the element type of ResVT because this takes me back to square one, i.e. that I then need to worry about promoting the type to i32 or i64. I could pass in a ConstantSDNode instead of the {OpVT, Step} pair, which has both the OpVT type and the APInt combined? david-arm: Yeah, that's right. This arose because of trying to pass in an immediate value here. I'm not…
		paulwalker-armUnsubmitted Done Reply Inline Actions Thanks for the information Dave and sorry for messing you around here. How about we go back to a stock SDValue but add an assert in `getNode` that the second operand is the constant we expect? This doesn't prevent people using DAG interfaces to transform what we know to be a constant but at least we'll catch is very early if it goes wrong. I still think this interface is of value as a generic way to create such vectors regardless of vector type. paulwalker-arm: Thanks for the information Dave and sorry for messing you around here. How about we go back to…
SDValue getBoolConstant(bool V, const SDLoc &DL, EVT VT, EVT OpVT);		SDValue getBoolConstant(bool V, const SDLoc &DL, EVT VT, EVT OpVT);
/// @}		/// @}

/// Create a ConstantFPSDNode wrapping a constant value.		/// Create a ConstantFPSDNode wrapping a constant value.
/// If VT is a vector type, the constant is splatted into a BUILD_VECTOR.		/// If VT is a vector type, the constant is splatted into a BUILD_VECTOR.
///		///
/// If only legal types can be produced, this does the necessary		/// If only legal types can be produced, this does the necessary
/// transformations (e.g., if the vector element type is illegal).		/// transformations (e.g., if the vector element type is illegal).
▲ Show 20 Lines • Show All 168 Lines • ▼ Show 20 Lines	if (Op.getOpcode() == ISD::UNDEF) {
VT.getVectorElementType().bitsLE(Op.getValueType()))) &&		VT.getVectorElementType().bitsLE(Op.getValueType()))) &&
"A splatted value must have a width equal or (for integers) "		"A splatted value must have a width equal or (for integers) "
"greater than the vector element type!");		"greater than the vector element type!");
return getNode(ISD::UNDEF, SDLoc(), VT);		return getNode(ISD::UNDEF, SDLoc(), VT);
}		}
return getNode(ISD::SPLAT_VECTOR, DL, VT, Op);		return getNode(ISD::SPLAT_VECTOR, DL, VT, Op);
}		}

		/// Returns a vector of type ResVT whose elements contain the linear sequence
		paulwalker-armUnsubmitted Not Done Reply Inline Actions This is no longer true, perhaps just `Returns a vector of type...` paulwalker-arm: This is no longer true, perhaps just `Returns a vector of type...`
		/// <0, Step, Step * 2, Step * 3, ...>
		SDValue getStepVector(const SDLoc &DL, EVT ResVT, SDValue Step);

/// Returns an ISD::VECTOR_SHUFFLE node semantically equivalent to		/// Returns an ISD::VECTOR_SHUFFLE node semantically equivalent to
/// the shuffle node in input but with swapped operands.		/// the shuffle node in input but with swapped operands.
///		///
/// Example: shuffle A, B, <0,5,2,7> -> shuffle B, A, <4,1,6,3>		/// Example: shuffle A, B, <0,5,2,7> -> shuffle B, A, <4,1,6,3>
SDValue getCommutedVectorShuffle(const ShuffleVectorSDNode &SV);		SDValue getCommutedVectorShuffle(const ShuffleVectorSDNode &SV);

/// Convert Op, which must be of float type, to the		/// Convert Op, which must be of float type, to the
/// float type VT, by either extending or rounding (by truncation).		/// float type VT, by either extending or rounding (by truncation).
▲ Show 20 Lines • Show All 1,242 Lines • Show Last 20 Lines

llvm/include/llvm/IR/IRBuilder.h

Show First 20 Lines • Show All 848 Lines • ▼ Show 20 Lines	CallInst CreateGCRelocate(Instruction Statepoint,
int DerivedOffset,		int DerivedOffset,
Type *ResultType,		Type *ResultType,
const Twine &Name = "");		const Twine &Name = "");

/// Create a call to llvm.vscale, multiplied by \p Scaling. The type of VScale		/// Create a call to llvm.vscale, multiplied by \p Scaling. The type of VScale
/// will be the same type as that of \p Scaling.		/// will be the same type as that of \p Scaling.
Value CreateVScale(Constant Scaling, const Twine &Name = "");		Value CreateVScale(Constant Scaling, const Twine &Name = "");

		/// Creates a vector of type \p DstType with the linear sequence <0, 1, ...>
		Value CreateStepVector(Type DstType, const Twine &Name = "");
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'CreateStepVector' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'CreateStepVector' [readability-identifier…

/// Create a call to intrinsic \p ID with 1 operand which is mangled on its		/// Create a call to intrinsic \p ID with 1 operand which is mangled on its
/// type.		/// type.
CallInst CreateUnaryIntrinsic(Intrinsic::ID ID, Value V,		CallInst CreateUnaryIntrinsic(Intrinsic::ID ID, Value V,
Instruction *FMFSource = nullptr,		Instruction *FMFSource = nullptr,
const Twine &Name = "");		const Twine &Name = "");

/// Create a call to intrinsic \p ID with 2 operands which is mangled on the		/// Create a call to intrinsic \p ID with 2 operands which is mangled on the
/// first type.		/// first type.
▲ Show 20 Lines • Show All 1,785 Lines • Show Last 20 Lines

llvm/include/llvm/IR/Intrinsics.td

	Show First 20 Lines • Show All 1,330 Lines • ▼ Show 20 Lines
	def int_is_constant : DefaultAttrsIntrinsic<[llvm_i1_ty], [llvm_any_ty],			def int_is_constant : DefaultAttrsIntrinsic<[llvm_i1_ty], [llvm_any_ty],
	[IntrNoMem, IntrWillReturn, IntrConvergent],			[IntrNoMem, IntrWillReturn, IntrConvergent],
	"llvm.is.constant">;			"llvm.is.constant">;

	// Intrinsic to mask out bits of a pointer.			// Intrinsic to mask out bits of a pointer.
	def int_ptrmask: DefaultAttrsIntrinsic<[llvm_anyptr_ty], [LLVMMatchType<0>, llvm_anyint_ty],			def int_ptrmask: DefaultAttrsIntrinsic<[llvm_anyptr_ty], [LLVMMatchType<0>, llvm_anyint_ty],
	[IntrNoMem, IntrSpeculatable, IntrWillReturn]>;			[IntrNoMem, IntrSpeculatable, IntrWillReturn]>;

				def int_experimental_stepvector : DefaultAttrsIntrinsic<[llvm_anyvector_ty],
				[], [IntrNoMem]>;

	//===---------------- Vector Predication Intrinsics --------------===//			//===---------------- Vector Predication Intrinsics --------------===//

	// Speculatable Binary operators			// Speculatable Binary operators
	let IntrProperties = [IntrSpeculatable, IntrNoMem, IntrNoSync, IntrWillReturn] in {			let IntrProperties = [IntrSpeculatable, IntrNoMem, IntrNoSync, IntrWillReturn] in {
	def int_vp_add : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],			def int_vp_add : DefaultAttrsIntrinsic<[ llvm_anyvector_ty ],
	[ LLVMMatchType<0>,			[ LLVMMatchType<0>,
	LLVMMatchType<0>,			LLVMMatchType<0>,
	LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,			LLVMScalarOrSameVectorWidth<0, llvm_i1_ty>,
	▲ Show 20 Lines • Show All 348 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetSelectionDAG.td

	Show First 20 Lines • Show All 655 Lines • ▼ Show 20 Lines
	def ist : SDNode<"ISD::STORE" , SDTIStore,			def ist : SDNode<"ISD::STORE" , SDTIStore,
	[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;			[SDNPHasChain, SDNPMayStore, SDNPMemOperand]>;

	def vector_shuffle : SDNode<"ISD::VECTOR_SHUFFLE", SDTVecShuffle, []>;			def vector_shuffle : SDNode<"ISD::VECTOR_SHUFFLE", SDTVecShuffle, []>;
	def vector_reverse : SDNode<"ISD::VECTOR_REVERSE", SDTVecReverse>;			def vector_reverse : SDNode<"ISD::VECTOR_REVERSE", SDTVecReverse>;
	def vector_splice : SDNode<"ISD::VECTOR_SPLICE", SDTVecSlice, []>;			def vector_splice : SDNode<"ISD::VECTOR_SPLICE", SDTVecSlice, []>;
	def build_vector : SDNode<"ISD::BUILD_VECTOR", SDTypeProfile<1, -1, []>, []>;			def build_vector : SDNode<"ISD::BUILD_VECTOR", SDTypeProfile<1, -1, []>, []>;
	def splat_vector : SDNode<"ISD::SPLAT_VECTOR", SDTypeProfile<1, 1, []>, []>;			def splat_vector : SDNode<"ISD::SPLAT_VECTOR", SDTypeProfile<1, 1, []>, []>;
				def step_vector : SDNode<"ISD::STEP_VECTOR", SDTypeProfile<1, 1,
				[SDTCisVec<0>, SDTCisInt<1>]>, []>;
	def scalar_to_vector : SDNode<"ISD::SCALAR_TO_VECTOR", SDTypeProfile<1, 1, []>,			def scalar_to_vector : SDNode<"ISD::SCALAR_TO_VECTOR", SDTypeProfile<1, 1, []>,
	[]>;			[]>;

	// vector_extract/vector_insert are deprecated. extractelt/insertelt			// vector_extract/vector_insert are deprecated. extractelt/insertelt
	// are preferred.			// are preferred.
	def vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",			def vector_extract : SDNode<"ISD::EXTRACT_VECTOR_ELT",
	SDTypeProfile<1, 2, [SDTCisPtrTy<2>]>, []>;			SDTypeProfile<1, 2, [SDTCisPtrTy<2>]>, []>;
	def vector_insert : SDNode<"ISD::INSERT_VECTOR_ELT",			def vector_insert : SDNode<"ISD::INSERT_VECTOR_ELT",
	▲ Show 20 Lines • Show All 985 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

Show First 20 Lines • Show All 104 Lines • ▼ Show 20 Lines	#endif
case ISD::INSERT_VECTOR_ELT:		case ISD::INSERT_VECTOR_ELT:
Res = PromoteIntRes_INSERT_VECTOR_ELT(N); break;		Res = PromoteIntRes_INSERT_VECTOR_ELT(N); break;
case ISD::BUILD_VECTOR:		case ISD::BUILD_VECTOR:
Res = PromoteIntRes_BUILD_VECTOR(N); break;		Res = PromoteIntRes_BUILD_VECTOR(N); break;
case ISD::SCALAR_TO_VECTOR:		case ISD::SCALAR_TO_VECTOR:
Res = PromoteIntRes_SCALAR_TO_VECTOR(N); break;		Res = PromoteIntRes_SCALAR_TO_VECTOR(N); break;
case ISD::SPLAT_VECTOR:		case ISD::SPLAT_VECTOR:
Res = PromoteIntRes_SPLAT_VECTOR(N); break;		Res = PromoteIntRes_SPLAT_VECTOR(N); break;
		case ISD::STEP_VECTOR: Res = PromoteIntRes_STEP_VECTOR(N); break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case ISD::STEP_VECTOR: Res = PromoteIntRes_STEP_VECTOR(N); break; + case ISD::STEP_VECTOR: + Res = PromoteIntRes_STEP_VECTOR(N); + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - case ISD::STEP_VECTOR: Res =…
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
Res = PromoteIntRes_CONCAT_VECTORS(N); break;		Res = PromoteIntRes_CONCAT_VECTORS(N); break;

case ISD::ANY_EXTEND_VECTOR_INREG:		case ISD::ANY_EXTEND_VECTOR_INREG:
case ISD::SIGN_EXTEND_VECTOR_INREG:		case ISD::SIGN_EXTEND_VECTOR_INREG:
case ISD::ZERO_EXTEND_VECTOR_INREG:		case ISD::ZERO_EXTEND_VECTOR_INREG:
Res = PromoteIntRes_EXTEND_VECTOR_INREG(N); break;		Res = PromoteIntRes_EXTEND_VECTOR_INREG(N); break;

▲ Show 20 Lines • Show All 4,656 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::PromoteIntRes_SPLAT_VECTOR(SDNode *N) {
assert(NOutVT.isVector() && "Type must be promoted to a vector type");		assert(NOutVT.isVector() && "Type must be promoted to a vector type");
EVT NOutElemVT = NOutVT.getVectorElementType();		EVT NOutElemVT = NOutVT.getVectorElementType();

SDValue Op = DAG.getNode(ISD::ANY_EXTEND, dl, NOutElemVT, SplatVal);		SDValue Op = DAG.getNode(ISD::ANY_EXTEND, dl, NOutElemVT, SplatVal);

return DAG.getNode(ISD::SPLAT_VECTOR, dl, NOutVT, Op);		return DAG.getNode(ISD::SPLAT_VECTOR, dl, NOutVT, Op);
}		}

		SDValue DAGTypeLegalizer::PromoteIntRes_STEP_VECTOR(SDNode *N) {
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'PromoteIntRes_STEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'PromoteIntRes_STEP_VECTOR' [readability…
		SDLoc dl(N);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming]…
		EVT OutVT = N->getValueType(0);
		EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);
		assert(NOutVT.isVector() && "Type must be promoted to a vector type");
		EVT NOutElemVT = TLI.getTypeToTransformTo(*DAG.getContext(),
		NOutVT.getVectorElementType());
		APInt StepVal = cast<ConstantSDNode>(N->getOperand(0))->getAPIntValue();
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Could use `N->getConstantOperandAPInt(0)` here. paulwalker-arm: Could use `N->getConstantOperandAPInt(0)` here.
		SDValue Step = DAG.getConstant(StepVal.getZExtValue(), dl, NOutElemVT);
		return DAG.getStepVector(dl, NOutVT, Step);
		}

SDValue DAGTypeLegalizer::PromoteIntRes_CONCAT_VECTORS(SDNode *N) {		SDValue DAGTypeLegalizer::PromoteIntRes_CONCAT_VECTORS(SDNode *N) {
SDLoc dl(N);		SDLoc dl(N);

		paulwalker-armUnsubmitted Done Reply Inline Actions Given that we mandate the Step must be an immediate, I wondering if DAG should have a getStepVector function that takes such. I say this because, for example, what are you expecting to happen if `DAG.getNode(ISD::ANY_EXTEND` did not get folded away. As an aside, do you really not care about the type of extension? I think you absolutely need zero extension based on the current node restrictions. paulwalker-arm: Given that we mandate the Step must be an immediate, I wondering if DAG should have a…
EVT OutVT = N->getValueType(0);		EVT OutVT = N->getValueType(0);
EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);		EVT NOutVT = TLI.getTypeToTransformTo(*DAG.getContext(), OutVT);
assert(NOutVT.isVector() && "This type must be promoted to a vector type");		assert(NOutVT.isVector() && "This type must be promoted to a vector type");

EVT OutElemTy = NOutVT.getVectorElementType();		EVT OutElemTy = NOutVT.getVectorElementType();

unsigned NumElem = N->getOperand(0).getValueType().getVectorNumElements();		unsigned NumElem = N->getOperand(0).getValueType().getVectorNumElements();
unsigned NumOutElem = NOutVT.getVectorNumElements();		unsigned NumOutElem = NOutVT.getVectorNumElements();
▲ Show 20 Lines • Show All 148 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 298 Lines • ▼ Show 20 Lines	private:
SDValue PromoteIntRes_AtomicCmpSwap(AtomicSDNode *N, unsigned ResNo);		SDValue PromoteIntRes_AtomicCmpSwap(AtomicSDNode *N, unsigned ResNo);
SDValue PromoteIntRes_EXTRACT_SUBVECTOR(SDNode *N);		SDValue PromoteIntRes_EXTRACT_SUBVECTOR(SDNode *N);
SDValue PromoteIntRes_VECTOR_REVERSE(SDNode *N);		SDValue PromoteIntRes_VECTOR_REVERSE(SDNode *N);
SDValue PromoteIntRes_VECTOR_SHUFFLE(SDNode *N);		SDValue PromoteIntRes_VECTOR_SHUFFLE(SDNode *N);
SDValue PromoteIntRes_VECTOR_SPLICE(SDNode *N);		SDValue PromoteIntRes_VECTOR_SPLICE(SDNode *N);
SDValue PromoteIntRes_BUILD_VECTOR(SDNode *N);		SDValue PromoteIntRes_BUILD_VECTOR(SDNode *N);
SDValue PromoteIntRes_SCALAR_TO_VECTOR(SDNode *N);		SDValue PromoteIntRes_SCALAR_TO_VECTOR(SDNode *N);
SDValue PromoteIntRes_SPLAT_VECTOR(SDNode *N);		SDValue PromoteIntRes_SPLAT_VECTOR(SDNode *N);
		SDValue PromoteIntRes_STEP_VECTOR(SDNode *N);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'PromoteIntRes_STEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'PromoteIntRes_STEP_VECTOR' [readability…
SDValue PromoteIntRes_EXTEND_VECTOR_INREG(SDNode *N);		SDValue PromoteIntRes_EXTEND_VECTOR_INREG(SDNode *N);
SDValue PromoteIntRes_INSERT_VECTOR_ELT(SDNode *N);		SDValue PromoteIntRes_INSERT_VECTOR_ELT(SDNode *N);
SDValue PromoteIntRes_CONCAT_VECTORS(SDNode *N);		SDValue PromoteIntRes_CONCAT_VECTORS(SDNode *N);
SDValue PromoteIntRes_BITCAST(SDNode *N);		SDValue PromoteIntRes_BITCAST(SDNode *N);
SDValue PromoteIntRes_BSWAP(SDNode *N);		SDValue PromoteIntRes_BSWAP(SDNode *N);
SDValue PromoteIntRes_BITREVERSE(SDNode *N);		SDValue PromoteIntRes_BITREVERSE(SDNode *N);
SDValue PromoteIntRes_BUILD_PAIR(SDNode *N);		SDValue PromoteIntRes_BUILD_PAIR(SDNode *N);
SDValue PromoteIntRes_Constant(SDNode *N);		SDValue PromoteIntRes_Constant(SDNode *N);
▲ Show 20 Lines • Show All 516 Lines • ▼ Show 20 Lines	private:
void SplitVecRes_INSERT_SUBVECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_INSERT_SUBVECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_FPOWI(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_FPOWI(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_FCOPYSIGN(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_FCOPYSIGN(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_LOAD(LoadSDNode *LD, SDValue &Lo, SDValue &Hi);		void SplitVecRes_LOAD(LoadSDNode *LD, SDValue &Lo, SDValue &Hi);
void SplitVecRes_MLOAD(MaskedLoadSDNode *MLD, SDValue &Lo, SDValue &Hi);		void SplitVecRes_MLOAD(MaskedLoadSDNode *MLD, SDValue &Lo, SDValue &Hi);
void SplitVecRes_MGATHER(MaskedGatherSDNode *MGT, SDValue &Lo, SDValue &Hi);		void SplitVecRes_MGATHER(MaskedGatherSDNode *MGT, SDValue &Lo, SDValue &Hi);
void SplitVecRes_ScalarOp(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_ScalarOp(SDNode *N, SDValue &Lo, SDValue &Hi);
		void SplitVecRes_STEP_VECTOR(SDNode *N, SDValue &Lo, SDValue &Hi);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'SplitVecRes_STEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'SplitVecRes_STEP_VECTOR' [readability…
void SplitVecRes_SETCC(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_SETCC(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_VECTOR_REVERSE(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_VECTOR_REVERSE(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_VECTOR_SHUFFLE(ShuffleVectorSDNode *N, SDValue &Lo,		void SplitVecRes_VECTOR_SHUFFLE(ShuffleVectorSDNode *N, SDValue &Lo,
SDValue &Hi);		SDValue &Hi);
void SplitVecRes_VECTOR_SPLICE(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_VECTOR_SPLICE(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_VAARG(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_VAARG(SDNode *N, SDValue &Lo, SDValue &Hi);
void SplitVecRes_FP_TO_XINT_SAT(SDNode *N, SDValue &Lo, SDValue &Hi);		void SplitVecRes_FP_TO_XINT_SAT(SDNode *N, SDValue &Lo, SDValue &Hi);

▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 922 Lines • ▼ Show 20 Lines	#endif
case ISD::INSERT_SUBVECTOR: SplitVecRes_INSERT_SUBVECTOR(N, Lo, Hi); break;		case ISD::INSERT_SUBVECTOR: SplitVecRes_INSERT_SUBVECTOR(N, Lo, Hi); break;
case ISD::FPOWI: SplitVecRes_FPOWI(N, Lo, Hi); break;		case ISD::FPOWI: SplitVecRes_FPOWI(N, Lo, Hi); break;
case ISD::FCOPYSIGN: SplitVecRes_FCOPYSIGN(N, Lo, Hi); break;		case ISD::FCOPYSIGN: SplitVecRes_FCOPYSIGN(N, Lo, Hi); break;
case ISD::INSERT_VECTOR_ELT: SplitVecRes_INSERT_VECTOR_ELT(N, Lo, Hi); break;		case ISD::INSERT_VECTOR_ELT: SplitVecRes_INSERT_VECTOR_ELT(N, Lo, Hi); break;
case ISD::SPLAT_VECTOR:		case ISD::SPLAT_VECTOR:
case ISD::SCALAR_TO_VECTOR:		case ISD::SCALAR_TO_VECTOR:
SplitVecRes_ScalarOp(N, Lo, Hi);		SplitVecRes_ScalarOp(N, Lo, Hi);
break;		break;
		case ISD::STEP_VECTOR:
		SplitVecRes_STEP_VECTOR(N, Lo, Hi);
		break;
case ISD::SIGN_EXTEND_INREG: SplitVecRes_InregOp(N, Lo, Hi); break;		case ISD::SIGN_EXTEND_INREG: SplitVecRes_InregOp(N, Lo, Hi); break;
case ISD::LOAD:		case ISD::LOAD:
SplitVecRes_LOAD(cast<LoadSDNode>(N), Lo, Hi);		SplitVecRes_LOAD(cast<LoadSDNode>(N), Lo, Hi);
break;		break;
case ISD::MLOAD:		case ISD::MLOAD:
SplitVecRes_MLOAD(cast<MaskedLoadSDNode>(N), Lo, Hi);		SplitVecRes_MLOAD(cast<MaskedLoadSDNode>(N), Lo, Hi);
break;		break;
case ISD::MGATHER:		case ISD::MGATHER:
▲ Show 20 Lines • Show All 695 Lines • ▼ Show 20 Lines	void DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT(SDNode *N, SDValue &Lo,
// If we adjusted the original type, we need to truncate the results.		// If we adjusted the original type, we need to truncate the results.
std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
if (LoVT != Lo.getValueType())		if (LoVT != Lo.getValueType())
Lo = DAG.getNode(ISD::TRUNCATE, dl, LoVT, Lo);		Lo = DAG.getNode(ISD::TRUNCATE, dl, LoVT, Lo);
if (HiVT != Hi.getValueType())		if (HiVT != Hi.getValueType())
Hi = DAG.getNode(ISD::TRUNCATE, dl, HiVT, Hi);		Hi = DAG.getNode(ISD::TRUNCATE, dl, HiVT, Hi);
}		}

		void DAGTypeLegalizer::SplitVecRes_STEP_VECTOR(SDNode *N, SDValue &Lo,
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'SplitVecRes_STEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'SplitVecRes_STEP_VECTOR' [readability…
		SDValue &Hi) {
		EVT LoVT, HiVT;
		SDLoc dl(N);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming]…
		assert(N->getValueType(0).isScalableVector() &&
		kmclaughlinUnsubmitted Done Reply Inline Actions nit: can you add an error message to this assert? kmclaughlin: nit: can you add an error message to this assert?
		"Only scalable vectors are supported for STEP_VECTOR");
		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
		SDValue Step = N->getOperand(0);

		Lo = DAG.getNode(ISD::STEP_VECTOR, dl, LoVT, Step);

		// Hi = Lo + (EltCnt * Step)
		EVT EltVT = Step.getValueType();
		SDValue StartOfHi =
		DAG.getVScale(dl, EltVT,
		paulwalker-armUnsubmitted Done Reply Inline Actions I guess there's no action required for this patch as fixed length vectors are currently excluded, however, if this idiom becomes common then I suggest creating a DAG.getElementCountAsNode(EVT) like function. That way code like this will just work for both fixed and scalable vectors. paulwalker-arm: I guess there's no action required for this patch as fixed length vectors are currently…
		bin.cheng-aliUnsubmitted Not Done Reply Inline Actions My question is if it should become common? I mean if fixed length vector can be handled more efficiently, shall we always lower and handle fixed/scalable length differently, rather than handle them simultaneously in more and more functions, which has no obvious benefit? bin.cheng-ali: My question is if it should become common? I mean if fixed length vector can be handled more…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions I agree and hope we evolve to a more unified set of nodes that work for all vector types. That said, this is a larger conversation and so today we're trying to add scalable vector support in a way that is a NFC for fixed length vectors. This patch does add a unified interface in the form of `SelectionDAG::getSepVector(...)` that should allow for portable code regardless of vector type. SplitVecRes_STEP_VECTOR is specifically linked to `ISD::STEP_VECTOR`, which can only be created for scalable vector types. So whilst this function can be easily extended to cover fixed length vector types, it would not be testable in a way that preserved this "NFC for fixed length vectors" goal. paulwalker-arm: I agree and hope we evolve to a more unified set of nodes that work for all vector types. That…
		cast<ConstantSDNode>(Step)->getAPIntValue() *
		LoVT.getVectorMinNumElements());
		StartOfHi = DAG.getZExtOrTrunc(StartOfHi, dl, HiVT.getVectorElementType());
		StartOfHi = DAG.getNode(ISD::SPLAT_VECTOR, dl, HiVT, StartOfHi);
		paulwalker-armUnsubmitted Done Reply Inline Actions The Step is an immediate, as is the known part of LoVT's element count, which suggests you shouldn't need to use DAG for the maths because you can do `getVScale(N->getConstantOperandAPInt(0) * LoVT.getVectorMinNumElements())` paulwalker-arm: The Step is an immediate, as is the known part of LoVT's element count, which suggests you…

		Hi = DAG.getNode(ISD::STEP_VECTOR, dl, HiVT, Step);
		Hi = DAG.getNode(ISD::ADD, dl, HiVT, Hi, StartOfHi);
		}

void DAGTypeLegalizer::SplitVecRes_ScalarOp(SDNode *N, SDValue &Lo,		void DAGTypeLegalizer::SplitVecRes_ScalarOp(SDNode *N, SDValue &Lo,
SDValue &Hi) {		SDValue &Hi) {
EVT LoVT, HiVT;		EVT LoVT, HiVT;
SDLoc dl(N);		SDLoc dl(N);
std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));		std::tie(LoVT, HiVT) = DAG.GetSplitDestVTs(N->getValueType(0));
Lo = DAG.getNode(N->getOpcode(), dl, LoVT, N->getOperand(0));		Lo = DAG.getNode(N->getOpcode(), dl, LoVT, N->getOperand(0));
if (N->getOpcode() == ISD::SCALAR_TO_VECTOR) {		if (N->getOpcode() == ISD::SCALAR_TO_VECTOR) {
Hi = DAG.getUNDEF(HiVT);		Hi = DAG.getUNDEF(HiVT);
▲ Show 20 Lines • Show All 3,894 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,738 Lines • ▼ Show 20 Lines	if (!CondCodeNodes[Cond]) {
auto *N = newSDNode<CondCodeSDNode>(Cond);		auto *N = newSDNode<CondCodeSDNode>(Cond);
CondCodeNodes[Cond] = N;		CondCodeNodes[Cond] = N;
InsertNode(N);		InsertNode(N);
}		}

return SDValue(CondCodeNodes[Cond], 0);		return SDValue(CondCodeNodes[Cond], 0);
}		}

		SDValue SelectionDAG::getStepVector(const SDLoc &DL, EVT ResVT, SDValue Step) {
		if (ResVT.isScalableVector())
		return getNode(ISD::STEP_VECTOR, DL, ResVT, Step);

		EVT OpVT = Step.getValueType();
		APInt StepVal = cast<ConstantSDNode>(Step)->getAPIntValue();
		SmallVector<SDValue, 16> OpsStepConstants;
		for (uint64_t i = 0; i < ResVT.getVectorNumElements(); i++)
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming]…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Given this is fixed length specific I think calling `getVectorNumElements()` is clearer. paulwalker-arm: Given this is fixed length specific I think calling `getVectorNumElements()` is clearer.
		OpsStepConstants.push_back(getConstant(StepVal * i, DL, OpVT));
		return getBuildVector(ResVT, DL, OpsStepConstants);
		}

/// Swaps the values of N1 and N2. Swaps all indices in the shuffle mask M that		/// Swaps the values of N1 and N2. Swaps all indices in the shuffle mask M that
/// point at N1 to point at N2 and indices that point at N2 to point at N1.		/// point at N1 to point at N2 and indices that point at N2 to point at N1.
static void commuteShuffle(SDValue &N1, SDValue &N2, MutableArrayRef<int> M) {		static void commuteShuffle(SDValue &N1, SDValue &N2, MutableArrayRef<int> M) {
std::swap(N1, N2);		std::swap(N1, N2);
ShuffleVectorSDNode::commuteMask(M);		ShuffleVectorSDNode::commuteMask(M);
}		}

SDValue SelectionDAG::getVectorShuffle(EVT VT, const SDLoc &dl, SDValue N1,		SDValue SelectionDAG::getVectorShuffle(EVT VT, const SDLoc &dl, SDValue N1,
▲ Show 20 Lines • Show All 2,546 Lines • ▼ Show 20 Lines
// FIXME: unify with llvm::haveNoCommonBitsSet.		// FIXME: unify with llvm::haveNoCommonBitsSet.
// FIXME: could also handle masked merge pattern (X & ~M) op (Y & M)		// FIXME: could also handle masked merge pattern (X & ~M) op (Y & M)
bool SelectionDAG::haveNoCommonBitsSet(SDValue A, SDValue B) const {		bool SelectionDAG::haveNoCommonBitsSet(SDValue A, SDValue B) const {
assert(A.getValueType() == B.getValueType() &&		assert(A.getValueType() == B.getValueType() &&
"Values must have the same type");		"Values must have the same type");
return (computeKnownBits(A).Zero \| computeKnownBits(B).Zero).isAllOnesValue();		return (computeKnownBits(A).Zero \| computeKnownBits(B).Zero).isAllOnesValue();
}		}

		static SDValue FoldSTEP_VECTOR(const SDLoc &DL, EVT VT, SDValue Step,
		SelectionDAG &DAG) {
		if (cast<ConstantSDNode>(Step)->isNullValue())
		return DAG.getConstant(0, DL, VT);
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Comment is no longer valid. I'd just remove it. paulwalker-arm: Comment is no longer valid. I'd just remove it.

		paulwalker-armUnsubmitted Done Reply Inline Actions `if (cast<ConstantSDNode>(Step)->isNullValue())` ? paulwalker-arm: `if (cast<ConstantSDNode>(Step)->isNullValue())` ?
		paulwalker-armUnsubmitted Done Reply Inline Actions I think `DAG.getConstant(0...` is better here so that the target's preferred nodes (i.e. SPLAT_VECTOR or BUILD_VECTOR) are used. paulwalker-arm: I think `DAG.getConstant(0...` is better here so that the target's preferred nodes (i.e.
		return SDValue();
		}

static SDValue FoldBUILD_VECTOR(const SDLoc &DL, EVT VT,		static SDValue FoldBUILD_VECTOR(const SDLoc &DL, EVT VT,
ArrayRef<SDValue> Ops,		ArrayRef<SDValue> Ops,
SelectionDAG &DAG) {		SelectionDAG &DAG) {
int NumOps = Ops.size();		int NumOps = Ops.size();
assert(NumOps != 0 && "Can't build an empty vector!");		assert(NumOps != 0 && "Can't build an empty vector!");
assert(!VT.isScalableVector() &&		assert(!VT.isScalableVector() &&
"BUILD_VECTOR cannot be used with scalable types");		"BUILD_VECTOR cannot be used with scalable types");
assert(VT.getVectorNumElements() == (unsigned)NumOps &&		assert(VT.getVectorNumElements() == (unsigned)NumOps &&
Show All 9 Lines	static SDValue FoldBUILD_VECTOR(const SDLoc &DL, EVT VT,
for (int i = 0; i != NumOps; ++i) {		for (int i = 0; i != NumOps; ++i) {
if (Ops[i].getOpcode() != ISD::EXTRACT_VECTOR_ELT \|\|		if (Ops[i].getOpcode() != ISD::EXTRACT_VECTOR_ELT \|\|
Ops[i].getOperand(0).getValueType() != VT \|\|		Ops[i].getOperand(0).getValueType() != VT \|\|
(IdentitySrc && Ops[i].getOperand(0) != IdentitySrc) \|\|		(IdentitySrc && Ops[i].getOperand(0) != IdentitySrc) \|\|
!isa<ConstantSDNode>(Ops[i].getOperand(1)) \|\|		!isa<ConstantSDNode>(Ops[i].getOperand(1)) \|\|
cast<ConstantSDNode>(Ops[i].getOperand(1))->getAPIntValue() != i) {		cast<ConstantSDNode>(Ops[i].getOperand(1))->getAPIntValue() != i) {
IsIdentity = false;		IsIdentity = false;
break;		break;
}		}
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'FoldSTEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'FoldSTEP_VECTOR' [readability-identifier…
IdentitySrc = Ops[i].getOperand(0);		IdentitySrc = Ops[i].getOperand(0);
}		}
if (IsIdentity)		if (IsIdentity)
return IdentitySrc;		return IdentitySrc;

return SDValue();		return SDValue();
}		}

▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines	case ISD::FP16_TO_FP: {
(Val.getBitWidth() == 16) ? Val : Val.trunc(16));		(Val.getBitWidth() == 16) ? Val : Val.trunc(16));

// This can return overflow, underflow, or inexact; we don't care.		// This can return overflow, underflow, or inexact; we don't care.
// FIXME need to be more flexible about rounding mode.		// FIXME need to be more flexible about rounding mode.
(void)FPV.convert(EVTToAPFloatSemantics(VT),		(void)FPV.convert(EVTToAPFloatSemantics(VT),
APFloat::rmNearestTiesToEven, &Ignored);		APFloat::rmNearestTiesToEven, &Ignored);
return getConstantFP(FPV, DL, VT);		return getConstantFP(FPV, DL, VT);
}		}
		case ISD::STEP_VECTOR: {
		if (SDValue V = FoldSTEP_VECTOR(DL, VT, Operand, *this))
		return V;
		break;
		}
}		}
}		}

// Constant fold unary operations with a floating point constant operand.		// Constant fold unary operations with a floating point constant operand.
if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Operand)) {		if (ConstantFPSDNode *C = dyn_cast<ConstantFPSDNode>(Operand)) {
APFloat V = C->getValueAPF(); // make copy		APFloat V = C->getValueAPF(); // make copy
switch (Opcode) {		switch (Opcode) {
case ISD::FNEG:		case ISD::FNEG:
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	if (BV->isConstant()) {
return Fold;		return Fold;
}		}
}		}
}		}
}		}

unsigned OpOpcode = Operand.getNode()->getOpcode();		unsigned OpOpcode = Operand.getNode()->getOpcode();
switch (Opcode) {		switch (Opcode) {
		case ISD::STEP_VECTOR:
		assert(VT.isScalableVector() &&
		"STEP_VECTOR can only be used with scalable types");
		assert(VT.getScalarSizeInBits() >= 8 &&
		paulwalker-armUnsubmitted Not Done Reply Inline Actions To match the comment you'll need to add an `isInteger` test. paulwalker-arm: To match the comment you'll need to add an `isInteger` test.
		"STEP_VECTOR can only be used with vectors of integers that are at "
		paulwalker-armUnsubmitted Done Reply Inline Actions Is it worth also validating that Operand is at least as big as the result element type. That way if the "not-negative" requirement ever get's dropped I think the signedness will not actually matter? paulwalker-arm: Is it worth also validating that Operand is at least as big as the result element type. That…
		"least 8 bits wide");
		assert(Operand.getValueType().bitsGE(VT.getScalarType()) &&
		paulwalker-armUnsubmitted Not Done Reply Inline Actions FYI: APInt has a function for this called `isNonNegative()`. paulwalker-arm: FYI: APInt has a function for this called `isNonNegative()`.
		"Operand type should be at least as large as the element type");
		assert(isa<ConstantSDNode>(Operand) &&
		cast<ConstantSDNode>(Operand)->getAPIntValue().isNonNegative() &&
		"Expected positive integer constant for STEP_VECTOR");
		break;
case ISD::FREEZE:		case ISD::FREEZE:
assert(VT == Operand.getValueType() && "Unexpected VT!");		assert(VT == Operand.getValueType() && "Unexpected VT!");
break;		break;
case ISD::TokenFactor:		case ISD::TokenFactor:
case ISD::MERGE_VALUES:		case ISD::MERGE_VALUES:
case ISD::CONCAT_VECTORS:		case ISD::CONCAT_VECTORS:
return Operand; // Factor, merge or concat of one node? No need.		return Operand; // Factor, merge or concat of one node? No need.
case ISD::BUILD_VECTOR: {		case ISD::BUILD_VECTOR: {
▲ Show 20 Lines • Show All 5,735 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

Show First 20 Lines • Show All 773 Lines • ▼ Show 20 Lines	private:

// These two are implemented in StatepointLowering.cpp		// These two are implemented in StatepointLowering.cpp
void visitGCRelocate(const GCRelocateInst &Relocate);		void visitGCRelocate(const GCRelocateInst &Relocate);
void visitGCResult(const GCResultInst &I);		void visitGCResult(const GCResultInst &I);

void visitVectorReduce(const CallInst &I, unsigned Intrinsic);		void visitVectorReduce(const CallInst &I, unsigned Intrinsic);
void visitVectorReverse(const CallInst &I);		void visitVectorReverse(const CallInst &I);
void visitVectorSplice(const CallInst &I);		void visitVectorSplice(const CallInst &I);
		void visitStepVector(const CallInst &I);

void visitUserOp1(const Instruction &I) {		void visitUserOp1(const Instruction &I) {
llvm_unreachable("UserOp1 should not exist at instruction selection time!");		llvm_unreachable("UserOp1 should not exist at instruction selection time!");
}		}
void visitUserOp2(const Instruction &I) {		void visitUserOp2(const Instruction &I) {
llvm_unreachable("UserOp2 should not exist at instruction selection time!");		llvm_unreachable("UserOp2 should not exist at instruction selection time!");
}		}

▲ Show 20 Lines • Show All 121 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 6,939 Lines • ▼ Show 20 Lines	case Intrinsic::xray_typedevent: {
SDValue patchableNode = SDValue(MN, 0);		SDValue patchableNode = SDValue(MN, 0);
DAG.setRoot(patchableNode);		DAG.setRoot(patchableNode);
setValue(&I, patchableNode);		setValue(&I, patchableNode);
return;		return;
}		}
case Intrinsic::experimental_deoptimize:		case Intrinsic::experimental_deoptimize:
LowerDeoptimizeCall(&I);		LowerDeoptimizeCall(&I);
return;		return;
		case Intrinsic::experimental_stepvector:
		visitStepVector(I);
		paulwalker-armUnsubmitted Done Reply Inline Actions Perhaps worth punting this into visitStepVector. This is what we've done for vector.reverse. paulwalker-arm: Perhaps worth punting this into visitStepVector. This is what we've done for vector.reverse.
		return;
case Intrinsic::vector_reduce_fadd:		case Intrinsic::vector_reduce_fadd:
case Intrinsic::vector_reduce_fmul:		case Intrinsic::vector_reduce_fmul:
case Intrinsic::vector_reduce_add:		case Intrinsic::vector_reduce_add:
case Intrinsic::vector_reduce_mul:		case Intrinsic::vector_reduce_mul:
case Intrinsic::vector_reduce_and:		case Intrinsic::vector_reduce_and:
case Intrinsic::vector_reduce_or:		case Intrinsic::vector_reduce_or:
case Intrinsic::vector_reduce_xor:		case Intrinsic::vector_reduce_xor:
case Intrinsic::vector_reduce_smax:		case Intrinsic::vector_reduce_smax:
▲ Show 20 Lines • Show All 3,967 Lines • ▼ Show 20 Lines	if (NumClusters > 3 && TM.getOptLevel() != CodeGenOpt::None &&
splitWorkItem(WorkList, W, SI.getCondition(), SwitchMBB);		splitWorkItem(WorkList, W, SI.getCondition(), SwitchMBB);
continue;		continue;
}		}

lowerWorkItem(W, SI.getCondition(), SwitchMBB, DefaultMBB);		lowerWorkItem(W, SI.getCondition(), SwitchMBB, DefaultMBB);
}		}
}		}

		void SelectionDAGBuilder::visitStepVector(const CallInst &I) {
		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
		auto DL = getCurSDLoc();
		EVT ResultVT = TLI.getValueType(DAG.getDataLayout(), I.getType());
		EVT OpVT =
		TLI.getTypeToTransformTo(*DAG.getContext(), ResultVT.getScalarType());
		paulwalker-armUnsubmitted Done Reply Inline Actions Is this needed? I ask because `DAG.getStepVector` will validate the result VT in one place rather than expecting all it's users to do likewise. paulwalker-arm: Is this needed? I ask because `DAG.getStepVector` will validate the result VT in one place…
		SDValue Step = DAG.getConstant(1, DL, OpVT);
		setValue(&I, DAG.getStepVector(DL, ResultVT, Step));
		}

void SelectionDAGBuilder::visitVectorReverse(const CallInst &I) {		void SelectionDAGBuilder::visitVectorReverse(const CallInst &I) {
const TargetLowering &TLI = DAG.getTargetLoweringInfo();		const TargetLowering &TLI = DAG.getTargetLoweringInfo();
EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());		EVT VT = TLI.getValueType(DAG.getDataLayout(), I.getType());

SDLoc DL = getCurSDLoc();		SDLoc DL = getCurSDLoc();
SDValue V = getValue(I.getOperand(0));		SDValue V = getValue(I.getOperand(0));
		paulwalker-armUnsubmitted Done Reply Inline Actions What about pushing this into `DAG.getStepVector`? that way there's a generic way for any code to create this vector without needing to worry about the result type. paulwalker-arm: What about pushing this into `DAG.getStepVector`? that way there's a generic way for any code…
assert(VT == V.getValueType() && "Malformed vector.reverse!");		assert(VT == V.getValueType() && "Malformed vector.reverse!");

if (VT.isScalableVector()) {		if (VT.isScalableVector()) {
setValue(&I, DAG.getNode(ISD::VECTOR_REVERSE, DL, VT, V));		setValue(&I, DAG.getNode(ISD::VECTOR_REVERSE, DL, VT, V));
return;		return;
}		}

// Use VECTOR_SHUFFLE for the fixed-length vector		// Use VECTOR_SHUFFLE for the fixed-length vector
▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

Show First 20 Lines • Show All 286 Lines • ▼ Show 20 Lines	#endif
case ISD::INSERT_SUBVECTOR: return "insert_subvector";		case ISD::INSERT_SUBVECTOR: return "insert_subvector";
case ISD::EXTRACT_SUBVECTOR: return "extract_subvector";		case ISD::EXTRACT_SUBVECTOR: return "extract_subvector";
case ISD::SCALAR_TO_VECTOR: return "scalar_to_vector";		case ISD::SCALAR_TO_VECTOR: return "scalar_to_vector";
case ISD::VECTOR_SHUFFLE: return "vector_shuffle";		case ISD::VECTOR_SHUFFLE: return "vector_shuffle";
case ISD::VECTOR_SPLICE: return "vector_splice";		case ISD::VECTOR_SPLICE: return "vector_splice";
case ISD::SPLAT_VECTOR: return "splat_vector";		case ISD::SPLAT_VECTOR: return "splat_vector";
case ISD::SPLAT_VECTOR_PARTS: return "splat_vector_parts";		case ISD::SPLAT_VECTOR_PARTS: return "splat_vector_parts";
case ISD::VECTOR_REVERSE: return "vector_reverse";		case ISD::VECTOR_REVERSE: return "vector_reverse";
		case ISD::STEP_VECTOR: return "step_vector";
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - case ISD::STEP_VECTOR: return "step_vector"; + case ISD::STEP_VECTOR: + return "step_vector"; Lint: Pre-merge checks: clang-format: please reformat the code ``` - case ISD::STEP_VECTOR: return…
case ISD::CARRY_FALSE: return "carry_false";		case ISD::CARRY_FALSE: return "carry_false";
case ISD::ADDC: return "addc";		case ISD::ADDC: return "addc";
case ISD::ADDE: return "adde";		case ISD::ADDE: return "adde";
case ISD::ADDCARRY: return "addcarry";		case ISD::ADDCARRY: return "addcarry";
case ISD::SADDO_CARRY: return "saddo_carry";		case ISD::SADDO_CARRY: return "saddo_carry";
case ISD::SADDO: return "saddo";		case ISD::SADDO: return "saddo";
case ISD::UADDO: return "uaddo";		case ISD::UADDO: return "uaddo";
case ISD::SSUBO: return "ssubo";		case ISD::SSUBO: return "ssubo";
▲ Show 20 Lines • Show All 754 Lines • Show Last 20 Lines

llvm/lib/IR/IRBuilder.cpp

Show First 20 Lines • Show All 85 Lines • ▼ Show 20 Lines	Value IRBuilderBase::CreateVScale(Constant Scaling, const Twine &Name) {
Function *TheFn =		Function *TheFn =
Intrinsic::getDeclaration(M, Intrinsic::vscale, {Scaling->getType()});		Intrinsic::getDeclaration(M, Intrinsic::vscale, {Scaling->getType()});
CallInst *CI = createCallHelper(TheFn, {}, this, Name);		CallInst *CI = createCallHelper(TheFn, {}, this, Name);
return cast<ConstantInt>(Scaling)->getSExtValue() == 1		return cast<ConstantInt>(Scaling)->getSExtValue() == 1
? CI		? CI
: CreateMul(CI, Scaling);		: CreateMul(CI, Scaling);
}		}

		Value IRBuilderBase::CreateStepVector(Type DstType, const Twine &Name) {
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'CreateStepVector' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'CreateStepVector' [readability-identifier…
		if (isa<ScalableVectorType>(DstType))
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Keep this style if you prefer but I think it will be cleaner done as if (isa<ScalableVectorType>(DstType)) return CreateIntrinsic(Intrinsic::experimental_stepvector, {DstType}, {},.... <fixed vector handling here> paulwalker-arm: Keep this style if you prefer but I think it will be cleaner done as ``` if…
		return CreateIntrinsic(Intrinsic::experimental_stepvector, {DstType}, {},
		nullptr, Name);
		paulwalker-armUnsubmitted Done Reply Inline Actions Given this is a FixedVectorType you should be able to short cut ElementCount and use getNumElements() directly. paulwalker-arm: Given this is a FixedVectorType you should be able to short cut ElementCount and use…

		Type *STy = DstType->getScalarType();
		unsigned NumEls = cast<FixedVectorType>(DstType)->getNumElements();

		// Create a vector of consecutive numbers from zero to VF.
		SmallVector<Constant *, 8> Indices;
		for (unsigned i = 0; i < NumEls; ++i)
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming]…
		Indices.push_back(ConstantInt::get(STy, i));

		// Add the consecutive indices to the vector value.
		paulwalker-armUnsubmitted Done Reply Inline Actions I cannot see how this assert will ever fire and so `return ConstantVector::get(Indices);` just looks nicer. paulwalker-arm: I cannot see how this assert will ever fire and so `return ConstantVector::get(Indices);` just…
		return ConstantVector::get(Indices);
		}

CallInst IRBuilderBase::CreateMemSet(Value Ptr, Value Val, Value Size,		CallInst IRBuilderBase::CreateMemSet(Value Ptr, Value Val, Value Size,
MaybeAlign Align, bool isVolatile,		MaybeAlign Align, bool isVolatile,
MDNode TBAATag, MDNode ScopeTag,		MDNode TBAATag, MDNode ScopeTag,
MDNode *NoAliasTag) {		MDNode *NoAliasTag) {
Ptr = getCastedInt8PtrValue(Ptr);		Ptr = getCastedInt8PtrValue(Ptr);
Value *Ops[] = {Ptr, Val, Size, getInt1(isVolatile)};		Value *Ops[] = {Ptr, Val, Size, getInt1(isVolatile)};
Type *Tys[] = { Ptr->getType(), Size->getType() };		Type *Tys[] = { Ptr->getType(), Size->getType() };
Module *M = BB->getParent()->getParent();		Module *M = BB->getParent()->getParent();
▲ Show 20 Lines • Show All 1,064 Lines • Show Last 20 Lines

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 5,179 Lines • ▼ Show 20 Lines	Assert(cast<FixedVectorType>(ResultTy)->getNumElements() ==
"Result of a matrix operation does not fit in the returned vector!");		"Result of a matrix operation does not fit in the returned vector!");

if (Stride)		if (Stride)
Assert(Stride->getZExtValue() >= NumRows->getZExtValue(),		Assert(Stride->getZExtValue() >= NumRows->getZExtValue(),
"Stride must be greater or equal than the number of rows!", IF);		"Stride must be greater or equal than the number of rows!", IF);

break;		break;
}		}
		case Intrinsic::experimental_stepvector: {
		VectorType *VecTy = dyn_cast<VectorType>(Call.getType());
		Assert(VecTy && VecTy->getScalarType()->isIntegerTy() &&
		VecTy->getScalarSizeInBits() >= 8,
		"experimental_stepvector only supported for vectors of integers "
		"with a bitwidth of at least 8.",
		&Call);
		break;
		}
case Intrinsic::experimental_vector_insert: {		case Intrinsic::experimental_vector_insert: {
VectorType *VecTy = cast<VectorType>(Call.getArgOperand(0)->getType());		VectorType *VecTy = cast<VectorType>(Call.getArgOperand(0)->getType());
VectorType *SubVecTy = cast<VectorType>(Call.getArgOperand(1)->getType());		VectorType *SubVecTy = cast<VectorType>(Call.getArgOperand(1)->getType());

Assert(VecTy->getElementType() == SubVecTy->getElementType(),		Assert(VecTy->getElementType() == SubVecTy->getElementType(),
"experimental_vector_insert parameters must have the same element "		"experimental_vector_insert parameters must have the same element "
"type.",		"type.",
&Call);		&Call);
▲ Show 20 Lines • Show All 898 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.h

Show First 20 Lines • Show All 930 Lines • ▼ Show 20 Lines	private:
SDValue LowerFLT_ROUNDS_(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerFLT_ROUNDS_(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerSET_ROUNDING(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerSET_ROUNDING(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerINSERT_VECTOR_ELT(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerINSERT_VECTOR_ELT(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerEXTRACT_VECTOR_ELT(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerEXTRACT_VECTOR_ELT(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerSCALAR_TO_VECTOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerSCALAR_TO_VECTOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerBUILD_VECTOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerVECTOR_SHUFFLE(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerSPLAT_VECTOR(SDValue Op, SelectionDAG &DAG) const;
		SDValue LowerSTEP_VECTOR(SDValue Op, SelectionDAG &DAG) const;
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'LowerSTEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'LowerSTEP_VECTOR' [readability-identifier…
SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerDUPQLane(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerToPredicatedOp(SDValue Op, SelectionDAG &DAG, unsigned NewOp,		SDValue LowerToPredicatedOp(SDValue Op, SelectionDAG &DAG, unsigned NewOp,
bool OverrideNEON = false) const;		bool OverrideNEON = false) const;
SDValue LowerToScalableOp(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerToScalableOp(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerEXTRACT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerINSERT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerINSERT_SUBVECTOR(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerDIV(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerDIV(SDValue Op, SelectionDAG &DAG) const;
SDValue LowerMUL(SDValue Op, SelectionDAG &DAG) const;		SDValue LowerMUL(SDValue Op, SelectionDAG &DAG) const;
▲ Show 20 Lines • Show All 139 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,121 Lines • ▼ Show 20 Lines	for (auto VT : {MVT::nxv16i8, MVT::nxv8i16, MVT::nxv4i32, MVT::nxv2i64}) {
setOperationAction(ISD::VECREDUCE_ADD, VT, Custom);		setOperationAction(ISD::VECREDUCE_ADD, VT, Custom);
setOperationAction(ISD::VECREDUCE_AND, VT, Custom);		setOperationAction(ISD::VECREDUCE_AND, VT, Custom);
setOperationAction(ISD::VECREDUCE_OR, VT, Custom);		setOperationAction(ISD::VECREDUCE_OR, VT, Custom);
setOperationAction(ISD::VECREDUCE_XOR, VT, Custom);		setOperationAction(ISD::VECREDUCE_XOR, VT, Custom);
setOperationAction(ISD::VECREDUCE_UMIN, VT, Custom);		setOperationAction(ISD::VECREDUCE_UMIN, VT, Custom);
setOperationAction(ISD::VECREDUCE_UMAX, VT, Custom);		setOperationAction(ISD::VECREDUCE_UMAX, VT, Custom);
setOperationAction(ISD::VECREDUCE_SMIN, VT, Custom);		setOperationAction(ISD::VECREDUCE_SMIN, VT, Custom);
setOperationAction(ISD::VECREDUCE_SMAX, VT, Custom);		setOperationAction(ISD::VECREDUCE_SMAX, VT, Custom);
		setOperationAction(ISD::STEP_VECTOR, VT, Custom);

setOperationAction(ISD::MULHU, VT, Expand);		setOperationAction(ISD::MULHU, VT, Expand);
setOperationAction(ISD::MULHS, VT, Expand);		setOperationAction(ISD::MULHS, VT, Expand);
setOperationAction(ISD::UMUL_LOHI, VT, Expand);		setOperationAction(ISD::UMUL_LOHI, VT, Expand);
setOperationAction(ISD::SMUL_LOHI, VT, Expand);		setOperationAction(ISD::SMUL_LOHI, VT, Expand);
}		}

// Illegal unpacked integer vector types.		// Illegal unpacked integer vector types.
▲ Show 20 Lines • Show All 3,249 Lines • ▼ Show 20 Lines	SDValue AArch64TargetLowering::LowerOperation(SDValue Op,
case ISD::EXTRACT_VECTOR_ELT:		case ISD::EXTRACT_VECTOR_ELT:
return LowerEXTRACT_VECTOR_ELT(Op, DAG);		return LowerEXTRACT_VECTOR_ELT(Op, DAG);
case ISD::BUILD_VECTOR:		case ISD::BUILD_VECTOR:
return LowerBUILD_VECTOR(Op, DAG);		return LowerBUILD_VECTOR(Op, DAG);
case ISD::VECTOR_SHUFFLE:		case ISD::VECTOR_SHUFFLE:
return LowerVECTOR_SHUFFLE(Op, DAG);		return LowerVECTOR_SHUFFLE(Op, DAG);
case ISD::SPLAT_VECTOR:		case ISD::SPLAT_VECTOR:
return LowerSPLAT_VECTOR(Op, DAG);		return LowerSPLAT_VECTOR(Op, DAG);
		case ISD::STEP_VECTOR:
		return LowerSTEP_VECTOR(Op, DAG);
case ISD::EXTRACT_SUBVECTOR:		case ISD::EXTRACT_SUBVECTOR:
return LowerEXTRACT_SUBVECTOR(Op, DAG);		return LowerEXTRACT_SUBVECTOR(Op, DAG);
case ISD::INSERT_SUBVECTOR:		case ISD::INSERT_SUBVECTOR:
return LowerINSERT_SUBVECTOR(Op, DAG);		return LowerINSERT_SUBVECTOR(Op, DAG);
case ISD::SDIV:		case ISD::SDIV:
case ISD::UDIV:		case ISD::UDIV:
return LowerDIV(Op, DAG);		return LowerDIV(Op, DAG);
case ISD::SMIN:		case ISD::SMIN:
▲ Show 20 Lines • Show All 4,631 Lines • ▼ Show 20 Lines	if (NumElts == 4) {

if (Cost <= 4)		if (Cost <= 4)
return GeneratePerfectShuffle(PFEntry, V1, V2, DAG, dl);		return GeneratePerfectShuffle(PFEntry, V1, V2, DAG, dl);
}		}

return GenerateTBL(Op, ShuffleMask, DAG);		return GenerateTBL(Op, ShuffleMask, DAG);
}		}

		SDValue AArch64TargetLowering::LowerSTEP_VECTOR(SDValue Op,
		SelectionDAG &DAG) const {
		SDLoc dl(Op);
		EVT VT = Op.getValueType();
		assert(VT.isScalableVector() &&
		"Only expect scalable vectors for STEP_VECTOR");
		EVT ElemVT = VT.getScalarType();
		assert(ElemVT != MVT::i1 &&
		"Vectors of i1 types not supported for STEP_VECTOR");
		paulwalker-armUnsubmitted Done Reply Inline Actions I'm not a fan of the pattern duplication within AArch64SVEInstrInfo.td so I'm thinking we should lower all ISD::STEP_VECTORs to AArch64ISD::INDEX_VECTOR. Regardless of whether you do that I would expect the predicate handling to be more generic, for example: `return ISD::TRUNC(ISD::STEP_VECTOR(Op.getConstantOperandVal())` paulwalker-arm: I'm not a fan of the pattern duplication within AArch64SVEInstrInfo.td so I'm thinking we…
		paulwalker-armUnsubmitted Not Done Reply Inline Actions This assert should not be necessary as you're protecting this already within getNode and thus can assume the node to be valid. paulwalker-arm: This assert should not be necessary as you're protecting this already within getNode and thus…
		david-armAuthorUnsubmitted Done Reply Inline Actions Hi @paulwalker-arm, me too for i1 case, however I tried the generic route first exactly as you suggest and it didn't work. The problem was that if I created the ISD::TRUNC node we never ended up custom lowering as you'd expect. If you look at AArch64TargetLowering::LowerTRUNCATE for i1 element types we return this: return DAG.getSetCC(dl, VT, And, Zero, ISD::SETNE); which then never gets into our custom lowering code, despite the action being set to Custom for precisely those types. We then crash in isel trying to match a `setcc` operation. However I'd love to do it this way if you've got any suggestions for how I can solve this problem? I suspect there are just too many levels of custom lowering involved, i.e. custom lower STEP_VECTOR, followed by custom lower TRUNCATE, followed by custom lower SETCC. So in general your preference is to custom lower all types and go through this function to create AArch64ISD::INDEX_VECTOR? david-arm: Hi @paulwalker-arm, me too for i1 case, however I tried the generic route first exactly as you…
		david-armAuthorUnsubmitted Done Reply Inline Actions One solution might be to rewrite LowerTRUNCATE to avoid returning SETCC for scalable vectors and use AArch64ISD::SETCC_MERGE_ZERO directly? I'd just assumed people would find this a bit ugly. david-arm: One solution might be to rewrite LowerTRUNCATE to avoid returning SETCC for scalable vectors…

		SDValue StepVal = Op.getOperand(0);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'LowerSTEP_VECTOR' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'LowerSTEP_VECTOR' [readability-identifier…
		SDValue Zero = DAG.getConstant(0, dl, StepVal.getValueType());
		return DAG.getNode(AArch64ISD::INDEX_VECTOR, dl, VT, Zero, StepVal);
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'dl' [readability-identifier-naming]…
		}

SDValue AArch64TargetLowering::LowerSPLAT_VECTOR(SDValue Op,		SDValue AArch64TargetLowering::LowerSPLAT_VECTOR(SDValue Op,
SelectionDAG &DAG) const {		SelectionDAG &DAG) const {
SDLoc dl(Op);		SDLoc dl(Op);
		paulwalker-armUnsubmitted Done Reply Inline Actions Is it possible to use getPromotedVTForPredicate here? paulwalker-arm: Is it possible to use getPromotedVTForPredicate here?
EVT VT = Op.getValueType();		EVT VT = Op.getValueType();
EVT ElemVT = VT.getScalarType();		EVT ElemVT = VT.getScalarType();
SDValue SplatVal = Op.getOperand(0);		SDValue SplatVal = Op.getOperand(0);

if (useSVEForFixedLengthVectorVT(VT))		if (useSVEForFixedLengthVectorVT(VT))
		paulwalker-armUnsubmitted Done Reply Inline Actions I can imagine a route where this promotion is also required for the `ElemVT != MVT::i1` case. Related to this I suspect we'll need an implementation of `PromoteIntOp_STEP_VECTOR` but currently don't because there's nothing that can exercise the `Step != 1` case. paulwalker-arm: I can imagine a route where this promotion is also required for the `ElemVT != MVT::i1` case.
return LowerToScalableOp(Op, DAG);		return LowerToScalableOp(Op, DAG);

// Extend input splat value where needed to fit into a GPR (32b or 64b only)		// Extend input splat value where needed to fit into a GPR (32b or 64b only)
// FPRs don't have this restriction.		// FPRs don't have this restriction.
switch (ElemVT.getSimpleVT().SimpleTy) {		switch (ElemVT.getSimpleVT().SimpleTy) {
case MVT::i1: {		case MVT::i1: {
// The only legal i1 vectors are SVE vectors, so we can use SVE-specific		// The only legal i1 vectors are SVE vectors, so we can use SVE-specific
// lowering code.		// lowering code.
▲ Show 20 Lines • Show All 1,132 Lines • ▼ Show 20 Lines	SDValue AArch64TargetLowering::LowerTRUNCATE(SDValue Op,

if (VT.getScalarType() == MVT::i1) {		if (VT.getScalarType() == MVT::i1) {
// Lower i1 truncate to `(x & 1) != 0`.		// Lower i1 truncate to `(x & 1) != 0`.
SDLoc dl(Op);		SDLoc dl(Op);
EVT OpVT = Op.getOperand(0).getValueType();		EVT OpVT = Op.getOperand(0).getValueType();
SDValue Zero = DAG.getConstant(0, dl, OpVT);		SDValue Zero = DAG.getConstant(0, dl, OpVT);
SDValue One = DAG.getConstant(1, dl, OpVT);		SDValue One = DAG.getConstant(1, dl, OpVT);
SDValue And = DAG.getNode(ISD::AND, dl, OpVT, Op.getOperand(0), One);		SDValue And = DAG.getNode(ISD::AND, dl, OpVT, Op.getOperand(0), One);
return DAG.getSetCC(dl, VT, And, Zero, ISD::SETNE);		return DAG.getSetCC(dl, VT, And, Zero, ISD::SETNE);
		paulwalker-armUnsubmitted Done Reply Inline Actions Is this test needed? I'm guessing this code was added as part of SVE support (v#i1 being an illegal type for NEON). paulwalker-arm: Is this test needed? I'm guessing this code was added as part of SVE support (v#i1 being an…
}		}

if (!VT.isVector() \|\| VT.isScalableVector())		if (!VT.isVector() \|\| VT.isScalableVector())
return SDValue();		return SDValue();

if (useSVEForFixedLengthVectorVT(Op.getOperand(0).getValueType()))		if (useSVEForFixedLengthVectorVT(Op.getOperand(0).getValueType()))
return LowerFixedLengthVectorTruncateToSVE(Op, DAG);		return LowerFixedLengthVectorTruncateToSVE(Op, DAG);

▲ Show 20 Lines • Show All 7,198 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Show First 20 Lines • Show All 254 Lines • ▼ Show 20 Lines	case Intrinsic::abs: {
static const auto ValidAbsTys = {MVT::v8i8, MVT::v16i8, MVT::v4i16,		static const auto ValidAbsTys = {MVT::v8i8, MVT::v16i8, MVT::v4i16,
MVT::v8i16, MVT::v2i32, MVT::v4i32,		MVT::v8i16, MVT::v2i32, MVT::v4i32,
MVT::v2i64};		MVT::v2i64};
auto LT = TLI->getTypeLegalizationCost(DL, RetTy);		auto LT = TLI->getTypeLegalizationCost(DL, RetTy);
if (any_of(ValidAbsTys, [&LT](MVT M) { return M == LT.second; }))		if (any_of(ValidAbsTys, [&LT](MVT M) { return M == LT.second; }))
return LT.first;		return LT.first;
break;		break;
}		}
		case Intrinsic::experimental_stepvector: {
		unsigned Cost = 1; // Cost of the `index' instruction
		auto LT = TLI->getTypeLegalizationCost(DL, RetTy);
		paulwalker-armUnsubmitted Not Done Reply Inline Actions This is safe to assume and already checked within Verifier.cpp. paulwalker-arm: This is safe to assume and already checked within Verifier.cpp.
		// Legalisation of illegal vectors involves an `index' instruction plus
		// (LT.first - 1) vector adds.
		if (LT.first > 1) {
		Type *LegalVTy = EVT(LT.second).getTypeForEVT(RetTy->getContext());
		unsigned AddCost =
		getArithmeticInstrCost(Instruction::Add, LegalVTy, CostKind);
		Cost += AddCost * (LT.first - 1);
		}
		paulwalker-armUnsubmitted Done Reply Inline Actions In a world where InstructionCost is everywhere I think it's better for the InstructionCost side to be the LHS because it reduces the number of required operator overloads. Check with @sdesmalen but I think you'll save some work if you write this as `getArithmeticInstrCost() * (LT.first - 1)`. paulwalker-arm: In a world where InstructionCost is everywhere I think it's better for the InstructionCost side…
		return Cost;
		}
default:		default:
break;		break;
}		}
return BaseT::getIntrinsicInstrCost(ICA, CostKind);		return BaseT::getIntrinsicInstrCost(ICA, CostKind);
}		}

bool AArch64TTIImpl::isWideningInstruction(Type *DstTy, unsigned Opcode,		bool AArch64TTIImpl::isWideningInstruction(Type *DstTy, unsigned Opcode,
ArrayRef<const Value *> Args) {		ArrayRef<const Value *> Args) {
▲ Show 20 Lines • Show All 1,023 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll

This file was added.

				; RUN: opt -cost-model -analyze -mtriple=aarch64--linux-gnu -mattr=+neon < %s \| FileCheck %s

				; Check icmp for legal integer vectors.
				define void @stepvector_legal_int() {
				; CHECK-LABEL: 'stepvector_legal_int'
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <2 x i64> @llvm.experimental.stepvector.v2i64()
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %2 = call <4 x i32> @llvm.experimental.stepvector.v4i32()
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %3 = call <8 x i16> @llvm.experimental.stepvector.v8i16()
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %4 = call <16 x i8> @llvm.experimental.stepvector.v16i8()
				%1 = call <2 x i64> @llvm.experimental.stepvector.v2i64()
				%2 = call <4 x i32> @llvm.experimental.stepvector.v4i32()
				%3 = call <8 x i16> @llvm.experimental.stepvector.v8i16()
				%4 = call <16 x i8> @llvm.experimental.stepvector.v16i8()
				ret void
				}

				; Check icmp for an illegal integer vector.
				define void @stepvector_illegal_int() {
				; CHECK-LABEL: 'stepvector_illegal_int'
				; CHECK: Cost Model: Found an estimated cost of 2 for instruction: %1 = call <4 x i64> @llvm.experimental.stepvector.v4i64()
				; CHECK: Cost Model: Found an estimated cost of 4 for instruction: %2 = call <16 x i32> @llvm.experimental.stepvector.v16i32()
				%1 = call <4 x i64> @llvm.experimental.stepvector.v4i64()
				%2 = call <16 x i32> @llvm.experimental.stepvector.v16i32()
				ret void
				}


				declare <2 x i64> @llvm.experimental.stepvector.v2i64()
				declare <4 x i32> @llvm.experimental.stepvector.v4i32()
				declare <8 x i16> @llvm.experimental.stepvector.v8i16()
				declare <16 x i8> @llvm.experimental.stepvector.v16i8()

				declare <4 x i64> @llvm.experimental.stepvector.v4i64()
				declare <16 x i32> @llvm.experimental.stepvector.v16i32()

llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll

This file was added.

				; RUN: opt -cost-model -analyze -mtriple=aarch64--linux-gnu -mattr=+sve < %s 2>%t \| FileCheck %s

				; RUN: FileCheck --check-prefix=WARN --allow-empty %s <%t

				; If this check fails please read test/CodeGen/AArch64/README for instructions on how to resolve it.
				; WARN-NOT: warning

				; Check icmp for legal integer vectors.
				define void @stepvector_legal_int() {
				; CHECK-LABEL: 'stepvector_legal_int'
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %2 = call <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %3 = call <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
				; CHECK: Cost Model: Found an estimated cost of 1 for instruction: %4 = call <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()
				%1 = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
				%2 = call <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
				%3 = call <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
				%4 = call <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()
				ret void
				}

				; Check icmp for an illegal integer vector.
				define void @stepvector_illegal_int() {
				; CHECK-LABEL: 'stepvector_illegal_int'
				; CHECK: Cost Model: Found an estimated cost of 2 for instruction: %1 = call <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
				; CHECK: Cost Model: Found an estimated cost of 4 for instruction: %2 = call <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()
				%1 = call <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
				%2 = call <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()
				ret void
				}


				declare <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
				declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
				declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
				declare <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()

				declare <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
				declare <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()

llvm/test/CodeGen/AArch64/neon-stepvector.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+neon < %s \| FileCheck %s --check-prefixes=CHECK

				; LEGAL INTEGER TYPES

				define <2 x i64> @stepvector_v2i64() {
				; CHECK-LABEL: .LCPI0_0:
				; CHECK-NEXT: .xword 0
				; CHECK-NEXT: .xword 1
				; CHECK-LABEL: stepvector_v2i64:
				; CHECK: // %bb.0: // %entry
				paulwalker-armUnsubmitted Not Done Reply Inline Actions For this test to be correct requires LCPI0_0 to point to something meaningful. I believe update_llc_test_checks.py strips this information so you'll need to manually add it. We did likewise for named-vector-shuffle-reverse-neon.ll. paulwalker-arm: For this test to be correct requires LCPI0_0 to point to something meaningful. I believe…
				; CHECK-NEXT: adrp x8, .LCPI0_0
				; CHECK-NEXT: ldr q0, [x8, :lo12:.LCPI0_0]
				; CHECK-NEXT: ret
				entry:
				%0 = call <2 x i64> @llvm.experimental.stepvector.v2i64()
				ret <2 x i64> %0
				}

				define <4 x i32> @stepvector_v4i32() {
				; CHECK-LABEL: .LCPI1_0:
				; CHECK-NEXT: .word 0
				; CHECK-NEXT: .word 1
				; CHECK-NEXT: .word 2
				; CHECK-NEXT: .word 3
				; CHECK-LABEL: stepvector_v4i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI1_0
				; CHECK-NEXT: ldr q0, [x8, :lo12:.LCPI1_0]
				; CHECK-NEXT: ret
				entry:
				%0 = call <4 x i32> @llvm.experimental.stepvector.v4i32()
				ret <4 x i32> %0
				}

				define <8 x i16> @stepvector_v8i16() {
				; CHECK-LABEL: .LCPI2_0:
				; CHECK-NEXT: .hword 0
				; CHECK-NEXT: .hword 1
				; CHECK-NEXT: .hword 2
				; CHECK-NEXT: .hword 3
				; CHECK-NEXT: .hword 4
				; CHECK-NEXT: .hword 5
				; CHECK-NEXT: .hword 6
				; CHECK-NEXT: .hword 7
				; CHECK-LABEL: stepvector_v8i16:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI2_0
				; CHECK-NEXT: ldr q0, [x8, :lo12:.LCPI2_0]
				; CHECK-NEXT: ret
				entry:
				%0 = call <8 x i16> @llvm.experimental.stepvector.v8i16()
				ret <8 x i16> %0
				}

				define <16 x i8> @stepvector_v16i8() {
				; CHECK-LABEL: .LCPI3_0:
				; CHECK-NEXT: .byte 0
				; CHECK-NEXT: .byte 1
				; CHECK-NEXT: .byte 2
				; CHECK-NEXT: .byte 3
				; CHECK-NEXT: .byte 4
				; CHECK-NEXT: .byte 5
				; CHECK-NEXT: .byte 6
				; CHECK-NEXT: .byte 7
				; CHECK-NEXT: .byte 8
				; CHECK-NEXT: .byte 9
				; CHECK-NEXT: .byte 10
				; CHECK-NEXT: .byte 11
				; CHECK-NEXT: .byte 12
				; CHECK-NEXT: .byte 13
				; CHECK-NEXT: .byte 14
				; CHECK-NEXT: .byte 15
				; CHECK-LABEL: stepvector_v16i8:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI3_0
				; CHECK-NEXT: ldr q0, [x8, :lo12:.LCPI3_0]
				; CHECK-NEXT: ret
				entry:
				%0 = call <16 x i8> @llvm.experimental.stepvector.v16i8()
				ret <16 x i8> %0
				}

				; ILLEGAL INTEGER TYPES

				define <4 x i64> @stepvector_v4i64() {
				; CHECK-LABEL: .LCPI4_0:
				; CHECK-NEXT: .xword 0
				; CHECK-NEXT: .xword 1
				; CHECK-LABEL: .LCPI4_1:
				; CHECK-NEXT: .xword 2
				; CHECK-NEXT: .xword 3
				; CHECK-LABEL: stepvector_v4i64:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI4_0
				; CHECK-NEXT: adrp x9, .LCPI4_1
				; CHECK-NEXT: ldr q0, [x8, :lo12:.LCPI4_0]
				; CHECK-NEXT: ldr q1, [x9, :lo12:.LCPI4_1]
				; CHECK-NEXT: ret
				entry:
				%0 = call <4 x i64> @llvm.experimental.stepvector.v4i64()
				ret <4 x i64> %0
				}

				define <16 x i32> @stepvector_v16i32() {
				; CHECK-LABEL: .LCPI5_0:
				; CHECK-NEXT: .word 0
				; CHECK-NEXT: .word 1
				; CHECK-NEXT: .word 2
				; CHECK-NEXT: .word 3
				; CHECK-LABEL: .LCPI5_1:
				; CHECK-NEXT: .word 4
				; CHECK-NEXT: .word 5
				; CHECK-NEXT: .word 6
				; CHECK-NEXT: .word 7
				; CHECK-LABEL: .LCPI5_2:
				; CHECK-NEXT: .word 8
				; CHECK-NEXT: .word 9
				; CHECK-NEXT: .word 10
				; CHECK-NEXT: .word 11
				; CHECK-LABEL: .LCPI5_3:
				; CHECK-NEXT: .word 12
				; CHECK-NEXT: .word 13
				; CHECK-NEXT: .word 14
				; CHECK-NEXT: .word 15
				; CHECK-LABEL: stepvector_v16i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI5_0
				; CHECK-NEXT: adrp x9, .LCPI5_1
				; CHECK-NEXT: adrp x10, .LCPI5_2
				; CHECK-NEXT: adrp x11, .LCPI5_3
				; CHECK-NEXT: ldr q0, [x8, :lo12:.LCPI5_0]
				; CHECK-NEXT: ldr q1, [x9, :lo12:.LCPI5_1]
				; CHECK-NEXT: ldr q2, [x10, :lo12:.LCPI5_2]
				; CHECK-NEXT: ldr q3, [x11, :lo12:.LCPI5_3]
				; CHECK-NEXT: ret
				entry:
				%0 = call <16 x i32> @llvm.experimental.stepvector.v16i32()
				ret <16 x i32> %0
				}

				define <2 x i32> @stepvector_v2i32() {
				; CHECK-LABEL: .LCPI6_0:
				; CHECK-NEXT: .word 0
				; CHECK-NEXT: .word 1
				; CHECK-LABEL: stepvector_v2i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI6_0
				kmclaughlinUnsubmitted Done Reply Inline Actions Should there also be some tests here for illegal predicate types such as v32i1, as there are in sve-stepvector.ll? kmclaughlin: Should there also be some tests here for illegal predicate types such as v32i1, as there are in…
				david-armAuthorUnsubmitted Done Reply Inline Actions Hi @kmclaughlin, thanks for pointing that out. So I originally did add some tests for illegal types for neon, but for some reason the i1 element types weren't being promoted to i8 and so we just ended up using GPRs instead, i.e. something like `mov %x0, #some_immediate` I thought it looked really odd and inconsistent with the legal types so I left them out. I'm not sure if that's a bug with the AArch64 backend for Neon or intended behaviour. david-arm: Hi @kmclaughlin, thanks for pointing that out. So I originally did add some tests for illegal…
				; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI6_0]
				; CHECK-NEXT: ret
				entry:
				%0 = call <2 x i32> @llvm.experimental.stepvector.v2i32()
				ret <2 x i32> %0
				}

				define <4 x i16> @stepvector_v4i16() {
				; CHECK-LABEL: .LCPI7_0:
				; CHECK-NEXT: .hword 0
				; CHECK-NEXT: .hword 1
				; CHECK-NEXT: .hword 2
				; CHECK-NEXT: .hword 3
				; CHECK-LABEL: stepvector_v4i16:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: adrp x8, .LCPI7_0
				; CHECK-NEXT: ldr d0, [x8, :lo12:.LCPI7_0]
				; CHECK-NEXT: ret
				entry:
				%0 = call <4 x i16> @llvm.experimental.stepvector.v4i16()
				ret <4 x i16> %0
				}


				declare <2 x i64> @llvm.experimental.stepvector.v2i64()
				declare <4 x i32> @llvm.experimental.stepvector.v4i32()
				declare <8 x i16> @llvm.experimental.stepvector.v8i16()
				declare <16 x i8> @llvm.experimental.stepvector.v16i8()

				declare <4 x i64> @llvm.experimental.stepvector.v4i64()
				declare <16 x i32> @llvm.experimental.stepvector.v16i32()
				declare <2 x i32> @llvm.experimental.stepvector.v2i32()
				declare <4 x i16> @llvm.experimental.stepvector.v4i16()

llvm/test/CodeGen/AArch64/sve-stepvector.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s 2>%t \| FileCheck %s --check-prefixes=CHECK
				; RUN: FileCheck --check-prefix=WARN --allow-empty %s < %t

				; If this check fails please read test/CodeGen/AArch64/README for instructions on how to resolve it.
				; WARN-NOT: warning

				; LEGAL INTEGER TYPES

				define <vscale x 2 x i64> @stepvector_nxv2i64() {
				; CHECK-LABEL: stepvector_nxv2i64:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.d, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
				ret <vscale x 2 x i64> %0
				}

				define <vscale x 4 x i32> @stepvector_nxv4i32() {
				; CHECK-LABEL: stepvector_nxv4i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.s, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
				ret <vscale x 4 x i32> %0
				}

				define <vscale x 8 x i16> @stepvector_nxv8i16() {
				; CHECK-LABEL: stepvector_nxv8i16:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.h, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
				ret <vscale x 8 x i16> %0
				}

				define <vscale x 16 x i8> @stepvector_nxv16i8() {
				; CHECK-LABEL: stepvector_nxv16i8:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.b, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()
				ret <vscale x 16 x i8> %0
				}

				; ILLEGAL INTEGER TYPES

				define <vscale x 4 x i64> @stepvector_nxv4i64() {
				; CHECK-LABEL: stepvector_nxv4i64:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: cntd x8
				; CHECK-NEXT: mov z1.d, x8
				; CHECK-NEXT: index z0.d, #0, #1
				; CHECK-NEXT: add z1.d, z0.d, z1.d
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
				ret <vscale x 4 x i64> %0
				}

				define <vscale x 16 x i32> @stepvector_nxv16i32() {
				; CHECK-LABEL: stepvector_nxv16i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: cntw x9
				; CHECK-NEXT: cnth x8
				; CHECK-NEXT: index z0.s, #0, #1
				; CHECK-NEXT: mov z1.s, w9
				; CHECK-NEXT: mov z3.s, w8
				; CHECK-NEXT: add z1.s, z0.s, z1.s
				; CHECK-NEXT: add z2.s, z0.s, z3.s
				; CHECK-NEXT: add z3.s, z1.s, z3.s
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()
				ret <vscale x 16 x i32> %0
				}

				define <vscale x 2 x i32> @stepvector_nxv2i32() {
				; CHECK-LABEL: stepvector_nxv2i32:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.d, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 2 x i32> @llvm.experimental.stepvector.nxv2i32()
				ret <vscale x 2 x i32> %0
				}

				define <vscale x 4 x i16> @stepvector_nxv4i16() {
				; CHECK-LABEL: stepvector_nxv4i16:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.s, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 4 x i16> @llvm.experimental.stepvector.nxv4i16()
				ret <vscale x 4 x i16> %0
				}

				define <vscale x 8 x i8> @stepvector_nxv8i8() {
				; CHECK-LABEL: stepvector_nxv8i8:
				; CHECK: // %bb.0: // %entry
				; CHECK-NEXT: index z0.h, #0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = call <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()
				ret <vscale x 8 x i8> %0
				}

				declare <vscale x 2 x i64> @llvm.experimental.stepvector.nxv2i64()
				declare <vscale x 4 x i32> @llvm.experimental.stepvector.nxv4i32()
				declare <vscale x 8 x i16> @llvm.experimental.stepvector.nxv8i16()
				declare <vscale x 16 x i8> @llvm.experimental.stepvector.nxv16i8()

				declare <vscale x 4 x i64> @llvm.experimental.stepvector.nxv4i64()
				declare <vscale x 16 x i32> @llvm.experimental.stepvector.nxv16i32()
				declare <vscale x 2 x i32> @llvm.experimental.stepvector.nxv2i32()
				declare <vscale x 8 x i8> @llvm.experimental.stepvector.nxv8i8()
				declare <vscale x 4 x i16> @llvm.experimental.stepvector.nxv4i16()

llvm/test/Verifier/stepvector-intrinsic.ll

This file was added.

				; RUN: not opt -S -verify < %s 2>&1 \| FileCheck %s

				; Reject stepvector intrinsics that return a scalar

				define i32 @stepvector_i32() {
				; CHECK: Intrinsic has incorrect return type!
				%1 = call i32 @llvm.experimental.stepvector.i32()
				ret i32 %1
				}

				; Reject vectors with non-integer elements

				define <vscale x 4 x float> @stepvector_float() {
				; CHECK: experimental_stepvector only supported for vectors of integers with a bitwidth of at least 8
				%1 = call <vscale x 4 x float> @llvm.experimental.stepvector.nxv4f32()
				ret <vscale x 4 x float> %1
				}

				; Reject vectors of integers less than 8 bits in width

				define <vscale x 16 x i1> @stepvector_i1() {
				; CHECK: experimental_stepvector only supported for vectors of integers with a bitwidth of at least 8
				%1 = call <vscale x 16 x i1> @llvm.experimental.stepvector.nxv16i1()
				ret <vscale x 16 x i1> %1
				}

				declare i32 @llvm.experimental.stepvector.i32()
				declare <vscale x 4 x float> @llvm.experimental.stepvector.nxv4f32()
				declare <vscale x 16 x i1> @llvm.experimental.stepvector.nxv16i1()

llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp

	Show First 20 Lines • Show All 642 Lines • ▼ Show 20 Lines
	TEST_F(AArch64SelectionDAGTest, getTypeConversion_NoScalarizeEVT_nxv1f128) {			TEST_F(AArch64SelectionDAGTest, getTypeConversion_NoScalarizeEVT_nxv1f128) {
	if (!TM)			if (!TM)
	return;			return;

	EVT FromVT = EVT::getVectorVT(Context, MVT::f128, 1, true);			EVT FromVT = EVT::getVectorVT(Context, MVT::f128, 1, true);
	EXPECT_DEATH(getTypeAction(FromVT), "Cannot legalize this vector");			EXPECT_DEATH(getTypeAction(FromVT), "Cannot legalize this vector");
	}			}

				TEST_F(AArch64SelectionDAGTest, TestFold_STEP_VECTOR) {
				if (!TM)
				return;

				SDLoc Loc;
				auto IntVT = EVT::getIntegerVT(Context, 8);
				auto VecVT = EVT::getVectorVT(Context, MVT::i8, 16, true);

				// Should create SPLAT_VECTOR
				SDValue Zero = DAG->getConstant(0, Loc, IntVT);
				SDValue Op = DAG->getNode(ISD::STEP_VECTOR, Loc, VecVT, Zero);
				EXPECT_EQ(Op.getOpcode(), ISD::SPLAT_VECTOR);
				}

	} // end namespace llvm			} // end namespace llvm

llvm/unittests/IR/IRBuilderTest.cpp

Show First 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	Call = Builder.CreateIntrinsic(Intrinsic::masked_load,
{VecTy, PtrToVecTy}, ArgTys,		{VecTy, PtrToVecTy}, ArgTys,
nullptr, "masked.load");		nullptr, "masked.load");
FTy = Call->getFunctionType();		FTy = Call->getFunctionType();
EXPECT_EQ(FTy->getReturnType(), VecTy);		EXPECT_EQ(FTy->getReturnType(), VecTy);
for (unsigned i = 0; i != ArgTys.size(); ++i)		for (unsigned i = 0; i != ArgTys.size(); ++i)
EXPECT_EQ(FTy->getParamType(i), ArgTys[i]->getType());		EXPECT_EQ(FTy->getParamType(i), ArgTys[i]->getType());
}		}

		TEST_F(IRBuilderTest, CreateStepVector) {
		IRBuilder<> Builder(BB);

		// Fixed width vectors
		Type *DstVecTy = VectorType::get(Builder.getInt32Ty(), 4, false);
		Value *StepVec = Builder.CreateStepVector(DstVecTy);
		EXPECT_TRUE(isa<Constant>(StepVec));
		EXPECT_EQ(StepVec->getType(), DstVecTy);

		const auto *VectorValue = cast<Constant>(StepVec);
		for (unsigned i = 0; i < 4; i++) {
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for variable 'i' [readability-identifier-naming]…
		EXPECT_TRUE(isa<ConstantInt>(VectorValue->getAggregateElement(i)));
		ConstantInt *El = cast<ConstantInt>(VectorValue->getAggregateElement(i));
		EXPECT_EQ(El->getValue(), i);
		}

		// Scalable vectors
		DstVecTy = VectorType::get(Builder.getInt32Ty(), 4, true);
		StepVec = Builder.CreateStepVector(DstVecTy);
		EXPECT_TRUE(isa<CallInst>(StepVec));
		CallInst *Call = cast<CallInst>(StepVec);
		FunctionType *FTy = Call->getFunctionType();
		EXPECT_EQ(FTy->getReturnType(), DstVecTy);
		EXPECT_EQ(Call->getIntrinsicID(), Intrinsic::experimental_stepvector);
		}

TEST_F(IRBuilderTest, ConstrainedFP) {		TEST_F(IRBuilderTest, ConstrainedFP) {
IRBuilder<> Builder(BB);		IRBuilder<> Builder(BB);
Value *V;		Value *V;
Value *VDouble;		Value *VDouble;
Value *VInt;		Value *VInt;
CallInst *Call;		CallInst *Call;
IntrinsicInst *II;		IntrinsicInst *II;
GlobalVariable GVDouble = new GlobalVariable(M, Type::getDoubleTy(Ctx),		GlobalVariable GVDouble = new GlobalVariable(M, Type::getDoubleTy(Ctx),
▲ Show 20 Lines • Show All 808 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[IR][SVE] Add new llvm.experimental.stepvector intrinsicClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 332285

llvm/docs/LangRef.rst

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/include/llvm/CodeGen/ISDOpcodes.h

llvm/include/llvm/CodeGen/SelectionDAG.h

llvm/include/llvm/IR/IRBuilder.h

llvm/include/llvm/IR/Intrinsics.td

llvm/include/llvm/Target/TargetSelectionDAG.td

llvm/lib/CodeGen/SelectionDAG/LegalizeIntegerTypes.cpp

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h

llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp

llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp

llvm/lib/IR/IRBuilder.cpp

llvm/lib/IR/Verifier.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.h

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

llvm/test/Analysis/CostModel/AArch64/neon-stepvector.ll

llvm/test/Analysis/CostModel/AArch64/sve-stepvector.ll

llvm/test/CodeGen/AArch64/neon-stepvector.ll

llvm/test/CodeGen/AArch64/sve-stepvector.ll

llvm/test/Verifier/stepvector-intrinsic.ll

llvm/unittests/CodeGen/AArch64SelectionDAGTest.cpp

llvm/unittests/IR/IRBuilderTest.cpp

[IR][SVE] Add new llvm.experimental.stepvector intrinsic
ClosedPublic