This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/CodeGen/SelectionDAG/
-
CodeGen/
-
SelectionDAG/
-
LegalizeTypes.h
3/4
LegalizeVectorTypes.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
7/10
split-vector-insert.ll

Differential D92760

[SelectionDAG] Implement SplitVecOp_INSERT_SUBVECTOR
ClosedPublic

Authored by joechrisellis on Dec 7 2020, 6:10 AM.

Download Raw Diff

Details

Reviewers

peterwaller-arm
DavidTruby
kmclaughlin
paulwalker-arm

Commits

rGd863a0ddebc8: [SelectionDAG] Implement SplitVecOp_INSERT_SUBVECTOR

Summary

This function is needed for when it is necessary to split the subvector
operand of an llvm.experimental.vector.insert call. Splitting the
subvector operand means performing two insertions: one inserting the
lower part of the split subvector into the destination vector, and
another for inserting the upper part.

Through experimenting, it seems quite rare to need split the subvector
operand, but this is necessary to avoid assertion errors.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

joechrisellis created this revision.Dec 7 2020, 6:10 AM

Herald added subscribers: ecnelises, hiraditya. · View Herald TranscriptDec 7 2020, 6:10 AM

joechrisellis requested review of this revision.Dec 7 2020, 6:10 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 7 2020, 6:10 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Tests missing

@lebedev.ri hi, I am aware of this. Changes planned. 🙂

Harbormaster completed remote builds in B81283: Diff 309892.Dec 7 2020, 6:58 AM

Add test.

Harbormaster completed remote builds in B81409: Diff 310122.Dec 8 2020, 3:45 AM

Hi @joechrisellis, hope you don't mind but I've added @paulwalker-arm and @kmclaughlin as reviewers, due to the legalisation work they've been involved in for the last few months!

Is it possible to add a brief description of what the intent of this patch is? It's not immediately obvious why we're adding the split function from the code or summary. I'm assuming it's related to the new intrinsic in your test, since LLVM has not needed this function until now?

In D92760#2439765, @david-arm wrote:

Is it possible to add a brief description of what the intent of this patch is? It's not immediately obvious why we're adding the split function from the code or summary. I'm assuming it's related to the new intrinsic in your test, since LLVM has not needed this function until now?

Yes, good idea. I am not sure, but I think it might have actually been possible to hit a codepath that expects the split function to exist even without the new llvm.experimental.vector.insert intrinsics (unless the INSERT_SUBVECTOR ISD nodes are never created?). From our experiments, it's pretty rare that we end up needing this function even with the intrinsics anyway. But it does prevent an assertion error in certain circumstances. 🙂

Also, it might be worth noting that operand splitting is already implemented for EXTRACT_SUBVECTOR.

cameron.mcinally added a subscriber: cameron.mcinally.Dec 8 2020, 12:33 PM

If you can find a test that exposes the need for this split function with generic IR that would be really helpful I think - LLVM has managed for a long time without needing this split function so something has changed. I suspect it's probably just the new intrinsic that now exposes this code path.

llvm/test/CodeGen/AArch64/split-vector-insert.ll
7	Could you add a few more tests here, for example test floating point (nxv2f64.v8f64) and predicate vectors (nxv2i1.v8i1)?

In D92760#2442179, @david-arm wrote:

If you can find a test that exposes the need for this split function with generic IR that would be really helpful I think - LLVM has managed for a long time without needing this split function so something has changed. I suspect it's probably just the new intrinsic that now exposes this code path.

Personally I think using the intrinsic as the test is better than trying to rely on an exact sequence of unluckiness to generate the failure.

For information I believe scalable vectors is what has changed. Specifically the changes we've made to INSERT_SUBVECTOR to allow a fixed-length vector to be extracted from/inserted into a scalable vector. Before this you'd only ever insert "fixed into fixed" or "scalable into scalable". Given the nature of INSERT_SUBVECTOR the operand is always smaller than the result and thus if the operand type is illegal then the result type must also be illegal, in which case SplitVecRes_INSERT_SUBVECTOR will handle type legalisation. However by allowing mixed vector types we've opened up the possibility for extracting an illegal vector type, that requires splitting to be made legal, from a legal one.

HI @paulwalker-arm, right that makes sense and thanks for the explanation!

paulwalker-arm added inline comments.Dec 9 2020, 3:19 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2288–2292	Is this protection necessary? From what I can see the code below should work for all valid forms of INSERT_SUBVECTOR. That's to say you don't need to worry about the case of inserting a scalable vector into a fixed length vector because that is not a valid use of INSERT_SUBVECTOR and thus should be caught before getting here.
llvm/test/CodeGen/AArch64/split-vector-insert.ll
55	Is this required? I'm wondering if simply passing the subvector as a function parameter (or loading it from memory) and returning the scalable result directly, leads to a simpler test.

paulwalker-arm added inline comments.Dec 9 2020, 3:27 AM

llvm/test/CodeGen/AArch64/split-vector-insert.ll
3	This RUN line requires an asserts build so the test will need `; REQUIRES: asserts` . However, for what it's worth it seems overkill to test both the code generation output and the result of the legaliser with generally the assembler output from llc usually being enough.

Add more tests, address @paulwalker-arm and @david-arm's review comments.

Remove unneeded files.

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2288–2292	It might not be necessary, but since `SplitVecOp_EXTRACT_SUBVECTOR` has a similar check I would like to keep this for the time being!
llvm/test/CodeGen/AArch64/split-vector-insert.ll
3	I've added the `; REQUIRES: asserts` line. I am a little reluctant to remove the legaliser checks because without them it is not clear we are testing `SplitVecOp_INSERT_SUBVECTOR`?
7	Done for floating point vectors -- I also tried to hit this codepath with predicate vectors but couldn't find a test case that works. Please ping if you think this is something we definitely need! 😄
55	I think it is necessary. I did try to simplify this test further, but didn't get anywhere. FWIW, calls to `SplitVecOp_INSERT_SUBVECTOR` seem to be very rare. I was unable to write a test by hand that exercised this codepath -- I actually got this test by using creduce + bugpoint to reduce a failure we had in a project that uses the ACLE. It seems that simpler examples, like what you suggest, allow the compiler to factor out the `INSERT_SUBVECTOR` ISD node earlier on.

paulwalker-arm added inline comments.Dec 9 2020, 7:35 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2288–2292	I believe in `SplitVecOp_EXTRACT_SUBVECTOR`'s case there is a check for a legitimate scenario that if hit may require an update to the function. However, in your instance the check is simply validating the DAG, which is not necessary because that's the job of `SelectionDAG::getNode`, of which I can see: assert((VT.isScalableVector() \|\| N2VT.isFixedLengthVector()) && "Cannot insert a scalable vector into a fixed length vector!"); So unless the check serves a different purpose the code looks redundant and redundant code should be removed. I guess if you are super paranoid you could replicate `SelectionDAG::getNode`'s assert but I really don't see the need.

Harbormaster completed remote builds in B81630: Diff 310518.Dec 9 2020, 7:48 AM

Harbormaster completed remote builds in B81631: Diff 310521.

Remove redudant check for valid INSERTION_SUBVECTOR ISD node.

joechrisellis added inline comments.Dec 9 2020, 8:02 AM

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
2288–2292	Ahh okay, that makes sense -- my bad! Will remove this check. 🙂

Simplify test.

Harbormaster completed remote builds in B81642: Diff 310539.Dec 9 2020, 9:03 AM

Harbormaster completed remote builds in B81650: Diff 310558.Dec 9 2020, 10:03 AM

A few stylistic things to consider (I'm only really bothered about loosing the v32 tests) but otherwise looks good.

llvm/test/CodeGen/AArch64/split-vector-insert.ll
2	It's up to you but I prefer to keep RUN lines minimal so: `-o -` is not needed as that is the default when redirecting files into llc, `-mtriple=aarch64--` can be replaced with `target triple = "aarch64-unknown-linux-gnu"` `-mcpu=a64fx` can be replaced with a function attribute `attributes #0 = { "target-features"="+sve" }` remembering to reference #0 in the function definitions.
3	Again your choice, but you could drop this then just use CHECK for the code generation validation, which is more in keeping with the majority of the code generation tests.
117	I don't believe this and the other v32 test offers any value. The v8 tests are already testing nested (two levels) type legalisation, so there's no reason to test it again at four levels. Add this to the fact the generated code it pretty unreadable I recommend removing them.

This revision is now accepted and ready to land.Dec 10 2020, 5:04 AM

Address @paulwalker-arm's style comments.

Thanks for the comments! 😄

Harbormaster completed remote builds in B81825: Diff 310862.Dec 10 2020, 6:36 AM

paulwalker-arm added inline comments.Dec 10 2020, 7:11 AM

llvm/test/CodeGen/AArch64/split-vector-insert.ll
3	--check-prefix=CHECK is not needed as it's the default when no --check-prefix is specified.

Remove redundant --check-prefix=CHECK.

Harbormaster completed remote builds in B81984: Diff 311135.Dec 11 2020, 1:29 AM

This revision was landed with ongoing or failed builds.Dec 11 2020, 3:08 AM

Closed by commit rGd863a0ddebc8: [SelectionDAG] Implement SplitVecOp_INSERT_SUBVECTOR (authored by joechrisellis). · Explain Why

This revision was automatically updated to reflect the committed changes.

joechrisellis added a commit: rGd863a0ddebc8: [SelectionDAG] Implement SplitVecOp_INSERT_SUBVECTOR.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

LegalizeTypes.h

1 line

LegalizeVectorTypes.cpp

27 lines

test/

CodeGen/

AArch64/

split-vector-insert.ll

115 lines

Diff 311155

llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h

Show First 20 Lines • Show All 835 Lines • ▼ Show 20 Lines	private:
bool SplitVectorOperand(SDNode *N, unsigned OpNo);		bool SplitVectorOperand(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_VSELECT(SDNode *N, unsigned OpNo);		SDValue SplitVecOp_VSELECT(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_VECREDUCE(SDNode *N, unsigned OpNo);		SDValue SplitVecOp_VECREDUCE(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_VECREDUCE_SEQ(SDNode *N);		SDValue SplitVecOp_VECREDUCE_SEQ(SDNode *N);
SDValue SplitVecOp_UnaryOp(SDNode *N);		SDValue SplitVecOp_UnaryOp(SDNode *N);
SDValue SplitVecOp_TruncateHelper(SDNode *N);		SDValue SplitVecOp_TruncateHelper(SDNode *N);

SDValue SplitVecOp_BITCAST(SDNode *N);		SDValue SplitVecOp_BITCAST(SDNode *N);
		SDValue SplitVecOp_INSERT_SUBVECTOR(SDNode *N, unsigned OpNo);
SDValue SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N);		SDValue SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N);
SDValue SplitVecOp_EXTRACT_VECTOR_ELT(SDNode *N);		SDValue SplitVecOp_EXTRACT_VECTOR_ELT(SDNode *N);
SDValue SplitVecOp_ExtVecInRegOp(SDNode *N);		SDValue SplitVecOp_ExtVecInRegOp(SDNode *N);
SDValue SplitVecOp_STORE(StoreSDNode *N, unsigned OpNo);		SDValue SplitVecOp_STORE(StoreSDNode *N, unsigned OpNo);
SDValue SplitVecOp_MSTORE(MaskedStoreSDNode *N, unsigned OpNo);		SDValue SplitVecOp_MSTORE(MaskedStoreSDNode *N, unsigned OpNo);
SDValue SplitVecOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);		SDValue SplitVecOp_MSCATTER(MaskedScatterSDNode *N, unsigned OpNo);
SDValue SplitVecOp_MGATHER(MaskedGatherSDNode *MGT, unsigned OpNo);		SDValue SplitVecOp_MGATHER(MaskedGatherSDNode *MGT, unsigned OpNo);
SDValue SplitVecOp_CONCAT_VECTORS(SDNode *N);		SDValue SplitVecOp_CONCAT_VECTORS(SDNode *N);
▲ Show 20 Lines • Show All 189 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp

Show First 20 Lines • Show All 2,056 Lines • ▼ Show 20 Lines	#ifndef NDEBUG
dbgs() << "\n";		dbgs() << "\n";
#endif		#endif
report_fatal_error("Do not know how to split this operator's "		report_fatal_error("Do not know how to split this operator's "
"operand!\n");		"operand!\n");

case ISD::SETCC: Res = SplitVecOp_VSETCC(N); break;		case ISD::SETCC: Res = SplitVecOp_VSETCC(N); break;
case ISD::BITCAST: Res = SplitVecOp_BITCAST(N); break;		case ISD::BITCAST: Res = SplitVecOp_BITCAST(N); break;
case ISD::EXTRACT_SUBVECTOR: Res = SplitVecOp_EXTRACT_SUBVECTOR(N); break;		case ISD::EXTRACT_SUBVECTOR: Res = SplitVecOp_EXTRACT_SUBVECTOR(N); break;
		case ISD::INSERT_SUBVECTOR: Res = SplitVecOp_INSERT_SUBVECTOR(N, OpNo); break;
case ISD::EXTRACT_VECTOR_ELT:Res = SplitVecOp_EXTRACT_VECTOR_ELT(N); break;		case ISD::EXTRACT_VECTOR_ELT:Res = SplitVecOp_EXTRACT_VECTOR_ELT(N); break;
case ISD::CONCAT_VECTORS: Res = SplitVecOp_CONCAT_VECTORS(N); break;		case ISD::CONCAT_VECTORS: Res = SplitVecOp_CONCAT_VECTORS(N); break;
case ISD::TRUNCATE:		case ISD::TRUNCATE:
Res = SplitVecOp_TruncateHelper(N);		Res = SplitVecOp_TruncateHelper(N);
break;		break;
case ISD::STRICT_FP_ROUND:		case ISD::STRICT_FP_ROUND:
case ISD::FP_ROUND: Res = SplitVecOp_FP_ROUND(N); break;		case ISD::FP_ROUND: Res = SplitVecOp_FP_ROUND(N); break;
case ISD::FCOPYSIGN: Res = SplitVecOp_FCOPYSIGN(N); break;		case ISD::FCOPYSIGN: Res = SplitVecOp_FCOPYSIGN(N); break;
▲ Show 20 Lines • Show All 200 Lines • ▼ Show 20 Lines	SDValue DAGTypeLegalizer::SplitVecOp_BITCAST(SDNode *N) {

if (DAG.getDataLayout().isBigEndian())		if (DAG.getDataLayout().isBigEndian())
std::swap(Lo, Hi);		std::swap(Lo, Hi);

return DAG.getNode(ISD::BITCAST, SDLoc(N), N->getValueType(0),		return DAG.getNode(ISD::BITCAST, SDLoc(N), N->getValueType(0),
JoinIntegers(Lo, Hi));		JoinIntegers(Lo, Hi));
}		}

		SDValue DAGTypeLegalizer::SplitVecOp_INSERT_SUBVECTOR(SDNode *N,
		unsigned OpNo) {
		assert(OpNo == 1 && "Invalid OpNo; can only split SubVec.");
		// We know that the result type is legal.
		EVT ResVT = N->getValueType(0);

		SDValue Vec = N->getOperand(0);
		SDValue SubVec = N->getOperand(1);
		SDValue Idx = N->getOperand(2);
		SDLoc dl(N);

		paulwalker-armUnsubmitted Not Done Reply Inline Actions Is this protection necessary? From what I can see the code below should work for all valid forms of INSERT_SUBVECTOR. That's to say you don't need to worry about the case of inserting a scalable vector into a fixed length vector because that is not a valid use of INSERT_SUBVECTOR and thus should be caught before getting here. paulwalker-arm: Is this protection necessary? From what I can see the code below should work for all valid…
		joechrisellisAuthorUnsubmitted Done Reply Inline Actions It might not be necessary, but since `SplitVecOp_EXTRACT_SUBVECTOR` has a similar check I would like to keep this for the time being! joechrisellis: It might not be necessary, but since `SplitVecOp_EXTRACT_SUBVECTOR` has a similar check I would…
		paulwalker-armUnsubmitted Done Reply Inline Actions I believe in `SplitVecOp_EXTRACT_SUBVECTOR`'s case there is a check for a legitimate scenario that if hit may require an update to the function. However, in your instance the check is simply validating the DAG, which is not necessary because that's the job of `SelectionDAG::getNode`, of which I can see: assert((VT.isScalableVector() \|\| N2VT.isFixedLengthVector()) && "Cannot insert a scalable vector into a fixed length vector!"); So unless the check serves a different purpose the code looks redundant and redundant code should be removed. I guess if you are super paranoid you could replicate `SelectionDAG::getNode`'s assert but I really don't see the need. paulwalker-arm: I believe in `SplitVecOp_EXTRACT_SUBVECTOR`'s case there is a check for a legitimate scenario…
		joechrisellisAuthorUnsubmitted Done Reply Inline Actions Ahh okay, that makes sense -- my bad! Will remove this check. 🙂 joechrisellis: Ahh okay, that makes sense -- my bad! Will remove this check. 🙂
		SDValue Lo, Hi;
		GetSplitVector(SubVec, Lo, Hi);

		uint64_t IdxVal = cast<ConstantSDNode>(Idx)->getZExtValue();
		uint64_t LoElts = Lo.getValueType().getVectorMinNumElements();

		SDValue FirstInsertion =
		DAG.getNode(ISD::INSERT_SUBVECTOR, dl, ResVT, Vec, Lo, Idx);
		SDValue SecondInsertion =
		DAG.getNode(ISD::INSERT_SUBVECTOR, dl, ResVT, FirstInsertion, Hi,
		DAG.getVectorIdxConstant(IdxVal + LoElts, dl));

		return SecondInsertion;
		}

SDValue DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N) {		SDValue DAGTypeLegalizer::SplitVecOp_EXTRACT_SUBVECTOR(SDNode *N) {
// We know that the extracted result type is legal.		// We know that the extracted result type is legal.
EVT SubVT = N->getValueType(0);		EVT SubVT = N->getValueType(0);

SDValue Idx = N->getOperand(1);		SDValue Idx = N->getOperand(1);
SDLoc dl(N);		SDLoc dl(N);
SDValue Lo, Hi;		SDValue Lo, Hi;

▲ Show 20 Lines • Show All 3,056 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/split-vector-insert.ll

This file was added.

				; RUN: llc < %s -debug-only=legalize-types 2>&1 \| FileCheck %s --check-prefix=CHECK-LEGALIZATION
				; RUN: llc < %s \| FileCheck %s
				paulwalker-armUnsubmitted Done Reply Inline Actions It's up to you but I prefer to keep RUN lines minimal so: `-o -` is not needed as that is the default when redirecting files into llc, `-mtriple=aarch64--` can be replaced with `target triple = "aarch64-unknown-linux-gnu"` `-mcpu=a64fx` can be replaced with a function attribute `attributes #0 = { "target-features"="+sve" }` remembering to reference #0 in the function definitions. paulwalker-arm: It's up to you but I prefer to keep RUN lines minimal so: `-o -` is not needed as that is the…
				; REQUIRES: asserts
				paulwalker-armUnsubmitted Not Done Reply Inline Actions This RUN line requires an asserts build so the test will need `; REQUIRES: asserts` . However, for what it's worth it seems overkill to test both the code generation output and the result of the legaliser with generally the assembler output from llc usually being enough. paulwalker-arm: This RUN line requires an asserts build so the test will need `; REQUIRES: asserts` . However…
				joechrisellisAuthorUnsubmitted Done Reply Inline Actions I've added the `; REQUIRES: asserts` line. I am a little reluctant to remove the legaliser checks because without them it is not clear we are testing `SplitVecOp_INSERT_SUBVECTOR`? joechrisellis: I've added the `; REQUIRES: asserts` line. I am a little reluctant to remove the legaliser…
				paulwalker-armUnsubmitted Done Reply Inline Actions Again your choice, but you could drop this then just use CHECK for the code generation validation, which is more in keeping with the majority of the code generation tests. paulwalker-arm: Again your choice, but you could drop this then just use CHECK for the code generation…
				paulwalker-armUnsubmitted Not Done Reply Inline Actions --check-prefix=CHECK is not needed as it's the default when no --check-prefix is specified. paulwalker-arm: --check-prefix=CHECK is not needed as it's the default when no --check-prefix is specified.

				target triple = "aarch64-unknown-linux-gnu"
				attributes #0 = {"target-features"="+sve"}

				david-armUnsubmitted Done Reply Inline Actions Could you add a few more tests here, for example test floating point (nxv2f64.v8f64) and predicate vectors (nxv2i1.v8i1)? david-arm: Could you add a few more tests here, for example test floating point (nxv2f64.v8f64) and…
				joechrisellisAuthorUnsubmitted Done Reply Inline Actions Done for floating point vectors -- I also tried to hit this codepath with predicate vectors but couldn't find a test case that works. Please ping if you think this is something we definitely need! 😄 joechrisellis: Done for floating point vectors -- I also tried to hit this codepath with predicate vectors but…
				declare <vscale x 2 x i64> @llvm.experimental.vector.insert.nxv2i64.v8i64(<vscale x 2 x i64>, <8 x i64>, i64)
				declare <vscale x 2 x double> @llvm.experimental.vector.insert.nxv2f64.v8f64(<vscale x 2 x double>, <8 x double>, i64)

				define <vscale x 2 x i64> @test_nxv2i64_v8i64(<vscale x 2 x i64> %a, <8 x i64> %b) #0 {
				; CHECK-LEGALIZATION: Legally typed node: [[T1:t[0-9]+]]: nxv2i64 = insert_subvector {{t[0-9]+}}, {{t[0-9]+}}, Constant:i64<0>
				; CHECK-LEGALIZATION: Legally typed node: [[T2:t[0-9]+]]: nxv2i64 = insert_subvector [[T1]], {{t[0-9]+}}, Constant:i64<2>
				; CHECK-LEGALIZATION: Legally typed node: [[T3:t[0-9]+]]: nxv2i64 = insert_subvector [[T2]], {{t[0-9]+}}, Constant:i64<4>
				; CHECK-LEGALIZATION: Legally typed node: [[T4:t[0-9]+]]: nxv2i64 = insert_subvector [[T3]], {{t[0-9]+}}, Constant:i64<6>

				; CHECK-LABEL: test_nxv2i64_v8i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
				; CHECK-NEXT: addvl sp, sp, #-4
				; CHECK-NEXT: .cfi_escape 0x0f, 0x0c, 0x8f, 0x00, 0x11, 0x10, 0x22, 0x11, 0x20, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 16 + 32 * VG
				; CHECK-NEXT: .cfi_offset w29, -16
				; CHECK-NEXT: cntd x8
				; CHECK-NEXT: sub x8, x8, #1 // =1
				; CHECK-NEXT: cmp x8, #0 // =0
				; CHECK-NEXT: csel x10, x8, xzr, lo
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: mov x9, sp
				; CHECK-NEXT: lsl x10, x10, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp]
				; CHECK-NEXT: str q1, [x9, x10]
				; CHECK-NEXT: addvl x10, sp, #1
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp]
				; CHECK-NEXT: mov w9, #2
				; CHECK-NEXT: cmp x8, #2 // =2
				; CHECK-NEXT: csel x9, x8, x9, lo
				; CHECK-NEXT: lsl x9, x9, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp, #1, mul vl]
				; CHECK-NEXT: str q2, [x10, x9]
				; CHECK-NEXT: addvl x10, sp, #2
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp, #1, mul vl]
				; CHECK-NEXT: mov w9, #4
				; CHECK-NEXT: cmp x8, #4 // =4
				; CHECK-NEXT: csel x9, x8, x9, lo
				; CHECK-NEXT: lsl x9, x9, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp, #2, mul vl]
				; CHECK-NEXT: str q3, [x10, x9]
				; CHECK-NEXT: addvl x10, sp, #3
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp, #2, mul vl]
				; CHECK-NEXT: mov w9, #6
				; CHECK-NEXT: cmp x8, #6 // =6
				; CHECK-NEXT: csel x8, x8, x9, lo
				; CHECK-NEXT: lsl x8, x8, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp, #3, mul vl]
				; CHECK-NEXT: str q4, [x10, x8]
				paulwalker-armUnsubmitted Not Done Reply Inline Actions Is this required? I'm wondering if simply passing the subvector as a function parameter (or loading it from memory) and returning the scalable result directly, leads to a simpler test. paulwalker-arm: Is this required? I'm wondering if simply passing the subvector as a function parameter (or…
				joechrisellisAuthorUnsubmitted Done Reply Inline Actions I think it is necessary. I did try to simplify this test further, but didn't get anywhere. FWIW, calls to `SplitVecOp_INSERT_SUBVECTOR` seem to be very rare. I was unable to write a test by hand that exercised this codepath -- I actually got this test by using creduce + bugpoint to reduce a failure we had in a project that uses the ACLE. It seems that simpler examples, like what you suggest, allow the compiler to factor out the `INSERT_SUBVECTOR` ISD node earlier on. joechrisellis: I think it is necessary. I did try to simplify this test further, but didn't get anywhere. FWIW…
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp, #3, mul vl]
				; CHECK-NEXT: addvl sp, sp, #4
				; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
				; CHECK-NEXT: ret
				%r = call <vscale x 2 x i64> @llvm.experimental.vector.insert.nxv2i64.v8i64(<vscale x 2 x i64> %a, <8 x i64> %b, i64 0)
				ret <vscale x 2 x i64> %r
				}

				define <vscale x 2 x double> @test_nxv2f64_v8f64(<vscale x 2 x double> %a, <8 x double> %b) #0 {
				; CHECK-LEGALIZATION: Legally typed node: [[T1:t[0-9]+]]: nxv2f64 = insert_subvector {{t[0-9]+}}, {{t[0-9]+}}, Constant:i64<0>
				; CHECK-LEGALIZATION: Legally typed node: [[T2:t[0-9]+]]: nxv2f64 = insert_subvector [[T1]], {{t[0-9]+}}, Constant:i64<2>
				; CHECK-LEGALIZATION: Legally typed node: [[T3:t[0-9]+]]: nxv2f64 = insert_subvector [[T2]], {{t[0-9]+}}, Constant:i64<4>
				; CHECK-LEGALIZATION: Legally typed node: [[T4:t[0-9]+]]: nxv2f64 = insert_subvector [[T3]], {{t[0-9]+}}, Constant:i64<6>

				; CHECK-LABEL: test_nxv2f64_v8f64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
				; CHECK-NEXT: addvl sp, sp, #-4
				; CHECK-NEXT: .cfi_escape 0x0f, 0x0c, 0x8f, 0x00, 0x11, 0x10, 0x22, 0x11, 0x20, 0x92, 0x2e, 0x00, 0x1e, 0x22 // sp + 16 + 32 * VG
				; CHECK-NEXT: .cfi_offset w29, -16
				; CHECK-NEXT: cntd x8
				; CHECK-NEXT: sub x8, x8, #1 // =1
				; CHECK-NEXT: cmp x8, #0 // =0
				; CHECK-NEXT: csel x10, x8, xzr, lo
				; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: mov x9, sp
				; CHECK-NEXT: lsl x10, x10, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp]
				; CHECK-NEXT: str q1, [x9, x10]
				; CHECK-NEXT: addvl x10, sp, #1
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp]
				; CHECK-NEXT: mov w9, #2
				; CHECK-NEXT: cmp x8, #2 // =2
				; CHECK-NEXT: csel x9, x8, x9, lo
				; CHECK-NEXT: lsl x9, x9, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp, #1, mul vl]
				; CHECK-NEXT: str q2, [x10, x9]
				; CHECK-NEXT: addvl x10, sp, #2
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp, #1, mul vl]
				; CHECK-NEXT: mov w9, #4
				; CHECK-NEXT: cmp x8, #4 // =4
				; CHECK-NEXT: csel x9, x8, x9, lo
				; CHECK-NEXT: lsl x9, x9, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp, #2, mul vl]
				; CHECK-NEXT: str q3, [x10, x9]
				; CHECK-NEXT: addvl x10, sp, #3
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp, #2, mul vl]
				; CHECK-NEXT: mov w9, #6
				; CHECK-NEXT: cmp x8, #6 // =6
				; CHECK-NEXT: csel x8, x8, x9, lo
				; CHECK-NEXT: lsl x8, x8, #3
				; CHECK-NEXT: st1d { z0.d }, p0, [sp, #3, mul vl]
				; CHECK-NEXT: str q4, [x10, x8]
				; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp, #3, mul vl]
				; CHECK-NEXT: addvl sp, sp, #4
				; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
				; CHECK-NEXT: ret
				%r = call <vscale x 2 x double> @llvm.experimental.vector.insert.nxv2f64.v8f64(<vscale x 2 x double> %a, <8 x double> %b, i64 0)
				ret <vscale x 2 x double> %r
				}
				paulwalker-armUnsubmitted Done Reply Inline Actions I don't believe this and the other v32 test offers any value. The v8 tests are already testing nested (two levels) type legalisation, so there's no reason to test it again at four levels. Add this to the fact the generated code it pretty unreadable I recommend removing them. paulwalker-arm: I don't believe this and the other v32 test offers any value. The v8 tests are already testing…