This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
IR/
4/4
Verifier.cpp
-
Transforms/InstCombine/
-
InstCombine/
2/4
InstCombineCalls.cpp
-
test/
-
CodeGen/
-
AArch64/
2/2
sve-extract-vector.ll
1/3
sve-insert-vector.ll
-
RISCV/rvv/
-
rvv/
-
fixed-vectors-extract-subvector.ll
2
fixed-vectors-insert-subvector.ll
-
Transforms/InstCombine/
-
InstCombine/
-
canonicalize-vector-extract.ll
-
canonicalize-vector-insert.ll
-
Verifier/
-
insert-extract-intrinsics-invalid.ll

Differential D102842

[Verifier] Fail on invalid indices for {insert,extract} vector intrinsics
AbandonedPublic

Authored by joechrisellis on May 20 2021, 5:21 AM.

Download Raw Diff

Details

Reviewers

david-arm
sdesmalen
cameron.mcinally
DavidTruby
peterwaller-arm

Summary

The langref (llvm/docs/LangRef.rst) specifies the following for the
llvm.experimental.vector.insert intrinsic:

``idx`` represents the starting element number at which ``subvec``
will be inserted. ``idx`` must be a constant multiple of
``subvec``'s known minimum vector length.

and the following for the llvm.experimental.vector.extract intrinsic:

The ``idx`` specifies the starting element number within ``vec``
from which a subvector is extracted. ``idx`` must be a constant
multiple of the known-minimum vector length of the result type.

These conditions were not previously enforced in the verifier, meaning
that the implementation was not entirely consistent with the langref. In
some circumstances, invalid indices were permitted silently, and in
other circumstances, an undef was spawned where a verifier error would
have been preferred.

This patch adds more checks to the verifier to ensure that these
constraints are not violated. It also updates existing tests to make
sure that they abide by the use of the intrinsic documented in the
langref.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	6,960 ms	x64 debian > libFuzzer.libFuzzer::fork-ubsan.test

Event Timeline

joechrisellis created this revision.May 20 2021, 5:21 AM

Herald added subscribers: dexonsmith, hiraditya. · View Herald TranscriptMay 20 2021, 5:22 AM

joechrisellis requested review of this revision.May 20 2021, 5:22 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 20 2021, 5:22 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

joechrisellis retitled this revision from [Verifier] Fail on invalid indices in {insert,extract} vector intrinsics to [Verifier] Fail on invalid indices for {insert,extract} vector intrinsics.May 20 2021, 5:26 AM

Reviewers, especially @david-arm: there are still additional verifier checks to be implemented. In particular, we can add a verifier check for 'overrunning' the vectors with an insert/extract. This is currently handled by spawning an undef at the moment, but where we can determine this statically it makes sense to have the verifier spit an error. 🙂

Harbormaster completed remote builds in B105410: Diff 346704.May 20 2021, 6:02 AM

peterwaller-arm added a reviewer: peterwaller-arm.May 20 2021, 7:26 AM

joechrisellis mentioned this in D102907: [Verifier] Fail if vectors overrun for {insert,extract} vector intrinsics.May 21 2021, 2:37 AM

david-arm added inline comments.May 21 2021, 4:48 AM

llvm/test/CodeGen/AArch64/sve-extract-vector.ll
14–15	I think the name needs updating to idx2 now?
47–48	Same here
llvm/test/CodeGen/AArch64/sve-insert-vector.ll
112	This is just FYI and not caused by your patch, but suppose vscale = 1 and we clamp the index of 8 to 7. The `str q1, [x9, x8]` instruction actually ends up storing out 128 bits starting from index 7, which is beyond the end of temporary stack space. It looks like we could be corrupting the stack!

Address review comments.

@david-arm:
- update function names.

joechrisellis added a subscriber: paulwalker-arm.May 21 2021, 8:10 AM

joechrisellis added inline comments.

llvm/test/CodeGen/AArch64/sve-insert-vector.ll
112	Hi Dave, thanks for pointing this out -- this is a great spot. Had a chat with @paulwalker-arm about this and will be looking into what the possibilities are here. A key takeaway from our conversation was that if vscale = 1 for this particular insertion then we're not strictly performing an 'insert subvector', so in some sense we could consider this abuse of the intrinsic. Most of our use-cases for the intrinsic are with an idx of 0 as well, so I don't think this will cause problems if the compilation pipeline is run end-to-end on a C program or something. Of course, we can't really defend against something like this statically, so the possibilities are limited. We discussed: more stringent checks at runtime -- but this could easily result in poor code quality. Perhaps there is a different/better way of checking the bounds/clamping. allocating more stack space than strictly necessary to give us some breathing room (? -- might have misunderstood this). (@paulwalker-arm, if I have misunderstood anything let me know 😄)

paulwalker-arm added inline comments.May 21 2021, 9:06 AM

llvm/test/CodeGen/AArch64/sve-insert-vector.ll
112	I'll say it's not about "breathing room" but more guaranteeing there's enough stack space, so something like `max(sizeof(ResultVT), sizeof(SubVT) + clamped(idx)*sizeof(Elt))` noting that this is only a problem when insert a fixed-length vector into a scalable vector.

Harbormaster completed remote builds in B105639: Diff 347037.May 21 2021, 9:31 AM

The patch looks good thanks @joechrisellis! Just one comment left ...

llvm/lib/IR/Verifier.cpp
5317	Hi @joechrisellis could you add a verifier test for this new Assert too?

Matt added a subscriber: Matt.May 31 2021, 9:43 AM

Fixup RISCV tests.

The removed functions were not valid uses of the intrinsic -- this was
previously never flagged by the verifier.

Herald added subscribers: frasercrmck, luismarques, apazos and 19 others. · View Herald TranscriptJun 1 2021, 2:44 AM

joechrisellis marked an inline comment as done.Jun 1 2021, 3:11 AM

joechrisellis added inline comments.

llvm/lib/IR/Verifier.cpp
5317	Thanks for pointing this out -- turns out that this assertion is unnecessary because the intrinsic is defined like so: def int_experimental_vector_insert : DefaultAttrsIntrinsic<[llvm_anyvector_ty], [LLVMMatchType<0>, llvm_anyvector_ty, llvm_i64_ty], [IntrNoMem, ImmArg<ArgIndex<2>>]>; The `LLVMMatchType` here means that the constraint described by this assertion is already enforced -- there must be an existing mechanism that does this for the intrinsics. I'll remove this assertion. I think it's safe to assume that the generic checking mechanism for `LLVMMatchType` is already tested, so no need to add additional tests. 😄

Address review comments.

@david-arm:
- remove superfluous assert.

Harbormaster completed remote builds in B107000: Diff 348920.Jun 1 2021, 4:25 AM

Gentle ping. 🙂

david-arm added inline comments.Jun 17 2021, 1:28 AM

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
92	Is it worth keeping this test, but changing the index of 4 to 8?

frasercrmck added inline comments.Jun 17 2021, 1:39 AM

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll
92	Don't think so because the test below uses an index of 8. Axe it, I say. And thanks for catching this!

LGTM! Thanks for addressing the review comments!

This revision is now accepted and ready to land.Jun 17 2021, 1:41 AM

paulwalker-arm added inline comments.Jun 17 2021, 3:44 AM

llvm/lib/IR/Verifier.cpp
5317	I don't understand this assert and the matching one for `experimental_vector_extract`. When inserting a fixed length vector into a scalable vector we allow any index, assuming we are matching the logic of ISD::INSERT_SUBVECTOR. If you look at SelectionDAG::getNode you see the set of asserts we use there.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1866–1867	I appreciate I'm late to the party here but why do we bother transforming illegal usage? I would have thought it better to keep the intrinsic and let the verifier report an error. By transforming the illegal usage to an undef we're just hiding a bug, which might cause people to waste time investigating a non-related area.

joechrisellis added inline comments.Jun 17 2021, 4:01 AM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1866–1867	There is a second patch here which I think addresses this concern: https://reviews.llvm.org/D102907

paulwalker-arm added inline comments.Jun 17 2021, 4:08 AM

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1866–1867	Any objections to me recommending you merge this patch with D102907 and just have a single patch to fix the verification of these intrinsics.

joechrisellis marked an inline comment as done.Jun 17 2021, 5:37 AM

joechrisellis added inline comments.

llvm/lib/IR/Verifier.cpp
5317	Discussed offline -- repeating the conclusion here for clarity. LangRef and the ISD node don’t make a distinction between the ‘inserting fixed into scalable’ and ‘inserting fixed into fixed’ case as far as the index constraints are concerned. As a result, we thought it might be better to avoid implementing something that was not documented as a thing that is supported. So will leave this as-is for now. It's easy enough to relax in the future if need be.
llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
1866–1867	Nope, I can do that. 👍

Thanks for the reviews guys.

Abandoning in favour of D104468.

Cheers!

dexonsmith removed a subscriber: dexonsmith.Jun 17 2021, 10:40 AM

joechrisellis abandoned this revision.Jun 18 2021, 1:31 AM

Revision Contents

Path

Size

llvm/

lib/

IR/

Verifier.cpp

20 lines

Transforms/

InstCombine/

InstCombineCalls.cpp

11 lines

test/

CodeGen/

AArch64/

sve-extract-vector.ll

59 lines

sve-insert-vector.ll

59 lines

RISCV/

rvv/

fixed-vectors-extract-subvector.ll

17 lines

fixed-vectors-insert-subvector.ll

47 lines

Transforms/

InstCombine/

canonicalize-vector-extract.ll

23 lines

canonicalize-vector-insert.ll

23 lines

Verifier/

insert-extract-intrinsics-invalid.ll

16 lines

Diff 348920

llvm/lib/IR/Verifier.cpp

Show First 20 Lines • Show All 5,296 Lines • ▼ Show 20 Lines	case Intrinsic::experimental_stepvector: {
Assert(VecTy && VecTy->getScalarType()->isIntegerTy() &&		Assert(VecTy && VecTy->getScalarType()->isIntegerTy() &&
VecTy->getScalarSizeInBits() >= 8,		VecTy->getScalarSizeInBits() >= 8,
"experimental_stepvector only supported for vectors of integers "		"experimental_stepvector only supported for vectors of integers "
"with a bitwidth of at least 8.",		"with a bitwidth of at least 8.",
&Call);		&Call);
break;		break;
}		}
case Intrinsic::experimental_vector_insert: {		case Intrinsic::experimental_vector_insert: {
VectorType *VecTy = cast<VectorType>(Call.getArgOperand(0)->getType());		Value *Vec = Call.getArgOperand(0);
VectorType *SubVecTy = cast<VectorType>(Call.getArgOperand(1)->getType());		Value *SubVec = Call.getArgOperand(1);
		Value *Idx = Call.getArgOperand(2);
		unsigned IdxN = cast<ConstantInt>(Idx)->getZExtValue();

		VectorType *VecTy = cast<VectorType>(Vec->getType());
		VectorType *SubVecTy = cast<VectorType>(SubVec->getType());

Assert(VecTy->getElementType() == SubVecTy->getElementType(),		Assert(VecTy->getElementType() == SubVecTy->getElementType(),
"experimental_vector_insert parameters must have the same element "		"experimental_vector_insert parameters must have the same element "
"type.",		"type.",
&Call);		&Call);
		Assert(IdxN % SubVecTy->getElementCount().getKnownMinValue() == 0,
		david-armUnsubmitted Done Reply Inline Actions Hi @joechrisellis could you add a verifier test for this new Assert too? david-arm: Hi @joechrisellis could you add a verifier test for this new Assert too?
		joechrisellisAuthorUnsubmitted Done Reply Inline Actions Thanks for pointing this out -- turns out that this assertion is unnecessary because the intrinsic is defined like so: def int_experimental_vector_insert : DefaultAttrsIntrinsic<[llvm_anyvector_ty], [LLVMMatchType<0>, llvm_anyvector_ty, llvm_i64_ty], [IntrNoMem, ImmArg<ArgIndex<2>>]>; The `LLVMMatchType` here means that the constraint described by this assertion is already enforced -- there must be an existing mechanism that does this for the intrinsics. I'll remove this assertion. I think it's safe to assume that the generic checking mechanism for `LLVMMatchType` is already tested, so no need to add additional tests. 😄 joechrisellis: Thanks for pointing this out -- turns out that this assertion is unnecessary because the…
		paulwalker-armUnsubmitted Done Reply Inline Actions I don't understand this assert and the matching one for `experimental_vector_extract`. When inserting a fixed length vector into a scalable vector we allow any index, assuming we are matching the logic of ISD::INSERT_SUBVECTOR. If you look at SelectionDAG::getNode you see the set of asserts we use there. paulwalker-arm: I don't understand this assert and the matching one for `experimental_vector_extract`. When…
		joechrisellisAuthorUnsubmitted Done Reply Inline Actions Discussed offline -- repeating the conclusion here for clarity. LangRef and the ISD node don’t make a distinction between the ‘inserting fixed into scalable’ and ‘inserting fixed into fixed’ case as far as the index constraints are concerned. As a result, we thought it might be better to avoid implementing something that was not documented as a thing that is supported. So will leave this as-is for now. It's easy enough to relax in the future if need be. joechrisellis: Discussed offline -- repeating the conclusion here for clarity. LangRef and the ISD node don’t…
		"experimental_vector_insert index must be a constant multiple of "
		"the subvector's known minimum vector length.");
break;		break;
}		}
case Intrinsic::experimental_vector_extract: {		case Intrinsic::experimental_vector_extract: {
		Value *Vec = Call.getArgOperand(0);
		Value *Idx = Call.getArgOperand(1);
		unsigned IdxN = cast<ConstantInt>(Idx)->getZExtValue();
VectorType *ResultTy = cast<VectorType>(Call.getType());		VectorType *ResultTy = cast<VectorType>(Call.getType());
VectorType *VecTy = cast<VectorType>(Call.getArgOperand(0)->getType());		VectorType *VecTy = cast<VectorType>(Vec->getType());

Assert(ResultTy->getElementType() == VecTy->getElementType(),		Assert(ResultTy->getElementType() == VecTy->getElementType(),
"experimental_vector_extract result must have the same element "		"experimental_vector_extract result must have the same element "
"type as the input vector.",		"type as the input vector.",
&Call);		&Call);
		Assert(IdxN % ResultTy->getElementCount().getKnownMinValue() == 0,
		"experimental_vector_extract index must be a constant multiple of "
		"the result type's known minimum vector length.");
break;		break;
}		}
case Intrinsic::experimental_noalias_scope_decl: {		case Intrinsic::experimental_noalias_scope_decl: {
NoAliasScopeDecls.push_back(cast<IntrinsicInst>(&Call));		NoAliasScopeDecls.push_back(cast<IntrinsicInst>(&Call));
break;		break;
}		}
};		};
}		}
▲ Show 20 Lines • Show All 880 Lines • Show Last 20 Lines

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

Show First 20 Lines • Show All 1,857 Lines • ▼ Show 20 Lines	case Intrinsic::experimental_vector_insert: {
// Only canonicalize if the destination vector, Vec, and SubVec are all		// Only canonicalize if the destination vector, Vec, and SubVec are all
// fixed vectors.		// fixed vectors.
if (DstTy && VecTy && SubVecTy) {		if (DstTy && VecTy && SubVecTy) {
unsigned DstNumElts = DstTy->getNumElements();		unsigned DstNumElts = DstTy->getNumElements();
unsigned VecNumElts = VecTy->getNumElements();		unsigned VecNumElts = VecTy->getNumElements();
unsigned SubVecNumElts = SubVecTy->getNumElements();		unsigned SubVecNumElts = SubVecTy->getNumElements();
unsigned IdxN = cast<ConstantInt>(Idx)->getZExtValue();		unsigned IdxN = cast<ConstantInt>(Idx)->getZExtValue();

// The result of this call is undefined if IdxN is not a constant multiple		// The result of this call is undefined if the insertion overruns Vec.
// of the SubVec's minimum vector length OR the insertion overruns Vec.		if (IdxN + SubVecNumElts > VecNumElts) {
		paulwalker-armUnsubmitted Not Done Reply Inline Actions I appreciate I'm late to the party here but why do we bother transforming illegal usage? I would have thought it better to keep the intrinsic and let the verifier report an error. By transforming the illegal usage to an undef we're just hiding a bug, which might cause people to waste time investigating a non-related area. paulwalker-arm: I appreciate I'm late to the party here but why do we bother transforming illegal usage? I…
		joechrisellisAuthorUnsubmitted Done Reply Inline Actions There is a second patch here which I think addresses this concern: https://reviews.llvm.org/D102907 joechrisellis: There is a second patch here which I think addresses this concern: https://reviews.llvm.
		paulwalker-armUnsubmitted Not Done Reply Inline Actions Any objections to me recommending you merge this patch with D102907 and just have a single patch to fix the verification of these intrinsics. paulwalker-arm: Any objections to me recommending you merge this patch with D102907 and just have a single…
		joechrisellisAuthorUnsubmitted Done Reply Inline Actions Nope, I can do that. 👍 joechrisellis: Nope, I can do that. 👍
if (IdxN % SubVecNumElts != 0 \|\| IdxN + SubVecNumElts > VecNumElts) {
replaceInstUsesWith(CI, UndefValue::get(CI.getType()));		replaceInstUsesWith(CI, UndefValue::get(CI.getType()));
return eraseInstFromFunction(CI);		return eraseInstFromFunction(CI);
}		}

// An insert that entirely overwrites Vec with SubVec is a nop.		// An insert that entirely overwrites Vec with SubVec is a nop.
if (VecNumElts == SubVecNumElts) {		if (VecNumElts == SubVecNumElts) {
replaceInstUsesWith(CI, SubVec);		replaceInstUsesWith(CI, SubVec);
return eraseInstFromFunction(CI);		return eraseInstFromFunction(CI);
Show All 35 Lines	case Intrinsic::experimental_vector_extract: {

// Only canonicalize if the the destination vector and Vec are fixed		// Only canonicalize if the the destination vector and Vec are fixed
// vectors.		// vectors.
if (DstTy && VecTy) {		if (DstTy && VecTy) {
unsigned DstNumElts = DstTy->getNumElements();		unsigned DstNumElts = DstTy->getNumElements();
unsigned VecNumElts = VecTy->getNumElements();		unsigned VecNumElts = VecTy->getNumElements();
unsigned IdxN = cast<ConstantInt>(Idx)->getZExtValue();		unsigned IdxN = cast<ConstantInt>(Idx)->getZExtValue();

// The result of this call is undefined if IdxN is not a constant multiple		// The result of this call is undefined if the extraction overruns Vec.
// of the result type's minimum vector length OR the extraction overruns		if (IdxN + DstNumElts > VecNumElts) {
// Vec.
if (IdxN % DstNumElts != 0 \|\| IdxN + DstNumElts > VecNumElts) {
replaceInstUsesWith(CI, UndefValue::get(CI.getType()));		replaceInstUsesWith(CI, UndefValue::get(CI.getType()));
return eraseInstFromFunction(CI);		return eraseInstFromFunction(CI);
}		}

// Extracting the entirety of Vec is a nop.		// Extracting the entirety of Vec is a nop.
if (VecNumElts == DstNumElts) {		if (VecNumElts == DstNumElts) {
replaceInstUsesWith(CI, Vec);		replaceInstUsesWith(CI, Vec);
return eraseInstFromFunction(CI);		return eraseInstFromFunction(CI);
▲ Show 20 Lines • Show All 963 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-extract-vector.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s \| FileCheck %s --check-prefixes=CHECK			; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve < %s \| FileCheck %s --check-prefixes=CHECK

	; Should codegen to a nop, since idx is zero.			; Should codegen to a nop, since idx is zero.
	define <2 x i64> @extract_v2i64_nxv2i64(<vscale x 2 x i64> %vec) nounwind {			define <2 x i64> @extract_v2i64_nxv2i64(<vscale x 2 x i64> %vec) nounwind {
	; CHECK-LABEL: extract_v2i64_nxv2i64:			; CHECK-LABEL: extract_v2i64_nxv2i64:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0			; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64(<vscale x 2 x i64> %vec, i64 0)			%retval = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64(<vscale x 2 x i64> %vec, i64 0)
	ret <2 x i64> %retval			ret <2 x i64> %retval
	}			}

	; Goes through memory currently; idx != 0.			; Goes through memory currently; idx != 0.
	define <2 x i64> @extract_v2i64_nxv2i64_idx1(<vscale x 2 x i64> %vec) nounwind {			define <2 x i64> @extract_v2i64_nxv2i64_idx2(<vscale x 2 x i64> %vec) nounwind {
				david-armUnsubmitted Done Reply Inline Actions I think the name needs updating to idx2 now? david-arm: I think the name needs updating to idx2 now?
	; CHECK-LABEL: extract_v2i64_nxv2i64_idx1:			; CHECK-LABEL: extract_v2i64_nxv2i64_idx2:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: cntd x9
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #2
				; CHECK-NEXT: cmp x9, #2 // =2
	; CHECK-NEXT: ptrue p0.d			; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: csel x8, x9, x8, lo
	; CHECK-NEXT: st1d { z0.d }, p0, [sp]			; CHECK-NEXT: st1d { z0.d }, p0, [sp]
	; CHECK-NEXT: ldur q0, [sp, #8]			; CHECK-NEXT: lsl x8, x8, #3
				; CHECK-NEXT: mov x9, sp
				; CHECK-NEXT: ldr q0, [x9, x8]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64(<vscale x 2 x i64> %vec, i64 1)			%retval = call <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64(<vscale x 2 x i64> %vec, i64 2)
	ret <2 x i64> %retval			ret <2 x i64> %retval
	}			}

	; Should codegen to a nop, since idx is zero.			; Should codegen to a nop, since idx is zero.
	define <4 x i32> @extract_v4i32_nxv4i32(<vscale x 4 x i32> %vec) nounwind {			define <4 x i32> @extract_v4i32_nxv4i32(<vscale x 4 x i32> %vec) nounwind {
	; CHECK-LABEL: extract_v4i32_nxv4i32:			; CHECK-LABEL: extract_v4i32_nxv4i32:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0			; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> %vec, i64 0)			%retval = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> %vec, i64 0)
	ret <4 x i32> %retval			ret <4 x i32> %retval
	}			}

	; Goes through memory currently; idx != 0.			; Goes through memory currently; idx != 0.
	define <4 x i32> @extract_v4i32_nxv4i32_idx1(<vscale x 4 x i32> %vec) nounwind {			define <4 x i32> @extract_v4i32_nxv4i32_idx4(<vscale x 4 x i32> %vec) nounwind {
				david-armUnsubmitted Done Reply Inline Actions Same here david-arm: Same here
	; CHECK-LABEL: extract_v4i32_nxv4i32_idx1:			; CHECK-LABEL: extract_v4i32_nxv4i32_idx4:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: cntw x9
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #4
				; CHECK-NEXT: cmp x9, #4 // =4
	; CHECK-NEXT: ptrue p0.s			; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: csel x8, x9, x8, lo
	; CHECK-NEXT: st1w { z0.s }, p0, [sp]			; CHECK-NEXT: st1w { z0.s }, p0, [sp]
	; CHECK-NEXT: ldur q0, [sp, #4]			; CHECK-NEXT: lsl x8, x8, #2
				; CHECK-NEXT: mov x9, sp
				; CHECK-NEXT: ldr q0, [x9, x8]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> %vec, i64 1)			%retval = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> %vec, i64 4)
	ret <4 x i32> %retval			ret <4 x i32> %retval
	}			}

	; Should codegen to a nop, since idx is zero.			; Should codegen to a nop, since idx is zero.
	define <8 x i16> @extract_v8i16_nxv8i16(<vscale x 8 x i16> %vec) nounwind {			define <8 x i16> @extract_v8i16_nxv8i16(<vscale x 8 x i16> %vec) nounwind {
	; CHECK-LABEL: extract_v8i16_nxv8i16:			; CHECK-LABEL: extract_v8i16_nxv8i16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0			; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <8 x i16> @llvm.experimental.vector.extract.v8i16.nxv8i16(<vscale x 8 x i16> %vec, i64 0)			%retval = call <8 x i16> @llvm.experimental.vector.extract.v8i16.nxv8i16(<vscale x 8 x i16> %vec, i64 0)
	ret <8 x i16> %retval			ret <8 x i16> %retval
	}			}

	; Goes through memory currently; idx != 0.			; Goes through memory currently; idx != 0.
	define <8 x i16> @extract_v8i16_nxv8i16_idx1(<vscale x 8 x i16> %vec) nounwind {			define <8 x i16> @extract_v8i16_nxv8i16_idx8(<vscale x 8 x i16> %vec) nounwind {
	; CHECK-LABEL: extract_v8i16_nxv8i16_idx1:			; CHECK-LABEL: extract_v8i16_nxv8i16_idx8:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: cnth x9
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #8
				; CHECK-NEXT: cmp x9, #8 // =8
	; CHECK-NEXT: ptrue p0.h			; CHECK-NEXT: ptrue p0.h
				; CHECK-NEXT: csel x8, x9, x8, lo
	; CHECK-NEXT: st1h { z0.h }, p0, [sp]			; CHECK-NEXT: st1h { z0.h }, p0, [sp]
	; CHECK-NEXT: ldur q0, [sp, #2]			; CHECK-NEXT: lsl x8, x8, #1
				; CHECK-NEXT: mov x9, sp
				; CHECK-NEXT: ldr q0, [x9, x8]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <8 x i16> @llvm.experimental.vector.extract.v8i16.nxv8i16(<vscale x 8 x i16> %vec, i64 1)			%retval = call <8 x i16> @llvm.experimental.vector.extract.v8i16.nxv8i16(<vscale x 8 x i16> %vec, i64 8)
	ret <8 x i16> %retval			ret <8 x i16> %retval
	}			}

	; Should codegen to a nop, since idx is zero.			; Should codegen to a nop, since idx is zero.
	define <16 x i8> @extract_v16i8_nxv16i8(<vscale x 16 x i8> %vec) nounwind {			define <16 x i8> @extract_v16i8_nxv16i8(<vscale x 16 x i8> %vec) nounwind {
	; CHECK-LABEL: extract_v16i8_nxv16i8:			; CHECK-LABEL: extract_v16i8_nxv16i8:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0			; CHECK-NEXT: // kill: def $q0 killed $q0 killed $z0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <16 x i8> @llvm.experimental.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8> %vec, i64 0)			%retval = call <16 x i8> @llvm.experimental.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8> %vec, i64 0)
	ret <16 x i8> %retval			ret <16 x i8> %retval
	}			}

	; Goes through memory currently; idx != 0.			; Goes through memory currently; idx != 0.
	define <16 x i8> @extract_v16i8_nxv16i8_idx1(<vscale x 16 x i8> %vec) nounwind {			define <16 x i8> @extract_v16i8_nxv16i8_idx16(<vscale x 16 x i8> %vec) nounwind {
	; CHECK-LABEL: extract_v16i8_nxv16i8_idx1:			; CHECK-LABEL: extract_v16i8_nxv16i8_idx16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: rdvl x9, #1
				; CHECK-NEXT: sub x9, x9, #1 // =1
	; CHECK-NEXT: ptrue p0.b			; CHECK-NEXT: ptrue p0.b
				; CHECK-NEXT: mov w8, #16
				; CHECK-NEXT: cmp x9, #16 // =16
	; CHECK-NEXT: st1b { z0.b }, p0, [sp]			; CHECK-NEXT: st1b { z0.b }, p0, [sp]
	; CHECK-NEXT: ldur q0, [sp, #1]			; CHECK-NEXT: csel x8, x9, x8, lo
				; CHECK-NEXT: mov x9, sp
				; CHECK-NEXT: ldr q0, [x9, x8]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <16 x i8> @llvm.experimental.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8> %vec, i64 1)			%retval = call <16 x i8> @llvm.experimental.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8> %vec, i64 16)
	ret <16 x i8> %retval			ret <16 x i8> %retval
	}			}

	declare <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64(<vscale x 2 x i64>, i64)			declare <2 x i64> @llvm.experimental.vector.extract.v2i64.nxv2i64(<vscale x 2 x i64>, i64)
	declare <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32>, i64)			declare <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32>, i64)
	declare <8 x i16> @llvm.experimental.vector.extract.v8i16.nxv8i16(<vscale x 8 x i16>, i64)			declare <8 x i16> @llvm.experimental.vector.extract.v8i16.nxv8i16(<vscale x 8 x i16>, i64)
	declare <16 x i8> @llvm.experimental.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8>, i64)			declare <16 x i8> @llvm.experimental.vector.extract.v16i8.nxv16i8(<vscale x 16 x i8>, i64)

llvm/test/CodeGen/AArch64/sve-insert-vector.ll

	Show All 11 Lines
	; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp]			; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 2 x i64> @llvm.experimental.vector.insert.nxv2i64.v2i64(<vscale x 2 x i64> %vec, <2 x i64> %subvec, i64 0)			%retval = call <vscale x 2 x i64> @llvm.experimental.vector.insert.nxv2i64.v2i64(<vscale x 2 x i64> %vec, <2 x i64> %subvec, i64 0)
	ret <vscale x 2 x i64> %retval			ret <vscale x 2 x i64> %retval
	}			}

	define <vscale x 2 x i64> @insert_v2i64_nxv2i64_idx1(<vscale x 2 x i64> %vec, <2 x i64> %subvec) nounwind {			define <vscale x 2 x i64> @insert_v2i64_nxv2i64_idx2(<vscale x 2 x i64> %vec, <2 x i64> %subvec) nounwind {
	; CHECK-LABEL: insert_v2i64_nxv2i64_idx1:			; CHECK-LABEL: insert_v2i64_nxv2i64_idx2:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: cntd x9
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #2
				; CHECK-NEXT: cmp x9, #2 // =2
				; CHECK-NEXT: csel x8, x9, x8, lo
	; CHECK-NEXT: ptrue p0.d			; CHECK-NEXT: ptrue p0.d
				; CHECK-NEXT: lsl x8, x8, #3
				; CHECK-NEXT: mov x9, sp
	; CHECK-NEXT: st1d { z0.d }, p0, [sp]			; CHECK-NEXT: st1d { z0.d }, p0, [sp]
	; CHECK-NEXT: stur q1, [sp, #8]			; CHECK-NEXT: str q1, [x9, x8]
	; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp]			; CHECK-NEXT: ld1d { z0.d }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 2 x i64> @llvm.experimental.vector.insert.nxv2i64.v2i64(<vscale x 2 x i64> %vec, <2 x i64> %subvec, i64 1)			%retval = call <vscale x 2 x i64> @llvm.experimental.vector.insert.nxv2i64.v2i64(<vscale x 2 x i64> %vec, <2 x i64> %subvec, i64 2)
	ret <vscale x 2 x i64> %retval			ret <vscale x 2 x i64> %retval
	}			}

	define <vscale x 4 x i32> @insert_v4i32_nxv4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec) nounwind {			define <vscale x 4 x i32> @insert_v4i32_nxv4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec) nounwind {
	; CHECK-LABEL: insert_v4i32_nxv4i32:			; CHECK-LABEL: insert_v4i32_nxv4i32:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
	; CHECK-NEXT: ptrue p0.s			; CHECK-NEXT: ptrue p0.s
	; CHECK-NEXT: st1w { z0.s }, p0, [sp]			; CHECK-NEXT: st1w { z0.s }, p0, [sp]
	; CHECK-NEXT: str q1, [sp]			; CHECK-NEXT: str q1, [sp]
	; CHECK-NEXT: ld1w { z0.s }, p0/z, [sp]			; CHECK-NEXT: ld1w { z0.s }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec, i64 0)			%retval = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec, i64 0)
	ret <vscale x 4 x i32> %retval			ret <vscale x 4 x i32> %retval
	}			}

	define <vscale x 4 x i32> @insert_v4i32_nxv4i32_idx1(<vscale x 4 x i32> %vec, <4 x i32> %subvec) nounwind {			define <vscale x 4 x i32> @insert_v4i32_nxv4i32_idx4(<vscale x 4 x i32> %vec, <4 x i32> %subvec) nounwind {
	; CHECK-LABEL: insert_v4i32_nxv4i32_idx1:			; CHECK-LABEL: insert_v4i32_nxv4i32_idx4:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: cntw x9
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #4
				; CHECK-NEXT: cmp x9, #4 // =4
				; CHECK-NEXT: csel x8, x9, x8, lo
	; CHECK-NEXT: ptrue p0.s			; CHECK-NEXT: ptrue p0.s
				; CHECK-NEXT: lsl x8, x8, #2
				; CHECK-NEXT: mov x9, sp
	; CHECK-NEXT: st1w { z0.s }, p0, [sp]			; CHECK-NEXT: st1w { z0.s }, p0, [sp]
	; CHECK-NEXT: stur q1, [sp, #4]			; CHECK-NEXT: str q1, [x9, x8]
	; CHECK-NEXT: ld1w { z0.s }, p0/z, [sp]			; CHECK-NEXT: ld1w { z0.s }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec, i64 1)			%retval = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec, i64 4)
	ret <vscale x 4 x i32> %retval			ret <vscale x 4 x i32> %retval
	}			}

	define <vscale x 8 x i16> @insert_v8i16_nxv8i16(<vscale x 8 x i16> %vec, <8 x i16> %subvec) nounwind {			define <vscale x 8 x i16> @insert_v8i16_nxv8i16(<vscale x 8 x i16> %vec, <8 x i16> %subvec) nounwind {
	; CHECK-LABEL: insert_v8i16_nxv8i16:			; CHECK-LABEL: insert_v8i16_nxv8i16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
	; CHECK-NEXT: ptrue p0.h			; CHECK-NEXT: ptrue p0.h
	; CHECK-NEXT: st1h { z0.h }, p0, [sp]			; CHECK-NEXT: st1h { z0.h }, p0, [sp]
	; CHECK-NEXT: str q1, [sp]			; CHECK-NEXT: str q1, [sp]
	; CHECK-NEXT: ld1h { z0.h }, p0/z, [sp]			; CHECK-NEXT: ld1h { z0.h }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 8 x i16> @llvm.experimental.vector.insert.nxv8i16.v8i16(<vscale x 8 x i16> %vec, <8 x i16> %subvec, i64 0)			%retval = call <vscale x 8 x i16> @llvm.experimental.vector.insert.nxv8i16.v8i16(<vscale x 8 x i16> %vec, <8 x i16> %subvec, i64 0)
	ret <vscale x 8 x i16> %retval			ret <vscale x 8 x i16> %retval
	}			}

	define <vscale x 8 x i16> @insert_v8i16_nxv8i16_idx1(<vscale x 8 x i16> %vec, <8 x i16> %subvec) nounwind {			define <vscale x 8 x i16> @insert_v8i16_nxv8i16_idx8(<vscale x 8 x i16> %vec, <8 x i16> %subvec) nounwind {
	; CHECK-LABEL: insert_v8i16_nxv8i16_idx1:			; CHECK-LABEL: insert_v8i16_nxv8i16_idx8:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: cnth x9
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #8
				; CHECK-NEXT: cmp x9, #8 // =8
				; CHECK-NEXT: csel x8, x9, x8, lo
	; CHECK-NEXT: ptrue p0.h			; CHECK-NEXT: ptrue p0.h
				; CHECK-NEXT: lsl x8, x8, #1
				; CHECK-NEXT: mov x9, sp
	; CHECK-NEXT: st1h { z0.h }, p0, [sp]			; CHECK-NEXT: st1h { z0.h }, p0, [sp]
	; CHECK-NEXT: stur q1, [sp, #2]			; CHECK-NEXT: str q1, [x9, x8]
				david-armUnsubmitted Not Done Reply Inline Actions This is just FYI and not caused by your patch, but suppose vscale = 1 and we clamp the index of 8 to 7. The `str q1, [x9, x8]` instruction actually ends up storing out 128 bits starting from index 7, which is beyond the end of temporary stack space. It looks like we could be corrupting the stack! david-arm: This is just FYI and not caused by your patch, but suppose vscale = 1 and we clamp the index of…
				joechrisellisAuthorUnsubmitted Done Reply Inline Actions Hi Dave, thanks for pointing this out -- this is a great spot. Had a chat with @paulwalker-arm about this and will be looking into what the possibilities are here. A key takeaway from our conversation was that if vscale = 1 for this particular insertion then we're not strictly performing an 'insert subvector', so in some sense we could consider this abuse of the intrinsic. Most of our use-cases for the intrinsic are with an idx of 0 as well, so I don't think this will cause problems if the compilation pipeline is run end-to-end on a C program or something. Of course, we can't really defend against something like this statically, so the possibilities are limited. We discussed: more stringent checks at runtime -- but this could easily result in poor code quality. Perhaps there is a different/better way of checking the bounds/clamping. allocating more stack space than strictly necessary to give us some breathing room (? -- might have misunderstood this). (@paulwalker-arm, if I have misunderstood anything let me know 😄) joechrisellis: Hi Dave, thanks for pointing this out -- this is a great spot. Had a chat with @paulwalker-arm…
				paulwalker-armUnsubmitted Not Done Reply Inline Actions I'll say it's not about "breathing room" but more guaranteeing there's enough stack space, so something like `max(sizeof(ResultVT), sizeof(SubVT) + clamped(idx)sizeof(Elt))` noting that this is only a problem when insert a fixed-length vector into a scalable vector. paulwalker-arm:* I'll say it's not about "breathing room" but more guaranteeing there's enough stack space, so…
	; CHECK-NEXT: ld1h { z0.h }, p0/z, [sp]			; CHECK-NEXT: ld1h { z0.h }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 8 x i16> @llvm.experimental.vector.insert.nxv8i16.v8i16(<vscale x 8 x i16> %vec, <8 x i16> %subvec, i64 1)			%retval = call <vscale x 8 x i16> @llvm.experimental.vector.insert.nxv8i16.v8i16(<vscale x 8 x i16> %vec, <8 x i16> %subvec, i64 8)
	ret <vscale x 8 x i16> %retval			ret <vscale x 8 x i16> %retval
	}			}

	define <vscale x 16 x i8> @insert_v16i8_nxv16i8(<vscale x 16 x i8> %vec, <16 x i8> %subvec) nounwind {			define <vscale x 16 x i8> @insert_v16i8_nxv16i8(<vscale x 16 x i8> %vec, <16 x i8> %subvec) nounwind {
	; CHECK-LABEL: insert_v16i8_nxv16i8:			; CHECK-LABEL: insert_v16i8_nxv16i8:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
	; CHECK-NEXT: ptrue p0.b			; CHECK-NEXT: ptrue p0.b
	; CHECK-NEXT: st1b { z0.b }, p0, [sp]			; CHECK-NEXT: st1b { z0.b }, p0, [sp]
	; CHECK-NEXT: str q1, [sp]			; CHECK-NEXT: str q1, [sp]
	; CHECK-NEXT: ld1b { z0.b }, p0/z, [sp]			; CHECK-NEXT: ld1b { z0.b }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 16 x i8> @llvm.experimental.vector.insert.nxv16i8.v16i8(<vscale x 16 x i8> %vec, <16 x i8> %subvec, i64 0)			%retval = call <vscale x 16 x i8> @llvm.experimental.vector.insert.nxv16i8.v16i8(<vscale x 16 x i8> %vec, <16 x i8> %subvec, i64 0)
	ret <vscale x 16 x i8> %retval			ret <vscale x 16 x i8> %retval
	}			}

	define <vscale x 16 x i8> @insert_v16i8_nxv16i8_idx1(<vscale x 16 x i8> %vec, <16 x i8> %subvec) nounwind {			define <vscale x 16 x i8> @insert_v16i8_nxv16i8_idx16(<vscale x 16 x i8> %vec, <16 x i8> %subvec) nounwind {
	; CHECK-LABEL: insert_v16i8_nxv16i8_idx1:			; CHECK-LABEL: insert_v16i8_nxv16i8_idx16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill			; CHECK-NEXT: str x29, [sp, #-16]! // 8-byte Folded Spill
	; CHECK-NEXT: addvl sp, sp, #-1			; CHECK-NEXT: addvl sp, sp, #-1
				; CHECK-NEXT: rdvl x9, #1
				; CHECK-NEXT: sub x9, x9, #1 // =1
				; CHECK-NEXT: mov w8, #16
				; CHECK-NEXT: cmp x9, #16 // =16
	; CHECK-NEXT: ptrue p0.b			; CHECK-NEXT: ptrue p0.b
				; CHECK-NEXT: csel x8, x9, x8, lo
				; CHECK-NEXT: mov x9, sp
	; CHECK-NEXT: st1b { z0.b }, p0, [sp]			; CHECK-NEXT: st1b { z0.b }, p0, [sp]
	; CHECK-NEXT: stur q1, [sp, #1]			; CHECK-NEXT: str q1, [x9, x8]
	; CHECK-NEXT: ld1b { z0.b }, p0/z, [sp]			; CHECK-NEXT: ld1b { z0.b }, p0/z, [sp]
	; CHECK-NEXT: addvl sp, sp, #1			; CHECK-NEXT: addvl sp, sp, #1
	; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload			; CHECK-NEXT: ldr x29, [sp], #16 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%retval = call <vscale x 16 x i8> @llvm.experimental.vector.insert.nxv16i8.v16i8(<vscale x 16 x i8> %vec, <16 x i8> %subvec, i64 1)			%retval = call <vscale x 16 x i8> @llvm.experimental.vector.insert.nxv16i8.v16i8(<vscale x 16 x i8> %vec, <16 x i8> %subvec, i64 16)
	ret <vscale x 16 x i8> %retval			ret <vscale x 16 x i8> %retval
	}			}


	; Insert subvectors into illegal vectors			; Insert subvectors into illegal vectors

	define void @insert_nxv8i64_nxv16i64(<vscale x 8 x i64> %sv0, <vscale x 8 x i64> %sv1, <vscale x 16 x i64>* %out) {			define void @insert_nxv8i64_nxv16i64(<vscale x 8 x i64> %sv0, <vscale x 8 x i64> %sv1, <vscale x 16 x i64>* %out) {
	; CHECK-LABEL: insert_nxv8i64_nxv16i64:			; CHECK-LABEL: insert_nxv8i64_nxv16i64:
	▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll

	Show First 20 Lines • Show All 291 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu			; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu
	; CHECK-NEXT: vse1.v v0, (a0)			; CHECK-NEXT: vse1.v v0, (a0)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%c = call <8 x i1> @llvm.experimental.vector.extract.v8i1.nxv2i1(<vscale x 2 x i1> %x, i64 0)			%c = call <8 x i1> @llvm.experimental.vector.extract.v8i1.nxv2i1(<vscale x 2 x i1> %x, i64 0)
	store <8 x i1> %c, <8 x i1>* %y			store <8 x i1> %c, <8 x i1>* %y
	ret void			ret void
	}			}

	define void @extract_v8i1_nxv2i1_2(<vscale x 2 x i1> %x, <8 x i1>* %y) {
	; CHECK-LABEL: extract_v8i1_nxv2i1_2:
	; CHECK: # %bb.0:
	; CHECK-NEXT: vsetvli a1, zero, e8,mf4,ta,mu
	; CHECK-NEXT: vmv.v.i v25, 0
	; CHECK-NEXT: vmerge.vim v25, v25, 1, v0
	; CHECK-NEXT: vsetivli a1, 8, e8,mf4,ta,mu
	; CHECK-NEXT: vslidedown.vi v25, v25, 2
	; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu
	; CHECK-NEXT: vmsne.vi v26, v25, 0
	; CHECK-NEXT: vse1.v v26, (a0)
	; CHECK-NEXT: ret
	%c = call <8 x i1> @llvm.experimental.vector.extract.v8i1.nxv2i1(<vscale x 2 x i1> %x, i64 2)
	store <8 x i1> %c, <8 x i1>* %y
	ret void
	}

	define void @extract_v8i1_nxv64i1_0(<vscale x 64 x i1> %x, <8 x i1>* %y) {			define void @extract_v8i1_nxv64i1_0(<vscale x 64 x i1> %x, <8 x i1>* %y) {
	; CHECK-LABEL: extract_v8i1_nxv64i1_0:			; CHECK-LABEL: extract_v8i1_nxv64i1_0:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu			; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu
	; CHECK-NEXT: vse1.v v0, (a0)			; CHECK-NEXT: vse1.v v0, (a0)
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%c = call <8 x i1> @llvm.experimental.vector.extract.v8i1.nxv64i1(<vscale x 64 x i1> %x, i64 0)			%c = call <8 x i1> @llvm.experimental.vector.extract.v8i1.nxv64i1(<vscale x 64 x i1> %x, i64 0)
	store <8 x i1> %c, <8 x i1>* %y			store <8 x i1> %c, <8 x i1>* %y
	▲ Show 20 Lines • Show All 341 Lines • Show Last 20 Lines

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll

	Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
	; LMULMAX1-NEXT: vsetivli a0, 8, e32,m4,tu,mu			; LMULMAX1-NEXT: vsetivli a0, 8, e32,m4,tu,mu
	; LMULMAX1-NEXT: vslideup.vi v8, v12, 4			; LMULMAX1-NEXT: vslideup.vi v8, v12, 4
	; LMULMAX1-NEXT: ret			; LMULMAX1-NEXT: ret
	%sv = load <8 x i32>, <8 x i32>* %svp			%sv = load <8 x i32>, <8 x i32>* %svp
	%v = call <vscale x 8 x i32> @llvm.experimental.vector.insert.v8i32.nxv8i32(<vscale x 8 x i32> %vec, <8 x i32> %sv, i64 0)			%v = call <vscale x 8 x i32> @llvm.experimental.vector.insert.v8i32.nxv8i32(<vscale x 8 x i32> %vec, <8 x i32> %sv, i64 0)
	ret <vscale x 8 x i32> %v			ret <vscale x 8 x i32> %v
	}			}

	define <vscale x 8 x i32> @insert_nxv8i32_v8i32_4(<vscale x 8 x i32> %vec, <8 x i32>* %svp) {
	; LMULMAX2-LABEL: insert_nxv8i32_v8i32_4:
	; LMULMAX2: # %bb.0:
	; LMULMAX2-NEXT: vsetivli a1, 8, e32,m2,ta,mu
	; LMULMAX2-NEXT: vle32.v v28, (a0)
	; LMULMAX2-NEXT: vsetivli a0, 12, e32,m4,tu,mu
	; LMULMAX2-NEXT: vslideup.vi v8, v28, 4
	; LMULMAX2-NEXT: ret
	;
	; LMULMAX1-LABEL: insert_nxv8i32_v8i32_4:
	; LMULMAX1: # %bb.0:
	; LMULMAX1-NEXT: vsetivli a1, 4, e32,m1,ta,mu
	; LMULMAX1-NEXT: vle32.v v28, (a0)
	; LMULMAX1-NEXT: addi a0, a0, 16
	; LMULMAX1-NEXT: vle32.v v12, (a0)
	; LMULMAX1-NEXT: vsetivli a0, 8, e32,m4,tu,mu
	; LMULMAX1-NEXT: vslideup.vi v8, v28, 4
	; LMULMAX1-NEXT: vsetivli a0, 12, e32,m4,tu,mu
	; LMULMAX1-NEXT: vslideup.vi v8, v12, 8
	; LMULMAX1-NEXT: ret
	%sv = load <8 x i32>, <8 x i32>* %svp
	%v = call <vscale x 8 x i32> @llvm.experimental.vector.insert.v8i32.nxv8i32(<vscale x 8 x i32> %vec, <8 x i32> %sv, i64 4)
	david-armUnsubmitted Not Done Reply Inline Actions Is it worth keeping this test, but changing the index of 4 to 8? david-arm: Is it worth keeping this test, but changing the index of 4 to 8?
	frasercrmckUnsubmitted Not Done Reply Inline Actions Don't think so because the test below uses an index of 8. Axe it, I say. And thanks for catching this! frasercrmck: Don't think so because the test below uses an index of 8. Axe it, I say. And thanks for…
	ret <vscale x 8 x i32> %v
	}

	define <vscale x 8 x i32> @insert_nxv8i32_v8i32_8(<vscale x 8 x i32> %vec, <8 x i32>* %svp) {			define <vscale x 8 x i32> @insert_nxv8i32_v8i32_8(<vscale x 8 x i32> %vec, <8 x i32>* %svp) {
	; LMULMAX2-LABEL: insert_nxv8i32_v8i32_8:			; LMULMAX2-LABEL: insert_nxv8i32_v8i32_8:
	; LMULMAX2: # %bb.0:			; LMULMAX2: # %bb.0:
	; LMULMAX2-NEXT: vsetivli a1, 8, e32,m2,ta,mu			; LMULMAX2-NEXT: vsetivli a1, 8, e32,m2,ta,mu
	; LMULMAX2-NEXT: vle32.v v28, (a0)			; LMULMAX2-NEXT: vle32.v v28, (a0)
	; LMULMAX2-NEXT: vsetivli a0, 16, e32,m4,tu,mu			; LMULMAX2-NEXT: vsetivli a0, 16, e32,m4,tu,mu
	; LMULMAX2-NEXT: vslideup.vi v8, v28, 8			; LMULMAX2-NEXT: vslideup.vi v8, v28, 8
	; LMULMAX2-NEXT: ret			; LMULMAX2-NEXT: ret
	▲ Show 20 Lines • Show All 400 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: vsetvli a0, zero, e8,mf4,ta,mu			; CHECK-NEXT: vsetvli a0, zero, e8,mf4,ta,mu
	; CHECK-NEXT: vmsne.vi v0, v25, 0			; CHECK-NEXT: vmsne.vi v0, v25, 0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%sv = load <4 x i1>, <4 x i1>* %svp			%sv = load <4 x i1>, <4 x i1>* %svp
	%c = call <vscale x 2 x i1> @llvm.experimental.vector.insert.v4i1.nxv2i1(<vscale x 2 x i1> %v, <4 x i1> %sv, i64 0)			%c = call <vscale x 2 x i1> @llvm.experimental.vector.insert.v4i1.nxv2i1(<vscale x 2 x i1> %v, <4 x i1> %sv, i64 0)
	ret <vscale x 2 x i1> %c			ret <vscale x 2 x i1> %c
	}			}

	define <vscale x 2 x i1> @insert_nxv2i1_v4i1_6(<vscale x 2 x i1> %v, <4 x i1>* %svp) {
	; CHECK-LABEL: insert_nxv2i1_v4i1_6:
	; CHECK: # %bb.0:
	; CHECK-NEXT: vsetivli a1, 4, e8,mf4,ta,mu
	; CHECK-NEXT: vle1.v v27, (a0)
	; CHECK-NEXT: vsetvli a0, zero, e8,mf4,ta,mu
	; CHECK-NEXT: vmv.v.i v25, 0
	; CHECK-NEXT: vmerge.vim v25, v25, 1, v0
	; CHECK-NEXT: vsetivli a0, 4, e8,mf4,ta,mu
	; CHECK-NEXT: vmv.v.i v26, 0
	; CHECK-NEXT: vmv1r.v v0, v27
	; CHECK-NEXT: vmerge.vim v26, v26, 1, v0
	; CHECK-NEXT: vsetivli a0, 10, e8,mf4,tu,mu
	; CHECK-NEXT: vslideup.vi v25, v26, 6
	; CHECK-NEXT: vsetvli a0, zero, e8,mf4,ta,mu
	; CHECK-NEXT: vmsne.vi v0, v25, 0
	; CHECK-NEXT: ret
	%sv = load <4 x i1>, <4 x i1>* %svp
	%c = call <vscale x 2 x i1> @llvm.experimental.vector.insert.v4i1.nxv2i1(<vscale x 2 x i1> %v, <4 x i1> %sv, i64 6)
	ret <vscale x 2 x i1> %c
	}

	define <vscale x 8 x i1> @insert_nxv8i1_v4i1_0(<vscale x 8 x i1> %v, <8 x i1>* %svp) {			define <vscale x 8 x i1> @insert_nxv8i1_v4i1_0(<vscale x 8 x i1> %v, <8 x i1>* %svp) {
	; CHECK-LABEL: insert_nxv8i1_v4i1_0:			; CHECK-LABEL: insert_nxv8i1_v4i1_0:
	; CHECK: # %bb.0:			; CHECK: # %bb.0:
	; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu			; CHECK-NEXT: vsetivli a1, 8, e8,mf2,ta,mu
	; CHECK-NEXT: vle1.v v25, (a0)			; CHECK-NEXT: vle1.v v25, (a0)
	; CHECK-NEXT: vsetivli a0, 1, e8,mf8,tu,mu			; CHECK-NEXT: vsetivli a0, 1, e8,mf8,tu,mu
	; CHECK-NEXT: vslideup.vi v0, v25, 0			; CHECK-NEXT: vslideup.vi v0, v25, 0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	▲ Show 20 Lines • Show All 117 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll

	Show First 20 Lines • Show All 95 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP1:%.]] = shufflevector <8 x i32> [[VEC:%.]], <8 x i32> undef, <3 x i32> <i32 3, i32 4, i32 5>			; CHECK-NEXT: [[TMP1:%.]] = shufflevector <8 x i32> [[VEC:%.]], <8 x i32> undef, <3 x i32> <i32 3, i32 4, i32 5>
	; CHECK-NEXT: ret <3 x i32> [[TMP1]]			; CHECK-NEXT: ret <3 x i32> [[TMP1]]
	;			;
	%1 = call <3 x i32> @llvm.experimental.vector.extract.v3i32.v8i32(<8 x i32> %vec, i64 3)			%1 = call <3 x i32> @llvm.experimental.vector.extract.v3i32.v8i32(<8 x i32> %vec, i64 3)
	ret <3 x i32> %1			ret <3 x i32> %1
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Invalid canonicalizations
	; ============================================================================ ;

	; Idx must be the be a constant multiple of the destination vector's length,
	; otherwise the result is undefined.
	define <4 x i32> @idx_not_constant_multiple(<8 x i32> %vec) {
	; CHECK-LABEL: @idx_not_constant_multiple(
	; CHECK-NEXT: ret <4 x i32> undef
	;
	%1 = call <4 x i32> @llvm.experimental.vector.extract.v4i32.v8i32(<8 x i32> %vec, i64 1)
	ret <4 x i32> %1
	}

	; If the extraction overruns the vector, the result is undefined.
	define <10 x i32> @extract_overrun(<8 x i32> %vec) {
	; CHECK-LABEL: @extract_overrun(
	; CHECK-NEXT: ret <10 x i32> undef
	;
	%1 = call <10 x i32> @llvm.experimental.vector.extract.v10i32.v8i32(<8 x i32> %vec, i64 0)
	ret <10 x i32> %1
	}

	; ============================================================================ ;
	; Scalable cases			; Scalable cases
	; ============================================================================ ;			; ============================================================================ ;

	; Scalable extractions should not be canonicalized. This will be lowered to the			; Scalable extractions should not be canonicalized. This will be lowered to the
	; EXTRACT_SUBVECTOR ISD node later.			; EXTRACT_SUBVECTOR ISD node later.
	define <4 x i32> @scalable_extract(<vscale x 4 x i32> %vec) {			define <4 x i32> @scalable_extract(<vscale x 4 x i32> %vec) {
	; CHECK-LABEL: @scalable_extract(			; CHECK-LABEL: @scalable_extract(
	; CHECK-NEXT: [[TMP1:%.]] = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> [[VEC:%.]], i64 0)			; CHECK-NEXT: [[TMP1:%.]] = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> [[VEC:%.]], i64 0)
	; CHECK-NEXT: ret <4 x i32> [[TMP1]]			; CHECK-NEXT: ret <4 x i32> [[TMP1]]
	;			;
	%1 = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> %vec, i64 0)			%1 = call <4 x i32> @llvm.experimental.vector.extract.v4i32.nxv4i32(<vscale x 4 x i32> %vec, i64 0)
	ret <4 x i32> %1			ret <4 x i32> %1
	}			}

llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll

	Show First 20 Lines • Show All 103 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP2:%.]] = shufflevector <8 x i32> [[VEC:%.]], <8 x i32> [[TMP1]], <8 x i32> <i32 0, i32 1, i32 2, i32 8, i32 9, i32 10, i32 6, i32 7>			; CHECK-NEXT: [[TMP2:%.]] = shufflevector <8 x i32> [[VEC:%.]], <8 x i32> [[TMP1]], <8 x i32> <i32 0, i32 1, i32 2, i32 8, i32 9, i32 10, i32 6, i32 7>
	; CHECK-NEXT: ret <8 x i32> [[TMP2]]			; CHECK-NEXT: ret <8 x i32> [[TMP2]]
	;			;
	%1 = call <8 x i32> @llvm.experimental.vector.insert.v8i32.v3i32(<8 x i32> %vec, <3 x i32> %subvec, i64 3)			%1 = call <8 x i32> @llvm.experimental.vector.insert.v8i32.v3i32(<8 x i32> %vec, <3 x i32> %subvec, i64 3)
	ret <8 x i32> %1			ret <8 x i32> %1
	}			}

	; ============================================================================ ;			; ============================================================================ ;
	; Invalid canonicalizations
	; ============================================================================ ;

	; Idx must be the be a constant multiple of the subvector's minimum vector
	; length, otherwise the result is undefined.
	define <8 x i32> @idx_not_constant_multiple(<8 x i32> %vec, <4 x i32> %subvec) {
	; CHECK-LABEL: @idx_not_constant_multiple(
	; CHECK-NEXT: ret <8 x i32> undef
	;
	%1 = call <8 x i32> @llvm.experimental.vector.insert.v8i32.v4i32(<8 x i32> %vec, <4 x i32> %subvec, i64 2)
	ret <8 x i32> %1
	}

	; If the insertion overruns the vector, the result is undefined.
	define <8 x i32> @insert_overrun(<8 x i32> %vec, <8 x i32> %subvec) {
	; CHECK-LABEL: @insert_overrun(
	; CHECK-NEXT: ret <8 x i32> undef
	;
	%1 = call <8 x i32> @llvm.experimental.vector.insert.v8i32.v8i32(<8 x i32> %vec, <8 x i32> %subvec, i64 4)
	ret <8 x i32> %1
	}

	; ============================================================================ ;
	; Scalable cases			; Scalable cases
	; ============================================================================ ;			; ============================================================================ ;

	; Scalable insertions should not be canonicalized. This will be lowered to the			; Scalable insertions should not be canonicalized. This will be lowered to the
	; INSERT_SUBVECTOR ISD node later.			; INSERT_SUBVECTOR ISD node later.
	define <vscale x 4 x i32> @scalable_insert(<vscale x 4 x i32> %vec, <4 x i32> %subvec) {			define <vscale x 4 x i32> @scalable_insert(<vscale x 4 x i32> %vec, <4 x i32> %subvec) {
	; CHECK-LABEL: @scalable_insert(			; CHECK-LABEL: @scalable_insert(
	; CHECK-NEXT: [[TMP1:%.]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> [[VEC:%.]], <4 x i32> [[SUBVEC:%.*]], i64 0)			; CHECK-NEXT: [[TMP1:%.]] = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> [[VEC:%.]], <4 x i32> [[SUBVEC:%.*]], i64 0)
	; CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]			; CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]
	;			;
	%1 = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec, i64 0)			%1 = call <vscale x 4 x i32> @llvm.experimental.vector.insert.nxv4i32.v4i32(<vscale x 4 x i32> %vec, <4 x i32> %subvec, i64 0)
	ret <vscale x 4 x i32> %1			ret <vscale x 4 x i32> %1
	}			}

llvm/test/Verifier/insert-extract-intrinsics-invalid.ll

This file was added.

				; RUN: not opt -verify -S < %s 2>&1 >/dev/null \| FileCheck %s

				; CHECK: experimental_vector_extract index must be a constant multiple of the result type's known minimum vector length.
				define <4 x i32> @extract_idx_not_constant_multiple(<8 x i32> %vec) {
				%1 = call <4 x i32> @llvm.experimental.vector.extract.v4i32.v8i32(<8 x i32> %vec, i64 1)
				ret <4 x i32> %1
				}

				; CHECK: experimental_vector_insert index must be a constant multiple of the subvector's known minimum vector length.
				define <8 x i32> @insert_idx_not_constant_multiple(<8 x i32> %vec, <4 x i32> %subvec) {
				%1 = call <8 x i32> @llvm.experimental.vector.insert.v8i32.v4i32(<8 x i32> %vec, <4 x i32> %subvec, i64 2)
				ret <8 x i32> %1
				}

				declare <4 x i32> @llvm.experimental.vector.extract.v4i32.v8i32(<8 x i32>, i64)
				declare <8 x i32> @llvm.experimental.vector.insert.v8i32.v4i32(<8 x i32>, <4 x i32>, i64)

This is an archive of the discontinued LLVM Phabricator instance.

[Verifier] Fail on invalid indices for {insert,extract} vector intrinsicsAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 348920

llvm/lib/IR/Verifier.cpp

llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp

llvm/test/CodeGen/AArch64/sve-extract-vector.ll

llvm/test/CodeGen/AArch64/sve-insert-vector.ll

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-extract-subvector.ll

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert-subvector.ll

llvm/test/Transforms/InstCombine/canonicalize-vector-extract.ll

llvm/test/Transforms/InstCombine/canonicalize-vector-insert.ll

llvm/test/Verifier/insert-extract-intrinsics-invalid.ll

[Verifier] Fail on invalid indices for {insert,extract} vector intrinsics
AbandonedPublic