This is an archive of the discontinued LLVM Phabricator instance.

[SVE][Builtins] Add metadata to intrinsic calls for builtins that don't define the result of inactive lanes.
AbandonedPublic

Authored by paulwalker-arm on Jan 8 2023, 4:39 PM.

Download Raw Diff

Details

Reviewers

efriedma
sdesmalen
david-arm
kmclaughlin

Summary

The ACLE for SVE define a repeating set of builtins that allow the
result of inactive lanes to be zeroed (Z), copied from an input
operand (M) or have an undefined value (X). When lowering these
builtins we lose the semantics of the undefined variants because to
keep the intrinsic count down we chose to treat them as M forms.

This largely makes sense because in the majority of instances only
the M form is backed by a real instruction. This does mean we miss
out on some optimisation opportunities and so this patch introduces
metadata to the intrinsic calls that allow us to represent the cases
where an M form can be considered to be an X form. This metadata is
freely ignorable because copying the inactive lanes from an input
operand is a valid option to represent an undefined value, and
matches the behaviour before this patch.

To demonstrate the metadata's usage this patch includes a trivial
optimisation so that svadd_x emits the unpredicated variant of ADD
as expected.

NOTE: I did investigate representing the undefined lanes using a select on the governing predicate but this proved a poor design because optimisations became order sensitive, the extra IR made use count protection harder to handle and the select instruction itself has strict rules relating to poison that hampered the intent of this change.

NOTE: All the existing tests pass without regeneration and so to keep the reviewed patch small I only regenerated one of the tests to show the affect. If agreeable I'll regenerate all the other tests just before landing the patch.

Depends on D141056

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	60,060 ms	x64 debian > libFuzzer.libFuzzer::fuzzer-leak.test
	60,050 ms	x64 debian > libFuzzer.libFuzzer::minimize_crash.test
	60,050 ms	x64 debian > libFuzzer.libFuzzer::out-of-process-fuzz.test
	60,070 ms	x64 debian > libFuzzer.libFuzzer::value-profile-load.test

Event Timeline

paulwalker-arm created this revision.Jan 8 2023, 4:39 PM

Herald added a reviewer: efriedma. · View Herald TranscriptJan 8 2023, 4:39 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: ctetreau, psnobl, hiraditya, tschuett. · View Herald Transcript

paulwalker-arm requested review of this revision.Jan 8 2023, 4:39 PM

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 8 2023, 4:39 PM

Herald added subscribers: llvm-commits, cfe-commits. · View Herald Transcript

Fixed typo.

paulwalker-arm added reviewers: sdesmalen, david-arm, kmclaughlin.Jan 8 2023, 4:43 PM

Harbormaster completed remote builds in B206401: Diff 487247.Jan 8 2023, 5:32 PM

Using metadata seems sensible, but did you also identify any downsides? I could imagine that we'd need to manually propagate metadata to any nodes after we do a combine (which can't be blindly copied?), e.g. add + mul -> mla, this new intrinsic would also need the metadata.

For intrinsics that don't have a directly corresponding (unpredicated) LLVM IR instruction, is there still a way to use this information in SelectionDAG?

the select instruction itself has strict rules relating to poison that hampered the intent of this change

For my understanding, can you elaborate what these strict rules regarding poison are that hamper such a change, and what it was that you tried?

Matt added a subscriber: Matt.Jan 9 2023, 3:15 PM

In D141240#4035438, @sdesmalen wrote:

Using metadata seems sensible, but did you also identify any downsides? I could imagine that we'd need to manually propagate metadata to any nodes after we do a combine (which can't be blindly copied?), e.g. add + mul -> mla, this new intrinsic would also need the metadata.

I don't really see manually propagation as a downside because it's not functionally required but rather advantageous to maximise optimisation opportunities. The downside is the opposite in that any transformation that wants to rely on the inactive lanes being defined as before this patch will now need to check for the presence of (or rather lack of) the new metadata before blindly reusing the result of an existing SVE intrinsic call. The transformation can still reuse the call it must just first discard the metadata.

For intrinsics that don't have a directly corresponding (unpredicated) LLVM IR instruction, is there still a way to use this information in SelectionDAG?

Truth be told I'm not entire sure how this will play out. I'm not sure whether it's better to use the information within the IR as I'm doing in this patch or whether this should be used solely when lowering IR to DAG. So it's really an experiment to see what sticks while proving a route to fix some of the issues we've already observed with how we represent the X forms.

Predicated->unpredicted aside another use for encoding undefiness is that it helps with things the FMAs where we can use FMAD if that better suits register allocation much like we do for stock IR.

the select instruction itself has strict rules relating to poison that hampered the intent of this change

For my understanding, can you elaborate what these strict rules regarding poison are that hamper such a change, and what it was that you tried?

The LangRef states the transformation "select P, A, undef ==> A" is only valid when you can prove the inactive lanes of "A" do not contain poison. I'm unsure if this is a true blocker or a mere inconvenience because to maintain the maximum amount of information we likely don't want to remove the selects anyway. I went down this path by creating an SVE undef intrinsic, which nothing knows about and thus will be left alone. The problem is that it massively polluted the IR and I was worried it'll make it harder to spot/implement the typical combines. For sure the existing combines will need to be changed because they'll not know to look through the new selects.

There is the option to change the clang builtin lowering to provide finer control over which builtins emit these selects, but that just means more changes (updates to existing instcombines) each time we decide a builtin is worth the extra select.

I'll keep experimenting but as I mention within the in code comment, the likely best solution is to have dedicate intrinsics with this being the least intrusive hack.

Perhaps the key word there is "hack" :) I'll investigate the dedicate intrinsics route because perhaps we only require a handful to get the majority of the benefit.

paulwalker-arm planned changes to this revision.Jan 11 2023, 5:30 PM

Just a heads up that I'm likely to abandon this patch because as predicted implementing dedicated intrinsics is looking like the better design and most all the code generation plumbing is already present and so even the implementation is minimal.

paulwalker-arm mentioned this in D141937: [SVE] Add intrinsics for integer binops that explicitly undefine the result for inactive lanes..Jan 17 2023, 8:07 AM

D141939 turned out to be the better approach.

Revision Contents

Path

Size

clang/

lib/

CodeGen/

CGBuiltin.cpp

15 lines

test/

CodeGen/

aarch64-sve-intrinsics/

acle_sve_add.c

88 lines

llvm/

lib/

Target/

AArch64/

AArch64TargetTransformInfo.cpp

10 lines

test/

Transforms/

InstCombine/

AArch64/

sve-intrinsic-unpredicate.ll

19 lines

Diff 487247

clang/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 9,487 Lines • ▼ Show 20 Lines	else if (Builtin->LLVMIntrinsic != 0) {
if (TypeFlags.getMergeType() == SVETypeFlags::MergeZero) {		if (TypeFlags.getMergeType() == SVETypeFlags::MergeZero) {
llvm::Type *OpndTy = Ops[1]->getType();		llvm::Type *OpndTy = Ops[1]->getType();
auto *SplatZero = Constant::getNullValue(OpndTy);		auto *SplatZero = Constant::getNullValue(OpndTy);
Ops[1] = Builder.CreateSelect(Ops[0], Ops[1], SplatZero);		Ops[1] = Builder.CreateSelect(Ops[0], Ops[1], SplatZero);
}		}

Function *F = CGM.getIntrinsic(Builtin->LLVMIntrinsic,		Function *F = CGM.getIntrinsic(Builtin->LLVMIntrinsic,
getSVEOverloadTypes(TypeFlags, Ty, Ops));		getSVEOverloadTypes(TypeFlags, Ty, Ops));
Value *Call = Builder.CreateCall(F, Ops);		CallInst *Call = Builder.CreateCall(F, Ops);

		// These builtins don't have a defined result for inactive lanes.
		// NOTE: The intention of these builtins is to allow the compiler to better
		// utilise unpredicated SVE instructions. Arguably a better implementation
		// is to have dedicated intrinsics for these builtins. However, there is a
		// lot of them and most have no equivalent unpredicated variant so instead
		// we treat them as SVETypeFlags::MergeOp1. Metadata is applied, which is
		// freely ignorable, to help identify when the predicate can be dropped.
		if (TypeFlags.getMergeType() == SVETypeFlags::MergeAny)
		Call->setMetadata("inactive_lanes_undefined",
		MDNode::get(getLLVMContext(), {}));

// Predicate results must be converted to svbool_t.		// Predicate results must be converted to svbool_t.
if (auto PredTy = dyn_cast<llvm::VectorType>(Call->getType()))		if (auto PredTy = dyn_cast<llvm::VectorType>(Call->getType()))
if (PredTy->getScalarType()->isIntegerTy(1))		if (PredTy->getScalarType()->isIntegerTy(1))
Call = EmitSVEPredicateCast(Call, cast<llvm::ScalableVectorType>(Ty));		return EmitSVEPredicateCast(Call, cast<llvm::ScalableVectorType>(Ty));

return Call;		return Call;
}		}

switch (BuiltinID) {		switch (BuiltinID) {
default:		default:
return nullptr;		return nullptr;

▲ Show 20 Lines • Show All 10,299 Lines • Show Last 20 Lines

clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_add.c

Show First 20 Lines • Show All 290 Lines • ▼ Show 20 Lines
//		//
svuint64_t test_svadd_u64_m(svbool_t pg, svuint64_t op1, svuint64_t op2)		svuint64_t test_svadd_u64_m(svbool_t pg, svuint64_t op1, svuint64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_u64,_m,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_u64,_m,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_s8_x(		// CHECK-LABEL: @test_svadd_s8_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]		// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]
//		//
// CPP-CHECK-LABEL: @_Z15test_svadd_s8_xu10__SVBool_tu10__SVInt8_tu10__SVInt8_t(		// CPP-CHECK-LABEL: @_Z15test_svadd_s8_xu10__SVBool_tu10__SVInt8_tu10__SVInt8_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]		// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]
//		//
svint8_t test_svadd_s8_x(svbool_t pg, svint8_t op1, svint8_t op2)		svint8_t test_svadd_s8_x(svbool_t pg, svint8_t op1, svint8_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_s8,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_s8,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_s16_x(		// CHECK-LABEL: @test_svadd_s16_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]		// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_s16_xu10__SVBool_tu11__SVInt16_tu11__SVInt16_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_s16_xu10__SVBool_tu11__SVInt16_tu11__SVInt16_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]
//		//
svint16_t test_svadd_s16_x(svbool_t pg, svint16_t op1, svint16_t op2)		svint16_t test_svadd_s16_x(svbool_t pg, svint16_t op1, svint16_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_s16,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_s16,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_s32_x(		// CHECK-LABEL: @test_svadd_s32_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]		// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_s32_xu10__SVBool_tu11__SVInt32_tu11__SVInt32_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_s32_xu10__SVBool_tu11__SVInt32_tu11__SVInt32_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]
//		//
svint32_t test_svadd_s32_x(svbool_t pg, svint32_t op1, svint32_t op2)		svint32_t test_svadd_s32_x(svbool_t pg, svint32_t op1, svint32_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_s32,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_s32,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_s64_x(		// CHECK-LABEL: @test_svadd_s64_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]		// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_s64_xu10__SVBool_tu11__SVInt64_tu11__SVInt64_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_s64_xu10__SVBool_tu11__SVInt64_tu11__SVInt64_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]
//		//
svint64_t test_svadd_s64_x(svbool_t pg, svint64_t op1, svint64_t op2)		svint64_t test_svadd_s64_x(svbool_t pg, svint64_t op1, svint64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_s64,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_s64,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_u8_x(		// CHECK-LABEL: @test_svadd_u8_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]		// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]
//		//
// CPP-CHECK-LABEL: @_Z15test_svadd_u8_xu10__SVBool_tu11__SVUint8_tu11__SVUint8_t(		// CPP-CHECK-LABEL: @_Z15test_svadd_u8_xu10__SVBool_tu11__SVUint8_tu11__SVUint8_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.]], <vscale x 16 x i8> [[OP2:%.]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]		// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP0]]
//		//
svuint8_t test_svadd_u8_x(svbool_t pg, svuint8_t op1, svuint8_t op2)		svuint8_t test_svadd_u8_x(svbool_t pg, svuint8_t op1, svuint8_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_u8,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_u8,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_u16_x(		// CHECK-LABEL: @test_svadd_u16_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]		// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_u16_xu10__SVBool_tu12__SVUint16_tu12__SVUint16_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_u16_xu10__SVBool_tu12__SVUint16_tu12__SVUint16_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP1]]
//		//
svuint16_t test_svadd_u16_x(svbool_t pg, svuint16_t op1, svuint16_t op2)		svuint16_t test_svadd_u16_x(svbool_t pg, svuint16_t op1, svuint16_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_u16,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_u16,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_u32_x(		// CHECK-LABEL: @test_svadd_u32_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]		// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_u32_xu10__SVBool_tu12__SVUint32_tu12__SVUint32_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_u32_xu10__SVBool_tu12__SVUint32_tu12__SVUint32_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP1]]
//		//
svuint32_t test_svadd_u32_x(svbool_t pg, svuint32_t op1, svuint32_t op2)		svuint32_t test_svadd_u32_x(svbool_t pg, svuint32_t op1, svuint32_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_u32,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_u32,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_u64_x(		// CHECK-LABEL: @test_svadd_u64_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]		// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_u64_xu10__SVBool_tu12__SVUint64_tu12__SVUint64_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_u64_xu10__SVBool_tu12__SVUint64_tu12__SVUint64_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP1]]
//		//
svuint64_t test_svadd_u64_x(svbool_t pg, svuint64_t op1, svuint64_t op2)		svuint64_t test_svadd_u64_x(svbool_t pg, svuint64_t op1, svuint64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_u64,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_u64,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_s8_z(		// CHECK-LABEL: @test_svadd_n_s8_z(
▲ Show 20 Lines • Show All 339 Lines • ▼ Show 20 Lines
{		{
return SVE_ACLE_FUNC(svadd,_n_u64,_m,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_u64,_m,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_s8_x(		// CHECK-LABEL: @test_svadd_n_s8_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer		// CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]		// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z17test_svadd_n_s8_xu10__SVBool_tu10__SVInt8_ta(		// CPP-CHECK-LABEL: @_Z17test_svadd_n_s8_xu10__SVBool_tu10__SVInt8_ta(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]
//		//
svint8_t test_svadd_n_s8_x(svbool_t pg, svint8_t op1, int8_t op2)		svint8_t test_svadd_n_s8_x(svbool_t pg, svint8_t op1, int8_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_s8,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_s8,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_s16_x(		// CHECK-LABEL: @test_svadd_n_s16_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]		// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_s16_xu10__SVBool_tu11__SVInt16_ts(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_s16_xu10__SVBool_tu11__SVInt16_ts(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]
//		//
svint16_t test_svadd_n_s16_x(svbool_t pg, svint16_t op1, int16_t op2)		svint16_t test_svadd_n_s16_x(svbool_t pg, svint16_t op1, int16_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_s16,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_s16,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_s32_x(		// CHECK-LABEL: @test_svadd_n_s32_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]		// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_s32_xu10__SVBool_tu11__SVInt32_ti(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_s32_xu10__SVBool_tu11__SVInt32_ti(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]
//		//
svint32_t test_svadd_n_s32_x(svbool_t pg, svint32_t op1, int32_t op2)		svint32_t test_svadd_n_s32_x(svbool_t pg, svint32_t op1, int32_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_s32,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_s32,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_s64_x(		// CHECK-LABEL: @test_svadd_n_s64_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]		// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_s64_xu10__SVBool_tu11__SVInt64_tl(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_s64_xu10__SVBool_tu11__SVInt64_tl(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]
//		//
svint64_t test_svadd_n_s64_x(svbool_t pg, svint64_t op1, int64_t op2)		svint64_t test_svadd_n_s64_x(svbool_t pg, svint64_t op1, int64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_s64,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_s64,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_u8_x(		// CHECK-LABEL: @test_svadd_n_u8_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer		// CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]		// CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z17test_svadd_n_u8_xu10__SVBool_tu11__SVUint8_th(		// CPP-CHECK-LABEL: @_Z17test_svadd_n_u8_xu10__SVBool_tu11__SVUint8_th(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 16 x i8> poison, i8 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP0:%.*]] = shufflevector <vscale x 16 x i8> [[DOTSPLATINSERT]], <vscale x 16 x i8> poison, <vscale x 16 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 16 x i8> @llvm.aarch64.sve.add.nxv16i8(<vscale x 16 x i1> [[PG:%.]], <vscale x 16 x i8> [[OP1:%.*]], <vscale x 16 x i8> [[TMP0]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 16 x i8> [[TMP1]]
//		//
svuint8_t test_svadd_n_u8_x(svbool_t pg, svuint8_t op1, uint8_t op2)		svuint8_t test_svadd_n_u8_x(svbool_t pg, svuint8_t op1, uint8_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_u8,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_u8,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_u16_x(		// CHECK-LABEL: @test_svadd_n_u16_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]		// CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_u16_xu10__SVBool_tu12__SVUint16_tt(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_u16_xu10__SVBool_tu12__SVUint16_tt(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x i16> poison, i16 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x i16> [[DOTSPLATINSERT]], <vscale x 8 x i16> poison, <vscale x 8 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x i16> @llvm.aarch64.sve.add.nxv8i16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x i16> [[OP1:%.]], <vscale x 8 x i16> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 8 x i16> [[TMP2]]
//		//
svuint16_t test_svadd_n_u16_x(svbool_t pg, svuint16_t op1, uint16_t op2)		svuint16_t test_svadd_n_u16_x(svbool_t pg, svuint16_t op1, uint16_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_u16,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_u16,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_u32_x(		// CHECK-LABEL: @test_svadd_n_u32_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]		// CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_u32_xu10__SVBool_tu12__SVUint32_tj(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_u32_xu10__SVBool_tu12__SVUint32_tj(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x i32> poison, i32 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x i32> [[DOTSPLATINSERT]], <vscale x 4 x i32> poison, <vscale x 4 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x i32> [[OP1:%.]], <vscale x 4 x i32> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 4 x i32> [[TMP2]]
//		//
svuint32_t test_svadd_n_u32_x(svbool_t pg, svuint32_t op1, uint32_t op2)		svuint32_t test_svadd_n_u32_x(svbool_t pg, svuint32_t op1, uint32_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_u32,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_u32,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_u64_x(		// CHECK-LABEL: @test_svadd_n_u64_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]		// CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_u64_xu10__SVBool_tu12__SVUint64_tm(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_u64_xu10__SVBool_tu12__SVUint64_tm(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x i64> poison, i64 [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x i64> [[DOTSPLATINSERT]], <vscale x 2 x i64> poison, <vscale x 2 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x i64> @llvm.aarch64.sve.add.nxv2i64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x i64> [[OP1:%.]], <vscale x 2 x i64> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 2 x i64> [[TMP2]]
//		//
svuint64_t test_svadd_n_u64_x(svbool_t pg, svuint64_t op1, uint64_t op2)		svuint64_t test_svadd_n_u64_x(svbool_t pg, svuint64_t op1, uint64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_u64,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_u64,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_f16_z(		// CHECK-LABEL: @test_svadd_f16_z(
▲ Show 20 Lines • Show All 102 Lines • ▼ Show 20 Lines
svfloat64_t test_svadd_f64_m(svbool_t pg, svfloat64_t op1, svfloat64_t op2)		svfloat64_t test_svadd_f64_m(svbool_t pg, svfloat64_t op1, svfloat64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_f64,_m,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_f64,_m,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_f16_x(		// CHECK-LABEL: @test_svadd_f16_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 8 x half> [[TMP1]]		// CHECK-NEXT: ret <vscale x 8 x half> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_f16_xu10__SVBool_tu13__SVFloat16_tu13__SVFloat16_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_f16_xu10__SVBool_tu13__SVFloat16_tu13__SVFloat16_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 8 x half> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 8 x half> [[TMP1]]
//		//
svfloat16_t test_svadd_f16_x(svbool_t pg, svfloat16_t op1, svfloat16_t op2)		svfloat16_t test_svadd_f16_x(svbool_t pg, svfloat16_t op1, svfloat16_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_f16,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_f16,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_f32_x(		// CHECK-LABEL: @test_svadd_f32_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]		// CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_f32_xu10__SVBool_tu13__SVFloat32_tu13__SVFloat32_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_f32_xu10__SVBool_tu13__SVFloat32_tu13__SVFloat32_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 4 x float> [[TMP1]]
//		//
svfloat32_t test_svadd_f32_x(svbool_t pg, svfloat32_t op1, svfloat32_t op2)		svfloat32_t test_svadd_f32_x(svbool_t pg, svfloat32_t op1, svfloat32_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_f32,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_f32,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_f64_x(		// CHECK-LABEL: @test_svadd_f64_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[OP2:%.*]])		// CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]		// CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
//		//
// CPP-CHECK-LABEL: @_Z16test_svadd_f64_xu10__SVBool_tu13__SVFloat64_tu13__SVFloat64_t(		// CPP-CHECK-LABEL: @_Z16test_svadd_f64_xu10__SVBool_tu13__SVFloat64_tu13__SVFloat64_t(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[OP2:%.*]])		// CPP-CHECK-NEXT: [[TMP1:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[OP2:%.*]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]		// CPP-CHECK-NEXT: ret <vscale x 2 x double> [[TMP1]]
//		//
svfloat64_t test_svadd_f64_x(svbool_t pg, svfloat64_t op1, svfloat64_t op2)		svfloat64_t test_svadd_f64_x(svbool_t pg, svfloat64_t op1, svfloat64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_f64,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_f64,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_f16_z(		// CHECK-LABEL: @test_svadd_n_f16_z(
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	svfloat64_t test_svadd_n_f64_m(svbool_t pg, svfloat64_t op1, float64_t op2)
return SVE_ACLE_FUNC(svadd,_n_f64,_m,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_f64,_m,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_f16_x(		// CHECK-LABEL: @test_svadd_n_f16_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x half> poison, half [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x half> poison, half [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x half> [[DOTSPLATINSERT]], <vscale x 8 x half> poison, <vscale x 8 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x half> [[DOTSPLATINSERT]], <vscale x 8 x half> poison, <vscale x 8 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 8 x half> [[TMP2]]		// CHECK-NEXT: ret <vscale x 8 x half> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_f16_xu10__SVBool_tu13__SVFloat16_tDh(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_f16_xu10__SVBool_tu13__SVFloat16_tDh(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 8 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv8i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x half> poison, half [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 8 x half> poison, half [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x half> [[DOTSPLATINSERT]], <vscale x 8 x half> poison, <vscale x 8 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 8 x half> [[DOTSPLATINSERT]], <vscale x 8 x half> poison, <vscale x 8 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> [[TMP0]], <vscale x 8 x half> [[OP1:%.]], <vscale x 8 x half> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 8 x half> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 8 x half> [[TMP2]]
//		//
svfloat16_t test_svadd_n_f16_x(svbool_t pg, svfloat16_t op1, float16_t op2)		svfloat16_t test_svadd_n_f16_x(svbool_t pg, svfloat16_t op1, float16_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_f16,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_f16,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_f32_x(		// CHECK-LABEL: @test_svadd_n_f32_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x float> poison, float [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x float> poison, float [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x float> [[DOTSPLATINSERT]], <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x float> [[DOTSPLATINSERT]], <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 4 x float> [[TMP2]]		// CHECK-NEXT: ret <vscale x 4 x float> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_f32_xu10__SVBool_tu13__SVFloat32_tf(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_f32_xu10__SVBool_tu13__SVFloat32_tf(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 4 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv4i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x float> poison, float [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 4 x float> poison, float [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x float> [[DOTSPLATINSERT]], <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 4 x float> [[DOTSPLATINSERT]], <vscale x 4 x float> poison, <vscale x 4 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> [[TMP0]], <vscale x 4 x float> [[OP1:%.]], <vscale x 4 x float> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 4 x float> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 4 x float> [[TMP2]]
//		//
svfloat32_t test_svadd_n_f32_x(svbool_t pg, svfloat32_t op1, float32_t op2)		svfloat32_t test_svadd_n_f32_x(svbool_t pg, svfloat32_t op1, float32_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_f32,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_f32,_x,)(pg, op1, op2);
}		}

// CHECK-LABEL: @test_svadd_n_f64_x(		// CHECK-LABEL: @test_svadd_n_f64_x(
// CHECK-NEXT: entry:		// CHECK-NEXT: entry:
// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x double> poison, double [[OP2:%.]], i64 0		// CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x double> poison, double [[OP2:%.]], i64 0
// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x double> [[DOTSPLATINSERT]], <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer		// CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x double> [[DOTSPLATINSERT]], <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer
// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[TMP1]])		// CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[TMP1]]), !inactive_lanes_undefined !2
// CHECK-NEXT: ret <vscale x 2 x double> [[TMP2]]		// CHECK-NEXT: ret <vscale x 2 x double> [[TMP2]]
//		//
// CPP-CHECK-LABEL: @_Z18test_svadd_n_f64_xu10__SVBool_tu13__SVFloat64_td(		// CPP-CHECK-LABEL: @_Z18test_svadd_n_f64_xu10__SVBool_tu13__SVFloat64_td(
// CPP-CHECK-NEXT: entry:		// CPP-CHECK-NEXT: entry:
// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])		// CPP-CHECK-NEXT: [[TMP0:%.]] = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> [[PG:%.]])
// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x double> poison, double [[OP2:%.]], i64 0		// CPP-CHECK-NEXT: [[DOTSPLATINSERT:%.]] = insertelement <vscale x 2 x double> poison, double [[OP2:%.]], i64 0
// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x double> [[DOTSPLATINSERT]], <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer		// CPP-CHECK-NEXT: [[TMP1:%.*]] = shufflevector <vscale x 2 x double> [[DOTSPLATINSERT]], <vscale x 2 x double> poison, <vscale x 2 x i32> zeroinitializer
// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[TMP1]])		// CPP-CHECK-NEXT: [[TMP2:%.]] = tail call <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> [[TMP0]], <vscale x 2 x double> [[OP1:%.]], <vscale x 2 x double> [[TMP1]]), !inactive_lanes_undefined !2
// CPP-CHECK-NEXT: ret <vscale x 2 x double> [[TMP2]]		// CPP-CHECK-NEXT: ret <vscale x 2 x double> [[TMP2]]
//		//
svfloat64_t test_svadd_n_f64_x(svbool_t pg, svfloat64_t op1, float64_t op2)		svfloat64_t test_svadd_n_f64_x(svbool_t pg, svfloat64_t op1, float64_t op2)
{		{
return SVE_ACLE_FUNC(svadd,_n_f64,_x,)(pg, op1, op2);		return SVE_ACLE_FUNC(svadd,_n_f64,_x,)(pg, op1, op2);
}		}

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Show First 20 Lines • Show All 1,197 Lines • ▼ Show 20 Lines	if (auto FMAD =
instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul,		instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul,
Intrinsic::aarch64_sve_fmad>(IC, II,		Intrinsic::aarch64_sve_fmad>(IC, II,
false))		false))
return FMAD;		return FMAD;
if (auto MAD = instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_mul,		if (auto MAD = instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_mul,
Intrinsic::aarch64_sve_mad>(		Intrinsic::aarch64_sve_mad>(
IC, II, false))		IC, II, false))
return MAD;		return MAD;

		// The predicate is redundant if we don't care about inactive lanes.
		if (II.getIntrinsicID() == Intrinsic::aarch64_sve_add &&
		II.hasMetadata("inactive_lanes_undefined")) {
		auto *UnpredAdd =
		BinaryOperator::Create(Instruction::Add, II.getArgOperand(1),
		II.getArgOperand(2), II.getName(), &II);
		return IC.replaceInstUsesWith(II, UnpredAdd);
		}

return instCombineSVEVectorBinOp(IC, II);		return instCombineSVEVectorBinOp(IC, II);
}		}

static std::optional<Instruction *> instCombineSVEVectorSub(InstCombiner &IC,		static std::optional<Instruction *> instCombineSVEVectorSub(InstCombiner &IC,
IntrinsicInst &II) {		IntrinsicInst &II) {
if (auto FMLS =		if (auto FMLS =
instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul,		instCombineSVEVectorFuseMulAddSub<Intrinsic::aarch64_sve_fmul,
Intrinsic::aarch64_sve_fmls>(IC, II,		Intrinsic::aarch64_sve_fmls>(IC, II,
▲ Show 20 Lines • Show All 2,091 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-unpredicate.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -passes=instcombine < %s \| FileCheck %s

				target triple = "aarch64-unknown-linux-gnu"

				define <vscale x 4 x i32> @unpredicate_add_x(<vscale x 4 x i1> %p, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b) #0 {
				; CHECK-LABEL: @unpredicate_add_x(
				; CHECK-NEXT: [[OP1:%.]] = add <vscale x 4 x i32> [[A:%.]], [[B:%.*]]
				; CHECK-NEXT: ret <vscale x 4 x i32> [[OP1]]
				;
				%op = tail call <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1> %p, <vscale x 4 x i32> %a, <vscale x 4 x i32> %b), !inactive_lanes_undefined !0
				ret <vscale x 4 x i32> %op
				}

				declare <vscale x 4 x i32> @llvm.aarch64.sve.add.nxv4i32(<vscale x 4 x i1>, <vscale x 4 x i32>, <vscale x 4 x i32>)

				attributes #0 = { "target-features"="+sve" }

				!0 = !{}