This is an archive of the discontinued LLVM Phabricator instance.

[ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
ClosedPublic

Authored by labrinea on Dec 4 2015, 2:38 AM.

Download Raw Diff

Details

Reviewers

rengolin
echristo
cfe-commits

Commits

rGd162b5c8c479: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
rC256822: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
rL256822: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.

Summary

The existing tests are checking back-end generated assembly. Instead, we want to check front-end generated IR.

Diff Detail

Repository: rL LLVM

Event Timeline

labrinea updated this revision to Diff 41854.Dec 4 2015, 2:38 AM

labrinea retitled this revision from to [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics..

labrinea updated this object.

labrinea added reviewers: jmolloy, rengolin, echristo, cfe-commits.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptDec 4 2015, 2:38 AM

labrinea updated this object.Dec 4 2015, 5:52 AM

Please remove the asm tests here. As I stated in the original review thread there's no reason for them to be here.

Thanks.

-eric

ASM tests have been removed.

One inline comment, thanks!

-eric

test/CodeGen/aarch64-v8.1a-neon-intrinsics.c
4 ↗	(On Diff #42035)	Why do you need to enable the optimizers?

labrinea added inline comments.Dec 8 2015, 3:14 AM

test/CodeGen/aarch64-v8.1a-neon-intrinsics.c
4 ↗	(On Diff #42035)	Our intention with these tests is to check that we are generating a sequence of {v/s}qrdmulh, {v/s}q{add/sub}{s}, shufflevector, {insert/extract}element IR instructions. Using -O1 promotes memory to registers, combines instructions, and therefore decreases the context of IR that we need to check.

Should be pretty easy to either use CHECK-DAG or pick out the particular instructions you want to check here. Otherwise you're just checking how the optimizer runs. That, in particular, also sounds like a good backend check.

Eric is reviewing this; resigning myself.

Hi Eric,

The main optimization I feel is useful is mem2reg. Without that, if I want to properly check the right values go to the right operands of the intrinsic calls I have to write FileCheck matchers that match stores and their relevant loads, plus bitcasts. This not only looks more obfuscated than matching the mem2reg output, but it is also less resilient to changes in the way clang code generates.

The generated IR for each intrinsic is around 50 lines. I can just pick out the particular instructions I want to check, as you suggested, but they won't we connected by the flow of values. In my opinion such a test will be less valuable.

I can do this both ways but my preferred way is to run the bare minimum of optimization to de-cruft the output and make the test robust and readable. If you feel however that you don't want the optimizers run I will make a best effort at writing a test that doesn't use them.

I understand the conflicting priorities here for sure. You'd like a test that's as minimal as possible, without having to depend on external (to clang) libraries here. I really would appreciate it if you'd make the test not rely on mem2reg etc so we can be sure that clang's code generation is the thing tested here and not the optimizer. Making sure that the unoptimized output reduces properly would be a great opt test for the backend though.

Thanks!

-eric

Disabled optimizers.

LGTM, and thanks for all of the iteration.

-eric

This revision is now accepted and ready to land.Jan 4 2016, 3:01 PM

Closed by commit rL256822: [ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics. (authored by alelab01). · Explain WhyJan 5 2016, 2:02 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

cfe/

trunk/

test/

CodeGen/

aarch64-v8.1a-neon-intrinsics.c

156 lines

arm-v8.1a-neon-intrinsics.c

135 lines

Diff 43969

cfe/trunk/test/CodeGen/aarch64-v8.1a-neon-intrinsics.c

	// REQUIRES: aarch64-registered-target			// REQUIRES: aarch64-registered-target

	// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon \			// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon \
	// RUN: -target-feature +v8.1a -O3 -S -o - %s \			// RUN: -target-feature +v8.1a -S -emit-llvm -o - %s \| FileCheck %s
	// RUN: \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64

	#include <arm_neon.h>			#include <arm_neon.h>

	// CHECK-AARCH64-LABEL: test_vqrdmlah_laneq_s16			// CHECK-LABEL: test_vqrdmlah_laneq_s16
	int16x4_t test_vqrdmlah_laneq_s16(int16x4_t a, int16x4_t b, int16x8_t v) {			int16x4_t test_vqrdmlah_laneq_s16(int16x4_t a, int16x4_t b, int16x8_t v) {
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.4h, {{v[0-9]+}}.4h, {{v[0-9]+}}.h[7]			// CHECK: shufflevector <8 x i16> {{%.}}, <8 x i16> {{%.}}, <4 x i32> <i32 7, i32 7, i32 7, i32 7>
				// CHECK: call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK: call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	return vqrdmlah_laneq_s16(a, b, v, 7);			return vqrdmlah_laneq_s16(a, b, v, 7);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlah_laneq_s32			// CHECK-LABEL: test_vqrdmlah_laneq_s32
	int32x2_t test_vqrdmlah_laneq_s32(int32x2_t a, int32x2_t b, int32x4_t v) {			int32x2_t test_vqrdmlah_laneq_s32(int32x2_t a, int32x2_t b, int32x4_t v) {
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.2s, {{v[0-9]+}}.2s, {{v[0-9]+}}.s[3]			// CHECK: shufflevector <4 x i32> {{%.}}, <4 x i32> {{%.}}, <2 x i32> <i32 3, i32 3>
				// CHECK: call <2 x i32> @llvm.aarch64.neon.sqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK: call <2 x i32> @llvm.aarch64.neon.sqadd.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	return vqrdmlah_laneq_s32(a, b, v, 3);			return vqrdmlah_laneq_s32(a, b, v, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahq_laneq_s16			// CHECK-LABEL: test_vqrdmlahq_laneq_s16
	int16x8_t test_vqrdmlahq_laneq_s16(int16x8_t a, int16x8_t b, int16x8_t v) {			int16x8_t test_vqrdmlahq_laneq_s16(int16x8_t a, int16x8_t b, int16x8_t v) {
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.h[7]			// CHECK: shufflevector <8 x i16> {{%.}}, <8 x i16> {{%.}}, <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>
				// CHECK: call <8 x i16> @llvm.aarch64.neon.sqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK: call <8 x i16> @llvm.aarch64.neon.sqadd.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	return vqrdmlahq_laneq_s16(a, b, v, 7);			return vqrdmlahq_laneq_s16(a, b, v, 7);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahq_laneq_s32			// CHECK-LABEL: test_vqrdmlahq_laneq_s32
	int32x4_t test_vqrdmlahq_laneq_s32(int32x4_t a, int32x4_t b, int32x4_t v) {			int32x4_t test_vqrdmlahq_laneq_s32(int32x4_t a, int32x4_t b, int32x4_t v) {
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, {{v[0-9]+}}.s[3]			// CHECK: shufflevector <4 x i32> {{%.}}, <4 x i32> {{%.}}, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
				// CHECK: call <4 x i32> @llvm.aarch64.neon.sqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK: call <4 x i32> @llvm.aarch64.neon.sqadd.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	return vqrdmlahq_laneq_s32(a, b, v, 3);			return vqrdmlahq_laneq_s32(a, b, v, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahh_s16			// CHECK-LABEL: test_vqrdmlahh_s16
	int16_t test_vqrdmlahh_s16(int16_t a, int16_t b, int16_t c) {			int16_t test_vqrdmlahh_s16(int16_t a, int16_t b, int16_t c) {
	// CHECK-AARCH64: sqrdmlah {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}			// CHECK: [[insb:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insc:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[mul:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> [[insb]], <4 x i16> [[insc]])
				// CHECK: extractelement <4 x i16> [[mul]], i64 0
				// CHECK: [[insa:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insmul:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[add:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16> [[insa]], <4 x i16> [[insmul]])
				// CHECK: extractelement <4 x i16> [[add]], i64 0
	return vqrdmlahh_s16(a, b, c);			return vqrdmlahh_s16(a, b, c);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahs_s32			// CHECK-LABEL: test_vqrdmlahs_s32
	int32_t test_vqrdmlahs_s32(int32_t a, int32_t b, int32_t c) {			int32_t test_vqrdmlahs_s32(int32_t a, int32_t b, int32_t c) {
	// CHECK-AARCH64: sqrdmlah {{s[0-9]+}}, {{s[0-9]+}}, {{s[0-9]+}}			// CHECK: call i32 @llvm.aarch64.neon.sqrdmulh.i32(i32 {{%.}}, i32 {{%.}})
				// CHECK: call i32 @llvm.aarch64.neon.sqadd.i32(i32 {{%.}}, i32 {{%.}})
	return vqrdmlahs_s32(a, b, c);			return vqrdmlahs_s32(a, b, c);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahh_lane_s16			// CHECK-LABEL: test_vqrdmlahh_lane_s16
	int16_t test_vqrdmlahh_lane_s16(int16_t a, int16_t b, int16x4_t c) {			int16_t test_vqrdmlahh_lane_s16(int16_t a, int16_t b, int16x4_t c) {
	// CHECK-AARCH64: sqrdmlah {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}, {{v[0-9]+}}.h[3]			// CHECK: extractelement <4 x i16> {{%.*}}, i32 3
				// CHECK: [[insb:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insc:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[mul:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> [[insb]], <4 x i16> [[insc]])
				// CHECK: extractelement <4 x i16> [[mul]], i64 0
				// CHECK: [[insa:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insmul:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[add:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16> [[insa]], <4 x i16> [[insmul]])
				// CHECK: extractelement <4 x i16> [[add]], i64 0
	return vqrdmlahh_lane_s16(a, b, c, 3);			return vqrdmlahh_lane_s16(a, b, c, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahs_lane_s32			// CHECK-LABEL: test_vqrdmlahs_lane_s32
	int32_t test_vqrdmlahs_lane_s32(int32_t a, int32_t b, int32x2_t c) {			int32_t test_vqrdmlahs_lane_s32(int32_t a, int32_t b, int32x2_t c) {
	// CHECK-AARCH64: sqrdmlah {{s[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}.s[1]			// CHECK: extractelement <2 x i32> {{%.*}}, i32 1
				// CHECK: call i32 @llvm.aarch64.neon.sqrdmulh.i32(i32 {{%.}}, i32 {{%.}})
				// CHECK: call i32 @llvm.aarch64.neon.sqadd.i32(i32 {{%.}}, i32 {{%.}})
	return vqrdmlahs_lane_s32(a, b, c, 1);			return vqrdmlahs_lane_s32(a, b, c, 1);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahh_laneq_s16			// CHECK-LABEL: test_vqrdmlahh_laneq_s16
	int16_t test_vqrdmlahh_laneq_s16(int16_t a, int16_t b, int16x8_t c) {			int16_t test_vqrdmlahh_laneq_s16(int16_t a, int16_t b, int16x8_t c) {
	// CHECK-AARCH64: sqrdmlah {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}, {{v[0-9]+}}.h[7]			// CHECK: extractelement <8 x i16> {{%.*}}, i32 7
				// CHECK: [[insb:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insc:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[mul:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> [[insb]], <4 x i16> [[insc]])
				// CHECK: extractelement <4 x i16> [[mul]], i64 0
				// CHECK: [[insa:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insmul:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[add:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16> [[insa]], <4 x i16> [[insmul]])
				// CHECK: extractelement <4 x i16> [[add]], i64 0
	return vqrdmlahh_laneq_s16(a, b, c, 7);			return vqrdmlahh_laneq_s16(a, b, c, 7);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlahs_laneq_s32			// CHECK-LABEL: test_vqrdmlahs_laneq_s32
	int32_t test_vqrdmlahs_laneq_s32(int32_t a, int32_t b, int32x4_t c) {			int32_t test_vqrdmlahs_laneq_s32(int32_t a, int32_t b, int32x4_t c) {
	// CHECK-AARCH64: sqrdmlah {{s[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}.s[3]			// CHECK: extractelement <4 x i32> {{%.*}}, i32 3
				// CHECK: call i32 @llvm.aarch64.neon.sqrdmulh.i32(i32 {{%.}}, i32 {{%.}})
				// CHECK: call i32 @llvm.aarch64.neon.sqadd.i32(i32 {{%.}}, i32 {{%.}})
	return vqrdmlahs_laneq_s32(a, b, c, 3);			return vqrdmlahs_laneq_s32(a, b, c, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlsh_laneq_s16			// CHECK-LABEL: test_vqrdmlsh_laneq_s16
	int16x4_t test_vqrdmlsh_laneq_s16(int16x4_t a, int16x4_t b, int16x8_t v) {			int16x4_t test_vqrdmlsh_laneq_s16(int16x4_t a, int16x4_t b, int16x8_t v) {
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.4h, {{v[0-9]+}}.4h, {{v[0-9]+}}.h[7]			// CHECK: shufflevector <8 x i16> {{%.}}, <8 x i16> {{%.}}, <4 x i32> <i32 7, i32 7, i32 7, i32 7>
				// CHECK: call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK: call <4 x i16> @llvm.aarch64.neon.sqsub.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	return vqrdmlsh_laneq_s16(a, b, v, 7);			return vqrdmlsh_laneq_s16(a, b, v, 7);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlsh_laneq_s32			// CHECK-LABEL: test_vqrdmlsh_laneq_s32
	int32x2_t test_vqrdmlsh_laneq_s32(int32x2_t a, int32x2_t b, int32x4_t v) {			int32x2_t test_vqrdmlsh_laneq_s32(int32x2_t a, int32x2_t b, int32x4_t v) {
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.2s, {{v[0-9]+}}.2s, {{v[0-9]+}}.s[3]			// CHECK: shufflevector <4 x i32> {{%.}}, <4 x i32> {{%.}}, <2 x i32> <i32 3, i32 3>
				// CHECK: call <2 x i32> @llvm.aarch64.neon.sqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK: call <2 x i32> @llvm.aarch64.neon.sqsub.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	return vqrdmlsh_laneq_s32(a, b, v, 3);			return vqrdmlsh_laneq_s32(a, b, v, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshq_laneq_s16			// CHECK-LABEL: test_vqrdmlshq_laneq_s16
	int16x8_t test_vqrdmlshq_laneq_s16(int16x8_t a, int16x8_t b, int16x8_t v) {			int16x8_t test_vqrdmlshq_laneq_s16(int16x8_t a, int16x8_t b, int16x8_t v) {
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.h[7]			// CHECK: shufflevector <8 x i16> {{%.}}, <8 x i16> {{%.}}, <8 x i32> <i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7, i32 7>
				// CHECK: call <8 x i16> @llvm.aarch64.neon.sqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK: call <8 x i16> @llvm.aarch64.neon.sqsub.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	return vqrdmlshq_laneq_s16(a, b, v, 7);			return vqrdmlshq_laneq_s16(a, b, v, 7);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshq_laneq_s32			// CHECK-LABEL: test_vqrdmlshq_laneq_s32
	int32x4_t test_vqrdmlshq_laneq_s32(int32x4_t a, int32x4_t b, int32x4_t v) {			int32x4_t test_vqrdmlshq_laneq_s32(int32x4_t a, int32x4_t b, int32x4_t v) {
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, {{v[0-9]+}}.s[3]			// CHECK: shufflevector <4 x i32> {{%.}}, <4 x i32> {{%.}}, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
				// CHECK: call <4 x i32> @llvm.aarch64.neon.sqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK: call <4 x i32> @llvm.aarch64.neon.sqsub.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	return vqrdmlshq_laneq_s32(a, b, v, 3);			return vqrdmlshq_laneq_s32(a, b, v, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshh_s16			// CHECK-LABEL: test_vqrdmlshh_s16
	int16_t test_vqrdmlshh_s16(int16_t a, int16_t b, int16_t c) {			int16_t test_vqrdmlshh_s16(int16_t a, int16_t b, int16_t c) {
	// CHECK-AARCH64: sqrdmlsh {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}			// CHECK: [[insb:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insc:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[mul:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> [[insb]], <4 x i16> [[insc]])
				// CHECK: extractelement <4 x i16> [[mul]], i64 0
				// CHECK: [[insa:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insmul:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[sub:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqsub.v4i16(<4 x i16> [[insa]], <4 x i16> [[insmul]])
				// CHECK: extractelement <4 x i16> [[sub]], i64 0
	return vqrdmlshh_s16(a, b, c);			return vqrdmlshh_s16(a, b, c);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshs_s32			// CHECK-LABEL: test_vqrdmlshs_s32
	int32_t test_vqrdmlshs_s32(int32_t a, int32_t b, int32_t c) {			int32_t test_vqrdmlshs_s32(int32_t a, int32_t b, int32_t c) {
	// CHECK-AARCH64: sqrdmlsh {{s[0-9]+}}, {{s[0-9]+}}, {{s[0-9]+}}			// CHECK: call i32 @llvm.aarch64.neon.sqrdmulh.i32(i32 {{%.}}, i32 {{%.}})
				// CHECK: call i32 @llvm.aarch64.neon.sqsub.i32(i32 {{%.}}, i32 {{%.}})
	return vqrdmlshs_s32(a, b, c);			return vqrdmlshs_s32(a, b, c);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshh_lane_s16			// CHECK-LABEL: test_vqrdmlshh_lane_s16
	int16_t test_vqrdmlshh_lane_s16(int16_t a, int16_t b, int16x4_t c) {			int16_t test_vqrdmlshh_lane_s16(int16_t a, int16_t b, int16x4_t c) {
	// CHECK-AARCH64: sqrdmlsh {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}, {{v[0-9]+}}.h[3]			// CHECK: extractelement <4 x i16> {{%.*}}, i32 3
				// CHECK: [[insb:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insc:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[mul:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> [[insb]], <4 x i16> [[insc]])
				// CHECK: extractelement <4 x i16> [[mul]], i64 0
				// CHECK: [[insa:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insmul:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[sub:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqsub.v4i16(<4 x i16> [[insa]], <4 x i16> [[insmul]])
				// CHECK: extractelement <4 x i16> [[sub]], i64 0
	return vqrdmlshh_lane_s16(a, b, c, 3);			return vqrdmlshh_lane_s16(a, b, c, 3);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshs_lane_s32			// CHECK-LABEL: test_vqrdmlshs_lane_s32
	int32_t test_vqrdmlshs_lane_s32(int32_t a, int32_t b, int32x2_t c) {			int32_t test_vqrdmlshs_lane_s32(int32_t a, int32_t b, int32x2_t c) {
	// CHECK-AARCH64: sqrdmlsh {{s[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}.s[1]			// CHECK: extractelement <2 x i32> {{%.*}}, i32 1
				// CHECK: call i32 @llvm.aarch64.neon.sqrdmulh.i32(i32 {{%.}}, i32 {{%.}})
				// CHECK: call i32 @llvm.aarch64.neon.sqsub.i32(i32 {{%.}}, i32 {{%.}})
	return vqrdmlshs_lane_s32(a, b, c, 1);			return vqrdmlshs_lane_s32(a, b, c, 1);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshh_laneq_s16			// CHECK-LABEL: test_vqrdmlshh_laneq_s16
	int16_t test_vqrdmlshh_laneq_s16(int16_t a, int16_t b, int16x8_t c) {			int16_t test_vqrdmlshh_laneq_s16(int16_t a, int16_t b, int16x8_t c) {
	// CHECK-AARCH64: sqrdmlsh {{h[0-9]+\|v[0-9]+.4h}}, {{h[0-9]+\|v[0-9]+.4h}}, {{v[0-9]+}}.h[7]			// CHECK: extractelement <8 x i16> {{%.*}}, i32 7
				// CHECK: [[insb:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insc:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[mul:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> [[insb]], <4 x i16> [[insc]])
				// CHECK: extractelement <4 x i16> [[mul]], i64 0
				// CHECK: [[insa:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[insmul:%.]] = insertelement <4 x i16> undef, i16 {{%.}}, i64 0
				// CHECK: [[sub:%.*]] = call <4 x i16> @llvm.aarch64.neon.sqsub.v4i16(<4 x i16> [[insa]], <4 x i16> [[insmul]])
				// CHECK: extractelement <4 x i16> [[sub]], i64 0
	return vqrdmlshh_laneq_s16(a, b, c, 7);			return vqrdmlshh_laneq_s16(a, b, c, 7);
	}			}

	// CHECK-AARCH64-LABEL: test_vqrdmlshs_laneq_s32			// CHECK-LABEL: test_vqrdmlshs_laneq_s32
	int32_t test_vqrdmlshs_laneq_s32(int32_t a, int32_t b, int32x4_t c) {			int32_t test_vqrdmlshs_laneq_s32(int32_t a, int32_t b, int32x4_t c) {
	// CHECK-AARCH64: sqrdmlsh {{s[0-9]+}}, {{s[0-9]+}}, {{v[0-9]+}}.s[3]			// CHECK: extractelement <4 x i32> {{%.*}}, i32 3
				// CHECK: call i32 @llvm.aarch64.neon.sqrdmulh.i32(i32 {{%.}}, i32 {{%.}})
				// CHECK: call i32 @llvm.aarch64.neon.sqsub.i32(i32 {{%.}}, i32 {{%.}})
	return vqrdmlshs_laneq_s32(a, b, c, 3);			return vqrdmlshs_laneq_s32(a, b, c, 3);
	}			}

cfe/trunk/test/CodeGen/arm-v8.1a-neon-intrinsics.c

	// RUN: %clang_cc1 -triple armv8.1a-linux-gnu -target-feature +neon \			// RUN: %clang_cc1 -triple armv8.1a-linux-gnu -target-feature +neon \
	// RUN: -O3 -S -o - %s \			// RUN: -S -emit-llvm -o - %s \
	// RUN: \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-ARM			// RUN: \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-ARM

	// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon \			// RUN: %clang_cc1 -triple aarch64-linux-gnu -target-feature +neon \
	// RUN: -target-feature +v8.1a -O3 -S -o - %s \			// RUN: -target-feature +v8.1a -S -emit-llvm -o - %s \
	// RUN: \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64			// RUN: \| FileCheck %s --check-prefix=CHECK --check-prefix=CHECK-AARCH64

	// REQUIRES: arm-registered-target,aarch64-registered-target			// REQUIRES: arm-registered-target,aarch64-registered-target

	#include <arm_neon.h>			#include <arm_neon.h>

	// CHECK-LABEL: test_vqrdmlah_s16			// CHECK-LABEL: test_vqrdmlah_s16
	int16x4_t test_vqrdmlah_s16(int16x4_t a, int16x4_t b, int16x4_t c) {			int16x4_t test_vqrdmlah_s16(int16x4_t a, int16x4_t b, int16x4_t c) {
	// CHECK-ARM: vqrdmlah.s16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}			// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.4h, {{v[0-9]+}}.4h, {{v[0-9]+}}.4h			// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqadds.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})

				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	return vqrdmlah_s16(a, b, c);			return vqrdmlah_s16(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlah_s32			// CHECK-LABEL: test_vqrdmlah_s32
	int32x2_t test_vqrdmlah_s32(int32x2_t a, int32x2_t b, int32x2_t c) {			int32x2_t test_vqrdmlah_s32(int32x2_t a, int32x2_t b, int32x2_t c) {
	// CHECK-ARM: vqrdmlah.s32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}			// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.2s, {{v[0-9]+}}.2s, {{v[0-9]+}}.2s			// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqadds.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})

				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqadd.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	return vqrdmlah_s32(a, b, c);			return vqrdmlah_s32(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlahq_s16			// CHECK-LABEL: test_vqrdmlahq_s16
	int16x8_t test_vqrdmlahq_s16(int16x8_t a, int16x8_t b, int16x8_t c) {			int16x8_t test_vqrdmlahq_s16(int16x8_t a, int16x8_t b, int16x8_t c) {
	// CHECK-ARM: vqrdmlah.s16 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}			// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h			// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqadds.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})

				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqadd.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	return vqrdmlahq_s16(a, b, c);			return vqrdmlahq_s16(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlahq_s32			// CHECK-LABEL: test_vqrdmlahq_s32
	int32x4_t test_vqrdmlahq_s32(int32x4_t a, int32x4_t b, int32x4_t c) {			int32x4_t test_vqrdmlahq_s32(int32x4_t a, int32x4_t b, int32x4_t c) {
	// CHECK-ARM: vqrdmlah.s32 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}			// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, {{v[0-9]+}}.4s			// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqadds.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})

				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqadd.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	return vqrdmlahq_s32(a, b, c);			return vqrdmlahq_s32(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlah_lane_s16			// CHECK-LABEL: test_vqrdmlah_lane_s16
	int16x4_t test_vqrdmlah_lane_s16(int16x4_t a, int16x4_t b, int16x4_t c) {			int16x4_t test_vqrdmlah_lane_s16(int16x4_t a, int16x4_t b, int16x4_t c) {
	// CHECK-ARM: vqrdmlah.s16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}[3]			// CHECK-ARM: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.4h, {{v[0-9]+}}.4h, {{v[0-9]+}}.h[3]			// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqadds.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})

				// CHECK-AARCH64: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqadd.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	return vqrdmlah_lane_s16(a, b, c, 3);			return vqrdmlah_lane_s16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vqrdmlah_lane_s32			// CHECK-LABEL: test_vqrdmlah_lane_s32
	int32x2_t test_vqrdmlah_lane_s32(int32x2_t a, int32x2_t b, int32x2_t c) {			int32x2_t test_vqrdmlah_lane_s32(int32x2_t a, int32x2_t b, int32x2_t c) {
	// CHECK-ARM: vqrdmlah.s32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}[1]			// CHECK-ARM: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <2 x i32> <i32 1, i32 1>
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.2s, {{v[0-9]+}}.2s, {{v[0-9]+}}.s[1]			// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqadds.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})

				// CHECK-AARCH64: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <2 x i32> <i32 1, i32 1>
				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqadd.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	return vqrdmlah_lane_s32(a, b, c, 1);			return vqrdmlah_lane_s32(a, b, c, 1);
	}			}

	// CHECK-LABEL: test_vqrdmlahq_lane_s16			// CHECK-LABEL: test_vqrdmlahq_lane_s16
	int16x8_t test_vqrdmlahq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t c) {			int16x8_t test_vqrdmlahq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t c) {
	// CHECK-ARM: vqrdmlah.s16 q{{[0-9]+}}, q{{[0-9]+}}, d{{[0-9]+}}[3]			// CHECK-ARM: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.h[3]			// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqadds.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})

				// CHECK-AARCH64: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqadd.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	return vqrdmlahq_lane_s16(a, b, c, 3);			return vqrdmlahq_lane_s16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vqrdmlahq_lane_s32			// CHECK-LABEL: test_vqrdmlahq_lane_s32
	int32x4_t test_vqrdmlahq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t c) {			int32x4_t test_vqrdmlahq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t c) {
	// CHECK-ARM: vqrdmlah.s32 q{{[0-9]+}}, q{{[0-9]+}}, d{{[0-9]+}}[1]			// CHECK-ARM: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
	// CHECK-AARCH64: sqrdmlah {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, {{v[0-9]+}}.s[1]			// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqadds.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})

				// CHECK-AARCH64: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqadd.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	return vqrdmlahq_lane_s32(a, b, c, 1);			return vqrdmlahq_lane_s32(a, b, c, 1);
	}			}

	// CHECK-LABEL: test_vqrdmlsh_s16			// CHECK-LABEL: test_vqrdmlsh_s16
	int16x4_t test_vqrdmlsh_s16(int16x4_t a, int16x4_t b, int16x4_t c) {			int16x4_t test_vqrdmlsh_s16(int16x4_t a, int16x4_t b, int16x4_t c) {
	// CHECK-ARM: vqrdmlsh.s16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}			// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.4h, {{v[0-9]+}}.4h, {{v[0-9]+}}.4h			// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqsubs.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})

				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqsub.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	return vqrdmlsh_s16(a, b, c);			return vqrdmlsh_s16(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlsh_s32			// CHECK-LABEL: test_vqrdmlsh_s32
	int32x2_t test_vqrdmlsh_s32(int32x2_t a, int32x2_t b, int32x2_t c) {			int32x2_t test_vqrdmlsh_s32(int32x2_t a, int32x2_t b, int32x2_t c) {
	// CHECK-ARM: vqrdmlsh.s32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}			// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.2s, {{v[0-9]+}}.2s, {{v[0-9]+}}.2s			// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqsubs.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})

				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqsub.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	return vqrdmlsh_s32(a, b, c);			return vqrdmlsh_s32(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlshq_s16			// CHECK-LABEL: test_vqrdmlshq_s16
	int16x8_t test_vqrdmlshq_s16(int16x8_t a, int16x8_t b, int16x8_t c) {			int16x8_t test_vqrdmlshq_s16(int16x8_t a, int16x8_t b, int16x8_t c) {
	// CHECK-ARM: vqrdmlsh.s16 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}			// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.8h			// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqsubs.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})

				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqsub.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	return vqrdmlshq_s16(a, b, c);			return vqrdmlshq_s16(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlshq_s32			// CHECK-LABEL: test_vqrdmlshq_s32
	int32x4_t test_vqrdmlshq_s32(int32x4_t a, int32x4_t b, int32x4_t c) {			int32x4_t test_vqrdmlshq_s32(int32x4_t a, int32x4_t b, int32x4_t c) {
	// CHECK-ARM: vqrdmlsh.s32 q{{[0-9]+}}, q{{[0-9]+}}, q{{[0-9]+}}			// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, {{v[0-9]+}}.4s			// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqsubs.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})

				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqsub.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	return vqrdmlshq_s32(a, b, c);			return vqrdmlshq_s32(a, b, c);
	}			}

	// CHECK-LABEL: test_vqrdmlsh_lane_s16			// CHECK-LABEL: test_vqrdmlsh_lane_s16
	int16x4_t test_vqrdmlsh_lane_s16(int16x4_t a, int16x4_t b, int16x4_t c) {			int16x4_t test_vqrdmlsh_lane_s16(int16x4_t a, int16x4_t b, int16x4_t c) {
	// CHECK-ARM: vqrdmlsh.s16 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}[3]			// CHECK-ARM: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.4h, {{v[0-9]+}}.4h, {{v[0-9]+}}.h[3]			// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK-ARM: call <4 x i16> @llvm.arm.neon.vqsubs.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})

				// CHECK-AARCH64: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <4 x i32> <i32 3, i32 3, i32 3, i32 3>
				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqrdmulh.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
				// CHECK-AARCH64: call <4 x i16> @llvm.aarch64.neon.sqsub.v4i16(<4 x i16> {{%.}}, <4 x i16> {{%.}})
	return vqrdmlsh_lane_s16(a, b, c, 3);			return vqrdmlsh_lane_s16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vqrdmlsh_lane_s32			// CHECK-LABEL: test_vqrdmlsh_lane_s32
	int32x2_t test_vqrdmlsh_lane_s32(int32x2_t a, int32x2_t b, int32x2_t c) {			int32x2_t test_vqrdmlsh_lane_s32(int32x2_t a, int32x2_t b, int32x2_t c) {
	// CHECK-ARM: vqrdmlsh.s32 d{{[0-9]+}}, d{{[0-9]+}}, d{{[0-9]+}}[1]			// CHECK-ARM: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <2 x i32> <i32 1, i32 1>
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.2s, {{v[0-9]+}}.2s, {{v[0-9]+}}.s[1]			// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK-ARM: call <2 x i32> @llvm.arm.neon.vqsubs.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})

				// CHECK-AARCH64: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <2 x i32> <i32 1, i32 1>
				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqrdmulh.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
				// CHECK-AARCH64: call <2 x i32> @llvm.aarch64.neon.sqsub.v2i32(<2 x i32> {{%.}}, <2 x i32> {{%.}})
	return vqrdmlsh_lane_s32(a, b, c, 1);			return vqrdmlsh_lane_s32(a, b, c, 1);
	}			}

	// CHECK-LABEL: test_vqrdmlshq_lane_s16			// CHECK-LABEL: test_vqrdmlshq_lane_s16
	int16x8_t test_vqrdmlshq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t c) {			int16x8_t test_vqrdmlshq_lane_s16(int16x8_t a, int16x8_t b, int16x4_t c) {
	// CHECK-ARM: vqrdmlsh.s16 q{{[0-9]+}}, q{{[0-9]+}}, d{{[0-9]+}}[3]			// CHECK-ARM: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.8h, {{v[0-9]+}}.8h, {{v[0-9]+}}.h[3]			// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK-ARM: call <8 x i16> @llvm.arm.neon.vqsubs.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})

				// CHECK-AARCH64: shufflevector <4 x i16> {{%.}}, <4 x i16> {{%.}}, <8 x i32> <i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3, i32 3>
				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqrdmulh.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
				// CHECK-AARCH64: call <8 x i16> @llvm.aarch64.neon.sqsub.v8i16(<8 x i16> {{%.}}, <8 x i16> {{%.}})
	return vqrdmlshq_lane_s16(a, b, c, 3);			return vqrdmlshq_lane_s16(a, b, c, 3);
	}			}

	// CHECK-LABEL: test_vqrdmlshq_lane_s32			// CHECK-LABEL: test_vqrdmlshq_lane_s32
	int32x4_t test_vqrdmlshq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t c) {			int32x4_t test_vqrdmlshq_lane_s32(int32x4_t a, int32x4_t b, int32x2_t c) {
	// CHECK-ARM: vqrdmlsh.s32 q{{[0-9]+}}, q{{[0-9]+}}, d{{[0-9]+}}[1]			// CHECK-ARM: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
	// CHECK-AARCH64: sqrdmlsh {{v[0-9]+}}.4s, {{v[0-9]+}}.4s, {{v[0-9]+}}.s[1]			// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK-ARM: call <4 x i32> @llvm.arm.neon.vqsubs.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})

				// CHECK-AARCH64: shufflevector <2 x i32> {{%.}}, <2 x i32> {{%.}}, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqrdmulh.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
				// CHECK-AARCH64: call <4 x i32> @llvm.aarch64.neon.sqsub.v4i32(<4 x i32> {{%.}}, <4 x i32> {{%.}})
	return vqrdmlshq_lane_s32(a, b, c, 1);			return vqrdmlshq_lane_s32(a, b, c, 1);
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 43969

cfe/trunk/test/CodeGen/aarch64-v8.1a-neon-intrinsics.c

cfe/trunk/test/CodeGen/arm-v8.1a-neon-intrinsics.c

[ARM] [AARCH64] Add CodeGen IR tests for {VS}QRDML{AS}H v8.1a intrinsics.
ClosedPublic