This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
1/4
AArch64InstrInfo.td
-
test/
-
Analysis/CostModel/AArch64/
-
CostModel/
-
AArch64/
-
vector-select.ll
-
CodeGen/AArch64/
-
AArch64/
-
arm64-subvector-extend.ll
1
arm64-vshr.ll
-
cmp-select-sign.ll
-
dag-numsignbits.ll
-
div_minsize.ll
-
selectcc-to-shiftand.ll
-
srem-seteq-vec-splat.ll
-
sve-fixed-length-fp-vselect.ll
-
sve-fixed-length-int-vselect.ll
-
sve-fixed-length-masked-gather.ll
-
sve-fixed-length-masked-loads.ll
-
sve-fixed-length-masked-scatter.ll
-
sve-fixed-length-masked-stores.ll
-
vec_uaddo.ll
-
vec_umulo.ll
-
vselect-constants.ll

Differential D115457

[AArch64] Convert sra(X, elt_size(X)-1) to cmlt(X, 0)
ClosedPublic

Authored by labrinea on Dec 9 2021, 11:05 AM.

Download Raw Diff

Details

Reviewers

llvm-commits
SjoerdMeijer
dmgreen
momchil.velikov

Commits

rG61bb8b5d4040: [AArch64] Convert sra(X, elt_size(X)-1) to cmlt(X, 0)

Summary

CMLT has twice the execution throughput of SSHR on Arm out-of-order cores.

Diff Detail

Unit TestsFailed

	Time	Test
	100 ms	x64 debian > LLVM.Bindings/Go::go.test

Event Timeline

labrinea created this revision.Dec 9 2021, 11:05 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald TranscriptDec 9 2021, 11:05 AM

labrinea requested review of this revision.Dec 9 2021, 11:05 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 9 2021, 11:05 AM

Harbormaster completed remote builds in B138487: Diff 393225.Dec 9 2021, 11:48 AM

Can you add a reasoning to the commit message? As far as I understand, the CMLT has a higher throughput on many cpus than the sshr.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11229 ↗	(On Diff #393225)	This add exemption makes it a bit awkward. Do we need to convert this early, or could we just add tablegen patterns and let it pick the "best" one for the given code, like it is fairly descent at. Something like this, for each type, might be enough: def : Pat<(v16i8 (AArch64vashr (v16i8 V128:$Rn), (i32 7))), (CMLTv16i8rz V128:$Rn)>;
llvm/test/CodeGen/AArch64/arm64-vshr.ll
50–51	I would leave the old test name, or maybe change the constant so it's no longer 63 and still tests a sshr is produced.

chill added a subscriber: chill.Dec 10 2021, 2:08 AM

chill added inline comments.

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11229 ↗	(On Diff #393225)	Why is ADD an exception?

labrinea added inline comments.Dec 13 2021, 11:50 AM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11229 ↗	(On Diff #393225)	When the Shift is followed by an Add we should prefer emitting a single instruction (SSRA) instead of two instructions (CMLT, ADD). As Dave suggested above I might be able to work around this with tablegen patterns.

Reimplemented the transformation using tablegen patterns.
Updated the description to explain the motivation behind this change.

Harbormaster completed remote builds in B139040: Diff 393987.Dec 13 2021, 12:53 PM

Thanks. LGTM with a couple of suggestions.

llvm/lib/Target/AArch64/AArch64InstrInfo.td
4179	I would remove all these empty lines, to keep the related patterns together.
4815	There is a v1i64 CMLT here if you want to add the pattern for that too. v1 types usually matter less but it may be good to have it for consistency.

This revision is now accepted and ready to land.Dec 14 2021, 1:22 AM

labrinea added inline comments.Dec 14 2021, 1:51 AM

llvm/lib/Target/AArch64/AArch64InstrInfo.td
4815	There is no `1D` variant if that's what you meant. According to ArmARM the encoding (size = 11 , Q = 0) is reserved.

dmgreen added inline comments.Dec 14 2021, 1:56 AM

llvm/lib/Target/AArch64/AArch64InstrInfo.td
4815	This is the "Scalar" variant, not the "Vector" variant. This one, I think: https://developer.arm.com/documentation/dui0802/a/A64-Advanced-SIMD-Scalar-Instructions/CMLT--scalar--zero-

This revision was landed with ongoing or failed builds.Dec 14 2021, 8:09 AM

Closed by commit rG61bb8b5d4040: [AArch64] Convert sra(X, elt_size(X)-1) to cmlt(X, 0) (authored by labrinea). · Explain Why

This revision was automatically updated to reflect the committed changes.

labrinea added a commit: rG61bb8b5d4040: [AArch64] Convert sra(X, elt_size(X)-1) to cmlt(X, 0).

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64InstrInfo.td

21 lines

test/

Analysis/

CostModel/

AArch64/

vector-select.ll

2 lines

CodeGen/

AArch64/

arm64-subvector-extend.ll

12 lines

4 lines

14 lines

2 lines

2 lines

selectcc-to-shiftand.ll

8 lines

srem-seteq-vec-splat.ll

4 lines

sve-fixed-length-fp-vselect.ll

10 lines

sve-fixed-length-int-vselect.ll

14 lines

sve-fixed-length-masked-gather.ll

2 lines

sve-fixed-length-masked-loads.ll

2 lines

sve-fixed-length-masked-scatter.ll

2 lines

sve-fixed-length-masked-stores.ll

2 lines

vec_uaddo.ll

14 lines

vec_umulo.ll

14 lines

vselect-constants.ll

30 lines

Diff 393987

llvm/lib/Target/AArch64/AArch64InstrInfo.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,168 Lines • ▼ Show 20 Lines
	defm CMEQ : SIMDCmpTwoVector<0, 0b01001, "cmeq", AArch64cmeqz>;			defm CMEQ : SIMDCmpTwoVector<0, 0b01001, "cmeq", AArch64cmeqz>;
	defm CMGE : SIMDCmpTwoVector<1, 0b01000, "cmge", AArch64cmgez>;			defm CMGE : SIMDCmpTwoVector<1, 0b01000, "cmge", AArch64cmgez>;
	defm CMGT : SIMDCmpTwoVector<0, 0b01000, "cmgt", AArch64cmgtz>;			defm CMGT : SIMDCmpTwoVector<0, 0b01000, "cmgt", AArch64cmgtz>;
	defm CMLE : SIMDCmpTwoVector<1, 0b01001, "cmle", AArch64cmlez>;			defm CMLE : SIMDCmpTwoVector<1, 0b01001, "cmle", AArch64cmlez>;
	defm CMLT : SIMDCmpTwoVector<0, 0b01010, "cmlt", AArch64cmltz>;			defm CMLT : SIMDCmpTwoVector<0, 0b01010, "cmlt", AArch64cmltz>;
	defm CNT : SIMDTwoVectorB<0, 0b00, 0b00101, "cnt", ctpop>;			defm CNT : SIMDTwoVectorB<0, 0b00, 0b00101, "cnt", ctpop>;
	defm FABS : SIMDTwoVectorFP<0, 1, 0b01111, "fabs", fabs>;			defm FABS : SIMDTwoVectorFP<0, 1, 0b01111, "fabs", fabs>;

				def : Pat<(v8i8 (AArch64vashr (v8i8 V64:$Rn), (i32 7))),
				(CMLTv8i8rz V64:$Rn)>;

				dmgreenUnsubmitted Not Done Reply Inline Actions I would remove all these empty lines, to keep the related patterns together. dmgreen: I would remove all these empty lines, to keep the related patterns together.
				def : Pat<(v4i16 (AArch64vashr (v4i16 V64:$Rn), (i32 15))),
				(CMLTv4i16rz V64:$Rn)>;

				def : Pat<(v2i32 (AArch64vashr (v2i32 V64:$Rn), (i32 31))),
				(CMLTv2i32rz V64:$Rn)>;

				def : Pat<(v16i8 (AArch64vashr (v16i8 V128:$Rn), (i32 7))),
				(CMLTv16i8rz V128:$Rn)>;

				def : Pat<(v8i16 (AArch64vashr (v8i16 V128:$Rn), (i32 15))),
				(CMLTv8i16rz V128:$Rn)>;

				def : Pat<(v4i32 (AArch64vashr (v4i32 V128:$Rn), (i32 31))),
				(CMLTv4i32rz V128:$Rn)>;

				def : Pat<(v2i64 (AArch64vashr (v2i64 V128:$Rn), (i32 63))),
				(CMLTv2i64rz V128:$Rn)>;

	defm FCMEQ : SIMDFPCmpTwoVector<0, 1, 0b01101, "fcmeq", AArch64fcmeqz>;			defm FCMEQ : SIMDFPCmpTwoVector<0, 1, 0b01101, "fcmeq", AArch64fcmeqz>;
	defm FCMGE : SIMDFPCmpTwoVector<1, 1, 0b01100, "fcmge", AArch64fcmgez>;			defm FCMGE : SIMDFPCmpTwoVector<1, 1, 0b01100, "fcmge", AArch64fcmgez>;
	defm FCMGT : SIMDFPCmpTwoVector<0, 1, 0b01100, "fcmgt", AArch64fcmgtz>;			defm FCMGT : SIMDFPCmpTwoVector<0, 1, 0b01100, "fcmgt", AArch64fcmgtz>;
	defm FCMLE : SIMDFPCmpTwoVector<1, 1, 0b01101, "fcmle", AArch64fcmlez>;			defm FCMLE : SIMDFPCmpTwoVector<1, 1, 0b01101, "fcmle", AArch64fcmlez>;
	defm FCMLT : SIMDFPCmpTwoVector<0, 1, 0b01110, "fcmlt", AArch64fcmltz>;			defm FCMLT : SIMDFPCmpTwoVector<0, 1, 0b01110, "fcmlt", AArch64fcmltz>;
	defm FCVTAS : SIMDTwoVectorFPToInt<0,0,0b11100, "fcvtas",int_aarch64_neon_fcvtas>;			defm FCVTAS : SIMDTwoVectorFPToInt<0,0,0b11100, "fcvtas",int_aarch64_neon_fcvtas>;
	defm FCVTAU : SIMDTwoVectorFPToInt<1,0,0b11100, "fcvtau",int_aarch64_neon_fcvtau>;			defm FCVTAU : SIMDTwoVectorFPToInt<1,0,0b11100, "fcvtau",int_aarch64_neon_fcvtau>;
	defm FCVTL : SIMDFPWidenTwoVector<0, 0, 0b10111, "fcvtl">;			defm FCVTL : SIMDFPWidenTwoVector<0, 0, 0b10111, "fcvtl">;
	▲ Show 20 Lines • Show All 601 Lines • ▼ Show 20 Lines
	// Advanced SIMD two scalar instructions.			// Advanced SIMD two scalar instructions.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	defm ABS : SIMDTwoScalarD< 0, 0b01011, "abs", abs>;			defm ABS : SIMDTwoScalarD< 0, 0b01011, "abs", abs>;
	defm CMEQ : SIMDCmpTwoScalarD< 0, 0b01001, "cmeq", AArch64cmeqz>;			defm CMEQ : SIMDCmpTwoScalarD< 0, 0b01001, "cmeq", AArch64cmeqz>;
	defm CMGE : SIMDCmpTwoScalarD< 1, 0b01000, "cmge", AArch64cmgez>;			defm CMGE : SIMDCmpTwoScalarD< 1, 0b01000, "cmge", AArch64cmgez>;
	defm CMGT : SIMDCmpTwoScalarD< 0, 0b01000, "cmgt", AArch64cmgtz>;			defm CMGT : SIMDCmpTwoScalarD< 0, 0b01000, "cmgt", AArch64cmgtz>;
	defm CMLE : SIMDCmpTwoScalarD< 1, 0b01001, "cmle", AArch64cmlez>;			defm CMLE : SIMDCmpTwoScalarD< 1, 0b01001, "cmle", AArch64cmlez>;
	defm CMLT : SIMDCmpTwoScalarD< 0, 0b01010, "cmlt", AArch64cmltz>;			defm CMLT : SIMDCmpTwoScalarD< 0, 0b01010, "cmlt", AArch64cmltz>;
				dmgreenUnsubmitted Not Done Reply Inline Actions There is a v1i64 CMLT here if you want to add the pattern for that too. v1 types usually matter less but it may be good to have it for consistency. dmgreen: There is a v1i64 CMLT here if you want to add the pattern for that too. v1 types usually matter…
				labrineaAuthorUnsubmitted Done Reply Inline Actions There is no `1D` variant if that's what you meant. According to ArmARM the encoding (size = 11 , Q = 0) is reserved. labrinea: There is no `1D` variant if that's what you meant. According to ArmARM the encoding (size = 11…
				dmgreenUnsubmitted Not Done Reply Inline Actions This is the "Scalar" variant, not the "Vector" variant. This one, I think: https://developer.arm.com/documentation/dui0802/a/A64-Advanced-SIMD-Scalar-Instructions/CMLT--scalar--zero- dmgreen: This is the "Scalar" variant, not the "Vector" variant. This one, I think: https://developer.
	defm FCMEQ : SIMDFPCmpTwoScalar<0, 1, 0b01101, "fcmeq", AArch64fcmeqz>;			defm FCMEQ : SIMDFPCmpTwoScalar<0, 1, 0b01101, "fcmeq", AArch64fcmeqz>;
	defm FCMGE : SIMDFPCmpTwoScalar<1, 1, 0b01100, "fcmge", AArch64fcmgez>;			defm FCMGE : SIMDFPCmpTwoScalar<1, 1, 0b01100, "fcmge", AArch64fcmgez>;
	defm FCMGT : SIMDFPCmpTwoScalar<0, 1, 0b01100, "fcmgt", AArch64fcmgtz>;			defm FCMGT : SIMDFPCmpTwoScalar<0, 1, 0b01100, "fcmgt", AArch64fcmgtz>;
	defm FCMLE : SIMDFPCmpTwoScalar<1, 1, 0b01101, "fcmle", AArch64fcmlez>;			defm FCMLE : SIMDFPCmpTwoScalar<1, 1, 0b01101, "fcmle", AArch64fcmlez>;
	defm FCMLT : SIMDFPCmpTwoScalar<0, 1, 0b01110, "fcmlt", AArch64fcmltz>;			defm FCMLT : SIMDFPCmpTwoScalar<0, 1, 0b01110, "fcmlt", AArch64fcmltz>;
	defm FCVTAS : SIMDFPTwoScalar< 0, 0, 0b11100, "fcvtas">;			defm FCVTAS : SIMDFPTwoScalar< 0, 0, 0b11100, "fcvtas">;
	defm FCVTAU : SIMDFPTwoScalar< 1, 0, 0b11100, "fcvtau">;			defm FCVTAU : SIMDFPTwoScalar< 1, 0, 0b11100, "fcvtau">;
	defm FCVTMS : SIMDFPTwoScalar< 0, 0, 0b11011, "fcvtms">;			defm FCVTMS : SIMDFPTwoScalar< 0, 0, 0b11011, "fcvtms">;
	▲ Show 20 Lines • Show All 3,437 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/AArch64/vector-select.ll

	Show First 20 Lines • Show All 137 Lines • ▼ Show 20 Lines

	; COST-LABEL: v2i64_select_no_cmp			; COST-LABEL: v2i64_select_no_cmp
	; COST-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %s.1 = select <2 x i1> %cond, <2 x i64> %a, <2 x i64> %b			; COST-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %s.1 = select <2 x i1> %cond, <2 x i64> %a, <2 x i64> %b

	; CODE-LABEL: v2i64_select_no_cmp			; CODE-LABEL: v2i64_select_no_cmp
	; CODE: bb.0			; CODE: bb.0
	; CODE-NEXT: ushll v{{.+}}.2d, v{{.+}}.2s, #0			; CODE-NEXT: ushll v{{.+}}.2d, v{{.+}}.2s, #0
	; CODE-NEXT: shl v{{.+}}.2d, v{{.+}}.2d, #63			; CODE-NEXT: shl v{{.+}}.2d, v{{.+}}.2d, #63
	; CODE-NEXT: sshr v{{.+}}.2d, v{{.+}}.2d, #63			; CODE-NEXT: cmlt v{{.+}}.2d, v{{.+}}.2d, #0
	; CODE-NEXT: bif v{{.+}}.16b, v{{.+}}.16b, v{{.+}}.16b			; CODE-NEXT: bif v{{.+}}.16b, v{{.+}}.16b, v{{.+}}.16b
	; CODE-NEXT: ret			; CODE-NEXT: ret

	define <2 x i64> @v2i64_select_no_cmp(<2 x i64> %a, <2 x i64> %b, <2 x i1> %cond) {			define <2 x i64> @v2i64_select_no_cmp(<2 x i64> %a, <2 x i64> %b, <2 x i1> %cond) {
	%s.1 = select <2 x i1> %cond, <2 x i64> %a, <2 x i64> %b			%s.1 = select <2 x i1> %cond, <2 x i64> %a, <2 x i64> %b
	ret <2 x i64> %s.1			ret <2 x i64> %s.1
	}			}

llvm/test/CodeGen/AArch64/arm64-subvector-extend.ll

	Show First 20 Lines • Show All 342 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ldr w8, [sp, #176]			; CHECK-NEXT: ldr w8, [sp, #176]
	; CHECK-NEXT: mov.b v0[14], w9			; CHECK-NEXT: mov.b v0[14], w9
	; CHECK-NEXT: mov.b v1[14], w8			; CHECK-NEXT: mov.b v1[14], w8
	; CHECK-NEXT: ldr w8, [sp, #184]			; CHECK-NEXT: ldr w8, [sp, #184]
	; CHECK-NEXT: mov.b v0[15], w10			; CHECK-NEXT: mov.b v0[15], w10
	; CHECK-NEXT: mov.b v1[15], w8			; CHECK-NEXT: mov.b v1[15], w8
	; CHECK-NEXT: shl.16b v0, v0, #7			; CHECK-NEXT: shl.16b v0, v0, #7
	; CHECK-NEXT: shl.16b v1, v1, #7			; CHECK-NEXT: shl.16b v1, v1, #7
	; CHECK-NEXT: sshr.16b v0, v0, #7			; CHECK-NEXT: cmlt.16b v0, v0, #0
	; CHECK-NEXT: sshr.16b v1, v1, #7			; CHECK-NEXT: cmlt.16b v1, v1, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%res = sext <32 x i1> %arg to <32 x i8>			%res = sext <32 x i1> %arg to <32 x i8>
	ret <32 x i8> %res			ret <32 x i8> %res
	}			}

	define <64 x i8> @zext_v64i1(<64 x i1> %arg) {			define <64 x i8> @zext_v64i1(<64 x i1> %arg) {
	; CHECK-LABEL: zext_v64i1:			; CHECK-LABEL: zext_v64i1:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	▲ Show 20 Lines • Show All 249 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: mov.b v3[15], w8			; CHECK-NEXT: mov.b v3[15], w8
	; CHECK-NEXT: mov.b v2[15], w9			; CHECK-NEXT: mov.b v2[15], w9
	; CHECK-NEXT: mov.b v1[15], w13			; CHECK-NEXT: mov.b v1[15], w13
	; CHECK-NEXT: mov.b v0[15], w10			; CHECK-NEXT: mov.b v0[15], w10
	; CHECK-NEXT: shl.16b v3, v3, #7			; CHECK-NEXT: shl.16b v3, v3, #7
	; CHECK-NEXT: shl.16b v2, v2, #7			; CHECK-NEXT: shl.16b v2, v2, #7
	; CHECK-NEXT: shl.16b v4, v1, #7			; CHECK-NEXT: shl.16b v4, v1, #7
	; CHECK-NEXT: shl.16b v5, v0, #7			; CHECK-NEXT: shl.16b v5, v0, #7
	; CHECK-NEXT: sshr.16b v0, v3, #7			; CHECK-NEXT: cmlt.16b v0, v3, #0
	; CHECK-NEXT: sshr.16b v1, v2, #7			; CHECK-NEXT: cmlt.16b v1, v2, #0
	; CHECK-NEXT: sshr.16b v2, v4, #7			; CHECK-NEXT: cmlt.16b v2, v4, #0
	; CHECK-NEXT: sshr.16b v3, v5, #7			; CHECK-NEXT: cmlt.16b v3, v5, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%res = sext <64 x i1> %arg to <64 x i8>			%res = sext <64 x i1> %arg to <64 x i8>
	ret <64 x i8> %res			ret <64 x i8> %res
	}			}

	define <1 x i128> @sext_v1x64(<1 x i64> %arg) {			define <1 x i128> @sext_v1x64(<1 x i64> %arg) {
	; X0 & X1 are the real return registers, SDAG messes with v0 too for unknown reasons.			; X0 & X1 are the real return registers, SDAG messes with v0 too for unknown reasons.
	; CHECKDAG-LABEL: sext_v1x64:			; CHECKDAG-LABEL: sext_v1x64:
	Show All 17 Lines

llvm/test/CodeGen/AArch64/arm64-vshr.ll

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	entry:
store <8 x i16> %b, <8 x i16>* %b.addr, align 16		store <8 x i16> %b, <8 x i16>* %b.addr, align 16
%0 = load <8 x i16>, <8 x i16>* %a.addr, align 16		%0 = load <8 x i16>, <8 x i16>* %a.addr, align 16
%1 = load <8 x i16>, <8 x i16>* %b.addr, align 16		%1 = load <8 x i16>, <8 x i16>* %b.addr, align 16
%shr = lshr <8 x i16> %0, %1		%shr = lshr <8 x i16> %0, %1
ret <8 x i16> %shr		ret <8 x i16> %shr
}		}

define <1 x i64> @sshr_v1i64(<1 x i64> %A) nounwind {		define <1 x i64> @sshr_v1i64(<1 x i64> %A) nounwind {
; CHECK-LABEL: sshr_v1i64:		; CHECK-LABEL: sshr_v1i64:
; CHECK: sshr d0, d0, #63		; CHECK: sshr d0, d0, #42
		dmgreenUnsubmitted Not Done Reply Inline Actions I would leave the old test name, or maybe change the constant so it's no longer 63 and still tests a sshr is produced. dmgreen: I would leave the old test name, or maybe change the constant so it's no longer 63 and still…
%tmp3 = ashr <1 x i64> %A, < i64 63 >		%tmp3 = ashr <1 x i64> %A, < i64 42 >
ret <1 x i64> %tmp3		ret <1 x i64> %tmp3
}		}

define <1 x i64> @ushr_v1i64(<1 x i64> %A) nounwind {		define <1 x i64> @ushr_v1i64(<1 x i64> %A) nounwind {
; CHECK-LABEL: ushr_v1i64:		; CHECK-LABEL: ushr_v1i64:
; CHECK: ushr d0, d0, #63		; CHECK: ushr d0, d0, #63
%tmp3 = lshr <1 x i64> %A, < i64 63 >		%tmp3 = lshr <1 x i64> %A, < i64 63 >
ret <1 x i64> %tmp3		ret <1 x i64> %tmp3
}		}

attributes #0 = { nounwind }		attributes #0 = { nounwind }

llvm/test/CodeGen/AArch64/cmp-select-sign.ll

Show First 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%res = select i1 %c, i64 1, i64 -1		%res = select i1 %c, i64 1, i64 -1
ret i64 %res		ret i64 %res
}		}

define <7 x i8> @sign_7xi8(<7 x i8> %a) {		define <7 x i8> @sign_7xi8(<7 x i8> %a) {
; CHECK-LABEL: sign_7xi8:		; CHECK-LABEL: sign_7xi8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: movi v1.8b, #1		; CHECK-NEXT: movi v1.8b, #1
; CHECK-NEXT: sshr v0.8b, v0.8b, #7		; CHECK-NEXT: cmlt v0.8b, v0.8b, #0
; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b		; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp sgt <7 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%c = icmp sgt <7 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
%res = select <7 x i1> %c, <7 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <7 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%res = select <7 x i1> %c, <7 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <7 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
ret <7 x i8> %res		ret <7 x i8> %res
}		}

define <8 x i8> @sign_8xi8(<8 x i8> %a) {		define <8 x i8> @sign_8xi8(<8 x i8> %a) {
; CHECK-LABEL: sign_8xi8:		; CHECK-LABEL: sign_8xi8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: movi v1.8b, #1		; CHECK-NEXT: movi v1.8b, #1
; CHECK-NEXT: sshr v0.8b, v0.8b, #7		; CHECK-NEXT: cmlt v0.8b, v0.8b, #0
; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b		; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp sgt <8 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%c = icmp sgt <8 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
%res = select <8 x i1> %c, <8 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <8 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%res = select <8 x i1> %c, <8 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <8 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
ret <8 x i8> %res		ret <8 x i8> %res
}		}

define <16 x i8> @sign_16xi8(<16 x i8> %a) {		define <16 x i8> @sign_16xi8(<16 x i8> %a) {
; CHECK-LABEL: sign_16xi8:		; CHECK-LABEL: sign_16xi8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: movi v1.16b, #1		; CHECK-NEXT: movi v1.16b, #1
; CHECK-NEXT: sshr v0.16b, v0.16b, #7		; CHECK-NEXT: cmlt v0.16b, v0.16b, #0
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp sgt <16 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%c = icmp sgt <16 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
%res = select <16 x i1> %c, <16 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%res = select <16 x i1> %c, <16 x i8> <i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1, i8 1>, <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
ret <16 x i8> %res		ret <16 x i8> %res
}		}

define <3 x i32> @sign_3xi32(<3 x i32> %a) {		define <3 x i32> @sign_3xi32(<3 x i32> %a) {
; CHECK-LABEL: sign_3xi32:		; CHECK-LABEL: sign_3xi32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: orr v0.4s, #1		; CHECK-NEXT: orr v0.4s, #1
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp sgt <3 x i32> %a, <i32 -1, i32 -1, i32 -1>		%c = icmp sgt <3 x i32> %a, <i32 -1, i32 -1, i32 -1>
%res = select <3 x i1> %c, <3 x i32> <i32 1, i32 1, i32 1>, <3 x i32> <i32 -1, i32 -1, i32 -1>		%res = select <3 x i1> %c, <3 x i32> <i32 1, i32 1, i32 1>, <3 x i32> <i32 -1, i32 -1, i32 -1>
ret <3 x i32> %res		ret <3 x i32> %res
}		}

define <4 x i32> @sign_4xi32(<4 x i32> %a) {		define <4 x i32> @sign_4xi32(<4 x i32> %a) {
; CHECK-LABEL: sign_4xi32:		; CHECK-LABEL: sign_4xi32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: orr v0.4s, #1		; CHECK-NEXT: orr v0.4s, #1
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp sgt <4 x i32> %a, <i32 -1, i32 -1, i32 -1, i32 -1>		%c = icmp sgt <4 x i32> %a, <i32 -1, i32 -1, i32 -1, i32 -1>
%res = select <4 x i1> %c, <4 x i32> <i32 1, i32 1, i32 1, i32 1>, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>		%res = select <4 x i1> %c, <4 x i32> <i32 1, i32 1, i32 1, i32 1>, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
ret <4 x i32> %res		ret <4 x i32> %res
}		}

define <4 x i32> @sign_4xi32_multi_use(<4 x i32> %a) {		define <4 x i32> @sign_4xi32_multi_use(<4 x i32> %a) {
; CHECK-LABEL: sign_4xi32_multi_use:		; CHECK-LABEL: sign_4xi32_multi_use:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sub sp, sp, #32		; CHECK-NEXT: sub sp, sp, #32
; CHECK-NEXT: str x30, [sp, #16] // 8-byte Folded Spill		; CHECK-NEXT: str x30, [sp, #16] // 8-byte Folded Spill
; CHECK-NEXT: .cfi_def_cfa_offset 32		; CHECK-NEXT: .cfi_def_cfa_offset 32
; CHECK-NEXT: .cfi_offset w30, -16		; CHECK-NEXT: .cfi_offset w30, -16
; CHECK-NEXT: movi v1.2d, #0xffffffffffffffff		; CHECK-NEXT: movi v1.2d, #0xffffffffffffffff
; CHECK-NEXT: sshr v2.4s, v0.4s, #31		; CHECK-NEXT: cmlt v2.4s, v0.4s, #0
; CHECK-NEXT: cmgt v0.4s, v0.4s, v1.4s		; CHECK-NEXT: cmgt v0.4s, v0.4s, v1.4s
; CHECK-NEXT: orr v2.4s, #1		; CHECK-NEXT: orr v2.4s, #1
; CHECK-NEXT: xtn v0.4h, v0.4s		; CHECK-NEXT: xtn v0.4h, v0.4s
; CHECK-NEXT: str q2, [sp] // 16-byte Folded Spill		; CHECK-NEXT: str q2, [sp] // 16-byte Folded Spill
; CHECK-NEXT: bl use_4xi1		; CHECK-NEXT: bl use_4xi1
; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload		; CHECK-NEXT: ldr q0, [sp] // 16-byte Folded Reload
; CHECK-NEXT: ldr x30, [sp, #16] // 8-byte Folded Reload		; CHECK-NEXT: ldr x30, [sp, #16] // 8-byte Folded Reload
; CHECK-NEXT: add sp, sp, #32		; CHECK-NEXT: add sp, sp, #32
Show All 20 Lines	; CHECK-NEXT: ret
ret <4 x i32> %res		ret <4 x i32> %res
}		}

; First select operand breaks sign pattern.		; First select operand breaks sign pattern.
define <4 x i32> @not_sign_4xi32_2(<4 x i32> %a) {		define <4 x i32> @not_sign_4xi32_2(<4 x i32> %a) {
; CHECK-LABEL: not_sign_4xi32_2:		; CHECK-LABEL: not_sign_4xi32_2:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: adrp x8, .LCPI17_0		; CHECK-NEXT: adrp x8, .LCPI17_0
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI17_0]		; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI17_0]
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%c = icmp sgt <4 x i32> %a, <i32 -1, i32 -1, i32 -1, i32 -1>		%c = icmp sgt <4 x i32> %a, <i32 -1, i32 -1, i32 -1, i32 -1>
%res = select <4 x i1> %c, <4 x i32> <i32 1, i32 1, i32 -1, i32 1>, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>		%res = select <4 x i1> %c, <4 x i32> <i32 1, i32 1, i32 -1, i32 1>, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
ret <4 x i32> %res		ret <4 x i32> %res
}		}

▲ Show 20 Lines • Show All 42 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/dag-numsignbits.ll

	Show All 13 Lines
	; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI0_0]			; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI0_0]
	; CHECK-NEXT: adrp x8, .LCPI0_1			; CHECK-NEXT: adrp x8, .LCPI0_1
	; CHECK-NEXT: add v0.4h, v0.4h, v1.4h			; CHECK-NEXT: add v0.4h, v0.4h, v1.4h
	; CHECK-NEXT: movi v1.4h, #1			; CHECK-NEXT: movi v1.4h, #1
	; CHECK-NEXT: cmgt v0.4h, v1.4h, v0.4h			; CHECK-NEXT: cmgt v0.4h, v1.4h, v0.4h
	; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI0_1]			; CHECK-NEXT: ldr d1, [x8, :lo12:.LCPI0_1]
	; CHECK-NEXT: and v0.8b, v0.8b, v1.8b			; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
	; CHECK-NEXT: shl v0.4h, v0.4h, #15			; CHECK-NEXT: shl v0.4h, v0.4h, #15
	; CHECK-NEXT: sshr v0.4h, v0.4h, #15			; CHECK-NEXT: cmlt v0.4h, v0.4h, #0
	; CHECK-NEXT: umov w0, v0.h[0]			; CHECK-NEXT: umov w0, v0.h[0]
	; CHECK-NEXT: umov w3, v0.h[3]			; CHECK-NEXT: umov w3, v0.h[3]
	; CHECK-NEXT: b foo			; CHECK-NEXT: b foo
	%tmp3 = shufflevector <4 x i16> %a1, <4 x i16> undef, <4 x i32> zeroinitializer			%tmp3 = shufflevector <4 x i16> %a1, <4 x i16> undef, <4 x i32> zeroinitializer
	%tmp5 = add <4 x i16> %tmp3, <i16 18249, i16 6701, i16 -18744, i16 -25086>			%tmp5 = add <4 x i16> %tmp3, <i16 18249, i16 6701, i16 -18744, i16 -25086>
	%tmp6 = icmp slt <4 x i16> %tmp5, <i16 1, i16 1, i16 1, i16 1>			%tmp6 = icmp slt <4 x i16> %tmp5, <i16 1, i16 1, i16 1, i16 1>
	%tmp7 = and <4 x i1> %tmp6, <i1 true, i1 false, i1 false, i1 true>			%tmp7 = and <4 x i1> %tmp6, <i1 true, i1 false, i1 false, i1 true>
	%tmp8 = sext <4 x i1> %tmp7 to <4 x i16>			%tmp8 = sext <4 x i1> %tmp7 to <4 x i16>
	Show All 13 Lines

llvm/test/CodeGen/AArch64/div_minsize.ll

Show All 29 Lines	entry:
ret i32 %div		ret i32 %div
; CHECK-LABEL: testsize4		; CHECK-LABEL: testsize4
; CHECK: udiv		; CHECK: udiv
}		}

define <8 x i16> @sdiv_vec8x16_minsize(<8 x i16> %var) minsize {		define <8 x i16> @sdiv_vec8x16_minsize(<8 x i16> %var) minsize {
entry:		entry:
; CHECK: sdiv_vec8x16_minsize		; CHECK: sdiv_vec8x16_minsize
; CHECK: sshr v1.8h, v0.8h, #15		; CHECK: cmlt v1.8h, v0.8h, #0
; CHECK: usra v0.8h, v1.8h, #11		; CHECK: usra v0.8h, v1.8h, #11
; CHECK: sshr v0.8h, v0.8h, #5		; CHECK: sshr v0.8h, v0.8h, #5
; CHECK: ret		; CHECK: ret
%0 = sdiv <8 x i16> %var, <i16 32, i16 32, i16 32, i16 32, i16 32, i16 32, i16 32, i16 32>		%0 = sdiv <8 x i16> %var, <i16 32, i16 32, i16 32, i16 32, i16 32, i16 32, i16 32, i16 32>
ret <8 x i16> %0		ret <8 x i16> %0
}		}

llvm/test/CodeGen/AArch64/selectcc-to-shiftand.ll

Show First 20 Lines • Show All 161 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
ret i64 %shl		ret i64 %shl
}		}

define <16 x i8> @sel_shift_bool_v16i8(<16 x i1> %t) {		define <16 x i8> @sel_shift_bool_v16i8(<16 x i1> %t) {
; CHECK-LABEL: sel_shift_bool_v16i8:		; CHECK-LABEL: sel_shift_bool_v16i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v0.16b, v0.16b, #7		; CHECK-NEXT: shl v0.16b, v0.16b, #7
; CHECK-NEXT: movi v1.16b, #128		; CHECK-NEXT: movi v1.16b, #128
; CHECK-NEXT: sshr v0.16b, v0.16b, #7		; CHECK-NEXT: cmlt v0.16b, v0.16b, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%shl = select <16 x i1> %t, <16 x i8> <i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128>, <16 x i8> zeroinitializer		%shl = select <16 x i1> %t, <16 x i8> <i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128, i8 128>, <16 x i8> zeroinitializer
ret <16 x i8> %shl		ret <16 x i8> %shl
}		}

define <8 x i16> @sel_shift_bool_v8i16(<8 x i1> %t) {		define <8 x i16> @sel_shift_bool_v8i16(<8 x i1> %t) {
; CHECK-LABEL: sel_shift_bool_v8i16:		; CHECK-LABEL: sel_shift_bool_v8i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.8h, v0.8b, #0		; CHECK-NEXT: ushll v0.8h, v0.8b, #0
; CHECK-NEXT: movi v1.8h, #128		; CHECK-NEXT: movi v1.8h, #128
; CHECK-NEXT: shl v0.8h, v0.8h, #15		; CHECK-NEXT: shl v0.8h, v0.8h, #15
; CHECK-NEXT: sshr v0.8h, v0.8h, #15		; CHECK-NEXT: cmlt v0.8h, v0.8h, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%shl= select <8 x i1> %t, <8 x i16> <i16 128, i16 128, i16 128, i16 128, i16 128, i16 128, i16 128, i16 128>, <8 x i16> zeroinitializer		%shl= select <8 x i1> %t, <8 x i16> <i16 128, i16 128, i16 128, i16 128, i16 128, i16 128, i16 128, i16 128>, <8 x i16> zeroinitializer
ret <8 x i16> %shl		ret <8 x i16> %shl
}		}

define <4 x i32> @sel_shift_bool_v4i32(<4 x i1> %t) {		define <4 x i32> @sel_shift_bool_v4i32(<4 x i1> %t) {
; CHECK-LABEL: sel_shift_bool_v4i32:		; CHECK-LABEL: sel_shift_bool_v4i32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.4s, v0.4h, #0		; CHECK-NEXT: ushll v0.4s, v0.4h, #0
; CHECK-NEXT: movi v1.4s, #64		; CHECK-NEXT: movi v1.4s, #64
; CHECK-NEXT: shl v0.4s, v0.4s, #31		; CHECK-NEXT: shl v0.4s, v0.4s, #31
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%shl = select <4 x i1> %t, <4 x i32> <i32 64, i32 64, i32 64, i32 64>, <4 x i32> zeroinitializer		%shl = select <4 x i1> %t, <4 x i32> <i32 64, i32 64, i32 64, i32 64>, <4 x i32> zeroinitializer
ret <4 x i32> %shl		ret <4 x i32> %shl
}		}

define <2 x i64> @sel_shift_bool_v2i64(<2 x i1> %t) {		define <2 x i64> @sel_shift_bool_v2i64(<2 x i1> %t) {
; CHECK-LABEL: sel_shift_bool_v2i64:		; CHECK-LABEL: sel_shift_bool_v2i64:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.2d, v0.2s, #0		; CHECK-NEXT: ushll v0.2d, v0.2s, #0
; CHECK-NEXT: mov w8, #65536		; CHECK-NEXT: mov w8, #65536
; CHECK-NEXT: dup v1.2d, x8		; CHECK-NEXT: dup v1.2d, x8
; CHECK-NEXT: shl v0.2d, v0.2d, #63		; CHECK-NEXT: shl v0.2d, v0.2d, #63
; CHECK-NEXT: sshr v0.2d, v0.2d, #63		; CHECK-NEXT: cmlt v0.2d, v0.2d, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%shl = select <2 x i1> %t, <2 x i64> <i64 65536, i64 65536>, <2 x i64> zeroinitializer		%shl = select <2 x i1> %t, <2 x i64> <i64 65536, i64 65536>, <2 x i64> zeroinitializer
ret <2 x i64> %shl		ret <2 x i64> %shl
}		}

llvm/test/CodeGen/AArch64/srem-seteq-vec-splat.ll

Show First 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%ret = zext <4 x i1> %cmp to <4 x i32>		%ret = zext <4 x i1> %cmp to <4 x i32>
ret <4 x i32> %ret		ret <4 x i32> %ret
}		}

; We can lower remainder of division by powers of two much better elsewhere.		; We can lower remainder of division by powers of two much better elsewhere.
define <4 x i32> @test_srem_pow2(<4 x i32> %X) nounwind {		define <4 x i32> @test_srem_pow2(<4 x i32> %X) nounwind {
; CHECK-LABEL: test_srem_pow2:		; CHECK-LABEL: test_srem_pow2:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v2.4s, v0.4s, #31		; CHECK-NEXT: cmlt v2.4s, v0.4s, #0
; CHECK-NEXT: mov v3.16b, v0.16b		; CHECK-NEXT: mov v3.16b, v0.16b
; CHECK-NEXT: movi v1.4s, #1		; CHECK-NEXT: movi v1.4s, #1
; CHECK-NEXT: usra v3.4s, v2.4s, #28		; CHECK-NEXT: usra v3.4s, v2.4s, #28
; CHECK-NEXT: bic v3.4s, #15		; CHECK-NEXT: bic v3.4s, #15
; CHECK-NEXT: sub v0.4s, v0.4s, v3.4s		; CHECK-NEXT: sub v0.4s, v0.4s, v3.4s
; CHECK-NEXT: cmeq v0.4s, v0.4s, #0		; CHECK-NEXT: cmeq v0.4s, v0.4s, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%srem = srem <4 x i32> %X, <i32 16, i32 16, i32 16, i32 16>		%srem = srem <4 x i32> %X, <i32 16, i32 16, i32 16, i32 16>
%cmp = icmp eq <4 x i32> %srem, <i32 0, i32 0, i32 0, i32 0>		%cmp = icmp eq <4 x i32> %srem, <i32 0, i32 0, i32 0, i32 0>
%ret = zext <4 x i1> %cmp to <4 x i32>		%ret = zext <4 x i1> %cmp to <4 x i32>
ret <4 x i32> %ret		ret <4 x i32> %ret
}		}

; We could lower remainder of division by INT_MIN much better elsewhere.		; We could lower remainder of division by INT_MIN much better elsewhere.
define <4 x i32> @test_srem_int_min(<4 x i32> %X) nounwind {		define <4 x i32> @test_srem_int_min(<4 x i32> %X) nounwind {
; CHECK-LABEL: test_srem_int_min:		; CHECK-LABEL: test_srem_int_min:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v2.4s, v0.4s, #31		; CHECK-NEXT: cmlt v2.4s, v0.4s, #0
; CHECK-NEXT: mov v3.16b, v0.16b		; CHECK-NEXT: mov v3.16b, v0.16b
; CHECK-NEXT: movi v1.4s, #128, lsl #24		; CHECK-NEXT: movi v1.4s, #128, lsl #24
; CHECK-NEXT: usra v3.4s, v2.4s, #1		; CHECK-NEXT: usra v3.4s, v2.4s, #1
; CHECK-NEXT: and v1.16b, v3.16b, v1.16b		; CHECK-NEXT: and v1.16b, v3.16b, v1.16b
; CHECK-NEXT: sub v0.4s, v0.4s, v1.4s		; CHECK-NEXT: sub v0.4s, v0.4s, v1.4s
; CHECK-NEXT: movi v1.4s, #1		; CHECK-NEXT: movi v1.4s, #1
; CHECK-NEXT: cmeq v0.4s, v0.4s, #0		; CHECK-NEXT: cmeq v0.4s, v0.4s, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
Show All 18 Lines

llvm/test/CodeGen/AArch64/sve-fixed-length-fp-vselect.ll

Show All 19 Lines
; Don't use SVE when its registers are no bigger than NEON.		; Don't use SVE when its registers are no bigger than NEON.
; NO_SVE-NOT: ptrue		; NO_SVE-NOT: ptrue

; Don't use SVE for 64-bit vectors.		; Don't use SVE for 64-bit vectors.
define <4 x half> @select_v4f16(<4 x half> %op1, <4 x half> %op2, <4 x i1> %mask) #0 {		define <4 x half> @select_v4f16(<4 x half> %op1, <4 x half> %op2, <4 x i1> %mask) #0 {
; CHECK-LABEL: select_v4f16:		; CHECK-LABEL: select_v4f16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v2.4h, v2.4h, #15		; CHECK-NEXT: shl v2.4h, v2.4h, #15
; CHECK-NEXT: sshr v2.4h, v2.4h, #15		; CHECK-NEXT: cmlt v2.4h, v2.4h, #0
; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b		; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <4 x i1> %mask, <4 x half> %op1, <4 x half> %op2		%sel = select <4 x i1> %mask, <4 x half> %op1, <4 x half> %op2
ret <4 x half> %sel		ret <4 x half> %sel
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <8 x half> @select_v8f16(<8 x half> %op1, <8 x half> %op2, <8 x i1> %mask) #0 {		define <8 x half> @select_v8f16(<8 x half> %op1, <8 x half> %op2, <8 x i1> %mask) #0 {
; CHECK-LABEL: select_v8f16:		; CHECK-LABEL: select_v8f16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v2.8h, v2.8b, #0		; CHECK-NEXT: ushll v2.8h, v2.8b, #0
; CHECK-NEXT: shl v2.8h, v2.8h, #15		; CHECK-NEXT: shl v2.8h, v2.8h, #15
; CHECK-NEXT: sshr v2.8h, v2.8h, #15		; CHECK-NEXT: cmlt v2.8h, v2.8h, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <8 x i1> %mask, <8 x half> %op1, <8 x half> %op2		%sel = select <8 x i1> %mask, <8 x half> %op1, <8 x half> %op2
ret <8 x half> %sel		ret <8 x half> %sel
}		}

define void @select_v16f16(<16 x half>* %a, <16 x half>* %b, <16 x i1>* %c) #0 {		define void @select_v16f16(<16 x half>* %a, <16 x half>* %b, <16 x i1>* %c) #0 {
; CHECK-LABEL: select_v16f16:		; CHECK-LABEL: select_v16f16:
▲ Show 20 Lines • Show All 600 Lines • ▼ Show 20 Lines	; VBITS_GE_2048-NEXT: ret
ret void		ret void
}		}

; Don't use SVE for 64-bit vectors.		; Don't use SVE for 64-bit vectors.
define <2 x float> @select_v2f32(<2 x float> %op1, <2 x float> %op2, <2 x i1> %mask) #0 {		define <2 x float> @select_v2f32(<2 x float> %op1, <2 x float> %op2, <2 x i1> %mask) #0 {
; CHECK-LABEL: select_v2f32:		; CHECK-LABEL: select_v2f32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v2.2s, v2.2s, #31		; CHECK-NEXT: shl v2.2s, v2.2s, #31
; CHECK-NEXT: sshr v2.2s, v2.2s, #31		; CHECK-NEXT: cmlt v2.2s, v2.2s, #0
; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b		; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <2 x i1> %mask, <2 x float> %op1, <2 x float> %op2		%sel = select <2 x i1> %mask, <2 x float> %op1, <2 x float> %op2
ret <2 x float> %sel		ret <2 x float> %sel
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <4 x float> @select_v4f32(<4 x float> %op1, <4 x float> %op2, <4 x i1> %mask) #0 {		define <4 x float> @select_v4f32(<4 x float> %op1, <4 x float> %op2, <4 x i1> %mask) #0 {
; CHECK-LABEL: select_v4f32:		; CHECK-LABEL: select_v4f32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v2.4s, v2.4h, #0		; CHECK-NEXT: ushll v2.4s, v2.4h, #0
; CHECK-NEXT: shl v2.4s, v2.4s, #31		; CHECK-NEXT: shl v2.4s, v2.4s, #31
; CHECK-NEXT: sshr v2.4s, v2.4s, #31		; CHECK-NEXT: cmlt v2.4s, v2.4s, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <4 x i1> %mask, <4 x float> %op1, <4 x float> %op2		%sel = select <4 x i1> %mask, <4 x float> %op1, <4 x float> %op2
ret <4 x float> %sel		ret <4 x float> %sel
}		}

define void @select_v8f32(<8 x float>* %a, <8 x float>* %b, <8 x i1>* %c) #0 {		define void @select_v8f32(<8 x float>* %a, <8 x float>* %b, <8 x i1>* %c) #0 {
; CHECK-LABEL: select_v8f32:		; CHECK-LABEL: select_v8f32:
▲ Show 20 Lines • Show All 313 Lines • ▼ Show 20 Lines
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <2 x double> @select_v2f64(<2 x double> %op1, <2 x double> %op2, <2 x i1> %mask) #0 {		define <2 x double> @select_v2f64(<2 x double> %op1, <2 x double> %op2, <2 x i1> %mask) #0 {
; CHECK-LABEL: select_v2f64:		; CHECK-LABEL: select_v2f64:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v2.2d, v2.2s, #0		; CHECK-NEXT: ushll v2.2d, v2.2s, #0
; CHECK-NEXT: shl v2.2d, v2.2d, #63		; CHECK-NEXT: shl v2.2d, v2.2d, #63
; CHECK-NEXT: sshr v2.2d, v2.2d, #63		; CHECK-NEXT: cmlt v2.2d, v2.2d, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <2 x i1> %mask, <2 x double> %op1, <2 x double> %op2		%sel = select <2 x i1> %mask, <2 x double> %op1, <2 x double> %op2
ret <2 x double> %sel		ret <2 x double> %sel
}		}

define void @select_v4f64(<4 x double>* %a, <4 x double>* %b, <4 x i1>* %c) #0 {		define void @select_v4f64(<4 x double>* %a, <4 x double>* %b, <4 x i1>* %c) #0 {
; CHECK-LABEL: select_v4f64:		; CHECK-LABEL: select_v4f64:
▲ Show 20 Lines • Show All 300 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-fixed-length-int-vselect.ll

Show All 19 Lines
; Don't use SVE when its registers are no bigger than NEON.		; Don't use SVE when its registers are no bigger than NEON.
; NO_SVE-NOT: ptrue		; NO_SVE-NOT: ptrue

; Don't use SVE for 64-bit vectors.		; Don't use SVE for 64-bit vectors.
define <8 x i8> @select_v8i8(<8 x i8> %op1, <8 x i8> %op2, <8 x i1> %mask) #0 {		define <8 x i8> @select_v8i8(<8 x i8> %op1, <8 x i8> %op2, <8 x i1> %mask) #0 {
; CHECK-LABEL: select_v8i8:		; CHECK-LABEL: select_v8i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v2.8b, v2.8b, #7		; CHECK-NEXT: shl v2.8b, v2.8b, #7
; CHECK-NEXT: sshr v2.8b, v2.8b, #7		; CHECK-NEXT: cmlt v2.8b, v2.8b, #0
; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b		; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <8 x i1> %mask, <8 x i8> %op1, <8 x i8> %op2		%sel = select <8 x i1> %mask, <8 x i8> %op1, <8 x i8> %op2
ret <8 x i8> %sel		ret <8 x i8> %sel
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <16 x i8> @select_v16i8(<16 x i8> %op1, <16 x i8> %op2, <16 x i1> %mask) #0 {		define <16 x i8> @select_v16i8(<16 x i8> %op1, <16 x i8> %op2, <16 x i1> %mask) #0 {
; CHECK-LABEL: select_v16i8:		; CHECK-LABEL: select_v16i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v2.16b, v2.16b, #7		; CHECK-NEXT: shl v2.16b, v2.16b, #7
; CHECK-NEXT: sshr v2.16b, v2.16b, #7		; CHECK-NEXT: cmlt v2.16b, v2.16b, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <16 x i1> %mask, <16 x i8> %op1, <16 x i8> %op2		%sel = select <16 x i1> %mask, <16 x i8> %op1, <16 x i8> %op2
ret <16 x i8> %sel		ret <16 x i8> %sel
}		}

define void @select_v32i8(<32 x i8>* %a, <32 x i8>* %b, <32 x i1>* %c) #0 {		define void @select_v32i8(<32 x i8>* %a, <32 x i8>* %b, <32 x i1>* %c) #0 {
; CHECK-LABEL: select_v32i8:		; CHECK-LABEL: select_v32i8:
▲ Show 20 Lines • Show All 1,083 Lines • ▼ Show 20 Lines	; VBITS_GE_2048-NEXT: ret
ret void		ret void
}		}

; Don't use SVE for 64-bit vectors.		; Don't use SVE for 64-bit vectors.
define <4 x i16> @select_v4i16(<4 x i16> %op1, <4 x i16> %op2, <4 x i1> %mask) #0 {		define <4 x i16> @select_v4i16(<4 x i16> %op1, <4 x i16> %op2, <4 x i1> %mask) #0 {
; CHECK-LABEL: select_v4i16:		; CHECK-LABEL: select_v4i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v2.4h, v2.4h, #15		; CHECK-NEXT: shl v2.4h, v2.4h, #15
; CHECK-NEXT: sshr v2.4h, v2.4h, #15		; CHECK-NEXT: cmlt v2.4h, v2.4h, #0
; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b		; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <4 x i1> %mask, <4 x i16> %op1, <4 x i16> %op2		%sel = select <4 x i1> %mask, <4 x i16> %op1, <4 x i16> %op2
ret <4 x i16> %sel		ret <4 x i16> %sel
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <8 x i16> @select_v8i16(<8 x i16> %op1, <8 x i16> %op2, <8 x i1> %mask) #0 {		define <8 x i16> @select_v8i16(<8 x i16> %op1, <8 x i16> %op2, <8 x i1> %mask) #0 {
; CHECK-LABEL: select_v8i16:		; CHECK-LABEL: select_v8i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v2.8h, v2.8b, #0		; CHECK-NEXT: ushll v2.8h, v2.8b, #0
; CHECK-NEXT: shl v2.8h, v2.8h, #15		; CHECK-NEXT: shl v2.8h, v2.8h, #15
; CHECK-NEXT: sshr v2.8h, v2.8h, #15		; CHECK-NEXT: cmlt v2.8h, v2.8h, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <8 x i1> %mask, <8 x i16> %op1, <8 x i16> %op2		%sel = select <8 x i1> %mask, <8 x i16> %op1, <8 x i16> %op2
ret <8 x i16> %sel		ret <8 x i16> %sel
}		}

define void @select_v16i16(<16 x i16>* %a, <16 x i16>* %b, <16 x i1>* %c) #0 {		define void @select_v16i16(<16 x i16>* %a, <16 x i16>* %b, <16 x i1>* %c) #0 {
; CHECK-LABEL: select_v16i16:		; CHECK-LABEL: select_v16i16:
▲ Show 20 Lines • Show All 600 Lines • ▼ Show 20 Lines	; VBITS_GE_2048-NEXT: ret
ret void		ret void
}		}

; Don't use SVE for 64-bit vectors.		; Don't use SVE for 64-bit vectors.
define <2 x i32> @select_v2i32(<2 x i32> %op1, <2 x i32> %op2, <2 x i1> %mask) #0 {		define <2 x i32> @select_v2i32(<2 x i32> %op1, <2 x i32> %op2, <2 x i1> %mask) #0 {
; CHECK-LABEL: select_v2i32:		; CHECK-LABEL: select_v2i32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: shl v2.2s, v2.2s, #31		; CHECK-NEXT: shl v2.2s, v2.2s, #31
; CHECK-NEXT: sshr v2.2s, v2.2s, #31		; CHECK-NEXT: cmlt v2.2s, v2.2s, #0
; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b		; CHECK-NEXT: bif v0.8b, v1.8b, v2.8b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <2 x i1> %mask, <2 x i32> %op1, <2 x i32> %op2		%sel = select <2 x i1> %mask, <2 x i32> %op1, <2 x i32> %op2
ret <2 x i32> %sel		ret <2 x i32> %sel
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <4 x i32> @select_v4i32(<4 x i32> %op1, <4 x i32> %op2, <4 x i1> %mask) #0 {		define <4 x i32> @select_v4i32(<4 x i32> %op1, <4 x i32> %op2, <4 x i1> %mask) #0 {
; CHECK-LABEL: select_v4i32:		; CHECK-LABEL: select_v4i32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v2.4s, v2.4h, #0		; CHECK-NEXT: ushll v2.4s, v2.4h, #0
; CHECK-NEXT: shl v2.4s, v2.4s, #31		; CHECK-NEXT: shl v2.4s, v2.4s, #31
; CHECK-NEXT: sshr v2.4s, v2.4s, #31		; CHECK-NEXT: cmlt v2.4s, v2.4s, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <4 x i1> %mask, <4 x i32> %op1, <4 x i32> %op2		%sel = select <4 x i1> %mask, <4 x i32> %op1, <4 x i32> %op2
ret <4 x i32> %sel		ret <4 x i32> %sel
}		}

define void @select_v8i32(<8 x i32>* %a, <8 x i32>* %b, <8 x i1>* %c) #0 {		define void @select_v8i32(<8 x i32>* %a, <8 x i32>* %b, <8 x i1>* %c) #0 {
; CHECK-LABEL: select_v8i32:		; CHECK-LABEL: select_v8i32:
▲ Show 20 Lines • Show All 313 Lines • ▼ Show 20 Lines
}		}

; Don't use SVE for 128-bit vectors.		; Don't use SVE for 128-bit vectors.
define <2 x i64> @select_v2i64(<2 x i64> %op1, <2 x i64> %op2, <2 x i1> %mask) #0 {		define <2 x i64> @select_v2i64(<2 x i64> %op1, <2 x i64> %op2, <2 x i1> %mask) #0 {
; CHECK-LABEL: select_v2i64:		; CHECK-LABEL: select_v2i64:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v2.2d, v2.2s, #0		; CHECK-NEXT: ushll v2.2d, v2.2s, #0
; CHECK-NEXT: shl v2.2d, v2.2d, #63		; CHECK-NEXT: shl v2.2d, v2.2d, #63
; CHECK-NEXT: sshr v2.2d, v2.2d, #63		; CHECK-NEXT: cmlt v2.2d, v2.2d, #0
; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b		; CHECK-NEXT: bif v0.16b, v1.16b, v2.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%sel = select <2 x i1> %mask, <2 x i64> %op1, <2 x i64> %op2		%sel = select <2 x i1> %mask, <2 x i64> %op1, <2 x i64> %op2
ret <2 x i64> %sel		ret <2 x i64> %sel
}		}

define void @select_v4i64(<4 x i64>* %a, <4 x i64>* %b, <4 x i1>* %c) #0 {		define void @select_v4i64(<4 x i64>* %a, <4 x i64>* %b, <4 x i1>* %c) #0 {
; CHECK-LABEL: select_v4i64:		; CHECK-LABEL: select_v4i64:
▲ Show 20 Lines • Show All 300 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-gather.ll

	Show First 20 Lines • Show All 626 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: shl v1.2s, v1.2s, #16			; CHECK-NEXT: shl v1.2s, v1.2s, #16
	; CHECK-NEXT: sshr v1.2s, v1.2s, #16			; CHECK-NEXT: sshr v1.2s, v1.2s, #16
	; CHECK-NEXT: fmov w8, s1			; CHECK-NEXT: fmov w8, s1
	; CHECK-NEXT: mov w9, v1.s[1]			; CHECK-NEXT: mov w9, v1.s[1]
	; CHECK-NEXT: ldr q1, [x1]			; CHECK-NEXT: ldr q1, [x1]
	; CHECK-NEXT: mov v0.h[0], w8			; CHECK-NEXT: mov v0.h[0], w8
	; CHECK-NEXT: mov v0.h[1], w9			; CHECK-NEXT: mov v0.h[1], w9
	; CHECK-NEXT: shl v0.4h, v0.4h, #15			; CHECK-NEXT: shl v0.4h, v0.4h, #15
	; CHECK-NEXT: sshr v0.4h, v0.4h, #15			; CHECK-NEXT: cmlt v0.4h, v0.4h, #0
	; CHECK-NEXT: sunpklo z0.s, z0.h			; CHECK-NEXT: sunpklo z0.s, z0.h
	; CHECK-NEXT: sunpklo z0.d, z0.s			; CHECK-NEXT: sunpklo z0.d, z0.s
	; CHECK-NEXT: cmpne p0.d, p0/z, z0.d, #0			; CHECK-NEXT: cmpne p0.d, p0/z, z0.d, #0
	; CHECK-NEXT: ld1h { z0.d }, p0/z, [z1.d]			; CHECK-NEXT: ld1h { z0.d }, p0/z, [z1.d]
	; CHECK-NEXT: uzp1 z0.s, z0.s, z0.s			; CHECK-NEXT: uzp1 z0.s, z0.s, z0.s
	; CHECK-NEXT: uzp1 z0.h, z0.h, z0.h			; CHECK-NEXT: uzp1 z0.h, z0.h, z0.h
	; CHECK-NEXT: str s0, [x0]			; CHECK-NEXT: str s0, [x0]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	▲ Show 20 Lines • Show All 700 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-loads.ll

	Show All 36 Lines
	; CHECK-NEXT: mov v1.s[1], w9			; CHECK-NEXT: mov v1.s[1], w9
	; CHECK-NEXT: shl v1.2s, v1.2s, #16			; CHECK-NEXT: shl v1.2s, v1.2s, #16
	; CHECK-NEXT: sshr v1.2s, v1.2s, #16			; CHECK-NEXT: sshr v1.2s, v1.2s, #16
	; CHECK-NEXT: fmov w8, s1			; CHECK-NEXT: fmov w8, s1
	; CHECK-NEXT: mov w9, v1.s[1]			; CHECK-NEXT: mov w9, v1.s[1]
	; CHECK-NEXT: mov v0.h[0], w8			; CHECK-NEXT: mov v0.h[0], w8
	; CHECK-NEXT: mov v0.h[1], w9			; CHECK-NEXT: mov v0.h[1], w9
	; CHECK-NEXT: shl v0.4h, v0.4h, #15			; CHECK-NEXT: shl v0.4h, v0.4h, #15
	; CHECK-NEXT: sshr v0.4h, v0.4h, #15			; CHECK-NEXT: cmlt v0.4h, v0.4h, #0
	; CHECK-NEXT: cmpne p0.h, p0/z, z0.h, #0			; CHECK-NEXT: cmpne p0.h, p0/z, z0.h, #0
	; CHECK-NEXT: ld1h { z0.h }, p0/z, [x0]			; CHECK-NEXT: ld1h { z0.h }, p0/z, [x0]
	; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0			; CHECK-NEXT: // kill: def $d0 killed $d0 killed $z0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%a = load <2 x half>, <2 x half>* %ap			%a = load <2 x half>, <2 x half>* %ap
	%b = load <2 x half>, <2 x half>* %bp			%b = load <2 x half>, <2 x half>* %bp
	%mask = fcmp oeq <2 x half> %a, %b			%mask = fcmp oeq <2 x half> %a, %b
	%load = call <2 x half> @llvm.masked.load.v2f16(<2 x half>* %ap, i32 8, <2 x i1> %mask, <2 x half> zeroinitializer)			%load = call <2 x half> @llvm.masked.load.v2f16(<2 x half>* %ap, i32 8, <2 x i1> %mask, <2 x half> zeroinitializer)
	▲ Show 20 Lines • Show All 660 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll

	Show First 20 Lines • Show All 575 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: shl v2.2s, v2.2s, #16			; CHECK-NEXT: shl v2.2s, v2.2s, #16
	; CHECK-NEXT: sshr v2.2s, v2.2s, #16			; CHECK-NEXT: sshr v2.2s, v2.2s, #16
	; CHECK-NEXT: fmov w8, s2			; CHECK-NEXT: fmov w8, s2
	; CHECK-NEXT: mov w9, v2.s[1]			; CHECK-NEXT: mov w9, v2.s[1]
	; CHECK-NEXT: ldr q2, [x1]			; CHECK-NEXT: ldr q2, [x1]
	; CHECK-NEXT: mov v0.h[0], w8			; CHECK-NEXT: mov v0.h[0], w8
	; CHECK-NEXT: mov v0.h[1], w9			; CHECK-NEXT: mov v0.h[1], w9
	; CHECK-NEXT: shl v0.4h, v0.4h, #15			; CHECK-NEXT: shl v0.4h, v0.4h, #15
	; CHECK-NEXT: sshr v0.4h, v0.4h, #15			; CHECK-NEXT: cmlt v0.4h, v0.4h, #0
	; CHECK-NEXT: sunpklo z0.s, z0.h			; CHECK-NEXT: sunpklo z0.s, z0.h
	; CHECK-NEXT: sunpklo z0.d, z0.s			; CHECK-NEXT: sunpklo z0.d, z0.s
	; CHECK-NEXT: cmpne p0.d, p0/z, z0.d, #0			; CHECK-NEXT: cmpne p0.d, p0/z, z0.d, #0
	; CHECK-NEXT: uunpklo z0.d, z1.s			; CHECK-NEXT: uunpklo z0.d, z1.s
	; CHECK-NEXT: st1h { z0.d }, p0, [z2.d]			; CHECK-NEXT: st1h { z0.d }, p0, [z2.d]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%vals = load <2 x half>, <2 x half>* %a			%vals = load <2 x half>, <2 x half>* %a
	%ptrs = load <2 x half>, <2 x half>* %b			%ptrs = load <2 x half>, <2 x half>* %b
	▲ Show 20 Lines • Show All 630 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-stores.ll

	Show All 36 Lines
	; CHECK-NEXT: mov v2.s[1], w9			; CHECK-NEXT: mov v2.s[1], w9
	; CHECK-NEXT: shl v2.2s, v2.2s, #16			; CHECK-NEXT: shl v2.2s, v2.2s, #16
	; CHECK-NEXT: sshr v2.2s, v2.2s, #16			; CHECK-NEXT: sshr v2.2s, v2.2s, #16
	; CHECK-NEXT: fmov w8, s2			; CHECK-NEXT: fmov w8, s2
	; CHECK-NEXT: mov w9, v2.s[1]			; CHECK-NEXT: mov w9, v2.s[1]
	; CHECK-NEXT: mov v0.h[0], w8			; CHECK-NEXT: mov v0.h[0], w8
	; CHECK-NEXT: mov v0.h[1], w9			; CHECK-NEXT: mov v0.h[1], w9
	; CHECK-NEXT: shl v0.4h, v0.4h, #15			; CHECK-NEXT: shl v0.4h, v0.4h, #15
	; CHECK-NEXT: sshr v0.4h, v0.4h, #15			; CHECK-NEXT: cmlt v0.4h, v0.4h, #0
	; CHECK-NEXT: cmpne p0.h, p0/z, z0.h, #0			; CHECK-NEXT: cmpne p0.h, p0/z, z0.h, #0
	; CHECK-NEXT: st1h { z1.h }, p0, [x1]			; CHECK-NEXT: st1h { z1.h }, p0, [x1]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%a = load <2 x half>, <2 x half>* %ap			%a = load <2 x half>, <2 x half>* %ap
	%b = load <2 x half>, <2 x half>* %bp			%b = load <2 x half>, <2 x half>* %bp
	%mask = fcmp oeq <2 x half> %a, %b			%mask = fcmp oeq <2 x half> %a, %b
	call void @llvm.masked.store.v2f16(<2 x half> %a, <2 x half>* %bp, i32 8, <2 x i1> %mask)			call void @llvm.masked.store.v2f16(<2 x half> %a, <2 x half>* %bp, i32 8, <2 x i1> %mask)
	ret void			ret void
	▲ Show 20 Lines • Show All 257 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/vec_uaddo.ll

	Show First 20 Lines • Show All 146 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ushll v2.4s, v2.4h, #0			; CHECK-NEXT: ushll v2.4s, v2.4h, #0
	; CHECK-NEXT: zip1 v3.8b, v1.8b, v0.8b			; CHECK-NEXT: zip1 v3.8b, v1.8b, v0.8b
	; CHECK-NEXT: zip2 v1.8b, v1.8b, v0.8b			; CHECK-NEXT: zip2 v1.8b, v1.8b, v0.8b
	; CHECK-NEXT: ushll v0.4s, v0.4h, #0			; CHECK-NEXT: ushll v0.4s, v0.4h, #0
	; CHECK-NEXT: shl v2.4s, v2.4s, #31			; CHECK-NEXT: shl v2.4s, v2.4s, #31
	; CHECK-NEXT: ushll v3.4s, v3.4h, #0			; CHECK-NEXT: ushll v3.4s, v3.4h, #0
	; CHECK-NEXT: ushll v1.4s, v1.4h, #0			; CHECK-NEXT: ushll v1.4s, v1.4h, #0
	; CHECK-NEXT: shl v5.4s, v0.4s, #31			; CHECK-NEXT: shl v5.4s, v0.4s, #31
	; CHECK-NEXT: sshr v0.4s, v2.4s, #31			; CHECK-NEXT: cmlt v0.4s, v2.4s, #0
	; CHECK-NEXT: shl v3.4s, v3.4s, #31			; CHECK-NEXT: shl v3.4s, v3.4s, #31
	; CHECK-NEXT: shl v6.4s, v1.4s, #31			; CHECK-NEXT: shl v6.4s, v1.4s, #31
	; CHECK-NEXT: sshr v1.4s, v5.4s, #31			; CHECK-NEXT: cmlt v1.4s, v5.4s, #0
	; CHECK-NEXT: sshr v2.4s, v3.4s, #31			; CHECK-NEXT: cmlt v2.4s, v3.4s, #0
	; CHECK-NEXT: sshr v3.4s, v6.4s, #31			; CHECK-NEXT: cmlt v3.4s, v6.4s, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%t = call {<16 x i8>, <16 x i1>} @llvm.uadd.with.overflow.v16i8(<16 x i8> %a0, <16 x i8> %a1)			%t = call {<16 x i8>, <16 x i1>} @llvm.uadd.with.overflow.v16i8(<16 x i8> %a0, <16 x i8> %a1)
	%val = extractvalue {<16 x i8>, <16 x i1>} %t, 0			%val = extractvalue {<16 x i8>, <16 x i1>} %t, 0
	%obit = extractvalue {<16 x i8>, <16 x i1>} %t, 1			%obit = extractvalue {<16 x i8>, <16 x i1>} %t, 1
	%res = sext <16 x i1> %obit to <16 x i32>			%res = sext <16 x i1> %obit to <16 x i32>
	store <16 x i8> %val, <16 x i8>* %p2			store <16 x i8> %val, <16 x i8>* %p2
	ret <16 x i32> %res			ret <16 x i32> %res
	}			}

	define <8 x i32> @uaddo_v8i16(<8 x i16> %a0, <8 x i16> %a1, <8 x i16>* %p2) nounwind {			define <8 x i32> @uaddo_v8i16(<8 x i16> %a0, <8 x i16> %a1, <8 x i16>* %p2) nounwind {
	; CHECK-LABEL: uaddo_v8i16:			; CHECK-LABEL: uaddo_v8i16:
	; CHECK: // %bb.0:			; CHECK: // %bb.0:
	; CHECK-NEXT: add v2.8h, v0.8h, v1.8h			; CHECK-NEXT: add v2.8h, v0.8h, v1.8h
	; CHECK-NEXT: cmhi v0.8h, v0.8h, v2.8h			; CHECK-NEXT: cmhi v0.8h, v0.8h, v2.8h
	; CHECK-NEXT: str q2, [x0]			; CHECK-NEXT: str q2, [x0]
	; CHECK-NEXT: xtn v0.8b, v0.8h			; CHECK-NEXT: xtn v0.8b, v0.8h
	; CHECK-NEXT: zip1 v1.8b, v0.8b, v0.8b			; CHECK-NEXT: zip1 v1.8b, v0.8b, v0.8b
	; CHECK-NEXT: zip2 v0.8b, v0.8b, v0.8b			; CHECK-NEXT: zip2 v0.8b, v0.8b, v0.8b
	; CHECK-NEXT: ushll v1.4s, v1.4h, #0			; CHECK-NEXT: ushll v1.4s, v1.4h, #0
	; CHECK-NEXT: ushll v0.4s, v0.4h, #0			; CHECK-NEXT: ushll v0.4s, v0.4h, #0
	; CHECK-NEXT: shl v1.4s, v1.4s, #31			; CHECK-NEXT: shl v1.4s, v1.4s, #31
	; CHECK-NEXT: shl v3.4s, v0.4s, #31			; CHECK-NEXT: shl v3.4s, v0.4s, #31
	; CHECK-NEXT: sshr v0.4s, v1.4s, #31			; CHECK-NEXT: cmlt v0.4s, v1.4s, #0
	; CHECK-NEXT: sshr v1.4s, v3.4s, #31			; CHECK-NEXT: cmlt v1.4s, v3.4s, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%t = call {<8 x i16>, <8 x i1>} @llvm.uadd.with.overflow.v8i16(<8 x i16> %a0, <8 x i16> %a1)			%t = call {<8 x i16>, <8 x i1>} @llvm.uadd.with.overflow.v8i16(<8 x i16> %a0, <8 x i16> %a1)
	%val = extractvalue {<8 x i16>, <8 x i1>} %t, 0			%val = extractvalue {<8 x i16>, <8 x i1>} %t, 0
	%obit = extractvalue {<8 x i16>, <8 x i1>} %t, 1			%obit = extractvalue {<8 x i16>, <8 x i1>} %t, 1
	%res = sext <8 x i1> %obit to <8 x i32>			%res = sext <8 x i1> %obit to <8 x i32>
	store <8 x i16> %val, <8 x i16>* %p2			store <8 x i16> %val, <8 x i16>* %p2
	ret <8 x i32> %res			ret <8 x i32> %res
	}			}
	▲ Show 20 Lines • Show All 98 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: cset w14, lo			; CHECK-NEXT: cset w14, lo
	; CHECK-NEXT: csel w13, w13, w14, eq			; CHECK-NEXT: csel w13, w13, w14, eq
	; CHECK-NEXT: fmov s0, w13			; CHECK-NEXT: fmov s0, w13
	; CHECK-NEXT: mov v0.s[1], w10			; CHECK-NEXT: mov v0.s[1], w10
	; CHECK-NEXT: ldr x10, [sp]			; CHECK-NEXT: ldr x10, [sp]
	; CHECK-NEXT: stp x8, x9, [x10, #16]			; CHECK-NEXT: stp x8, x9, [x10, #16]
	; CHECK-NEXT: shl v0.2s, v0.2s, #31			; CHECK-NEXT: shl v0.2s, v0.2s, #31
	; CHECK-NEXT: stp x11, x12, [x10]			; CHECK-NEXT: stp x11, x12, [x10]
	; CHECK-NEXT: sshr v0.2s, v0.2s, #31			; CHECK-NEXT: cmlt v0.2s, v0.2s, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%t = call {<2 x i128>, <2 x i1>} @llvm.uadd.with.overflow.v2i128(<2 x i128> %a0, <2 x i128> %a1)			%t = call {<2 x i128>, <2 x i1>} @llvm.uadd.with.overflow.v2i128(<2 x i128> %a0, <2 x i128> %a1)
	%val = extractvalue {<2 x i128>, <2 x i1>} %t, 0			%val = extractvalue {<2 x i128>, <2 x i1>} %t, 0
	%obit = extractvalue {<2 x i128>, <2 x i1>} %t, 1			%obit = extractvalue {<2 x i128>, <2 x i1>} %t, 1
	%res = sext <2 x i1> %obit to <2 x i32>			%res = sext <2 x i1> %obit to <2 x i32>
	store <2 x i128> %val, <2 x i128>* %p2			store <2 x i128> %val, <2 x i128>* %p2
	ret <2 x i32> %res			ret <2 x i32> %res
	}			}

llvm/test/CodeGen/AArch64/vec_umulo.ll

	Show First 20 Lines • Show All 175 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ushll v4.4s, v4.4h, #0			; CHECK-NEXT: ushll v4.4s, v4.4h, #0
	; CHECK-NEXT: ushll v2.4s, v2.4h, #0			; CHECK-NEXT: ushll v2.4s, v2.4h, #0
	; CHECK-NEXT: ushll v5.4s, v5.4h, #0			; CHECK-NEXT: ushll v5.4s, v5.4h, #0
	; CHECK-NEXT: ushll v3.4s, v3.4h, #0			; CHECK-NEXT: ushll v3.4s, v3.4h, #0
	; CHECK-NEXT: shl v4.4s, v4.4s, #31			; CHECK-NEXT: shl v4.4s, v4.4s, #31
	; CHECK-NEXT: shl v2.4s, v2.4s, #31			; CHECK-NEXT: shl v2.4s, v2.4s, #31
	; CHECK-NEXT: shl v6.4s, v5.4s, #31			; CHECK-NEXT: shl v6.4s, v5.4s, #31
	; CHECK-NEXT: shl v3.4s, v3.4s, #31			; CHECK-NEXT: shl v3.4s, v3.4s, #31
	; CHECK-NEXT: sshr v4.4s, v4.4s, #31			; CHECK-NEXT: cmlt v4.4s, v4.4s, #0
	; CHECK-NEXT: sshr v5.4s, v2.4s, #31			; CHECK-NEXT: cmlt v5.4s, v2.4s, #0
	; CHECK-NEXT: sshr v2.4s, v6.4s, #31			; CHECK-NEXT: cmlt v2.4s, v6.4s, #0
	; CHECK-NEXT: sshr v3.4s, v3.4s, #31			; CHECK-NEXT: cmlt v3.4s, v3.4s, #0
	; CHECK-NEXT: mul v6.16b, v0.16b, v1.16b			; CHECK-NEXT: mul v6.16b, v0.16b, v1.16b
	; CHECK-NEXT: mov v0.16b, v4.16b			; CHECK-NEXT: mov v0.16b, v4.16b
	; CHECK-NEXT: mov v1.16b, v5.16b			; CHECK-NEXT: mov v1.16b, v5.16b
	; CHECK-NEXT: str q6, [x0]			; CHECK-NEXT: str q6, [x0]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%t = call {<16 x i8>, <16 x i1>} @llvm.umul.with.overflow.v16i8(<16 x i8> %a0, <16 x i8> %a1)			%t = call {<16 x i8>, <16 x i1>} @llvm.umul.with.overflow.v16i8(<16 x i8> %a0, <16 x i8> %a1)
	%val = extractvalue {<16 x i8>, <16 x i1>} %t, 0			%val = extractvalue {<16 x i8>, <16 x i1>} %t, 0
	%obit = extractvalue {<16 x i8>, <16 x i1>} %t, 1			%obit = extractvalue {<16 x i8>, <16 x i1>} %t, 1
	Show All 11 Lines
	; CHECK-NEXT: cmtst v2.8h, v2.8h, v2.8h			; CHECK-NEXT: cmtst v2.8h, v2.8h, v2.8h
	; CHECK-NEXT: xtn v2.8b, v2.8h			; CHECK-NEXT: xtn v2.8b, v2.8h
	; CHECK-NEXT: zip1 v3.8b, v2.8b, v0.8b			; CHECK-NEXT: zip1 v3.8b, v2.8b, v0.8b
	; CHECK-NEXT: zip2 v2.8b, v2.8b, v0.8b			; CHECK-NEXT: zip2 v2.8b, v2.8b, v0.8b
	; CHECK-NEXT: ushll v3.4s, v3.4h, #0			; CHECK-NEXT: ushll v3.4s, v3.4h, #0
	; CHECK-NEXT: ushll v2.4s, v2.4h, #0			; CHECK-NEXT: ushll v2.4s, v2.4h, #0
	; CHECK-NEXT: shl v3.4s, v3.4s, #31			; CHECK-NEXT: shl v3.4s, v3.4s, #31
	; CHECK-NEXT: shl v4.4s, v2.4s, #31			; CHECK-NEXT: shl v4.4s, v2.4s, #31
	; CHECK-NEXT: sshr v2.4s, v3.4s, #31			; CHECK-NEXT: cmlt v2.4s, v3.4s, #0
	; CHECK-NEXT: sshr v3.4s, v4.4s, #31			; CHECK-NEXT: cmlt v3.4s, v4.4s, #0
	; CHECK-NEXT: mul v4.8h, v0.8h, v1.8h			; CHECK-NEXT: mul v4.8h, v0.8h, v1.8h
	; CHECK-NEXT: mov v0.16b, v2.16b			; CHECK-NEXT: mov v0.16b, v2.16b
	; CHECK-NEXT: mov v1.16b, v3.16b			; CHECK-NEXT: mov v1.16b, v3.16b
	; CHECK-NEXT: str q4, [x0]			; CHECK-NEXT: str q4, [x0]
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%t = call {<8 x i16>, <8 x i1>} @llvm.umul.with.overflow.v8i16(<8 x i16> %a0, <8 x i16> %a1)			%t = call {<8 x i16>, <8 x i1>} @llvm.umul.with.overflow.v8i16(<8 x i16> %a0, <8 x i16> %a1)
	%val = extractvalue {<8 x i16>, <8 x i1>} %t, 0			%val = extractvalue {<8 x i16>, <8 x i1>} %t, 0
	%obit = extractvalue {<8 x i16>, <8 x i1>} %t, 1			%obit = extractvalue {<8 x i16>, <8 x i1>} %t, 1
	▲ Show 20 Lines • Show All 140 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: orr w9, w9, w10			; CHECK-NEXT: orr w9, w9, w10
	; CHECK-NEXT: ldr x10, [sp]			; CHECK-NEXT: ldr x10, [sp]
	; CHECK-NEXT: fmov s0, w12			; CHECK-NEXT: fmov s0, w12
	; CHECK-NEXT: stp x11, x15, [x10]			; CHECK-NEXT: stp x11, x15, [x10]
	; CHECK-NEXT: mov v0.s[1], w9			; CHECK-NEXT: mov v0.s[1], w9
	; CHECK-NEXT: mul x9, x2, x6			; CHECK-NEXT: mul x9, x2, x6
	; CHECK-NEXT: shl v0.2s, v0.2s, #31			; CHECK-NEXT: shl v0.2s, v0.2s, #31
	; CHECK-NEXT: stp x9, x8, [x10, #16]			; CHECK-NEXT: stp x9, x8, [x10, #16]
	; CHECK-NEXT: sshr v0.2s, v0.2s, #31			; CHECK-NEXT: cmlt v0.2s, v0.2s, #0
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%t = call {<2 x i128>, <2 x i1>} @llvm.umul.with.overflow.v2i128(<2 x i128> %a0, <2 x i128> %a1)			%t = call {<2 x i128>, <2 x i1>} @llvm.umul.with.overflow.v2i128(<2 x i128> %a0, <2 x i128> %a1)
	%val = extractvalue {<2 x i128>, <2 x i1>} %t, 0			%val = extractvalue {<2 x i128>, <2 x i1>} %t, 0
	%obit = extractvalue {<2 x i128>, <2 x i1>} %t, 1			%obit = extractvalue {<2 x i128>, <2 x i1>} %t, 1
	%res = sext <2 x i1> %obit to <2 x i32>			%res = sext <2 x i1> %obit to <2 x i32>
	store <2 x i128> %val, <2 x i128>* %p2			store <2 x i128> %val, <2 x i128>* %p2
	ret <2 x i32> %res			ret <2 x i32> %res
	}			}

llvm/test/CodeGen/AArch64/vselect-constants.ll

Show All 10 Lines
; CHECK-LABEL: sel_C1_or_C2_vec:		; CHECK-LABEL: sel_C1_or_C2_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.4s, v0.4h, #0		; CHECK-NEXT: ushll v0.4s, v0.4h, #0
; CHECK-NEXT: adrp x8, .LCPI0_0		; CHECK-NEXT: adrp x8, .LCPI0_0
; CHECK-NEXT: adrp x9, .LCPI0_1		; CHECK-NEXT: adrp x9, .LCPI0_1
; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI0_0]		; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI0_0]
; CHECK-NEXT: shl v0.4s, v0.4s, #31		; CHECK-NEXT: shl v0.4s, v0.4s, #31
; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI0_1]		; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI0_1]
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b		; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%add = select <4 x i1> %cond, <4 x i32> <i32 3000, i32 1, i32 -1, i32 0>, <4 x i32> <i32 42, i32 0, i32 -2, i32 -1>		%add = select <4 x i1> %cond, <4 x i32> <i32 3000, i32 1, i32 -1, i32 0>, <4 x i32> <i32 42, i32 0, i32 -2, i32 -1>
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <4 x i32> @cmp_sel_C1_or_C2_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @cmp_sel_C1_or_C2_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: cmp_sel_C1_or_C2_vec:		; CHECK-LABEL: cmp_sel_C1_or_C2_vec:
Show All 14 Lines
; CHECK-LABEL: sel_Cplus1_or_C_vec:		; CHECK-LABEL: sel_Cplus1_or_C_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.4s, v0.4h, #0		; CHECK-NEXT: ushll v0.4s, v0.4h, #0
; CHECK-NEXT: adrp x8, .LCPI2_0		; CHECK-NEXT: adrp x8, .LCPI2_0
; CHECK-NEXT: adrp x9, .LCPI2_1		; CHECK-NEXT: adrp x9, .LCPI2_1
; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI2_0]		; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI2_0]
; CHECK-NEXT: shl v0.4s, v0.4s, #31		; CHECK-NEXT: shl v0.4s, v0.4s, #31
; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI2_1]		; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI2_1]
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b		; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%add = select <4 x i1> %cond, <4 x i32> <i32 43, i32 1, i32 -1, i32 0>, <4 x i32> <i32 42, i32 0, i32 -2, i32 -1>		%add = select <4 x i1> %cond, <4 x i32> <i32 43, i32 1, i32 -1, i32 0>, <4 x i32> <i32 42, i32 0, i32 -2, i32 -1>
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <4 x i32> @cmp_sel_Cplus1_or_C_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @cmp_sel_Cplus1_or_C_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: cmp_sel_Cplus1_or_C_vec:		; CHECK-LABEL: cmp_sel_Cplus1_or_C_vec:
Show All 14 Lines
; CHECK-LABEL: sel_Cminus1_or_C_vec:		; CHECK-LABEL: sel_Cminus1_or_C_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.4s, v0.4h, #0		; CHECK-NEXT: ushll v0.4s, v0.4h, #0
; CHECK-NEXT: adrp x8, .LCPI4_0		; CHECK-NEXT: adrp x8, .LCPI4_0
; CHECK-NEXT: adrp x9, .LCPI4_1		; CHECK-NEXT: adrp x9, .LCPI4_1
; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI4_0]		; CHECK-NEXT: ldr q1, [x8, :lo12:.LCPI4_0]
; CHECK-NEXT: shl v0.4s, v0.4s, #31		; CHECK-NEXT: shl v0.4s, v0.4s, #31
; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI4_1]		; CHECK-NEXT: ldr q2, [x9, :lo12:.LCPI4_1]
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b		; CHECK-NEXT: bsl v0.16b, v2.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%add = select <4 x i1> %cond, <4 x i32> <i32 43, i32 1, i32 -1, i32 0>, <4 x i32> <i32 44, i32 2, i32 0, i32 1>		%add = select <4 x i1> %cond, <4 x i32> <i32 43, i32 1, i32 -1, i32 0>, <4 x i32> <i32 44, i32 2, i32 0, i32 1>
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <4 x i32> @cmp_sel_Cminus1_or_C_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @cmp_sel_Cminus1_or_C_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: cmp_sel_Cminus1_or_C_vec:		; CHECK-LABEL: cmp_sel_Cminus1_or_C_vec:
Show All 10 Lines	; CHECK-NEXT: ret
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <4 x i32> @sel_minus1_or_0_vec(<4 x i1> %cond) {		define <4 x i32> @sel_minus1_or_0_vec(<4 x i1> %cond) {
; CHECK-LABEL: sel_minus1_or_0_vec:		; CHECK-LABEL: sel_minus1_or_0_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.4s, v0.4h, #0		; CHECK-NEXT: ushll v0.4s, v0.4h, #0
; CHECK-NEXT: shl v0.4s, v0.4s, #31		; CHECK-NEXT: shl v0.4s, v0.4s, #31
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%add = select <4 x i1> %cond, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, <4 x i32> <i32 0, i32 0, i32 0, i32 0>		%add = select <4 x i1> %cond, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <4 x i32> @cmp_sel_minus1_or_0_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @cmp_sel_minus1_or_0_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: cmp_sel_minus1_or_0_vec:		; CHECK-LABEL: cmp_sel_minus1_or_0_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
Show All 27 Lines
}		}

define <4 x i32> @sel_1_or_0_vec(<4 x i1> %cond) {		define <4 x i32> @sel_1_or_0_vec(<4 x i1> %cond) {
; CHECK-LABEL: sel_1_or_0_vec:		; CHECK-LABEL: sel_1_or_0_vec:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ushll v0.4s, v0.4h, #0		; CHECK-NEXT: ushll v0.4s, v0.4h, #0
; CHECK-NEXT: movi v1.4s, #1		; CHECK-NEXT: movi v1.4s, #1
; CHECK-NEXT: shl v0.4s, v0.4s, #31		; CHECK-NEXT: shl v0.4s, v0.4s, #31
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%add = select <4 x i1> %cond, <4 x i32> <i32 1, i32 1, i32 1, i32 1>, <4 x i32> <i32 0, i32 0, i32 0, i32 0>		%add = select <4 x i1> %cond, <4 x i32> <i32 1, i32 1, i32 1, i32 1>, <4 x i32> <i32 0, i32 0, i32 0, i32 0>
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <4 x i32> @cmp_sel_1_or_0_vec(<4 x i32> %x, <4 x i32> %y) {		define <4 x i32> @cmp_sel_1_or_0_vec(<4 x i32> %x, <4 x i32> %y) {
; CHECK-LABEL: cmp_sel_1_or_0_vec:		; CHECK-LABEL: cmp_sel_1_or_0_vec:
Show All 30 Lines	; CHECK-NEXT: ret
%cond = icmp eq <4 x i32> %x, %y		%cond = icmp eq <4 x i32> %x, %y
%add = select <4 x i1> %cond, <4 x i32> <i32 0, i32 0, i32 0, i32 0>, <4 x i32> <i32 1, i32 1, i32 1, i32 1>		%add = select <4 x i1> %cond, <4 x i32> <i32 0, i32 0, i32 0, i32 0>, <4 x i32> <i32 1, i32 1, i32 1, i32 1>
ret <4 x i32> %add		ret <4 x i32> %add
}		}

define <16 x i8> @signbit_mask_v16i8(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @signbit_mask_v16i8(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: signbit_mask_v16i8:		; CHECK-LABEL: signbit_mask_v16i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.16b, v0.16b, #7		; CHECK-NEXT: cmlt v0.16b, v0.16b, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <16 x i8> %a, zeroinitializer		%cond = icmp slt <16 x i8> %a, zeroinitializer
%r = select <16 x i1> %cond, <16 x i8> %b, <16 x i8> zeroinitializer		%r = select <16 x i1> %cond, <16 x i8> %b, <16 x i8> zeroinitializer
ret <16 x i8> %r		ret <16 x i8> %r
}		}

; Swap cmp pred and select ops. This is logically equivalent to the above test.		; Swap cmp pred and select ops. This is logically equivalent to the above test.

define <16 x i8> @signbit_mask_swap_v16i8(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @signbit_mask_swap_v16i8(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: signbit_mask_swap_v16i8:		; CHECK-LABEL: signbit_mask_swap_v16i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.16b, v0.16b, #7		; CHECK-NEXT: cmlt v0.16b, v0.16b, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp sgt <16 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>		%cond = icmp sgt <16 x i8> %a, <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>
%r = select <16 x i1> %cond, <16 x i8> zeroinitializer, <16 x i8> %b		%r = select <16 x i1> %cond, <16 x i8> zeroinitializer, <16 x i8> %b
ret <16 x i8> %r		ret <16 x i8> %r
}		}

define <8 x i16> @signbit_mask_v8i16(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @signbit_mask_v8i16(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: signbit_mask_v8i16:		; CHECK-LABEL: signbit_mask_v8i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.8h, v0.8h, #15		; CHECK-NEXT: cmlt v0.8h, v0.8h, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <8 x i16> %a, zeroinitializer		%cond = icmp slt <8 x i16> %a, zeroinitializer
%r = select <8 x i1> %cond, <8 x i16> %b, <8 x i16> zeroinitializer		%r = select <8 x i1> %cond, <8 x i16> %b, <8 x i16> zeroinitializer
ret <8 x i16> %r		ret <8 x i16> %r
}		}

define <4 x i32> @signbit_mask_v4i32(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @signbit_mask_v4i32(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: signbit_mask_v4i32:		; CHECK-LABEL: signbit_mask_v4i32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <4 x i32> %a, zeroinitializer		%cond = icmp slt <4 x i32> %a, zeroinitializer
%r = select <4 x i1> %cond, <4 x i32> %b, <4 x i32> zeroinitializer		%r = select <4 x i1> %cond, <4 x i32> %b, <4 x i32> zeroinitializer
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define <2 x i64> @signbit_mask_v2i64(<2 x i64> %a, <2 x i64> %b) {		define <2 x i64> @signbit_mask_v2i64(<2 x i64> %a, <2 x i64> %b) {
; CHECK-LABEL: signbit_mask_v2i64:		; CHECK-LABEL: signbit_mask_v2i64:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.2d, v0.2d, #63		; CHECK-NEXT: cmlt v0.2d, v0.2d, #0
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <2 x i64> %a, zeroinitializer		%cond = icmp slt <2 x i64> %a, zeroinitializer
%r = select <2 x i1> %cond, <2 x i64> %b, <2 x i64> zeroinitializer		%r = select <2 x i1> %cond, <2 x i64> %b, <2 x i64> zeroinitializer
ret <2 x i64> %r		ret <2 x i64> %r
}		}

define <16 x i8> @signbit_setmask_v16i8(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @signbit_setmask_v16i8(<16 x i8> %a, <16 x i8> %b) {
; CHECK-LABEL: signbit_setmask_v16i8:		; CHECK-LABEL: signbit_setmask_v16i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.16b, v0.16b, #7		; CHECK-NEXT: cmlt v0.16b, v0.16b, #0
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <16 x i8> %a, zeroinitializer		%cond = icmp slt <16 x i8> %a, zeroinitializer
%r = select <16 x i1> %cond, <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>, <16 x i8> %b		%r = select <16 x i1> %cond, <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>, <16 x i8> %b
ret <16 x i8> %r		ret <16 x i8> %r
}		}

define <8 x i16> @signbit_setmask_v8i16(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @signbit_setmask_v8i16(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: signbit_setmask_v8i16:		; CHECK-LABEL: signbit_setmask_v8i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.8h, v0.8h, #15		; CHECK-NEXT: cmlt v0.8h, v0.8h, #0
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <8 x i16> %a, zeroinitializer		%cond = icmp slt <8 x i16> %a, zeroinitializer
%r = select <8 x i1> %cond, <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>, <8 x i16> %b		%r = select <8 x i1> %cond, <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>, <8 x i16> %b
ret <8 x i16> %r		ret <8 x i16> %r
}		}

; Swap cmp pred and select ops. This is logically equivalent to the above test.		; Swap cmp pred and select ops. This is logically equivalent to the above test.

define <8 x i16> @signbit_setmask_swap_v8i16(<8 x i16> %a, <8 x i16> %b) {		define <8 x i16> @signbit_setmask_swap_v8i16(<8 x i16> %a, <8 x i16> %b) {
; CHECK-LABEL: signbit_setmask_swap_v8i16:		; CHECK-LABEL: signbit_setmask_swap_v8i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.8h, v0.8h, #15		; CHECK-NEXT: cmlt v0.8h, v0.8h, #0
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp sgt <8 x i16> %a, <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>		%cond = icmp sgt <8 x i16> %a, <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>
%r = select <8 x i1> %cond, <8 x i16> %b, <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>		%r = select <8 x i1> %cond, <8 x i16> %b, <8 x i16> <i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1, i16 -1>
ret <8 x i16> %r		ret <8 x i16> %r
}		}

define <4 x i32> @signbit_setmask_v4i32(<4 x i32> %a, <4 x i32> %b) {		define <4 x i32> @signbit_setmask_v4i32(<4 x i32> %a, <4 x i32> %b) {
; CHECK-LABEL: signbit_setmask_v4i32:		; CHECK-LABEL: signbit_setmask_v4i32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.4s, v0.4s, #31		; CHECK-NEXT: cmlt v0.4s, v0.4s, #0
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <4 x i32> %a, zeroinitializer		%cond = icmp slt <4 x i32> %a, zeroinitializer
%r = select <4 x i1> %cond, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, <4 x i32> %b		%r = select <4 x i1> %cond, <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>, <4 x i32> %b
ret <4 x i32> %r		ret <4 x i32> %r
}		}

define <2 x i64> @signbit_setmask_v2i64(<2 x i64> %a, <2 x i64> %b) {		define <2 x i64> @signbit_setmask_v2i64(<2 x i64> %a, <2 x i64> %b) {
; CHECK-LABEL: signbit_setmask_v2i64:		; CHECK-LABEL: signbit_setmask_v2i64:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: sshr v0.2d, v0.2d, #63		; CHECK-NEXT: cmlt v0.2d, v0.2d, #0
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%cond = icmp slt <2 x i64> %a, zeroinitializer		%cond = icmp slt <2 x i64> %a, zeroinitializer
%r = select <2 x i1> %cond, <2 x i64> <i64 -1, i64 -1>, <2 x i64> %b		%r = select <2 x i1> %cond, <2 x i64> <i64 -1, i64 -1>, <2 x i64> %b
ret <2 x i64> %r		ret <2 x i64> %r
}		}

define <16 x i8> @not_signbit_mask_v16i8(<16 x i8> %a, <16 x i8> %b) {		define <16 x i8> @not_signbit_mask_v16i8(<16 x i8> %a, <16 x i8> %b) {
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Convert sra(X, elt_size(X)-1) to cmlt(X, 0)ClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 393987

llvm/lib/Target/AArch64/AArch64InstrInfo.td

llvm/test/Analysis/CostModel/AArch64/vector-select.ll

llvm/test/CodeGen/AArch64/arm64-subvector-extend.ll

llvm/test/CodeGen/AArch64/arm64-vshr.ll

llvm/test/CodeGen/AArch64/cmp-select-sign.ll

llvm/test/CodeGen/AArch64/dag-numsignbits.ll

llvm/test/CodeGen/AArch64/div_minsize.ll

llvm/test/CodeGen/AArch64/selectcc-to-shiftand.ll

llvm/test/CodeGen/AArch64/srem-seteq-vec-splat.ll

llvm/test/CodeGen/AArch64/sve-fixed-length-fp-vselect.ll

llvm/test/CodeGen/AArch64/sve-fixed-length-int-vselect.ll

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-gather.ll

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-loads.ll

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-scatter.ll

llvm/test/CodeGen/AArch64/sve-fixed-length-masked-stores.ll

llvm/test/CodeGen/AArch64/vec_uaddo.ll

llvm/test/CodeGen/AArch64/vec_umulo.ll

llvm/test/CodeGen/AArch64/vselect-constants.ll

[AArch64] Convert sra(X, elt_size(X)-1) to cmlt(X, 0)
ClosedPublic