This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/AArch64/
-
Target/
-
AArch64/
-
AArch64InstrInfo.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
GlobalISel/
-
arm64-atomic.ll
-
arm64-pcsections.ll
-
aarch64-dynamic-stack-layout.ll
-
arm64-rev.ll
-
cmp-chains.ll
-
machine-combiner.ll
-
reduce-and.ll
-
reduce-or.ll
-
reduce-shuffle.ll
-
reduce-xor.ll
-
swift-return.ll
-
vecreduce-and-legalization.ll

Differential D138107

[AArch64][MachineCombiner] Update isAssociativeAndCommutative
AbandonedPublic

Authored by kawashima-fj on Nov 16 2022, 2:23 AM.

Download Raw Diff

Details

Reviewers

dmgreen
fhahn
t.p.northover

Summary

This commit adds opcodes for ADD, MUL, AND, ORR, and EOR Base/SIMD/SVE instructions and missing opcodes for FADD and FMUL FP/SIMD/SVE instructions to the isAssociativeAndCommutative function. Also, it removes opcodes for the FMULX instruction, which is not associative (bug fix).

This helps increasing instruction-level parallelism by the existing Machine InstCombiner pass. This supersedes D132828, which implements tree height reduction in a new LLVM IR pass. Advantages of using the existing Machine InstCombiner pass are (1) more precise cost estimation, (2) no redundant process, and (3) less compile-time impact. Disadvantages are (4) per-target isAssociativeAndCommutative implementation and (4) constraints by the instruction set (see comment for MULWrr in AArch64InstrInfo::isAssociativeAndCommutative). In addition, (5) the sequence of instructions may not be optimal in some cases in terms of ILP because the algorithm in TargetInstrInfo::getMachineCombinerPatterns in the Machine InstCombiner pass is simpler than that of D132828. Nonetheless, it generates a fairly good sequence of instructions.

I run C/C++ benchmarks in SPECrate 2017 on Fujitsu A64FX processor, which has two pipelines for integer operations and SIMD/FP operations each. 511.povray_r had 4% improvement. Other benchmarks (int: 500, 502, 505, 520, 523, 525, 531, 541, 557; fp: 508, 510, 519, 538, 544) were within 1% up/down. For a synthetic benchmark, it doubled the performance.

Diff Detail

Unit TestsFailed

	Time	Test
	60,080 ms	x64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::restore_stack.cpp

Event Timeline

kawashima-fj created this revision.Nov 16 2022, 2:23 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 2:23 AM

Herald added subscribers: ctetreau, hiraditya, kristof.beyls. · View Herald Transcript

kawashima-fj requested review of this revision.Nov 16 2022, 2:23 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 16 2022, 2:23 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

kawashima-fj mentioned this in D132828: Add new optimization pass of Tree Height Reduction.Nov 16 2022, 2:32 AM

@melver
As shown in the change of llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll, the Machine InstCombiner pass removes pcsections metadata when replacing MachineInstr. I'm not sure it should nor not.

Essentially, the Machine InstCombiner pass changes MIR

$5 = ADD $1, $2, pcsections !0
$6 = ADD $5, $3, pcsections !0
$7 = ADD $6, $4, pcsections !0

$5 = ADD $1, $2, pcsections !0
$6 = ADD $3, $4
$7 = ADD $5, $6

Should pcsections !0 in $6 and $7 be preserved? If so, probablly code to create new MachineInstr should be updated.

In D138107#3930158, @kawashima-fj wrote:
@melver
As shown in the change of llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll, the Machine InstCombiner pass removes pcsections metadata when replacing MachineInstr. I'm not sure it should nor not.

Essentially, the Machine InstCombiner pass changes MIR
$5 = ADD $1, $2, pcsections !0
$6 = ADD $5, $3, pcsections !0
$7 = ADD $6, $4, pcsections !0
to
$5 = ADD $1, $2, pcsections !0
$6 = ADD $3, $4
$7 = ADD $5, $6
Should pcsections !0 in $6 and $7 be preserved? If so, probablly code to create new MachineInstr should be updated.

Yes, because the add instructions in LLVM IR have !pcsections attached to them.

It should be enough to use BuildMI(*MF, MIMetadata(Prev), TII->get(Opcode), NewVR) ... for Prev and BuildMI(*MF, MIMetadata(Root), TII->get(Opcode), RegC)... for Root.

Thanks for pointing it out.

I think it would be best to split this up into some instruction groups. The scalar were already being handled by https://reviews.llvm.org/D134260 (ANDS can also be added). I hadn't committed that because it ran into the same pcsections test failure. I have a patch to fix it up though, I can put that up for review and get D134260 committed.

The others could be split into scalar fp/neon/sve/etc.

dmgreen mentioned this in D138112: [AArch64][MachineCombiner] Use MIMetadata to copy pcsections metadata to reassociated instructions..Nov 16 2022, 3:14 AM

Harbormaster completed remote builds in B197944: Diff 475717.Nov 16 2022, 3:20 AM

In D138107#3930239, @dmgreen wrote:

I think it would be best to split this up into some instruction groups. The scalar were already being handled by https://reviews.llvm.org/D134260 (ANDS can also be added). I hadn't committed that because it ran into the same pcsections test failure. I have a patch to fix it up though, I can put that up for review and get D134260 committed.

The others could be split into scalar fp/neon/sve/etc.

Thanks. I didn't notice your D134260.
Do you mean splitting one patch to some patches? Or, splitting case ... return statement in the code?

Thanks. I didn't notice your D134260.
Do you mean splitting one patch to some patches? Or, splitting case ... return statement in the code?

I mean a few different patches. It can be good to commit things in smaller chunks in case some part of it needs to be reverted. The patch looks sensible from what I can tell, I just feel it is doing a little too much all at once.

In D138107#3930322, @dmgreen wrote:

I mean a few different patches. It can be good to commit things in smaller chunks in case some part of it needs to be reverted. The patch looks sensible from what I can tell, I just feel it is doing a little too much all at once.

Ok. Do you have a plan to add FP/NEON/SVE patches after D134260? If not, I'll post splitted patches after D134260 is landed.

I have no plans past D134260, I will leave all the rest to you!

SjoerdMeijer added a subscriber: SjoerdMeijer.Nov 16 2022, 4:55 AM

dmgreen mentioned this in rG71609871dd73: [AArch64][MachineCombiner] Use MIMetadata to copy pcsections metadata to….Nov 16 2022, 5:23 AM

Matt added a subscriber: Matt.Nov 22 2022, 11:03 AM

kawashima-fj mentioned this in D139606: [AArch64][NFC] Add tests for D134260.Dec 8 2022, 12:13 AM

kawashima-fj mentioned this in D139607: [AArch64][NFC] Change order of instructions in isAssociativeAndCommutative.Dec 8 2022, 12:19 AM

Abandon. I'll post splitted (and improved) patches.

kawashima-fj mentioned this in D139809: [AArch64] Add FP16 instructions to isAssociativeAndCommutative.Dec 11 2022, 11:04 PM

kawashima-fj mentioned this in D139810: [AArch64] Add Neon int instructions to isAssociativeAndCommutative.Dec 11 2022, 11:07 PM

kawashima-fj mentioned this in D140396: [AArch64] Add SVE FP instructions to isAssociativeAndCommutative.Dec 20 2022, 6:53 AM

kawashima-fj mentioned this in D140398: [AArch64] Add SVE int instructions to isAssociativeAndCommutative.Dec 20 2022, 6:56 AM

HsiangKai mentioned this in D140530: [RISCV] Add integer scalar instructions to isAssociativeAndCommutative.Dec 22 2022, 1:01 AM

HsiangKai mentioned this in rG002005e6740e: [RISCV] Add integer scalar instructions to isAssociativeAndCommutative.Dec 29 2022, 3:59 AM

Revision Contents

Path

Size

llvm/

lib/

Target/

AArch64/

AArch64InstrInfo.cpp

91 lines

test/

CodeGen/

AArch64/

GlobalISel/

arm64-atomic.ll

32 lines

arm64-pcsections.ll

16 lines

aarch64-dynamic-stack-layout.ll

4 lines

86 lines

12 lines

273 lines

88 lines

88 lines

98 lines

88 lines

20 lines

vecreduce-and-legalization.ll

28 lines

Diff 475717

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,895 Lines • ▼ Show 20 Lines

	//			//
	// Is \param MO defined by a floating-point multiply and can be combined?			// Is \param MO defined by a floating-point multiply and can be combined?
	static bool canCombineWithFMUL(MachineBasicBlock &MBB, MachineOperand &MO,			static bool canCombineWithFMUL(MachineBasicBlock &MBB, MachineOperand &MO,
	unsigned MulOpc) {			unsigned MulOpc) {
	return canCombine(MBB, MO, MulOpc);			return canCombine(MBB, MO, MulOpc);
	}			}

	// TODO: There are many more machine instruction opcodes to match:			// TODO: There may be more machine instruction opcodes to match:
	// 1. Other data types (integer, vectors)			// 1. Other data types
	// 2. Other math / logic operations (xor, or)			// 2. Other math / logic operations
	// 3. Other forms of the same operation (intrinsics and other variants)			// 3. Other forms of the same operation (intrinsics and other variants)
	bool AArch64InstrInfo::isAssociativeAndCommutative(			bool AArch64InstrInfo::isAssociativeAndCommutative(
	const MachineInstr &Inst) const {			const MachineInstr &Inst) const {
	switch (Inst.getOpcode()) {			switch (Inst.getOpcode()) {
	case AArch64::FADDDrr:			// == Integer types ==
				// -- Base instructions --
				// Opcodes MULWrr and MULXrr don't exist because
				// `MUL <Wd>, <Wn>, <Wm>` and `MUL <Xd>, <Xn>, <Xm>` are aliases of
				// `MADD <Wd>, <Wn>, <Wm>, WZR` and `MADD <Xd>, <Xn>, <Xm>, XZR` respectively.
				// The machine-combiner does not support three-source-operands machine
				// instruction. So we cannot reassociate MULs.
				case AArch64::ADDWrr:
				case AArch64::ADDXrr:
				case AArch64::ANDWrr:
				case AArch64::ANDXrr:
				case AArch64::ORRWrr:
				case AArch64::ORRXrr:
				case AArch64::EORWrr:
				case AArch64::EORXrr:
				// -- Advanced SIMD instructions --
				// Opcodes MULv1i64 and MULv2i64 don't exist because corresponding
				// `MUL <Vd>.1D, <Vn>.1D, <Vm>.1D` and `MUL <Vd>.2D, <Vn>.2D, <Vm>.2D`
				// don't exist in the Advanced SIMD instruction set.
				case AArch64::ADDv8i8:
				case AArch64::ADDv16i8:
				case AArch64::ADDv4i16:
				case AArch64::ADDv8i16:
				case AArch64::ADDv2i32:
				case AArch64::ADDv4i32:
				case AArch64::ADDv1i64:
				case AArch64::ADDv2i64:
				case AArch64::MULv8i8:
				case AArch64::MULv16i8:
				case AArch64::MULv4i16:
				case AArch64::MULv8i16:
				case AArch64::MULv2i32:
				case AArch64::MULv4i32:
				case AArch64::ANDv8i8:
				case AArch64::ANDv16i8:
				case AArch64::ORRv8i8:
				case AArch64::ORRv16i8:
				case AArch64::EORv8i8:
				case AArch64::EORv16i8:
				// -- SVE instructions --
				case AArch64::ADD_ZZZ_B:
				case AArch64::ADD_ZZZ_H:
				case AArch64::ADD_ZZZ_S:
				case AArch64::ADD_ZZZ_D:
				case AArch64::MUL_ZZZ_B:
				case AArch64::MUL_ZZZ_H:
				case AArch64::MUL_ZZZ_S:
				case AArch64::MUL_ZZZ_D:
				case AArch64::AND_ZZZ:
				case AArch64::ORR_ZZZ:
				case AArch64::EOR_ZZZ:
				return true;

				// == Floating-point types ==
				// -- Floating-point instructions --
				case AArch64::FADDHrr:
	case AArch64::FADDSrr:			case AArch64::FADDSrr:
				case AArch64::FADDDrr:
				case AArch64::FMULHrr:
				case AArch64::FMULSrr:
				case AArch64::FMULDrr:
				// -- Advanced SIMD instructions --
				case AArch64::FADDv4f16:
				case AArch64::FADDv8f16:
	case AArch64::FADDv2f32:			case AArch64::FADDv2f32:
	case AArch64::FADDv2f64:
	case AArch64::FADDv4f32:			case AArch64::FADDv4f32:
	case AArch64::FMULDrr:			case AArch64::FADDv2f64:
	case AArch64::FMULSrr:			case AArch64::FMULv4f16:
	case AArch64::FMULX32:			case AArch64::FMULv8f16:
	case AArch64::FMULX64:
	case AArch64::FMULXv2f32:
	case AArch64::FMULXv2f64:
	case AArch64::FMULXv4f32:
	case AArch64::FMULv2f32:			case AArch64::FMULv2f32:
	case AArch64::FMULv2f64:
	case AArch64::FMULv4f32:			case AArch64::FMULv4f32:
				case AArch64::FMULv2f64:
				// -- SVE instructions --
				case AArch64::FADD_ZZZ_H:
				case AArch64::FADD_ZZZ_S:
				case AArch64::FADD_ZZZ_D:
				case AArch64::FMUL_ZZZ_H:
				case AArch64::FMUL_ZZZ_S:
				case AArch64::FMUL_ZZZ_D:
	return Inst.getParent()->getParent()->getTarget().Options.UnsafeFPMath \|\|			return Inst.getParent()->getParent()->getTarget().Options.UnsafeFPMath \|\|
	(Inst.getFlag(MachineInstr::MIFlag::FmReassoc) &&			(Inst.getFlag(MachineInstr::MIFlag::FmReassoc) &&
	Inst.getFlag(MachineInstr::MIFlag::FmNsz));			Inst.getFlag(MachineInstr::MIFlag::FmNsz));

	default:			default:
	return false;			return false;
	}			}
	}			}

	/// Find instructions that can be turned into madd.			/// Find instructions that can be turned into madd.
	static bool getMaddPatterns(MachineInstr &Root,			static bool getMaddPatterns(MachineInstr &Root,
	SmallVectorImpl<MachineCombinerPattern> &Patterns) {			SmallVectorImpl<MachineCombinerPattern> &Patterns) {
	▲ Show 20 Lines • Show All 3,270 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll

	Show First 20 Lines • Show All 704 Lines • ▼ Show 20 Lines
	define i8 @atomic_load_relaxed_8(i8* %p, i32 %off32) #0 {			define i8 @atomic_load_relaxed_8(i8* %p, i32 %off32) #0 {
	; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_8:			; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_8:
	; CHECK-NOLSE-O1: ; %bb.0:			; CHECK-NOLSE-O1: ; %bb.0:
	; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O1-NEXT: ldrb w9, [x0, #4095]			; CHECK-NOLSE-O1-NEXT: ldrb w9, [x0, #4095]
	; CHECK-NOLSE-O1-NEXT: ldrb w10, [x0, w1, sxtw]			; CHECK-NOLSE-O1-NEXT: ldrb w10, [x0, w1, sxtw]
	; CHECK-NOLSE-O1-NEXT: ldurb w11, [x0, #-256]			; CHECK-NOLSE-O1-NEXT: ldurb w11, [x0, #-256]
	; CHECK-NOLSE-O1-NEXT: ldrb w8, [x8]			; CHECK-NOLSE-O1-NEXT: ldrb w8, [x8]
	; CHECK-NOLSE-O1-NEXT: add w9, w9, w10
	; CHECK-NOLSE-O1-NEXT: add w9, w9, w11			; CHECK-NOLSE-O1-NEXT: add w9, w9, w11
				; CHECK-NOLSE-O1-NEXT: add w9, w10, w9
	; CHECK-NOLSE-O1-NEXT: add w0, w9, w8			; CHECK-NOLSE-O1-NEXT: add w0, w9, w8
	; CHECK-NOLSE-O1-NEXT: ret			; CHECK-NOLSE-O1-NEXT: ret
	;			;
	; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_8:			; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_8:
	; CHECK-NOLSE-O0: ; %bb.0:			; CHECK-NOLSE-O0: ; %bb.0:
	; CHECK-NOLSE-O0-NEXT: ldrb w9, [x0, #4095]			; CHECK-NOLSE-O0-NEXT: ldrb w9, [x0, #4095]
	; CHECK-NOLSE-O0-NEXT: add x8, x0, w1, sxtw			; CHECK-NOLSE-O0-NEXT: add x8, x0, w1, sxtw
	; CHECK-NOLSE-O0-NEXT: ldrb w8, [x8]			; CHECK-NOLSE-O0-NEXT: ldrb w8, [x8]
	; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxtb			; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxtb
	; CHECK-NOLSE-O0-NEXT: subs x9, x0, #256			; CHECK-NOLSE-O0-NEXT: subs x9, x0, #256
	; CHECK-NOLSE-O0-NEXT: ldrb w9, [x9]			; CHECK-NOLSE-O0-NEXT: ldrb w9, [x9]
	; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxtb			; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxtb
	; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O0-NEXT: ldrb w9, [x9]			; CHECK-NOLSE-O0-NEXT: ldrb w9, [x9]
	; CHECK-NOLSE-O0-NEXT: add w0, w8, w9, uxtb			; CHECK-NOLSE-O0-NEXT: add w0, w8, w9, uxtb
	; CHECK-NOLSE-O0-NEXT: ret			; CHECK-NOLSE-O0-NEXT: ret
	;			;
	; CHECK-LSE-O1-LABEL: atomic_load_relaxed_8:			; CHECK-LSE-O1-LABEL: atomic_load_relaxed_8:
	; CHECK-LSE-O1: ; %bb.0:			; CHECK-LSE-O1: ; %bb.0:
	; CHECK-LSE-O1-NEXT: ldrb w8, [x0, #4095]			; CHECK-LSE-O1-NEXT: ldrb w8, [x0, #4095]
	; CHECK-LSE-O1-NEXT: ldrb w9, [x0, w1, sxtw]			; CHECK-LSE-O1-NEXT: ldrb w9, [x0, w1, sxtw]
	; CHECK-LSE-O1-NEXT: add w8, w8, w9			; CHECK-LSE-O1-NEXT: ldurb w10, [x0, #-256]
	; CHECK-LSE-O1-NEXT: ldurb w9, [x0, #-256]			; CHECK-LSE-O1-NEXT: add w8, w8, w10
	; CHECK-LSE-O1-NEXT: add w8, w8, w9			; CHECK-LSE-O1-NEXT: add w8, w9, w8
	; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-LSE-O1-NEXT: ldrb w9, [x9]			; CHECK-LSE-O1-NEXT: ldrb w9, [x9]
	; CHECK-LSE-O1-NEXT: add w0, w8, w9			; CHECK-LSE-O1-NEXT: add w0, w8, w9
	; CHECK-LSE-O1-NEXT: ret			; CHECK-LSE-O1-NEXT: ret
	;			;
	; CHECK-LSE-O0-LABEL: atomic_load_relaxed_8:			; CHECK-LSE-O0-LABEL: atomic_load_relaxed_8:
	; CHECK-LSE-O0: ; %bb.0:			; CHECK-LSE-O0: ; %bb.0:
	; CHECK-LSE-O0-NEXT: ldrb w9, [x0, #4095]			; CHECK-LSE-O0-NEXT: ldrb w9, [x0, #4095]
	Show All 28 Lines
	define i16 @atomic_load_relaxed_16(i16* %p, i32 %off32) #0 {			define i16 @atomic_load_relaxed_16(i16* %p, i32 %off32) #0 {
	; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_16:			; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_16:
	; CHECK-NOLSE-O1: ; %bb.0:			; CHECK-NOLSE-O1: ; %bb.0:
	; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O1-NEXT: ldrh w9, [x0, #8190]			; CHECK-NOLSE-O1-NEXT: ldrh w9, [x0, #8190]
	; CHECK-NOLSE-O1-NEXT: ldrh w10, [x0, w1, sxtw #1]			; CHECK-NOLSE-O1-NEXT: ldrh w10, [x0, w1, sxtw #1]
	; CHECK-NOLSE-O1-NEXT: ldurh w11, [x0, #-256]			; CHECK-NOLSE-O1-NEXT: ldurh w11, [x0, #-256]
	; CHECK-NOLSE-O1-NEXT: ldrh w8, [x8]			; CHECK-NOLSE-O1-NEXT: ldrh w8, [x8]
	; CHECK-NOLSE-O1-NEXT: add w9, w9, w10
	; CHECK-NOLSE-O1-NEXT: add w9, w9, w11			; CHECK-NOLSE-O1-NEXT: add w9, w9, w11
				; CHECK-NOLSE-O1-NEXT: add w9, w10, w9
	; CHECK-NOLSE-O1-NEXT: add w0, w9, w8			; CHECK-NOLSE-O1-NEXT: add w0, w9, w8
	; CHECK-NOLSE-O1-NEXT: ret			; CHECK-NOLSE-O1-NEXT: ret
	;			;
	; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_16:			; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_16:
	; CHECK-NOLSE-O0: ; %bb.0:			; CHECK-NOLSE-O0: ; %bb.0:
	; CHECK-NOLSE-O0-NEXT: ldrh w9, [x0, #8190]			; CHECK-NOLSE-O0-NEXT: ldrh w9, [x0, #8190]
	; CHECK-NOLSE-O0-NEXT: add x8, x0, w1, sxtw #1			; CHECK-NOLSE-O0-NEXT: add x8, x0, w1, sxtw #1
	; CHECK-NOLSE-O0-NEXT: ldrh w8, [x8]			; CHECK-NOLSE-O0-NEXT: ldrh w8, [x8]
	; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxth			; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxth
	; CHECK-NOLSE-O0-NEXT: subs x9, x0, #256			; CHECK-NOLSE-O0-NEXT: subs x9, x0, #256
	; CHECK-NOLSE-O0-NEXT: ldrh w9, [x9]			; CHECK-NOLSE-O0-NEXT: ldrh w9, [x9]
	; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxth			; CHECK-NOLSE-O0-NEXT: add w8, w8, w9, uxth
	; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O0-NEXT: ldrh w9, [x9]			; CHECK-NOLSE-O0-NEXT: ldrh w9, [x9]
	; CHECK-NOLSE-O0-NEXT: add w0, w8, w9, uxth			; CHECK-NOLSE-O0-NEXT: add w0, w8, w9, uxth
	; CHECK-NOLSE-O0-NEXT: ret			; CHECK-NOLSE-O0-NEXT: ret
	;			;
	; CHECK-LSE-O1-LABEL: atomic_load_relaxed_16:			; CHECK-LSE-O1-LABEL: atomic_load_relaxed_16:
	; CHECK-LSE-O1: ; %bb.0:			; CHECK-LSE-O1: ; %bb.0:
	; CHECK-LSE-O1-NEXT: ldrh w8, [x0, #8190]			; CHECK-LSE-O1-NEXT: ldrh w8, [x0, #8190]
	; CHECK-LSE-O1-NEXT: ldrh w9, [x0, w1, sxtw #1]			; CHECK-LSE-O1-NEXT: ldrh w9, [x0, w1, sxtw #1]
	; CHECK-LSE-O1-NEXT: add w8, w8, w9			; CHECK-LSE-O1-NEXT: ldurh w10, [x0, #-256]
	; CHECK-LSE-O1-NEXT: ldurh w9, [x0, #-256]			; CHECK-LSE-O1-NEXT: add w8, w8, w10
	; CHECK-LSE-O1-NEXT: add w8, w8, w9			; CHECK-LSE-O1-NEXT: add w8, w9, w8
	; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-LSE-O1-NEXT: ldrh w9, [x9]			; CHECK-LSE-O1-NEXT: ldrh w9, [x9]
	; CHECK-LSE-O1-NEXT: add w0, w8, w9			; CHECK-LSE-O1-NEXT: add w0, w8, w9
	; CHECK-LSE-O1-NEXT: ret			; CHECK-LSE-O1-NEXT: ret
	;			;
	; CHECK-LSE-O0-LABEL: atomic_load_relaxed_16:			; CHECK-LSE-O0-LABEL: atomic_load_relaxed_16:
	; CHECK-LSE-O0: ; %bb.0:			; CHECK-LSE-O0: ; %bb.0:
	; CHECK-LSE-O0-NEXT: ldrh w9, [x0, #8190]			; CHECK-LSE-O0-NEXT: ldrh w9, [x0, #8190]
	Show All 28 Lines
	define i32 @atomic_load_relaxed_32(i32* %p, i32 %off32) #0 {			define i32 @atomic_load_relaxed_32(i32* %p, i32 %off32) #0 {
	; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_32:			; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_32:
	; CHECK-NOLSE-O1: ; %bb.0:			; CHECK-NOLSE-O1: ; %bb.0:
	; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O1-NEXT: ldr w9, [x0, #16380]			; CHECK-NOLSE-O1-NEXT: ldr w9, [x0, #16380]
	; CHECK-NOLSE-O1-NEXT: ldr w10, [x0, w1, sxtw #2]			; CHECK-NOLSE-O1-NEXT: ldr w10, [x0, w1, sxtw #2]
	; CHECK-NOLSE-O1-NEXT: ldur w11, [x0, #-256]			; CHECK-NOLSE-O1-NEXT: ldur w11, [x0, #-256]
	; CHECK-NOLSE-O1-NEXT: ldr w8, [x8]			; CHECK-NOLSE-O1-NEXT: ldr w8, [x8]
	; CHECK-NOLSE-O1-NEXT: add w9, w9, w10
	; CHECK-NOLSE-O1-NEXT: add w9, w9, w11			; CHECK-NOLSE-O1-NEXT: add w9, w9, w11
				; CHECK-NOLSE-O1-NEXT: add w9, w10, w9
	; CHECK-NOLSE-O1-NEXT: add w0, w9, w8			; CHECK-NOLSE-O1-NEXT: add w0, w9, w8
	; CHECK-NOLSE-O1-NEXT: ret			; CHECK-NOLSE-O1-NEXT: ret
	;			;
	; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_32:			; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_32:
	; CHECK-NOLSE-O0: ; %bb.0:			; CHECK-NOLSE-O0: ; %bb.0:
	; CHECK-NOLSE-O0-NEXT: ldr w8, [x0, #16380]			; CHECK-NOLSE-O0-NEXT: ldr w8, [x0, #16380]
	; CHECK-NOLSE-O0-NEXT: ldr w9, [x0, w1, sxtw #2]			; CHECK-NOLSE-O0-NEXT: ldr w9, [x0, w1, sxtw #2]
	; CHECK-NOLSE-O0-NEXT: add w8, w8, w9			; CHECK-NOLSE-O0-NEXT: add w8, w8, w9
	; CHECK-NOLSE-O0-NEXT: ldur w9, [x0, #-256]			; CHECK-NOLSE-O0-NEXT: ldur w9, [x0, #-256]
	; CHECK-NOLSE-O0-NEXT: add w8, w8, w9			; CHECK-NOLSE-O0-NEXT: add w8, w8, w9
	; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O0-NEXT: ldr w9, [x9]			; CHECK-NOLSE-O0-NEXT: ldr w9, [x9]
	; CHECK-NOLSE-O0-NEXT: add w0, w8, w9			; CHECK-NOLSE-O0-NEXT: add w0, w8, w9
	; CHECK-NOLSE-O0-NEXT: ret			; CHECK-NOLSE-O0-NEXT: ret
	;			;
	; CHECK-LSE-O1-LABEL: atomic_load_relaxed_32:			; CHECK-LSE-O1-LABEL: atomic_load_relaxed_32:
	; CHECK-LSE-O1: ; %bb.0:			; CHECK-LSE-O1: ; %bb.0:
	; CHECK-LSE-O1-NEXT: ldr w8, [x0, #16380]			; CHECK-LSE-O1-NEXT: ldr w8, [x0, #16380]
	; CHECK-LSE-O1-NEXT: ldr w9, [x0, w1, sxtw #2]			; CHECK-LSE-O1-NEXT: ldr w9, [x0, w1, sxtw #2]
	; CHECK-LSE-O1-NEXT: add w8, w8, w9			; CHECK-LSE-O1-NEXT: ldur w10, [x0, #-256]
	; CHECK-LSE-O1-NEXT: ldur w9, [x0, #-256]			; CHECK-LSE-O1-NEXT: add w8, w8, w10
	; CHECK-LSE-O1-NEXT: add w8, w8, w9			; CHECK-LSE-O1-NEXT: add w8, w9, w8
	; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-LSE-O1-NEXT: ldr w9, [x9]			; CHECK-LSE-O1-NEXT: ldr w9, [x9]
	; CHECK-LSE-O1-NEXT: add w0, w8, w9			; CHECK-LSE-O1-NEXT: add w0, w8, w9
	; CHECK-LSE-O1-NEXT: ret			; CHECK-LSE-O1-NEXT: ret
	;			;
	; CHECK-LSE-O0-LABEL: atomic_load_relaxed_32:			; CHECK-LSE-O0-LABEL: atomic_load_relaxed_32:
	; CHECK-LSE-O0: ; %bb.0:			; CHECK-LSE-O0: ; %bb.0:
	; CHECK-LSE-O0-NEXT: ldr w8, [x0, #16380]			; CHECK-LSE-O0-NEXT: ldr w8, [x0, #16380]
	Show All 26 Lines
	define i64 @atomic_load_relaxed_64(i64* %p, i32 %off32) #0 {			define i64 @atomic_load_relaxed_64(i64* %p, i32 %off32) #0 {
	; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_64:			; CHECK-NOLSE-O1-LABEL: atomic_load_relaxed_64:
	; CHECK-NOLSE-O1: ; %bb.0:			; CHECK-NOLSE-O1: ; %bb.0:
	; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O1-NEXT: add x8, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O1-NEXT: ldr x9, [x0, #32760]			; CHECK-NOLSE-O1-NEXT: ldr x9, [x0, #32760]
	; CHECK-NOLSE-O1-NEXT: ldr x10, [x0, w1, sxtw #3]			; CHECK-NOLSE-O1-NEXT: ldr x10, [x0, w1, sxtw #3]
	; CHECK-NOLSE-O1-NEXT: ldur x11, [x0, #-256]			; CHECK-NOLSE-O1-NEXT: ldur x11, [x0, #-256]
	; CHECK-NOLSE-O1-NEXT: ldr x8, [x8]			; CHECK-NOLSE-O1-NEXT: ldr x8, [x8]
	; CHECK-NOLSE-O1-NEXT: add x9, x9, x10
	; CHECK-NOLSE-O1-NEXT: add x9, x9, x11			; CHECK-NOLSE-O1-NEXT: add x9, x9, x11
				; CHECK-NOLSE-O1-NEXT: add x9, x10, x9
	; CHECK-NOLSE-O1-NEXT: add x0, x9, x8			; CHECK-NOLSE-O1-NEXT: add x0, x9, x8
	; CHECK-NOLSE-O1-NEXT: ret			; CHECK-NOLSE-O1-NEXT: ret
	;			;
	; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_64:			; CHECK-NOLSE-O0-LABEL: atomic_load_relaxed_64:
	; CHECK-NOLSE-O0: ; %bb.0:			; CHECK-NOLSE-O0: ; %bb.0:
	; CHECK-NOLSE-O0-NEXT: ldr x8, [x0, #32760]			; CHECK-NOLSE-O0-NEXT: ldr x8, [x0, #32760]
	; CHECK-NOLSE-O0-NEXT: ldr x9, [x0, w1, sxtw #3]			; CHECK-NOLSE-O0-NEXT: ldr x9, [x0, w1, sxtw #3]
	; CHECK-NOLSE-O0-NEXT: add x8, x8, x9			; CHECK-NOLSE-O0-NEXT: add x8, x8, x9
	; CHECK-NOLSE-O0-NEXT: ldur x9, [x0, #-256]			; CHECK-NOLSE-O0-NEXT: ldur x9, [x0, #-256]
	; CHECK-NOLSE-O0-NEXT: add x8, x8, x9			; CHECK-NOLSE-O0-NEXT: add x8, x8, x9
	; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-NOLSE-O0-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-NOLSE-O0-NEXT: ldr x9, [x9]			; CHECK-NOLSE-O0-NEXT: ldr x9, [x9]
	; CHECK-NOLSE-O0-NEXT: add x0, x8, x9			; CHECK-NOLSE-O0-NEXT: add x0, x8, x9
	; CHECK-NOLSE-O0-NEXT: ret			; CHECK-NOLSE-O0-NEXT: ret
	;			;
	; CHECK-LSE-O1-LABEL: atomic_load_relaxed_64:			; CHECK-LSE-O1-LABEL: atomic_load_relaxed_64:
	; CHECK-LSE-O1: ; %bb.0:			; CHECK-LSE-O1: ; %bb.0:
	; CHECK-LSE-O1-NEXT: ldr x8, [x0, #32760]			; CHECK-LSE-O1-NEXT: ldr x8, [x0, #32760]
	; CHECK-LSE-O1-NEXT: ldr x9, [x0, w1, sxtw #3]			; CHECK-LSE-O1-NEXT: ldr x9, [x0, w1, sxtw #3]
	; CHECK-LSE-O1-NEXT: add x8, x8, x9			; CHECK-LSE-O1-NEXT: ldur x10, [x0, #-256]
	; CHECK-LSE-O1-NEXT: ldur x9, [x0, #-256]			; CHECK-LSE-O1-NEXT: add x8, x8, x10
	; CHECK-LSE-O1-NEXT: add x8, x8, x9			; CHECK-LSE-O1-NEXT: add x8, x9, x8
	; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936			; CHECK-LSE-O1-NEXT: add x9, x0, #291, lsl #12 ; =1191936
	; CHECK-LSE-O1-NEXT: ldr x9, [x9]			; CHECK-LSE-O1-NEXT: ldr x9, [x9]
	; CHECK-LSE-O1-NEXT: add x0, x8, x9			; CHECK-LSE-O1-NEXT: add x0, x8, x9
	; CHECK-LSE-O1-NEXT: ret			; CHECK-LSE-O1-NEXT: ret
	;			;
	; CHECK-LSE-O0-LABEL: atomic_load_relaxed_64:			; CHECK-LSE-O0-LABEL: atomic_load_relaxed_64:
	; CHECK-LSE-O0: ; %bb.0:			; CHECK-LSE-O0: ; %bb.0:
	; CHECK-LSE-O0-NEXT: ldr x8, [x0, #32760]			; CHECK-LSE-O0-NEXT: ldr x8, [x0, #32760]
	▲ Show 20 Lines • Show All 2,017 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll

Show First 20 Lines • Show All 383 Lines • ▼ Show 20 Lines	define i8 @atomic_load_relaxed_8(i8* %p, i32 %off32) {
; CHECK: bb.0 (%ir-block.0):		; CHECK: bb.0 (%ir-block.0):
; CHECK-NEXT: liveins: $w1, $x0		; CHECK-NEXT: liveins: $w1, $x0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12		; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12
; CHECK-NEXT: renamable $w9 = LDRBBui renamable $x0, 4095, pcsections !0 :: (load monotonic (s8) from %ir.ptr_unsigned)		; CHECK-NEXT: renamable $w9 = LDRBBui renamable $x0, 4095, pcsections !0 :: (load monotonic (s8) from %ir.ptr_unsigned)
; CHECK-NEXT: renamable $w10 = LDRBBroW renamable $x0, killed renamable $w1, 1, 0, pcsections !0 :: (load unordered (s8) from %ir.ptr_regoff)		; CHECK-NEXT: renamable $w10 = LDRBBroW renamable $x0, killed renamable $w1, 1, 0, pcsections !0 :: (load unordered (s8) from %ir.ptr_regoff)
; CHECK-NEXT: renamable $w11 = LDURBBi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s8) from %ir.ptr_unscaled)		; CHECK-NEXT: renamable $w11 = LDURBBi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s8) from %ir.ptr_unscaled)
; CHECK-NEXT: renamable $w8 = LDRBBui killed renamable $x8, 0, pcsections !0 :: (load unordered (s8) from %ir.ptr_random)		; CHECK-NEXT: renamable $w8 = LDRBBui killed renamable $x8, 0, pcsections !0 :: (load unordered (s8) from %ir.ptr_random)
; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w10, 0, pcsections !0		; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w11, 0
; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w11, 0, pcsections !0		; CHECK-NEXT: $w9 = ADDWrs killed renamable $w10, killed renamable $w9, 0
; CHECK-NEXT: $w0 = ADDWrs killed renamable $w9, killed renamable $w8, 0, pcsections !0		; CHECK-NEXT: $w0 = ADDWrs killed renamable $w9, killed renamable $w8, 0, pcsections !0
; CHECK-NEXT: RET undef $lr, implicit $w0		; CHECK-NEXT: RET undef $lr, implicit $w0
%ptr_unsigned = getelementptr i8, i8* %p, i32 4095		%ptr_unsigned = getelementptr i8, i8* %p, i32 4095
%val_unsigned = load atomic i8, i8* %ptr_unsigned monotonic, align 1, !pcsections !0		%val_unsigned = load atomic i8, i8* %ptr_unsigned monotonic, align 1, !pcsections !0

%ptr_regoff = getelementptr i8, i8* %p, i32 %off32		%ptr_regoff = getelementptr i8, i8* %p, i32 %off32
%val_regoff = load atomic i8, i8* %ptr_regoff unordered, align 1, !pcsections !0		%val_regoff = load atomic i8, i8* %ptr_regoff unordered, align 1, !pcsections !0
%tot1 = add i8 %val_unsigned, %val_regoff, !pcsections !0		%tot1 = add i8 %val_unsigned, %val_regoff, !pcsections !0
Show All 14 Lines	define i16 @atomic_load_relaxed_16(i16* %p, i32 %off32) {
; CHECK: bb.0 (%ir-block.0):		; CHECK: bb.0 (%ir-block.0):
; CHECK-NEXT: liveins: $w1, $x0		; CHECK-NEXT: liveins: $w1, $x0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12		; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12
; CHECK-NEXT: renamable $w9 = LDRHHui renamable $x0, 4095, pcsections !0 :: (load monotonic (s16) from %ir.ptr_unsigned)		; CHECK-NEXT: renamable $w9 = LDRHHui renamable $x0, 4095, pcsections !0 :: (load monotonic (s16) from %ir.ptr_unsigned)
; CHECK-NEXT: renamable $w10 = LDRHHroW renamable $x0, killed renamable $w1, 1, 1, pcsections !0 :: (load unordered (s16) from %ir.ptr_regoff)		; CHECK-NEXT: renamable $w10 = LDRHHroW renamable $x0, killed renamable $w1, 1, 1, pcsections !0 :: (load unordered (s16) from %ir.ptr_regoff)
; CHECK-NEXT: renamable $w11 = LDURHHi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s16) from %ir.ptr_unscaled)		; CHECK-NEXT: renamable $w11 = LDURHHi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s16) from %ir.ptr_unscaled)
; CHECK-NEXT: renamable $w8 = LDRHHui killed renamable $x8, 0, pcsections !0 :: (load unordered (s16) from %ir.ptr_random)		; CHECK-NEXT: renamable $w8 = LDRHHui killed renamable $x8, 0, pcsections !0 :: (load unordered (s16) from %ir.ptr_random)
; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w10, 0, pcsections !0		; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w11, 0
; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w11, 0, pcsections !0		; CHECK-NEXT: $w9 = ADDWrs killed renamable $w10, killed renamable $w9, 0
; CHECK-NEXT: $w0 = ADDWrs killed renamable $w9, killed renamable $w8, 0, pcsections !0		; CHECK-NEXT: $w0 = ADDWrs killed renamable $w9, killed renamable $w8, 0, pcsections !0
; CHECK-NEXT: RET undef $lr, implicit $w0		; CHECK-NEXT: RET undef $lr, implicit $w0
%ptr_unsigned = getelementptr i16, i16* %p, i32 4095		%ptr_unsigned = getelementptr i16, i16* %p, i32 4095
%val_unsigned = load atomic i16, i16* %ptr_unsigned monotonic, align 2, !pcsections !0		%val_unsigned = load atomic i16, i16* %ptr_unsigned monotonic, align 2, !pcsections !0

%ptr_regoff = getelementptr i16, i16* %p, i32 %off32		%ptr_regoff = getelementptr i16, i16* %p, i32 %off32
%val_regoff = load atomic i16, i16* %ptr_regoff unordered, align 2, !pcsections !0		%val_regoff = load atomic i16, i16* %ptr_regoff unordered, align 2, !pcsections !0
%tot1 = add i16 %val_unsigned, %val_regoff, !pcsections !0		%tot1 = add i16 %val_unsigned, %val_regoff, !pcsections !0
Show All 14 Lines	define i32 @atomic_load_relaxed_32(i32* %p, i32 %off32) {
; CHECK: bb.0 (%ir-block.0):		; CHECK: bb.0 (%ir-block.0):
; CHECK-NEXT: liveins: $w1, $x0		; CHECK-NEXT: liveins: $w1, $x0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12		; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12
; CHECK-NEXT: renamable $w9 = LDRWui renamable $x0, 4095, pcsections !0 :: (load monotonic (s32) from %ir.ptr_unsigned)		; CHECK-NEXT: renamable $w9 = LDRWui renamable $x0, 4095, pcsections !0 :: (load monotonic (s32) from %ir.ptr_unsigned)
; CHECK-NEXT: renamable $w10 = LDRWroW renamable $x0, killed renamable $w1, 1, 1, pcsections !0 :: (load unordered (s32) from %ir.ptr_regoff)		; CHECK-NEXT: renamable $w10 = LDRWroW renamable $x0, killed renamable $w1, 1, 1, pcsections !0 :: (load unordered (s32) from %ir.ptr_regoff)
; CHECK-NEXT: renamable $w11 = LDURWi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s32) from %ir.ptr_unscaled)		; CHECK-NEXT: renamable $w11 = LDURWi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s32) from %ir.ptr_unscaled)
; CHECK-NEXT: renamable $w8 = LDRWui killed renamable $x8, 0, pcsections !0 :: (load unordered (s32) from %ir.ptr_random)		; CHECK-NEXT: renamable $w8 = LDRWui killed renamable $x8, 0, pcsections !0 :: (load unordered (s32) from %ir.ptr_random)
; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w10, 0, pcsections !0		; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w11, 0
; CHECK-NEXT: $w9 = ADDWrs killed renamable $w9, killed renamable $w11, 0, pcsections !0		; CHECK-NEXT: $w9 = ADDWrs killed renamable $w10, killed renamable $w9, 0
; CHECK-NEXT: $w0 = ADDWrs killed renamable $w9, killed renamable $w8, 0, pcsections !0		; CHECK-NEXT: $w0 = ADDWrs killed renamable $w9, killed renamable $w8, 0, pcsections !0
; CHECK-NEXT: RET undef $lr, implicit $w0		; CHECK-NEXT: RET undef $lr, implicit $w0
%ptr_unsigned = getelementptr i32, i32* %p, i32 4095		%ptr_unsigned = getelementptr i32, i32* %p, i32 4095
%val_unsigned = load atomic i32, i32* %ptr_unsigned monotonic, align 4, !pcsections !0		%val_unsigned = load atomic i32, i32* %ptr_unsigned monotonic, align 4, !pcsections !0

%ptr_regoff = getelementptr i32, i32* %p, i32 %off32		%ptr_regoff = getelementptr i32, i32* %p, i32 %off32
%val_regoff = load atomic i32, i32* %ptr_regoff unordered, align 4, !pcsections !0		%val_regoff = load atomic i32, i32* %ptr_regoff unordered, align 4, !pcsections !0
%tot1 = add i32 %val_unsigned, %val_regoff, !pcsections !0		%tot1 = add i32 %val_unsigned, %val_regoff, !pcsections !0
Show All 14 Lines	define i64 @atomic_load_relaxed_64(i64* %p, i32 %off32) {
; CHECK: bb.0 (%ir-block.0):		; CHECK: bb.0 (%ir-block.0):
; CHECK-NEXT: liveins: $w1, $x0		; CHECK-NEXT: liveins: $w1, $x0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12		; CHECK-NEXT: renamable $x8 = ADDXri renamable $x0, 291, 12
; CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 4095, pcsections !0 :: (load monotonic (s64) from %ir.ptr_unsigned)		; CHECK-NEXT: renamable $x9 = LDRXui renamable $x0, 4095, pcsections !0 :: (load monotonic (s64) from %ir.ptr_unsigned)
; CHECK-NEXT: renamable $x10 = LDRXroW renamable $x0, killed renamable $w1, 1, 1, pcsections !0 :: (load unordered (s64) from %ir.ptr_regoff)		; CHECK-NEXT: renamable $x10 = LDRXroW renamable $x0, killed renamable $w1, 1, 1, pcsections !0 :: (load unordered (s64) from %ir.ptr_regoff)
; CHECK-NEXT: renamable $x11 = LDURXi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s64) from %ir.ptr_unscaled)		; CHECK-NEXT: renamable $x11 = LDURXi killed renamable $x0, -256, pcsections !0 :: (load monotonic (s64) from %ir.ptr_unscaled)
; CHECK-NEXT: renamable $x8 = LDRXui killed renamable $x8, 0, pcsections !0 :: (load unordered (s64) from %ir.ptr_random)		; CHECK-NEXT: renamable $x8 = LDRXui killed renamable $x8, 0, pcsections !0 :: (load unordered (s64) from %ir.ptr_random)
; CHECK-NEXT: $x9 = ADDXrs killed renamable $x9, killed renamable $x10, 0, pcsections !0		; CHECK-NEXT: $x9 = ADDXrs killed renamable $x9, killed renamable $x11, 0
; CHECK-NEXT: $x9 = ADDXrs killed renamable $x9, killed renamable $x11, 0, pcsections !0		; CHECK-NEXT: $x9 = ADDXrs killed renamable $x10, killed renamable $x9, 0
; CHECK-NEXT: $x0 = ADDXrs killed renamable $x9, killed renamable $x8, 0, pcsections !0		; CHECK-NEXT: $x0 = ADDXrs killed renamable $x9, killed renamable $x8, 0, pcsections !0
; CHECK-NEXT: RET undef $lr, implicit $x0		; CHECK-NEXT: RET undef $lr, implicit $x0
%ptr_unsigned = getelementptr i64, i64* %p, i32 4095		%ptr_unsigned = getelementptr i64, i64* %p, i32 4095
%val_unsigned = load atomic i64, i64* %ptr_unsigned monotonic, align 8, !pcsections !0		%val_unsigned = load atomic i64, i64* %ptr_unsigned monotonic, align 8, !pcsections !0

%ptr_regoff = getelementptr i64, i64* %p, i32 %off32		%ptr_regoff = getelementptr i64, i64* %p, i32 %off32
%val_regoff = load atomic i64, i64* %ptr_regoff unordered, align 8, !pcsections !0		%val_regoff = load atomic i64, i64* %ptr_regoff unordered, align 8, !pcsections !0
%tot1 = add i64 %val_unsigned, %val_regoff, !pcsections !0		%tot1 = add i64 %val_unsigned, %val_regoff, !pcsections !0
▲ Show 20 Lines • Show All 833 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll

	; RUN: llc -verify-machineinstrs -mtriple=aarch64-none-linux-gnu -disable-post-ra < %s \| FileCheck %s			; RUN: llc -verify-machineinstrs -mtriple=aarch64-none-linux-gnu -disable-post-ra -aarch64-enable-mcr=false < %s \| FileCheck %s
	; RUN: llc -verify-machineinstrs -mtriple=arm64-apple-ios -frame-pointer=all -disable-post-ra < %s \| FileCheck %s --check-prefix=CHECK-MACHO			; RUN: llc -verify-machineinstrs -mtriple=arm64-apple-ios -frame-pointer=all -disable-post-ra -aarch64-enable-mcr=false < %s \| FileCheck %s --check-prefix=CHECK-MACHO

	; This test aims to check basic correctness of frame layout &			; This test aims to check basic correctness of frame layout &
	; frame access code. There are 8 functions in this test file,			; frame access code. There are 8 functions in this test file,
	; each function implements one element in the cartesian product			; each function implements one element in the cartesian product
	; of:			; of:
	; . a function having a VLA/noVLA			; . a function having a VLA/noVLA
	; . a function with dynamic stack realignment/no dynamic stack realignment.			; . a function with dynamic stack realignment/no dynamic stack realignment.
	; . a function needing a frame pionter/no frame pointer,			; . a function needing a frame pionter/no frame pointer,
	▲ Show 20 Lines • Show All 694 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/arm64-rev.ll

Show First 20 Lines • Show All 177 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_rev16_w:		; GISEL-LABEL: test_rev16_w:
; GISEL: // %bb.0: // %entry		; GISEL: // %bb.0: // %entry
; GISEL-NEXT: lsr w8, w0, #8		; GISEL-NEXT: lsr w8, w0, #8
; GISEL-NEXT: lsl w9, w0, #8		; GISEL-NEXT: lsl w9, w0, #8
; GISEL-NEXT: and w10, w8, #0xff0000		; GISEL-NEXT: and w10, w8, #0xff0000
; GISEL-NEXT: and w11, w9, #0xff000000		; GISEL-NEXT: and w11, w9, #0xff000000
		; GISEL-NEXT: and w8, w8, #0xff
; GISEL-NEXT: and w9, w9, #0xff00		; GISEL-NEXT: and w9, w9, #0xff00
; GISEL-NEXT: orr w10, w11, w10		; GISEL-NEXT: orr w10, w11, w10
; GISEL-NEXT: and w8, w8, #0xff		; GISEL-NEXT: orr w8, w9, w8
; GISEL-NEXT: orr w9, w10, w9		; GISEL-NEXT: orr w0, w10, w8
; GISEL-NEXT: orr w0, w9, w8
; GISEL-NEXT: ret		; GISEL-NEXT: ret
entry:		entry:
%tmp1 = lshr i32 %X, 8		%tmp1 = lshr i32 %X, 8
%X15 = bitcast i32 %X to i32		%X15 = bitcast i32 %X to i32
%tmp4 = shl i32 %X15, 8		%tmp4 = shl i32 %X15, 8
%tmp2 = and i32 %tmp1, 16711680		%tmp2 = and i32 %tmp1, 16711680
%tmp5 = and i32 %tmp4, -16777216		%tmp5 = and i32 %tmp4, -16777216
%tmp9 = and i32 %tmp1, 255		%tmp9 = and i32 %tmp1, 255
▲ Show 20 Lines • Show All 525 Lines • ▼ Show 20 Lines
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_rev16_x_hwbyteswaps_complex1:		; GISEL-LABEL: test_rev16_x_hwbyteswaps_complex1:
; GISEL: // %bb.0: // %entry		; GISEL: // %bb.0: // %entry
; GISEL-NEXT: lsr x8, x0, #8		; GISEL-NEXT: lsr x8, x0, #8
; GISEL-NEXT: lsl x9, x0, #8		; GISEL-NEXT: lsl x9, x0, #8
; GISEL-NEXT: and x10, x8, #0xff000000000000		; GISEL-NEXT: and x10, x8, #0xff000000000000
; GISEL-NEXT: and x11, x9, #0xff00000000000000		; GISEL-NEXT: and x11, x9, #0xff00000000000000
		; GISEL-NEXT: and x12, x8, #0xff00000000
		; GISEL-NEXT: and x13, x9, #0xff0000000000
; GISEL-NEXT: orr x10, x10, x11		; GISEL-NEXT: orr x10, x10, x11
; GISEL-NEXT: and x11, x8, #0xff00000000		; GISEL-NEXT: orr x11, x12, x13
; GISEL-NEXT: orr x10, x10, x11		; GISEL-NEXT: and x12, x8, #0xff0000
; GISEL-NEXT: and x11, x9, #0xff0000000000		; GISEL-NEXT: and x13, x9, #0xff000000
; GISEL-NEXT: orr x10, x10, x11		; GISEL-NEXT: orr x12, x12, x13
; GISEL-NEXT: and x11, x8, #0xff0000
; GISEL-NEXT: orr x10, x10, x11
; GISEL-NEXT: and x11, x9, #0xff000000
; GISEL-NEXT: orr x10, x10, x11
; GISEL-NEXT: and x8, x8, #0xff		; GISEL-NEXT: and x8, x8, #0xff
		; GISEL-NEXT: orr x10, x10, x11
		; GISEL-NEXT: orr x8, x12, x8
; GISEL-NEXT: orr x8, x10, x8		; GISEL-NEXT: orr x8, x10, x8
; GISEL-NEXT: and x9, x9, #0xff00		; GISEL-NEXT: and x9, x9, #0xff00
; GISEL-NEXT: orr x0, x8, x9		; GISEL-NEXT: orr x0, x8, x9
; GISEL-NEXT: ret		; GISEL-NEXT: ret
entry:		entry:
%0 = lshr i64 %a, 8		%0 = lshr i64 %a, 8
%1 = and i64 %0, 71776119061217280		%1 = and i64 %0, 71776119061217280
%2 = shl i64 %a, 8		%2 = shl i64 %a, 8
Show All 27 Lines
; CHECK-NEXT: bfi x8, x11, #24, #8		; CHECK-NEXT: bfi x8, x11, #24, #8
; CHECK-NEXT: bfi x8, x0, #8, #8		; CHECK-NEXT: bfi x8, x0, #8, #8
; CHECK-NEXT: mov x0, x8		; CHECK-NEXT: mov x0, x8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_rev16_x_hwbyteswaps_complex2:		; GISEL-LABEL: test_rev16_x_hwbyteswaps_complex2:
; GISEL: // %bb.0: // %entry		; GISEL: // %bb.0: // %entry
; GISEL-NEXT: lsr x8, x0, #8		; GISEL-NEXT: lsr x8, x0, #8
; GISEL-NEXT: lsl x10, x0, #8		; GISEL-NEXT: lsl x9, x0, #8
; GISEL-NEXT: and x9, x8, #0xff000000000000		; GISEL-NEXT: and x10, x8, #0xff000000000000
; GISEL-NEXT: and x11, x8, #0xff00000000		; GISEL-NEXT: and x11, x8, #0xff00000000
; GISEL-NEXT: orr x9, x9, x11		; GISEL-NEXT: and x12, x8, #0xff0000
; GISEL-NEXT: and x11, x8, #0xff0000
; GISEL-NEXT: orr x9, x9, x11
; GISEL-NEXT: and x8, x8, #0xff		; GISEL-NEXT: and x8, x8, #0xff
; GISEL-NEXT: orr x8, x9, x8		; GISEL-NEXT: orr x10, x10, x11
; GISEL-NEXT: and x9, x10, #0xff00000000000000		; GISEL-NEXT: orr x8, x12, x8
; GISEL-NEXT: orr x8, x8, x9		; GISEL-NEXT: and x11, x9, #0xff00000000000000
; GISEL-NEXT: and x9, x10, #0xff0000000000		; GISEL-NEXT: and x12, x9, #0xff0000000000
; GISEL-NEXT: orr x8, x8, x9		; GISEL-NEXT: orr x11, x11, x12
; GISEL-NEXT: and x9, x10, #0xff000000		; GISEL-NEXT: and x12, x9, #0xff000000
; GISEL-NEXT: orr x8, x8, x9		; GISEL-NEXT: orr x8, x10, x8
; GISEL-NEXT: and x9, x10, #0xff00		; GISEL-NEXT: orr x10, x11, x12
		; GISEL-NEXT: orr x8, x8, x10
		; GISEL-NEXT: and x9, x9, #0xff00
; GISEL-NEXT: orr x0, x8, x9		; GISEL-NEXT: orr x0, x8, x9
; GISEL-NEXT: ret		; GISEL-NEXT: ret
entry:		entry:
%0 = lshr i64 %a, 8		%0 = lshr i64 %a, 8
%1 = and i64 %0, 71776119061217280		%1 = and i64 %0, 71776119061217280
%2 = shl i64 %a, 8		%2 = shl i64 %a, 8
%3 = and i64 %0, 1095216660480		%3 = and i64 %0, 1095216660480
%4 = or i64 %1, %3		%4 = or i64 %1, %3
Show All 34 Lines
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_rev16_x_hwbyteswaps_complex3:		; GISEL-LABEL: test_rev16_x_hwbyteswaps_complex3:
; GISEL: // %bb.0: // %entry		; GISEL: // %bb.0: // %entry
; GISEL-NEXT: lsr x8, x0, #8		; GISEL-NEXT: lsr x8, x0, #8
; GISEL-NEXT: lsl x9, x0, #8		; GISEL-NEXT: lsl x9, x0, #8
; GISEL-NEXT: and x10, x8, #0xff000000000000		; GISEL-NEXT: and x10, x8, #0xff000000000000
; GISEL-NEXT: and x11, x9, #0xff00000000000000		; GISEL-NEXT: and x11, x9, #0xff00000000000000
		; GISEL-NEXT: and x12, x8, #0xff00000000
		; GISEL-NEXT: and x13, x9, #0xff0000000000
; GISEL-NEXT: orr x10, x11, x10		; GISEL-NEXT: orr x10, x11, x10
; GISEL-NEXT: and x11, x8, #0xff00000000		; GISEL-NEXT: orr x11, x12, x13
; GISEL-NEXT: orr x10, x11, x10		; GISEL-NEXT: and x12, x8, #0xff0000
; GISEL-NEXT: and x11, x9, #0xff0000000000		; GISEL-NEXT: and x13, x9, #0xff000000
; GISEL-NEXT: orr x10, x11, x10		; GISEL-NEXT: orr x12, x12, x13
; GISEL-NEXT: and x11, x8, #0xff0000
; GISEL-NEXT: orr x10, x11, x10
; GISEL-NEXT: and x11, x9, #0xff000000
; GISEL-NEXT: orr x10, x11, x10
; GISEL-NEXT: and x8, x8, #0xff		; GISEL-NEXT: and x8, x8, #0xff
; GISEL-NEXT: orr x8, x8, x10		; GISEL-NEXT: orr x10, x10, x11
		; GISEL-NEXT: orr x8, x12, x8
		; GISEL-NEXT: orr x8, x10, x8
; GISEL-NEXT: and x9, x9, #0xff00		; GISEL-NEXT: and x9, x9, #0xff00
; GISEL-NEXT: orr x0, x9, x8		; GISEL-NEXT: orr x0, x9, x8
; GISEL-NEXT: ret		; GISEL-NEXT: ret
entry:		entry:
%0 = lshr i64 %a, 8		%0 = lshr i64 %a, 8
%1 = and i64 %0, 71776119061217280		%1 = and i64 %0, 71776119061217280
%2 = shl i64 %a, 8		%2 = shl i64 %a, 8
%3 = and i64 %2, -72057594037927936		%3 = and i64 %2, -72057594037927936
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	entry:
%6 = or i64 %4, %5		%6 = or i64 %4, %5
ret i64 %6		ret i64 %6
}		}

define i64 @test_or_and_combine2(i64 %a, i64 %b) nounwind {		define i64 @test_or_and_combine2(i64 %a, i64 %b) nounwind {
; CHECK-LABEL: test_or_and_combine2:		; CHECK-LABEL: test_or_and_combine2:
; CHECK: // %bb.0: // %entry		; CHECK: // %bb.0: // %entry
; CHECK-NEXT: lsr x8, x0, #8		; CHECK-NEXT: lsr x8, x0, #8
; CHECK-NEXT: lsl x10, x0, #8		; CHECK-NEXT: lsl x9, x0, #8
; CHECK-NEXT: and x9, x8, #0xff000000000000		; CHECK-NEXT: and x10, x8, #0xff000000000000
		; CHECK-NEXT: and x11, x9, #0xff00000000
; CHECK-NEXT: and x8, x8, #0xff0000		; CHECK-NEXT: and x8, x8, #0xff0000
; CHECK-NEXT: orr x9, x9, x10		; CHECK-NEXT: orr x9, x10, x9
; CHECK-NEXT: and x10, x10, #0xff00000000		; CHECK-NEXT: orr x8, x11, x8
; CHECK-NEXT: orr x9, x9, x10
; CHECK-NEXT: orr x0, x9, x8		; CHECK-NEXT: orr x0, x9, x8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_or_and_combine2:		; GISEL-LABEL: test_or_and_combine2:
; GISEL: // %bb.0: // %entry		; GISEL: // %bb.0: // %entry
; GISEL-NEXT: lsr x8, x0, #8		; GISEL-NEXT: lsr x8, x0, #8
; GISEL-NEXT: lsl x10, x0, #8		; GISEL-NEXT: lsl x9, x0, #8
; GISEL-NEXT: and x9, x8, #0xff000000000000		; GISEL-NEXT: and x10, x8, #0xff000000000000
		; GISEL-NEXT: and x11, x9, #0xff00000000
; GISEL-NEXT: and x8, x8, #0xff0000		; GISEL-NEXT: and x8, x8, #0xff0000
; GISEL-NEXT: orr x9, x9, x10		; GISEL-NEXT: orr x9, x10, x9
; GISEL-NEXT: and x10, x10, #0xff00000000		; GISEL-NEXT: orr x8, x11, x8
; GISEL-NEXT: orr x9, x9, x10
; GISEL-NEXT: orr x0, x9, x8		; GISEL-NEXT: orr x0, x9, x8
; GISEL-NEXT: ret		; GISEL-NEXT: ret
entry:		entry:
%0 = lshr i64 %a, 8		%0 = lshr i64 %a, 8
%1 = and i64 %0, 71776119061217280		%1 = and i64 %0, 71776119061217280
%2 = shl i64 %a, 8		%2 = shl i64 %a, 8
%3 = or i64 %1, %2		%3 = or i64 %1, %2
%4 = and i64 %2, 1095216660480		%4 = and i64 %2, 1095216660480
Show All 27 Lines

llvm/test/CodeGen/AArch64/cmp-chains.ll

	Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
	;			;
	; GISEL-LABEL: cmp_and4:			; GISEL-LABEL: cmp_and4:
	; GISEL: // %bb.0:			; GISEL: // %bb.0:
	; GISEL-NEXT: cmp w2, w3			; GISEL-NEXT: cmp w2, w3
	; GISEL-NEXT: cset w8, hi			; GISEL-NEXT: cset w8, hi
	; GISEL-NEXT: cmp w0, w1			; GISEL-NEXT: cmp w0, w1
	; GISEL-NEXT: cset w9, lo			; GISEL-NEXT: cset w9, lo
	; GISEL-NEXT: cmp w4, w5			; GISEL-NEXT: cmp w4, w5
	; GISEL-NEXT: and w8, w8, w9			; GISEL-NEXT: cset w10, ne
	; GISEL-NEXT: cset w9, ne
	; GISEL-NEXT: cmp w6, w7			; GISEL-NEXT: cmp w6, w7
				; GISEL-NEXT: cset w11, eq
	; GISEL-NEXT: and w8, w8, w9			; GISEL-NEXT: and w8, w8, w9
	; GISEL-NEXT: cset w9, eq			; GISEL-NEXT: and w9, w10, w11
	; GISEL-NEXT: and w0, w8, w9			; GISEL-NEXT: and w0, w8, w9
	; GISEL-NEXT: ret			; GISEL-NEXT: ret
	%9 = icmp ugt i32 %2, %3			%9 = icmp ugt i32 %2, %3
	%10 = icmp ult i32 %0, %1			%10 = icmp ult i32 %0, %1
	%11 = select i1 %9, i1 %10, i1 false			%11 = select i1 %9, i1 %10, i1 false
	%12 = icmp ne i32 %4, %5			%12 = icmp ne i32 %4, %5
	%13 = select i1 %11, i1 %12, i1 false			%13 = select i1 %11, i1 %12, i1 false
	%14 = icmp eq i32 %6, %7			%14 = icmp eq i32 %6, %7
	▲ Show 20 Lines • Show All 69 Lines • ▼ Show 20 Lines
	;			;
	; GISEL-LABEL: cmp_or4:			; GISEL-LABEL: cmp_or4:
	; GISEL: // %bb.0:			; GISEL: // %bb.0:
	; GISEL-NEXT: cmp w0, w1			; GISEL-NEXT: cmp w0, w1
	; GISEL-NEXT: cset w8, lo			; GISEL-NEXT: cset w8, lo
	; GISEL-NEXT: cmp w2, w3			; GISEL-NEXT: cmp w2, w3
	; GISEL-NEXT: cset w9, hi			; GISEL-NEXT: cset w9, hi
	; GISEL-NEXT: cmp w4, w5			; GISEL-NEXT: cmp w4, w5
	; GISEL-NEXT: orr w8, w8, w9			; GISEL-NEXT: cset w10, ne
	; GISEL-NEXT: cset w9, ne
	; GISEL-NEXT: cmp w6, w7			; GISEL-NEXT: cmp w6, w7
				; GISEL-NEXT: cset w11, eq
	; GISEL-NEXT: orr w8, w8, w9			; GISEL-NEXT: orr w8, w8, w9
	; GISEL-NEXT: cset w9, eq			; GISEL-NEXT: orr w9, w10, w11
	; GISEL-NEXT: orr w0, w8, w9			; GISEL-NEXT: orr w0, w8, w9
	; GISEL-NEXT: ret			; GISEL-NEXT: ret
	%9 = icmp ult i32 %0, %1			%9 = icmp ult i32 %0, %1
	%10 = icmp ugt i32 %2, %3			%10 = icmp ugt i32 %2, %3
	%11 = select i1 %9, i1 true, i1 %10			%11 = select i1 %9, i1 true, i1 %10
	%12 = icmp ne i32 %4, %5			%12 = icmp ne i32 %4, %5
	%13 = select i1 %11, i1 true, i1 %12			%13 = select i1 %11, i1 true, i1 %12
	%14 = icmp eq i32 %6, %7			%14 = icmp eq i32 %6, %7
	▲ Show 20 Lines • Show All 60 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/machine-combiner.ll

	; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
	; RUN: llc -mtriple=aarch64-gnu-linux -mcpu=cortex-a57 < %s \| FileCheck %s --check-prefixes=CHECK,CHECK-STD			; RUN: llc -mtriple=aarch64-gnu-linux -mcpu=cortex-a710 < %s \| FileCheck %s --check-prefixes=CHECK,CHECK-STD
	; RUN: llc -mtriple=aarch64-gnu-linux -mcpu=cortex-a57 -enable-unsafe-fp-math < %s \| FileCheck %s --check-prefixes=CHECK,CHECK-UNSAFE			; RUN: llc -mtriple=aarch64-gnu-linux -mcpu=cortex-a710 -enable-unsafe-fp-math < %s \| FileCheck %s --check-prefixes=CHECK,CHECK-UNSAFE

	; Incremental updates of the instruction depths should be enough for this test			; Incremental updates of the instruction depths should be enough for this test
	; case.			; case.
	; RUN: llc -mtriple=aarch64-gnu-linux -mcpu=cortex-a57 -enable-unsafe-fp-math \			; RUN: llc -mtriple=aarch64-gnu-linux -mcpu=cortex-a710 -enable-unsafe-fp-math \
	; RUN: -machine-combiner-inc-threshold=0 -machine-combiner-verify-pattern-order=true < %s \| FileCheck %s --check-prefixes=CHECK,CHECK-UNSAFE			; RUN: -machine-combiner-inc-threshold=0 -machine-combiner-verify-pattern-order=true < %s \| FileCheck %s --check-prefixes=CHECK,CHECK-UNSAFE

	; Verify that the first two adds are independent regardless of how the inputs are			; Verify that the first two adds are independent regardless of how the inputs are
	; commuted. The destination registers are used as source registers for the third add.			; commuted. The destination registers are used as source registers for the third add.

	define float @reassociate_adds1(float %x0, float %x1, float %x2, float %x3) {			define float @reassociate_adds1(float %x0, float %x1, float %x2, float %x3) {
	; CHECK-STD-LABEL: reassociate_adds1:			; CHECK-STD-LABEL: reassociate_adds1:
	; CHECK-STD: // %bb.0:			; CHECK-STD: // %bb.0:
	▲ Show 20 Lines • Show All 168 Lines • ▼ Show 20 Lines
	}			}

	; Verify that scalar single-precision multiplies are reassociated.			; Verify that scalar single-precision multiplies are reassociated.

	define float @reassociate_muls1(float %x0, float %x1, float %x2, float %x3) {			define float @reassociate_muls1(float %x0, float %x1, float %x2, float %x3) {
	; CHECK-STD-LABEL: reassociate_muls1:			; CHECK-STD-LABEL: reassociate_muls1:
	; CHECK-STD: // %bb.0:			; CHECK-STD: // %bb.0:
	; CHECK-STD-NEXT: fdiv s0, s0, s1			; CHECK-STD-NEXT: fdiv s0, s0, s1
	; CHECK-STD-NEXT: fmul s1, s2, s0			; CHECK-STD-NEXT: fmul s0, s2, s0
	; CHECK-STD-NEXT: fmul s0, s3, s1			; CHECK-STD-NEXT: fmul s0, s3, s0
	; CHECK-STD-NEXT: ret			; CHECK-STD-NEXT: ret
	;			;
	; CHECK-UNSAFE-LABEL: reassociate_muls1:			; CHECK-UNSAFE-LABEL: reassociate_muls1:
	; CHECK-UNSAFE: // %bb.0:			; CHECK-UNSAFE: // %bb.0:
	; CHECK-UNSAFE-NEXT: fdiv s0, s0, s1			; CHECK-UNSAFE-NEXT: fdiv s0, s0, s1
	; CHECK-UNSAFE-NEXT: fmul s1, s2, s3			; CHECK-UNSAFE-NEXT: fmul s1, s2, s3
	; CHECK-UNSAFE-NEXT: fmul s0, s0, s1			; CHECK-UNSAFE-NEXT: fmul s0, s0, s1
	; CHECK-UNSAFE-NEXT: ret			; CHECK-UNSAFE-NEXT: ret
	Show All 26 Lines
	}			}

	; Verify that scalar double-precision multiplies are reassociated.			; Verify that scalar double-precision multiplies are reassociated.

	define double @reassociate_muls_double(double %x0, double %x1, double %x2, double %x3) {			define double @reassociate_muls_double(double %x0, double %x1, double %x2, double %x3) {
	; CHECK-STD-LABEL: reassociate_muls_double:			; CHECK-STD-LABEL: reassociate_muls_double:
	; CHECK-STD: // %bb.0:			; CHECK-STD: // %bb.0:
	; CHECK-STD-NEXT: fdiv d0, d0, d1			; CHECK-STD-NEXT: fdiv d0, d0, d1
	; CHECK-STD-NEXT: fmul d1, d2, d0			; CHECK-STD-NEXT: fmul d0, d2, d0
	; CHECK-STD-NEXT: fmul d0, d3, d1			; CHECK-STD-NEXT: fmul d0, d3, d0
	; CHECK-STD-NEXT: ret			; CHECK-STD-NEXT: ret
	;			;
	; CHECK-UNSAFE-LABEL: reassociate_muls_double:			; CHECK-UNSAFE-LABEL: reassociate_muls_double:
	; CHECK-UNSAFE: // %bb.0:			; CHECK-UNSAFE: // %bb.0:
	; CHECK-UNSAFE-NEXT: fdiv d0, d0, d1			; CHECK-UNSAFE-NEXT: fdiv d0, d0, d1
	; CHECK-UNSAFE-NEXT: fmul d1, d2, d3			; CHECK-UNSAFE-NEXT: fmul d1, d2, d3
	; CHECK-UNSAFE-NEXT: fmul d0, d0, d1			; CHECK-UNSAFE-NEXT: fmul d0, d0, d1
	; CHECK-UNSAFE-NEXT: ret			; CHECK-UNSAFE-NEXT: ret
	%t0 = fdiv double %x0, %x1			%t0 = fdiv double %x0, %x1
	%t1 = fmul double %x2, %t0			%t1 = fmul double %x2, %t0
	%t2 = fmul double %x3, %t1			%t2 = fmul double %x3, %t1
	ret double %t2			ret double %t2
	}			}

				; Verify that scalar integer adds are reassociated.

				define i32 @reassociate_adds_i32(i32 %x0, i32 %x1, i32 %x2, i32 %x3) {
				; CHECK-LABEL: reassociate_adds_i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: udiv w8, w0, w1
				; CHECK-NEXT: add w9, w2, w3
				; CHECK-NEXT: add w0, w8, w9
				; CHECK-NEXT: ret
				%t0 = udiv i32 %x0, %x1
				%t1 = add i32 %x2, %t0
				%t2 = add i32 %x3, %t1
				ret i32 %t2
				}

				define i64 @reassociate_adds_i64(i64 %x0, i64 %x1, i64 %x2, i64 %x3) {
				; CHECK-LABEL: reassociate_adds_i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: udiv x8, x0, x1
				; CHECK-NEXT: add x9, x2, x3
				; CHECK-NEXT: add x0, x8, x9
				; CHECK-NEXT: ret
				%t0 = udiv i64 %x0, %x1
				%t1 = add i64 %x2, %t0
				%t2 = add i64 %x3, %t1
				ret i64 %t2
				}

				; Verify that scalar bitwise operations are reassociated.

				define i32 @reassociate_ands_i32(i32 %x0, i32 %x1, i32 %x2, i32 %x3) {
				; CHECK-LABEL: reassociate_ands_i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and w8, w0, w1
				; CHECK-NEXT: and w9, w2, w3
				; CHECK-NEXT: and w0, w8, w9
				; CHECK-NEXT: ret
				%t0 = and i32 %x0, %x1
				%t1 = and i32 %t0, %x2
				%t2 = and i32 %t1, %x3
				ret i32 %t2
				}

				define i64 @reassociate_ors_i64(i64 %x0, i64 %x1, i64 %x2, i64 %x3) {
				; CHECK-LABEL: reassociate_ors_i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: orr x8, x0, x1
				; CHECK-NEXT: orr x9, x2, x3
				; CHECK-NEXT: orr x0, x8, x9
				; CHECK-NEXT: ret
				%t0 = or i64 %x0, %x1
				%t1 = or i64 %t0, %x2
				%t2 = or i64 %t1, %x3
				ret i64 %t2
				}

				define i32 @reassociate_xors_i32(i32 %x0, i32 %x1, i32 %x2, i32 %x3) {
				; CHECK-LABEL: reassociate_xors_i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: eor w8, w0, w1
				; CHECK-NEXT: eor w9, w2, w3
				; CHECK-NEXT: eor w0, w8, w9
				; CHECK-NEXT: ret
				%t0 = xor i32 %x0, %x1
				%t1 = xor i32 %t0, %x2
				%t2 = xor i32 %t1, %x3
				ret i32 %t2
				}

	; Verify that we reassociate vector instructions too.			; Verify that we reassociate vector instructions too.

	define <4 x float> @vector_reassociate_adds1(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, <4 x float> %x3) {			define <4 x float> @vector_reassociate_adds1(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, <4 x float> %x3) {
	; CHECK-STD-LABEL: vector_reassociate_adds1:			; CHECK-STD-LABEL: vector_reassociate_adds1:
	; CHECK-STD: // %bb.0:			; CHECK-STD: // %bb.0:
	; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v1.4s			; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v1.4s
	; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v2.4s			; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v2.4s
	; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v3.4s			; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v3.4s
	▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
	; CHECK-UNSAFE-NEXT: fadd v1.4s, v2.4s, v3.4s			; CHECK-UNSAFE-NEXT: fadd v1.4s, v2.4s, v3.4s
	; CHECK-UNSAFE-NEXT: fadd v0.4s, v0.4s, v1.4s			; CHECK-UNSAFE-NEXT: fadd v0.4s, v0.4s, v1.4s
	; CHECK-UNSAFE-NEXT: ret			; CHECK-UNSAFE-NEXT: ret
	%t0 = fadd <4 x float> %x0, %x1			%t0 = fadd <4 x float> %x0, %x1
	%t1 = fadd <4 x float> %x2, %t0			%t1 = fadd <4 x float> %x2, %t0
	%t2 = fadd <4 x float> %x3, %t1			%t2 = fadd <4 x float> %x3, %t1
	ret <4 x float> %t2			ret <4 x float> %t2
	}			}

	; Verify that 128-bit vector single-precision multiplies are reassociated.			; Verify that 128-bit vector single-precision multiplies are reassociated.

	define <4 x float> @reassociate_muls_v4f32(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, <4 x float> %x3) {			define <4 x float> @reassociate_muls_v4f32(<4 x float> %x0, <4 x float> %x1, <4 x float> %x2, <4 x float> %x3) {
	; CHECK-STD-LABEL: reassociate_muls_v4f32:			; CHECK-STD-LABEL: reassociate_muls_v4f32:
	; CHECK-STD: // %bb.0:			; CHECK-STD: // %bb.0:
	; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v1.4s			; CHECK-STD-NEXT: fadd v0.4s, v0.4s, v1.4s
	; CHECK-STD-NEXT: fmul v0.4s, v2.4s, v0.4s			; CHECK-STD-NEXT: fmul v0.4s, v2.4s, v0.4s
	; CHECK-STD-NEXT: fmul v0.4s, v3.4s, v0.4s			; CHECK-STD-NEXT: fmul v0.4s, v3.4s, v0.4s
	Show All 28 Lines
	; CHECK-UNSAFE-NEXT: fmul v0.2d, v0.2d, v1.2d			; CHECK-UNSAFE-NEXT: fmul v0.2d, v0.2d, v1.2d
	; CHECK-UNSAFE-NEXT: ret			; CHECK-UNSAFE-NEXT: ret
	%t0 = fadd <2 x double> %x0, %x1			%t0 = fadd <2 x double> %x0, %x1
	%t1 = fmul <2 x double> %x2, %t0			%t1 = fmul <2 x double> %x2, %t0
	%t2 = fmul <2 x double> %x3, %t1			%t2 = fmul <2 x double> %x3, %t1
	ret <2 x double> %t2			ret <2 x double> %t2
	}			}

				; Verify that vector integer arithmetic operations are reassociated.

				define <2 x i32> @reassociate_muls_v2i32(<2 x i32> %x0, <2 x i32> %x1, <2 x i32> %x2, <2 x i32> %x3) {
				; CHECK-LABEL: reassociate_muls_v2i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mul v0.2s, v0.2s, v1.2s
				; CHECK-NEXT: mul v1.2s, v2.2s, v3.2s
				; CHECK-NEXT: mul v0.2s, v0.2s, v1.2s
				; CHECK-NEXT: ret
				%t0 = mul <2 x i32> %x0, %x1
				%t1 = mul <2 x i32> %x2, %t0
				%t2 = mul <2 x i32> %x3, %t1
				ret <2 x i32> %t2
				}

				define <2 x i64> @reassociate_adds_v2i64(<2 x i64> %x0, <2 x i64> %x1, <2 x i64> %x2, <2 x i64> %x3) {
				; CHECK-LABEL: reassociate_adds_v2i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: add v0.2d, v0.2d, v1.2d
				; CHECK-NEXT: add v1.2d, v2.2d, v3.2d
				; CHECK-NEXT: add v0.2d, v0.2d, v1.2d
				; CHECK-NEXT: ret
				%t0 = add <2 x i64> %x0, %x1
				%t1 = add <2 x i64> %x2, %t0
				%t2 = add <2 x i64> %x3, %t1
				ret <2 x i64> %t2
				}

				; Verify that vector bitwise operations are reassociated.

				define <16 x i8> @reassociate_ands_v16i8(<16 x i8> %x0, <16 x i8> %x1, <16 x i8> %x2, <16 x i8> %x3) {
				; CHECK-LABEL: reassociate_ands_v16i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
				; CHECK-NEXT: and v1.16b, v2.16b, v3.16b
				; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
				; CHECK-NEXT: ret
				%t0 = or <16 x i8> %x0, %x1
				%t1 = and <16 x i8> %t0, %x2
				%t2 = and <16 x i8> %t1, %x3
				ret <16 x i8> %t2
				}

				define <4 x i16> @reassociate_ors_v4i16(<4 x i16> %x0, <4 x i16> %x1, <4 x i16> %x2, <4 x i16> %x3) {
				; CHECK-LABEL: reassociate_ors_v4i16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: orr v1.8b, v2.8b, v3.8b
				; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
				; CHECK-NEXT: ret
				%t0 = xor <4 x i16> %x0, %x1
				%t1 = or <4 x i16> %t0, %x2
				%t2 = or <4 x i16> %t1, %x3
				ret <4 x i16> %t2
				}

				define <4 x i32> @reassociate_xors_v4i32(<4 x i32> %x0, <4 x i32> %x1, <4 x i32> %x2, <4 x i32> %x3) {
				; CHECK-LABEL: reassociate_xors_v4i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
				; CHECK-NEXT: eor v1.16b, v2.16b, v3.16b
				; CHECK-NEXT: eor v0.16b, v0.16b, v1.16b
				; CHECK-NEXT: ret
				%t0 = and <4 x i32> %x0, %x1
				%t1 = xor <4 x i32> %t0, %x2
				%t2 = xor <4 x i32> %t1, %x3
				ret <4 x i32> %t2
				}

				; Verify that scalable vector FP arithmetic operations are reassociated.

				define <vscale x 4 x float> @reassociate_adds_nxv4f32(<vscale x 4 x float> %x0, <vscale x 4 x float> %x1, <vscale x 4 x float> %x2, <vscale x 4 x float> %x3) {
				; CHECK-STD-LABEL: reassociate_adds_nxv4f32:
				; CHECK-STD: // %bb.0:
				; CHECK-STD-NEXT: fadd z0.s, z0.s, z1.s
				; CHECK-STD-NEXT: fadd z0.s, z2.s, z0.s
				; CHECK-STD-NEXT: fadd z0.s, z3.s, z0.s
				; CHECK-STD-NEXT: ret
				;
				; CHECK-UNSAFE-LABEL: reassociate_adds_nxv4f32:
				; CHECK-UNSAFE: // %bb.0:
				; CHECK-UNSAFE-NEXT: fadd z0.s, z0.s, z1.s
				; CHECK-UNSAFE-NEXT: fadd z1.s, z2.s, z3.s
				; CHECK-UNSAFE-NEXT: fadd z0.s, z0.s, z1.s
				; CHECK-UNSAFE-NEXT: ret
				%t0 = fadd reassoc <vscale x 4 x float> %x0, %x1
				%t1 = fadd reassoc <vscale x 4 x float> %x2, %t0
				%t2 = fadd reassoc <vscale x 4 x float> %x3, %t1
				ret <vscale x 4 x float> %t2
				}

				define <vscale x 2 x double> @reassociate_muls_nxv2f64(<vscale x 2 x double> %x0, <vscale x 2 x double> %x1, <vscale x 2 x double> %x2, <vscale x 2 x double> %x3) {
				; CHECK-STD-LABEL: reassociate_muls_nxv2f64:
				; CHECK-STD: // %bb.0:
				; CHECK-STD-NEXT: fmul z0.d, z0.d, z1.d
				; CHECK-STD-NEXT: fmul z0.d, z2.d, z0.d
				; CHECK-STD-NEXT: fmul z0.d, z3.d, z0.d
				; CHECK-STD-NEXT: ret
				;
				; CHECK-UNSAFE-LABEL: reassociate_muls_nxv2f64:
				; CHECK-UNSAFE: // %bb.0:
				; CHECK-UNSAFE-NEXT: fmul z0.d, z0.d, z1.d
				; CHECK-UNSAFE-NEXT: fmul z1.d, z2.d, z3.d
				; CHECK-UNSAFE-NEXT: fmul z0.d, z0.d, z1.d
				; CHECK-UNSAFE-NEXT: ret
				%t0 = fmul reassoc <vscale x 2 x double> %x0, %x1
				%t1 = fmul reassoc <vscale x 2 x double> %x2, %t0
				%t2 = fmul reassoc <vscale x 2 x double> %x3, %t1
				ret <vscale x 2 x double> %t2
				}

				; Verify that scalable vector integer arithmetic operations are reassociated.

				define <vscale x 4 x i32> @reassociate_muls_nxv4i32(<vscale x 4 x i32> %x0, <vscale x 4 x i32> %x1, <vscale x 4 x i32> %x2, <vscale x 4 x i32> %x3) {
				; CHECK-LABEL: reassociate_muls_nxv4i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: mul z0.s, z0.s, z1.s
				; CHECK-NEXT: mul z1.s, z2.s, z3.s
				; CHECK-NEXT: mul z0.s, z0.s, z1.s
				; CHECK-NEXT: ret
				%t0 = mul <vscale x 4 x i32> %x0, %x1
				%t1 = mul <vscale x 4 x i32> %x2, %t0
				%t2 = mul <vscale x 4 x i32> %x3, %t1
				ret <vscale x 4 x i32> %t2
				}

				define <vscale x 2 x i64> @reassociate_adds_nxv2i64(<vscale x 2 x i64> %x0, <vscale x 2 x i64> %x1, <vscale x 2 x i64> %x2, <vscale x 2 x i64> %x3) {
				; CHECK-LABEL: reassociate_adds_nxv2i64:
				; CHECK: // %bb.0:
				; CHECK-NEXT: add z0.d, z0.d, z1.d
				; CHECK-NEXT: add z1.d, z2.d, z3.d
				; CHECK-NEXT: add z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%t0 = add <vscale x 2 x i64> %x0, %x1
				%t1 = add <vscale x 2 x i64> %x2, %t0
				%t2 = add <vscale x 2 x i64> %x3, %t1
				ret <vscale x 2 x i64> %t2
				}

				; Verify that scalable vector bitwise operations are reassociated.

				define <vscale x 16 x i8> @reassociate_ands_nxv16i8(<vscale x 16 x i8> %x0, <vscale x 16 x i8> %x1, <vscale x 16 x i8> %x2, <vscale x 16 x i8> %x3) {
				; CHECK-LABEL: reassociate_ands_nxv16i8:
				; CHECK: // %bb.0:
				; CHECK-NEXT: orr z0.d, z0.d, z1.d
				; CHECK-NEXT: and z1.d, z2.d, z3.d
				; CHECK-NEXT: and z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%t0 = or <vscale x 16 x i8> %x0, %x1
				%t1 = and <vscale x 16 x i8> %t0, %x2
				%t2 = and <vscale x 16 x i8> %t1, %x3
				ret <vscale x 16 x i8> %t2
				}

				define <vscale x 8 x i16> @reassociate_ors_nxv8i16(<vscale x 8 x i16> %x0, <vscale x 8 x i16> %x1, <vscale x 8 x i16> %x2, <vscale x 8 x i16> %x3) {
				; CHECK-LABEL: reassociate_ors_nxv8i16:
				; CHECK: // %bb.0:
				; CHECK-NEXT: eor z0.d, z0.d, z1.d
				; CHECK-NEXT: orr z1.d, z2.d, z3.d
				; CHECK-NEXT: orr z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%t0 = xor <vscale x 8 x i16> %x0, %x1
				%t1 = or <vscale x 8 x i16> %t0, %x2
				%t2 = or <vscale x 8 x i16> %t1, %x3
				ret <vscale x 8 x i16> %t2
				}

				define <vscale x 4 x i32> @reassociate_xors_nxv4i32(<vscale x 4 x i32> %x0, <vscale x 4 x i32> %x1, <vscale x 4 x i32> %x2, <vscale x 4 x i32> %x3) {
				; CHECK-LABEL: reassociate_xors_nxv4i32:
				; CHECK: // %bb.0:
				; CHECK-NEXT: and z0.d, z0.d, z1.d
				; CHECK-NEXT: eor z1.d, z2.d, z3.d
				; CHECK-NEXT: eor z0.d, z0.d, z1.d
				; CHECK-NEXT: ret
				%t0 = and <vscale x 4 x i32> %x0, %x1
				%t1 = xor <vscale x 4 x i32> %t0, %x2
				%t2 = xor <vscale x 4 x i32> %t1, %x3
				ret <vscale x 4 x i32> %t2
				}

	; PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016			; PR25016: https://llvm.org/bugs/show_bug.cgi?id=25016
	; Verify that reassociation is not happening needlessly or wrongly.			; Verify that reassociation is not happening needlessly or wrongly.

	declare double @bar()			declare double @bar()

	define double @reassociate_adds_from_calls() {			define double @reassociate_adds_from_calls() {
	; CHECK-STD-LABEL: reassociate_adds_from_calls:			; CHECK-STD-LABEL: reassociate_adds_from_calls:
	; CHECK-STD: // %bb.0:			; CHECK-STD: // %bb.0:
	Show All 33 Lines
	; CHECK-UNSAFE-NEXT: bl bar			; CHECK-UNSAFE-NEXT: bl bar
	; CHECK-UNSAFE-NEXT: fmov d8, d0			; CHECK-UNSAFE-NEXT: fmov d8, d0
	; CHECK-UNSAFE-NEXT: bl bar			; CHECK-UNSAFE-NEXT: bl bar
	; CHECK-UNSAFE-NEXT: fmov d9, d0			; CHECK-UNSAFE-NEXT: fmov d9, d0
	; CHECK-UNSAFE-NEXT: bl bar			; CHECK-UNSAFE-NEXT: bl bar
	; CHECK-UNSAFE-NEXT: fmov d10, d0			; CHECK-UNSAFE-NEXT: fmov d10, d0
	; CHECK-UNSAFE-NEXT: bl bar			; CHECK-UNSAFE-NEXT: bl bar
	; CHECK-UNSAFE-NEXT: fadd d1, d8, d9			; CHECK-UNSAFE-NEXT: fadd d1, d8, d9
	; CHECK-UNSAFE-NEXT: fadd d0, d10, d0
	; CHECK-UNSAFE-NEXT: ldr x30, [sp, #24] // 8-byte Folded Reload
	; CHECK-UNSAFE-NEXT: ldp d9, d8, [sp, #8] // 16-byte Folded Reload			; CHECK-UNSAFE-NEXT: ldp d9, d8, [sp, #8] // 16-byte Folded Reload
				; CHECK-UNSAFE-NEXT: ldr x30, [sp, #24] // 8-byte Folded Reload
				; CHECK-UNSAFE-NEXT: fadd d0, d10, d0
	; CHECK-UNSAFE-NEXT: fadd d0, d1, d0			; CHECK-UNSAFE-NEXT: fadd d0, d1, d0
	; CHECK-UNSAFE-NEXT: ldr d10, [sp], #32 // 8-byte Folded Reload			; CHECK-UNSAFE-NEXT: ldr d10, [sp], #32 // 8-byte Folded Reload
	; CHECK-UNSAFE-NEXT: ret			; CHECK-UNSAFE-NEXT: ret
	%x0 = call double @bar()			%x0 = call double @bar()
	%x1 = call double @bar()			%x1 = call double @bar()
	%x2 = call double @bar()			%x2 = call double @bar()
	%x3 = call double @bar()			%x3 = call double @bar()
	%t0 = fadd double %x0, %x1			%t0 = fadd double %x0, %x1
	Show All 16 Lines
	; CHECK-NEXT: bl bar			; CHECK-NEXT: bl bar
	; CHECK-NEXT: fmov d8, d0			; CHECK-NEXT: fmov d8, d0
	; CHECK-NEXT: bl bar			; CHECK-NEXT: bl bar
	; CHECK-NEXT: fmov d9, d0			; CHECK-NEXT: fmov d9, d0
	; CHECK-NEXT: bl bar			; CHECK-NEXT: bl bar
	; CHECK-NEXT: fmov d10, d0			; CHECK-NEXT: fmov d10, d0
	; CHECK-NEXT: bl bar			; CHECK-NEXT: bl bar
	; CHECK-NEXT: fadd d1, d8, d9			; CHECK-NEXT: fadd d1, d8, d9
	; CHECK-NEXT: fadd d0, d10, d0
	; CHECK-NEXT: ldr x30, [sp, #24] // 8-byte Folded Reload
	; CHECK-NEXT: ldp d9, d8, [sp, #8] // 16-byte Folded Reload			; CHECK-NEXT: ldp d9, d8, [sp, #8] // 16-byte Folded Reload
				; CHECK-NEXT: ldr x30, [sp, #24] // 8-byte Folded Reload
				; CHECK-NEXT: fadd d0, d10, d0
	; CHECK-NEXT: fadd d0, d1, d0			; CHECK-NEXT: fadd d0, d1, d0
	; CHECK-NEXT: ldr d10, [sp], #32 // 8-byte Folded Reload			; CHECK-NEXT: ldr d10, [sp], #32 // 8-byte Folded Reload
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	%x0 = call double @bar()			%x0 = call double @bar()
	%x1 = call double @bar()			%x1 = call double @bar()
	%x2 = call double @bar()			%x2 = call double @bar()
	%x3 = call double @bar()			%x3 = call double @bar()
	%t0 = fadd double %x0, %x1			%t0 = fadd double %x0, %x1
	%t1 = fadd double %x2, %x3			%t1 = fadd double %x2, %x3
	%t2 = fadd double %t0, %t1			%t2 = fadd double %t0, %t1
	ret double %t2			ret double %t2
	}			}

llvm/test/CodeGen/AArch64/reduce-and.ll

Show First 20 Lines • Show All 258 Lines • ▼ Show 20 Lines	; GISEL-NEXT: ret
%and_result = call i8 @llvm.vector.reduce.and.v3i8(<3 x i8> %a)		%and_result = call i8 @llvm.vector.reduce.and.v3i8(<3 x i8> %a)
ret i8 %and_result		ret i8 %and_result
}		}

define i8 @test_redand_v4i8(<4 x i8> %a) {		define i8 @test_redand_v4i8(<4 x i8> %a) {
; CHECK-LABEL: test_redand_v4i8:		; CHECK-LABEL: test_redand_v4i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[3]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[2]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[1]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[0]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w10, w11, w10
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w0, w10, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v4i8:		; GISEL-LABEL: test_redand_v4i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
; GISEL-NEXT: mov h3, v0.h[3]		; GISEL-NEXT: mov h3, v0.h[3]
; GISEL-NEXT: fmov w8, s0		; GISEL-NEXT: fmov w8, s0
; GISEL-NEXT: fmov w9, s1		; GISEL-NEXT: fmov w9, s1
; GISEL-NEXT: fmov w10, s2		; GISEL-NEXT: fmov w10, s2
; GISEL-NEXT: fmov w11, s3		; GISEL-NEXT: fmov w11, s3
; GISEL-NEXT: and w8, w8, w9		; GISEL-NEXT: and w8, w8, w9
; GISEL-NEXT: and w9, w10, w11		; GISEL-NEXT: and w9, w10, w11
; GISEL-NEXT: and w0, w8, w9		; GISEL-NEXT: and w0, w8, w9
; GISEL-NEXT: ret		; GISEL-NEXT: ret
%and_result = call i8 @llvm.vector.reduce.and.v4i8(<4 x i8> %a)		%and_result = call i8 @llvm.vector.reduce.and.v4i8(<4 x i8> %a)
ret i8 %and_result		ret i8 %and_result
}		}

define i8 @test_redand_v8i8(<8 x i8> %a) {		define i8 @test_redand_v8i8(<8 x i8> %a) {
; CHECK-LABEL: test_redand_v8i8:		; CHECK-LABEL: test_redand_v8i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[5]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[4]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[1]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[0]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[3]
; CHECK-NEXT: umov w13, v0.b[5]		; CHECK-NEXT: umov w13, v0.b[2]
		; CHECK-NEXT: umov w14, v0.b[6]
		; CHECK-NEXT: umov w15, v0.b[7]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[6]		; CHECK-NEXT: and w10, w11, w10
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w11, w13, w12
; CHECK-NEXT: umov w10, v0.b[7]		; CHECK-NEXT: and w9, w10, w11
; CHECK-NEXT: and w8, w8, w11		; CHECK-NEXT: and w8, w8, w14
; CHECK-NEXT: and w8, w8, w12		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: and w8, w8, w13		; CHECK-NEXT: and w0, w8, w15
; CHECK-NEXT: and w8, w8, w9
; CHECK-NEXT: and w0, w8, w10
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v8i8:		; GISEL-LABEL: test_redand_v8i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
; GISEL-NEXT: mov b2, v0.b[2]		; GISEL-NEXT: mov b2, v0.b[2]
; GISEL-NEXT: mov b3, v0.b[3]		; GISEL-NEXT: mov b3, v0.b[3]
Show All 26 Lines
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: and v0.8b, v0.8b, v1.8b		; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[1]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[0]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[2]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[3]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[4]
		; CHECK-NEXT: umov w13, v0.b[5]
		; CHECK-NEXT: umov w14, v0.b[6]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]		; CHECK-NEXT: umov w9, v0.b[7]
		; CHECK-NEXT: and w10, w10, w11
		; CHECK-NEXT: and w11, w12, w13
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]		; CHECK-NEXT: and w10, w11, w14
; CHECK-NEXT: and w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: and w8, w8, w12
; CHECK-NEXT: and w8, w8, w9
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w8, w8, w10
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v16i8:		; GISEL-LABEL: test_redand_v16i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: and v0.8b, v0.8b, v1.8b		; GISEL-NEXT: and v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
; GISEL-NEXT: mov b2, v0.b[2]		; GISEL-NEXT: mov b2, v0.b[2]
Show All 28 Lines
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: and v0.8b, v0.8b, v1.8b		; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[1]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[0]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[2]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[3]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[4]
		; CHECK-NEXT: umov w13, v0.b[5]
		; CHECK-NEXT: umov w14, v0.b[6]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]		; CHECK-NEXT: umov w9, v0.b[7]
		; CHECK-NEXT: and w10, w10, w11
		; CHECK-NEXT: and w11, w12, w13
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]		; CHECK-NEXT: and w10, w11, w14
; CHECK-NEXT: and w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: and w8, w8, w12
; CHECK-NEXT: and w8, w8, w9
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w8, w8, w10
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v32i8:		; GISEL-LABEL: test_redand_v32i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: and v0.16b, v0.16b, v1.16b		; GISEL-NEXT: and v0.16b, v0.16b, v1.16b
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: and v0.8b, v0.8b, v1.8b		; GISEL-NEXT: and v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
Show All 22 Lines	; GISEL-NEXT: ret
%and_result = call i8 @llvm.vector.reduce.and.v32i8(<32 x i8> %a)		%and_result = call i8 @llvm.vector.reduce.and.v32i8(<32 x i8> %a)
ret i8 %and_result		ret i8 %and_result
}		}

define i16 @test_redand_v4i16(<4 x i16> %a) {		define i16 @test_redand_v4i16(<4 x i16> %a) {
; CHECK-LABEL: test_redand_v4i16:		; CHECK-LABEL: test_redand_v4i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[3]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[2]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[1]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[0]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w10, w11, w10
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w0, w10, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v4i16:		; GISEL-LABEL: test_redand_v4i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
; GISEL-NEXT: mov h3, v0.h[3]		; GISEL-NEXT: mov h3, v0.h[3]
Show All 14 Lines
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: and v0.8b, v0.8b, v1.8b		; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[1]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[0]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[2]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[3]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w9, w10, w11
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v8i16:		; GISEL-LABEL: test_redand_v8i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: and v0.8b, v0.8b, v1.8b		; GISEL-NEXT: and v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
Show All 16 Lines
; CHECK-NEXT: and v0.16b, v0.16b, v1.16b		; CHECK-NEXT: and v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: and v0.8b, v0.8b, v1.8b		; CHECK-NEXT: and v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[1]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[0]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[2]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[3]
; CHECK-NEXT: and w8, w9, w8		; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w9, w10, w11
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redand_v16i16:		; GISEL-LABEL: test_redand_v16i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: and v0.16b, v0.16b, v1.16b		; GISEL-NEXT: and v0.16b, v0.16b, v1.16b
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: and v0.8b, v0.8b, v1.8b		; GISEL-NEXT: and v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/reduce-or.ll

Show First 20 Lines • Show All 257 Lines • ▼ Show 20 Lines	; GISEL-NEXT: ret
%or_result = call i8 @llvm.vector.reduce.or.v3i8(<3 x i8> %a)		%or_result = call i8 @llvm.vector.reduce.or.v3i8(<3 x i8> %a)
ret i8 %or_result		ret i8 %or_result
}		}

define i8 @test_redor_v4i8(<4 x i8> %a) {		define i8 @test_redor_v4i8(<4 x i8> %a) {
; CHECK-LABEL: test_redor_v4i8:		; CHECK-LABEL: test_redor_v4i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[3]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[2]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[1]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[0]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w10, w11, w10
; CHECK-NEXT: orr w0, w8, w11		; CHECK-NEXT: orr w0, w10, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v4i8:		; GISEL-LABEL: test_redor_v4i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
; GISEL-NEXT: mov h3, v0.h[3]		; GISEL-NEXT: mov h3, v0.h[3]
; GISEL-NEXT: fmov w8, s0		; GISEL-NEXT: fmov w8, s0
; GISEL-NEXT: fmov w9, s1		; GISEL-NEXT: fmov w9, s1
; GISEL-NEXT: fmov w10, s2		; GISEL-NEXT: fmov w10, s2
; GISEL-NEXT: fmov w11, s3		; GISEL-NEXT: fmov w11, s3
; GISEL-NEXT: orr w8, w8, w9		; GISEL-NEXT: orr w8, w8, w9
; GISEL-NEXT: orr w9, w10, w11		; GISEL-NEXT: orr w9, w10, w11
; GISEL-NEXT: orr w0, w8, w9		; GISEL-NEXT: orr w0, w8, w9
; GISEL-NEXT: ret		; GISEL-NEXT: ret
%or_result = call i8 @llvm.vector.reduce.or.v4i8(<4 x i8> %a)		%or_result = call i8 @llvm.vector.reduce.or.v4i8(<4 x i8> %a)
ret i8 %or_result		ret i8 %or_result
}		}

define i8 @test_redor_v8i8(<8 x i8> %a) {		define i8 @test_redor_v8i8(<8 x i8> %a) {
; CHECK-LABEL: test_redor_v8i8:		; CHECK-LABEL: test_redor_v8i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[5]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[4]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[1]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[0]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[3]
; CHECK-NEXT: umov w13, v0.b[5]		; CHECK-NEXT: umov w13, v0.b[2]
		; CHECK-NEXT: umov w14, v0.b[6]
		; CHECK-NEXT: umov w15, v0.b[7]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[6]		; CHECK-NEXT: orr w10, w11, w10
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w11, w13, w12
; CHECK-NEXT: umov w10, v0.b[7]		; CHECK-NEXT: orr w9, w10, w11
; CHECK-NEXT: orr w8, w8, w11		; CHECK-NEXT: orr w8, w8, w14
; CHECK-NEXT: orr w8, w8, w12		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: orr w8, w8, w13		; CHECK-NEXT: orr w0, w8, w15
; CHECK-NEXT: orr w8, w8, w9
; CHECK-NEXT: orr w0, w8, w10
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v8i8:		; GISEL-LABEL: test_redor_v8i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
; GISEL-NEXT: mov b2, v0.b[2]		; GISEL-NEXT: mov b2, v0.b[2]
; GISEL-NEXT: mov b3, v0.b[3]		; GISEL-NEXT: mov b3, v0.b[3]
Show All 26 Lines
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b		; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[1]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[0]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[2]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[3]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[4]
		; CHECK-NEXT: umov w13, v0.b[5]
		; CHECK-NEXT: umov w14, v0.b[6]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]		; CHECK-NEXT: umov w9, v0.b[7]
		; CHECK-NEXT: orr w10, w10, w11
		; CHECK-NEXT: orr w11, w12, w13
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]		; CHECK-NEXT: orr w10, w11, w14
; CHECK-NEXT: orr w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: orr w8, w8, w12
; CHECK-NEXT: orr w8, w8, w9
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w8, w8, w10
; CHECK-NEXT: orr w0, w8, w11		; CHECK-NEXT: orr w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v16i8:		; GISEL-LABEL: test_redor_v16i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b		; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
; GISEL-NEXT: mov b2, v0.b[2]		; GISEL-NEXT: mov b2, v0.b[2]
Show All 28 Lines
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b		; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[1]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[0]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[2]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[3]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[4]
		; CHECK-NEXT: umov w13, v0.b[5]
		; CHECK-NEXT: umov w14, v0.b[6]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]		; CHECK-NEXT: umov w9, v0.b[7]
		; CHECK-NEXT: orr w10, w10, w11
		; CHECK-NEXT: orr w11, w12, w13
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]		; CHECK-NEXT: orr w10, w11, w14
; CHECK-NEXT: orr w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: orr w8, w8, w12
; CHECK-NEXT: orr w8, w8, w9
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w8, w8, w10
; CHECK-NEXT: orr w0, w8, w11		; CHECK-NEXT: orr w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v32i8:		; GISEL-LABEL: test_redor_v32i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: orr v0.16b, v0.16b, v1.16b		; GISEL-NEXT: orr v0.16b, v0.16b, v1.16b
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b		; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
Show All 22 Lines	; GISEL-NEXT: ret
%or_result = call i8 @llvm.vector.reduce.or.v32i8(<32 x i8> %a)		%or_result = call i8 @llvm.vector.reduce.or.v32i8(<32 x i8> %a)
ret i8 %or_result		ret i8 %or_result
}		}

define i16 @test_redor_v4i16(<4 x i16> %a) {		define i16 @test_redor_v4i16(<4 x i16> %a) {
; CHECK-LABEL: test_redor_v4i16:		; CHECK-LABEL: test_redor_v4i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[3]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[2]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[1]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[0]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w10, w11, w10
; CHECK-NEXT: orr w0, w8, w11		; CHECK-NEXT: orr w0, w10, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v4i16:		; GISEL-LABEL: test_redor_v4i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
; GISEL-NEXT: mov h3, v0.h[3]		; GISEL-NEXT: mov h3, v0.h[3]
Show All 14 Lines
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b		; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[1]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[0]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[2]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[3]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w9, w10, w11
; CHECK-NEXT: orr w0, w8, w11		; CHECK-NEXT: orr w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v8i16:		; GISEL-LABEL: test_redor_v8i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b		; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
Show All 16 Lines
; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b		; CHECK-NEXT: orr v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b		; CHECK-NEXT: orr v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[1]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[0]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[2]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[3]
; CHECK-NEXT: orr w8, w9, w8		; CHECK-NEXT: orr w8, w9, w8
; CHECK-NEXT: orr w8, w8, w10		; CHECK-NEXT: orr w9, w10, w11
; CHECK-NEXT: orr w0, w8, w11		; CHECK-NEXT: orr w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redor_v16i16:		; GISEL-LABEL: test_redor_v16i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: orr v0.16b, v0.16b, v1.16b		; GISEL-NEXT: orr v0.16b, v0.16b, v1.16b
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b		; GISEL-NEXT: orr v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/reduce-shuffle.ll

	Show All 35 Lines
	; CHECK-NEXT: shll2 v0.4s, v0.8h, #16			; CHECK-NEXT: shll2 v0.4s, v0.8h, #16
	; CHECK-NEXT: saddw2 v2.4s, v2.4s, v3.8h			; CHECK-NEXT: saddw2 v2.4s, v2.4s, v3.8h
	; CHECK-NEXT: saddw v3.4s, v4.4s, v3.4h			; CHECK-NEXT: saddw v3.4s, v4.4s, v3.4h
	; CHECK-NEXT: saddw2 v0.4s, v0.4s, v1.8h			; CHECK-NEXT: saddw2 v0.4s, v0.4s, v1.8h
	; CHECK-NEXT: saddw v1.4s, v5.4s, v1.4h			; CHECK-NEXT: saddw v1.4s, v5.4s, v1.4h
	; CHECK-NEXT: uzp2 v5.4s, v3.4s, v2.4s			; CHECK-NEXT: uzp2 v5.4s, v3.4s, v2.4s
	; CHECK-NEXT: ext v16.16b, v3.16b, v3.16b, #12			; CHECK-NEXT: ext v16.16b, v3.16b, v3.16b, #12
	; CHECK-NEXT: zip1 v17.4s, v1.4s, v0.4s			; CHECK-NEXT: zip1 v17.4s, v1.4s, v0.4s
				; CHECK-NEXT: mov v7.16b, v3.16b
				; CHECK-NEXT: zip2 v4.4s, v2.4s, v3.4s
	; CHECK-NEXT: zip2 v6.4s, v1.4s, v0.4s			; CHECK-NEXT: zip2 v6.4s, v1.4s, v0.4s
	; CHECK-NEXT: zip2 v18.4s, v3.4s, v2.4s			; CHECK-NEXT: zip2 v18.4s, v3.4s, v2.4s
	; CHECK-NEXT: uzp2 v5.4s, v5.4s, v3.4s
	; CHECK-NEXT: ext v19.16b, v1.16b, v17.16b, #8
	; CHECK-NEXT: mov v1.s[3], v0.s[2]
	; CHECK-NEXT: zip2 v4.4s, v2.4s, v3.4s
	; CHECK-NEXT: mov v7.16b, v3.16b
	; CHECK-NEXT: ext v16.16b, v2.16b, v16.16b, #12
	; CHECK-NEXT: mov v7.s[0], v2.s[1]			; CHECK-NEXT: mov v7.s[0], v2.s[1]
				; CHECK-NEXT: ext v16.16b, v2.16b, v16.16b, #12
				; CHECK-NEXT: ext v19.16b, v1.16b, v17.16b, #8
				; CHECK-NEXT: uzp2 v5.4s, v5.4s, v3.4s
	; CHECK-NEXT: mov v2.s[1], v3.s[0]			; CHECK-NEXT: mov v2.s[1], v3.s[0]
				; CHECK-NEXT: mov v1.s[3], v0.s[2]
				; CHECK-NEXT: mov v7.d[1], v17.d[1]
	; CHECK-NEXT: mov v5.d[1], v6.d[1]			; CHECK-NEXT: mov v5.d[1], v6.d[1]
				; CHECK-NEXT: mov v2.d[1], v19.d[1]
	; CHECK-NEXT: mov v18.d[1], v1.d[1]			; CHECK-NEXT: mov v18.d[1], v1.d[1]
	; CHECK-NEXT: mov v16.d[1], v6.d[1]			; CHECK-NEXT: mov v16.d[1], v6.d[1]
	; CHECK-NEXT: mov v4.d[1], v1.d[1]			; CHECK-NEXT: mov v4.d[1], v1.d[1]
	; CHECK-NEXT: mov v7.d[1], v17.d[1]			; CHECK-NEXT: add v0.4s, v7.4s, v2.4s
	; CHECK-NEXT: mov v2.d[1], v19.d[1]
	; CHECK-NEXT: add v1.4s, v5.4s, v18.4s			; CHECK-NEXT: add v1.4s, v5.4s, v18.4s
				; CHECK-NEXT: rev64 v5.4s, v0.4s
	; CHECK-NEXT: sub v3.4s, v4.4s, v16.4s			; CHECK-NEXT: sub v3.4s, v4.4s, v16.4s
	; CHECK-NEXT: rev64 v4.4s, v1.4s			; CHECK-NEXT: rev64 v4.4s, v1.4s
	; CHECK-NEXT: add v0.4s, v7.4s, v2.4s
	; CHECK-NEXT: sub v2.4s, v2.4s, v7.4s			; CHECK-NEXT: sub v2.4s, v2.4s, v7.4s
	; CHECK-NEXT: rev64 v5.4s, v0.4s			; CHECK-NEXT: mov v5.d[1], v0.d[1]
	; CHECK-NEXT: mov v4.d[1], v1.d[1]
	; CHECK-NEXT: add v6.4s, v3.4s, v2.4s			; CHECK-NEXT: add v6.4s, v3.4s, v2.4s
	; CHECK-NEXT: sub v2.4s, v2.4s, v3.4s			; CHECK-NEXT: sub v2.4s, v2.4s, v3.4s
	; CHECK-NEXT: mov v5.d[1], v0.d[1]			; CHECK-NEXT: mov v4.d[1], v1.d[1]
	; CHECK-NEXT: sub v0.4s, v0.4s, v4.4s
	; CHECK-NEXT: rev64 v7.4s, v2.4s			; CHECK-NEXT: rev64 v7.4s, v2.4s
	; CHECK-NEXT: rev64 v3.4s, v6.4s			; CHECK-NEXT: rev64 v3.4s, v6.4s
	; CHECK-NEXT: rev64 v4.4s, v0.4s
	; CHECK-NEXT: add v1.4s, v1.4s, v5.4s			; CHECK-NEXT: add v1.4s, v1.4s, v5.4s
				; CHECK-NEXT: sub v0.4s, v0.4s, v4.4s
	; CHECK-NEXT: sub v7.4s, v2.4s, v7.4s			; CHECK-NEXT: sub v7.4s, v2.4s, v7.4s
	; CHECK-NEXT: addp v5.4s, v1.4s, v6.4s			; CHECK-NEXT: addp v5.4s, v1.4s, v6.4s
	; CHECK-NEXT: addp v2.4s, v0.4s, v2.4s			; CHECK-NEXT: addp v2.4s, v0.4s, v2.4s
	; CHECK-NEXT: sub v3.4s, v6.4s, v3.4s			; CHECK-NEXT: sub v3.4s, v6.4s, v3.4s
				; CHECK-NEXT: rev64 v4.4s, v0.4s
	; CHECK-NEXT: rev64 v6.4s, v1.4s			; CHECK-NEXT: rev64 v6.4s, v1.4s
	; CHECK-NEXT: sub v0.4s, v0.4s, v4.4s
	; CHECK-NEXT: zip1 v16.4s, v5.4s, v5.4s			; CHECK-NEXT: zip1 v16.4s, v5.4s, v5.4s
	; CHECK-NEXT: ext v17.16b, v2.16b, v7.16b, #4			; CHECK-NEXT: ext v17.16b, v2.16b, v7.16b, #4
	; CHECK-NEXT: ext v18.16b, v5.16b, v3.16b, #4			; CHECK-NEXT: ext v18.16b, v5.16b, v3.16b, #4
	; CHECK-NEXT: ext v4.16b, v0.16b, v2.16b, #8			; CHECK-NEXT: sub v0.4s, v0.4s, v4.4s
	; CHECK-NEXT: sub v1.4s, v1.4s, v6.4s			; CHECK-NEXT: sub v1.4s, v1.4s, v6.4s
				; CHECK-NEXT: ext v4.16b, v0.16b, v2.16b, #8
	; CHECK-NEXT: ext v6.16b, v1.16b, v5.16b, #4			; CHECK-NEXT: ext v6.16b, v1.16b, v5.16b, #4
	; CHECK-NEXT: trn2 v1.4s, v16.4s, v1.4s			; CHECK-NEXT: trn2 v1.4s, v16.4s, v1.4s
	; CHECK-NEXT: zip2 v16.4s, v17.4s, v2.4s			; CHECK-NEXT: zip2 v16.4s, v17.4s, v2.4s
	; CHECK-NEXT: zip2 v17.4s, v18.4s, v5.4s			; CHECK-NEXT: zip2 v17.4s, v18.4s, v5.4s
	; CHECK-NEXT: ext v18.16b, v4.16b, v0.16b, #4			; CHECK-NEXT: ext v18.16b, v4.16b, v0.16b, #4
	; CHECK-NEXT: ext v6.16b, v6.16b, v6.16b, #4			; CHECK-NEXT: ext v6.16b, v6.16b, v6.16b, #4
	; CHECK-NEXT: ext v16.16b, v7.16b, v16.16b, #12			; CHECK-NEXT: ext v16.16b, v7.16b, v16.16b, #12
	; CHECK-NEXT: ext v17.16b, v3.16b, v17.16b, #12			; CHECK-NEXT: ext v17.16b, v3.16b, v17.16b, #12
	; CHECK-NEXT: mov v0.s[2], v2.s[1]
	; CHECK-NEXT: uzp2 v4.4s, v4.4s, v18.4s
	; CHECK-NEXT: mov v3.s[2], v5.s[3]			; CHECK-NEXT: mov v3.s[2], v5.s[3]
	; CHECK-NEXT: mov v7.s[2], v2.s[3]			; CHECK-NEXT: mov v7.s[2], v2.s[3]
	; CHECK-NEXT: sub v18.4s, v1.4s, v6.4s			; CHECK-NEXT: mov v0.s[2], v2.s[1]
	; CHECK-NEXT: mov v6.s[0], v5.s[1]			; CHECK-NEXT: uzp2 v4.4s, v4.4s, v18.4s
	; CHECK-NEXT: sub v19.4s, v0.4s, v4.4s
	; CHECK-NEXT: sub v20.4s, v3.4s, v17.4s			; CHECK-NEXT: sub v20.4s, v3.4s, v17.4s
	; CHECK-NEXT: sub v21.4s, v7.4s, v16.4s			; CHECK-NEXT: sub v21.4s, v7.4s, v16.4s
	; CHECK-NEXT: mov v0.s[1], v2.s[0]
	; CHECK-NEXT: mov v3.s[1], v5.s[2]			; CHECK-NEXT: mov v3.s[1], v5.s[2]
	; CHECK-NEXT: mov v7.s[1], v2.s[2]			; CHECK-NEXT: mov v7.s[1], v2.s[2]
	; CHECK-NEXT: add v1.4s, v1.4s, v6.4s			; CHECK-NEXT: sub v18.4s, v1.4s, v6.4s
	; CHECK-NEXT: add v0.4s, v0.4s, v4.4s			; CHECK-NEXT: mov v6.s[0], v5.s[1]
				; CHECK-NEXT: sub v19.4s, v0.4s, v4.4s
				; CHECK-NEXT: mov v0.s[1], v2.s[0]
	; CHECK-NEXT: add v2.4s, v3.4s, v17.4s			; CHECK-NEXT: add v2.4s, v3.4s, v17.4s
	; CHECK-NEXT: add v3.4s, v7.4s, v16.4s			; CHECK-NEXT: add v3.4s, v7.4s, v16.4s
	; CHECK-NEXT: mov v1.d[1], v18.d[1]			; CHECK-NEXT: add v1.4s, v1.4s, v6.4s
	; CHECK-NEXT: mov v0.d[1], v19.d[1]
	; CHECK-NEXT: mov v3.d[1], v21.d[1]			; CHECK-NEXT: mov v3.d[1], v21.d[1]
	; CHECK-NEXT: mov v2.d[1], v20.d[1]			; CHECK-NEXT: mov v2.d[1], v20.d[1]
	; CHECK-NEXT: cmlt v4.8h, v1.8h, #0			; CHECK-NEXT: add v0.4s, v0.4s, v4.4s
	; CHECK-NEXT: cmlt v5.8h, v0.8h, #0			; CHECK-NEXT: mov v1.d[1], v18.d[1]
				; CHECK-NEXT: mov v0.d[1], v19.d[1]
	; CHECK-NEXT: cmlt v6.8h, v3.8h, #0			; CHECK-NEXT: cmlt v6.8h, v3.8h, #0
	; CHECK-NEXT: cmlt v7.8h, v2.8h, #0			; CHECK-NEXT: cmlt v7.8h, v2.8h, #0
				; CHECK-NEXT: cmlt v4.8h, v1.8h, #0
	; CHECK-NEXT: add v3.4s, v6.4s, v3.4s			; CHECK-NEXT: add v3.4s, v6.4s, v3.4s
	; CHECK-NEXT: add v2.4s, v7.4s, v2.4s			; CHECK-NEXT: add v2.4s, v7.4s, v2.4s
				; CHECK-NEXT: cmlt v5.8h, v0.8h, #0
	; CHECK-NEXT: add v1.4s, v4.4s, v1.4s			; CHECK-NEXT: add v1.4s, v4.4s, v1.4s
	; CHECK-NEXT: add v0.4s, v5.4s, v0.4s
	; CHECK-NEXT: eor v1.16b, v1.16b, v4.16b
	; CHECK-NEXT: eor v0.16b, v0.16b, v5.16b
	; CHECK-NEXT: eor v2.16b, v2.16b, v7.16b			; CHECK-NEXT: eor v2.16b, v2.16b, v7.16b
	; CHECK-NEXT: eor v3.16b, v3.16b, v6.16b			; CHECK-NEXT: eor v3.16b, v3.16b, v6.16b
	; CHECK-NEXT: add v2.4s, v2.4s, v3.4s			; CHECK-NEXT: add v2.4s, v2.4s, v3.4s
	; CHECK-NEXT: add v0.4s, v1.4s, v0.4s			; CHECK-NEXT: add v0.4s, v5.4s, v0.4s
	; CHECK-NEXT: add v0.4s, v0.4s, v2.4s			; CHECK-NEXT: eor v1.16b, v1.16b, v4.16b
				; CHECK-NEXT: add v1.4s, v1.4s, v2.4s
				; CHECK-NEXT: eor v0.16b, v0.16b, v5.16b
				; CHECK-NEXT: add v0.4s, v0.4s, v1.4s
	; CHECK-NEXT: addv s0, v0.4s			; CHECK-NEXT: addv s0, v0.4s
	; CHECK-NEXT: fmov w8, s0			; CHECK-NEXT: fmov w8, s0
	; CHECK-NEXT: lsr w9, w8, #16			; CHECK-NEXT: lsr w9, w8, #16
	; CHECK-NEXT: add w8, w9, w8, uxth			; CHECK-NEXT: add w8, w9, w8, uxth
	; CHECK-NEXT: lsr w0, w8, #1			; CHECK-NEXT: lsr w0, w8, #1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	entry:			entry:
	%idx.ext = sext i32 %i1 to i64			%idx.ext = sext i32 %i1 to i64
	▲ Show 20 Lines • Show All 179 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: zip1 v6.4s, v1.4s, v3.4s			; CHECK-NEXT: zip1 v6.4s, v1.4s, v3.4s
	; CHECK-NEXT: zip2 v7.4s, v1.4s, v3.4s			; CHECK-NEXT: zip2 v7.4s, v1.4s, v3.4s
	; CHECK-NEXT: zip2 v1.4s, v3.4s, v1.4s			; CHECK-NEXT: zip2 v1.4s, v3.4s, v1.4s
	; CHECK-NEXT: zip1 v17.4s, v2.4s, v0.4s			; CHECK-NEXT: zip1 v17.4s, v2.4s, v0.4s
	; CHECK-NEXT: zip2 v2.4s, v2.4s, v0.4s			; CHECK-NEXT: zip2 v2.4s, v2.4s, v0.4s
	; CHECK-NEXT: ext v0.16b, v4.16b, v0.16b, #8			; CHECK-NEXT: ext v0.16b, v4.16b, v0.16b, #8
	; CHECK-NEXT: ext v3.16b, v16.16b, v3.16b, #8			; CHECK-NEXT: ext v3.16b, v16.16b, v3.16b, #8
	; CHECK-NEXT: add v1.4s, v5.4s, v1.4s			; CHECK-NEXT: add v1.4s, v5.4s, v1.4s
	; CHECK-NEXT: sub v5.4s, v6.4s, v17.4s			; CHECK-NEXT: sub v2.4s, v7.4s, v2.4s
	; CHECK-NEXT: ext v0.16b, v0.16b, v4.16b, #4			; CHECK-NEXT: ext v0.16b, v0.16b, v4.16b, #4
	; CHECK-NEXT: ext v3.16b, v3.16b, v16.16b, #4			; CHECK-NEXT: ext v3.16b, v3.16b, v16.16b, #4
	; CHECK-NEXT: cmlt v6.8h, v5.8h, #0			; CHECK-NEXT: sub v5.4s, v6.4s, v17.4s
	; CHECK-NEXT: sub v2.4s, v7.4s, v2.4s
	; CHECK-NEXT: add v4.4s, v6.4s, v5.4s
	; CHECK-NEXT: add v0.4s, v0.4s, v3.4s
	; CHECK-NEXT: cmlt v7.8h, v2.8h, #0			; CHECK-NEXT: cmlt v7.8h, v2.8h, #0
	; CHECK-NEXT: cmlt v17.8h, v1.8h, #0			; CHECK-NEXT: cmlt v17.8h, v1.8h, #0
	; CHECK-NEXT: eor v3.16b, v4.16b, v6.16b			; CHECK-NEXT: cmlt v6.8h, v5.8h, #0
	; CHECK-NEXT: cmlt v4.8h, v0.8h, #0
	; CHECK-NEXT: add v1.4s, v17.4s, v1.4s			; CHECK-NEXT: add v1.4s, v17.4s, v1.4s
	; CHECK-NEXT: add v2.4s, v7.4s, v2.4s			; CHECK-NEXT: add v2.4s, v7.4s, v2.4s
	; CHECK-NEXT: add v0.4s, v4.4s, v0.4s			; CHECK-NEXT: add v0.4s, v0.4s, v3.4s
				; CHECK-NEXT: add v4.4s, v6.4s, v5.4s
	; CHECK-NEXT: eor v2.16b, v2.16b, v7.16b			; CHECK-NEXT: eor v2.16b, v2.16b, v7.16b
	; CHECK-NEXT: eor v1.16b, v1.16b, v17.16b			; CHECK-NEXT: eor v1.16b, v1.16b, v17.16b
	; CHECK-NEXT: eor v0.16b, v0.16b, v4.16b			; CHECK-NEXT: cmlt v3.8h, v0.8h, #0
	; CHECK-NEXT: add v1.4s, v1.4s, v2.4s			; CHECK-NEXT: add v1.4s, v1.4s, v2.4s
	; CHECK-NEXT: add v0.4s, v0.4s, v3.4s			; CHECK-NEXT: add v0.4s, v3.4s, v0.4s
				; CHECK-NEXT: eor v2.16b, v4.16b, v6.16b
				; CHECK-NEXT: add v1.4s, v2.4s, v1.4s
				; CHECK-NEXT: eor v0.16b, v0.16b, v3.16b
	; CHECK-NEXT: add v0.4s, v0.4s, v1.4s			; CHECK-NEXT: add v0.4s, v0.4s, v1.4s
	; CHECK-NEXT: addv s0, v0.4s			; CHECK-NEXT: addv s0, v0.4s
	; CHECK-NEXT: fmov w8, s0			; CHECK-NEXT: fmov w8, s0
	; CHECK-NEXT: lsr w9, w8, #16			; CHECK-NEXT: lsr w9, w8, #16
	; CHECK-NEXT: add w8, w9, w8, uxth			; CHECK-NEXT: add w8, w9, w8, uxth
	; CHECK-NEXT: lsr w0, w8, #1			; CHECK-NEXT: lsr w0, w8, #1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	entry:			entry:
	▲ Show 20 Lines • Show All 189 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: sub v4.4s, v4.4s, v17.4s			; CHECK-NEXT: sub v4.4s, v4.4s, v17.4s
	; CHECK-NEXT: ext v2.16b, v2.16b, v6.16b, #4			; CHECK-NEXT: ext v2.16b, v2.16b, v6.16b, #4
	; CHECK-NEXT: ext v0.16b, v0.16b, v3.16b, #4			; CHECK-NEXT: ext v0.16b, v0.16b, v3.16b, #4
	; CHECK-NEXT: sub v3.4s, v5.4s, v7.4s			; CHECK-NEXT: sub v3.4s, v5.4s, v7.4s
	; CHECK-NEXT: cmlt v5.8h, v4.8h, #0			; CHECK-NEXT: cmlt v5.8h, v4.8h, #0
	; CHECK-NEXT: cmlt v6.8h, v3.8h, #0			; CHECK-NEXT: cmlt v6.8h, v3.8h, #0
	; CHECK-NEXT: add v0.4s, v0.4s, v2.4s			; CHECK-NEXT: add v0.4s, v0.4s, v2.4s
	; CHECK-NEXT: cmlt v2.8h, v1.8h, #0			; CHECK-NEXT: cmlt v2.8h, v1.8h, #0
	; CHECK-NEXT: cmlt v7.8h, v0.8h, #0
	; CHECK-NEXT: add v1.4s, v2.4s, v1.4s
	; CHECK-NEXT: add v3.4s, v6.4s, v3.4s			; CHECK-NEXT: add v3.4s, v6.4s, v3.4s
				; CHECK-NEXT: add v1.4s, v2.4s, v1.4s
				; CHECK-NEXT: cmlt v7.8h, v0.8h, #0
	; CHECK-NEXT: add v4.4s, v5.4s, v4.4s			; CHECK-NEXT: add v4.4s, v5.4s, v4.4s
	; CHECK-NEXT: add v0.4s, v7.4s, v0.4s
	; CHECK-NEXT: eor v4.16b, v4.16b, v5.16b
	; CHECK-NEXT: eor v0.16b, v0.16b, v7.16b
	; CHECK-NEXT: eor v3.16b, v3.16b, v6.16b			; CHECK-NEXT: eor v3.16b, v3.16b, v6.16b
	; CHECK-NEXT: eor v1.16b, v1.16b, v2.16b			; CHECK-NEXT: eor v1.16b, v1.16b, v2.16b
	; CHECK-NEXT: add v1.4s, v1.4s, v3.4s			; CHECK-NEXT: add v1.4s, v1.4s, v3.4s
	; CHECK-NEXT: add v0.4s, v0.4s, v4.4s			; CHECK-NEXT: add v0.4s, v7.4s, v0.4s
				; CHECK-NEXT: eor v2.16b, v4.16b, v5.16b
				; CHECK-NEXT: add v1.4s, v2.4s, v1.4s
				; CHECK-NEXT: eor v0.16b, v0.16b, v7.16b
	; CHECK-NEXT: add v0.4s, v0.4s, v1.4s			; CHECK-NEXT: add v0.4s, v0.4s, v1.4s
	; CHECK-NEXT: addv s0, v0.4s			; CHECK-NEXT: addv s0, v0.4s
	; CHECK-NEXT: fmov w8, s0			; CHECK-NEXT: fmov w8, s0
	; CHECK-NEXT: lsr w9, w8, #16			; CHECK-NEXT: lsr w9, w8, #16
	; CHECK-NEXT: add w8, w9, w8, uxth			; CHECK-NEXT: add w8, w9, w8, uxth
	; CHECK-NEXT: lsr w0, w8, #1			; CHECK-NEXT: lsr w0, w8, #1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	entry:			entry:
	▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/reduce-xor.ll

Show First 20 Lines • Show All 256 Lines • ▼ Show 20 Lines	; GISEL-NEXT: ret
%xor_result = call i8 @llvm.vector.reduce.xor.v3i8(<3 x i8> %a)		%xor_result = call i8 @llvm.vector.reduce.xor.v3i8(<3 x i8> %a)
ret i8 %xor_result		ret i8 %xor_result
}		}

define i8 @test_redxor_v4i8(<4 x i8> %a) {		define i8 @test_redxor_v4i8(<4 x i8> %a) {
; CHECK-LABEL: test_redxor_v4i8:		; CHECK-LABEL: test_redxor_v4i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[3]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[2]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[1]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[0]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w10, w11, w10
; CHECK-NEXT: eor w0, w8, w11		; CHECK-NEXT: eor w0, w10, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v4i8:		; GISEL-LABEL: test_redxor_v4i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
; GISEL-NEXT: mov h3, v0.h[3]		; GISEL-NEXT: mov h3, v0.h[3]
; GISEL-NEXT: fmov w8, s0		; GISEL-NEXT: fmov w8, s0
; GISEL-NEXT: fmov w9, s1		; GISEL-NEXT: fmov w9, s1
; GISEL-NEXT: fmov w10, s2		; GISEL-NEXT: fmov w10, s2
; GISEL-NEXT: fmov w11, s3		; GISEL-NEXT: fmov w11, s3
; GISEL-NEXT: eor w8, w8, w9		; GISEL-NEXT: eor w8, w8, w9
; GISEL-NEXT: eor w9, w10, w11		; GISEL-NEXT: eor w9, w10, w11
; GISEL-NEXT: eor w0, w8, w9		; GISEL-NEXT: eor w0, w8, w9
; GISEL-NEXT: ret		; GISEL-NEXT: ret
%xor_result = call i8 @llvm.vector.reduce.xor.v4i8(<4 x i8> %a)		%xor_result = call i8 @llvm.vector.reduce.xor.v4i8(<4 x i8> %a)
ret i8 %xor_result		ret i8 %xor_result
}		}

define i8 @test_redxor_v8i8(<8 x i8> %a) {		define i8 @test_redxor_v8i8(<8 x i8> %a) {
; CHECK-LABEL: test_redxor_v8i8:		; CHECK-LABEL: test_redxor_v8i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[5]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[4]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[1]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[0]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[3]
; CHECK-NEXT: umov w13, v0.b[5]		; CHECK-NEXT: umov w13, v0.b[2]
		; CHECK-NEXT: umov w14, v0.b[6]
		; CHECK-NEXT: umov w15, v0.b[7]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[6]		; CHECK-NEXT: eor w10, w11, w10
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w11, w13, w12
; CHECK-NEXT: umov w10, v0.b[7]		; CHECK-NEXT: eor w9, w10, w11
; CHECK-NEXT: eor w8, w8, w11		; CHECK-NEXT: eor w8, w8, w14
; CHECK-NEXT: eor w8, w8, w12		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: eor w8, w8, w13		; CHECK-NEXT: eor w0, w8, w15
; CHECK-NEXT: eor w8, w8, w9
; CHECK-NEXT: eor w0, w8, w10
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v8i8:		; GISEL-LABEL: test_redxor_v8i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
; GISEL-NEXT: mov b2, v0.b[2]		; GISEL-NEXT: mov b2, v0.b[2]
; GISEL-NEXT: mov b3, v0.b[3]		; GISEL-NEXT: mov b3, v0.b[3]
Show All 26 Lines
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b		; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[1]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[0]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[2]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[3]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[4]
		; CHECK-NEXT: umov w13, v0.b[5]
		; CHECK-NEXT: umov w14, v0.b[6]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]		; CHECK-NEXT: umov w9, v0.b[7]
		; CHECK-NEXT: eor w10, w10, w11
		; CHECK-NEXT: eor w11, w12, w13
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]		; CHECK-NEXT: eor w10, w11, w14
; CHECK-NEXT: eor w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: eor w8, w8, w12
; CHECK-NEXT: eor w8, w8, w9
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w8, w8, w10
; CHECK-NEXT: eor w0, w8, w11		; CHECK-NEXT: eor w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v16i8:		; GISEL-LABEL: test_redxor_v16i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b		; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
; GISEL-NEXT: mov b2, v0.b[2]		; GISEL-NEXT: mov b2, v0.b[2]
Show All 28 Lines
; CHECK-NEXT: eor v0.16b, v0.16b, v1.16b		; CHECK-NEXT: eor v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b		; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.b[1]		; CHECK-NEXT: umov w8, v0.b[1]
; CHECK-NEXT: umov w9, v0.b[0]		; CHECK-NEXT: umov w9, v0.b[0]
; CHECK-NEXT: umov w10, v0.b[2]		; CHECK-NEXT: umov w10, v0.b[2]
; CHECK-NEXT: umov w11, v0.b[3]		; CHECK-NEXT: umov w11, v0.b[3]
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w12, v0.b[4]
		; CHECK-NEXT: umov w13, v0.b[5]
		; CHECK-NEXT: umov w14, v0.b[6]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]		; CHECK-NEXT: umov w9, v0.b[7]
		; CHECK-NEXT: eor w10, w10, w11
		; CHECK-NEXT: eor w11, w12, w13
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]		; CHECK-NEXT: eor w10, w11, w14
; CHECK-NEXT: eor w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: eor w8, w8, w12
; CHECK-NEXT: eor w8, w8, w9
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w8, w8, w10
; CHECK-NEXT: eor w0, w8, w11		; CHECK-NEXT: eor w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v32i8:		; GISEL-LABEL: test_redxor_v32i8:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: eor v0.16b, v0.16b, v1.16b		; GISEL-NEXT: eor v0.16b, v0.16b, v1.16b
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b		; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov b1, v0.b[1]		; GISEL-NEXT: mov b1, v0.b[1]
Show All 22 Lines	; GISEL-NEXT: ret
%xor_result = call i8 @llvm.vector.reduce.xor.v32i8(<32 x i8> %a)		%xor_result = call i8 @llvm.vector.reduce.xor.v32i8(<32 x i8> %a)
ret i8 %xor_result		ret i8 %xor_result
}		}

define i16 @test_redxor_v4i16(<4 x i16> %a) {		define i16 @test_redxor_v4i16(<4 x i16> %a) {
; CHECK-LABEL: test_redxor_v4i16:		; CHECK-LABEL: test_redxor_v4i16:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0		; CHECK-NEXT: // kill: def $d0 killed $d0 def $q0
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[3]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[2]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[1]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[0]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w10, w11, w10
; CHECK-NEXT: eor w0, w8, w11		; CHECK-NEXT: eor w0, w10, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v4i16:		; GISEL-LABEL: test_redxor_v4i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0		; GISEL-NEXT: // kill: def $d0 killed $d0 def $q0
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
; GISEL-NEXT: mov h3, v0.h[3]		; GISEL-NEXT: mov h3, v0.h[3]
Show All 14 Lines
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b		; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[1]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[0]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[2]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[3]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w9, w10, w11
; CHECK-NEXT: eor w0, w8, w11		; CHECK-NEXT: eor w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v8i16:		; GISEL-LABEL: test_redxor_v8i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b		; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
; GISEL-NEXT: mov h2, v0.h[2]		; GISEL-NEXT: mov h2, v0.h[2]
Show All 16 Lines
; CHECK-NEXT: eor v0.16b, v0.16b, v1.16b		; CHECK-NEXT: eor v0.16b, v0.16b, v1.16b
; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8		; CHECK-NEXT: ext v1.16b, v0.16b, v0.16b, #8
; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b		; CHECK-NEXT: eor v0.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v0.h[1]		; CHECK-NEXT: umov w8, v0.h[1]
; CHECK-NEXT: umov w9, v0.h[0]		; CHECK-NEXT: umov w9, v0.h[0]
; CHECK-NEXT: umov w10, v0.h[2]		; CHECK-NEXT: umov w10, v0.h[2]
; CHECK-NEXT: umov w11, v0.h[3]		; CHECK-NEXT: umov w11, v0.h[3]
; CHECK-NEXT: eor w8, w9, w8		; CHECK-NEXT: eor w8, w9, w8
; CHECK-NEXT: eor w8, w8, w10		; CHECK-NEXT: eor w9, w10, w11
; CHECK-NEXT: eor w0, w8, w11		; CHECK-NEXT: eor w0, w8, w9
; CHECK-NEXT: ret		; CHECK-NEXT: ret
;		;
; GISEL-LABEL: test_redxor_v16i16:		; GISEL-LABEL: test_redxor_v16i16:
; GISEL: // %bb.0:		; GISEL: // %bb.0:
; GISEL-NEXT: eor v0.16b, v0.16b, v1.16b		; GISEL-NEXT: eor v0.16b, v0.16b, v1.16b
; GISEL-NEXT: mov d1, v0.d[1]		; GISEL-NEXT: mov d1, v0.d[1]
; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b		; GISEL-NEXT: eor v0.8b, v0.8b, v1.8b
; GISEL-NEXT: mov h1, v0.h[1]		; GISEL-NEXT: mov h1, v0.h[1]
▲ Show 20 Lines • Show All 142 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/swift-return.ll

Show All 21 Lines	entry:
%conv = trunc i32 %add to i16		%conv = trunc i32 %add to i16
ret i16 %conv		ret i16 %conv
}		}

declare swiftcc { i16, i8 } @gen(i32)		declare swiftcc { i16, i8 } @gen(i32)

; CHECK-LABEL: test2		; CHECK-LABEL: test2
; CHECK: bl _gen2		; CHECK: bl _gen2
; CHECK: add [[TMP:x.*]], x0, x1		; CHECK: add [[TMP1:x.*]], x0, x1
; CHECK: add [[TMP]], [[TMP]], x2		; CHECK: add [[TMP2:x.*]], x2, x3
; CHECK: add [[TMP]], [[TMP]], x3		; CHECK: add [[TMP3:x.*]], [[TMP1]], [[TMP2]]
; CHECK: add x0, [[TMP]], x4		; CHECK: add x0, [[TMP3]], x4
; CHECK-O0-LABEL: test2		; CHECK-O0-LABEL: test2
; CHECK-O0: bl _gen2		; CHECK-O0: bl _gen2
; CHECK-O0: add [[TMP:x.*]], x0, x1		; CHECK-O0: add [[TMP:x.*]], x0, x1
; CHECK-O0: add [[TMP]], [[TMP]], x2		; CHECK-O0: add [[TMP]], [[TMP]], x2
; CHECK-O0: add [[TMP]], [[TMP]], x3		; CHECK-O0: add [[TMP]], [[TMP]], x3
; CHECK-O0: add x0, [[TMP]], x4		; CHECK-O0: add x0, [[TMP]], x4

define i64 @test2(i64 %key) {		define i64 @test2(i64 %key) {
Show All 27 Lines	define swiftcc { i64, i64, i64, i64, i64 } @gen2(i64 %key) {
%Z2 = insertvalue { i64, i64, i64, i64, i64 } %Z, i64 %key, 2		%Z2 = insertvalue { i64, i64, i64, i64, i64 } %Z, i64 %key, 2
%Z3 = insertvalue { i64, i64, i64, i64, i64 } %Z2, i64 %key, 3		%Z3 = insertvalue { i64, i64, i64, i64, i64 } %Z2, i64 %key, 3
%Z4 = insertvalue { i64, i64, i64, i64, i64 } %Z3, i64 %key, 4		%Z4 = insertvalue { i64, i64, i64, i64, i64 } %Z3, i64 %key, 4
ret { i64, i64, i64, i64, i64 } %Z4		ret { i64, i64, i64, i64, i64 } %Z4
}		}

; CHECK-LABEL: test3		; CHECK-LABEL: test3
; CHECK: bl _gen3		; CHECK: bl _gen3
; CHECK: add [[TMP:w.*]], w0, w1		; CHECK: add [[TMP1:w.*]], w0, w1
; CHECK: add [[TMP]], [[TMP]], w2		; CHECK: add [[TMP2:w.*]], w2, w3
; CHECK: add w0, [[TMP]], w3		; CHECK: add [[TMP3:w.*]], [[TMP1]], [[TMP2]]
; CHECK-O0-LABEL: test3		; CHECK-O0-LABEL: test3
; CHECK-O0: bl _gen3		; CHECK-O0: bl _gen3
; CHECK-O0: add [[TMP:w.*]], w0, w1		; CHECK-O0: add [[TMP:w.*]], w0, w1
; CHECK-O0: add [[TMP]], [[TMP]], w2		; CHECK-O0: add [[TMP]], [[TMP]], w2
; CHECK-O0: add w0, [[TMP]], w3		; CHECK-O0: add w0, [[TMP]], w3
define i32 @test3(i32) {		define i32 @test3(i32) {
entry:		entry:
%call = call swiftcc { i32, i32, i32, i32 } @gen3(i32 %0)		%call = call swiftcc { i32, i32, i32, i32 } @gen3(i32 %0)
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines

declare swiftcc { double, double, double, double } @gen5()		declare swiftcc { double, double, double, double } @gen5()

; CHECK-LABEL: test6		; CHECK-LABEL: test6
; CHECK: bl _gen6		; CHECK: bl _gen6
; CHECK-DAG: fadd d0, d0, d1		; CHECK-DAG: fadd d0, d0, d1
; CHECK-DAG: fadd d0, d0, d2		; CHECK-DAG: fadd d0, d0, d2
; CHECK-DAG: fadd d0, d0, d3		; CHECK-DAG: fadd d0, d0, d3
; CHECK-DAG: add [[TMP:w.*]], w0, w1		; CHECK-DAG: add [[TMP1:w.*]], w0, w1
; CHECK-DAG: add [[TMP]], [[TMP]], w2		; CHECK-DAG: add [[TMP2:w.*]], w2, w3
; CHECK-DAG: add w0, [[TMP]], w3		; CHECK-DAG: add [[TMP3:w.*]], [[TMP1]], [[TMP2]]
; CHECK-O0-LABEL: test6		; CHECK-O0-LABEL: test6
; CHECK-O0: bl _gen6		; CHECK-O0: bl _gen6
; CHECK-O0-DAG: fadd d0, d0, d1		; CHECK-O0-DAG: fadd d0, d0, d1
; CHECK-O0-DAG: fadd d0, d0, d2		; CHECK-O0-DAG: fadd d0, d0, d2
; CHECK-O0-DAG: fadd d0, d0, d3		; CHECK-O0-DAG: fadd d0, d0, d3
; CHECK-O0-DAG: add [[TMP:w.*]], w0, w1		; CHECK-O0-DAG: add [[TMP:w.*]], w0, w1
; CHECK-O0-DAG: add [[TMP]], [[TMP]], w2		; CHECK-O0-DAG: add [[TMP]], [[TMP]], w2
; CHECK-O0-DAG: add w0, [[TMP]], w3		; CHECK-O0-DAG: add w0, [[TMP]], w3
▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/vecreduce-and-legalization.ll

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%b = call i8 @llvm.vector.reduce.and.v3i8(<3 x i8> %a)		%b = call i8 @llvm.vector.reduce.and.v3i8(<3 x i8> %a)
ret i8 %b		ret i8 %b
}		}

define i8 @test_v9i8(<9 x i8> %a) nounwind {		define i8 @test_v9i8(<9 x i8> %a) nounwind {
; CHECK-LABEL: test_v9i8:		; CHECK-LABEL: test_v9i8:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
; CHECK-NEXT: mov w8, #-1		; CHECK-NEXT: mov w8, #-1
; CHECK-NEXT: umov w12, v0.b[4]		; CHECK-NEXT: umov w9, v0.b[5]
; CHECK-NEXT: mov v1.16b, v0.16b		; CHECK-NEXT: mov v1.16b, v0.16b
		; CHECK-NEXT: umov w10, v0.b[6]
		; CHECK-NEXT: umov w15, v0.b[7]
; CHECK-NEXT: mov v1.b[9], w8		; CHECK-NEXT: mov v1.b[9], w8
; CHECK-NEXT: mov v1.b[10], w8		; CHECK-NEXT: mov v1.b[10], w8
; CHECK-NEXT: mov v1.b[11], w8		; CHECK-NEXT: mov v1.b[11], w8
; CHECK-NEXT: mov v1.b[13], w8		; CHECK-NEXT: mov v1.b[13], w8
		; CHECK-NEXT: umov w8, v0.b[4]
; CHECK-NEXT: ext v1.16b, v1.16b, v1.16b, #8		; CHECK-NEXT: ext v1.16b, v1.16b, v1.16b, #8
; CHECK-NEXT: and v1.8b, v0.8b, v1.8b
; CHECK-NEXT: umov w8, v1.b[1]
; CHECK-NEXT: umov w9, v1.b[0]
; CHECK-NEXT: umov w10, v1.b[2]
; CHECK-NEXT: umov w11, v1.b[3]
; CHECK-NEXT: and w8, w9, w8
; CHECK-NEXT: umov w9, v0.b[5]
; CHECK-NEXT: and w8, w8, w10
; CHECK-NEXT: umov w10, v0.b[6]
; CHECK-NEXT: and w8, w8, w11
; CHECK-NEXT: umov w11, v0.b[7]
; CHECK-NEXT: and w8, w8, w12
; CHECK-NEXT: and w8, w8, w9		; CHECK-NEXT: and w8, w8, w9
; CHECK-NEXT: and w8, w8, w10		; CHECK-NEXT: and w8, w8, w10
; CHECK-NEXT: and w0, w8, w11		; CHECK-NEXT: and w8, w8, w15
		; CHECK-NEXT: and v1.8b, v0.8b, v1.8b
		; CHECK-NEXT: umov w11, v1.b[1]
		; CHECK-NEXT: umov w12, v1.b[0]
		; CHECK-NEXT: umov w13, v1.b[2]
		; CHECK-NEXT: umov w14, v1.b[3]
		; CHECK-NEXT: and w9, w12, w11
		; CHECK-NEXT: and w11, w13, w14
		; CHECK-NEXT: and w9, w9, w11
		; CHECK-NEXT: and w0, w9, w8
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%b = call i8 @llvm.vector.reduce.and.v9i8(<9 x i8> %a)		%b = call i8 @llvm.vector.reduce.and.v9i8(<9 x i8> %a)
ret i8 %b		ret i8 %b
}		}

define i32 @test_v3i32(<3 x i32> %a) nounwind {		define i32 @test_v3i32(<3 x i32> %a) nounwind {
; CHECK-LABEL: test_v3i32:		; CHECK-LABEL: test_v3i32:
; CHECK: // %bb.0:		; CHECK: // %bb.0:
▲ Show 20 Lines • Show All 59 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][MachineCombiner] Update isAssociativeAndCommutativeAbandonedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 475717

llvm/lib/Target/AArch64/AArch64InstrInfo.cpp

llvm/test/CodeGen/AArch64/GlobalISel/arm64-atomic.ll

llvm/test/CodeGen/AArch64/GlobalISel/arm64-pcsections.ll

llvm/test/CodeGen/AArch64/aarch64-dynamic-stack-layout.ll

llvm/test/CodeGen/AArch64/arm64-rev.ll

llvm/test/CodeGen/AArch64/cmp-chains.ll

llvm/test/CodeGen/AArch64/machine-combiner.ll

llvm/test/CodeGen/AArch64/reduce-and.ll

llvm/test/CodeGen/AArch64/reduce-or.ll

llvm/test/CodeGen/AArch64/reduce-shuffle.ll

llvm/test/CodeGen/AArch64/reduce-xor.ll

llvm/test/CodeGen/AArch64/swift-return.ll

llvm/test/CodeGen/AArch64/vecreduce-and-legalization.ll

[AArch64][MachineCombiner] Update isAssociativeAndCommutative
AbandonedPublic