This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
3/6
BasicTTIImpl.h
-
lib/Target/SystemZ/
-
Target/
-
SystemZ/
-
SystemZTargetTransformInfo.cpp
-
test/Analysis/CostModel/
-
Analysis/
-
CostModel/
-
AArch64/
-
reduce-and.ll
-
reduce-or.ll
-
AMDGPU/
-
reduce-and.ll
-
reduce-or.ll
-
ARM/
-
reduce-and.ll
-
reduce-or.ll
-
PowerPC/
-
reduce-and.ll
-
reduce-or.ll
-
RISCV/
-
reduce-and.ll
-
reduce-or.ll
-
SystemZ/
1/1
reduce-and.ll
-
reduce-or.ll

Differential D97961

[Cost]Canonicalize the cost for logical or/and reductions.
ClosedPublic

Authored by ABataev on Mar 4 2021, 8:55 AM.

Download Raw Diff

Details

Reviewers

RKSimon
spatel
lebedev.ri

Commits

rG14ae0cf0f5cd: [Cost]Canonicalize the cost for logical or/and reductions.

Summary

The generic cost of logical or/and reductions should be cost of bitcast
<ReduxWidth x i1> to iReduxWidth + cmp eq|ne iReduxWidth.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

ABataev created this revision.Mar 4 2021, 8:55 AM

Herald added subscribers: frasercrmck, kerbowa, luismarques and 22 others. · View Herald TranscriptMar 4 2021, 8:55 AM

ABataev requested review of this revision.Mar 4 2021, 8:55 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 4 2021, 8:55 AM

Herald added a subscriber: MaskRay. · View Herald Transcript

ABataev added inline comments.Mar 4 2021, 8:57 AM

llvm/test/Analysis/CostModel/SystemZ/reduce-and.ll
13	The cost model for SystemZ is not complete/correct, needs to be fixed, that's why there is a regression for it.

ABataev added inline comments.Mar 4 2021, 8:59 AM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1909	SystemZ cost model does not implement vector-to-int bitcast and crashes. That's why have to use `Base::getCastInstrCost()` here rather than `thisT()->getCastInstrCost()`

ABataev retitled this revision from [Cost]Caonicalize the cost for logical or/and reductions. to [Cost]Canonicalize the cost for logical or/and reductions..Mar 4 2021, 8:59 AM

jrtc27 added a subscriber: craig.topper.Mar 4 2021, 9:14 AM

david-arm added a subscriber: david-arm.Mar 4 2021, 9:17 AM

david-arm added inline comments.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902	I'm not sure this is always true because some backends (e.g. AArch64) promote i1 to larger integers. The costs for AArch64 still look a bit odd to be honest. I tried them out manually and I observe about 8 instructions for AND reductions using <4 x i1> vectors since we have lots of bytewise moves of -1 into the vector lanes of a <4 x i32> vector.

ABataev added inline comments.Mar 4 2021, 9:44 AM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902	This is known problem, see https://bugs.llvm.org/show_bug.cgi?id=41636 https://bugs.llvm.org/show_bug.cgi?id=41635 https://bugs.llvm.org/show_bug.cgi?id=41634 Looks like the construct is not lowered properly on some targets

Harbormaster completed remote builds in B92092: Diff 328194.Mar 4 2021, 10:12 PM

david-arm added inline comments.Mar 5 2021, 12:44 AM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902	Sure, I totally agree the codegen for ARM and AArch64 is awful and I take your point. I was just wondering if this assumption was a problem: %val = bitcast <ReduxWidth x i1> to iReduxWidth as I don't think is true for targets that promote i1 to i32 or something like that. In the bug shown above (https://bugs.llvm.org/show_bug.cgi?id=41636) even the optimal code is still operating on vectors of i8 types. I guess for those targets that do promote i1->iX they can come up with their own cost in the target specific getArithmeticReductionCost so maybe this isn't really a problem?

ABataev added inline comments.Mar 5 2021, 5:45 AM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1902	Yes, this is the idea. This patch provides just the basic cost estimation for this particular case, in case if the target cost is different it should define its own cost for this case.

RKSimon added inline comments.Mar 7 2021, 3:33 AM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
1909	How much of a task would it be to tweak the SystemZ TTI to avoid this? I worry that this kind of thing gets forgotten about and could cause other problems in the future.

Added basic cost estimation for vector-to-scalar cast for SystemZ target.

Herald added a subscriber: hiraditya. · View Herald TranscriptMar 19 2021, 6:58 AM

LGTM - cheers.

This is known problem, see
https://bugs.llvm.org/show_bug.cgi?id=41636
https://bugs.llvm.org/show_bug.cgi?id=41635
https://bugs.llvm.org/show_bug.cgi?id=41634

Please can you ping those bugs mentioning this default cost change?

This revision is now accepted and ready to land.Mar 19 2021, 7:54 AM

In D97961#2637496, @RKSimon wrote:

LGTM - cheers.

This is known problem, see
https://bugs.llvm.org/show_bug.cgi?id=41636
https://bugs.llvm.org/show_bug.cgi?id=41635
https://bugs.llvm.org/show_bug.cgi?id=41634

Please can you ping those bugs mentioning this default cost change?

Sure, will do, thanks!

Harbormaster completed remote builds in B94693: Diff 331859.Mar 19 2021, 9:28 AM

Closed by commit rG14ae0cf0f5cd: [Cost]Canonicalize the cost for logical or/and reductions. (authored by ABataev). · Explain WhyMar 19 2021, 11:02 AM

This revision was automatically updated to reflect the committed changes.

ABataev added a commit: rG14ae0cf0f5cd: [Cost]Canonicalize the cost for logical or/and reductions..

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

BasicTTIImpl.h

16 lines

lib/

Target/

SystemZ/

SystemZTargetTransformInfo.cpp

7 lines

test/

Analysis/

CostModel/

AArch64/

reduce-and.ll

12 lines

reduce-or.ll

12 lines

AMDGPU/

reduce-and.ll

14 lines

reduce-or.ll

14 lines

ARM/

reduce-and.ll

14 lines

reduce-or.ll

14 lines

PowerPC/

reduce-and.ll

14 lines

reduce-or.ll

14 lines

RISCV/

reduce-and.ll

35 lines

reduce-or.ll

35 lines

SystemZ/

reduce-and.ll

14 lines

reduce-or.ll

14 lines

Diff 331943

llvm/include/llvm/CodeGen/BasicTTIImpl.h

	Show First 20 Lines • Show All 1,889 Lines • ▼ Show 20 Lines
	///			///
	/// The cost model should take into account that the actual length of the			/// The cost model should take into account that the actual length of the
	/// vector is reduced on each iteration.			/// vector is reduced on each iteration.
	unsigned getArithmeticReductionCost(unsigned Opcode, VectorType *Ty,			unsigned getArithmeticReductionCost(unsigned Opcode, VectorType *Ty,
	bool IsPairwise,			bool IsPairwise,
	TTI::TargetCostKind CostKind) {			TTI::TargetCostKind CostKind) {
	Type *ScalarTy = Ty->getElementType();			Type *ScalarTy = Ty->getElementType();
	unsigned NumVecElts = cast<FixedVectorType>(Ty)->getNumElements();			unsigned NumVecElts = cast<FixedVectorType>(Ty)->getNumElements();
				if ((Opcode == Instruction::Or \|\| Opcode == Instruction::And) &&
				ScalarTy == IntegerType::getInt1Ty(Ty->getContext()) &&
				NumVecElts >= 2) {
				// Or reduction for i1 is represented as:
				// %val = bitcast <ReduxWidth x i1> to iReduxWidth
				david-armUnsubmitted Not Done Reply Inline Actions I'm not sure this is always true because some backends (e.g. AArch64) promote i1 to larger integers. The costs for AArch64 still look a bit odd to be honest. I tried them out manually and I observe about 8 instructions for AND reductions using <4 x i1> vectors since we have lots of bytewise moves of -1 into the vector lanes of a <4 x i32> vector. david-arm: I'm not sure this is always true because some backends (e.g. AArch64) promote i1 to larger…
				ABataevAuthorUnsubmitted Done Reply Inline Actions This is known problem, see https://bugs.llvm.org/show_bug.cgi?id=41636 https://bugs.llvm.org/show_bug.cgi?id=41635 https://bugs.llvm.org/show_bug.cgi?id=41634 Looks like the construct is not lowered properly on some targets ABataev: This is known problem, see https://bugs.llvm.org/show_bug.cgi?id=41636 https://bugs.llvm.
				david-armUnsubmitted Not Done Reply Inline Actions Sure, I totally agree the codegen for ARM and AArch64 is awful and I take your point. I was just wondering if this assumption was a problem: %val = bitcast <ReduxWidth x i1> to iReduxWidth as I don't think is true for targets that promote i1 to i32 or something like that. In the bug shown above (https://bugs.llvm.org/show_bug.cgi?id=41636) even the optimal code is still operating on vectors of i8 types. I guess for those targets that do promote i1->iX they can come up with their own cost in the target specific getArithmeticReductionCost so maybe this isn't really a problem? david-arm: Sure, I totally agree the codegen for ARM and AArch64 is awful and I take your point. I was…
				ABataevAuthorUnsubmitted Done Reply Inline Actions Yes, this is the idea. This patch provides just the basic cost estimation for this particular case, in case if the target cost is different it should define its own cost for this case. ABataev: Yes, this is the idea. This patch provides just the basic cost estimation for this particular…
				// %res = cmp ne iReduxWidth %val, 0
				// And reduction for i1 is represented as:
				// %val = bitcast <ReduxWidth x i1> to iReduxWidth
				// %res = cmp eq iReduxWidth %val, 11111
				Type *ValTy = IntegerType::get(Ty->getContext(), NumVecElts);
				return thisT()->getCastInstrCost(Instruction::BitCast, ValTy, Ty,
				TTI::CastContextHint::None, CostKind) +
				ABataevAuthorUnsubmitted Done Reply Inline Actions SystemZ cost model does not implement vector-to-int bitcast and crashes. That's why have to use `Base::getCastInstrCost()` here rather than `thisT()->getCastInstrCost()` ABataev: SystemZ cost model does not implement vector-to-int bitcast and crashes. That's why have to use…
				RKSimonUnsubmitted Not Done Reply Inline Actions How much of a task would it be to tweak the SystemZ TTI to avoid this? I worry that this kind of thing gets forgotten about and could cause other problems in the future. RKSimon: How much of a task would it be to tweak the SystemZ TTI to avoid this? I worry that this kind…
				thisT()->getCmpSelInstrCost(Instruction::ICmp, ValTy,
				CmpInst::makeCmpResultType(ValTy),
				CmpInst::BAD_ICMP_PREDICATE, CostKind);
				}
	unsigned NumReduxLevels = Log2_32(NumVecElts);			unsigned NumReduxLevels = Log2_32(NumVecElts);
	unsigned ArithCost = 0;			unsigned ArithCost = 0;
	unsigned ShuffleCost = 0;			unsigned ShuffleCost = 0;
	std::pair<unsigned, MVT> LT =			std::pair<unsigned, MVT> LT =
	thisT()->getTLI()->getTypeLegalizationCost(DL, Ty);			thisT()->getTLI()->getTypeLegalizationCost(DL, Ty);
	unsigned LongVectorCount = 0;			unsigned LongVectorCount = 0;
	unsigned MVTLen =			unsigned MVTLen =
	LT.second.isVector() ? LT.second.getVectorNumElements() : 1;			LT.second.isVector() ? LT.second.getVectorNumElements() : 1;
	▲ Show 20 Lines • Show All 147 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

Show First 20 Lines • Show All 744 Lines • ▼ Show 20 Lines	if ((Opcode == Instruction::ZExt \|\| Opcode == Instruction::SExt) &&
Type *CmpOpTy = ((I != nullptr) ? getCmpOpsType(I) : nullptr);		Type *CmpOpTy = ((I != nullptr) ? getCmpOpsType(I) : nullptr);
if (CmpOpTy != nullptr && CmpOpTy->isFloatingPointTy())		if (CmpOpTy != nullptr && CmpOpTy->isFloatingPointTy())
// If operands of an fp-type was compared, this costs +1.		// If operands of an fp-type was compared, this costs +1.
Cost++;		Cost++;
return Cost;		return Cost;
}		}
}		}
else if (ST->hasVector()) {		else if (ST->hasVector()) {
		// Vector to scalar cast.
auto *SrcVecTy = cast<FixedVectorType>(Src);		auto *SrcVecTy = cast<FixedVectorType>(Src);
auto *DstVecTy = cast<FixedVectorType>(Dst);		auto *DstVecTy = dyn_cast<FixedVectorType>(Dst);
		if (!DstVecTy) {
		// TODO: tune vector-to-scalar cast.
		return BaseT::getCastInstrCost(Opcode, Dst, Src, CCH, CostKind, I);
		}
unsigned VF = SrcVecTy->getNumElements();		unsigned VF = SrcVecTy->getNumElements();
unsigned NumDstVectors = getNumVectorRegs(Dst);		unsigned NumDstVectors = getNumVectorRegs(Dst);
unsigned NumSrcVectors = getNumVectorRegs(Src);		unsigned NumSrcVectors = getNumVectorRegs(Src);

if (Opcode == Instruction::Trunc) {		if (Opcode == Instruction::Trunc) {
if (Src->getScalarSizeInBits() == Dst->getScalarSizeInBits())		if (Src->getScalarSizeInBits() == Dst->getScalarSizeInBits())
return 0; // Check for NOOP conversions.		return 0; // Check for NOOP conversions.
return getVectorTruncCost(Src, Dst);		return getVectorTruncCost(Src, Dst);
▲ Show 20 Lines • Show All 427 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/AArch64/reduce-and.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 129 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 364 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 455 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 91 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 637 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 181 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1001 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 362 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/AArch64/reduce-or.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=aarch64-unknown-linux-gnu -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 38 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 129 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 364 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 455 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 91 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 637 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 181 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1001 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 362 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/AMDGPU/reduce-and.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=amdgcn-unknown-amdhsa -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=amdgcn-unknown-amdhsa -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 94 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 190 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 382 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 130 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/AMDGPU/reduce-or.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=amdgcn-unknown-amdhsa -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=amdgcn-unknown-amdhsa -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 94 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 190 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 382 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 130 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/ARM/reduce-and.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=armv8a-linux-gnueabihf -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=armv8a-linux-gnueabihf -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 53 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 150 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 391 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 488 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 97 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 682 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 193 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1070 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 385 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/ARM/reduce-or.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=armv8a-linux-gnueabihf -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=armv8a-linux-gnueabihf -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 53 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 150 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 391 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 488 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 97 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 682 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 193 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1070 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 385 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/PowerPC/reduce-and.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 97 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 193 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 386 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/PowerPC/reduce-or.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=powerpc64-unknown-linux-gnu -mcpu=pwr8 -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 14 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 97 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 20 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 193 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 386 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/RISCV/reduce-and.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=riscv32 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=riscv32 -cost-model -cost-kind=throughput -analyze \| FileCheck %s --check-prefix=RISCV32
	; RUN: opt < %s -mtriple=riscv64 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=riscv64 -cost-model -cost-kind=throughput -analyze \| FileCheck %s --check-prefix=RISCV64

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; RISCV32-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 94 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 190 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 66 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 382 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 132 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef			; RISCV32-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
				;
				; RISCV64-LABEL: 'reduce_i1'
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 130 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	%V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			%V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	Show All 12 Lines

llvm/test/Analysis/CostModel/RISCV/reduce-or.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=riscv32 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=riscv32 -cost-model -cost-kind=throughput -analyze \| FileCheck %s --check-prefix=RISCV32
	; RUN: opt < %s -mtriple=riscv64 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=riscv64 -cost-model -cost-kind=throughput -analyze \| FileCheck %s --check-prefix=RISCV64

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; RISCV32-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 10 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 22 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 46 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 94 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 190 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 66 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 382 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)			; RISCV32-NEXT: Cost Model: Found an estimated cost of 132 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef			; RISCV32-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
				;
				; RISCV64-LABEL: 'reduce_i1'
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 17 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 33 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 65 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 130 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
				; RISCV64-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	%V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			%V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	Show All 12 Lines

llvm/test/Analysis/CostModel/SystemZ/reduce-and.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=systemz-unknown -mcpu=z13 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=systemz-unknown -mcpu=z13 -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 13 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 66 for instruction: %V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 130 for instruction: %V64 = call i1 @llvm.vector.reduce.and.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 258 for instruction: %V128 = call i1 @llvm.vector.reduce.and.v128i1(<128 x i1> undef)
				ABataevAuthorUnsubmitted Done Reply Inline Actions The cost model for SystemZ is not complete/correct, needs to be fixed, that's why there is a regression for it. ABataev: The cost model for SystemZ is not complete/correct, needs to be fixed, that's why there is a…
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.and.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.and.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.and.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.and.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.and.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.and.v32i1(<32 x i1> undef)
	Show All 13 Lines

llvm/test/Analysis/CostModel/SystemZ/reduce-or.ll

	; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py
	; RUN: opt < %s -mtriple=systemz-unknown -mcpu=z13 -cost-model -cost-kind=throughput -analyze \| FileCheck %s			; RUN: opt < %s -mtriple=systemz-unknown -mcpu=z13 -cost-model -cost-kind=throughput -analyze \| FileCheck %s

	define i32 @reduce_i1(i32 %arg) {			define i32 @reduce_i1(i32 %arg) {
	; CHECK-LABEL: 'reduce_i1'			; CHECK-LABEL: 'reduce_i1'
	; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 3 for instruction: %V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 9 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 11 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 4 for instruction: %V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 13 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 66 for instruction: %V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 16 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 130 for instruction: %V64 = call i1 @llvm.vector.reduce.or.v64i1(<64 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 21 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 258 for instruction: %V128 = call i1 @llvm.vector.reduce.or.v128i1(<128 x i1> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef			; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret i32 undef
	;			;
	%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)			%V1 = call i1 @llvm.vector.reduce.or.v1i1(<1 x i1> undef)
	%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)			%V2 = call i1 @llvm.vector.reduce.or.v2i1(<2 x i1> undef)
	%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)			%V4 = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> undef)
	%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)			%V8 = call i1 @llvm.vector.reduce.or.v8i1(<8 x i1> undef)
	%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)			%V16 = call i1 @llvm.vector.reduce.or.v16i1(<16 x i1> undef)
	%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)			%V32 = call i1 @llvm.vector.reduce.or.v32i1(<32 x i1> undef)
	Show All 13 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Cost]Canonicalize the cost for logical or/and reductions.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 331943

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.cpp

llvm/test/Analysis/CostModel/AArch64/reduce-and.ll

llvm/test/Analysis/CostModel/AArch64/reduce-or.ll

llvm/test/Analysis/CostModel/AMDGPU/reduce-and.ll

llvm/test/Analysis/CostModel/AMDGPU/reduce-or.ll

llvm/test/Analysis/CostModel/ARM/reduce-and.ll

llvm/test/Analysis/CostModel/ARM/reduce-or.ll

llvm/test/Analysis/CostModel/PowerPC/reduce-and.ll

llvm/test/Analysis/CostModel/PowerPC/reduce-or.ll

llvm/test/Analysis/CostModel/RISCV/reduce-and.ll

llvm/test/Analysis/CostModel/RISCV/reduce-or.ll

llvm/test/Analysis/CostModel/SystemZ/reduce-and.ll

llvm/test/Analysis/CostModel/SystemZ/reduce-or.ll

[Cost]Canonicalize the cost for logical or/and reductions.
ClosedPublic