This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
1
IVDescriptors.h
-
Transforms/Utils/
-
Utils/
-
LoopUtils.h
-
lib/
-
Analysis/
1/4
IVDescriptors.cpp
-
Transforms/
-
Utils/
2/4
LoopUtils.cpp
-
Vectorize/
-
LoopVectorize.cpp
-
VPlanRecipes.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
-
if-reduction.ll
-
induction-min-max.ll
1
select-min-index.ll

Differential D152693

LoopVectorize: introduce RecurKind::Induction(I|F)(Max|Min)
AbandonedPublic

Authored by artagnon on Jun 12 2023, 3:44 AM.

Download Raw Diff

Details

Reviewers

dmgreen
fhahn
reames
david-arm
sdesmalen
kmclaughlin
Mel-Chen

Summary

LoopVectorize's SelectCmp pattern suffers from the deficiency that it
cannot handle non-invariant statements. In order to support
vectorization of the following example,

int src[n] = {4, 5, 2};
int r = 331;
for (int i = 0; i < n; i++) {
  if (src[i] > 3)
    r = i;
}
return r;

introduce RecurKind::InductionIMax, RecurKind::InductionIMin,
RecurKind::InductionFMax, and RecurKind::InductionFMin. We currently
only support assignments to the induction variable; in particular, we do
not support the following case:

int src[n] = {4, 5, 2};
int r = 331;
for (int i = 0; i < n; i++) {
  if (src[i] > 3)
    r = src[i];
}
return r;

CodeGen'ing for our original example involves checking the SCEV AddRec
expression (the min/max is inverted if the assignment is r = -i, in
place of r = i). Indeed, once we determine whether it's a min or max
reduction, CodeGen'ing involves the following:

Create a Splat with the int_min/int_max values, depending on the RecurKind.
The Src is filled with values: {0, 1, 2, 3, 4, ...}.
The Right is filled with values: {0, 1, 331, 331, 331, ...}.
The CmpVector is filled with values: {1, 1, 0, 0, 0, ...}.
Select using this CmpVector between Src and the Splat with int_min/int_max values.
Use a max-reduce/min-reduce on the result of the Select.
Use the exising Cmp which determines whether or not any assignment took place in the loop to select between the result of the max-reduce/min-reduce, and the initial value (InitVal).

Hence, the original example is vectorized.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

artagnon created this revision.Jun 12 2023, 3:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 3:44 AM

Herald added subscribers: shiva0217, StephenFan, arphaman and 3 others. · View Herald Transcript

artagnon requested review of this revision.Jun 12 2023, 3:44 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 12 2023, 3:44 AM

Herald added subscribers: llvm-commits, • pcwang-thead, vkmr. · View Herald Transcript

Harbormaster completed remote builds in B238145: Diff 530447.Jun 12 2023, 4:28 AM

It looks like this addresses a similar issue as D150851 by @Mel-Chen? It would be good to iterate on D150851 first and then extend it to handle additional cases from here

artagnon added a reviewer: Mel-Chen.Jun 15 2023, 2:17 AM

I'm glad that others have also noticed this, especially the implementation of the select-cmp pattern for decreasing induction variables. The select-cmp pattern for decreasing induction variables is a function that I haven't implemented yet. Have you come across any real applications or benchmarks that make use of it.

llvm/include/llvm/Analysis/IVDescriptors.h
55–62	I can understand the literal meaning of "the induction variable to be maximized/minimized," but perhaps there is a better way to describe this pattern, such as "monotonic increasing" and "monotonic decreasing."
llvm/lib/Analysis/IVDescriptors.cpp
741	I recommend using isKnownNegative. bool isKnownNegative (const SCEV *S) Test if the given expression is known to be negative.
1021–1026	Actually, I've had an idea: for this types of RecurKind, we should only need to do `AddReductionVar` once.
1174–1177	Based on my understanding, in a broader sense, ternary operations do not have an identity. Although I used the term "identity" for convenience in implementation to refer to the sentinel value, I have been considering whether to separate the concept of the sentinel value from identity.
llvm/lib/Transforms/Utils/LoopUtils.cpp
1094	Based on your implementation, `Right` would not be what you stated as {0, 1, 331, 331, 331, ...}, but rather {331, 331, 331, 331, 331, ...}.
1112	The behavior of `isSigned` is different from what you have in mind. (Indeed, the name is really misleading.) bool isSigned () const Returns true if all source operands of the recurrence are SExtInsts.
llvm/test/Transforms/LoopVectorize/select-min-index.ll
2–3	Why skip interleave?

Mel-Chen mentioned this in D150851: [LoopVectorize] Vectorize select-cmp reduction pattern for increasing integer induction variable.Jul 7 2023, 1:48 AM

In D152693#4455117, @Mel-Chen wrote:

I'm glad that others have also noticed this, especially the implementation of the select-cmp pattern for decreasing induction variables. The select-cmp pattern for decreasing induction variables is a function that I haven't implemented yet. Have you come across any real applications or benchmarks that make use of it.

I implemented this patch based on a TSVC run: gcc-aarch64 vectorizes increasing and decreasing induction variables, while llvm-riscv does not.

Thanks for your review comments: I will keep them in mind when developing a follow-up patch, when your patch lands.

llvm/lib/Analysis/IVDescriptors.cpp
1021–1026	Yes, your approach of extending `isSelectCmpPattern` saves us these.
llvm/lib/Transforms/Utils/LoopUtils.cpp
1094	Ah yes, my error.
1112	Oh, ouch.

Herald added a subscriber: wangpc. · View Herald TranscriptJul 7 2023, 4:24 AM

Hi Ram,

To fix the issue mentioned in https://reviews.llvm.org/D150851#4480312,
is it possible to introduce a vector to determine the array has element equal to the initial value or not?
So the vector could be used to adjust the result if the result comes from max reduction and it smaller than the initial value.

Something like:

    Init_val = 331;

Loop_header:
    AnyInitVal = {0, 0, 0, 0}
 
Loop_body:
    ...
    last_AnyInitVal = PHI (AnyInitVal, next_AnyInitVal), 
    current_AnyInitVal = veq  current_array_vec, Init_val   // if current_array_vec is (1, 2, 331, 331) then current_AnyInitVal = (0, 0, 1, 1)
    next_AnyInitVal = vor  current_AnyInitVal, last_AnyInitVal;
    ...
Loop Exit:

    if (result_come_from_max_reduction)
       if ((max_reduction < InitVal) && (next_AnyInitVal has non-zero element))
          return InitVal;
       else return max_reduction;

Subsumed by https://reviews.llvm.org/D150851 and follow-ups.

Herald added a subscriber: sunshaoce. · View Herald TranscriptAug 29 2023, 2:35 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

IVDescriptors.h

30 lines

Transforms/

Utils/

LoopUtils.h

9 lines

lib/

Analysis/

IVDescriptors.cpp

130 lines

Transforms/

Utils/

LoopUtils.cpp

72 lines

Vectorize/

LoopVectorize.cpp

9 lines

VPlanRecipes.cpp

3 lines

test/

Transforms/

LoopVectorize/

if-reduction.ll

14 lines

induction-min-max.ll

295 lines

select-min-index.ll

60 lines

Diff 530447

llvm/include/llvm/Analysis/IVDescriptors.h

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	enum class RecurKind {
UMax, ///< Unsigned integer max implemented in terms of select(cmp()).		UMax, ///< Unsigned integer max implemented in terms of select(cmp()).
FAdd, ///< Sum of floats.		FAdd, ///< Sum of floats.
FMul, ///< Product of floats.		FMul, ///< Product of floats.
FMin, ///< FP min implemented in terms of select(cmp()).		FMin, ///< FP min implemented in terms of select(cmp()).
FMax, ///< FP max implemented in terms of select(cmp()).		FMax, ///< FP max implemented in terms of select(cmp()).
FMulAdd, ///< Fused multiply-add of floats (a * b + c).		FMulAdd, ///< Fused multiply-add of floats (a * b + c).
SelectICmp, ///< Integer select(icmp(),x,y) where one of (x,y) is loop		SelectICmp, ///< Integer select(icmp(),x,y) where one of (x,y) is loop
///< invariant		///< invariant
SelectFCmp ///< Integer select(fcmp(),x,y) where one of (x,y) is loop		SelectFCmp, ///< Integer select(fcmp(),x,y) where one of (x,y) is loop
///< invariant		///< invariant
		InductionIMax, ///< Integer select(icmp(), x, y) where one of (x, y) is
		///< the induction variable to be maximized.
		InductionIMin, ///< Integer select(icmp(), x, y) where one of (x, y) is
		///< the induction variable to be minimized.
		InductionFMax, ///< Integer select(fcmp(), x, y) where one of (x, y) is
		///< the induction variable to be maximized.
		InductionFMin ///< Integer select(fcmp(), x, y) where one of (x, y) is
		///< the induction variable to be minimized.
		Mel-ChenUnsubmitted Not Done Reply Inline Actions I can understand the literal meaning of "the induction variable to be maximized/minimized," but perhaps there is a better way to describe this pattern, such as "monotonic increasing" and "monotonic decreasing." Mel-Chen: I can understand the literal meaning of "the induction variable to be maximized/minimized," but…
};		};

/// The RecurrenceDescriptor is used to identify recurrences variables in a		/// The RecurrenceDescriptor is used to identify recurrences variables in a
/// loop. Reduction is a special case of recurrence that has uses of the		/// loop. Reduction is a special case of recurrence that has uses of the
/// recurrence variable outside the loop. The method isReductionPHI identifies		/// recurrence variable outside the loop. The method isReductionPHI identifies
/// reductions that are basic recurrences.		/// reductions that are basic recurrences.
///		///
/// Basic recurrences are defined as the summation, product, OR, AND, XOR, min,		/// Basic recurrences are defined as the summation, product, OR, AND, XOR, min,
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	public:
/// Returns a struct describing if the instruction 'I' can be a recurrence		/// Returns a struct describing if the instruction 'I' can be a recurrence
/// variable of type 'Kind' for a Loop \p L and reduction PHI \p Phi.		/// variable of type 'Kind' for a Loop \p L and reduction PHI \p Phi.
/// If the recurrence is a min/max pattern of select(icmp()) this function		/// If the recurrence is a min/max pattern of select(icmp()) this function
/// advances the instruction pointer 'I' from the compare instruction to the		/// advances the instruction pointer 'I' from the compare instruction to the
/// select instruction and stores this pointer in 'PatternLastInst' member of		/// select instruction and stores this pointer in 'PatternLastInst' member of
/// the returned struct.		/// the returned struct.
static InstDesc isRecurrenceInstr(Loop L, PHINode Phi, Instruction *I,		static InstDesc isRecurrenceInstr(Loop L, PHINode Phi, Instruction *I,
RecurKind Kind, InstDesc &Prev,		RecurKind Kind, InstDesc &Prev,
FastMathFlags FuncFMF);		FastMathFlags FuncFMF, ScalarEvolution *SE);

/// Returns true if instruction I has multiple uses in Insts		/// Returns true if instruction I has multiple uses in Insts
static bool hasMultipleUsesOf(Instruction *I,		static bool hasMultipleUsesOf(Instruction *I,
SmallPtrSetImpl<Instruction *> &Insts,		SmallPtrSetImpl<Instruction *> &Insts,
unsigned MaxNumUses);		unsigned MaxNumUses);

/// Returns true if all uses of the instruction I is within the Set.		/// Returns true if all uses of the instruction I is within the Set.
static bool areAllUsesIn(Instruction I, SmallPtrSetImpl<Instruction > &Set);		static bool areAllUsesIn(Instruction I, SmallPtrSetImpl<Instruction > &Set);
Show All 10 Lines	public:
/// Select(ICmp(A, B), X, Y), or		/// Select(ICmp(A, B), X, Y), or
/// Select(FCmp(A, B), X, Y)		/// Select(FCmp(A, B), X, Y)
/// where one of (X, Y) is a loop invariant integer and the other is a PHI		/// where one of (X, Y) is a loop invariant integer and the other is a PHI
/// value. \p Prev specifies the description of an already processed select		/// value. \p Prev specifies the description of an already processed select
/// instruction, so its corresponding cmp can be matched to it.		/// instruction, so its corresponding cmp can be matched to it.
static InstDesc isSelectCmpPattern(Loop Loop, PHINode OrigPhi,		static InstDesc isSelectCmpPattern(Loop Loop, PHINode OrigPhi,
Instruction *I, InstDesc &Prev);		Instruction *I, InstDesc &Prev);

		/// Returns a struct describing whether the instruction is either a
		/// Select(ICmp(A, B), X, Y), or
		/// Select(FCmp(A, B), X, Y)
		/// where one of (X, Y) is the loop induction variable and the other is a PHI
		/// value. \p Prev specifies the description of an already processed select
		/// instruction, so its corresponding cmp can be matched to it.
		static InstDesc isInductionMinMaxPattern(Loop Loop, PHINode OrigPhi,
		Instruction *I, InstDesc &Prev,
		ScalarEvolution *SE);

/// Returns a struct describing if the instruction is a		/// Returns a struct describing if the instruction is a
/// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.		/// Select(FCmp(X, Y), (Z = X op PHINode), PHINode) instruction pattern.
static InstDesc isConditionalRdxPattern(RecurKind Kind, Instruction *I);		static InstDesc isConditionalRdxPattern(RecurKind Kind, Instruction *I);

/// Returns identity corresponding to the RecurrenceKind.		/// Returns identity corresponding to the RecurrenceKind.
Value getRecurrenceIdentity(RecurKind K, Type Tp, FastMathFlags FMF) const;		Value getRecurrenceIdentity(RecurKind K, Type Tp, FastMathFlags FMF) const;

/// Returns the opcode corresponding to the RecurrenceKind.		/// Returns the opcode corresponding to the RecurrenceKind.
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	public:
}		}

/// Returns true if the recurrence kind is of the form		/// Returns true if the recurrence kind is of the form
/// select(cmp(),x,y) where one of (x,y) is loop invariant.		/// select(cmp(),x,y) where one of (x,y) is loop invariant.
static bool isSelectCmpRecurrenceKind(RecurKind Kind) {		static bool isSelectCmpRecurrenceKind(RecurKind Kind) {
return Kind == RecurKind::SelectICmp \|\| Kind == RecurKind::SelectFCmp;		return Kind == RecurKind::SelectICmp \|\| Kind == RecurKind::SelectFCmp;
}		}

		/// Returns true if the recurrence kind is of the form
		/// select(cmp(),x,y) where one of (x,y) is the loop induction variable.
		static bool isInductionMinMaxRecurrenceKind(RecurKind Kind) {
		return Kind == RecurKind::InductionIMax \|\|
		Kind == RecurKind::InductionIMin \|\|
		Kind == RecurKind::InductionFMax \|\| Kind == RecurKind::InductionFMin;
		}

/// Returns the type of the recurrence. This type can be narrower than the		/// Returns the type of the recurrence. This type can be narrower than the
/// actual type of the Phi if the recurrence has been type-promoted.		/// actual type of the Phi if the recurrence has been type-promoted.
Type *getRecurrenceType() const { return RecurrenceType; }		Type *getRecurrenceType() const { return RecurrenceType; }

/// Returns a reference to the instructions used for type-promoting the		/// Returns a reference to the instructions used for type-promoting the
/// recurrence.		/// recurrence.
const SmallPtrSet<Instruction *, 8> &getCastInsts() const { return CastInsts; }		const SmallPtrSet<Instruction *, 8> &getCastInsts() const { return CastInsts; }

▲ Show 20 Lines • Show All 159 Lines • Show Last 20 Lines

llvm/include/llvm/Transforms/Utils/LoopUtils.h

	Show First 20 Lines • Show All 394 Lines • ▼ Show 20 Lines
	/// kind RecurKind::SelectICmp or RecurKind::SelectFCmp. The reduction operation			/// kind RecurKind::SelectICmp or RecurKind::SelectFCmp. The reduction operation
	/// is described by \p Desc.			/// is described by \p Desc.
	Value *createSelectCmpTargetReduction(IRBuilderBase &B,			Value *createSelectCmpTargetReduction(IRBuilderBase &B,
	const TargetTransformInfo *TTI,			const TargetTransformInfo *TTI,
	Value *Src,			Value *Src,
	const RecurrenceDescriptor &Desc,			const RecurrenceDescriptor &Desc,
	PHINode *OrigPhi);			PHINode *OrigPhi);

				/// Create a target reduction of the given vector \p Src for a reduction of the
				/// kind RecurKind::InductionIMax, RecurKind::InductionIMin,
				/// RecurKind::InductionFMax or RecurKind::InductionFMin. The reduction
				/// operation is described by \p Desc.
				Value *createInductionMinMaxTargetReduction(IRBuilderBase &B,
				const TargetTransformInfo *TTI,
				Value *Src,
				const RecurrenceDescriptor &Desc);

	/// Create a generic target reduction using a recurrence descriptor \p Desc			/// Create a generic target reduction using a recurrence descriptor \p Desc
	/// The target is queried to determine if intrinsics or shuffle sequences are			/// The target is queried to determine if intrinsics or shuffle sequences are
	/// required to implement the reduction.			/// required to implement the reduction.
	/// Fast-math-flags are propagated using the RecurrenceDescriptor.			/// Fast-math-flags are propagated using the RecurrenceDescriptor.
	Value createTargetReduction(IRBuilderBase &B, const TargetTransformInfo TTI,			Value createTargetReduction(IRBuilderBase &B, const TargetTransformInfo TTI,
	const RecurrenceDescriptor &Desc, Value *Src,			const RecurrenceDescriptor &Desc, Value *Src,
	PHINode *OrigPhi = nullptr);			PHINode *OrigPhi = nullptr);

	▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

llvm/lib/Analysis/IVDescriptors.cpp

Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	bool RecurrenceDescriptor::isIntegerRecurrenceKind(RecurKind Kind) {
case RecurKind::And:		case RecurKind::And:
case RecurKind::Xor:		case RecurKind::Xor:
case RecurKind::SMax:		case RecurKind::SMax:
case RecurKind::SMin:		case RecurKind::SMin:
case RecurKind::UMax:		case RecurKind::UMax:
case RecurKind::UMin:		case RecurKind::UMin:
case RecurKind::SelectICmp:		case RecurKind::SelectICmp:
case RecurKind::SelectFCmp:		case RecurKind::SelectFCmp:
		case RecurKind::InductionIMax:
		case RecurKind::InductionIMin:
		case RecurKind::InductionFMax:
		case RecurKind::InductionFMin:
return true;		return true;
}		}
return false;		return false;
}		}

bool RecurrenceDescriptor::isFloatingPointRecurrenceKind(RecurKind Kind) {		bool RecurrenceDescriptor::isFloatingPointRecurrenceKind(RecurKind Kind) {
return (Kind != RecurKind::None) && !isIntegerRecurrenceKind(Kind);		return (Kind != RecurKind::None) && !isIntegerRecurrenceKind(Kind);
}		}
▲ Show 20 Lines • Show All 305 Lines • ▼ Show 20 Lines	if (!Cur->isCommutative() && !IsAPhi && !isa<SelectInst>(Cur) &&
!VisitedInsts.count(dyn_cast<Instruction>(Cur->getOperand(0))))		!VisitedInsts.count(dyn_cast<Instruction>(Cur->getOperand(0))))
return false;		return false;

// Any reduction instruction must be of one of the allowed kinds. We ignore		// Any reduction instruction must be of one of the allowed kinds. We ignore
// the starting value (the Phi or an AND instruction if the Phi has been		// the starting value (the Phi or an AND instruction if the Phi has been
// type-promoted).		// type-promoted).
if (Cur != Start) {		if (Cur != Start) {
ReduxDesc =		ReduxDesc =
isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF);		isRecurrenceInstr(TheLoop, Phi, Cur, Kind, ReduxDesc, FuncFMF, SE);
ExactFPMathInst = ExactFPMathInst == nullptr		ExactFPMathInst = ExactFPMathInst == nullptr
? ReduxDesc.getExactFPMathInst()		? ReduxDesc.getExactFPMathInst()
: ExactFPMathInst;		: ExactFPMathInst;
if (!ReduxDesc.isRecurrence())		if (!ReduxDesc.isRecurrence())
return false;		return false;
// FIXME: FMF is allowed on phi, but propagation is not handled correctly.		// FIXME: FMF is allowed on phi, but propagation is not handled correctly.
if (isa<FPMathOperator>(ReduxDesc.getPatternInst()) && !IsAPhi) {		if (isa<FPMathOperator>(ReduxDesc.getPatternInst()) && !IsAPhi) {
FastMathFlags CurFMF = ReduxDesc.getPatternInst()->getFastMathFlags();		FastMathFlags CurFMF = ReduxDesc.getPatternInst()->getFastMathFlags();
Show All 20 Lines	while (!Worklist.empty()) {
// VisitedInsts.		// VisitedInsts.
if (IsASelect && (Kind == RecurKind::FAdd \|\| Kind == RecurKind::FMul) &&		if (IsASelect && (Kind == RecurKind::FAdd \|\| Kind == RecurKind::FMul) &&
hasMultipleUsesOf(Cur, VisitedInsts, 2))		hasMultipleUsesOf(Cur, VisitedInsts, 2))
return false;		return false;

// A reduction operation must only have one use of the reduction value.		// A reduction operation must only have one use of the reduction value.
if (!IsAPhi && !IsASelect && !isMinMaxRecurrenceKind(Kind) &&		if (!IsAPhi && !IsASelect && !isMinMaxRecurrenceKind(Kind) &&
!isSelectCmpRecurrenceKind(Kind) &&		!isSelectCmpRecurrenceKind(Kind) &&
		!isInductionMinMaxRecurrenceKind(Kind) &&
hasMultipleUsesOf(Cur, VisitedInsts, 1))		hasMultipleUsesOf(Cur, VisitedInsts, 1))
return false;		return false;

// All inputs to a PHI node must be a reduction value.		// All inputs to a PHI node must be a reduction value.
if (IsAPhi && Cur != Phi && !areAllUsesIn(Cur, VisitedInsts))		if (IsAPhi && Cur != Phi && !areAllUsesIn(Cur, VisitedInsts))
return false;		return false;

if ((isIntMinMaxRecurrenceKind(Kind) \|\| Kind == RecurKind::SelectICmp) &&		if ((isIntMinMaxRecurrenceKind(Kind) \|\| Kind == RecurKind::SelectICmp \|\|
		Kind == RecurKind::InductionIMax \|\|
		Kind == RecurKind::InductionIMin) &&
(isa<ICmpInst>(Cur) \|\| isa<SelectInst>(Cur)))		(isa<ICmpInst>(Cur) \|\| isa<SelectInst>(Cur)))
++NumCmpSelectPatternInst;		++NumCmpSelectPatternInst;
if ((isFPMinMaxRecurrenceKind(Kind) \|\| Kind == RecurKind::SelectFCmp) &&		if ((isFPMinMaxRecurrenceKind(Kind) \|\| Kind == RecurKind::SelectFCmp \|\|
		Kind == RecurKind::InductionFMax \|\|
		Kind == RecurKind::InductionFMin) &&
(isa<FCmpInst>(Cur) \|\| isa<SelectInst>(Cur)))		(isa<FCmpInst>(Cur) \|\| isa<SelectInst>(Cur)))
++NumCmpSelectPatternInst;		++NumCmpSelectPatternInst;

// Check whether we found a reduction operator.		// Check whether we found a reduction operator.
FoundReduxOp \|= !IsAPhi && Cur != Start;		FoundReduxOp \|= !IsAPhi && Cur != Start;

// Process users of current instruction. Push non-PHI nodes after PHI nodes		// Process users of current instruction. Push non-PHI nodes after PHI nodes
// onto the stack. This way we are going to have seen all inputs to PHI		// onto the stack. This way we are going to have seen all inputs to PHI
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	for (User *U : Cur->users()) {
NonPHIs.push_back(UI);		NonPHIs.push_back(UI);
}		}
} else if (!isa<PHINode>(UI) &&		} else if (!isa<PHINode>(UI) &&
((!isa<FCmpInst>(UI) && !isa<ICmpInst>(UI) &&		((!isa<FCmpInst>(UI) && !isa<ICmpInst>(UI) &&
!isa<SelectInst>(UI)) \|\|		!isa<SelectInst>(UI)) \|\|
(!isConditionalRdxPattern(Kind, UI).isRecurrence() &&		(!isConditionalRdxPattern(Kind, UI).isRecurrence() &&
!isSelectCmpPattern(TheLoop, Phi, UI, IgnoredVal)		!isSelectCmpPattern(TheLoop, Phi, UI, IgnoredVal)
.isRecurrence() &&		.isRecurrence() &&
		!isInductionMinMaxPattern(TheLoop, Phi, UI, IgnoredVal, SE)
		.isRecurrence() &&
!isMinMaxPattern(UI, Kind, IgnoredVal).isRecurrence())))		!isMinMaxPattern(UI, Kind, IgnoredVal).isRecurrence())))
return false;		return false;

// Remember that we completed the cycle.		// Remember that we completed the cycle.
if (UI == Phi)		if (UI == Phi)
FoundStartPHI = true;		FoundStartPHI = true;
}		}
Worklist.append(PHIs.begin(), PHIs.end());		Worklist.append(PHIs.begin(), PHIs.end());
Worklist.append(NonPHIs.begin(), NonPHIs.end());		Worklist.append(NonPHIs.begin(), NonPHIs.end());
}		}

// This means we have seen one but not the other instruction of the		// This means we have seen one but not the other instruction of the
// pattern or more than just a select and cmp. Zero implies that we saw a		// pattern or more than just a select and cmp. Zero implies that we saw a
// llvm.min/max intrinsic, which is always OK.		// llvm.min/max intrinsic, which is always OK.
if (isMinMaxRecurrenceKind(Kind) && NumCmpSelectPatternInst != 2 &&		if (isMinMaxRecurrenceKind(Kind) && NumCmpSelectPatternInst != 2 &&
NumCmpSelectPatternInst != 0)		NumCmpSelectPatternInst != 0)
return false;		return false;

if (isSelectCmpRecurrenceKind(Kind) && NumCmpSelectPatternInst != 1)		if ((isSelectCmpRecurrenceKind(Kind) \|\|
		isInductionMinMaxRecurrenceKind(Kind)) &&
		NumCmpSelectPatternInst != 1)
return false;		return false;

if (IntermediateStore) {		if (IntermediateStore) {
// Check that stored value goes to the phi node again. This way we make sure		// Check that stored value goes to the phi node again. This way we make sure
// that the value stored in IntermediateStore is indeed the final reduction		// that the value stored in IntermediateStore is indeed the final reduction
// value.		// value.
if (!is_contained(Phi->operands(), IntermediateStore->getValueOperand())) {		if (!is_contained(Phi->operands(), IntermediateStore->getValueOperand())) {
LLVM_DEBUG(dbgs() << "Not a final reduction value stored: "		LLVM_DEBUG(dbgs() << "Not a final reduction value stored: "
▲ Show 20 Lines • Show All 138 Lines • ▼ Show 20 Lines	RecurrenceDescriptor::isSelectCmpPattern(Loop Loop, PHINode OrigPhi,
// select(cmp(), loop_invariant, phi)		// select(cmp(), loop_invariant, phi)
if (!Loop->isLoopInvariant(NonPhi))		if (!Loop->isLoopInvariant(NonPhi))
return InstDesc(false, I);		return InstDesc(false, I);

return InstDesc(I, isa<ICmpInst>(I->getOperand(0)) ? RecurKind::SelectICmp		return InstDesc(I, isa<ICmpInst>(I->getOperand(0)) ? RecurKind::SelectICmp
: RecurKind::SelectFCmp);		: RecurKind::SelectFCmp);
}		}

		// We are looking for loops that do something like this:
		// int r = 0;
		// for (int i = 0; i < n; i++) {
		// if (src[i] > 3)
		// r = 2*i;
		// }
		// int r = 0;
		// for (int i = 0; i < n; i++) {
		// if (src[i] > 3)
		// r = -i;
		// }
		// where the reduction value (r) is generated from a combination of a min/max
		// reduction, depending on the step in the AddRec expression, and a select to
		// pick between this reduction and the initial value, depending on whether there
		// was an assignment in the loop.
		RecurrenceDescriptor::InstDesc
		RecurrenceDescriptor::isInductionMinMaxPattern(Loop Loop, PHINode OrigPhi,
		Instruction *I, InstDesc &Prev,
		ScalarEvolution *SE) {
		// We must handle the select(cmp(),x,y) as a single instruction. Advance to
		// the select.
		CmpInst::Predicate Pred;
		if (match(I, m_OneUse(m_Cmp(Pred, m_Value(), m_Value())))) {
		if (auto Select = dyn_cast<SelectInst>(I->user_begin()))
		return InstDesc(Select, Prev.getRecKind());
		}

		// Only match select with single use cmp condition.
		if (!match(I, m_Select(m_OneUse(m_Cmp(Pred, m_Value(), m_Value())), m_Value(),
		m_Value())))
		return InstDesc(false, I);

		SelectInst *SI = cast<SelectInst>(I);
		Value *NonPhi = nullptr;

		// We are looking for selects of the form:
		// select(cmp(), phi, SCEVAddRecExpr) or
		// select(cmp(), SCEVAddRecExpr, phi)
		if (OrigPhi == dyn_cast<PHINode>(SI->getTrueValue()))
		NonPhi = SI->getFalseValue();
		else if (OrigPhi == dyn_cast<PHINode>(SI->getFalseValue()))
		NonPhi = SI->getTrueValue();
		else
		return InstDesc(false, I);

		// If the NonPhi is loop invariant, we should match the SelectCmp pattern
		if (Loop->isLoopInvariant(NonPhi))
		return InstDesc(false, I);

		// We require an SCEV AddRec expression with a constant step. If the constant
		// step is negative, switch the min and max.
		if (!SE->isSCEVable(NonPhi->getType()))
		return InstDesc(false, I);

		if (auto *AR = dyn_cast<SCEVAddRecExpr>(SE->getSCEV(NonPhi)))
		if (auto ConstStep = dyn_cast<SCEVConstant>(AR->getStepRecurrence(SE))) {
		RecurKind RMin =
		(isa<ICmpInst>(I->getOperand(0)) ? RecurKind::InductionIMin
		: RecurKind::InductionFMin);
		RecurKind RMax =
		(isa<ICmpInst>(I->getOperand(0)) ? RecurKind::InductionIMax
		: RecurKind::InductionFMax);
		return InstDesc(I, ConstStep->getAPInt().isNegative() ? RMin : RMax);
		Mel-ChenUnsubmitted Not Done Reply Inline Actions I recommend using isKnownNegative. bool isKnownNegative (const SCEV S) Test if the given expression is known to be negative. Mel-Chen:* I recommend using isKnownNegative. ``` bool isKnownNegative (const SCEV *S) Test if the…
		}

		return InstDesc(false, I);
		}

RecurrenceDescriptor::InstDesc		RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isMinMaxPattern(Instruction *I, RecurKind Kind,		RecurrenceDescriptor::isMinMaxPattern(Instruction *I, RecurKind Kind,
const InstDesc &Prev) {		const InstDesc &Prev) {
assert((isa<CmpInst>(I) \|\| isa<SelectInst>(I) \|\| isa<CallInst>(I)) &&		assert((isa<CmpInst>(I) \|\| isa<SelectInst>(I) \|\| isa<CallInst>(I)) &&
"Expected a cmp or select or call instruction");		"Expected a cmp or select or call instruction");
if (!isMinMaxRecurrenceKind(Kind))		if (!isMinMaxRecurrenceKind(Kind))
return InstDesc(false, I);		return InstDesc(false, I);

▲ Show 20 Lines • Show All 83 Lines • ▼ Show 20 Lines	RecurrenceDescriptor::isConditionalRdxPattern(RecurKind Kind, Instruction *I) {
Instruction IPhi = isa<PHINode>(Op1) ? dyn_cast<Instruction>(Op1)		Instruction IPhi = isa<PHINode>(Op1) ? dyn_cast<Instruction>(Op1)
: dyn_cast<Instruction>(Op2);		: dyn_cast<Instruction>(Op2);
if (!IPhi \|\| IPhi != FalseVal)		if (!IPhi \|\| IPhi != FalseVal)
return InstDesc(false, I);		return InstDesc(false, I);

return InstDesc(true, SI);		return InstDesc(true, SI);
}		}

RecurrenceDescriptor::InstDesc		RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
RecurrenceDescriptor::isRecurrenceInstr(Loop L, PHINode OrigPhi,		Loop L, PHINode OrigPhi, Instruction *I, RecurKind Kind, InstDesc &Prev,
Instruction *I, RecurKind Kind,		FastMathFlags FuncFMF, ScalarEvolution *SE) {
InstDesc &Prev, FastMathFlags FuncFMF) {
assert(Prev.getRecKind() == RecurKind::None \|\| Prev.getRecKind() == Kind);		assert(Prev.getRecKind() == RecurKind::None \|\| Prev.getRecKind() == Kind);
switch (I->getOpcode()) {		switch (I->getOpcode()) {
default:		default:
return InstDesc(false, I);		return InstDesc(false, I);
case Instruction::PHI:		case Instruction::PHI:
return InstDesc(I, Prev.getRecKind(), Prev.getExactFPMathInst());		return InstDesc(I, Prev.getRecKind(), Prev.getExactFPMathInst());
case Instruction::Sub:		case Instruction::Sub:
case Instruction::Add:		case Instruction::Add:
Show All 19 Lines	if (Kind == RecurKind::FAdd \|\| Kind == RecurKind::FMul \|\|
Kind == RecurKind::Add \|\| Kind == RecurKind::Mul)		Kind == RecurKind::Add \|\| Kind == RecurKind::Mul)
return isConditionalRdxPattern(Kind, I);		return isConditionalRdxPattern(Kind, I);
[[fallthrough]];		[[fallthrough]];
case Instruction::FCmp:		case Instruction::FCmp:
case Instruction::ICmp:		case Instruction::ICmp:
case Instruction::Call:		case Instruction::Call:
if (isSelectCmpRecurrenceKind(Kind))		if (isSelectCmpRecurrenceKind(Kind))
return isSelectCmpPattern(L, OrigPhi, I, Prev);		return isSelectCmpPattern(L, OrigPhi, I, Prev);
		if (isInductionMinMaxRecurrenceKind(Kind))
		return isInductionMinMaxPattern(L, OrigPhi, I, Prev, SE);
if (isIntMinMaxRecurrenceKind(Kind) \|\|		if (isIntMinMaxRecurrenceKind(Kind) \|\|
(((FuncFMF.noNaNs() && FuncFMF.noSignedZeros()) \|\|		(((FuncFMF.noNaNs() && FuncFMF.noSignedZeros()) \|\|
(isa<FPMathOperator>(I) && I->hasNoNaNs() &&		(isa<FPMathOperator>(I) && I->hasNoNaNs() &&
I->hasNoSignedZeros())) &&		I->hasNoSignedZeros())) &&
isFPMinMaxRecurrenceKind(Kind)))		isFPMinMaxRecurrenceKind(Kind)))
return isMinMaxPattern(I, Kind, Prev);		return isMinMaxPattern(I, Kind, Prev);
else if (isFMulAddIntrinsic(I))		else if (isFMulAddIntrinsic(I))
return InstDesc(Kind == RecurKind::FMulAdd, I,		return InstDesc(Kind == RecurKind::FMulAdd, I,
▲ Show 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	if (AddReductionVar(Phi, RecurKind::UMin, TheLoop, FMF, RedDes, DB, AC, DT,
return true;		return true;
}		}
if (AddReductionVar(Phi, RecurKind::SelectICmp, TheLoop, FMF, RedDes, DB, AC,		if (AddReductionVar(Phi, RecurKind::SelectICmp, TheLoop, FMF, RedDes, DB, AC,
DT, SE)) {		DT, SE)) {
LLVM_DEBUG(dbgs() << "Found an integer conditional select reduction PHI."		LLVM_DEBUG(dbgs() << "Found an integer conditional select reduction PHI."
<< *Phi << "\n");		<< *Phi << "\n");
return true;		return true;
}		}
		if (AddReductionVar(Phi, RecurKind::InductionIMax, TheLoop, FMF, RedDes, DB,
		AC, DT, SE)) {
		LLVM_DEBUG(dbgs() << "Found a MAX reduction on loop induction variable PHI."
		<< *Phi << "\n");
		return true;
		}
		if (AddReductionVar(Phi, RecurKind::InductionIMin, TheLoop, FMF, RedDes, DB,
		AC, DT, SE)) {
		LLVM_DEBUG(dbgs() << "Found a MIN reduction on loop induction variable PHI."
		<< *Phi << "\n");
		return true;
		}
if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,		if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {		SE)) {
LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
return true;		return true;
}		}
if (AddReductionVar(Phi, RecurKind::FAdd, TheLoop, FMF, RedDes, DB, AC, DT,		if (AddReductionVar(Phi, RecurKind::FAdd, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {		SE)) {
LLVM_DEBUG(dbgs() << "Found an FAdd reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found an FAdd reduction PHI." << *Phi << "\n");
Show All 10 Lines	if (AddReductionVar(Phi, RecurKind::FMin, TheLoop, FMF, RedDes, DB, AC, DT,
return true;		return true;
}		}
if (AddReductionVar(Phi, RecurKind::SelectFCmp, TheLoop, FMF, RedDes, DB, AC,		if (AddReductionVar(Phi, RecurKind::SelectFCmp, TheLoop, FMF, RedDes, DB, AC,
DT, SE)) {		DT, SE)) {
LLVM_DEBUG(dbgs() << "Found a float conditional select reduction PHI."		LLVM_DEBUG(dbgs() << "Found a float conditional select reduction PHI."
<< " PHI." << *Phi << "\n");		<< " PHI." << *Phi << "\n");
return true;		return true;
}		}
		if (AddReductionVar(Phi, RecurKind::InductionFMax, TheLoop, FMF, RedDes, DB,
		AC, DT, SE)) {
		LLVM_DEBUG(dbgs() << "Found a MAX reduction on loop induction variable PHI."
		<< *Phi << "\n");
		return true;
		}
		if (AddReductionVar(Phi, RecurKind::InductionFMin, TheLoop, FMF, RedDes, DB,
		AC, DT, SE)) {
		LLVM_DEBUG(dbgs() << "Found a MIN reduction on loop induction variable PHI."
		<< *Phi << "\n");
		return true;
		}
		Mel-ChenUnsubmitted Not Done Reply Inline Actions Actually, I've had an idea: for this types of RecurKind, we should only need to do `AddReductionVar` once. Mel-Chen: Actually, I've had an idea: for this types of RecurKind, we should only need to do…
		artagnonAuthorUnsubmitted Done Reply Inline Actions Yes, your approach of extending `isSelectCmpPattern` saves us these. artagnon: Yes, your approach of extending `isSelectCmpPattern` saves us these.
if (AddReductionVar(Phi, RecurKind::FMulAdd, TheLoop, FMF, RedDes, DB, AC, DT,		if (AddReductionVar(Phi, RecurKind::FMulAdd, TheLoop, FMF, RedDes, DB, AC, DT,
SE)) {		SE)) {
LLVM_DEBUG(dbgs() << "Found an FMulAdd reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found an FMulAdd reduction PHI." << *Phi << "\n");
return true;		return true;
}		}
// Not a reduction of known type.		// Not a reduction of known type.
return false;		return false;
}		}
▲ Show 20 Lines • Show All 131 Lines • ▼ Show 20 Lines	assert((FMF.noNaNs() && FMF.noSignedZeros()) &&
"nnan, nsz is expected to be set for FP min reduction.");		"nnan, nsz is expected to be set for FP min reduction.");
return ConstantFP::getInfinity(Tp, false /Negative/);		return ConstantFP::getInfinity(Tp, false /Negative/);
case RecurKind::FMax:		case RecurKind::FMax:
assert((FMF.noNaNs() && FMF.noSignedZeros()) &&		assert((FMF.noNaNs() && FMF.noSignedZeros()) &&
"nnan, nsz is expected to be set for FP max reduction.");		"nnan, nsz is expected to be set for FP max reduction.");
return ConstantFP::getInfinity(Tp, true /Negative/);		return ConstantFP::getInfinity(Tp, true /Negative/);
case RecurKind::SelectICmp:		case RecurKind::SelectICmp:
case RecurKind::SelectFCmp:		case RecurKind::SelectFCmp:
		case RecurKind::InductionIMax:
		case RecurKind::InductionIMin:
		case RecurKind::InductionFMax:
		case RecurKind::InductionFMin:
		Mel-ChenUnsubmitted Not Done Reply Inline Actions Based on my understanding, in a broader sense, ternary operations do not have an identity. Although I used the term "identity" for convenience in implementation to refer to the sentinel value, I have been considering whether to separate the concept of the sentinel value from identity. Mel-Chen: Based on my understanding, in a broader sense, ternary operations do not have an identity.
return getRecurrenceStartValue();		return getRecurrenceStartValue();
break;		break;
default:		default:
llvm_unreachable("Unknown recurrence kind");		llvm_unreachable("Unknown recurrence kind");
}		}
}		}

unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {		unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
Show All 13 Lines	unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
case RecurKind::FMulAdd:		case RecurKind::FMulAdd:
case RecurKind::FAdd:		case RecurKind::FAdd:
return Instruction::FAdd;		return Instruction::FAdd;
case RecurKind::SMax:		case RecurKind::SMax:
case RecurKind::SMin:		case RecurKind::SMin:
case RecurKind::UMax:		case RecurKind::UMax:
case RecurKind::UMin:		case RecurKind::UMin:
case RecurKind::SelectICmp:		case RecurKind::SelectICmp:
		case RecurKind::InductionIMax:
		case RecurKind::InductionIMin:
return Instruction::ICmp;		return Instruction::ICmp;
case RecurKind::FMax:		case RecurKind::FMax:
case RecurKind::FMin:		case RecurKind::FMin:
case RecurKind::SelectFCmp:		case RecurKind::SelectFCmp:
		case RecurKind::InductionFMax:
		case RecurKind::InductionFMin:
return Instruction::FCmp;		return Instruction::FCmp;
default:		default:
llvm_unreachable("Unknown recurrence operation");		llvm_unreachable("Unknown recurrence operation");
}		}
}		}

SmallVector<Instruction *, 4>		SmallVector<Instruction *, 4>
RecurrenceDescriptor::getReductionOpChain(PHINode Phi, Loop L) const {		RecurrenceDescriptor::getReductionOpChain(PHINode Phi, Loop L) const {
▲ Show 20 Lines • Show All 448 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/LoopUtils.cpp

Show First 20 Lines • Show All 1,055 Lines • ▼ Show 20 Lines	Value *llvm::createSelectCmpTargetReduction(IRBuilderBase &Builder,
Value *Cmp =		Value *Cmp =
Builder.CreateCmp(CmpInst::ICMP_NE, Src, Right, "rdx.select.cmp");		Builder.CreateCmp(CmpInst::ICMP_NE, Src, Right, "rdx.select.cmp");

// If any predicate is true it means that we want to select the new value.		// If any predicate is true it means that we want to select the new value.
Cmp = Builder.CreateOrReduce(Cmp);		Cmp = Builder.CreateOrReduce(Cmp);
return Builder.CreateSelect(Cmp, NewVal, InitVal, "rdx.select");		return Builder.CreateSelect(Cmp, NewVal, InitVal, "rdx.select");
}		}

		Value *llvm::createInductionMinMaxTargetReduction(
		IRBuilderBase &Builder, const TargetTransformInfo TTI, Value Src,
		const RecurrenceDescriptor &Desc) {
		assert(RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(
		Desc.getRecurrenceKind()) &&
		"Unexpected reduction kind");
		Value *InitVal = Desc.getRecurrenceStartValue();
		Value *NewVal = nullptr;

		// Create a splat vector with the new value and compare this to the vector
		// we want to reduce.
		ElementCount EC = cast<VectorType>(Src->getType())->getElementCount();
		Value *Right = Builder.CreateVectorSplat(EC, InitVal);
		Value *CmpVector =
		Builder.CreateCmp(CmpInst::ICMP_NE, Src, Right, "rdx.select.cmp");
		Value *Cmp = Builder.CreateOrReduce(CmpVector);
		auto IdxTy = cast<IntegerType>(InitVal->getType());

		// Consider the following example:
		// int src[n] = {4, 5, 2};
		// int r = 331;
		// for (int i = 0; i < n; i++) {
		// if (src[i] > 3)
		// r = i;
		// }
		// return r;
		// We vectorize this case as follows:
		// 1. Create a Splat with the int_min/int_max values, depending on the
		// RecurKind.
		// 2. The Src is filled with values: {0, 1, 2, 3, 4, ...}.
		// 3. The Right is filled with values: {0, 1, 331, 331, 331, ...}.
		Mel-ChenUnsubmitted Not Done Reply Inline Actions Based on your implementation, `Right` would not be what you stated as {0, 1, 331, 331, 331, ...}, but rather {331, 331, 331, 331, 331, ...}. Mel-Chen: Based on your implementation, `Right` would not be what you stated as {0, 1, 331, 331, 331, ...
		artagnonAuthorUnsubmitted Done Reply Inline Actions Ah yes, my error. artagnon: Ah yes, my error.
		// 4. The CmpVector is filled with values: {1, 1, 0, 0, 0, ...}.
		// 5. Select using this CmpVector between Src and the Splat with
		// int_min/int_max values.
		// 6. Use a max-reduce/min-reduce on the result of the Select.
		// 7. Use the exising Cmp which determines whether or not any assignment
		// took place in the loop to select between the result of the
		// max-reduce/min-reduce, and the initial value (InitVal).
		switch (Desc.getRecurrenceKind()) {
		case RecurKind::InductionIMax:
		case RecurKind::InductionFMax: {
		auto MinValue = Desc.isSigned()
		? APInt::getSignedMinValue(IdxTy->getBitWidth())
		: APInt::getMinValue(IdxTy->getBitWidth());
		Value *MinSplat =
		Builder.CreateVectorSplat(EC, ConstantInt::get(IdxTy, MinValue));
		Value *Mask =
		Builder.CreateSelect(CmpVector, Src, MinSplat, "rdx.select.mask");
		NewVal = Builder.CreateIntMaxReduce(Mask, Desc.isSigned());
		Mel-ChenUnsubmitted Not Done Reply Inline Actions The behavior of `isSigned` is different from what you have in mind. (Indeed, the name is really misleading.) bool isSigned () const Returns true if all source operands of the recurrence are SExtInsts. Mel-Chen: The behavior of `isSigned` is different from what you have in mind. (Indeed, the name is really…
		artagnonAuthorUnsubmitted Done Reply Inline Actions Oh, ouch. artagnon: Oh, ouch.
		break;
		}
		case RecurKind::InductionIMin:
		case RecurKind::InductionFMin: {
		auto MaxValue = Desc.isSigned()
		? APInt::getSignedMaxValue(IdxTy->getBitWidth())
		: APInt::getMaxValue(IdxTy->getBitWidth());
		Value *MaxSplat =
		Builder.CreateVectorSplat(EC, ConstantInt::get(IdxTy, MaxValue));
		Value *Mask =
		Builder.CreateSelect(CmpVector, Src, MaxSplat, "rdx.select.mask");
		NewVal = Builder.CreateIntMinReduce(Mask, Desc.isSigned());
		break;
		}
		default:
		llvm_unreachable("Unexpected reduction kind");
		}

		return Builder.CreateSelect(Cmp, NewVal, InitVal, "rdx.select");
		}

Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder,		Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder,
const TargetTransformInfo *TTI,		const TargetTransformInfo *TTI,
Value *Src, RecurKind RdxKind) {		Value *Src, RecurKind RdxKind) {
auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();		auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();
switch (RdxKind) {		switch (RdxKind) {
case RecurKind::Add:		case RecurKind::Add:
return Builder.CreateAddReduce(Src);		return Builder.CreateAddReduce(Src);
case RecurKind::Mul:		case RecurKind::Mul:
Show All 35 Lines	Value *llvm::createTargetReduction(IRBuilderBase &B,
// All ops in the reduction inherit fast-math-flags from the recurrence		// All ops in the reduction inherit fast-math-flags from the recurrence
// descriptor.		// descriptor.
IRBuilderBase::FastMathFlagGuard FMFGuard(B);		IRBuilderBase::FastMathFlagGuard FMFGuard(B);
B.setFastMathFlags(Desc.getFastMathFlags());		B.setFastMathFlags(Desc.getFastMathFlags());

RecurKind RK = Desc.getRecurrenceKind();		RecurKind RK = Desc.getRecurrenceKind();
if (RecurrenceDescriptor::isSelectCmpRecurrenceKind(RK))		if (RecurrenceDescriptor::isSelectCmpRecurrenceKind(RK))
return createSelectCmpTargetReduction(B, TTI, Src, Desc, OrigPhi);		return createSelectCmpTargetReduction(B, TTI, Src, Desc, OrigPhi);
		if (RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(RK))
		return createInductionMinMaxTargetReduction(B, TTI, Src, Desc);

return createSimpleTargetReduction(B, TTI, Src, RK);		return createSimpleTargetReduction(B, TTI, Src, RK);
}		}

Value *llvm::createOrderedReduction(IRBuilderBase &B,		Value *llvm::createOrderedReduction(IRBuilderBase &B,
const RecurrenceDescriptor &Desc,		const RecurrenceDescriptor &Desc,
Value Src, Value Start) {		Value Src, Value Start) {
assert((Desc.getRecurrenceKind() == RecurKind::FAdd \|\|		assert((Desc.getRecurrenceKind() == RecurKind::FAdd \|\|
▲ Show 20 Lines • Show All 808 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 3,971 Lines • ▼ Show 20 Lines	else {
// Floating-point operations should have some FMF to enable the reduction.		// Floating-point operations should have some FMF to enable the reduction.
IRBuilderBase::FastMathFlagGuard FMFG(Builder);		IRBuilderBase::FastMathFlagGuard FMFG(Builder);
Builder.setFastMathFlags(RdxDesc.getFastMathFlags());		Builder.setFastMathFlags(RdxDesc.getFastMathFlags());
for (unsigned Part = 1; Part < UF; ++Part) {		for (unsigned Part = 1; Part < UF; ++Part) {
Value *RdxPart = State.get(LoopExitInstDef, Part);		Value *RdxPart = State.get(LoopExitInstDef, Part);
if (Op != Instruction::ICmp && Op != Instruction::FCmp) {		if (Op != Instruction::ICmp && Op != Instruction::FCmp) {
ReducedPartRdx = Builder.CreateBinOp(		ReducedPartRdx = Builder.CreateBinOp(
(Instruction::BinaryOps)Op, RdxPart, ReducedPartRdx, "bin.rdx");		(Instruction::BinaryOps)Op, RdxPart, ReducedPartRdx, "bin.rdx");
} else if (RecurrenceDescriptor::isSelectCmpRecurrenceKind(RK))		} else if (RecurrenceDescriptor::isSelectCmpRecurrenceKind(RK) \|\|
		RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(RK))
ReducedPartRdx = createSelectCmpOp(Builder, ReductionStartValue, RK,		ReducedPartRdx = createSelectCmpOp(Builder, ReductionStartValue, RK,
ReducedPartRdx, RdxPart);		ReducedPartRdx, RdxPart);
else		else
ReducedPartRdx = createMinMaxOp(Builder, RK, ReducedPartRdx, RdxPart);		ReducedPartRdx = createMinMaxOp(Builder, RK, ReducedPartRdx, RdxPart);
}		}
}		}

// Create the reduction after the loop. Note that inloop reductions create the		// Create the reduction after the loop. Note that inloop reductions create the
▲ Show 20 Lines • Show All 1,897 Lines • ▼ Show 20 Lines	if (!ScalarInterleavingRequiresRuntimePointerCheck &&
// and compares when VF=1 since it may just create more overhead than it's		// and compares when VF=1 since it may just create more overhead than it's
// worth for loops with small trip counts. This is because we still have to		// worth for loops with small trip counts. This is because we still have to
// do the final reduction after the loop.		// do the final reduction after the loop.
bool HasSelectCmpReductions =		bool HasSelectCmpReductions =
HasReductions &&		HasReductions &&
any_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {		any_of(Legal->getReductionVars(), [&](auto &Reduction) -> bool {
const RecurrenceDescriptor &RdxDesc = Reduction.second;		const RecurrenceDescriptor &RdxDesc = Reduction.second;
return RecurrenceDescriptor::isSelectCmpRecurrenceKind(		return RecurrenceDescriptor::isSelectCmpRecurrenceKind(
		RdxDesc.getRecurrenceKind()) \|\|
		RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(
RdxDesc.getRecurrenceKind());		RdxDesc.getRecurrenceKind());
});		});
if (HasSelectCmpReductions) {		if (HasSelectCmpReductions) {
LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");		LLVM_DEBUG(dbgs() << "LV: Not interleaving select-cmp reductions.\n");
return 1;		return 1;
}		}

// If we have a scalar reduction (vector reductions are already dealt with		// If we have a scalar reduction (vector reductions are already dealt with
// by this point), we can increase the critical path length if the loop		// by this point), we can increase the critical path length if the loop
▲ Show 20 Lines • Show All 2,931 Lines • ▼ Show 20 Lines	for (const auto &Reduction : CM.getInLoopReductionChains()) {
const SmallVector<Instruction *, 4> &ReductionOperations = Reduction.second;		const SmallVector<Instruction *, 4> &ReductionOperations = Reduction.second;

RecipeBuilder.recordRecipeOf(Phi);		RecipeBuilder.recordRecipeOf(Phi);
for (const auto &R : ReductionOperations) {		for (const auto &R : ReductionOperations) {
RecipeBuilder.recordRecipeOf(R);		RecipeBuilder.recordRecipeOf(R);
// For min/max reductions, where we have a pair of icmp/select, we also		// For min/max reductions, where we have a pair of icmp/select, we also
// need to record the ICmp recipe, so it can be removed later.		// need to record the ICmp recipe, so it can be removed later.
assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&		assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&
		!RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(Kind) &&
"Only min/max recurrences allowed for inloop reductions");		"Only min/max recurrences allowed for inloop reductions");
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind))		if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind))
RecipeBuilder.recordRecipeOf(cast<Instruction>(R->getOperand(0)));		RecipeBuilder.recordRecipeOf(cast<Instruction>(R->getOperand(0)));
}		}
}		}

// For each interleave group which is relevant for this (possibly trimmed)		// For each interleave group which is relevant for this (possibly trimmed)
// Range, add it to the set of groups to be later applied to the VPlan and add		// Range, add it to the set of groups to be later applied to the VPlan and add
▲ Show 20 Lines • Show All 268 Lines • ▼ Show 20 Lines	for (const auto &Reduction : CM.getInLoopReductionChains()) {
Instruction *Chain = Phi;		Instruction *Chain = Phi;
for (Instruction *R : ReductionOperations) {		for (Instruction *R : ReductionOperations) {
VPRecipeBase *WidenRecipe = RecipeBuilder.getRecipe(R);		VPRecipeBase *WidenRecipe = RecipeBuilder.getRecipe(R);
RecurKind Kind = RdxDesc.getRecurrenceKind();		RecurKind Kind = RdxDesc.getRecurrenceKind();

VPValue *ChainOp = Plan->getVPValue(Chain);		VPValue *ChainOp = Plan->getVPValue(Chain);
unsigned FirstOpId;		unsigned FirstOpId;
assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&		assert(!RecurrenceDescriptor::isSelectCmpRecurrenceKind(Kind) &&
		!RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(Kind) &&
"Only min/max recurrences allowed for inloop reductions");		"Only min/max recurrences allowed for inloop reductions");
// Recognize a call to the llvm.fmuladd intrinsic.		// Recognize a call to the llvm.fmuladd intrinsic.
bool IsFMulAdd = (Kind == RecurKind::FMulAdd);		bool IsFMulAdd = (Kind == RecurKind::FMulAdd);
assert((!IsFMulAdd \|\| RecurrenceDescriptor::isFMulAddIntrinsic(R)) &&		assert((!IsFMulAdd \|\| RecurrenceDescriptor::isFMulAddIntrinsic(R)) &&
"Expected instruction to be a call to the llvm.fmuladd intrinsic");		"Expected instruction to be a call to the llvm.fmuladd intrinsic");
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind)) {		if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind)) {
assert(isa<VPWidenSelectRecipe>(WidenRecipe) &&		assert(isa<VPWidenSelectRecipe>(WidenRecipe) &&
"Expected to replace a VPWidenSelectSC");		"Expected to replace a VPWidenSelectSC");
▲ Show 20 Lines • Show All 1,492 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

Show First 20 Lines • Show All 1,288 Lines • ▼ Show 20 Lines	void VPReductionPHIRecipe::execute(VPTransformState &State) {
// Reductions do not have to start at zero. They can start with		// Reductions do not have to start at zero. They can start with
// any loop invariant values.		// any loop invariant values.
VPValue *StartVPV = getStartValue();		VPValue *StartVPV = getStartValue();
Value *StartV = StartVPV->getLiveInIRValue();		Value *StartV = StartVPV->getLiveInIRValue();

Value *Iden = nullptr;		Value *Iden = nullptr;
RecurKind RK = RdxDesc.getRecurrenceKind();		RecurKind RK = RdxDesc.getRecurrenceKind();
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(RK) \|\|		if (RecurrenceDescriptor::isMinMaxRecurrenceKind(RK) \|\|
RecurrenceDescriptor::isSelectCmpRecurrenceKind(RK)) {		RecurrenceDescriptor::isSelectCmpRecurrenceKind(RK) \|\|
		RecurrenceDescriptor::isInductionMinMaxRecurrenceKind(RK)) {
// MinMax reduction have the start value as their identify.		// MinMax reduction have the start value as their identify.
if (ScalarPHI) {		if (ScalarPHI) {
Iden = StartV;		Iden = StartV;
} else {		} else {
IRBuilderBase::InsertPointGuard IPBuilder(Builder);		IRBuilderBase::InsertPointGuard IPBuilder(Builder);
Builder.SetInsertPoint(VectorPH->getTerminator());		Builder.SetInsertPoint(VectorPH->getTerminator());
StartV = Iden =		StartV = Iden =
Builder.CreateVectorSplat(State.VF, StartV, "minmax.ident");		Builder.CreateVectorSplat(State.VF, StartV, "minmax.ident");
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/if-reduction.ll

	Show First 20 Lines • Show All 906 Lines • ▼ Show 20 Lines

	for.end: ; preds = %for.body, %entry			for.end: ; preds = %for.body, %entry
	%1 = phi i32 [ 0, %entry ], [ %sum.2, %for.body ]			%1 = phi i32 [ 0, %entry ], [ %sum.2, %for.body ]
	ret i32 %1			ret i32 %1
	}			}

	@table = constant [13 x i16] [i16 10, i16 35, i16 69, i16 147, i16 280, i16 472, i16 682, i16 1013, i16 1559, i16 2544, i16 4553, i16 6494, i16 10000], align 1			@table = constant [13 x i16] [i16 10, i16 35, i16 69, i16 147, i16 280, i16 472, i16 682, i16 1013, i16 1559, i16 2544, i16 4553, i16 6494, i16 10000], align 1

	; CHECK-LABEL: @non_reduction_index(			; CHECK-LABEL: @reduction_index(
	; CHECK-NOT: <4 x i16>			; CHECK: %[[V1:.]] = insertelement <4 x i16> poison, i16 %[[ARG:.]], i64 0
	define i16 @non_reduction_index(i16 noundef %val) {			; CHECK: shufflevector <4 x i16> %[[V1]], <4 x i16> poison, <4 x i32> zeroinitializer
				define i16 @reduction_index(i16 noundef %val) {
	entry:			entry:
	br label %for.body			br label %for.body

	for.cond.cleanup: ; preds = %for.body			for.cond.cleanup: ; preds = %for.body
	%spec.select.lcssa = phi i16 [ %spec.select, %for.body ]			%spec.select.lcssa = phi i16 [ %spec.select, %for.body ]
	ret i16 %spec.select.lcssa			ret i16 %spec.select.lcssa

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	%i.05 = phi i16 [ 12, %entry ], [ %sub, %for.body ]			%i.05 = phi i16 [ 12, %entry ], [ %sub, %for.body ]
	%k.04 = phi i16 [ 0, %entry ], [ %spec.select, %for.body ]			%k.04 = phi i16 [ 0, %entry ], [ %spec.select, %for.body ]
	%arrayidx = getelementptr inbounds [13 x i16], ptr @table, i16 0, i16 %i.05			%arrayidx = getelementptr inbounds [13 x i16], ptr @table, i16 0, i16 %i.05
	%0 = load i16, ptr %arrayidx, align 1			%0 = load i16, ptr %arrayidx, align 1
	%cmp1 = icmp ugt i16 %0, %val			%cmp1 = icmp ugt i16 %0, %val
	%sub = add nsw i16 %i.05, -1			%sub = add nsw i16 %i.05, -1
	%spec.select = select i1 %cmp1, i16 %sub, i16 %k.04			%spec.select = select i1 %cmp1, i16 %sub, i16 %k.04
	%cmp.not = icmp eq i16 %sub, 0			%cmp.not = icmp eq i16 %sub, 0
	br i1 %cmp.not, label %for.cond.cleanup, label %for.body			br i1 %cmp.not, label %for.cond.cleanup, label %for.body
	}			}

	@tablef = constant [13 x half] [half 10.0, half 35.0, half 69.0, half 147.0, half 280.0, half 472.0, half 682.0, half 1013.0, half 1559.0, half 2544.0, half 4556.0, half 6496.0, half 10000.0], align 1			@tablef = constant [13 x half] [half 10.0, half 35.0, half 69.0, half 147.0, half 280.0, half 472.0, half 682.0, half 1013.0, half 1559.0, half 2544.0, half 4556.0, half 6496.0, half 10000.0], align 1

	; CHECK-LABEL: @non_reduction_index_half(			; CHECK-LABEL: @reduction_index_half(
	; CHECK-NOT: <4 x half>			; CHECK: %[[V1:.]] = insertelement <4 x half> poison, half %[[ARG:.]], i64 0
	define i16 @non_reduction_index_half(half noundef %val) {			; CHECK: shufflevector <4 x half> %[[V1]], <4 x half> poison, <4 x i32> zeroinitializer
				define i16 @reduction_index_half(half noundef %val) {
	entry:			entry:
	br label %for.body			br label %for.body

	for.cond.cleanup: ; preds = %for.body			for.cond.cleanup: ; preds = %for.body
	%spec.select.lcssa = phi i16 [ %spec.select, %for.body ]			%spec.select.lcssa = phi i16 [ %spec.select, %for.body ]
	ret i16 %spec.select.lcssa			ret i16 %spec.select.lcssa

	for.body: ; preds = %entry, %for.body			for.body: ; preds = %entry, %for.body
	Show All 13 Lines

llvm/test/Transforms/LoopVectorize/induction-min-max.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -passes=loop-vectorize,instcombine,dce -force-vector-interleave=1 -force-vector-width=4 -S < %s \| FileCheck %s

				define i32 @test_induction_icmp_max(ptr %a) {
				; CHECK-LABEL: @test_induction_icmp_max(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = load i32, ptr [[A:%.]], align 4
				; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[MINMAX_IDENT_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
				; CHECK-NEXT: [[MINMAX_IDENT_SPLAT:%.*]] = shufflevector <4 x i32> [[MINMAX_IDENT_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ [[MINMAX_IDENT_SPLAT]], [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDEX]]
				; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 4
				; CHECK-NEXT: [[TMP2:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD]], zeroinitializer
				; CHECK-NEXT: [[TMP3]] = select <4 x i1> [[TMP2]], <4 x i32> [[VEC_IND]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 4, i32 4, i32 4, i32 4>
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32000
				; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
				; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
				; CHECK-NEXT: [[RDX_SELECT_CMP:%.*]] = icmp ne <4 x i32> [[TMP3]], [[DOTSPLAT]]
				; CHECK-NEXT: [[TMP5:%.*]] = bitcast <4 x i1> [[RDX_SELECT_CMP]] to i4
				; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i4 [[TMP5]], 0
				; CHECK-NEXT: [[RDX_SELECT_MASK:%.*]] = select <4 x i1> [[RDX_SELECT_CMP]], <4 x i32> [[TMP3]], <4 x i32> zeroinitializer
				; CHECK-NEXT: [[TMP6:%.*]] = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> [[RDX_SELECT_MASK]])
				; CHECK-NEXT: [[RDX_SELECT:%.*]] = select i1 [[DOTNOT]], i32 [[TMP0]], i32 [[TMP6]]
				; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.cond.cleanup:
				; CHECK-NEXT: [[SPEC_SELECT_LCSSA:%.*]] = phi i32 [ poison, [[FOR_BODY]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[SPEC_SELECT_LCSSA]]
				; CHECK: for.body:
				; CHECK-NEXT: br i1 poison, label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
				;
				entry:
				%0 = load i32, ptr %a
				br label %for.body

				for.cond.cleanup:
				ret i32 %spec.select

				for.body:
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%k.07 = phi i32 [ %0, %entry ], [ %spec.select, %for.body ]
				%arrayidx1 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
				%1 = load i32, ptr %arrayidx1
				%cmp2 = icmp slt i32 %1, 0
				%2 = trunc i64 %indvars.iv to i32
				%spec.select = select i1 %cmp2, i32 %2, i32 %k.07
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond.not = icmp eq i64 %indvars.iv.next, 32000
				br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
				}

				define i32 @test_induction_icmp_min_negative_addrec(ptr %a) {
				; CHECK-LABEL: @test_induction_icmp_min_negative_addrec(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = load i32, ptr [[A:%.]], align 4
				; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[MINMAX_IDENT_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
				; CHECK-NEXT: [[MINMAX_IDENT_SPLAT:%.*]] = shufflevector <4 x i32> [[MINMAX_IDENT_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ [[MINMAX_IDENT_SPLAT]], [[VECTOR_PH]] ], [ [[TMP4:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[INDEX]]
				; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 4
				; CHECK-NEXT: [[TMP2:%.*]] = icmp slt <4 x i32> [[WIDE_LOAD]], zeroinitializer
				; CHECK-NEXT: [[TMP3:%.*]] = mul <4 x i32> [[VEC_IND]], <i32 -2, i32 -2, i32 -2, i32 -2>
				; CHECK-NEXT: [[TMP4]] = select <4 x i1> [[TMP2]], <4 x i32> [[TMP3]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 4, i32 4, i32 4, i32 4>
				; CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32000
				; CHECK-NEXT: br i1 [[TMP5]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
				; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
				; CHECK-NEXT: [[RDX_SELECT_CMP:%.*]] = icmp ne <4 x i32> [[TMP4]], [[DOTSPLAT]]
				; CHECK-NEXT: [[TMP6:%.*]] = bitcast <4 x i1> [[RDX_SELECT_CMP]] to i4
				; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i4 [[TMP6]], 0
				; CHECK-NEXT: [[RDX_SELECT_MASK:%.*]] = select <4 x i1> [[RDX_SELECT_CMP]], <4 x i32> [[TMP4]], <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
				; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> [[RDX_SELECT_MASK]])
				; CHECK-NEXT: [[RDX_SELECT:%.*]] = select i1 [[DOTNOT]], i32 [[TMP0]], i32 [[TMP7]]
				; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.cond.cleanup:
				; CHECK-NEXT: [[SPEC_SELECT_LCSSA:%.*]] = phi i32 [ poison, [[FOR_BODY]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[SPEC_SELECT_LCSSA]]
				; CHECK: for.body:
				; CHECK-NEXT: br i1 poison, label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
				;
				entry:
				%0 = load i32, ptr %a
				br label %for.body

				for.cond.cleanup:
				ret i32 %spec.select

				for.body:
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%k.07 = phi i32 [ %0, %entry ], [ %spec.select, %for.body ]
				%arrayidx1 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
				%1 = load i32, ptr %arrayidx1
				%cmp2 = icmp slt i32 %1, 0
				%2 = trunc i64 %indvars.iv to i32
				%3 = mul i32 %2, -2
				%spec.select = select i1 %cmp2, i32 %3, i32 %k.07
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond.not = icmp eq i64 %indvars.iv.next, 32000
				br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
				}

				define i32 @test_induction_icmp_max_negative_addrec(ptr %a) {
				; CHECK-LABEL: @test_induction_icmp_max_negative_addrec(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ <i32 -1, i32 -1, i32 -1, i32 -1>, [[VECTOR_PH]] ], [ [[TMP4:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 31999, i32 31998, i32 31997, i32 31996>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[OFFSET_IDX:%.*]] = sub i64 31999, [[INDEX]]
				; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds i32, ptr [[A:%.]], i64 [[OFFSET_IDX]]
				; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 -3
				; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP1]], align 4
				; CHECK-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
				; CHECK-NEXT: [[TMP2:%.*]] = icmp slt <4 x i32> [[REVERSE]], zeroinitializer
				; CHECK-NEXT: [[TMP3:%.*]] = sub <4 x i32> zeroinitializer, [[VEC_IND]]
				; CHECK-NEXT: [[TMP4]] = select <4 x i1> [[TMP2]], <4 x i32> [[TMP3]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 -4, i32 -4, i32 -4, i32 -4>
				; CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32000
				; CHECK-NEXT: br i1 [[TMP5]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[RDX_SELECT_CMP:%.*]] = icmp ne <4 x i32> [[TMP4]], <i32 -1, i32 -1, i32 -1, i32 -1>
				; CHECK-NEXT: [[TMP6:%.*]] = bitcast <4 x i1> [[RDX_SELECT_CMP]] to i4
				; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i4 [[TMP6]], 0
				; CHECK-NEXT: [[RDX_SELECT_MASK:%.*]] = select <4 x i1> [[RDX_SELECT_CMP]], <4 x i32> [[TMP4]], <4 x i32> zeroinitializer
				; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> [[RDX_SELECT_MASK]])
				; CHECK-NEXT: [[RDX_SELECT:%.*]] = select i1 [[DOTNOT]], i32 -1, i32 [[TMP7]]
				; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.cond.cleanup:
				; CHECK-NEXT: [[SPEC_SELECT_LCSSA:%.*]] = phi i32 [ poison, [[FOR_BODY]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[SPEC_SELECT_LCSSA]]
				; CHECK: for.body:
				; CHECK-NEXT: br i1 poison, label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
				;
				entry:
				br label %for.body

				for.cond.cleanup:
				ret i32 %spec.select

				for.body:
				%indvars.iv = phi i64 [ 31999, %entry ], [ %indvars.iv.next, %for.body ]
				%k.05 = phi i32 [ -1, %entry ], [ %spec.select, %for.body ]
				%arrayidx = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
				%0 = load i32, ptr %arrayidx
				%cmp1 = icmp slt i32 %0, 0
				%1 = trunc i64 %indvars.iv to i32
				%2 = sub i32 0, %1
				%spec.select = select i1 %cmp1, i32 %2, i32 %k.05
				%indvars.iv.next = add nsw i64 %indvars.iv, -1
				%cmp.not = icmp eq i64 %indvars.iv, 0
				br i1 %cmp.not, label %for.cond.cleanup, label %for.body
				}

				define i32 @test_induction_icmp_min(ptr %a) {
				; CHECK-LABEL: @test_induction_icmp_min(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: [[TMP0:%.]] = load i32, ptr [[A:%.]], align 4
				; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[MINMAX_IDENT_SPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
				; CHECK-NEXT: [[MINMAX_IDENT_SPLAT:%.*]] = shufflevector <4 x i32> [[MINMAX_IDENT_SPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ [[MINMAX_IDENT_SPLAT]], [[VECTOR_PH]] ], [ [[TMP4:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 31999, i32 31998, i32 31997, i32 31996>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[OFFSET_IDX:%.*]] = sub i64 31999, [[INDEX]]
				; CHECK-NEXT: [[TMP1:%.*]] = getelementptr inbounds i32, ptr [[A]], i64 [[OFFSET_IDX]]
				; CHECK-NEXT: [[TMP2:%.*]] = getelementptr inbounds i32, ptr [[TMP1]], i64 -3
				; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i32>, ptr [[TMP2]], align 4
				; CHECK-NEXT: [[REVERSE:%.*]] = shufflevector <4 x i32> [[WIDE_LOAD]], <4 x i32> poison, <4 x i32> <i32 3, i32 2, i32 1, i32 0>
				; CHECK-NEXT: [[TMP3:%.*]] = icmp slt <4 x i32> [[REVERSE]], zeroinitializer
				; CHECK-NEXT: [[TMP4]] = select <4 x i1> [[TMP3]], <4 x i32> [[VEC_IND]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 -4, i32 -4, i32 -4, i32 -4>
				; CHECK-NEXT: [[TMP5:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32000
				; CHECK-NEXT: br i1 [[TMP5]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[DOTSPLATINSERT:%.*]] = insertelement <4 x i32> poison, i32 [[TMP0]], i64 0
				; CHECK-NEXT: [[DOTSPLAT:%.*]] = shufflevector <4 x i32> [[DOTSPLATINSERT]], <4 x i32> poison, <4 x i32> zeroinitializer
				; CHECK-NEXT: [[RDX_SELECT_CMP:%.*]] = icmp ne <4 x i32> [[TMP4]], [[DOTSPLAT]]
				; CHECK-NEXT: [[TMP6:%.*]] = bitcast <4 x i1> [[RDX_SELECT_CMP]] to i4
				; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i4 [[TMP6]], 0
				; CHECK-NEXT: [[RDX_SELECT_MASK:%.*]] = select <4 x i1> [[RDX_SELECT_CMP]], <4 x i32> [[TMP4]], <4 x i32> <i32 -1, i32 -1, i32 -1, i32 -1>
				; CHECK-NEXT: [[TMP7:%.*]] = call i32 @llvm.vector.reduce.umin.v4i32(<4 x i32> [[RDX_SELECT_MASK]])
				; CHECK-NEXT: [[RDX_SELECT:%.*]] = select i1 [[DOTNOT]], i32 [[TMP0]], i32 [[TMP7]]
				; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.cond.cleanup:
				; CHECK-NEXT: [[SPEC_SELECT_LCSSA:%.*]] = phi i32 [ poison, [[FOR_BODY]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[SPEC_SELECT_LCSSA]]
				; CHECK: for.body:
				; CHECK-NEXT: br i1 poison, label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]]
				;
				entry:
				%0 = load i32, ptr %a
				br label %for.body

				for.cond.cleanup:
				ret i32 %spec.select

				for.body:
				%indvars.iv = phi i64 [ 31999, %entry ], [ %indvars.iv.next, %for.body ]
				%k.07 = phi i32 [ %0, %entry ], [ %spec.select, %for.body ]
				%arrayidx1 = getelementptr inbounds i32, ptr %a, i64 %indvars.iv
				%1 = load i32, ptr %arrayidx1
				%cmp2 = icmp slt i32 %1, 0
				%2 = trunc i64 %indvars.iv to i32
				%spec.select = select i1 %cmp2, i32 %2, i32 %k.07
				%indvars.iv.next = add nsw i64 %indvars.iv, -1
				%cmp.not = icmp eq i64 %indvars.iv, 0
				br i1 %cmp.not, label %for.cond.cleanup, label %for.body
				}

				define i32 @test_induction_fcmp_max(ptr %a) {
				; CHECK-LABEL: @test_induction_fcmp_max(
				; CHECK-NEXT: entry:
				; CHECK-NEXT: br i1 false, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ <i32 -1, i32 -1, i32 -1, i32 -1>, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i32> [ <i32 0, i32 1, i32 2, i32 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP0:%.]] = getelementptr inbounds float, ptr [[A:%.]], i64 [[INDEX]]
				; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x float>, ptr [[TMP0]], align 4
				; CHECK-NEXT: [[TMP1:%.*]] = fcmp olt <4 x float> [[WIDE_LOAD]], zeroinitializer
				; CHECK-NEXT: [[TMP2]] = select <4 x i1> [[TMP1]], <4 x i32> [[VEC_IND]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i32> [[VEC_IND]], <i32 4, i32 4, i32 4, i32 4>
				; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32000
				; CHECK-NEXT: br i1 [[TMP3]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP10:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[RDX_SELECT_CMP:%.*]] = icmp ne <4 x i32> [[TMP2]], <i32 -1, i32 -1, i32 -1, i32 -1>
				; CHECK-NEXT: [[TMP4:%.*]] = bitcast <4 x i1> [[RDX_SELECT_CMP]] to i4
				; CHECK-NEXT: [[DOTNOT:%.*]] = icmp eq i4 [[TMP4]], 0
				; CHECK-NEXT: [[RDX_SELECT_MASK:%.*]] = select <4 x i1> [[RDX_SELECT_CMP]], <4 x i32> [[TMP2]], <4 x i32> zeroinitializer
				; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.umax.v4i32(<4 x i32> [[RDX_SELECT_MASK]])
				; CHECK-NEXT: [[RDX_SELECT:%.*]] = select i1 [[DOTNOT]], i32 -1, i32 [[TMP5]]
				; CHECK-NEXT: br i1 true, label [[FOR_COND_CLEANUP:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: br label [[FOR_BODY:%.*]]
				; CHECK: for.cond.cleanup:
				; CHECK-NEXT: [[K_1_LCSSA:%.*]] = phi i32 [ poison, [[FOR_BODY]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
				; CHECK-NEXT: ret i32 [[K_1_LCSSA]]
				; CHECK: for.body:
				; CHECK-NEXT: br i1 poison, label [[FOR_COND_CLEANUP]], label [[FOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]]
				;
				entry:
				br label %for.body

				for.cond.cleanup:
				ret i32 %k.1

				for.body:
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%k.05 = phi i32 [ -1, %entry ], [ %k.1, %for.body ]
				%arrayidx = getelementptr inbounds float, ptr %a, i64 %indvars.iv
				%0 = load float, ptr %arrayidx
				%cmp1 = fcmp olt float %0, 0.000000e+00
				%1 = trunc i64 %indvars.iv to i32
				%k.1 = select i1 %cmp1, i32 %1, i32 %k.05
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%exitcond.not = icmp eq i64 %indvars.iv.next, 32000
				br i1 %exitcond.not, label %for.cond.cleanup, label %for.body
				}

llvm/test/Transforms/LoopVectorize/select-min-index.ll

; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=1 -S %s \| FileCheck %s		; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=1 -S %s \| FileCheck %s
; RUN: opt -passes=loop-vectorize -force-vector-width=4 -force-vector-interleave=2 -S %s \| FileCheck %s
; RUN: opt -passes=loop-vectorize -force-vector-width=1 -force-vector-interleave=2 -S %s \| FileCheck %s
Mel-ChenUnsubmitted Not Done Reply Inline Actions Why skip interleave? Mel-Chen: Why skip interleave?

; Test cases for selecting the index with the minimum value.		; Test cases for selecting the index with the minimum value.

define i64 @test_vectorize_select_umin_idx(ptr %src) {		define i64 @test_vectorize_select_umin_idx(ptr %src) {
; CHECK-LABEL: @test_vectorize_select_umin_idx(		; CHECK-LABEL: @test_vectorize_select_umin_idx(
; CHECK-NOT: vector.body:		; CHECK-NOT: vector.body:
;		;
entry:		entry:
▲ Show 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	loop:
%exitcond.not = icmp eq i64 %iv.next, 0		%exitcond.not = icmp eq i64 %iv.next, 0
br i1 %exitcond.not, label %exit, label %loop		br i1 %exitcond.not, label %exit, label %loop

exit:		exit:
%res = phi i64 [ %min.idx.next, %loop ]		%res = phi i64 [ %min.idx.next, %loop ]
ret i64 %res		ret i64 %res
}		}

define i64 @test_not_vectorize_select_no_min_reduction(ptr %src) {		define i64 @test_vectorize_select_no_min_reduction(ptr %src) {
; CHECK-LABEL: @test_not_vectorize_select_no_min_reduction(		; CHECK-LABEL: define i64 @test_vectorize_select_no_min_reduction
; CHECK-NOT: vector.body:		; CHECK-SAME: (ptr [[SRC:%.*]]) {
		; CHECK-NEXT: entry:
		; CHECK-NEXT: br i1 true, label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
		; CHECK: vector.ph:
		; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
		; CHECK: vector.body:
		; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
		; CHECK-NEXT: [[VEC_IND:%.]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, [[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.]], [[VECTOR_BODY]] ]
		; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP6:%.]], [[VECTOR_BODY]] ]
		; CHECK-NEXT: [[VECTOR_RECUR:%.]] = phi <4 x i64> [ <i64 poison, i64 poison, i64 poison, i64 0>, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
		; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
		; CHECK-NEXT: [[TMP1:%.*]] = getelementptr i64, ptr [[SRC]], i64 [[TMP0]]
		; CHECK-NEXT: [[TMP2:%.*]] = getelementptr i64, ptr [[TMP1]], i32 0
		; CHECK-NEXT: [[WIDE_LOAD:%.*]] = load <4 x i64>, ptr [[TMP2]], align 4
		; CHECK-NEXT: [[TMP3]] = add <4 x i64> [[WIDE_LOAD]], <i64 1, i64 1, i64 1, i64 1>
		; CHECK-NEXT: [[TMP4:%.*]] = shufflevector <4 x i64> [[VECTOR_RECUR]], <4 x i64> [[TMP3]], <4 x i32> <i32 3, i32 4, i32 5, i32 6>
		; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt <4 x i64> [[TMP4]], [[WIDE_LOAD]]
		; CHECK-NEXT: [[TMP6]] = select <4 x i1> [[TMP5]], <4 x i64> [[VEC_IND]], <4 x i64> [[VEC_PHI]]
		; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
		; CHECK-NEXT: [[VEC_IND_NEXT]] = add <4 x i64> [[VEC_IND]], <i64 4, i64 4, i64 4, i64 4>
		; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i64 [[INDEX_NEXT]], 0
		; CHECK-NEXT: br i1 [[TMP7]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
		; CHECK: middle.block:
		; CHECK-NEXT: [[RDX_SELECT_CMP:%.*]] = icmp ne <4 x i64> [[TMP6]], zeroinitializer
		; CHECK-NEXT: [[TMP8:%.*]] = call i1 @llvm.vector.reduce.or.v4i1(<4 x i1> [[RDX_SELECT_CMP]])
		; CHECK-NEXT: [[RDX_SELECT_MASK:%.*]] = select <4 x i1> [[RDX_SELECT_CMP]], <4 x i64> [[TMP6]], <4 x i64> zeroinitializer
		; CHECK-NEXT: [[TMP9:%.*]] = call i64 @llvm.vector.reduce.umax.v4i64(<4 x i64> [[RDX_SELECT_MASK]])
		; CHECK-NEXT: [[RDX_SELECT:%.*]] = select i1 [[TMP8]], i64 [[TMP9]], i64 0
		; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 0, 0
		; CHECK-NEXT: [[VECTOR_RECUR_EXTRACT:%.*]] = extractelement <4 x i64> [[TMP3]], i32 3
		; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
		; CHECK: scalar.ph:
		; CHECK-NEXT: [[SCALAR_RECUR_INIT:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[VECTOR_RECUR_EXTRACT]], [[MIDDLE_BLOCK]] ]
		; CHECK-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 0, [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY]] ]
		; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi i64 [ 0, [[ENTRY]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
		; CHECK-NEXT: br label [[LOOP:%.*]]
		; CHECK: loop:
		; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
		; CHECK-NEXT: [[MIN_IDX:%.]] = phi i64 [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[MIN_IDX_NEXT:%.]], [[LOOP]] ]
		; CHECK-NEXT: [[SCALAR_RECUR:%.]] = phi i64 [ [[SCALAR_RECUR_INIT]], [[SCALAR_PH]] ], [ [[MIN_VAL_NEXT:%.]], [[LOOP]] ]
		; CHECK-NEXT: [[GEP:%.*]] = getelementptr i64, ptr [[SRC]], i64 [[IV]]
		; CHECK-NEXT: [[L:%.*]] = load i64, ptr [[GEP]], align 4
		; CHECK-NEXT: [[CMP:%.*]] = icmp ugt i64 [[SCALAR_RECUR]], [[L]]
		; CHECK-NEXT: [[MIN_VAL_NEXT]] = add i64 [[L]], 1
		; CHECK-NEXT: [[FOO:%.*]] = call i64 @llvm.umin.i64(i64 [[SCALAR_RECUR]], i64 [[L]])
		; CHECK-NEXT: [[MIN_IDX_NEXT]] = select i1 [[CMP]], i64 [[IV]], i64 [[MIN_IDX]]
		; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
		; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], 0
		; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP3:![0-9]+]]
		; CHECK: exit:
		; CHECK-NEXT: [[RES:%.*]] = phi i64 [ [[MIN_IDX_NEXT]], [[LOOP]] ], [ [[RDX_SELECT]], [[MIDDLE_BLOCK]] ]
		; CHECK-NEXT: ret i64 [[RES]]
;		;
entry:		entry:
br label %loop		br label %loop

loop:		loop:
%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]		%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
%min.idx = phi i64 [ 0, %entry ], [ %min.idx.next, %loop ]		%min.idx = phi i64 [ 0, %entry ], [ %min.idx.next, %loop ]
%min.val = phi i64 [ 0, %entry ], [ %min.val.next, %loop ]		%min.val = phi i64 [ 0, %entry ], [ %min.val.next, %loop ]
%gep = getelementptr i64, ptr %src, i64 %iv		%gep = getelementptr i64, ptr %src, i64 %iv
%l = load i64, ptr %gep		%l = load i64, ptr %gep
%cmp = icmp ugt i64 %min.val, %l		%cmp = icmp ugt i64 %min.val, %l
%min.val.next = add i64 %l, 1		%min.val.next = add i64 %l, 1
%foo = call i64 @llvm.umin.i64(i64 %min.val, i64 %l)		%foo = call i64 @llvm.umin.i64(i64 %min.val, i64 %l)
%min.idx.next = select i1 %cmp, i64 %iv, i64 %min.idx		%min.idx.next = select i1 %cmp, i64 %iv, i64 %min.idx
%iv.next = add nuw nsw i64 %iv, 1		%iv.next = add nuw nsw i64 %iv, 1
%exitcond.not = icmp eq i64 %iv.next, 0		%exitcond.not = icmp eq i64 %iv.next, 0
br i1 %exitcond.not, label %exit, label %loop		br i1 %exitcond.not, label %exit, label %loop

exit:		exit:
%res = phi i64 [ %min.idx.next, %loop ]		%res = phi i64 [ %min.idx.next, %loop ]
ret i64 %res		ret i64 %res
}		}


define i64 @test_not_vectorize_cmp_value(i64 %x) {		define i64 @test_not_vectorize_cmp_value(i64 %x) {
; CHECK-LABEL: @test_not_vectorize_cmp_value(		; CHECK-LABEL: @test_not_vectorize_cmp_value(
; CHECK-NOT: vector.body:		; CHECK-NOT: vector.body:
;		;
entry:		entry:
br label %loop		br label %loop

loop:		loop:
▲ Show 20 Lines • Show All 182 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

LoopVectorize: introduce RecurKind::Induction(I|F)(Max|Min)AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 530447

llvm/include/llvm/Analysis/IVDescriptors.h

llvm/include/llvm/Transforms/Utils/LoopUtils.h

llvm/lib/Analysis/IVDescriptors.cpp

llvm/lib/Transforms/Utils/LoopUtils.cpp

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp

llvm/test/Transforms/LoopVectorize/if-reduction.ll

llvm/test/Transforms/LoopVectorize/induction-min-max.ll

llvm/test/Transforms/LoopVectorize/select-min-index.ll

LoopVectorize: introduce RecurKind::Induction(I|F)(Max|Min)
AbandonedPublic