This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Analysis/
-
llvm/
-
Analysis/
-
IVDescriptors.h
-
lib/
-
Analysis/
2
IVDescriptors.cpp
-
Transforms/
-
Utils/
-
LoopUtils.cpp
-
Vectorize/
1
LoopVectorizationLegality.cpp
-
LoopVectorize.cpp
1
VPlan.cpp
-
test/Transforms/LoopVectorize/AArch64/
-
Transforms/
-
LoopVectorize/
-
AArch64/
1
fadd-reduction-as-int.ll

Differential D111077

[LV] Support converting FP add to integer reductions.
Needs ReviewPublic

Authored by fhahn on Oct 4 2021, 10:06 AM.

Download Raw Diff

Details

Reviewers

scanon
spatel
dmgreen
Ayal
gilr
kmclaughlin

Summary

Floating point reductions that start at an integer value and only get
incremented by another positive integer value can be performed on
integers. The result then needs to be clamped by the maximum value value
with mantissa 1.0 (with the maximum exponent) times the step.

This approach has been suggested by @scanon.

One thing I am not sure about is if there is an API to get the
corresponding integer value of a APFloat, if it is an integer. The patch
does same manual matching against explicitly constructed APFloats.

Another thing to note is that the patch uses float->signed int/signed
int->float conversions. Should conversions to unsigned be used?

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Oct 4 2021, 10:06 AM

Herald added subscribers: rogfer01, hiraditya. · View Herald TranscriptOct 4 2021, 10:06 AM

fhahn requested review of this revision.Oct 4 2021, 10:06 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2021, 10:06 AM

Herald added a subscriber: vkmr. · View Herald Transcript

Harbormaster completed remote builds in B126852: Diff 376935.Oct 4 2021, 10:06 AM

Use 0x1.0p53 and 0x1.0p24 for maximum constants instead of large integer values.

Harbormaster completed remote builds in B126856: Diff 376941.Oct 4 2021, 10:17 AM

Interesting idea. Are these two bits of code always the same?
https://godbolt.org/z/EfPKPTMdf

Should we be doing this more generally, outside the vectorizing reductions?

tschuett added a subscriber: tschuett.Oct 4 2021, 10:49 AM

In D111077#3040417, @dmgreen wrote:

Interesting idea. Are these two bits of code always the same?
https://godbolt.org/z/EfPKPTMdf

I think both cases above should be the same. But I think we can construct slight variations where is they would not be. E.g. consider a loop where the induction variable starts at 0 and is incremented and overflow is allowed. If n would be negative, the result of removing the loop and converting n to a float would yield a negative number , but the loop version would always return a positive number. I might be missing some subtleties when it comes to sign handling, perhaps @scanon as further thoughts.

Should we be doing this more generally, outside the vectorizing reductions?

I think it might be worthwhile to convert such reductions outside the vectorizer in some cases. My motivation for starting in LV is that it should be clearly profitable if it allows vectorization. For general loops without vectorization, it might not be profitable I think, e.g. for loops that only execute once, due to the conversion overhead.

Should we be doing this more generally, outside the vectorizing reductions?

I think it might be worthwhile to convert such reductions outside the vectorizer in some cases. My motivation for starting in LV is that it should be clearly profitable if it allows vectorization. For general loops without vectorization, it might not be profitable I think, e.g. for loops that only execute once, due to the conversion overhead.

This is an interesting patch @fhahn! I haven't looked at it yet, but the principle sounds goods! I think we can also do this at -O3 for FP induction variables, i.e.

float f = 0;
for (int i = 0; i < n; i++)
{
  dst[i] = f;
  f += 2.0;
}

provided we know the conversion between float and integer is guaranteed to be lossless for every iteration. This can be done by versioning the loop with runtime SCEV checks I think, right? Or if we know the trip count at runtime we don't even need runtime checks.

Something like

const APFloat *NF = …;

APSInt NI(64, false);
If (NF->convertToInteger(NI, APFloat::rmTowardZero, &lgnored) == APFloat::opOK)

Check ConvertToSInt helper in IndVarSimplify.

In D111077#3040938, @fhahn wrote:

In D111077#3040417, @dmgreen wrote:

Interesting idea. Are these two bits of code always the same?
https://godbolt.org/z/EfPKPTMdf

I think both cases above should be the same. But I think we can construct slight variations where is they would not be. E.g. consider a loop where the induction variable starts at 0 and is incremented and overflow is allowed. If n would be negative, the result of removing the loop and converting n to a float would yield a negative number , but the loop version would always return a positive number. I might be missing some subtleties when it comes to sign handling, perhaps @scanon as further thoughts.

Yep, I was ignoring the negative numbers :) I meant more about the general idea of converting the loop to straight line code.

Should we be doing this more generally, outside the vectorizing reductions?

I think it might be worthwhile to convert such reductions outside the vectorizer in some cases. My motivation for starting in LV is that it should be clearly profitable if it allows vectorization. For general loops without vectorization, it might not be profitable I think, e.g. for loops that only execute once, due to the conversion overhead.

As far as I can tell from this code: https://godbolt.org/z/caPszPafr
The trace through when n==1 would be

cmp     w1, #1
b.lt    .LBB0_3
cmp     w1, #1
b.ne    .LBB0_4
mov     w8, wzr
movi    d0, #0000000000000000
b       .LBB0_7
sub     w8, w1, w8
fmov    s1, #1.00000000
subs    w8, w8, #1
fadd    s0, s0, s1
b.ne    .LBB0_8
ret

vs straight line code with no branches:

bic     w8, w1, w1, asr #31
mov     w9, #1266679808
scvtf   s0, w8
fmov    s1, w9
fminnm  s0, s0, s1
ret

And that's not including vectorization. It's kind of like a "high cost expansion" from SCEV (but to be fair as far as I understand we wouldn't always rewrite high cost exit values, even if it would mean deleting the loop (?)). Which is what made me wonder if we should be doing it generally, not just in the vectorizer. (Not that I have anything against this patch - it looks pretty sensible and doesn't complicate the reduction code any more than it already is. It seems to fit quite well).

In D111077#3047408, @dmgreen wrote:

In D111077#3040938, @fhahn wrote:

In D111077#3040417, @dmgreen wrote:

Interesting idea. Are these two bits of code always the same?
https://godbolt.org/z/EfPKPTMdf

I think both cases above should be the same. But I think we can construct slight variations where is they would not be. E.g. consider a loop where the induction variable starts at 0 and is incremented and overflow is allowed. If n would be negative, the result of removing the loop and converting n to a float would yield a negative number , but the loop version would always return a positive number. I might be missing some subtleties when it comes to sign handling, perhaps @scanon as further thoughts.

Yep, I was ignoring the negative numbers :) I meant more about the general idea of converting the loop to straight line code.

Should we be doing this more generally, outside the vectorizing reductions?

I think it might be worthwhile to convert such reductions outside the vectorizer in some cases. My motivation for starting in LV is that it should be clearly profitable if it allows vectorization. For general loops without vectorization, it might not be profitable I think, e.g. for loops that only execute once, due to the conversion overhead.

And that's not including vectorization. It's kind of like a "high cost expansion" from SCEV (but to be fair as far as I understand we wouldn't always rewrite high cost exit values, even if it would mean deleting the loop (?)). Which is what made me wonder if we should be doing it generally, not just in the vectorizer. (Not that I have anything against this patch - it looks pretty sensible and doesn't complicate the reduction code any more than it already is. It seems to fit quite well).

I think the case you shared is should be beneficial. I'd need to check where it would fit best and I think we'd need a few extra cost checks, but it shouldn't be too tricky.

I think there will still be cases in which it may not be profitable to perform the transformation on its own, so it might be good to also support this in LV, especially as we should be able to support it quite naturally.

As related work in the same area, and possibly an alternative (though I haven't looked at your test cases closely), I want to mention two old reviews of mine.

https://reviews.llvm.org/D68844 teaches SCEV to compute trip counts for simple floating point IVs
https://reviews.llvm.org/D68954 uses the above in IndVars to canonicalize to integers when possible

In D111077#3053926, @reames wrote:

As related work in the same area, and possibly an alternative (though I haven't looked at your test cases closely), I want to mention two old reviews of mine.

https://reviews.llvm.org/D68844 teaches SCEV to compute trip counts for simple floating point IVs
https://reviews.llvm.org/D68954 uses the above in IndVars to canonicalize to integers when possible

Thanks for sharing those patches. After taking a look, it seems a notable difference is that the current patch supports loops with variable trip counts due to clamping to the max value.

I've not thought too much about how this potentially could fit into SCEV, but a similar approach may be feasible there as well.

rebase and add tests where the IV may signed/unsigned overflow.

Harbormaster completed remote builds in B134773: Diff 387996.Nov 17 2021, 10:56 AM

Like I said, I do think a lot of the more common cases of this kind of code would be better suited deleting the loop and transforming to the straight line code. It will be when extra stuff going on in the loop where vectorization will become more useful. It might be worth looking into the loop-deletion case too, at some point.

llvm/lib/Analysis/IVDescriptors.cpp
236	Should half be supported too?
644	Is fsub supported? I don't see any tests or the code to handle them.
llvm/lib/Transforms/Vectorize/VPlan.cpp
1342	Are all start values valid? What if it is not an integer? Or if it is already -0x1.0p24? Or not a multiple of the induction amount? (I'm not sure any of those will be wrong, they might be fine and would start to "saturate" at the same point. I would have to test it to see what would happen).
llvm/test/Transforms/LoopVectorize/AArch64/fadd-reduction-as-int.ll
27	What prevents this add from overflowing?

Floating point reductions that start at an integer value and only get
incremented by another positive integer value can be performed on
integers

When the (integer) increments are invariant, as in several examples, such reductions should better be identified as (integer) inductions, with their final live-out value of saturate(init_value + trip_count[-1] * invariant_increment) pre-computed at the pre-header? Their in-loop values could then also be used, more easily (and efficiently) for trip counts known to avoid saturation. The general reduction infrastructure is needed when the increments are variant, as in the select'ing examples.
Why restrict to integer increments that are powers of 2?

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp
914	The `EnableStrictReductions`-part of the comment above should be moved below?

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

IVDescriptors.h

30 lines

lib/

Analysis/

IVDescriptors.cpp

42 lines

Transforms/

Utils/

LoopUtils.cpp

1 line

Vectorize/

LoopVectorizationLegality.cpp

19 lines

LoopVectorize.cpp

83 lines

VPlan.cpp

13 lines

test/

Transforms/

LoopVectorize/

AArch64/

fadd-reduction-as-int.ll

291 lines

Diff 376935

llvm/include/llvm/Analysis/IVDescriptors.h

	Show All 30 Lines
	class Loop;			class Loop;
	class PredicatedScalarEvolution;			class PredicatedScalarEvolution;
	class ScalarEvolution;			class ScalarEvolution;
	class SCEV;			class SCEV;
	class DominatorTree;			class DominatorTree;

	/// These are the kinds of recurrences that we support.			/// These are the kinds of recurrences that we support.
	enum class RecurKind {			enum class RecurKind {
	None, ///< Not a recurrence.			None, ///< Not a recurrence.
	Add, ///< Sum of integers.			Add, ///< Sum of integers.
	Mul, ///< Product of integers.			Mul, ///< Product of integers.
	Or, ///< Bitwise or logical OR of integers.			Or, ///< Bitwise or logical OR of integers.
	And, ///< Bitwise or logical AND of integers.			And, ///< Bitwise or logical AND of integers.
	Xor, ///< Bitwise or logical XOR of integers.			Xor, ///< Bitwise or logical XOR of integers.
	SMin, ///< Signed integer min implemented in terms of select(cmp()).			SMin, ///< Signed integer min implemented in terms of select(cmp()).
	SMax, ///< Signed integer max implemented in terms of select(cmp()).			SMax, ///< Signed integer max implemented in terms of select(cmp()).
	UMin, ///< Unisgned integer min implemented in terms of select(cmp()).			UMin, ///< Unisgned integer min implemented in terms of select(cmp()).
	UMax, ///< Unsigned integer max implemented in terms of select(cmp()).			UMax, ///< Unsigned integer max implemented in terms of select(cmp()).
	FAdd, ///< Sum of floats.			FAdd, ///< Sum of floats.
				FAddAsInt, ///< Sum of floats that can be performed on integers followed by
				///< clamping to the largest representable integer.
	FMul, ///< Product of floats.			FMul, ///< Product of floats.
	FMin, ///< FP min implemented in terms of select(cmp()).			FMin, ///< FP min implemented in terms of select(cmp()).
	FMax ///< FP max implemented in terms of select(cmp()).			FMax, ///< FP max implemented in terms of select(cmp()).
	};			};

	/// The RecurrenceDescriptor is used to identify recurrences variables in a			/// The RecurrenceDescriptor is used to identify recurrences variables in a
	/// loop. Reduction is a special case of recurrence that has uses of the			/// loop. Reduction is a special case of recurrence that has uses of the
	/// recurrence variable outside the loop. The method isReductionPHI identifies			/// recurrence variable outside the loop. The method isReductionPHI identifies
	/// reductions that are basic recurrences.			/// reductions that are basic recurrences.
	///			///
	/// Basic recurrences are defined as the summation, product, OR, AND, XOR, min,			/// Basic recurrences are defined as the summation, product, OR, AND, XOR, min,
	▲ Show 20 Lines • Show All 310 Lines • Show Last 20 Lines

llvm/lib/Analysis/IVDescriptors.cpp

Show First 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	bool RecurrenceDescriptor::AddReductionVar(PHINode *Phi, RecurKind Kind,
// Reduction variables are only found in the loop header block.		// Reduction variables are only found in the loop header block.
if (Phi->getParent() != TheLoop->getHeader())		if (Phi->getParent() != TheLoop->getHeader())
return false;		return false;

// Obtain the reduction start value from the value that comes from the loop		// Obtain the reduction start value from the value that comes from the loop
// preheader.		// preheader.
Value *RdxStart = Phi->getIncomingValueForBlock(TheLoop->getLoopPreheader());		Value *RdxStart = Phi->getIncomingValueForBlock(TheLoop->getLoopPreheader());

		if (Kind == RecurKind::FAddAsInt) {
		if (!RdxStart->getType()->isFloatTy() && !RdxStart->getType()->isDoubleTy())
		dmgreenUnsubmitted Not Done Reply Inline Actions Should half be supported too? dmgreen: Should half be supported too?
		return false;

		auto *C = dyn_cast<ConstantFP>(RdxStart);
		if (!C \|\| !C->getValue().isInteger())
		return false;
		}

// ExitInstruction is the single value which is used outside the loop.		// ExitInstruction is the single value which is used outside the loop.
// We only allow for a single reduction value to be used outside the loop.		// We only allow for a single reduction value to be used outside the loop.
// This includes users of the reduction, variables (which form a cycle		// This includes users of the reduction, variables (which form a cycle
// which ends in the phi node).		// which ends in the phi node).
Instruction *ExitInstruction = nullptr;		Instruction *ExitInstruction = nullptr;
// Indicates that we found a reduction operation in our scan.		// Indicates that we found a reduction operation in our scan.
bool FoundReduxOp = false;		bool FoundReduxOp = false;

▲ Show 20 Lines • Show All 251 Lines • ▼ Show 20 Lines	bool RecurrenceDescriptor::AddReductionVar(PHINode *Phi, RecurKind Kind,
}		}

// We found a reduction var if we have reached the original phi node and we		// We found a reduction var if we have reached the original phi node and we
// only have a single instruction with out-of-loop users.		// only have a single instruction with out-of-loop users.

// The ExitInstruction(Instruction which is allowed to have out-of-loop users)		// The ExitInstruction(Instruction which is allowed to have out-of-loop users)
// is saved as part of the RecurrenceDescriptor.		// is saved as part of the RecurrenceDescriptor.

		if (Kind == RecurKind::FAddAsInt) {
		RecurrenceType = IntegerType::get(RecurrenceType->getContext(),
		RecurrenceType->getScalarSizeInBits());
		}
// Save the description of this reduction variable.		// Save the description of this reduction variable.
RecurrenceDescriptor RD(RdxStart, ExitInstruction, Kind, FMF,		RecurrenceDescriptor RD(RdxStart, ExitInstruction, Kind, FMF,
ReduxDesc.getExactFPMathInst(), RecurrenceType,		ReduxDesc.getExactFPMathInst(), RecurrenceType,
IsSigned, IsOrdered, CastInsts);		IsSigned, IsOrdered, CastInsts);
RedDes = RD;		RedDes = RD;

return true;		return true;
}		}
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	if (!I1 \|\| !I1->isBinaryOp())
return InstDesc(false, I);		return InstDesc(false, I);

Value Op1, Op2;		Value Op1, Op2;
if ((m_FAdd(m_Value(Op1), m_Value(Op2)).match(I1) \|\|		if ((m_FAdd(m_Value(Op1), m_Value(Op2)).match(I1) \|\|
m_FSub(m_Value(Op1), m_Value(Op2)).match(I1)) &&		m_FSub(m_Value(Op1), m_Value(Op2)).match(I1)) &&
I1->isFast())		I1->isFast())
return InstDesc(Kind == RecurKind::FAdd, SI);		return InstDesc(Kind == RecurKind::FAdd, SI);

		if (m_FAdd(m_Value(Op1), m_Value(Op2)).match(I1))
		return InstDesc(Kind == RecurKind::FAddAsInt, SI);

if (m_FMul(m_Value(Op1), m_Value(Op2)).match(I1) && (I1->isFast()))		if (m_FMul(m_Value(Op1), m_Value(Op2)).match(I1) && (I1->isFast()))
return InstDesc(Kind == RecurKind::FMul, SI);		return InstDesc(Kind == RecurKind::FMul, SI);

return InstDesc(false, I);		return InstDesc(false, I);
}		}

RecurrenceDescriptor::InstDesc		RecurrenceDescriptor::InstDesc
RecurrenceDescriptor::isRecurrenceInstr(Instruction *I, RecurKind Kind,		RecurrenceDescriptor::isRecurrenceInstr(Instruction *I, RecurKind Kind,
Show All 14 Lines	RecurrenceDescriptor::isRecurrenceInstr(Instruction *I, RecurKind Kind,
case Instruction::Or:		case Instruction::Or:
return InstDesc(Kind == RecurKind::Or, I);		return InstDesc(Kind == RecurKind::Or, I);
case Instruction::Xor:		case Instruction::Xor:
return InstDesc(Kind == RecurKind::Xor, I);		return InstDesc(Kind == RecurKind::Xor, I);
case Instruction::FDiv:		case Instruction::FDiv:
case Instruction::FMul:		case Instruction::FMul:
return InstDesc(Kind == RecurKind::FMul, I,		return InstDesc(Kind == RecurKind::FMul, I,
I->hasAllowReassoc() ? nullptr : I);		I->hasAllowReassoc() ? nullptr : I);
case Instruction::FSub:		case Instruction::FSub:
		dmgreenUnsubmitted Not Done Reply Inline Actions Is fsub supported? I don't see any tests or the code to handle them. dmgreen: Is fsub supported? I don't see any tests or the code to handle them.
case Instruction::FAdd:		case Instruction::FAdd:
		if (Kind == RecurKind::FAddAsInt) {
		// FAdd-to-int conversion only supports a single fadd in the chain at the
		// moment.
		if (Prev.isRecurrence())
		return InstDesc(false, nullptr);
		if (auto *C = dyn_cast<ConstantFP>(I->getOperand(1))) {
		auto F = C->getValue();
		if (F == APFloat(F.getSemantics(), 1) \|\|
		F == APFloat(F.getSemantics(), 2) \|\|
		F == APFloat(F.getSemantics(), 4) \|\|
		F == APFloat(F.getSemantics(), 8) \|\|
		F == APFloat(F.getSemantics(), 16) \|\|
		F == APFloat(F.getSemantics(), 32))
		return InstDesc(true, I, I);
		}
		}
return InstDesc(Kind == RecurKind::FAdd, I,		return InstDesc(Kind == RecurKind::FAdd, I,
I->hasAllowReassoc() ? nullptr : I);		I->hasAllowReassoc() ? nullptr : I);
case Instruction::Select:		case Instruction::Select:
if (Kind == RecurKind::FAdd \|\| Kind == RecurKind::FMul)		if (Kind == RecurKind::FAdd \|\| Kind == RecurKind::FAddAsInt \|\|
		Kind == RecurKind::FMul)
return isConditionalRdxPattern(Kind, I);		return isConditionalRdxPattern(Kind, I);
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case Instruction::FCmp:		case Instruction::FCmp:
case Instruction::ICmp:		case Instruction::ICmp:
case Instruction::Call:		case Instruction::Call:
if (isIntMinMaxRecurrenceKind(Kind) \|\|		if (isIntMinMaxRecurrenceKind(Kind) \|\|
(((FuncFMF.noNaNs() && FuncFMF.noSignedZeros()) \|\|		(((FuncFMF.noNaNs() && FuncFMF.noSignedZeros()) \|\|
(isa<FPMathOperator>(I) && I->hasNoNaNs() &&		(isa<FPMathOperator>(I) && I->hasNoNaNs() &&
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	bool RecurrenceDescriptor::isReductionPHI(PHINode Phi, Loop TheLoop,
if (AddReductionVar(Phi, RecurKind::UMin, TheLoop, FMF, RedDes, DB, AC, DT)) {		if (AddReductionVar(Phi, RecurKind::UMin, TheLoop, FMF, RedDes, DB, AC, DT)) {
LLVM_DEBUG(dbgs() << "Found a UMIN reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found a UMIN reduction PHI." << *Phi << "\n");
return true;		return true;
}		}
if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT)) {		if (AddReductionVar(Phi, RecurKind::FMul, TheLoop, FMF, RedDes, DB, AC, DT)) {
LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found an FMult reduction PHI." << *Phi << "\n");
return true;		return true;
}		}
		if (AddReductionVar(Phi, RecurKind::FAddAsInt, TheLoop, FMF, RedDes, DB, AC,
		DT)) {
		LLVM_DEBUG(dbgs() << "Found an FAddAsInt reduction PHI." << *Phi << "\n");
		return true;
		}
if (AddReductionVar(Phi, RecurKind::FAdd, TheLoop, FMF, RedDes, DB, AC, DT)) {		if (AddReductionVar(Phi, RecurKind::FAdd, TheLoop, FMF, RedDes, DB, AC, DT)) {
LLVM_DEBUG(dbgs() << "Found an FAdd reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found an FAdd reduction PHI." << *Phi << "\n");
return true;		return true;
}		}
if (AddReductionVar(Phi, RecurKind::FMax, TheLoop, FMF, RedDes, DB, AC, DT)) {		if (AddReductionVar(Phi, RecurKind::FMax, TheLoop, FMF, RedDes, DB, AC, DT)) {
LLVM_DEBUG(dbgs() << "Found a float MAX reduction PHI." << *Phi << "\n");		LLVM_DEBUG(dbgs() << "Found a float MAX reduction PHI." << *Phi << "\n");
return true;		return true;
}		}
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Lines

/// This function returns the identity element (or neutral element) for		/// This function returns the identity element (or neutral element) for
/// the operation K.		/// the operation K.
Constant RecurrenceDescriptor::getRecurrenceIdentity(RecurKind K, Type Tp,		Constant RecurrenceDescriptor::getRecurrenceIdentity(RecurKind K, Type Tp,
FastMathFlags FMF) {		FastMathFlags FMF) {
switch (K) {		switch (K) {
case RecurKind::Xor:		case RecurKind::Xor:
case RecurKind::Add:		case RecurKind::Add:
		case RecurKind::FAddAsInt:
case RecurKind::Or:		case RecurKind::Or:
// Adding, Xoring, Oring zero to a number does not change it.		// Adding, Xoring, Oring zero to a number does not change it.
return ConstantInt::get(Tp, 0);		return ConstantInt::get(Tp, 0);
case RecurKind::Mul:		case RecurKind::Mul:
// Multiplying a number by 1 does not change it.		// Multiplying a number by 1 does not change it.
return ConstantInt::get(Tp, 1);		return ConstantInt::get(Tp, 1);
case RecurKind::And:		case RecurKind::And:
// AND-ing a number with an all-1 value does not change it.		// AND-ing a number with an all-1 value does not change it.
Show All 28 Lines	Constant RecurrenceDescriptor::getRecurrenceIdentity(RecurKind K, Type Tp,
default:		default:
llvm_unreachable("Unknown recurrence kind");		llvm_unreachable("Unknown recurrence kind");
}		}
}		}

unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {		unsigned RecurrenceDescriptor::getOpcode(RecurKind Kind) {
switch (Kind) {		switch (Kind) {
case RecurKind::Add:		case RecurKind::Add:
		case RecurKind::FAddAsInt:
return Instruction::Add;		return Instruction::Add;
case RecurKind::Mul:		case RecurKind::Mul:
return Instruction::Mul;		return Instruction::Mul;
case RecurKind::Or:		case RecurKind::Or:
return Instruction::Or;		return Instruction::Or;
case RecurKind::And:		case RecurKind::And:
return Instruction::And;		return Instruction::And;
case RecurKind::Xor:		case RecurKind::Xor:
▲ Show 20 Lines • Show All 416 Lines • Show Last 20 Lines

llvm/lib/Transforms/Utils/LoopUtils.cpp

	Show First 20 Lines • Show All 993 Lines • ▼ Show 20 Lines

	Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder,			Value *llvm::createSimpleTargetReduction(IRBuilderBase &Builder,
	const TargetTransformInfo *TTI,			const TargetTransformInfo *TTI,
	Value *Src, RecurKind RdxKind,			Value *Src, RecurKind RdxKind,
	ArrayRef<Value *> RedOps) {			ArrayRef<Value *> RedOps) {
	auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();			auto *SrcVecEltTy = cast<VectorType>(Src->getType())->getElementType();
	switch (RdxKind) {			switch (RdxKind) {
	case RecurKind::Add:			case RecurKind::Add:
				case RecurKind::FAddAsInt:
	return Builder.CreateAddReduce(Src);			return Builder.CreateAddReduce(Src);
	case RecurKind::Mul:			case RecurKind::Mul:
	return Builder.CreateMulReduce(Src);			return Builder.CreateMulReduce(Src);
	case RecurKind::And:			case RecurKind::And:
	return Builder.CreateAndReduce(Src);			return Builder.CreateAndReduce(Src);
	case RecurKind::Or:			case RecurKind::Or:
	return Builder.CreateOrReduce(Src);			return Builder.CreateOrReduce(Src);
	case RecurKind::Xor:			case RecurKind::Xor:
	▲ Show 20 Lines • Show All 756 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

	Show First 20 Lines • Show All 905 Lines • ▼ Show 20 Lines
	bool LoopVectorizationLegality::canVectorizeFPMath(			bool LoopVectorizationLegality::canVectorizeFPMath(
	bool EnableStrictReductions) {			bool EnableStrictReductions) {

	// First check if there is any ExactFP math or if we allow reassociations			// First check if there is any ExactFP math or if we allow reassociations
	if (!Requirements->getExactFPInst() \|\| Hints->allowReordering())			if (!Requirements->getExactFPInst() \|\| Hints->allowReordering())
	return true;			return true;

	// If the above is false, we have ExactFPMath & do not allow reordering.			// If the above is false, we have ExactFPMath & do not allow reordering.
	// If the EnableStrictReductions flag is set, first check if we have any			// If the EnableStrictReductions flag is set, first check if we have any
				AyalUnsubmitted Not Done Reply Inline Actions The `EnableStrictReductions`-part of the comment above should be moved below? Ayal: The `EnableStrictReductions`-part of the comment above should be moved below?
	// Exact FP induction vars, which we cannot vectorize.			// Exact FP induction vars, which we cannot vectorize.
	if (!EnableStrictReductions \|\|			if (any_of(getInductionVars(), [&](auto &Induction) -> bool {
	any_of(getInductionVars(), [&](auto &Induction) -> bool {
	InductionDescriptor IndDesc = Induction.second;			InductionDescriptor IndDesc = Induction.second;
	return IndDesc.getExactFPMathInst();			return IndDesc.getExactFPMathInst();
	}))			}))
	return false;			return false;

				if (all_of(getReductionVars(), [&](auto &Reduction) -> bool {
				const RecurrenceDescriptor &RdxDesc = Reduction.second;
				return !RdxDesc.hasExactFPMath() \|\|
				RdxDesc.getRecurrenceKind() == RecurKind::FAddAsInt;
				}))
				return true;

	// We can now only vectorize if all reductions with Exact FP math also			// We can now only vectorize if all reductions with Exact FP math also
	// have the isOrdered flag set, which indicates that we can move the			// have the isOrdered flag set, which indicates that we can move the
	// reduction operations in-loop.			// reduction operations in-loop.
	return (all_of(getReductionVars(), [&](auto &Reduction) -> bool {			return EnableStrictReductions &&
				all_of(getReductionVars(), [&](auto &Reduction) -> bool {
	const RecurrenceDescriptor &RdxDesc = Reduction.second;			const RecurrenceDescriptor &RdxDesc = Reduction.second;
	return !RdxDesc.hasExactFPMath() \|\| RdxDesc.isOrdered();			return !RdxDesc.hasExactFPMath() \|\| RdxDesc.isOrdered();
	}));			});
	}			}

	bool LoopVectorizationLegality::isInductionPhi(const Value *V) {			bool LoopVectorizationLegality::isInductionPhi(const Value *V) {
	Value In0 = const_cast<Value >(V);			Value In0 = const_cast<Value >(V);
	PHINode *PN = dyn_cast_or_null<PHINode>(In0);			PHINode *PN = dyn_cast_or_null<PHINode>(In0);
	if (!PN)			if (!PN)
	return false;			return false;

	▲ Show 20 Lines • Show All 368 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,179 Lines • ▼ Show 20 Lines	void InnerLoopVectorizer::fixCrossIterationPHIs(VPTransformState &State) {
for (VPRecipeBase &R : Header->phis()) {		for (VPRecipeBase &R : Header->phis()) {
if (auto *ReductionPhi = dyn_cast<VPReductionPHIRecipe>(&R))		if (auto *ReductionPhi = dyn_cast<VPReductionPHIRecipe>(&R))
fixReduction(ReductionPhi, State);		fixReduction(ReductionPhi, State);
else if (auto *FOR = dyn_cast<VPFirstOrderRecurrencePHIRecipe>(&R))		else if (auto *FOR = dyn_cast<VPFirstOrderRecurrencePHIRecipe>(&R))
fixFirstOrderRecurrence(FOR, State);		fixFirstOrderRecurrence(FOR, State);
}		}
}		}

		static VPRecipeBase getReductionIncrementRecipe(VPReductionPHIRecipe PhiR) {
		SmallVector<VPUser *, 2> Users(PhiR->users());
		assert((Users.size() == 1 \|\| Users.size() == 2) &&
		"reduction phi must have either 1 or 2 users");
		auto *R = cast<VPRecipeBase>(Users[0]);
		if (isa<VPWidenSelectRecipe>(R))
		return cast<VPRecipeBase>(Users[1]);
		return R;
		}

void InnerLoopVectorizer::fixFirstOrderRecurrence(VPWidenPHIRecipe *PhiR,		void InnerLoopVectorizer::fixFirstOrderRecurrence(VPWidenPHIRecipe *PhiR,
VPTransformState &State) {		VPTransformState &State) {
// This is the second phase of vectorizing first-order recurrences. An		// This is the second phase of vectorizing first-order recurrences. An
// overview of the transformation is described below. Suppose we have the		// overview of the transformation is described below. Suppose we have the
// following loop.		// following loop.
//		//
// for (int i = 0; i < n; ++i)		// for (int i = 0; i < n; ++i)
// b[i] = a[i] - a[i - 1];		// b[i] = a[i] - a[i - 1];
▲ Show 20 Lines • Show All 224 Lines • ▼ Show 20 Lines	else {
}		}
}		}

// Create the reduction after the loop. Note that inloop reductions create the		// Create the reduction after the loop. Note that inloop reductions create the
// target reduction in the loop using a Reduction recipe.		// target reduction in the loop using a Reduction recipe.
if (VF.isVector() && !PhiR->isInLoop()) {		if (VF.isVector() && !PhiR->isInLoop()) {
ReducedPartRdx =		ReducedPartRdx =
createTargetReduction(Builder, TTI, RdxDesc, ReducedPartRdx);		createTargetReduction(Builder, TTI, RdxDesc, ReducedPartRdx);
		if (RK == RecurKind::FAddAsInt) {
		// Convert back the integer result to a floating point number and clamp it
		// to the correct maximum value.

		VPRecipeBase *R = getReductionIncrementRecipe(PhiR);
		// Return the maximum integral value with 1.0 mantissa for float or
		// double, depending on the bitwidth.
		auto GetMaxVal = [](Type *T) -> uint64_t {
		// Reduction on float type.
		if (T->getScalarSizeInBits() == 32)
		return 16777216;

		assert(T->getScalarSizeInBits() == 64);
		return 9007199254740992;
		};
		auto *C = R->getOperand(1)->getLiveInIRValue();
		uint64_t Max =
		GetMaxVal(C->getType()) * cast<ConstantInt>(C)->getSExtValue();
		ReducedPartRdx = Builder.CreateSIToFP(ReducedPartRdx, PhiTy);
		ReducedPartRdx =
		Builder.CreateMinNum(ReducedPartRdx, ConstantFP::get(PhiTy, Max));
		} else {
// If the reduction can be performed in a smaller type, we need to extend		// If the reduction can be performed in a smaller type, we need to extend
// the reduction to the wider type before we branch to the original loop.		// the reduction to the wider type before we branch to the original loop.
if (PhiTy != RdxDesc.getRecurrenceType())		if (PhiTy != RdxDesc.getRecurrenceType())
ReducedPartRdx = RdxDesc.isSigned()		ReducedPartRdx = RdxDesc.isSigned()
? Builder.CreateSExt(ReducedPartRdx, PhiTy)		? Builder.CreateSExt(ReducedPartRdx, PhiTy)
: Builder.CreateZExt(ReducedPartRdx, PhiTy);		: Builder.CreateZExt(ReducedPartRdx, PhiTy);
}		}
		}

// Create a phi node that merges control-flow from the backedge-taken check		// Create a phi node that merges control-flow from the backedge-taken check
// block and the middle block.		// block and the middle block.
PHINode *BCBlockPhi = PHINode::Create(PhiTy, 2, "bc.merge.rdx",		PHINode *BCBlockPhi = PHINode::Create(PhiTy, 2, "bc.merge.rdx",
LoopScalarPreHeader->getTerminator());		LoopScalarPreHeader->getTerminator());
for (unsigned I = 0, E = LoopBypassBlocks.size(); I != E; ++I)		for (unsigned I = 0, E = LoopBypassBlocks.size(); I != E; ++I)
BCBlockPhi->addIncoming(ReductionStartValue, LoopBypassBlocks[I]);		BCBlockPhi->addIncoming(ReductionStartValue, LoopBypassBlocks[I]);
BCBlockPhi->addIncoming(ReducedPartRdx, LoopMiddleBlock);		BCBlockPhi->addIncoming(ReducedPartRdx, LoopMiddleBlock);
▲ Show 20 Lines • Show All 5,170 Lines • ▼ Show 20 Lines	for (VPRecipeBase &R : Plan->getEntry()->getEntryBasicBlock()->phis()) {
continue;		continue;
Builder.setInsertPoint(LatchVPBB);		Builder.setInsertPoint(LatchVPBB);
VPValue *Cond =		VPValue *Cond =
RecipeBuilder.createBlockInMask(OrigLoop->getHeader(), Plan);		RecipeBuilder.createBlockInMask(OrigLoop->getHeader(), Plan);
VPValue *Red = PhiR->getBackedgeValue();		VPValue *Red = PhiR->getBackedgeValue();
Builder.createNaryOp(Instruction::Select, {Cond, Red, PhiR});		Builder.createNaryOp(Instruction::Select, {Cond, Red, PhiR});
}		}
}		}

		SmallVector<VPRecipeBase *> ToRemove;
		for (VPRecipeBase &P : Plan->getEntry()->getEntryBasicBlock()->phis()) {
		auto *RedPhi = dyn_cast<VPReductionPHIRecipe>(&P);
		if (!RedPhi \|\| RedPhi->getRecurrenceDescriptor().getRecurrenceKind() !=
		RecurKind::FAddAsInt)
		continue;
		// Convert floating point reduction increment to an integer addition using a
		// VPInstruction.
		VPRecipeBase *R = getReductionIncrementRecipe(RedPhi);
		SmallVector<VPValue *> Ops(R->op_begin(), R->op_end());
		Value *In = Ops[1]->getLiveInIRValue();
		auto GetInt = [](Value *V) {
		auto *C = cast<ConstantFP>(V);
		auto F = C->getValue();
		if (F == APFloat(F.getSemantics(), 1))
		return 1;
		if (F == APFloat(F.getSemantics(), 2))
		return 2;
		if (F == APFloat(F.getSemantics(), 4))
		return 4;
		if (F == APFloat(F.getSemantics(), 8))
		return 8;
		if (F == APFloat(F.getSemantics(), 16))
		return 16;
		if (F == APFloat(F.getSemantics(), 32))
		return 32;
		llvm_unreachable("unexpected float constant");
		};
		Ops[1] = Plan->getOrAddVPValue(ConstantInt::get(
		RedPhi->getRecurrenceDescriptor().getRecurrenceType(), GetInt(In)));
		auto *IntAdd = new VPInstruction(Instruction::Add, Ops);
		R->getVPSingleValue()->replaceAllUsesWith(IntAdd);
		IntAdd->insertAfter(R);
		ToRemove.push_back(R);
		}
		for (VPRecipeBase *R : ToRemove)
		R->eraseFromParent();
}		}

#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)		#if !defined(NDEBUG) \|\| defined(LLVM_ENABLE_DUMP)
void VPInterleaveRecipe::print(raw_ostream &O, const Twine &Indent,		void VPInterleaveRecipe::print(raw_ostream &O, const Twine &Indent,
VPSlotTracker &SlotTracker) const {		VPSlotTracker &SlotTracker) const {
O << Indent << "INTERLEAVE-GROUP with factor " << IG->getFactor() << " at ";		O << Indent << "INTERLEAVE-GROUP with factor " << IG->getFactor() << " at ";
IG->getInsertPos()->printAsOperand(O, false);		IG->getInsertPos()->printAsOperand(O, false);
O << ", ";		O << ", ";
▲ Show 20 Lines • Show All 967 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/VPlan.cpp

Show First 20 Lines • Show All 1,310 Lines • ▼ Show 20 Lines	void VPReductionPHIRecipe::execute(VPTransformState &State) {
PHINode *PN = cast<PHINode>(getUnderlyingValue());		PHINode *PN = cast<PHINode>(getUnderlyingValue());
auto &Builder = State.Builder;		auto &Builder = State.Builder;

// In order to support recurrences we need to be able to vectorize Phi nodes.		// In order to support recurrences we need to be able to vectorize Phi nodes.
// Phi nodes have cycles, so we need to vectorize them in two stages. This is		// Phi nodes have cycles, so we need to vectorize them in two stages. This is
// stage #1: We create a new vector PHI node with no incoming edges. We'll use		// stage #1: We create a new vector PHI node with no incoming edges. We'll use
// this value when we vectorize all of the instructions that use the PHI.		// this value when we vectorize all of the instructions that use the PHI.
bool ScalarPHI = State.VF.isScalar() \|\| IsInLoop;		bool ScalarPHI = State.VF.isScalar() \|\| IsInLoop;
Type *VecTy =		RecurKind RK = RdxDesc.getRecurrenceKind();
ScalarPHI ? PN->getType() : VectorType::get(PN->getType(), State.VF);		Type *BaseTy =
		RK == RecurKind::FAddAsInt ? RdxDesc.getRecurrenceType() : PN->getType();
		Type *VecTy = ScalarPHI ? BaseTy : VectorType::get(BaseTy, State.VF);

BasicBlock *HeaderBB = State.CFG.PrevBB;		BasicBlock *HeaderBB = State.CFG.PrevBB;
assert(State.LI->getLoopFor(HeaderBB)->getHeader() == HeaderBB &&		assert(State.LI->getLoopFor(HeaderBB)->getHeader() == HeaderBB &&
"recipe must be in the vector loop header");		"recipe must be in the vector loop header");
unsigned LastPartForNewPhi = isOrdered() ? 1 : State.UF;		unsigned LastPartForNewPhi = isOrdered() ? 1 : State.UF;
for (unsigned Part = 0; Part < LastPartForNewPhi; ++Part) {		for (unsigned Part = 0; Part < LastPartForNewPhi; ++Part) {
Value *EntryPart =		Value *EntryPart =
PHINode::Create(VecTy, 2, "vec.phi", &*HeaderBB->getFirstInsertionPt());		PHINode::Create(VecTy, 2, "vec.phi", &*HeaderBB->getFirstInsertionPt());
State.set(this, EntryPart, Part);		State.set(this, EntryPart, Part);
}		}

// Reductions do not have to start at zero. They can start with		// Reductions do not have to start at zero. They can start with
// any loop invariant values.		// any loop invariant values.
VPValue *StartVPV = getStartValue();		VPValue *StartVPV = getStartValue();
Value *StartV = StartVPV->getLiveInIRValue();		Value *StartV = StartVPV->getLiveInIRValue();

		if (RK == RecurKind::FAddAsInt) {
		IRBuilderBase::InsertPointGuard IPBuilder(Builder);
		Builder.SetInsertPoint(State.CFG.VectorPreHeader->getTerminator());
		StartV = Builder.CreateFPToSI(StartV, RdxDesc.getRecurrenceType());
		dmgreenUnsubmitted Not Done Reply Inline Actions Are all start values valid? What if it is not an integer? Or if it is already -0x1.0p24? Or not a multiple of the induction amount? (I'm not sure any of those will be wrong, they might be fine and would start to "saturate" at the same point. I would have to test it to see what would happen). dmgreen: Are all start values valid? What if it is not an integer? Or if it is already -0x1.0p24? Or not…
		}

Value *Iden = nullptr;		Value *Iden = nullptr;
RecurKind RK = RdxDesc.getRecurrenceKind();
if (RecurrenceDescriptor::isMinMaxRecurrenceKind(RK)) {		if (RecurrenceDescriptor::isMinMaxRecurrenceKind(RK)) {
// MinMax reduction have the start value as their identify.		// MinMax reduction have the start value as their identify.
if (ScalarPHI) {		if (ScalarPHI) {
Iden = StartV;		Iden = StartV;
} else {		} else {
IRBuilderBase::InsertPointGuard IPBuilder(Builder);		IRBuilderBase::InsertPointGuard IPBuilder(Builder);
Builder.SetInsertPoint(State.CFG.VectorPreHeader->getTerminator());		Builder.SetInsertPoint(State.CFG.VectorPreHeader->getTerminator());
StartV = Iden =		StartV = Iden =
▲ Show 20 Lines • Show All 147 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/AArch64/fadd-reduction-as-int.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -loop-vectorize -mtriple arm64-apple-darwin -S -o - %s \| FileCheck %s			; RUN: opt -loop-vectorize -mtriple arm64-apple-darwin -S -o - %s \| FileCheck %s


	define float @test_fadd_to_int_add_1(i64 %cnt) {			define float @test_fadd_to_int_add_1(i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_add_1(			; CHECK-LABEL: @test_fadd_to_int_add_1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 2			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 8
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 2			; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 8
	; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]			; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 1.000000e+00			; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 1.000000e+00			; CHECK-NEXT: [[TMP2]] = add <4 x i32> [[VEC_PHI]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[TMP3]] = add <4 x i32> [[VEC_PHI1]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <4 x i32> [[TMP3]], [[TMP2]]
				; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[BIN_RDX]])
				dmgreenUnsubmitted Not Done Reply Inline Actions What prevents this add from overflowing? dmgreen: What prevents this add from overflowing?
				; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP5]] to float
				; CHECK-NEXT: [[TMP7:%.*]] = call float @llvm.minnum.f32(float [[TMP6]], float 0x4170000000000000)
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP2:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP2:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	Show All 19 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 1.500000e+00			; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 1.500000e+00
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 1.500000e+00			; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 1.500000e+00
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.500000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.500000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP4:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP5:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	%red.next = fadd float %red, 1.500000e+00			%red.next = fadd float %red, 1.500000e+00
	%iv.next = add nuw nsw i64 %iv, 1			%iv.next = add nuw nsw i64 %iv, 1
	%exitcond.not = icmp eq i64 %iv.next, %cnt			%exitcond.not = icmp eq i64 %iv.next, %cnt
	br i1 %exitcond.not, label %exit, label %loop			br i1 %exitcond.not, label %exit, label %loop

	exit:			exit:
	ret float %red.next			ret float %red.next
	}			}

	define float @test_fadd_to_int_add_1_start_negative(i64 %cnt) {			define float @test_fadd_to_int_add_1_start_negative(i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_add_1_start_negative(			; CHECK-LABEL: @test_fadd_to_int_add_1_start_negative(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 2			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 8
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 2			; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 8
	; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]			; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ -3.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ <i32 poison, i32 0, i32 0, i32 0>, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 1.000000e+00			; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 1.000000e+00			; CHECK-NEXT: [[TMP2]] = add <4 x i32> [[VEC_PHI]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[TMP3]] = add <4 x i32> [[VEC_PHI1]], <i32 1, i32 1, i32 1, i32 1>
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <4 x i32> [[TMP3]], [[TMP2]]
				; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[BIN_RDX]])
				; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP5]] to float
				; CHECK-NEXT: [[TMP7:%.*]] = call float @llvm.minnum.f32(float [[TMP6]], float 0x4170000000000000)
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ -3.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ -3.000000e+00, [[ENTRY]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP6:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP7:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ -3.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ -3.000000e+00, %entry ], [ %red.next, %loop ]
	Show All 19 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 1.250000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 1.250000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 1.000000e+00			; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 1.000000e+00
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 1.000000e+00			; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 1.000000e+00
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 1.250000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 1.250000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 1.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP8:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP9:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	Show All 21 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], -1.000000e+00			; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], -1.000000e+00
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], -1.000000e+00			; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], -1.000000e+00
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP10:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], -1.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], -1.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP10:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP11:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	%red.next = fadd float %red, -1.000000e+00			%red.next = fadd float %red, -1.000000e+00
	%iv.next = add nuw nsw i64 %iv, 1			%iv.next = add nuw nsw i64 %iv, 1
	%exitcond.not = icmp eq i64 %iv.next, %cnt			%exitcond.not = icmp eq i64 %iv.next, %cnt
	br i1 %exitcond.not, label %exit, label %loop			br i1 %exitcond.not, label %exit, label %loop

	exit:			exit:
	ret float %red.next			ret float %red.next
	}			}

	define double @test_fadd_to_int_add_1_double(i64 %cnt) {			define double @test_fadd_to_int_add_1_double(i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_add_1_double(			; CHECK-LABEL: @test_fadd_to_int_add_1_double(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 2			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 4
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 2			; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 4
	; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]			; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi double [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[TMP0:%.*]] = fadd double [[VEC_PHI]], 1.000000e+00			; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP1]] = fadd double [[TMP0]], 1.000000e+00			; CHECK-NEXT: [[TMP2]] = add <2 x i64> [[VEC_PHI]], <i64 1, i64 1>
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[TMP3]] = add <2 x i64> [[VEC_PHI1]], <i64 1, i64 1>
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <2 x i64> [[TMP3]], [[TMP2]]
				; CHECK-NEXT: [[TMP5:%.*]] = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> [[BIN_RDX]])
				; CHECK-NEXT: [[TMP6:%.*]] = sitofp i64 [[TMP5]] to double
				; CHECK-NEXT: [[TMP7:%.*]] = call double @llvm.minnum.f64(double [[TMP6]], double 0x4340000000000000)
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi double [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi double [ 0.000000e+00, [[ENTRY]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi double [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi double [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd double [[RED]], 1.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd double [[RED]], 1.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP12:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP13:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi double [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi double [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret double [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret double [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi double [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi double [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	%red.next = fadd double %red, 1.000000e+00			%red.next = fadd double %red, 1.000000e+00
	%iv.next = add nuw nsw i64 %iv, 1			%iv.next = add nuw nsw i64 %iv, 1
	%exitcond.not = icmp eq i64 %iv.next, %cnt			%exitcond.not = icmp eq i64 %iv.next, %cnt
	br i1 %exitcond.not, label %exit, label %loop			br i1 %exitcond.not, label %exit, label %loop

	exit:			exit:
	ret double %red.next			ret double %red.next
	}			}

	define float @test_fadd_to_int_add_2(i64 %cnt) {			define float @test_fadd_to_int_add_2(i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_add_2(			; CHECK-LABEL: @test_fadd_to_int_add_2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 2			; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 8
	; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]			; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
	; CHECK: vector.ph:			; CHECK: vector.ph:
	; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 2			; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 8
	; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]			; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
	; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]			; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
	; CHECK: vector.body:			; CHECK: vector.body:
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP2:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP3:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 2.000000e+00			; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 2.000000e+00			; CHECK-NEXT: [[TMP2]] = add <4 x i32> [[VEC_PHI]], <i32 2, i32 2, i32 2, i32 2>
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[TMP3]] = add <4 x i32> [[VEC_PHI1]], <i32 2, i32 2, i32 2, i32 2>
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]			; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP4]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <4 x i32> [[TMP3]], [[TMP2]]
				; CHECK-NEXT: [[TMP5:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[BIN_RDX]])
				; CHECK-NEXT: [[TMP6:%.*]] = sitofp i32 [[TMP5]] to float
				; CHECK-NEXT: [[TMP7:%.*]] = call float @llvm.minnum.f32(float [[TMP6]], float 0x4180000000000000)
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 2.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 2.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP14:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP15:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP7]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	Show All 19 Lines
	; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]			; CHECK-NEXT: [[VEC_PHI:%.]] = phi float [ 0.000000e+00, [[VECTOR_PH]] ], [ [[TMP1:%.]], [[VECTOR_BODY]] ]
	; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0			; CHECK-NEXT: [[INDUCTION:%.*]] = add i64 [[INDEX]], 0
	; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1			; CHECK-NEXT: [[INDUCTION1:%.*]] = add i64 [[INDEX]], 1
	; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 0.000000e+00			; CHECK-NEXT: [[TMP0:%.*]] = fadd float [[VEC_PHI]], 0.000000e+00
	; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 0.000000e+00			; CHECK-NEXT: [[TMP1]] = fadd float [[TMP0]], 0.000000e+00
	; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2			; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 2
	; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]			; CHECK-NEXT: [[TMP2:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP15:![0-9]+]]			; CHECK-NEXT: br i1 [[TMP2]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP16:![0-9]+]]
	; CHECK: middle.block:			; CHECK: middle.block:
	; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]			; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
	; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]			; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
	; CHECK: scalar.ph:			; CHECK: scalar.ph:
	; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]			; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
	; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 0.000000e+00			; CHECK-NEXT: [[RED_NEXT]] = fadd float [[RED]], 0.000000e+00
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP16:![0-9]+]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP17:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP1]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	%red.next = fadd float %red, 0.000000e+00			%red.next = fadd float %red, 0.000000e+00
	%iv.next = add nuw nsw i64 %iv, 1			%iv.next = add nuw nsw i64 %iv, 1
	%exitcond.not = icmp eq i64 %iv.next, %cnt			%exitcond.not = icmp eq i64 %iv.next, %cnt
	br i1 %exitcond.not, label %exit, label %loop			br i1 %exitcond.not, label %exit, label %loop

	exit:			exit:
	ret float %red.next			ret float %red.next
	}			}

	define float @test_fadd_to_int_with_select_add_1(i32* nocapture readonly %A, i64 %cnt) {			define float @test_fadd_to_int_with_select_add_1(i32* nocapture readonly %A, i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_with_select_add_1(			; CHECK-LABEL: @test_fadd_to_int_with_select_add_1(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 8
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 8
				; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP12:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP13:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
				; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP1]]
				; CHECK-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i32 0
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i32 [[TMP4]] to <4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x i32>, <4 x i32> [[TMP5]], align 4
				; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i32 4
				; CHECK-NEXT: [[TMP7:%.]] = bitcast i32 [[TMP6]] to <4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD2:%.]] = load <4 x i32>, <4 x i32> [[TMP7]], align 4
				; CHECK-NEXT: [[TMP8:%.*]] = icmp sgt <4 x i32> [[WIDE_LOAD]], zeroinitializer
				; CHECK-NEXT: [[TMP9:%.*]] = icmp sgt <4 x i32> [[WIDE_LOAD2]], zeroinitializer
				; CHECK-NEXT: [[TMP10:%.*]] = add <4 x i32> [[VEC_PHI]], <i32 1, i32 1, i32 1, i32 1>
				; CHECK-NEXT: [[TMP11:%.*]] = add <4 x i32> [[VEC_PHI1]], <i32 1, i32 1, i32 1, i32 1>
				; CHECK-NEXT: [[TMP12]] = select <4 x i1> [[TMP8]], <4 x i32> [[TMP10]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[TMP13]] = select <4 x i1> [[TMP9]], <4 x i32> [[TMP11]], <4 x i32> [[VEC_PHI1]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
				; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <4 x i32> [[TMP13]], [[TMP12]]
				; CHECK-NEXT: [[TMP15:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[BIN_RDX]])
				; CHECK-NEXT: [[TMP16:%.*]] = sitofp i32 [[TMP15]] to float
				; CHECK-NEXT: [[TMP17:%.*]] = call float @llvm.minnum.f32(float [[TMP16]], float 0x4170000000000000)
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[GEP_A:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IV]]			; CHECK-NEXT: [[GEP_A:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[IV]]
	; CHECK-NEXT: [[LV_A:%.]] = load i32, i32 [[GEP_A]], align 4			; CHECK-NEXT: [[LV_A:%.]] = load i32, i32 [[GEP_A]], align 4
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[LV_A]], 0			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[LV_A]], 0
	; CHECK-NEXT: [[ADD:%.*]] = fadd float [[RED]], 1.000000e+00			; CHECK-NEXT: [[ADD:%.*]] = fadd float [[RED]], 1.000000e+00
	; CHECK-NEXT: [[RED_NEXT]] = select i1 [[CMP1]], float [[ADD]], float [[RED]]			; CHECK-NEXT: [[RED_NEXT]] = select i1 [[CMP1]], float [[ADD]], float [[RED]]
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.]] = icmp eq i64 [[IV_NEXT]], [[CNT:%.]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP19:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	%gep.A = getelementptr inbounds i32, i32* %A, i64 %iv			%gep.A = getelementptr inbounds i32, i32* %A, i64 %iv
	%lv.A = load i32, i32* %gep.A, align 4			%lv.A = load i32, i32* %gep.A, align 4
	%cmp1 = icmp sgt i32 %lv.A, 0			%cmp1 = icmp sgt i32 %lv.A, 0
	%add = fadd float %red, 1.000000e+00			%add = fadd float %red, 1.000000e+00
	%red.next = select i1 %cmp1, float %add, float %red			%red.next = select i1 %cmp1, float %add, float %red
	%iv.next = add nuw nsw i64 %iv, 1			%iv.next = add nuw nsw i64 %iv, 1
	%exitcond.not = icmp eq i64 %iv.next, %cnt			%exitcond.not = icmp eq i64 %iv.next, %cnt
	br i1 %exitcond.not, label %exit, label %loop			br i1 %exitcond.not, label %exit, label %loop

	exit:			exit:
	ret float %red.next			ret float %red.next
	}			}

	define double @test_fadd_to_int_with_select_add_1_double(i32* nocapture readonly %A, i64 %cnt) {			define double @test_fadd_to_int_with_select_add_1_double(i32* nocapture readonly %A, i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_with_select_add_1_double(			; CHECK-LABEL: @test_fadd_to_int_with_select_add_1_double(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 4
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 4
				; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP12:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <2 x i64> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP13:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
				; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 2
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP1]]
				; CHECK-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i32 0
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i32 [[TMP4]] to <2 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <2 x i32>, <2 x i32> [[TMP5]], align 4
				; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i32 2
				; CHECK-NEXT: [[TMP7:%.]] = bitcast i32 [[TMP6]] to <2 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD2:%.]] = load <2 x i32>, <2 x i32> [[TMP7]], align 4
				; CHECK-NEXT: [[TMP8:%.*]] = icmp sgt <2 x i32> [[WIDE_LOAD]], zeroinitializer
				; CHECK-NEXT: [[TMP9:%.*]] = icmp sgt <2 x i32> [[WIDE_LOAD2]], zeroinitializer
				; CHECK-NEXT: [[TMP10:%.*]] = add <2 x i64> [[VEC_PHI]], <i64 1, i64 1>
				; CHECK-NEXT: [[TMP11:%.*]] = add <2 x i64> [[VEC_PHI1]], <i64 1, i64 1>
				; CHECK-NEXT: [[TMP12]] = select <2 x i1> [[TMP8]], <2 x i64> [[TMP10]], <2 x i64> [[VEC_PHI]]
				; CHECK-NEXT: [[TMP13]] = select <2 x i1> [[TMP9]], <2 x i64> [[TMP11]], <2 x i64> [[VEC_PHI1]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
				; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <2 x i64> [[TMP13]], [[TMP12]]
				; CHECK-NEXT: [[TMP15:%.*]] = call i64 @llvm.vector.reduce.add.v2i64(<2 x i64> [[BIN_RDX]])
				; CHECK-NEXT: [[TMP16:%.*]] = sitofp i64 [[TMP15]] to double
				; CHECK-NEXT: [[TMP17:%.*]] = call double @llvm.minnum.f64(double [[TMP16]], double 0x4340000000000000)
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi double [ 0.000000e+00, [[ENTRY]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi double [ 0.000000e+00, [[ENTRY]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi double [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[GEP_A:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IV]]			; CHECK-NEXT: [[GEP_A:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[IV]]
	; CHECK-NEXT: [[LV_A:%.]] = load i32, i32 [[GEP_A]], align 4			; CHECK-NEXT: [[LV_A:%.]] = load i32, i32 [[GEP_A]], align 4
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[LV_A]], 0			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[LV_A]], 0
	; CHECK-NEXT: [[ADD:%.*]] = fadd double [[RED]], 1.000000e+00			; CHECK-NEXT: [[ADD:%.*]] = fadd double [[RED]], 1.000000e+00
	; CHECK-NEXT: [[RED_NEXT]] = select i1 [[CMP1]], double [[ADD]], double [[RED]]			; CHECK-NEXT: [[RED_NEXT]] = select i1 [[CMP1]], double [[ADD]], double [[RED]]
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.]] = icmp eq i64 [[IV_NEXT]], [[CNT:%.]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP21:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi double [ [[RED_NEXT]], [[LOOP]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi double [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret double [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret double [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi double [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi double [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	%gep.A = getelementptr inbounds i32, i32* %A, i64 %iv			%gep.A = getelementptr inbounds i32, i32* %A, i64 %iv
	%lv.A = load i32, i32* %gep.A, align 4			%lv.A = load i32, i32* %gep.A, align 4
	%cmp1 = icmp sgt i32 %lv.A, 0			%cmp1 = icmp sgt i32 %lv.A, 0
	%add = fadd double %red, 1.000000e+00			%add = fadd double %red, 1.000000e+00
	%red.next = select i1 %cmp1, double %add, double %red			%red.next = select i1 %cmp1, double %add, double %red
	%iv.next = add nuw nsw i64 %iv, 1			%iv.next = add nuw nsw i64 %iv, 1
	%exitcond.not = icmp eq i64 %iv.next, %cnt			%exitcond.not = icmp eq i64 %iv.next, %cnt
	br i1 %exitcond.not, label %exit, label %loop			br i1 %exitcond.not, label %exit, label %loop

	exit:			exit:
	ret double %red.next			ret double %red.next
	}			}

	define float @test_fadd_to_int_with_select_add_2(i32* nocapture readonly %A, i64 %cnt) {			define float @test_fadd_to_int_with_select_add_2(i32* nocapture readonly %A, i64 %cnt) {
	; CHECK-LABEL: @test_fadd_to_int_with_select_add_2(			; CHECK-LABEL: @test_fadd_to_int_with_select_add_2(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
				; CHECK-NEXT: [[MIN_ITERS_CHECK:%.]] = icmp ult i64 [[CNT:%.]], 8
				; CHECK-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.]], label [[VECTOR_PH:%.]]
				; CHECK: vector.ph:
				; CHECK-NEXT: [[N_MOD_VF:%.*]] = urem i64 [[CNT]], 8
				; CHECK-NEXT: [[N_VEC:%.*]] = sub i64 [[CNT]], [[N_MOD_VF]]
				; CHECK-NEXT: br label [[VECTOR_BODY:%.*]]
				; CHECK: vector.body:
				; CHECK-NEXT: [[INDEX:%.]] = phi i64 [ 0, [[VECTOR_PH]] ], [ [[INDEX_NEXT:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP12:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[VEC_PHI1:%.]] = phi <4 x i32> [ zeroinitializer, [[VECTOR_PH]] ], [ [[TMP13:%.]], [[VECTOR_BODY]] ]
				; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INDEX]], 0
				; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INDEX]], 4
				; CHECK-NEXT: [[TMP2:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[TMP1]]
				; CHECK-NEXT: [[TMP4:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i32 0
				; CHECK-NEXT: [[TMP5:%.]] = bitcast i32 [[TMP4]] to <4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD:%.]] = load <4 x i32>, <4 x i32> [[TMP5]], align 4
				; CHECK-NEXT: [[TMP6:%.]] = getelementptr inbounds i32, i32 [[TMP2]], i32 4
				; CHECK-NEXT: [[TMP7:%.]] = bitcast i32 [[TMP6]] to <4 x i32>*
				; CHECK-NEXT: [[WIDE_LOAD2:%.]] = load <4 x i32>, <4 x i32> [[TMP7]], align 4
				; CHECK-NEXT: [[TMP8:%.*]] = icmp sgt <4 x i32> [[WIDE_LOAD]], zeroinitializer
				; CHECK-NEXT: [[TMP9:%.*]] = icmp sgt <4 x i32> [[WIDE_LOAD2]], zeroinitializer
				; CHECK-NEXT: [[TMP10:%.*]] = add <4 x i32> [[VEC_PHI]], <i32 1, i32 1, i32 1, i32 1>
				; CHECK-NEXT: [[TMP11:%.*]] = add <4 x i32> [[VEC_PHI1]], <i32 1, i32 1, i32 1, i32 1>
				; CHECK-NEXT: [[TMP12]] = select <4 x i1> [[TMP8]], <4 x i32> [[TMP10]], <4 x i32> [[VEC_PHI]]
				; CHECK-NEXT: [[TMP13]] = select <4 x i1> [[TMP9]], <4 x i32> [[TMP11]], <4 x i32> [[VEC_PHI1]]
				; CHECK-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 8
				; CHECK-NEXT: [[TMP14:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[TMP14]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP22:![0-9]+]]
				; CHECK: middle.block:
				; CHECK-NEXT: [[BIN_RDX:%.*]] = add <4 x i32> [[TMP13]], [[TMP12]]
				; CHECK-NEXT: [[TMP15:%.*]] = call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> [[BIN_RDX]])
				; CHECK-NEXT: [[TMP16:%.*]] = sitofp i32 [[TMP15]] to float
				; CHECK-NEXT: [[TMP17:%.*]] = call float @llvm.minnum.f32(float [[TMP16]], float 0x4170000000000000)
				; CHECK-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[CNT]], [[N_VEC]]
				; CHECK-NEXT: br i1 [[CMP_N]], label [[EXIT:%.*]], label [[SCALAR_PH]]
				; CHECK: scalar.ph:
				; CHECK-NEXT: [[BC_RESUME_VAL:%.]] = phi i64 [ [[N_VEC]], [[MIDDLE_BLOCK]] ], [ 0, [[ENTRY:%.]] ]
				; CHECK-NEXT: [[BC_MERGE_RDX:%.*]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i64 [ [[BC_RESUME_VAL]], [[SCALAR_PH]] ], [ [[IV_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[RED:%.]] = phi float [ 0.000000e+00, [[ENTRY]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]			; CHECK-NEXT: [[RED:%.]] = phi float [ [[BC_MERGE_RDX]], [[SCALAR_PH]] ], [ [[RED_NEXT:%.]], [[LOOP]] ]
	; CHECK-NEXT: [[GEP_A:%.]] = getelementptr inbounds i32, i32 [[A:%.*]], i64 [[IV]]			; CHECK-NEXT: [[GEP_A:%.]] = getelementptr inbounds i32, i32 [[A]], i64 [[IV]]
	; CHECK-NEXT: [[LV_A:%.]] = load i32, i32 [[GEP_A]], align 4			; CHECK-NEXT: [[LV_A:%.]] = load i32, i32 [[GEP_A]], align 4
	; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[LV_A]], 0			; CHECK-NEXT: [[CMP1:%.*]] = icmp sgt i32 [[LV_A]], 0
	; CHECK-NEXT: [[ADD:%.*]] = fadd float [[RED]], 1.000000e+00			; CHECK-NEXT: [[ADD:%.*]] = fadd float [[RED]], 1.000000e+00
	; CHECK-NEXT: [[RED_NEXT]] = select i1 [[CMP1]], float [[ADD]], float [[RED]]			; CHECK-NEXT: [[RED_NEXT]] = select i1 [[CMP1]], float [[ADD]], float [[RED]]
	; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw nsw i64 [[IV]], 1
	; CHECK-NEXT: [[EXITCOND_NOT:%.]] = icmp eq i64 [[IV_NEXT]], [[CNT:%.]]			; CHECK-NEXT: [[EXITCOND_NOT:%.*]] = icmp eq i64 [[IV_NEXT]], [[CNT]]
	; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[EXITCOND_NOT]], label [[EXIT]], label [[LOOP]], !llvm.loop [[LOOP23:![0-9]+]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ]			; CHECK-NEXT: [[RED_NEXT_LCSSA:%.*]] = phi float [ [[RED_NEXT]], [[LOOP]] ], [ [[TMP17]], [[MIDDLE_BLOCK]] ]
	; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]			; CHECK-NEXT: ret float [[RED_NEXT_LCSSA]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
	%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]			%red = phi float [ 0.000000e+00, %entry ], [ %red.next, %loop ]
	▲ Show 20 Lines • Show All 157 Lines • Show Last 20 Lines