This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
1
ScalarEvolutionExpander.cpp
-
test/
-
CodeGen/Thumb2/LowOverheadLoops/
-
Thumb2/
-
LowOverheadLoops/
-
fast-fp-loops.ll
-
mve-float-loops.ll
-
mve-tail-data-types.ll
-
Transforms/LoopStrengthReduce/X86/
-
LoopStrengthReduce/
-
X86/
1/2
pr46943.ll
-
sibling-loops.ll

Differential D95286

[LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943)
ClosedPublic

Authored by nikic on Jan 23 2021, 5:52 AM.

Download Raw Diff

Details

Reviewers

fhahn
reames
mkazantsev
dmgreen

Commits

rG835104a1141a: [LSR] Drop potentially invalid nowrap flags when switching to post-inc IV…

Summary

When LSR converts a branch on pre-inc IV into a branch on post-inc IV, the nowrap flags on the addition may no longer be valid. Previously, a poison result of the addition might have been ignored, in which case the program was well defined. After branching on the post-inc IV, we might be branching on poison, which is undefined behavior.

Fix this by discarding nowrap flags which are not present on the SCEV expression. Nowrap flags on the SCEV expression are proven by SCEV to always hold, independently of how the expression will be used. This is essentially the same fix we applied to IndVars LFTR, which also performs this kind of pre-inc to post-inc conversion.

I believe a similar problem can also exist for getelementptr inbounds, but I was not able to come up with a problematic test case. The inbounds case would have to be addressed in a differently anyway (as SCEV does not track this property).

Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.

Diff Detail

Event Timeline

nikic created this revision.Jan 23 2021, 5:52 AM

Herald added subscribers: dmgreen, javed.absar. · View Herald TranscriptJan 23 2021, 5:52 AM

nikic requested review of this revision.Jan 23 2021, 5:52 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2021, 5:52 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B86408: Diff 318751.Jan 23 2021, 5:55 AM

Adding @dmgreen for the changes to the Thumb2 tests.

jrmuizel added a subscriber: jrmuizel.Jan 23 2021, 10:48 AM

Hmm. These are the only tests that changed?

They are no longer producing hardware loops for what look like unrolled loops. Those are very old tests though and I don't believe we generate code quite like that any more.

@dmgreen Yeah, those are the only test changes. I believe the problem for those loops is that they have two induction variables, one that counts up and one that counts down. The branch is on the one that counts down (so this is the IV on which flags can be safely preserved), while there's a nuw flag on the one that counts up.

Yeah OK. I ran some tests and did some experiments - none of which showed any similar problems. The correctness fix sounds valid to me so I'm inclined to say that the Thumb2 tests are OK (as in not worth worrying about too much) and if we run into similar problems later on we can try and do something about them then.

LGTM

This revision is now accepted and ready to land.Jan 24 2021, 10:10 AM

fhahn added inline comments.Jan 24 2021, 12:56 PM

lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1446	`OverflowingBinaryOperator`?
test/Transforms/LoopStrengthReduce/X86/pr46943.ll
48	Is this transform actually still helpful, if we have to drop flags to do it?

nikic added inline comments.Jan 24 2021, 1:07 PM

test/Transforms/LoopStrengthReduce/X86/pr46943.ll
48	As LSR runs in the backend, and the backend makes rather little use of nowrap flags, I would assume so. When we did the same change in LFTR (which runs in the middle of the pipeline where nowrap flags are more important), I don't think any performance regressions were reported.

Use OverflowingBinaryOperator.

Herald added a subscriber: hiraditya. · View Herald TranscriptJan 24 2021, 1:10 PM

LGTM as well w/minor comments.

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447 ↗	(On Diff #318864)	if (auto *op = dyn_cast<OBO>...) e.g. you don't need the instruction, you can use the accessors on OBO.
1451 ↗	(On Diff #318864)	Any reason not to just copy the SCEV flags? Inferring stronger flags should be legal here.
llvm/test/Transforms/LoopStrengthReduce/X86/sibling-loops.ll
20 ↗	(On Diff #318864)	Given the nsw is present in the source, SCEV should know this is nsw. Any idea why it doesn't?

nikic added inline comments.Jan 25 2021, 12:56 AM

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447 ↗	(On Diff #318864)	OBO can generally also be a constant expression, which is why it's not allowed to set nowrap flags directly on it. We need to go through Instruction to modify the flags.
1451 ↗	(On Diff #318864)	I don't think that's quite correct without additional checks. We're checking the flags on the post-inc addrec, which don't make any statement about overflow on the first iteration. If you have something like pre-inc `{255,+,1}` and `{0,+,1}` post-inc, then the latter would be nuw (assuming appropriate BE count), while the former would not be. The add can't be nuw in that case, due to the overflow on the first iteration.
llvm/test/Transforms/LoopStrengthReduce/X86/sibling-loops.ll
20 ↗	(On Diff #318864)	The `%inc` IV doesn't seem to ever be branched on, so there's no guarantee that %inc being poison would result in undefined behavior. Thus SCEV can't transfer poison flags from IR. There are some additional cases we could transfer using D92739 (for branches in non-latch exits), but I don't think that would help this case either.

fhahn added inline comments.Jan 25 2021, 4:28 AM

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447 ↗	(On Diff #318864)	But can `Result` be a constant expression here? `PN` must be an AddRec for L, then the incoming value from the loop should be non-constant, otherwise it wouldn't be an AddRec?

nikic added inline comments.Jan 25 2021, 5:13 AM

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447 ↗	(On Diff #318864)	Yes, it can't be a constant expression here, but it can be in general, thus the OBO API does not support setting nowrap flags. (This is why I was originally using BinaryOperator here.) Or do you mean that this code should be using `cast<>` rather than `dyn_cast<>` for Instruction? I can change that.

Use cast<> instead of dyn_cast<>.

Closed by commit rG835104a1141a: [LSR] Drop potentially invalid nowrap flags when switching to post-inc IV… (authored by nikic). · Explain WhyJan 25 2021, 2:15 PM

This revision was automatically updated to reflect the committed changes.

nikic added a commit: rG835104a1141a: [LSR] Drop potentially invalid nowrap flags when switching to post-inc IV….

fhahn added inline comments.Jan 26 2021, 4:14 AM

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
1447 ↗	(On Diff #318864)	Or do you mean that this code should be using cast<> rather than dyn_cast<> for Instruction? I can change that. Yep, thanks!

Revision Contents

Path

Size


	llvm/

lib/

Transforms/

Utils/

ScalarEvolutionExpander.cpp

10 lines

test/

CodeGen/

Thumb2/

LowOverheadLoops/

fast-fp-loops.ll

58 lines

mve-float-loops.ll

153 lines

mve-tail-data-types.ll

292 lines

Transforms/

LoopStrengthReduce/

X86/

pr46943.ll

7 lines

sibling-loops.ll

10 lines

Diff 318751

lib/Transforms/Utils/ScalarEvolutionExpander.cpp

Show First 20 Lines • Show All 113 Lines • ▼ Show 20 Lines	SCEVExpander::findInsertPointAfter(Instruction I, Instruction MustDominate) {
return IP;		return IP;
}		}

/// InsertNoopCastOfTo - Insert a cast of V to the specified type,		/// InsertNoopCastOfTo - Insert a cast of V to the specified type,
/// which must be possible with a noop cast, doing what we can to share		/// which must be possible with a noop cast, doing what we can to share
/// the casts.		/// the casts.
Value SCEVExpander::InsertNoopCastOfTo(Value V, Type *Ty) {		Value SCEVExpander::InsertNoopCastOfTo(Value V, Type *Ty) {
Instruction::CastOps Op = CastInst::getCastOpcode(V, false, Ty, false);		Instruction::CastOps Op = CastInst::getCastOpcode(V, false, Ty, false);
assert((Op == Instruction::BitCast \|\|		assert((Op == Instruction::BitCast \|\|
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - assert((Op == Instruction::BitCast \|\| - Op == Instruction::PtrToInt \|\| + assert((Op == Instruction::BitCast \|\| Op == Instruction::PtrToInt \|\| Lint: Pre-merge checks: clang-format: please reformat the code ``` - assert((Op == Instruction::BitCast \|\|…
Op == Instruction::PtrToInt \|\|		Op == Instruction::PtrToInt \|\|
Op == Instruction::IntToPtr) &&		Op == Instruction::IntToPtr) &&
"InsertNoopCastOfTo cannot perform non-noop casts!");		"InsertNoopCastOfTo cannot perform non-noop casts!");
assert(SE.getTypeSizeInBits(V->getType()) == SE.getTypeSizeInBits(Ty) &&		assert(SE.getTypeSizeInBits(V->getType()) == SE.getTypeSizeInBits(Ty) &&
"InsertNoopCastOfTo cannot change sizes!");		"InsertNoopCastOfTo cannot change sizes!");

// inttoptr only works for integral pointers. For non-integral pointers, we		// inttoptr only works for integral pointers. For non-integral pointers, we
// can create a GEP on i8* null with the integral value as index. Note that		// can create a GEP on i8* null with the integral value as index. Note that
Show All 22 Lines	Value SCEVExpander::InsertNoopCastOfTo(Value V, Type *Ty) {
}		}
// Short-circuit unnecessary inttoptr<->ptrtoint casts.		// Short-circuit unnecessary inttoptr<->ptrtoint casts.
if ((Op == Instruction::PtrToInt \|\| Op == Instruction::IntToPtr) &&		if ((Op == Instruction::PtrToInt \|\| Op == Instruction::IntToPtr) &&
SE.getTypeSizeInBits(Ty) == SE.getTypeSizeInBits(V->getType())) {		SE.getTypeSizeInBits(Ty) == SE.getTypeSizeInBits(V->getType())) {
if (CastInst *CI = dyn_cast<CastInst>(V))		if (CastInst *CI = dyn_cast<CastInst>(V))
if ((CI->getOpcode() == Instruction::PtrToInt \|\|		if ((CI->getOpcode() == Instruction::PtrToInt \|\|
CI->getOpcode() == Instruction::IntToPtr) &&		CI->getOpcode() == Instruction::IntToPtr) &&
SE.getTypeSizeInBits(CI->getType()) ==		SE.getTypeSizeInBits(CI->getType()) ==
SE.getTypeSizeInBits(CI->getOperand(0)->getType()))		SE.getTypeSizeInBits(CI->getOperand(0)->getType()))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SE.getTypeSizeInBits(CI->getOperand(0)->getType())) + SE.getTypeSizeInBits(CI->getOperand(0)->getType())) Lint: Pre-merge checks: clang-format: please reformat the code ``` - SE.getTypeSizeInBits(CI->getOperand(0)…
return CI->getOperand(0);		return CI->getOperand(0);
if (ConstantExpr *CE = dyn_cast<ConstantExpr>(V))		if (ConstantExpr *CE = dyn_cast<ConstantExpr>(V))
if ((CE->getOpcode() == Instruction::PtrToInt \|\|		if ((CE->getOpcode() == Instruction::PtrToInt \|\|
CE->getOpcode() == Instruction::IntToPtr) &&		CE->getOpcode() == Instruction::IntToPtr) &&
SE.getTypeSizeInBits(CE->getType()) ==		SE.getTypeSizeInBits(CE->getType()) ==
SE.getTypeSizeInBits(CE->getOperand(0)->getType()))		SE.getTypeSizeInBits(CE->getOperand(0)->getType()))
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SE.getTypeSizeInBits(CE->getOperand(0)->getType())) + SE.getTypeSizeInBits(CE->getOperand(0)->getType())) Lint: Pre-merge checks: clang-format: please reformat the code ``` - SE.getTypeSizeInBits(CE->getOperand(0)…
return CE->getOperand(0);		return CE->getOperand(0);
}		}

// Fold a cast of a constant.		// Fold a cast of a constant.
if (Constant *C = dyn_cast<Constant>(V))		if (Constant *C = dyn_cast<Constant>(V))
return ConstantExpr::getCast(Op, C, Ty);		return ConstantExpr::getCast(Op, C, Ty);

// Cast the argument at the beginning of the entry block, after		// Cast the argument at the beginning of the entry block, after
Show All 12 Lines	Value SCEVExpander::InsertNoopCastOfTo(Value V, Type *Ty) {
Instruction *I = cast<Instruction>(V);		Instruction *I = cast<Instruction>(V);
BasicBlock::iterator IP = findInsertPointAfter(I, &*Builder.GetInsertPoint());		BasicBlock::iterator IP = findInsertPointAfter(I, &*Builder.GetInsertPoint());
return ReuseOrCreateCast(I, Ty, Op, IP);		return ReuseOrCreateCast(I, Ty, Op, IP);
}		}

/// InsertBinop - Insert the specified binary operator, doing a small amount		/// InsertBinop - Insert the specified binary operator, doing a small amount
/// of work to avoid inserting an obviously redundant operation, and hoisting		/// of work to avoid inserting an obviously redundant operation, and hoisting
/// to an outer loop when the opportunity is there and it is safe.		/// to an outer loop when the opportunity is there and it is safe.
Value *SCEVExpander::InsertBinop(Instruction::BinaryOps Opcode,		Value *SCEVExpander::InsertBinop(Instruction::BinaryOps Opcode,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -Value SCEVExpander::InsertBinop(Instruction::BinaryOps Opcode, - Value LHS, Value RHS, - SCEV::NoWrapFlags Flags, bool IsSafeToHoist) { +Value SCEVExpander::InsertBinop(Instruction::BinaryOps Opcode, Value LHS, + Value RHS, SCEV::NoWrapFlags Flags, + bool IsSafeToHoist) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -Value *SCEVExpander::InsertBinop(Instruction…
Value LHS, Value RHS,		Value LHS, Value RHS,
SCEV::NoWrapFlags Flags, bool IsSafeToHoist) {		SCEV::NoWrapFlags Flags, bool IsSafeToHoist) {
// Fold a binop with constant operands.		// Fold a binop with constant operands.
if (Constant *CLHS = dyn_cast<Constant>(LHS))		if (Constant *CLHS = dyn_cast<Constant>(LHS))
if (Constant *CRHS = dyn_cast<Constant>(RHS))		if (Constant *CRHS = dyn_cast<Constant>(RHS))
return ConstantExpr::get(Opcode, CLHS, CRHS);		return ConstantExpr::get(Opcode, CLHS, CRHS);

// Do a quick scan to see if we have this binop nearby. If so, reuse it.		// Do a quick scan to see if we have this binop nearby. If so, reuse it.
Show All 21 Lines	for (; ScanLimit; --IP, --ScanLimit) {
// flags installed.		// flags installed.
if (isa<PossiblyExactOperator>(I) && I->isExact())		if (isa<PossiblyExactOperator>(I) && I->isExact())
return true;		return true;
return false;		return false;
};		};
if (IP->getOpcode() == (unsigned)Opcode && IP->getOperand(0) == LHS &&		if (IP->getOpcode() == (unsigned)Opcode && IP->getOperand(0) == LHS &&
IP->getOperand(1) == RHS && !canGenerateIncompatiblePoison(&*IP))		IP->getOperand(1) == RHS && !canGenerateIncompatiblePoison(&*IP))
return &*IP;		return &*IP;
if (IP == BlockBegin) break;		if (IP == BlockBegin) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (IP == BlockBegin) break; + if (IP == BlockBegin) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (IP == BlockBegin) break; + if (IP ==…
}		}
}		}

// Save the original insertion point so we can restore it when we're done.		// Save the original insertion point so we can restore it when we're done.
DebugLoc Loc = Builder.GetInsertPoint()->getDebugLoc();		DebugLoc Loc = Builder.GetInsertPoint()->getDebugLoc();
SCEVInsertPointGuard Guard(Builder, this);		SCEVInsertPointGuard Guard(Builder, this);

if (IsSafeToHoist) {		if (IsSafeToHoist) {
// Move the insertion point out of as many loops as we can.		// Move the insertion point out of as many loops as we can.
while (const Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock())) {		while (const Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock())) {
if (!L->isLoopInvariant(LHS) \|\| !L->isLoopInvariant(RHS)) break;		if (!L->isLoopInvariant(LHS) \|\| !L->isLoopInvariant(RHS)) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!L->isLoopInvariant(LHS) \|\| !L->isLoopInvariant(RHS)) break; + if (!L->isLoopInvariant(LHS) \|\| !L->isLoopInvariant(RHS)) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!L->isLoopInvariant(LHS) \|\| !L…
BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader) break;		if (!Preheader) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!Preheader) break; + if (!Preheader) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!Preheader) break; + if (!Preheader)…

// Ok, move up a level.		// Ok, move up a level.
Builder.SetInsertPoint(Preheader->getTerminator());		Builder.SetInsertPoint(Preheader->getTerminator());
}		}
}		}

// If we haven't found this binop, insert it.		// If we haven't found this binop, insert it.
Instruction *BO = cast<Instruction>(Builder.CreateBinOp(Opcode, LHS, RHS));		Instruction *BO = cast<Instruction>(Builder.CreateBinOp(Opcode, LHS, RHS));
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	static bool FactorOutConstant(const SCEV &S, const SCEV &Remainder,

return false;		return false;
}		}

/// SimplifyAddOperands - Sort and simplify a list of add operands. NumAddRecs		/// SimplifyAddOperands - Sort and simplify a list of add operands. NumAddRecs
/// is the number of SCEVAddRecExprs present, which are kept at the end of		/// is the number of SCEVAddRecExprs present, which are kept at the end of
/// the list.		/// the list.
///		///
static void SimplifyAddOperands(SmallVectorImpl<const SCEV *> &Ops,		static void SimplifyAddOperands(SmallVectorImpl<const SCEV *> &Ops,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -static void SimplifyAddOperands(SmallVectorImpl<const SCEV > &Ops, - Type Ty, +static void SimplifyAddOperands(SmallVectorImpl<const SCEV > &Ops, Type Ty, Lint: Pre-merge checks: clang-format: please reformat the code ``` -static void SimplifyAddOperands…
Type *Ty,		Type *Ty,
ScalarEvolution &SE) {		ScalarEvolution &SE) {
unsigned NumAddRecs = 0;		unsigned NumAddRecs = 0;
for (unsigned i = Ops.size(); i > 0 && isa<SCEVAddRecExpr>(Ops[i-1]); --i)		for (unsigned i = Ops.size(); i > 0 && isa<SCEVAddRecExpr>(Ops[i-1]); --i)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - for (unsigned i = Ops.size(); i > 0 && isa<SCEVAddRecExpr>(Ops[i-1]); --i) + for (unsigned i = Ops.size(); i > 0 && isa<SCEVAddRecExpr>(Ops[i - 1]); --i) Lint: Pre-merge checks: clang-format: please reformat the code ``` - for (unsigned i = Ops.size(); i > 0 &&…
++NumAddRecs;		++NumAddRecs;
// Group Ops into non-addrecs and addrecs.		// Group Ops into non-addrecs and addrecs.
SmallVector<const SCEV *, 8> NoAddRecs(Ops.begin(), Ops.end() - NumAddRecs);		SmallVector<const SCEV *, 8> NoAddRecs(Ops.begin(), Ops.end() - NumAddRecs);
SmallVector<const SCEV *, 8> AddRecs(Ops.end() - NumAddRecs, Ops.end());		SmallVector<const SCEV *, 8> AddRecs(Ops.end() - NumAddRecs, Ops.end());
// Let ScalarEvolution sort and simplify the non-addrecs list.		// Let ScalarEvolution sort and simplify the non-addrecs list.
const SCEV *Sum = NoAddRecs.empty() ?		const SCEV *Sum = NoAddRecs.empty() ?
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SCEV Sum = NoAddRecs.empty() ? - SE.getConstant(Ty, 0) : - SE.getAddExpr(NoAddRecs); + const SCEV Sum = + NoAddRecs.empty() ? SE.getConstant(Ty, 0) : SE.getAddExpr(NoAddRecs); Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SCEV *Sum = NoAddRecs.empty() ?
SE.getConstant(Ty, 0) :		SE.getConstant(Ty, 0) :
SE.getAddExpr(NoAddRecs);		SE.getAddExpr(NoAddRecs);
// If it returned an add, use the operands. Otherwise it simplified		// If it returned an add, use the operands. Otherwise it simplified
// the sum into a single value, so just use that.		// the sum into a single value, so just use that.
Ops.clear();		Ops.clear();
if (const SCEVAddExpr *Add = dyn_cast<SCEVAddExpr>(Sum))		if (const SCEVAddExpr *Add = dyn_cast<SCEVAddExpr>(Sum))
Ops.append(Add->op_begin(), Add->op_end());		Ops.append(Add->op_begin(), Add->op_end());
else if (!Sum->isZero())		else if (!Sum->isZero())
Ops.push_back(Sum);		Ops.push_back(Sum);
// Then append the addrecs.		// Then append the addrecs.
Ops.append(AddRecs.begin(), AddRecs.end());		Ops.append(AddRecs.begin(), AddRecs.end());
}		}

/// SplitAddRecs - Flatten a list of add operands, moving addrec start values		/// SplitAddRecs - Flatten a list of add operands, moving addrec start values
/// out to the top level. For example, convert {a + b,+,c} to a, b, {0,+,d}.		/// out to the top level. For example, convert {a + b,+,c} to a, b, {0,+,d}.
/// This helps expose more opportunities for folding parts of the expressions		/// This helps expose more opportunities for folding parts of the expressions
/// into GEP indices.		/// into GEP indices.
///		///
static void SplitAddRecs(SmallVectorImpl<const SCEV *> &Ops,		static void SplitAddRecs(SmallVectorImpl<const SCEV *> &Ops,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -static void SplitAddRecs(SmallVectorImpl<const SCEV > &Ops, - Type Ty, +static void SplitAddRecs(SmallVectorImpl<const SCEV > &Ops, Type Ty, Lint: Pre-merge checks: clang-format: please reformat the code ``` -static void SplitAddRecs(SmallVectorImpl<const SCEV…
Type *Ty,		Type *Ty,
ScalarEvolution &SE) {		ScalarEvolution &SE) {
// Find the addrecs.		// Find the addrecs.
SmallVector<const SCEV *, 8> AddRecs;		SmallVector<const SCEV *, 8> AddRecs;
for (unsigned i = 0, e = Ops.size(); i != e; ++i)		for (unsigned i = 0, e = Ops.size(); i != e; ++i)
while (const SCEVAddRecExpr *A = dyn_cast<SCEVAddRecExpr>(Ops[i])) {		while (const SCEVAddRecExpr *A = dyn_cast<SCEVAddRecExpr>(Ops[i])) {
const SCEV *Start = A->getStart();		const SCEV *Start = A->getStart();
if (Start->isZero()) break;		if (Start->isZero()) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (Start->isZero()) break; + if (Start->isZero()) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (Start->isZero()) break; + if (Start…
const SCEV *Zero = SE.getConstant(Ty, 0);		const SCEV *Zero = SE.getConstant(Ty, 0);
AddRecs.push_back(SE.getAddRecExpr(Zero,		AddRecs.push_back(SE.getAddRecExpr(Zero,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - AddRecs.push_back(SE.getAddRecExpr(Zero, - A->getStepRecurrence(SE), + AddRecs.push_back(SE.getAddRecExpr(Zero, A->getStepRecurrence(SE), Lint: Pre-merge checks: clang-format: please reformat the code ``` - AddRecs.push_back(SE.getAddRecExpr(Zero…
A->getStepRecurrence(SE),		A->getStepRecurrence(SE),
A->getLoop(),		A->getLoop(),
A->getNoWrapFlags(SCEV::FlagNW)));		A->getNoWrapFlags(SCEV::FlagNW)));
if (const SCEVAddExpr *Add = dyn_cast<SCEVAddExpr>(Start)) {		if (const SCEVAddExpr *Add = dyn_cast<SCEVAddExpr>(Start)) {
Ops[i] = Zero;		Ops[i] = Zero;
Ops.append(Add->op_begin(), Add->op_end());		Ops.append(Add->op_begin(), Add->op_end());
e += Add->getNumOperands();		e += Add->getNumOperands();
} else {		} else {
Show All 31 Lines
/// pushing loop-invariant computation down into loops, so even if the		/// pushing loop-invariant computation down into loops, so even if the
/// GEPs were split here, the work would quickly be undone. The		/// GEPs were split here, the work would quickly be undone. The
/// LoopStrengthReduction pass, which is usually run quite late (and		/// LoopStrengthReduction pass, which is usually run quite late (and
/// after the last InstructionCombining pass), takes care of hoisting		/// after the last InstructionCombining pass), takes care of hoisting
/// loop-invariant portions of expressions, after considering what		/// loop-invariant portions of expressions, after considering what
/// can be folded using target addressing modes.		/// can be folded using target addressing modes.
///		///
Value SCEVExpander::expandAddToGEP(const SCEV const *op_begin,		Value SCEVExpander::expandAddToGEP(const SCEV const *op_begin,
const SCEV const op_end,		const SCEV const op_end,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SCEV const op_end, - PointerType PTy, - Type Ty, - Value V) { + const SCEV const op_end, PointerType PTy, + Type Ty, Value V) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SCEV…
PointerType *PTy,		PointerType *PTy,
Type *Ty,		Type *Ty,
Value *V) {		Value *V) {
Type *OriginalElTy = PTy->getElementType();		Type *OriginalElTy = PTy->getElementType();
Type *ElTy = OriginalElTy;		Type *ElTy = OriginalElTy;
SmallVector<Value *, 4> GepIndices;		SmallVector<Value *, 4> GepIndices;
SmallVector<const SCEV *, 8> Ops(op_begin, op_end);		SmallVector<const SCEV *, 8> Ops(op_begin, op_end);
bool AnyNonZeroIndices = false;		bool AnyNonZeroIndices = false;
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	Value *Scaled =
? Constant::getNullValue(Ty)		? Constant::getNullValue(Ty)
: expandCodeForImpl(SE.getAddExpr(ScaledOps), Ty, false);		: expandCodeForImpl(SE.getAddExpr(ScaledOps), Ty, false);
GepIndices.push_back(Scaled);		GepIndices.push_back(Scaled);

// Collect struct field index operands.		// Collect struct field index operands.
while (StructType *STy = dyn_cast<StructType>(ElTy)) {		while (StructType *STy = dyn_cast<StructType>(ElTy)) {
bool FoundFieldNo = false;		bool FoundFieldNo = false;
// An empty struct has no fields.		// An empty struct has no fields.
if (STy->getNumElements() == 0) break;		if (STy->getNumElements() == 0) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (STy->getNumElements() == 0) break; + if (STy->getNumElements() == 0) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (STy->getNumElements() == 0) break; +…
// Field offsets are known. See if a constant offset falls within any of		// Field offsets are known. See if a constant offset falls within any of
// the struct fields.		// the struct fields.
if (Ops.empty())		if (Ops.empty())
break;		break;
if (const SCEVConstant *C = dyn_cast<SCEVConstant>(Ops[0]))		if (const SCEVConstant *C = dyn_cast<SCEVConstant>(Ops[0]))
if (SE.getTypeSizeInBits(C->getType()) <= 64) {		if (SE.getTypeSizeInBits(C->getType()) <= 64) {
const StructLayout &SL = *DL.getStructLayout(STy);		const StructLayout &SL = *DL.getStructLayout(STy);
uint64_t FullOffset = C->getValue()->getZExtValue();		uint64_t FullOffset = C->getValue()->getZExtValue();
Show All 9 Lines	while (StructType *STy = dyn_cast<StructType>(ElTy)) {
}		}
}		}
// If no struct field offsets were found, tentatively assume that		// If no struct field offsets were found, tentatively assume that
// field zero was selected (since the zero offset would obviously		// field zero was selected (since the zero offset would obviously
// be folded away).		// be folded away).
if (!FoundFieldNo) {		if (!FoundFieldNo) {
ElTy = STy->getTypeAtIndex(0u);		ElTy = STy->getTypeAtIndex(0u);
GepIndices.push_back(		GepIndices.push_back(
Constant::getNullValue(Type::getInt32Ty(Ty->getContext())));		Constant::getNullValue(Type::getInt32Ty(Ty->getContext())));
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Constant::getNullValue(Type::getInt32Ty(Ty->getContext()))); + Constant::getNullValue(Type::getInt32Ty(Ty->getContext()))); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Constant::getNullValue(Type::getInt32Ty…
}		}
}		}

if (ArrayType *ATy = dyn_cast<ArrayType>(ElTy))		if (ArrayType *ATy = dyn_cast<ArrayType>(ElTy))
ElTy = ATy->getElementType();		ElTy = ATy->getElementType();
else		else
// FIXME: Handle VectorType.		// FIXME: Handle VectorType.
// E.g., If ElTy is scalable vector, then ElSize is not a compile-time		// E.g., If ElTy is scalable vector, then ElSize is not a compile-time
// constant, therefore can not be factored out. The generated IR is less		// constant, therefore can not be factored out. The generated IR is less
// ideal with base 'V' cast to i8* and do ugly getelementptr over that.		// ideal with base 'V' cast to i8* and do ugly getelementptr over that.
break;		break;
}		}

// If none of the operands were convertible to proper GEP indices, cast		// If none of the operands were convertible to proper GEP indices, cast
// the base to i8* and do an ugly getelementptr with that. It's still		// the base to i8* and do an ugly getelementptr with that. It's still
// better than ptrtoint+arithmetic+inttoptr at least.		// better than ptrtoint+arithmetic+inttoptr at least.
if (!AnyNonZeroIndices) {		if (!AnyNonZeroIndices) {
// Cast the base to i8*.		// Cast the base to i8*.
V = InsertNoopCastOfTo(V,		V = InsertNoopCastOfTo(V,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - V = InsertNoopCastOfTo(V, - Type::getInt8PtrTy(Ty->getContext(), PTy->getAddressSpace())); + V = InsertNoopCastOfTo( + V, Type::getInt8PtrTy(Ty->getContext(), PTy->getAddressSpace())); Lint: Pre-merge checks: clang-format: please reformat the code ``` - V = InsertNoopCastOfTo(V, - Type…
Type::getInt8PtrTy(Ty->getContext(), PTy->getAddressSpace()));		Type::getInt8PtrTy(Ty->getContext(), PTy->getAddressSpace()));

assert(!isa<Instruction>(V) \|\|		assert(!isa<Instruction>(V) \|\|
SE.DT.dominates(cast<Instruction>(V), &*Builder.GetInsertPoint()));		SE.DT.dominates(cast<Instruction>(V), &*Builder.GetInsertPoint()));

// Expand the operands for a plain byte offset.		// Expand the operands for a plain byte offset.
Value *Idx = expandCodeForImpl(SE.getAddExpr(Ops), Ty, false);		Value *Idx = expandCodeForImpl(SE.getAddExpr(Ops), Ty, false);

Show All 13 Lines	if (IP != BlockBegin) {
for (; ScanLimit; --IP, --ScanLimit) {		for (; ScanLimit; --IP, --ScanLimit) {
// Don't count dbg.value against the ScanLimit, to avoid perturbing the		// Don't count dbg.value against the ScanLimit, to avoid perturbing the
// generated code.		// generated code.
if (isa<DbgInfoIntrinsic>(IP))		if (isa<DbgInfoIntrinsic>(IP))
ScanLimit++;		ScanLimit++;
if (IP->getOpcode() == Instruction::GetElementPtr &&		if (IP->getOpcode() == Instruction::GetElementPtr &&
IP->getOperand(0) == V && IP->getOperand(1) == Idx)		IP->getOperand(0) == V && IP->getOperand(1) == Idx)
return &*IP;		return &*IP;
if (IP == BlockBegin) break;		if (IP == BlockBegin) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (IP == BlockBegin) break; + if (IP == BlockBegin) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (IP == BlockBegin) break; + if…
}		}
}		}

// Save the original insertion point so we can restore it when we're done.		// Save the original insertion point so we can restore it when we're done.
SCEVInsertPointGuard Guard(Builder, this);		SCEVInsertPointGuard Guard(Builder, this);

// Move the insertion point out of as many loops as we can.		// Move the insertion point out of as many loops as we can.
while (const Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock())) {		while (const Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock())) {
if (!L->isLoopInvariant(V) \|\| !L->isLoopInvariant(Idx)) break;		if (!L->isLoopInvariant(V) \|\| !L->isLoopInvariant(Idx)) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!L->isLoopInvariant(V) \|\| !L->isLoopInvariant(Idx)) break; + if (!L->isLoopInvariant(V) \|\| !L->isLoopInvariant(Idx)) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!L->isLoopInvariant(V) \|\| !L…
BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader) break;		if (!Preheader) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!Preheader) break; + if (!Preheader) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!Preheader) break; + if (!Preheader)…

// Ok, move up a level.		// Ok, move up a level.
Builder.SetInsertPoint(Preheader->getTerminator());		Builder.SetInsertPoint(Preheader->getTerminator());
}		}

// Emit a GEP.		// Emit a GEP.
return Builder.CreateGEP(Builder.getInt8Ty(), V, Idx, "uglygep");		return Builder.CreateGEP(Builder.getInt8Ty(), V, Idx, "uglygep");
}		}

{		{
SCEVInsertPointGuard Guard(Builder, this);		SCEVInsertPointGuard Guard(Builder, this);

// Move the insertion point out of as many loops as we can.		// Move the insertion point out of as many loops as we can.
while (const Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock())) {		while (const Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock())) {
if (!L->isLoopInvariant(V)) break;		if (!L->isLoopInvariant(V)) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!L->isLoopInvariant(V)) break; + if (!L->isLoopInvariant(V)) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!L->isLoopInvariant(V)) break; + if…

bool AnyIndexNotLoopInvariant = any_of(		bool AnyIndexNotLoopInvariant = any_of(
GepIndices, [L](Value *Op) { return !L->isLoopInvariant(Op); });		GepIndices, [L](Value *Op) { return !L->isLoopInvariant(Op); });

if (AnyIndexNotLoopInvariant)		if (AnyIndexNotLoopInvariant)
break;		break;

BasicBlock *Preheader = L->getLoopPreheader();		BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader) break;		if (!Preheader) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!Preheader) break; + if (!Preheader) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!Preheader) break; + if (!Preheader)…

// Ok, move up a level.		// Ok, move up a level.
Builder.SetInsertPoint(Preheader->getTerminator());		Builder.SetInsertPoint(Preheader->getTerminator());
}		}

// Insert a pretty getelementptr. Note that this GEP is not marked inbounds,		// Insert a pretty getelementptr. Note that this GEP is not marked inbounds,
// because ScalarEvolution may have changed the address arithmetic to		// because ScalarEvolution may have changed the address arithmetic to
// compute a value which is beyond the end of the allocated object.		// compute a value which is beyond the end of the allocated object.
Show All 13 Lines	Value SCEVExpander::expandAddToGEP(const SCEV Op, PointerType PTy, Type Ty,
return expandAddToGEP(Ops, Ops + 1, PTy, Ty, V);		return expandAddToGEP(Ops, Ops + 1, PTy, Ty, V);
}		}

/// PickMostRelevantLoop - Given two loops pick the one that's most relevant for		/// PickMostRelevantLoop - Given two loops pick the one that's most relevant for
/// SCEV expansion. If they are nested, this is the most nested. If they are		/// SCEV expansion. If they are nested, this is the most nested. If they are
/// neighboring, pick the later.		/// neighboring, pick the later.
static const Loop PickMostRelevantLoop(const Loop A, const Loop *B,		static const Loop PickMostRelevantLoop(const Loop A, const Loop *B,
DominatorTree &DT) {		DominatorTree &DT) {
if (!A) return B;		if (!A) return B;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!A) return B; - if (!B) return A; - if (A->contains(B)) return B; - if (B->contains(A)) return A; - if (DT.dominates(A->getHeader(), B->getHeader())) return B; - if (DT.dominates(B->getHeader(), A->getHeader())) return A; + if (!A) + return B; + if (!B) + return A; 8 diff lines are omitted. See full path. Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!A) return B; - if (!B) return A; - if (A…
if (!B) return A;		if (!B) return A;
if (A->contains(B)) return B;		if (A->contains(B)) return B;
if (B->contains(A)) return A;		if (B->contains(A)) return A;
if (DT.dominates(A->getHeader(), B->getHeader())) return B;		if (DT.dominates(A->getHeader(), B->getHeader())) return B;
if (DT.dominates(B->getHeader(), A->getHeader())) return A;		if (DT.dominates(B->getHeader(), A->getHeader())) return A;
return A; // Arbitrarily break the tie.		return A; // Arbitrarily break the tie.
}		}

Show All 33 Lines	const Loop SCEVExpander::getRelevantLoop(const SCEV S) {
}		}
llvm_unreachable("Unexpected SCEV type!");		llvm_unreachable("Unexpected SCEV type!");
}		}

namespace {		namespace {

/// LoopCompare - Compare loops by PickMostRelevantLoop.		/// LoopCompare - Compare loops by PickMostRelevantLoop.
class LoopCompare {		class LoopCompare {
DominatorTree &DT;		DominatorTree &DT;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code + Lint: Pre-merge checks: clang-format: please reformat the code ``` + ```
public:		public:
explicit LoopCompare(DominatorTree &dt) : DT(dt) {}		explicit LoopCompare(DominatorTree &dt) : DT(dt) {}

bool operator()(std::pair<const Loop , const SCEV > LHS,		bool operator()(std::pair<const Loop , const SCEV > LHS,
std::pair<const Loop , const SCEV > RHS) const {		std::pair<const Loop , const SCEV > RHS) const {
// Keep pointer operands sorted at the end.		// Keep pointer operands sorted at the end.
if (LHS.second->getType()->isPointerTy() !=		if (LHS.second->getType()->isPointerTy() !=
RHS.second->getType()->isPointerTy())		RHS.second->getType()->isPointerTy())
Show All 12 Lines	bool operator()(std::pair<const Loop , const SCEV > LHS,
} else if (RHS.second->isNonConstantNegative())		} else if (RHS.second->isNonConstantNegative())
return true;		return true;

// Otherwise they are equivalent according to this comparison.		// Otherwise they are equivalent according to this comparison.
return false;		return false;
}		}
};		};

}		}
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -} +} // namespace Lint: Pre-merge checks: clang-format: please reformat the code ``` -} +} // namespace ```

Value SCEVExpander::visitAddExpr(const SCEVAddExpr S) {		Value SCEVExpander::visitAddExpr(const SCEVAddExpr S) {
Type *Ty = SE.getEffectiveSCEVType(S->getType());		Type *Ty = SE.getEffectiveSCEVType(S->getType());

// Collect all the add operands in a loop, along with their associated loops.		// Collect all the add operands in a loop, along with their associated loops.
// Iterate in reverse so that constants are emitted last, all else equal, and		// Iterate in reverse so that constants are emitted last, all else equal, and
// so that pointer operands are inserted first, which the code below relies on		// so that pointer operands are inserted first, which the code below relies on
// to form more involved GEPs.		// to form more involved GEPs.
SmallVector<std::pair<const Loop , const SCEV >, 8> OpsAndLoops;		SmallVector<std::pair<const Loop , const SCEV >, 8> OpsAndLoops;
for (std::reverse_iterator<SCEVAddExpr::op_iterator> I(S->op_end()),		for (std::reverse_iterator<SCEVAddExpr::op_iterator> I(S->op_end()),
E(S->op_begin()); I != E; ++I)		E(S->op_begin()); I != E; ++I)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - E(S->op_begin()); I != E; ++I) + E(S->op_begin()); + I != E; ++I) Lint: Pre-merge checks: clang-format: please reformat the code ``` - E(S->op_begin()); I != E; ++I) + E(S…
OpsAndLoops.push_back(std::make_pair(getRelevantLoop(I), I));		OpsAndLoops.push_back(std::make_pair(getRelevantLoop(I), I));

// Sort by loop. Use a stable sort so that constants follow non-constants and		// Sort by loop. Use a stable sort so that constants follow non-constants and
// pointer operands precede non-pointer operands.		// pointer operands precede non-pointer operands.
llvm::stable_sort(OpsAndLoops, LoopCompare(SE.DT));		llvm::stable_sort(OpsAndLoops, LoopCompare(SE.DT));

// Emit instructions to add all the operands. Hoist as much as possible		// Emit instructions to add all the operands. Hoist as much as possible
// out of loops, and form meaningful getelementptrs where possible.		// out of loops, and form meaningful getelementptrs where possible.
Show All 19 Lines	if (!Sum) {
NewOps.push_back(X);		NewOps.push_back(X);
}		}
Sum = expandAddToGEP(NewOps.begin(), NewOps.end(), PTy, Ty, Sum);		Sum = expandAddToGEP(NewOps.begin(), NewOps.end(), PTy, Ty, Sum);
} else if (PointerType *PTy = dyn_cast<PointerType>(Op->getType())) {		} else if (PointerType *PTy = dyn_cast<PointerType>(Op->getType())) {
// The running sum is an integer, and there's a pointer at this level.		// The running sum is an integer, and there's a pointer at this level.
// Try to form a getelementptr. If the running sum is instructions,		// Try to form a getelementptr. If the running sum is instructions,
// use a SCEVUnknown to avoid re-analyzing them.		// use a SCEVUnknown to avoid re-analyzing them.
SmallVector<const SCEV *, 4> NewOps;		SmallVector<const SCEV *, 4> NewOps;
NewOps.push_back(isa<Instruction>(Sum) ? SE.getUnknown(Sum) :		NewOps.push_back(isa<Instruction>(Sum) ? SE.getUnknown(Sum) :
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - NewOps.push_back(isa<Instruction>(Sum) ? SE.getUnknown(Sum) : - SE.getSCEV(Sum)); + NewOps.push_back(isa<Instruction>(Sum) ? SE.getUnknown(Sum) + : SE.getSCEV(Sum)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - NewOps.push_back(isa<Instruction>(Sum) ? SE.
SE.getSCEV(Sum));		SE.getSCEV(Sum));
for (++I; I != E && I->first == CurLoop; ++I)		for (++I; I != E && I->first == CurLoop; ++I)
NewOps.push_back(I->second);		NewOps.push_back(I->second);
Sum = expandAddToGEP(NewOps.begin(), NewOps.end(), PTy, Ty, expand(Op));		Sum = expandAddToGEP(NewOps.begin(), NewOps.end(), PTy, Ty, expand(Op));
} else if (Op->isNonConstantNegative()) {		} else if (Op->isNonConstantNegative()) {
// Instead of doing a negate and add, just do a subtract.		// Instead of doing a negate and add, just do a subtract.
Value *W = expandCodeForImpl(SE.getNegativeSCEV(Op), Ty, false);		Value *W = expandCodeForImpl(SE.getNegativeSCEV(Op), Ty, false);
Sum = InsertNoopCastOfTo(Sum, Ty);		Sum = InsertNoopCastOfTo(Sum, Ty);
Sum = InsertBinop(Instruction::Sub, Sum, W, SCEV::FlagAnyWrap,		Sum = InsertBinop(Instruction::Sub, Sum, W, SCEV::FlagAnyWrap,
/IsSafeToHoist/ true);		/IsSafeToHoist/ true);
++I;		++I;
} else {		} else {
// A simple add.		// A simple add.
Value *W = expandCodeForImpl(Op, Ty, false);		Value *W = expandCodeForImpl(Op, Ty, false);
Sum = InsertNoopCastOfTo(Sum, Ty);		Sum = InsertNoopCastOfTo(Sum, Ty);
// Canonicalize a constant to the RHS.		// Canonicalize a constant to the RHS.
if (isa<Constant>(Sum)) std::swap(Sum, W);		if (isa<Constant>(Sum)) std::swap(Sum, W);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (isa<Constant>(Sum)) std::swap(Sum, W); + if (isa<Constant>(Sum)) + std::swap(Sum, W); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (isa<Constant>(Sum)) std::swap(Sum, W); +…
Sum = InsertBinop(Instruction::Add, Sum, W, S->getNoWrapFlags(),		Sum = InsertBinop(Instruction::Add, Sum, W, S->getNoWrapFlags(),
/IsSafeToHoist/ true);		/IsSafeToHoist/ true);
++I;		++I;
}		}
}		}

return Sum;		return Sum;
}		}

Value SCEVExpander::visitMulExpr(const SCEVMulExpr S) {		Value SCEVExpander::visitMulExpr(const SCEVMulExpr S) {
Type *Ty = SE.getEffectiveSCEVType(S->getType());		Type *Ty = SE.getEffectiveSCEVType(S->getType());

// Collect all the mul operands in a loop, along with their associated loops.		// Collect all the mul operands in a loop, along with their associated loops.
// Iterate in reverse so that constants are emitted last, all else equal.		// Iterate in reverse so that constants are emitted last, all else equal.
SmallVector<std::pair<const Loop , const SCEV >, 8> OpsAndLoops;		SmallVector<std::pair<const Loop , const SCEV >, 8> OpsAndLoops;
for (std::reverse_iterator<SCEVMulExpr::op_iterator> I(S->op_end()),		for (std::reverse_iterator<SCEVMulExpr::op_iterator> I(S->op_end()),
E(S->op_begin()); I != E; ++I)		E(S->op_begin()); I != E; ++I)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - E(S->op_begin()); I != E; ++I) + E(S->op_begin()); + I != E; ++I) Lint: Pre-merge checks: clang-format: please reformat the code ``` - E(S->op_begin()); I != E; ++I) + E(S…
OpsAndLoops.push_back(std::make_pair(getRelevantLoop(I), I));		OpsAndLoops.push_back(std::make_pair(getRelevantLoop(I), I));

// Sort by loop. Use a stable sort so that constants follow non-constants.		// Sort by loop. Use a stable sort so that constants follow non-constants.
llvm::stable_sort(OpsAndLoops, LoopCompare(SE.DT));		llvm::stable_sort(OpsAndLoops, LoopCompare(SE.DT));

// Emit instructions to mul all the operands. Hoist as much as possible		// Emit instructions to mul all the operands. Hoist as much as possible
// out of loops.		// out of loops.
Value *Prod = nullptr;		Value *Prod = nullptr;
Show All 23 Lines	const auto ExpandOpBinPowN = [this, &I, &OpsAndLoops, &Ty]() {
Value *P = expandCodeForImpl(I->second, Ty, false);		Value *P = expandCodeForImpl(I->second, Ty, false);
Value *Result = nullptr;		Value *Result = nullptr;
if (Exponent & 1)		if (Exponent & 1)
Result = P;		Result = P;
for (uint64_t BinExp = 2; BinExp <= Exponent; BinExp <<= 1) {		for (uint64_t BinExp = 2; BinExp <= Exponent; BinExp <<= 1) {
P = InsertBinop(Instruction::Mul, P, P, SCEV::FlagAnyWrap,		P = InsertBinop(Instruction::Mul, P, P, SCEV::FlagAnyWrap,
/IsSafeToHoist/ true);		/IsSafeToHoist/ true);
if (Exponent & BinExp)		if (Exponent & BinExp)
Result = Result ? InsertBinop(Instruction::Mul, Result, P,		Result = Result ? InsertBinop(Instruction::Mul, Result, P,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Result = Result ? InsertBinop(Instruction::Mul, Result, P, - SCEV::FlagAnyWrap, - /IsSafeToHoist/ true) - : P; + Result = + Result ? InsertBinop(Instruction::Mul, Result, P, SCEV::FlagAnyWrap, + /IsSafeToHoist/ true) + : P; Lint: Pre-merge checks: clang-format: please reformat the code ``` - Result = Result ? InsertBinop(Instruction…
SCEV::FlagAnyWrap,		SCEV::FlagAnyWrap,
/IsSafeToHoist/ true)		/IsSafeToHoist/ true)
: P;		: P;
}		}

I = E;		I = E;
assert(Result && "Nothing was expanded?");		assert(Result && "Nothing was expanded?");
return Result;		return Result;
Show All 9 Lines	if (!Prod) {
Prod = InsertBinop(Instruction::Sub, Constant::getNullValue(Ty), Prod,		Prod = InsertBinop(Instruction::Sub, Constant::getNullValue(Ty), Prod,
SCEV::FlagAnyWrap, /IsSafeToHoist/ true);		SCEV::FlagAnyWrap, /IsSafeToHoist/ true);
++I;		++I;
} else {		} else {
// A simple mul.		// A simple mul.
Value *W = ExpandOpBinPowN();		Value *W = ExpandOpBinPowN();
Prod = InsertNoopCastOfTo(Prod, Ty);		Prod = InsertNoopCastOfTo(Prod, Ty);
// Canonicalize a constant to the RHS.		// Canonicalize a constant to the RHS.
if (isa<Constant>(Prod)) std::swap(Prod, W);		if (isa<Constant>(Prod)) std::swap(Prod, W);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (isa<Constant>(Prod)) std::swap(Prod, W); + if (isa<Constant>(Prod)) + std::swap(Prod, W); Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (isa<Constant>(Prod)) std::swap(Prod, W)…
const APInt *RHS;		const APInt *RHS;
if (match(W, m_Power2(RHS))) {		if (match(W, m_Power2(RHS))) {
// Canonicalize Prod*(1<<C) to Prod<<C.		// Canonicalize Prod*(1<<C) to Prod<<C.
assert(!Ty->isVectorTy() && "vector types are not SCEVable");		assert(!Ty->isVectorTy() && "vector types are not SCEVable");
auto NWFlags = S->getNoWrapFlags();		auto NWFlags = S->getNoWrapFlags();
// clear nsw flag if shl will produce poison value.		// clear nsw flag if shl will produce poison value.
if (RHS->logBase2() == RHS->getBitWidth() - 1)		if (RHS->logBase2() == RHS->getBitWidth() - 1)
NWFlags = ScalarEvolution::clearFlags(NWFlags, SCEV::FlagNSW);		NWFlags = ScalarEvolution::clearFlags(NWFlags, SCEV::FlagNSW);
Show All 29 Lines

/// Move parts of Base into Rest to leave Base with the minimal		/// Move parts of Base into Rest to leave Base with the minimal
/// expression that provides a pointer operand suitable for a		/// expression that provides a pointer operand suitable for a
/// GEP expansion.		/// GEP expansion.
static void ExposePointerBase(const SCEV &Base, const SCEV &Rest,		static void ExposePointerBase(const SCEV &Base, const SCEV &Rest,
ScalarEvolution &SE) {		ScalarEvolution &SE) {
while (const SCEVAddRecExpr *A = dyn_cast<SCEVAddRecExpr>(Base)) {		while (const SCEVAddRecExpr *A = dyn_cast<SCEVAddRecExpr>(Base)) {
Base = A->getStart();		Base = A->getStart();
Rest = SE.getAddExpr(Rest,		Rest = SE.getAddExpr(Rest,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Rest = SE.getAddExpr(Rest, - SE.getAddRecExpr(SE.getConstant(A->getType(), 0), - A->getStepRecurrence(SE), - A->getLoop(), - A->getNoWrapFlags(SCEV::FlagNW))); + Rest = SE.getAddExpr( + Rest, SE.getAddRecExpr(SE.getConstant(A->getType(), 0), + A->getStepRecurrence(SE), A->getLoop(), + A->getNoWrapFlags(SCEV::FlagNW))); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Rest = SE.getAddExpr(Rest…
SE.getAddRecExpr(SE.getConstant(A->getType(), 0),		SE.getAddRecExpr(SE.getConstant(A->getType(), 0),
A->getStepRecurrence(SE),		A->getStepRecurrence(SE),
A->getLoop(),		A->getLoop(),
A->getNoWrapFlags(SCEV::FlagNW)));		A->getNoWrapFlags(SCEV::FlagNW)));
}		}
if (const SCEVAddExpr *A = dyn_cast<SCEVAddExpr>(Base)) {		if (const SCEVAddExpr *A = dyn_cast<SCEVAddExpr>(Base)) {
Base = A->getOperand(A->getNumOperands()-1);		Base = A->getOperand(A->getNumOperands()-1);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Base = A->getOperand(A->getNumOperands()-1); + Base = A->getOperand(A->getNumOperands() - 1); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Base = A->getOperand(A->getNumOperands()-1); +…
SmallVector<const SCEV *, 8> NewAddOps(A->operands());		SmallVector<const SCEV *, 8> NewAddOps(A->operands());
NewAddOps.back() = Rest;		NewAddOps.back() = Rest;
Rest = SE.getAddExpr(NewAddOps);		Rest = SE.getAddExpr(NewAddOps);
ExposePointerBase(Base, Rest, SE);		ExposePointerBase(Base, Rest, SE);
}		}
}		}

/// Determine if this is a well-behaved chain of instructions leading back to		/// Determine if this is a well-behaved chain of instructions leading back to
/// the PHI. If so, it may be reused by expanded expressions.		/// the PHI. If so, it may be reused by expanded expressions.
bool SCEVExpander::isNormalAddRecExprPHI(PHINode PN, Instruction IncV,		bool SCEVExpander::isNormalAddRecExprPHI(PHINode PN, Instruction IncV,
const Loop *L) {		const Loop *L) {
if (IncV->getNumOperands() == 0 \|\| isa<PHINode>(IncV) \|\|		if (IncV->getNumOperands() == 0 \|\| isa<PHINode>(IncV) \|\|
(isa<CastInst>(IncV) && !isa<BitCastInst>(IncV)))		(isa<CastInst>(IncV) && !isa<BitCastInst>(IncV)))
return false;		return false;
// If any of the operands don't dominate the insert position, bail.		// If any of the operands don't dominate the insert position, bail.
// Addrec operands are always loop-invariant, so this can only happen		// Addrec operands are always loop-invariant, so this can only happen
// if there are instructions which haven't been hoisted.		// if there are instructions which haven't been hoisted.
if (L == IVIncInsertLoop) {		if (L == IVIncInsertLoop) {
for (User::op_iterator OI = IncV->op_begin()+1,		for (User::op_iterator OI = IncV->op_begin()+1,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - for (User::op_iterator OI = IncV->op_begin()+1, - OE = IncV->op_end(); OI != OE; ++OI) + for (User::op_iterator OI = IncV->op_begin() + 1, OE = IncV->op_end(); + OI != OE; ++OI) Lint: Pre-merge checks: clang-format: please reformat the code ``` - for (User::op_iterator OI = IncV->op_begin()+1…
OE = IncV->op_end(); OI != OE; ++OI)		OE = IncV->op_end(); OI != OE; ++OI)
if (Instruction *OInst = dyn_cast<Instruction>(OI))		if (Instruction *OInst = dyn_cast<Instruction>(OI))
if (!SE.DT.dominates(OInst, IVIncInsertPos))		if (!SE.DT.dominates(OInst, IVIncInsertPos))
return false;		return false;
}		}
// Advance to the next instruction.		// Advance to the next instruction.
IncV = dyn_cast<Instruction>(IncV->getOperand(0));		IncV = dyn_cast<Instruction>(IncV->getOperand(0));
if (!IncV)		if (!IncV)
▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	for (auto I = IncV->op_begin() + 1, E = IncV->op_end(); I != E; ++I) {
}		}
// This must be a pointer addition of constants (pretty), which is already		// This must be a pointer addition of constants (pretty), which is already
// handled, or some number of address-size elements (ugly). Ugly geps		// handled, or some number of address-size elements (ugly). Ugly geps
// have 2 operands. i1* is used by the expander to represent an		// have 2 operands. i1* is used by the expander to represent an
// address-size element.		// address-size element.
if (IncV->getNumOperands() != 2)		if (IncV->getNumOperands() != 2)
return nullptr;		return nullptr;
unsigned AS = cast<PointerType>(IncV->getType())->getAddressSpace();		unsigned AS = cast<PointerType>(IncV->getType())->getAddressSpace();
if (IncV->getType() != Type::getInt1PtrTy(SE.getContext(), AS)		if (IncV->getType() != Type::getInt1PtrTy(SE.getContext(), AS)
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (IncV->getType() != Type::getInt1PtrTy(SE.getContext(), AS) - && IncV->getType() != Type::getInt8PtrTy(SE.getContext(), AS)) + if (IncV->getType() != Type::getInt1PtrTy(SE.getContext(), AS) && + IncV->getType() != Type::getInt8PtrTy(SE.getContext(), AS)) Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (IncV->getType() != Type::getInt1PtrTy(SE.
&& IncV->getType() != Type::getInt8PtrTy(SE.getContext(), AS))		&& IncV->getType() != Type::getInt8PtrTy(SE.getContext(), AS))
return nullptr;		return nullptr;
break;		break;
}		}
return dyn_cast<Instruction>(IncV->getOperand(0));		return dyn_cast<Instruction>(IncV->getOperand(0));
}		}
}		}

Show All 14 Lines	if (InsertPtGuard->GetInsertPoint() == It)
InsertPtGuard->SetInsertPoint(NewInsertPt);		InsertPtGuard->SetInsertPoint(NewInsertPt);
}		}

/// hoistStep - Attempt to hoist a simple IV increment above InsertPos to make		/// hoistStep - Attempt to hoist a simple IV increment above InsertPos to make
/// it available to other uses in this loop. Recursively hoist any operands,		/// it available to other uses in this loop. Recursively hoist any operands,
/// until we reach a value that dominates InsertPos.		/// until we reach a value that dominates InsertPos.
bool SCEVExpander::hoistIVInc(Instruction IncV, Instruction InsertPos) {		bool SCEVExpander::hoistIVInc(Instruction IncV, Instruction InsertPos) {
if (SE.DT.dominates(IncV, InsertPos))		if (SE.DT.dominates(IncV, InsertPos))
return true;		return true;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - return true; + return true; Lint: Pre-merge checks: clang-format: please reformat the code ``` - return true; + return true; ```

// InsertPos must itself dominate IncV so that IncV's new position satisfies		// InsertPos must itself dominate IncV so that IncV's new position satisfies
// its existing users.		// its existing users.
if (isa<PHINode>(InsertPos) \|\|		if (isa<PHINode>(InsertPos) \|\|
!SE.DT.dominates(InsertPos->getParent(), IncV->getParent()))		!SE.DT.dominates(InsertPos->getParent(), IncV->getParent()))
return false;		return false;

if (!SE.LI.movementPreservesLCSSAForm(IncV, InsertPos))		if (!SE.LI.movementPreservesLCSSAForm(IncV, InsertPos))
return false;		return false;

// Check that the chain of IV operands leading back to Phi can be hoisted.		// Check that the chain of IV operands leading back to Phi can be hoisted.
SmallVector<Instruction*, 4> IVIncs;		SmallVector<Instruction*, 4> IVIncs;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SmallVector<Instruction, 4> IVIncs; - for(;;) { - Instruction Oper = getIVIncOperand(IncV, InsertPos, /allowScale/true); + SmallVector<Instruction , 4> IVIncs; + for (;;) { + Instruction Oper = getIVIncOperand(IncV, InsertPos, /allowScale/ true); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SmallVector<Instruction*, 4> IVIncs; - for(;;) {…
for(;;) {		for(;;) {
Instruction Oper = getIVIncOperand(IncV, InsertPos, /allowScale*/true);		Instruction Oper = getIVIncOperand(IncV, InsertPos, /allowScale*/true);
if (!Oper)		if (!Oper)
return false;		return false;
// IncV is safe to hoist.		// IncV is safe to hoist.
IVIncs.push_back(IncV);		IVIncs.push_back(IncV);
IncV = Oper;		IncV = Oper;
if (SE.DT.dominates(IncV, InsertPos))		if (SE.DT.dominates(IncV, InsertPos))
break;		break;
}		}
for (auto I = IVIncs.rbegin(), E = IVIncs.rend(); I != E; ++I) {		for (auto I = IVIncs.rbegin(), E = IVIncs.rend(); I != E; ++I) {
fixupInsertPoints(*I);		fixupInsertPoints(*I);
(*I)->moveBefore(InsertPos);		(*I)->moveBefore(InsertPos);
}		}
return true;		return true;
}		}

/// Determine if this cyclic phi is in a form that would have been generated by		/// Determine if this cyclic phi is in a form that would have been generated by
/// LSR. We don't care if the phi was actually expanded in this pass, as long		/// LSR. We don't care if the phi was actually expanded in this pass, as long
/// as it is in a low-cost form, for example, no implied multiplication. This		/// as it is in a low-cost form, for example, no implied multiplication. This
/// should match any patterns generated by getAddRecExprPHILiterally and		/// should match any patterns generated by getAddRecExprPHILiterally and
/// expandAddtoGEP.		/// expandAddtoGEP.
bool SCEVExpander::isExpandedAddRecExprPHI(PHINode PN, Instruction IncV,		bool SCEVExpander::isExpandedAddRecExprPHI(PHINode PN, Instruction IncV,
const Loop *L) {		const Loop *L) {
for(Instruction *IVOper = IncV;		for(Instruction *IVOper = IncV;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - for(Instruction IVOper = IncV; - (IVOper = getIVIncOperand(IVOper, L->getLoopPreheader()->getTerminator(), - /allowScale=/false));) { + for (Instruction IVOper = IncV; + (IVOper = getIVIncOperand(IVOper, L->getLoopPreheader()->getTerminator(), + /allowScale=/false));) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - for(Instruction *IVOper = IncV; - (IVOper =…
(IVOper = getIVIncOperand(IVOper, L->getLoopPreheader()->getTerminator(),		(IVOper = getIVIncOperand(IVOper, L->getLoopPreheader()->getTerminator(),
/allowScale=/false));) {		/allowScale=/false));) {
if (IVOper == PN)		if (IVOper == PN)
return true;		return true;
}		}
return false;		return false;
}		}

Show All 11 Lines	if (ExpandTy->isPointerTy()) {
// that would require a multiply inside the loop.		// that would require a multiply inside the loop.
if (!isa<ConstantInt>(StepV))		if (!isa<ConstantInt>(StepV))
GEPPtrTy = PointerType::get(Type::getInt1Ty(SE.getContext()),		GEPPtrTy = PointerType::get(Type::getInt1Ty(SE.getContext()),
GEPPtrTy->getAddressSpace());		GEPPtrTy->getAddressSpace());
IncV = expandAddToGEP(SE.getSCEV(StepV), GEPPtrTy, IntTy, PN);		IncV = expandAddToGEP(SE.getSCEV(StepV), GEPPtrTy, IntTy, PN);
if (IncV->getType() != PN->getType())		if (IncV->getType() != PN->getType())
IncV = Builder.CreateBitCast(IncV, PN->getType());		IncV = Builder.CreateBitCast(IncV, PN->getType());
} else {		} else {
IncV = useSubtract ?		IncV = useSubtract ?
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - IncV = useSubtract ? - Builder.CreateSub(PN, StepV, Twine(IVName) + ".iv.next") : - Builder.CreateAdd(PN, StepV, Twine(IVName) + ".iv.next"); + IncV = useSubtract + ? Builder.CreateSub(PN, StepV, Twine(IVName) + ".iv.next") + : Builder.CreateAdd(PN, StepV, Twine(IVName) + ".iv.next"); Lint: Pre-merge checks: clang-format: please reformat the code ``` - IncV = useSubtract ? - Builder.CreateSub…
Builder.CreateSub(PN, StepV, Twine(IVName) + ".iv.next") :		Builder.CreateSub(PN, StepV, Twine(IVName) + ".iv.next") :
Builder.CreateAdd(PN, StepV, Twine(IVName) + ".iv.next");		Builder.CreateAdd(PN, StepV, Twine(IVName) + ".iv.next");
}		}
return IncV;		return IncV;
}		}

/// Hoist the addrec instruction chain rooted in the loop phi above the		/// Hoist the addrec instruction chain rooted in the loop phi above the
/// position. This routine assumes that this is possible (has been checked).		/// position. This routine assumes that this is possible (has been checked).
Show All 30 Lines	static bool canBeCheaplyTransformed(ScalarEvolution &SE,

// Check whether truncation will help.		// Check whether truncation will help.
if (Phi == Requested) {		if (Phi == Requested) {
InvertStep = false;		InvertStep = false;
return true;		return true;
}		}

// Check whether inverting will help: {R,+,-1} == R - {0,+,1}.		// Check whether inverting will help: {R,+,-1} == R - {0,+,1}.
if (SE.getAddExpr(Requested->getStart(),		if (SE.getAddExpr(Requested->getStart(),
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (SE.getAddExpr(Requested->getStart(), - SE.getNegativeSCEV(Requested)) == Phi) { + if (SE.getAddExpr(Requested->getStart(), SE.getNegativeSCEV(Requested)) == + Phi) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (SE.getAddExpr(Requested->getStart()…
SE.getNegativeSCEV(Requested)) == Phi) {		SE.getNegativeSCEV(Requested)) == Phi) {
InvertStep = true;		InvertStep = true;
return true;		return true;
}		}

return false;		return false;
}		}

static bool IsIncrementNSW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {		static bool IsIncrementNSW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {
if (!isa<IntegerType>(AR->getType()))		if (!isa<IntegerType>(AR->getType()))
return false;		return false;

unsigned BitWidth = cast<IntegerType>(AR->getType())->getBitWidth();		unsigned BitWidth = cast<IntegerType>(AR->getType())->getBitWidth();
Type WideTy = IntegerType::get(AR->getType()->getContext(), BitWidth 2);		Type WideTy = IntegerType::get(AR->getType()->getContext(), BitWidth 2);
const SCEV *Step = AR->getStepRecurrence(SE);		const SCEV *Step = AR->getStepRecurrence(SE);
const SCEV *OpAfterExtend = SE.getAddExpr(SE.getSignExtendExpr(Step, WideTy),		const SCEV *OpAfterExtend = SE.getAddExpr(SE.getSignExtendExpr(Step, WideTy),
SE.getSignExtendExpr(AR, WideTy));		SE.getSignExtendExpr(AR, WideTy));
const SCEV *ExtendAfterOp =		const SCEV *ExtendAfterOp =
SE.getSignExtendExpr(SE.getAddExpr(AR, Step), WideTy);		SE.getSignExtendExpr(SE.getAddExpr(AR, Step), WideTy);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SE.getSignExtendExpr(SE.getAddExpr(AR, Step), WideTy); + SE.getSignExtendExpr(SE.getAddExpr(AR, Step), WideTy); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SE.getSignExtendExpr(SE.getAddExpr(AR, Step)…
return ExtendAfterOp == OpAfterExtend;		return ExtendAfterOp == OpAfterExtend;
}		}

static bool IsIncrementNUW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {		static bool IsIncrementNUW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {
if (!isa<IntegerType>(AR->getType()))		if (!isa<IntegerType>(AR->getType()))
return false;		return false;

unsigned BitWidth = cast<IntegerType>(AR->getType())->getBitWidth();		unsigned BitWidth = cast<IntegerType>(AR->getType())->getBitWidth();
Type WideTy = IntegerType::get(AR->getType()->getContext(), BitWidth 2);		Type WideTy = IntegerType::get(AR->getType()->getContext(), BitWidth 2);
const SCEV *Step = AR->getStepRecurrence(SE);		const SCEV *Step = AR->getStepRecurrence(SE);
const SCEV *OpAfterExtend = SE.getAddExpr(SE.getZeroExtendExpr(Step, WideTy),		const SCEV *OpAfterExtend = SE.getAddExpr(SE.getZeroExtendExpr(Step, WideTy),
SE.getZeroExtendExpr(AR, WideTy));		SE.getZeroExtendExpr(AR, WideTy));
const SCEV *ExtendAfterOp =		const SCEV *ExtendAfterOp =
SE.getZeroExtendExpr(SE.getAddExpr(AR, Step), WideTy);		SE.getZeroExtendExpr(SE.getAddExpr(AR, Step), WideTy);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SE.getZeroExtendExpr(SE.getAddExpr(AR, Step), WideTy); + SE.getZeroExtendExpr(SE.getAddExpr(AR, Step), WideTy); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SE.getZeroExtendExpr(SE.getAddExpr(AR, Step)…
return ExtendAfterOp == OpAfterExtend;		return ExtendAfterOp == OpAfterExtend;
}		}

/// getAddRecExprPHILiterally - Helper for expandAddRecExprLiterally. Expand		/// getAddRecExprPHILiterally - Helper for expandAddRecExprLiterally. Expand
/// the base addrec, which is the addrec without any non-loop-dominating		/// the base addrec, which is the addrec without any non-loop-dominating
/// values, and return the PHI.		/// values, and return the PHI.
PHINode *		PHINode *
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -PHINode * -SCEVExpander::getAddRecExprPHILiterally(const SCEVAddRecExpr Normalized, - const Loop L, - Type ExpandTy, - Type IntTy, - Type &TruncTy, - bool &InvertStep) { - assert((!IVIncInsertLoop\|\|IVIncInsertPos) && "Uninitialized insert position"); +PHINode SCEVExpander::getAddRecExprPHILiterally( + const SCEVAddRecExpr Normalized, const Loop L, Type ExpandTy, 3 diff lines are omitted. See full path. Lint: Pre-merge checks:* clang-format: please reformat the code ``` -PHINode * -SCEVExpander::getAddRecExprPHILiterally…
SCEVExpander::getAddRecExprPHILiterally(const SCEVAddRecExpr *Normalized,		SCEVExpander::getAddRecExprPHILiterally(const SCEVAddRecExpr *Normalized,
const Loop *L,		const Loop *L,
Type *ExpandTy,		Type *ExpandTy,
Type *IntTy,		Type *IntTy,
Type *&TruncTy,		Type *&TruncTy,
bool &InvertStep) {		bool &InvertStep) {
assert((!IVIncInsertLoop\|\|IVIncInsertPos) && "Uninitialized insert position");		assert((!IVIncInsertLoop\|\|IVIncInsertPos) && "Uninitialized insert position");

Show All 27 Lines	for (PHINode &PN : L->getHeader()->phis()) {
if (!PhiSCEV)		if (!PhiSCEV)
continue;		continue;

bool IsMatchingSCEV = PhiSCEV == Normalized;		bool IsMatchingSCEV = PhiSCEV == Normalized;
// We only handle truncation and inversion of phi recurrences for the		// We only handle truncation and inversion of phi recurrences for the
// expanded expression if the expanded expression's loop dominates the		// expanded expression if the expanded expression's loop dominates the
// loop we insert to. Check now, so we can bail out early.		// loop we insert to. Check now, so we can bail out early.
if (!IsMatchingSCEV && !TryNonMatchingSCEV)		if (!IsMatchingSCEV && !TryNonMatchingSCEV)
continue;		continue;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - continue; + continue; Lint: Pre-merge checks: clang-format: please reformat the code ``` - continue; + continue; ```

// TODO: this possibly can be reworked to avoid this cast at all.		// TODO: this possibly can be reworked to avoid this cast at all.
Instruction *TempIncV =		Instruction *TempIncV =
dyn_cast<Instruction>(PN.getIncomingValueForBlock(LatchBlock));		dyn_cast<Instruction>(PN.getIncomingValueForBlock(LatchBlock));
if (!TempIncV)		if (!TempIncV)
continue;		continue;

// Check whether we can reuse this PHI node.		// Check whether we can reuse this PHI node.
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	for (pred_iterator HPI = HPB; HPI != HPE; ++HPI) {
if (!L->contains(Pred)) {		if (!L->contains(Pred)) {
PN->addIncoming(StartV, Pred);		PN->addIncoming(StartV, Pred);
continue;		continue;
}		}

// Create a step value and add it to the PHI.		// Create a step value and add it to the PHI.
// If IVIncInsertLoop is non-null and equal to the addrec's loop, insert the		// If IVIncInsertLoop is non-null and equal to the addrec's loop, insert the
// instructions at IVIncInsertPos.		// instructions at IVIncInsertPos.
Instruction *InsertPos = L == IVIncInsertLoop ?		Instruction *InsertPos = L == IVIncInsertLoop ?
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Instruction InsertPos = L == IVIncInsertLoop ? - IVIncInsertPos : Pred->getTerminator(); + Instruction InsertPos = + L == IVIncInsertLoop ? IVIncInsertPos : Pred->getTerminator(); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Instruction *InsertPos = L == IVIncInsertLoop ?
IVIncInsertPos : Pred->getTerminator();		IVIncInsertPos : Pred->getTerminator();
Builder.SetInsertPoint(InsertPos);		Builder.SetInsertPoint(InsertPos);
Value *IncV = expandIVInc(PN, StepV, L, ExpandTy, IntTy, useSubtract);		Value *IncV = expandIVInc(PN, StepV, L, ExpandTy, IntTy, useSubtract);

if (isa<OverflowingBinaryOperator>(IncV)) {		if (isa<OverflowingBinaryOperator>(IncV)) {
if (IncrementIsNUW)		if (IncrementIsNUW)
cast<BinaryOperator>(IncV)->setHasNoUnsignedWrap();		cast<BinaryOperator>(IncV)->setHasNoUnsignedWrap();
if (IncrementIsNSW)		if (IncrementIsNSW)
Show All 27 Lines	Value SCEVExpander::expandAddRecExprLiterally(const SCEVAddRecExpr S) {
}		}

// Strip off any non-loop-dominating component from the addrec start.		// Strip off any non-loop-dominating component from the addrec start.
const SCEV *Start = Normalized->getStart();		const SCEV *Start = Normalized->getStart();
const SCEV *PostLoopOffset = nullptr;		const SCEV *PostLoopOffset = nullptr;
if (!SE.properlyDominates(Start, L->getHeader())) {		if (!SE.properlyDominates(Start, L->getHeader())) {
PostLoopOffset = Start;		PostLoopOffset = Start;
Start = SE.getConstant(Normalized->getType(), 0);		Start = SE.getConstant(Normalized->getType(), 0);
Normalized = cast<SCEVAddRecExpr>(		Normalized = cast<SCEVAddRecExpr>(
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Normalized = cast<SCEVAddRecExpr>( - SE.getAddRecExpr(Start, Normalized->getStepRecurrence(SE), - Normalized->getLoop(), - Normalized->getNoWrapFlags(SCEV::FlagNW))); + Normalized = cast<SCEVAddRecExpr>(SE.getAddRecExpr( + Start, Normalized->getStepRecurrence(SE), Normalized->getLoop(), + Normalized->getNoWrapFlags(SCEV::FlagNW))); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Normalized = cast<SCEVAddRecExpr>( - SE.
SE.getAddRecExpr(Start, Normalized->getStepRecurrence(SE),		SE.getAddRecExpr(Start, Normalized->getStepRecurrence(SE),
Normalized->getLoop(),		Normalized->getLoop(),
Normalized->getNoWrapFlags(SCEV::FlagNW)));		Normalized->getNoWrapFlags(SCEV::FlagNW)));
}		}

// Strip off any non-loop-dominating component from the addrec step.		// Strip off any non-loop-dominating component from the addrec step.
const SCEV *Step = Normalized->getStepRecurrence(SE);		const SCEV *Step = Normalized->getStepRecurrence(SE);
const SCEV *PostLoopScale = nullptr;		const SCEV *PostLoopScale = nullptr;
if (!SE.dominates(Step, L->getHeader())) {		if (!SE.dominates(Step, L->getHeader())) {
PostLoopScale = Step;		PostLoopScale = Step;
Step = SE.getConstant(Normalized->getType(), 1);		Step = SE.getConstant(Normalized->getType(), 1);
if (!Start->isZero()) {		if (!Start->isZero()) {
// The normalization below assumes that Start is constant zero, so if		// The normalization below assumes that Start is constant zero, so if
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - // The normalization below assumes that Start is constant zero, so if - // it isn't re-associate Start to PostLoopOffset. - assert(!PostLoopOffset && "Start not-null but PostLoopOffset set?"); - PostLoopOffset = Start; - Start = SE.getConstant(Normalized->getType(), 0); + // The normalization below assumes that Start is constant zero, so if + // it isn't re-associate Start to PostLoopOffset. + assert(!PostLoopOffset && "Start not-null but PostLoopOffset set?"); + PostLoopOffset = Start; + Start = SE.getConstant(Normalized->getType(), 0); Lint: Pre-merge checks: clang-format: please reformat the code ``` - // The normalization below assumes that…
// it isn't re-associate Start to PostLoopOffset.		// it isn't re-associate Start to PostLoopOffset.
assert(!PostLoopOffset && "Start not-null but PostLoopOffset set?");		assert(!PostLoopOffset && "Start not-null but PostLoopOffset set?");
PostLoopOffset = Start;		PostLoopOffset = Start;
Start = SE.getConstant(Normalized->getType(), 0);		Start = SE.getConstant(Normalized->getType(), 0);
}		}
Normalized =		Normalized =
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Normalized = - cast<SCEVAddRecExpr>(SE.getAddRecExpr( - Start, Step, Normalized->getLoop(), - Normalized->getNoWrapFlags(SCEV::FlagNW))); + Normalized = cast<SCEVAddRecExpr>( + SE.getAddRecExpr(Start, Step, Normalized->getLoop(), + Normalized->getNoWrapFlags(SCEV::FlagNW))); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Normalized = - cast<SCEVAddRecExpr>(SE.
cast<SCEVAddRecExpr>(SE.getAddRecExpr(		cast<SCEVAddRecExpr>(SE.getAddRecExpr(
Start, Step, Normalized->getLoop(),		Start, Step, Normalized->getLoop(),
Normalized->getNoWrapFlags(SCEV::FlagNW)));		Normalized->getNoWrapFlags(SCEV::FlagNW)));
}		}

// Expand the core addrec. If we need post-loop scaling, force it to		// Expand the core addrec. If we need post-loop scaling, force it to
// expand to an integer type to avoid the need for additional casting.		// expand to an integer type to avoid the need for additional casting.
Type *ExpandTy = PostLoopScale ? IntTy : STy;		Type *ExpandTy = PostLoopScale ? IntTy : STy;
Show All 14 Lines	Value SCEVExpander::expandAddRecExprLiterally(const SCEVAddRecExpr S) {
if (!PostIncLoops.count(L))		if (!PostIncLoops.count(L))
Result = PN;		Result = PN;
else {		else {
// In PostInc mode, use the post-incremented value.		// In PostInc mode, use the post-incremented value.
BasicBlock *LatchBlock = L->getLoopLatch();		BasicBlock *LatchBlock = L->getLoopLatch();
assert(LatchBlock && "PostInc mode requires a unique loop latch!");		assert(LatchBlock && "PostInc mode requires a unique loop latch!");
Result = PN->getIncomingValueForBlock(LatchBlock);		Result = PN->getIncomingValueForBlock(LatchBlock);

		// We might be introducing a new use of the post-inc IV that is not poison
		// safe, in which case we should drop poison generating flags. Only keep
		// those flags for which SCEV has proven that they always hold.
		if (auto *BO = dyn_cast<BinaryOperator>(Result)) {
		fhahnUnsubmitted Not Done Reply Inline Actions `OverflowingBinaryOperator`? fhahn: `OverflowingBinaryOperator`?
		if (!S->hasNoUnsignedWrap())
		BO->setHasNoUnsignedWrap(false);
		if (!S->hasNoSignedWrap())
		BO->setHasNoSignedWrap(false);
		}

// For an expansion to use the postinc form, the client must call		// For an expansion to use the postinc form, the client must call
// expandCodeFor with an InsertPoint that is either outside the PostIncLoop		// expandCodeFor with an InsertPoint that is either outside the PostIncLoop
// or dominated by IVIncInsertPos.		// or dominated by IVIncInsertPos.
if (isa<Instruction>(Result) &&		if (isa<Instruction>(Result) &&
!SE.DT.dominates(cast<Instruction>(Result),		!SE.DT.dominates(cast<Instruction>(Result),
&*Builder.GetInsertPoint())) {		&*Builder.GetInsertPoint())) {
// The induction variable's postinc expansion does not dominate this use.		// The induction variable's postinc expansion does not dominate this use.
// IVUsers tries to prevent this case, so it is rare. However, it can		// IVUsers tries to prevent this case, so it is rare. However, it can
// happen when an IVUser outside the loop is not dominated by the latch		// happen when an IVUser outside the loop is not dominated by the latch
// block. Adjusting IVIncInsertPos before expansion begins cannot handle		// block. Adjusting IVIncInsertPos before expansion begins cannot handle
// all cases. Consider a phi outside whose operand is replaced during		// all cases. Consider a phi outside whose operand is replaced during
// expansion with the value of the postinc user. Without fundamentally		// expansion with the value of the postinc user. Without fundamentally
// changing the way postinc users are tracked, the only remedy is		// changing the way postinc users are tracked, the only remedy is
// inserting an extra IV increment. StepV might fold into PostLoopOffset,		// inserting an extra IV increment. StepV might fold into PostLoopOffset,
// but hopefully expandCodeFor handles that.		// but hopefully expandCodeFor handles that.
bool useSubtract =		bool useSubtract =
!ExpandTy->isPointerTy() && Step->isNonConstantNegative();		!ExpandTy->isPointerTy() && Step->isNonConstantNegative();
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - !ExpandTy->isPointerTy() && Step->isNonConstantNegative(); + !ExpandTy->isPointerTy() && Step->isNonConstantNegative(); Lint: Pre-merge checks: clang-format: please reformat the code ``` - !ExpandTy->isPointerTy() && Step…
if (useSubtract)		if (useSubtract)
Step = SE.getNegativeSCEV(Step);		Step = SE.getNegativeSCEV(Step);
Value *StepV;		Value *StepV;
{		{
// Expand the step somewhere that dominates the loop header.		// Expand the step somewhere that dominates the loop header.
SCEVInsertPointGuard Guard(Builder, this);		SCEVInsertPointGuard Guard(Builder, this);
StepV = expandCodeForImpl(		StepV = expandCodeForImpl(
Step, IntTy, &*L->getHeader()->getFirstInsertionPt(), false);		Step, IntTy, &*L->getHeader()->getFirstInsertionPt(), false);
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	Value SCEVExpander::visitAddRecExpr(const SCEVAddRecExpr S) {
PHINode *CanonicalIV = nullptr;		PHINode *CanonicalIV = nullptr;
if (PHINode *PN = L->getCanonicalInductionVariable())		if (PHINode *PN = L->getCanonicalInductionVariable())
if (SE.getTypeSizeInBits(PN->getType()) >= SE.getTypeSizeInBits(Ty))		if (SE.getTypeSizeInBits(PN->getType()) >= SE.getTypeSizeInBits(Ty))
CanonicalIV = PN;		CanonicalIV = PN;

// Rewrite an AddRec in terms of the canonical induction variable, if		// Rewrite an AddRec in terms of the canonical induction variable, if
// its type is more narrow.		// its type is more narrow.
if (CanonicalIV &&		if (CanonicalIV &&
SE.getTypeSizeInBits(CanonicalIV->getType()) >		SE.getTypeSizeInBits(CanonicalIV->getType()) >
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SE.getTypeSizeInBits(CanonicalIV->getType()) > - SE.getTypeSizeInBits(Ty)) { + SE.getTypeSizeInBits(CanonicalIV->getType()) > SE.getTypeSizeInBits(Ty)) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - SE.getTypeSizeInBits(CanonicalIV->getType())…
SE.getTypeSizeInBits(Ty)) {		SE.getTypeSizeInBits(Ty)) {
SmallVector<const SCEV *, 4> NewOps(S->getNumOperands());		SmallVector<const SCEV *, 4> NewOps(S->getNumOperands());
for (unsigned i = 0, e = S->getNumOperands(); i != e; ++i)		for (unsigned i = 0, e = S->getNumOperands(); i != e; ++i)
NewOps[i] = SE.getAnyExtendExpr(S->op_begin()[i], CanonicalIV->getType());		NewOps[i] = SE.getAnyExtendExpr(S->op_begin()[i], CanonicalIV->getType());
Value *V = expand(SE.getAddRecExpr(NewOps, S->getLoop(),		Value *V = expand(SE.getAddRecExpr(NewOps, S->getLoop(),
S->getNoWrapFlags(SCEV::FlagNW)));		S->getNoWrapFlags(SCEV::FlagNW)));
BasicBlock::iterator NewInsertPt =		BasicBlock::iterator NewInsertPt =
findInsertPointAfter(cast<Instruction>(V), &*Builder.GetInsertPoint());		findInsertPointAfter(cast<Instruction>(V), &*Builder.GetInsertPoint());
V = expandCodeForImpl(SE.getTruncateExpr(SE.getUnknown(V), Ty), nullptr,		V = expandCodeForImpl(SE.getTruncateExpr(SE.getUnknown(V), Ty), nullptr,
&*NewInsertPt, false);		&*NewInsertPt, false);
return V;		return V;
}		}

// {X,+,F} --> X + {0,+,F}		// {X,+,F} --> X + {0,+,F}
if (!S->getStart()->isZero()) {		if (!S->getStart()->isZero()) {
SmallVector<const SCEV *, 4> NewOps(S->operands());		SmallVector<const SCEV *, 4> NewOps(S->operands());
NewOps[0] = SE.getConstant(Ty, 0);		NewOps[0] = SE.getConstant(Ty, 0);
const SCEV *Rest = SE.getAddRecExpr(NewOps, L,		const SCEV *Rest = SE.getAddRecExpr(NewOps, L,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SCEV Rest = SE.getAddRecExpr(NewOps, L, - S->getNoWrapFlags(SCEV::FlagNW)); + const SCEV Rest = + SE.getAddRecExpr(NewOps, L, S->getNoWrapFlags(SCEV::FlagNW)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SCEV *Rest = SE.getAddRecExpr(NewOps, L…
S->getNoWrapFlags(SCEV::FlagNW));		S->getNoWrapFlags(SCEV::FlagNW));

// Turn things like ptrtoint+arithmetic+inttoptr into GEP. See the		// Turn things like ptrtoint+arithmetic+inttoptr into GEP. See the
// comments on expandAddToGEP for details.		// comments on expandAddToGEP for details.
const SCEV *Base = S->getStart();		const SCEV *Base = S->getStart();
// Dig into the expression to find the pointer base for a GEP.		// Dig into the expression to find the pointer base for a GEP.
const SCEV *ExposedRest = Rest;		const SCEV *ExposedRest = Rest;
ExposePointerBase(Base, ExposedRest, SE);		ExposePointerBase(Base, ExposedRest, SE);
Show All 37 Lines	for (pred_iterator HPI = HPB; HPI != HPE; ++HPI) {
// duplicates!		// duplicates!
CanonicalIV->addIncoming(CanonicalIV->getIncomingValueForBlock(HP), HP);		CanonicalIV->addIncoming(CanonicalIV->getIncomingValueForBlock(HP), HP);
continue;		continue;
}		}

if (L->contains(HP)) {		if (L->contains(HP)) {
// Insert a unit add instruction right before the terminator		// Insert a unit add instruction right before the terminator
// corresponding to the back-edge.		// corresponding to the back-edge.
Instruction *Add = BinaryOperator::CreateAdd(CanonicalIV, One,		Instruction *Add = BinaryOperator::CreateAdd(CanonicalIV, One,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Instruction Add = BinaryOperator::CreateAdd(CanonicalIV, One, - "indvar.next", - HP->getTerminator()); + Instruction Add = BinaryOperator::CreateAdd( + CanonicalIV, One, "indvar.next", HP->getTerminator()); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Instruction *Add = BinaryOperator…
"indvar.next",		"indvar.next",
HP->getTerminator());		HP->getTerminator());
Add->setDebugLoc(HP->getTerminator()->getDebugLoc());		Add->setDebugLoc(HP->getTerminator()->getDebugLoc());
rememberInstruction(Add);		rememberInstruction(Add);
CanonicalIV->addIncoming(Add, HP);		CanonicalIV->addIncoming(Add, HP);
} else {		} else {
CanonicalIV->addIncoming(Constant::getNullValue(Ty), HP);		CanonicalIV->addIncoming(Constant::getNullValue(Ty), HP);
}		}
}		}
}		}

// {0,+,1} --> Insert a canonical induction variable into the loop!		// {0,+,1} --> Insert a canonical induction variable into the loop!
if (S->isAffine() && S->getOperand(1)->isOne()) {		if (S->isAffine() && S->getOperand(1)->isOne()) {
assert(Ty == SE.getEffectiveSCEVType(CanonicalIV->getType()) &&		assert(Ty == SE.getEffectiveSCEVType(CanonicalIV->getType()) &&
"IVs with types different from the canonical IV should "		"IVs with types different from the canonical IV should "
"already have been handled!");		"already have been handled!");
return CanonicalIV;		return CanonicalIV;
}		}

// {0,+,F} --> {0,+,1} * F		// {0,+,F} --> {0,+,1} * F

// If this is a simple linear addrec, emit it now as a special case.		// If this is a simple linear addrec, emit it now as a special case.
if (S->isAffine()) // {0,+,F} --> i*F		if (S->isAffine()) // {0,+,F} --> i*F
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (S->isAffine()) // {0,+,F} --> iF - return - expand(SE.getTruncateOrNoop( - SE.getMulExpr(SE.getUnknown(CanonicalIV), - SE.getNoopOrAnyExtend(S->getOperand(1), - CanonicalIV->getType())), + if (S->isAffine()) // {0,+,F} --> iF + return expand(SE.getTruncateOrNoop( + SE.getMulExpr( + SE.getUnknown(CanonicalIV), + SE.getNoopOrAnyExtend(S->getOperand(1), CanonicalIV->getType())), Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (S->isAffine()) // {0,+,F} --> i*F…
return		return
expand(SE.getTruncateOrNoop(		expand(SE.getTruncateOrNoop(
SE.getMulExpr(SE.getUnknown(CanonicalIV),		SE.getMulExpr(SE.getUnknown(CanonicalIV),
SE.getNoopOrAnyExtend(S->getOperand(1),		SE.getNoopOrAnyExtend(S->getOperand(1),
CanonicalIV->getType())),		CanonicalIV->getType())),
Ty));		Ty));

// If this is a chain of recurrences, turn it into a closed form, using the		// If this is a chain of recurrences, turn it into a closed form, using the
// folders, then expandCodeFor the closed form. This allows the folders to		// folders, then expandCodeFor the closed form. This allows the folders to
// simplify the expression without having to build a bunch of special code		// simplify the expression without having to build a bunch of special code
// into this folder.		// into this folder.
const SCEV *IH = SE.getUnknown(CanonicalIV); // Get I as a "symbolic" SCEV.		const SCEV *IH = SE.getUnknown(CanonicalIV); // Get I as a "symbolic" SCEV.
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - const SCEV IH = SE.getUnknown(CanonicalIV); // Get I as a "symbolic" SCEV. + const SCEV IH = SE.getUnknown(CanonicalIV); // Get I as a "symbolic" SCEV. Lint: Pre-merge checks: clang-format: please reformat the code ``` - const SCEV *IH = SE.getUnknown(CanonicalIV); //…

// Promote S up to the canonical IV type, if the cast is foldable.		// Promote S up to the canonical IV type, if the cast is foldable.
const SCEV *NewS = S;		const SCEV *NewS = S;
const SCEV *Ext = SE.getNoopOrAnyExtend(S, CanonicalIV->getType());		const SCEV *Ext = SE.getNoopOrAnyExtend(S, CanonicalIV->getType());
if (isa<SCEVAddRecExpr>(Ext))		if (isa<SCEVAddRecExpr>(Ext))
NewS = Ext;		NewS = Ext;

const SCEV *V = cast<SCEVAddRecExpr>(NewS)->evaluateAtIteration(IH, SE);		const SCEV *V = cast<SCEVAddRecExpr>(NewS)->evaluateAtIteration(IH, SE);
//cerr << "Evaluated: " << this << "\n to: " << V << "\n";		//cerr << "Evaluated: " << this << "\n to: " << V << "\n";
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - //cerr << "Evaluated: " << this << "\n to: " << V << "\n"; + // cerr << "Evaluated: " << this << "\n to: " << V << "\n"; Lint: Pre-merge checks: clang-format: please reformat the code ``` - //cerr << "Evaluated: " << *this << "\n to: "…

// Truncate the result down to the original type, if needed.		// Truncate the result down to the original type, if needed.
const SCEV *T = SE.getTruncateOrNoop(V, Ty);		const SCEV *T = SE.getTruncateOrNoop(V, Ty);
return expand(T);		return expand(T);
}		}

Value SCEVExpander::visitPtrToIntExpr(const SCEVPtrToIntExpr S) {		Value SCEVExpander::visitPtrToIntExpr(const SCEVPtrToIntExpr S) {
Value *V =		Value *V =
Show All 21 Lines	Value SCEVExpander::visitSignExtendExpr(const SCEVSignExtendExpr S) {
Type *Ty = SE.getEffectiveSCEVType(S->getType());		Type *Ty = SE.getEffectiveSCEVType(S->getType());
Value *V = expandCodeForImpl(		Value *V = expandCodeForImpl(
S->getOperand(), SE.getEffectiveSCEVType(S->getOperand()->getType()),		S->getOperand(), SE.getEffectiveSCEVType(S->getOperand()->getType()),
false);		false);
return Builder.CreateSExt(V, Ty);		return Builder.CreateSExt(V, Ty);
}		}

Value SCEVExpander::visitSMaxExpr(const SCEVSMaxExpr S) {		Value SCEVExpander::visitSMaxExpr(const SCEVSMaxExpr S) {
Value *LHS = expand(S->getOperand(S->getNumOperands()-1));		Value *LHS = expand(S->getOperand(S->getNumOperands()-1));
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Value LHS = expand(S->getOperand(S->getNumOperands()-1)); + Value LHS = expand(S->getOperand(S->getNumOperands() - 1)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Value *LHS = expand(S->getOperand(S…
Type *Ty = LHS->getType();		Type *Ty = LHS->getType();
for (int i = S->getNumOperands()-2; i >= 0; --i) {		for (int i = S->getNumOperands()-2; i >= 0; --i) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - for (int i = S->getNumOperands()-2; i >= 0; --i) { + for (int i = S->getNumOperands() - 2; i >= 0; --i) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - for (int i = S->getNumOperands()-2; i >= 0; --i)…
// In the case of mixed integer and pointer types, do the		// In the case of mixed integer and pointer types, do the
// rest of the comparisons as integer.		// rest of the comparisons as integer.
Type *OpTy = S->getOperand(i)->getType();		Type *OpTy = S->getOperand(i)->getType();
if (OpTy->isIntegerTy() != Ty->isIntegerTy()) {		if (OpTy->isIntegerTy() != Ty->isIntegerTy()) {
Ty = SE.getEffectiveSCEVType(Ty);		Ty = SE.getEffectiveSCEVType(Ty);
LHS = InsertNoopCastOfTo(LHS, Ty);		LHS = InsertNoopCastOfTo(LHS, Ty);
}		}
Value *RHS = expandCodeForImpl(S->getOperand(i), Ty, false);		Value *RHS = expandCodeForImpl(S->getOperand(i), Ty, false);
Value *ICmp = Builder.CreateICmpSGT(LHS, RHS);		Value *ICmp = Builder.CreateICmpSGT(LHS, RHS);
Value *Sel = Builder.CreateSelect(ICmp, LHS, RHS, "smax");		Value *Sel = Builder.CreateSelect(ICmp, LHS, RHS, "smax");
LHS = Sel;		LHS = Sel;
}		}
// In the case of mixed integer and pointer types, cast the		// In the case of mixed integer and pointer types, cast the
// final result back to the pointer type.		// final result back to the pointer type.
if (LHS->getType() != S->getType())		if (LHS->getType() != S->getType())
LHS = InsertNoopCastOfTo(LHS, S->getType());		LHS = InsertNoopCastOfTo(LHS, S->getType());
return LHS;		return LHS;
}		}

Value SCEVExpander::visitUMaxExpr(const SCEVUMaxExpr S) {		Value SCEVExpander::visitUMaxExpr(const SCEVUMaxExpr S) {
Value *LHS = expand(S->getOperand(S->getNumOperands()-1));		Value *LHS = expand(S->getOperand(S->getNumOperands()-1));
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - Value LHS = expand(S->getOperand(S->getNumOperands()-1)); + Value LHS = expand(S->getOperand(S->getNumOperands() - 1)); Lint: Pre-merge checks: clang-format: please reformat the code ``` - Value *LHS = expand(S->getOperand(S…
Type *Ty = LHS->getType();		Type *Ty = LHS->getType();
for (int i = S->getNumOperands()-2; i >= 0; --i) {		for (int i = S->getNumOperands()-2; i >= 0; --i) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - for (int i = S->getNumOperands()-2; i >= 0; --i) { + for (int i = S->getNumOperands() - 2; i >= 0; --i) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - for (int i = S->getNumOperands()-2; i >= 0; --i)…
// In the case of mixed integer and pointer types, do the		// In the case of mixed integer and pointer types, do the
// rest of the comparisons as integer.		// rest of the comparisons as integer.
Type *OpTy = S->getOperand(i)->getType();		Type *OpTy = S->getOperand(i)->getType();
if (OpTy->isIntegerTy() != Ty->isIntegerTy()) {		if (OpTy->isIntegerTy() != Ty->isIntegerTy()) {
Ty = SE.getEffectiveSCEVType(Ty);		Ty = SE.getEffectiveSCEVType(Ty);
LHS = InsertNoopCastOfTo(LHS, Ty);		LHS = InsertNoopCastOfTo(LHS, Ty);
}		}
Value *RHS = expandCodeForImpl(S->getOperand(i), Ty, false);		Value *RHS = expandCodeForImpl(S->getOperand(i), Ty, false);
▲ Show 20 Lines • Show All 140 Lines • ▼ Show 20 Lines	Value SCEVExpander::expand(const SCEV S) {
// Compute an insertion point for this SCEV object. Hoist the instructions		// Compute an insertion point for this SCEV object. Hoist the instructions
// as far out in the loop nest as possible.		// as far out in the loop nest as possible.
Instruction InsertPt = &Builder.GetInsertPoint();		Instruction InsertPt = &Builder.GetInsertPoint();

// We can move insertion point only if there is no div or rem operations		// We can move insertion point only if there is no div or rem operations
// otherwise we are risky to move it over the check for zero denominator.		// otherwise we are risky to move it over the check for zero denominator.
auto SafeToHoist = [](const SCEV *S) {		auto SafeToHoist = [](const SCEV *S) {
return !SCEVExprContains(S, [](const SCEV *S) {		return !SCEVExprContains(S, [](const SCEV *S) {
if (const auto *D = dyn_cast<SCEVUDivExpr>(S)) {		if (const auto *D = dyn_cast<SCEVUDivExpr>(S)) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (const auto D = dyn_cast<SCEVUDivExpr>(S)) { - if (const auto SC = dyn_cast<SCEVConstant>(D->getRHS())) - // Division by non-zero constants can be hoisted. - return SC->getValue()->isZero(); - // All other divisions should not be moved as they may be - // divisions by zero and should be kept within the - // conditions of the surrounding loops that guard their - // execution (see PR35406). - return true; - } 14 diff lines are omitted. See full path. Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (const auto *D =…
if (const auto *SC = dyn_cast<SCEVConstant>(D->getRHS()))		if (const auto *SC = dyn_cast<SCEVConstant>(D->getRHS()))
// Division by non-zero constants can be hoisted.		// Division by non-zero constants can be hoisted.
return SC->getValue()->isZero();		return SC->getValue()->isZero();
// All other divisions should not be moved as they may be		// All other divisions should not be moved as they may be
// divisions by zero and should be kept within the		// divisions by zero and should be kept within the
// conditions of the surrounding loops that guard their		// conditions of the surrounding loops that guard their
// execution (see PR35406).		// execution (see PR35406).
return true;		return true;
}		}
return false;		return false;
});		});
};		};
if (SafeToHoist(S)) {		if (SafeToHoist(S)) {
for (Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock());;		for (Loop *L = SE.LI.getLoopFor(Builder.GetInsertBlock());;
L = L->getParentLoop()) {		L = L->getParentLoop()) {
if (SE.isLoopInvariant(S, L)) {		if (SE.isLoopInvariant(S, L)) {
if (!L) break;		if (!L) break;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - if (!L) break; + if (!L) + break; Lint: Pre-merge checks: clang-format: please reformat the code ``` - if (!L) break; + if (!L) +…
if (BasicBlock *Preheader = L->getLoopPreheader())		if (BasicBlock *Preheader = L->getLoopPreheader())
InsertPt = Preheader->getTerminator();		InsertPt = Preheader->getTerminator();
else		else
// LSR sets the insertion point for AddRec start/step values to the		// LSR sets the insertion point for AddRec start/step values to the
// block start to simplify value reuse, even though it's an invalid		// block start to simplify value reuse, even though it's an invalid
// position. SCEVExpander must correct for this in all cases.		// position. SCEVExpander must correct for this in all cases.
InsertPt = &*L->getHeader()->getFirstInsertionPt();		InsertPt = &*L->getHeader()->getFirstInsertionPt();
} else {		} else {
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
///		///
/// This does not depend on any SCEVExpander state but should be used in		/// This does not depend on any SCEVExpander state but should be used in
/// the same context that SCEVExpander is used.		/// the same context that SCEVExpander is used.
unsigned		unsigned
SCEVExpander::replaceCongruentIVs(Loop L, const DominatorTree DT,		SCEVExpander::replaceCongruentIVs(Loop L, const DominatorTree DT,
SmallVectorImpl<WeakTrackingVH> &DeadInsts,		SmallVectorImpl<WeakTrackingVH> &DeadInsts,
const TargetTransformInfo *TTI) {		const TargetTransformInfo *TTI) {
// Find integer phis in order of increasing width.		// Find integer phis in order of increasing width.
SmallVector<PHINode*, 8> Phis;		SmallVector<PHINode*, 8> Phis;
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SmallVector<PHINode, 8> Phis; + SmallVector<PHINode , 8> Phis; Lint: Pre-merge checks: clang-format: please reformat the code ``` - SmallVector<PHINode*, 8> Phis; +…
for (PHINode &PN : L->getHeader()->phis())		for (PHINode &PN : L->getHeader()->phis())
Phis.push_back(&PN);		Phis.push_back(&PN);

if (TTI)		if (TTI)
llvm::sort(Phis, [](Value LHS, Value RHS) {		llvm::sort(Phis, [](Value LHS, Value RHS) {
// Put pointers at the back and make sure pointer < pointer = false.		// Put pointers at the back and make sure pointer < pointer = false.
if (!LHS->getType()->isIntegerTy() \|\| !RHS->getType()->isIntegerTy())		if (!LHS->getType()->isIntegerTy() \|\| !RHS->getType()->isIntegerTy())
return RHS->getType()->isIntegerTy() && !LHS->getType()->isIntegerTy();		return RHS->getType()->isIntegerTy() && !LHS->getType()->isIntegerTy();
Show All 20 Lines	for (PHINode *Phi : Phis) {
// Fold constant phis. They may be congruent to other constant phis and		// Fold constant phis. They may be congruent to other constant phis and
// would confuse the logic below that expects proper IVs.		// would confuse the logic below that expects proper IVs.
if (Value *V = SimplifyPHINode(Phi)) {		if (Value *V = SimplifyPHINode(Phi)) {
if (V->getType() != Phi->getType())		if (V->getType() != Phi->getType())
continue;		continue;
Phi->replaceAllUsesWith(V);		Phi->replaceAllUsesWith(V);
DeadInsts.emplace_back(Phi);		DeadInsts.emplace_back(Phi);
++NumElim;		++NumElim;
DEBUG_WITH_TYPE(DebugType, dbgs()		DEBUG_WITH_TYPE(DebugType, dbgs()
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - DEBUG_WITH_TYPE(DebugType, dbgs() - << "INDVARS: Eliminated constant iv: " << Phi << '\n'); + DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS: Eliminated constant iv: " + << Phi << '\n'); Lint: Pre-merge checks: clang-format: please reformat the code ``` - DEBUG_WITH_TYPE(DebugType, dbgs()…
<< "INDVARS: Eliminated constant iv: " << *Phi << '\n');		<< "INDVARS: Eliminated constant iv: " << *Phi << '\n');
continue;		continue;
}		}

if (!SE.isSCEVable(Phi->getType()))		if (!SE.isSCEVable(Phi->getType()))
continue;		continue;

PHINode *&OrigPhiRef = ExprToIVMap[SE.getSCEV(Phi)];		PHINode *&OrigPhiRef = ExprToIVMap[SE.getSCEV(Phi)];
if (!OrigPhiRef) {		if (!OrigPhiRef) {
OrigPhiRef = Phi;		OrigPhiRef = Phi;
if (Phi->getType()->isIntegerTy() && TTI &&		if (Phi->getType()->isIntegerTy() && TTI &&
TTI->isTruncateFree(Phi->getType(), Phis.back()->getType())) {		TTI->isTruncateFree(Phi->getType(), Phis.back()->getType())) {
// This phi can be freely truncated to the narrowest phi type. Map the		// This phi can be freely truncated to the narrowest phi type. Map the
// truncated expression to it so it will be reused for narrow types.		// truncated expression to it so it will be reused for narrow types.
const SCEV *TruncExpr =		const SCEV *TruncExpr =
SE.getTruncateExpr(SE.getSCEV(Phi), Phis.back()->getType());		SE.getTruncateExpr(SE.getSCEV(Phi), Phis.back()->getType());
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SE.getTruncateExpr(SE.getSCEV(Phi), Phis.back()->getType()); + SE.getTruncateExpr(SE.getSCEV(Phi), Phis.back()->getType()); Lint: Pre-merge checks: clang-format: please reformat the code ``` - SE.getTruncateExpr(SE.getSCEV(Phi), Phis.
ExprToIVMap[TruncExpr] = Phi;		ExprToIVMap[TruncExpr] = Phi;
}		}
continue;		continue;
}		}

// Replacing a pointer phi with an integer phi or vice-versa doesn't make		// Replacing a pointer phi with an integer phi or vice-versa doesn't make
// sense.		// sense.
if (OrigPhiRef->getType()->isPointerTy() != Phi->getType()->isPointerTy())		if (OrigPhiRef->getType()->isPointerTy() != Phi->getType()->isPointerTy())
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	if (BasicBlock *LatchBlock = L->getLoopLatch()) {
}		}
IsomorphicInc->replaceAllUsesWith(NewInc);		IsomorphicInc->replaceAllUsesWith(NewInc);
DeadInsts.emplace_back(IsomorphicInc);		DeadInsts.emplace_back(IsomorphicInc);
}		}
}		}
}		}
DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS: Eliminated congruent iv: "		DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS: Eliminated congruent iv: "
<< *Phi << '\n');		<< *Phi << '\n');
DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS: Original iv: "		DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS: Original iv: "
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS: Original iv: " - << OrigPhiRef << '\n'); + DEBUG_WITH_TYPE(DebugType, + dbgs() << "INDVARS: Original iv: " << OrigPhiRef << '\n'); Lint: Pre-merge checks: clang-format: please reformat the code ``` - DEBUG_WITH_TYPE(DebugType, dbgs() << "INDVARS…
<< *OrigPhiRef << '\n');		<< *OrigPhiRef << '\n');
++NumElim;		++NumElim;
Value *NewIV = OrigPhiRef;		Value *NewIV = OrigPhiRef;
if (OrigPhiRef->getType() != Phi->getType()) {		if (OrigPhiRef->getType() != Phi->getType()) {
IRBuilder<> Builder(&*L->getHeader()->getFirstInsertionPt());		IRBuilder<> Builder(&*L->getHeader()->getFirstInsertionPt());
Builder.SetCurrentDebugLocation(Phi->getDebugLoc());		Builder.SetCurrentDebugLocation(Phi->getDebugLoc());
NewIV = Builder.CreateTruncOrBitCast(OrigPhiRef, Phi->getType(), IVName);		NewIV = Builder.CreateTruncOrBitCast(OrigPhiRef, Phi->getType(), IVName);
}		}
Show All 36 Lines	SCEVExpander::getRelatedExistingExpansion(const SCEV S, const Instruction At,

// There is potential to make this significantly smarter, but this simple		// There is potential to make this significantly smarter, but this simple
// heuristic already gets some interesting cases.		// heuristic already gets some interesting cases.

// Can not find suitable value.		// Can not find suitable value.
return None;		return None;
}		}

template<typename T> static int costAndCollectOperands(		template<typename T> static int costAndCollectOperands(
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -template<typename T> static int costAndCollectOperands( - const SCEVOperand &WorkItem, const TargetTransformInfo &TTI, - TargetTransformInfo::TargetCostKind CostKind, - SmallVectorImpl<SCEVOperand> &Worklist) { +template <typename T> +static int costAndCollectOperands(const SCEVOperand &WorkItem, + const TargetTransformInfo &TTI, + TargetTransformInfo::TargetCostKind CostKind, + SmallVectorImpl<SCEVOperand> &Worklist) { Lint: Pre-merge checks: clang-format: please reformat the code ``` -template<typename T> static int…
const SCEVOperand &WorkItem, const TargetTransformInfo &TTI,		const SCEVOperand &WorkItem, const TargetTransformInfo &TTI,
TargetTransformInfo::TargetCostKind CostKind,		TargetTransformInfo::TargetCostKind CostKind,
SmallVectorImpl<SCEVOperand> &Worklist) {		SmallVectorImpl<SCEVOperand> &Worklist) {

const T *S = cast<T>(WorkItem.S);		const T *S = cast<T>(WorkItem.S);
int Cost = 0;		int Cost = 0;
// Object to help map SCEV operands to expanded IR instructions.		// Object to help map SCEV operands to expanded IR instructions.
struct OperationIndices {		struct OperationIndices {
OperationIndices(unsigned Opc, size_t min, size_t max) :		OperationIndices(unsigned Opc, size_t min, size_t max) :
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - OperationIndices(unsigned Opc, size_t min, size_t max) : - Opcode(Opc), MinIdx(min), MaxIdx(max) { } + OperationIndices(unsigned Opc, size_t min, size_t max) + : Opcode(Opc), MinIdx(min), MaxIdx(max) {} Lint: Pre-merge checks: clang-format: please reformat the code ``` - OperationIndices(unsigned Opc, size_t min…
Opcode(Opc), MinIdx(min), MaxIdx(max) { }		Opcode(Opc), MinIdx(min), MaxIdx(max) { }
unsigned Opcode;		unsigned Opcode;
size_t MinIdx;		size_t MinIdx;
size_t MaxIdx;		size_t MaxIdx;
};		};

// Collect the operations of all the instructions that will be needed to		// Collect the operations of all the instructions that will be needed to
// expand the SCEVExpr. This is so that when we come to cost the operands,		// expand the SCEVExpr. This is so that when we come to cost the operands,
// we know what the generated user(s) will be.		// we know what the generated user(s) will be.
SmallVector<OperationIndices, 2> Operations;		SmallVector<OperationIndices, 2> Operations;

auto CastCost = [&](unsigned Opcode) {		auto CastCost = [&](unsigned Opcode) {
Operations.emplace_back(Opcode, 0, 0);		Operations.emplace_back(Opcode, 0, 0);
return TTI.getCastInstrCost(Opcode, S->getType(),		return TTI.getCastInstrCost(Opcode, S->getType(),
S->getOperand(0)->getType(),		S->getOperand(0)->getType(),
TTI::CastContextHint::None, CostKind);		TTI::CastContextHint::None, CostKind);
};		};

auto ArithCost = [&](unsigned Opcode, unsigned NumRequired,		auto ArithCost = [&](unsigned Opcode, unsigned NumRequired,
unsigned MinIdx = 0, unsigned MaxIdx = 1) {		unsigned MinIdx = 0, unsigned MaxIdx = 1) {
Operations.emplace_back(Opcode, MinIdx, MaxIdx);		Operations.emplace_back(Opcode, MinIdx, MaxIdx);
return NumRequired *		return NumRequired *
TTI.getArithmeticInstrCost(Opcode, S->getType(), CostKind);		TTI.getArithmeticInstrCost(Opcode, S->getType(), CostKind);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - TTI.getArithmeticInstrCost(Opcode, S->getType(), CostKind); + TTI.getArithmeticInstrCost(Opcode, S->getType(), CostKind); Lint: Pre-merge checks: clang-format: please reformat the code ``` - TTI.getArithmeticInstrCost(Opcode, S->getType…
};		};

auto CmpSelCost = [&](unsigned Opcode, unsigned NumRequired,		auto CmpSelCost = [&](unsigned Opcode, unsigned NumRequired,
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - auto CmpSelCost = [&](unsigned Opcode, unsigned NumRequired, - unsigned MinIdx, unsigned MaxIdx) { + auto CmpSelCost = [&](unsigned Opcode, unsigned NumRequired, unsigned MinIdx, + unsigned MaxIdx) { Lint: Pre-merge checks: clang-format: please reformat the code ``` - auto CmpSelCost = [&](unsigned Opcode, unsigned…
unsigned MinIdx, unsigned MaxIdx) {		unsigned MinIdx, unsigned MaxIdx) {
Operations.emplace_back(Opcode, MinIdx, MaxIdx);		Operations.emplace_back(Opcode, MinIdx, MaxIdx);
Type *OpType = S->getOperand(0)->getType();		Type *OpType = S->getOperand(0)->getType();
return NumRequired * TTI.getCmpSelInstrCost(		return NumRequired * TTI.getCmpSelInstrCost(
Opcode, OpType, CmpInst::makeCmpResultType(OpType),		Opcode, OpType, CmpInst::makeCmpResultType(OpType),
CmpInst::BAD_ICMP_PREDICATE, CostKind);		CmpInst::BAD_ICMP_PREDICATE, CostKind);
};		};

Show All 38 Lines	template<typename T> static int costAndCollectOperands(
case scUMinExpr: {		case scUMinExpr: {
Cost += CmpSelCost(Instruction::ICmp, S->getNumOperands() - 1, 0, 1);		Cost += CmpSelCost(Instruction::ICmp, S->getNumOperands() - 1, 0, 1);
Cost += CmpSelCost(Instruction::Select, S->getNumOperands() - 1, 0, 2);		Cost += CmpSelCost(Instruction::Select, S->getNumOperands() - 1, 0, 2);
break;		break;
}		}
case scAddRecExpr: {		case scAddRecExpr: {
// In this polynominal, we may have some zero operands, and we shouldn't		// In this polynominal, we may have some zero operands, and we shouldn't
// really charge for those. So how many non-zero coeffients are there?		// really charge for those. So how many non-zero coeffients are there?
int NumTerms = llvm::count_if(S->operands(), [](const SCEV *Op) {		int NumTerms = llvm::count_if(S->operands(), [](const SCEV *Op) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - int NumTerms = llvm::count_if(S->operands(), [](const SCEV Op) { - return !Op->isZero(); - }); + int NumTerms = llvm::count_if(S->operands(), + [](const SCEV Op) { return !Op->isZero(); }); Lint: Pre-merge checks: clang-format: please reformat the code ``` - int NumTerms = llvm::count_if(S->operands(), []…
return !Op->isZero();		return !Op->isZero();
});		});

assert(NumTerms >= 1 && "Polynominal should have at least one term.");		assert(NumTerms >= 1 && "Polynominal should have at least one term.");
assert(!(*std::prev(S->operands().end()))->isZero() &&		assert(!(*std::prev(S->operands().end()))->isZero() &&
"Last operand should not be zero");		"Last operand should not be zero");

// Ignoring constant term (operand 0), how many of the coeffients are u> 1?		// Ignoring constant term (operand 0), how many of the coeffients are u> 1?
int NumNonZeroDegreeNonOneTerms =		int NumNonZeroDegreeNonOneTerms =
llvm::count_if(S->operands(), [](const SCEV *Op) {		llvm::count_if(S->operands(), [](const SCEV *Op) {
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - llvm::count_if(S->operands(), [](const SCEV Op) { - auto SConst = dyn_cast<SCEVConstant>(Op); - return !SConst \|\| SConst->getAPInt().ugt(1); - }); + llvm::count_if(S->operands(), [](const SCEV Op) { + auto SConst = dyn_cast<SCEVConstant>(Op); + return !SConst \|\| SConst->getAPInt().ugt(1); + }); Lint: Pre-merge checks: clang-format: please reformat the code ``` - llvm::count_if(S->operands(), [](const SCEV…
auto *SConst = dyn_cast<SCEVConstant>(Op);		auto *SConst = dyn_cast<SCEVConstant>(Op);
return !SConst \|\| SConst->getAPInt().ugt(1);		return !SConst \|\| SConst->getAPInt().ugt(1);
});		});

// Much like with normal add expr, the polynominal will require		// Much like with normal add expr, the polynominal will require
// one less addition than the number of it's terms.		// one less addition than the number of it's terms.
int AddCost = ArithCost(Instruction::Add, NumTerms - 1,		int AddCost = ArithCost(Instruction::Add, NumTerms - 1,
/MinIdx/1, /MaxIdx/1);		/MinIdx/1, /MaxIdx/1);
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - /MinIdx/1, /MaxIdx/1); + /MinIdx/ 1, /MaxIdx/ 1); Lint: Pre-merge checks: clang-format: please reformat the code ``` - /MinIdx/1…
// Here, each one of those will require a multiplication.		// Here, each one of those will require a multiplication.
int MulCost = ArithCost(Instruction::Mul, NumNonZeroDegreeNonOneTerms);		int MulCost = ArithCost(Instruction::Mul, NumNonZeroDegreeNonOneTerms);
Cost = AddCost + MulCost;		Cost = AddCost + MulCost;

// What is the degree of this polynominal?		// What is the degree of this polynominal?
int PolyDegree = S->getNumOperands() - 1;		int PolyDegree = S->getNumOperands() - 1;
assert(PolyDegree >= 1 && "Should be at least affine.");		assert(PolyDegree >= 1 && "Should be at least affine.");

▲ Show 20 Lines • Show All 325 Lines • ▼ Show 20 Lines
// scaling the recurrence outside the loop, but this technique isn't generally		// scaling the recurrence outside the loop, but this technique isn't generally
// applicable. Expanding a nested recurrence outside a loop requires computing		// applicable. Expanding a nested recurrence outside a loop requires computing
// binomial coefficients. This could be done, but the recurrence has to be in a		// binomial coefficients. This could be done, but the recurrence has to be in a
// perfectly reduced form, which can't be guaranteed.		// perfectly reduced form, which can't be guaranteed.
struct SCEVFindUnsafe {		struct SCEVFindUnsafe {
ScalarEvolution &SE;		ScalarEvolution &SE;
bool IsUnsafe;		bool IsUnsafe;

SCEVFindUnsafe(ScalarEvolution &se): SE(se), IsUnsafe(false) {}		SCEVFindUnsafe(ScalarEvolution &se): SE(se), IsUnsafe(false) {}
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code - SCEVFindUnsafe(ScalarEvolution &se): SE(se), IsUnsafe(false) {} + SCEVFindUnsafe(ScalarEvolution &se) : SE(se), IsUnsafe(false) {} Lint: Pre-merge checks: clang-format: please reformat the code ``` - SCEVFindUnsafe(ScalarEvolution &se): SE(se)…

bool follow(const SCEV *S) {		bool follow(const SCEV *S) {
if (const SCEVUDivExpr *D = dyn_cast<SCEVUDivExpr>(S)) {		if (const SCEVUDivExpr *D = dyn_cast<SCEVUDivExpr>(S)) {
const SCEVConstant *SC = dyn_cast<SCEVConstant>(D->getRHS());		const SCEVConstant *SC = dyn_cast<SCEVConstant>(D->getRHS());
if (!SC \|\| SC->getValue()->isZero()) {		if (!SC \|\| SC->getValue()->isZero()) {
IsUnsafe = true;		IsUnsafe = true;
return false;		return false;
}		}
}		}
if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(S)) {		if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(S)) {
const SCEV *Step = AR->getStepRecurrence(SE);		const SCEV *Step = AR->getStepRecurrence(SE);
if (!AR->isAffine() && !SE.dominates(Step, AR->getLoop()->getHeader())) {		if (!AR->isAffine() && !SE.dominates(Step, AR->getLoop()->getHeader())) {
IsUnsafe = true;		IsUnsafe = true;
return false;		return false;
}		}
}		}
return true;		return true;
}		}
bool isDone() const { return IsUnsafe; }		bool isDone() const { return IsUnsafe; }
};		};
}		}
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -} +} // namespace Lint: Pre-merge checks: clang-format: please reformat the code ``` -} +} // namespace ```

namespace llvm {		namespace llvm {
bool isSafeToExpand(const SCEV *S, ScalarEvolution &SE) {		bool isSafeToExpand(const SCEV *S, ScalarEvolution &SE) {
SCEVFindUnsafe Search(SE);		SCEVFindUnsafe Search(SE);
visitAll(S, Search);		visitAll(S, Search);
return !Search.IsUnsafe;		return !Search.IsUnsafe;
}		}

▲ Show 20 Lines • Show All 50 Lines • ▼ Show 20 Lines	assert(all_of(I->users(),
"during expansion");		"during expansion");
#endif		#endif
assert(!I->getType()->isVoidTy() &&		assert(!I->getType()->isVoidTy() &&
"inserted instruction should have non-void types");		"inserted instruction should have non-void types");
I->replaceAllUsesWith(UndefValue::get(I->getType()));		I->replaceAllUsesWith(UndefValue::get(I->getType()));
I->eraseFromParent();		I->eraseFromParent();
}		}
}		}
}		}
		Lint: Pre-merge checks Inline Actions clang-format: please reformat the code -} +} // namespace llvm Lint: Pre-merge checks: clang-format: please reformat the code ``` -} +} // namespace llvm ```

test/CodeGen/Thumb2/LowOverheadLoops/fast-fp-loops.ll

	Show All 21 Lines
	; CHECK-NEXT: ands r4, r5			; CHECK-NEXT: ands r4, r5
	; CHECK-NEXT: lsls r4, r4, #31			; CHECK-NEXT: lsls r4, r4, #31
	; CHECK-NEXT: itt eq			; CHECK-NEXT: itt eq
	; CHECK-NEXT: andeq.w r5, lr, r12			; CHECK-NEXT: andeq.w r5, lr, r12
	; CHECK-NEXT: lslseq.w r5, r5, #31			; CHECK-NEXT: lslseq.w r5, r5, #31
	; CHECK-NEXT: beq .LBB0_4			; CHECK-NEXT: beq .LBB0_4
	; CHECK-NEXT: @ %bb.2: @ %for.body.preheader			; CHECK-NEXT: @ %bb.2: @ %for.body.preheader
	; CHECK-NEXT: subs r5, r3, #1			; CHECK-NEXT: subs r5, r3, #1
	; CHECK-NEXT: and r7, r3, #3			; CHECK-NEXT: and lr, r3, #3
	; CHECK-NEXT: cmp r5, #3			; CHECK-NEXT: cmp r5, #3
	; CHECK-NEXT: bhs .LBB0_6			; CHECK-NEXT: bhs .LBB0_6
	; CHECK-NEXT: @ %bb.3:			; CHECK-NEXT: @ %bb.3:
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: movs r3, #0
	; CHECK-NEXT: b .LBB0_8			; CHECK-NEXT: b .LBB0_8
	; CHECK-NEXT: .LBB0_4: @ %vector.ph			; CHECK-NEXT: .LBB0_4: @ %vector.ph
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: dlstp.32 lr, r3			; CHECK-NEXT: dlstp.32 lr, r3
	; CHECK-NEXT: .LBB0_5: @ %vector.body			; CHECK-NEXT: .LBB0_5: @ %vector.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: add.w r12, r12, #4
	; CHECK-NEXT: vldrw.u32 q0, [r1], #16			; CHECK-NEXT: vldrw.u32 q0, [r1], #16
	; CHECK-NEXT: vldrw.u32 q1, [r2], #16			; CHECK-NEXT: vldrw.u32 q1, [r2], #16
	; CHECK-NEXT: vmul.f32 q0, q1, q0			; CHECK-NEXT: vmul.f32 q0, q1, q0
	; CHECK-NEXT: vstrw.32 q0, [r0], #16			; CHECK-NEXT: vstrw.32 q0, [r0], #16
	; CHECK-NEXT: letp lr, .LBB0_5			; CHECK-NEXT: letp lr, .LBB0_5
	; CHECK-NEXT: b .LBB0_11			; CHECK-NEXT: b .LBB0_11
	; CHECK-NEXT: .LBB0_6: @ %for.body.preheader.new			; CHECK-NEXT: .LBB0_6: @ %for.body.preheader.new
	; CHECK-NEXT: bic r3, r3, #3			; CHECK-NEXT: sub.w r12, r3, lr
	; CHECK-NEXT: movs r5, #1			; CHECK-NEXT: movs r4, #0
	; CHECK-NEXT: subs r3, #4
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r5, r3, lsr #2
	; CHECK-NEXT: movs r3, #0			; CHECK-NEXT: movs r3, #0
	; CHECK-NEXT: dls lr, lr
	; CHECK-NEXT: .LBB0_7: @ %for.body			; CHECK-NEXT: .LBB0_7: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: adds r4, r1, r3			; CHECK-NEXT: adds r5, r1, r4
	; CHECK-NEXT: adds r5, r2, r3			; CHECK-NEXT: adds r6, r2, r4
	; CHECK-NEXT: adds r6, r0, r3			; CHECK-NEXT: adds r7, r0, r4
	; CHECK-NEXT: adds r3, #16			; CHECK-NEXT: adds r3, #4
	; CHECK-NEXT: vldr s0, [r4]			; CHECK-NEXT: vldr s0, [r5]
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: adds r4, #16
	; CHECK-NEXT: vldr s2, [r5]			; CHECK-NEXT: vldr s2, [r6]
				; CHECK-NEXT: cmp r12, r3
	; CHECK-NEXT: vmul.f32 s0, s2, s0			; CHECK-NEXT: vmul.f32 s0, s2, s0
	; CHECK-NEXT: vstr s0, [r6]			; CHECK-NEXT: vstr s0, [r7]
	; CHECK-NEXT: vldr s0, [r4, #4]			; CHECK-NEXT: vldr s0, [r5, #4]
	; CHECK-NEXT: vldr s2, [r5, #4]			; CHECK-NEXT: vldr s2, [r6, #4]
	; CHECK-NEXT: vmul.f32 s0, s2, s0			; CHECK-NEXT: vmul.f32 s0, s2, s0
	; CHECK-NEXT: vstr s0, [r6, #4]			; CHECK-NEXT: vstr s0, [r7, #4]
	; CHECK-NEXT: vldr s0, [r4, #8]			; CHECK-NEXT: vldr s0, [r5, #8]
	; CHECK-NEXT: vldr s2, [r5, #8]			; CHECK-NEXT: vldr s2, [r6, #8]
	; CHECK-NEXT: vmul.f32 s0, s2, s0			; CHECK-NEXT: vmul.f32 s0, s2, s0
	; CHECK-NEXT: vstr s0, [r6, #8]			; CHECK-NEXT: vstr s0, [r7, #8]
	; CHECK-NEXT: vldr s0, [r4, #12]			; CHECK-NEXT: vldr s0, [r5, #12]
	; CHECK-NEXT: vldr s2, [r5, #12]			; CHECK-NEXT: vldr s2, [r6, #12]
	; CHECK-NEXT: vmul.f32 s0, s2, s0			; CHECK-NEXT: vmul.f32 s0, s2, s0
	; CHECK-NEXT: vstr s0, [r6, #12]			; CHECK-NEXT: vstr s0, [r7, #12]
	; CHECK-NEXT: le lr, .LBB0_7			; CHECK-NEXT: bne .LBB0_7
	; CHECK-NEXT: .LBB0_8: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB0_8: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r7, .LBB0_11			; CHECK-NEXT: wls lr, lr, .LBB0_11
	; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader
	; CHECK-NEXT: add.w r1, r1, r12, lsl #2			; CHECK-NEXT: add.w r1, r1, r3, lsl #2
	; CHECK-NEXT: add.w r2, r2, r12, lsl #2			; CHECK-NEXT: add.w r2, r2, r3, lsl #2
	; CHECK-NEXT: add.w r0, r0, r12, lsl #2			; CHECK-NEXT: add.w r0, r0, r3, lsl #2
	; CHECK-NEXT: mov lr, r7
	; CHECK-NEXT: .LBB0_10: @ %for.body.epil			; CHECK-NEXT: .LBB0_10: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: vldr s0, [r1]			; CHECK-NEXT: vldr s0, [r1]
	; CHECK-NEXT: adds r1, #4			; CHECK-NEXT: adds r1, #4
	; CHECK-NEXT: vldr s2, [r2]			; CHECK-NEXT: vldr s2, [r2]
	; CHECK-NEXT: adds r2, #4			; CHECK-NEXT: adds r2, #4
	; CHECK-NEXT: vmul.f32 s0, s2, s0			; CHECK-NEXT: vmul.f32 s0, s2, s0
	; CHECK-NEXT: vstr s0, [r0]			; CHECK-NEXT: vstr s0, [r0]
	▲ Show 20 Lines • Show All 501 Lines • Show Last 20 Lines

test/CodeGen/Thumb2/LowOverheadLoops/mve-float-loops.ll

	Show First 20 Lines • Show All 1,453 Lines • ▼ Show 20 Lines

	define arm_aapcs_vfpcc float @half_half_mac(half* nocapture readonly %a, half* nocapture readonly %b, i32 %N) {			define arm_aapcs_vfpcc float @half_half_mac(half* nocapture readonly %a, half* nocapture readonly %b, i32 %N) {
	; CHECK-LABEL: half_half_mac:			; CHECK-LABEL: half_half_mac:
	; CHECK: @ %bb.0: @ %entry			; CHECK: @ %bb.0: @ %entry
	; CHECK-NEXT: push {r4, r5, r7, lr}			; CHECK-NEXT: push {r4, r5, r7, lr}
	; CHECK-NEXT: cbz r2, .LBB9_3			; CHECK-NEXT: cbz r2, .LBB9_3
	; CHECK-NEXT: @ %bb.1: @ %for.body.preheader			; CHECK-NEXT: @ %bb.1: @ %for.body.preheader
	; CHECK-NEXT: subs r3, r2, #1			; CHECK-NEXT: subs r3, r2, #1
	; CHECK-NEXT: and r5, r2, #3			; CHECK-NEXT: and lr, r2, #3
				; CHECK-NEXT: vldr s0, .LCPI9_0
	; CHECK-NEXT: cmp r3, #3			; CHECK-NEXT: cmp r3, #3
	; CHECK-NEXT: bhs .LBB9_4			; CHECK-NEXT: bhs .LBB9_4
	; CHECK-NEXT: @ %bb.2:			; CHECK-NEXT: @ %bb.2:
	; CHECK-NEXT: vldr s0, .LCPI9_0			; CHECK-NEXT: movs r2, #0
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: b .LBB9_6			; CHECK-NEXT: b .LBB9_6
	; CHECK-NEXT: .LBB9_3:			; CHECK-NEXT: .LBB9_3:
	; CHECK-NEXT: vldr s0, .LCPI9_0			; CHECK-NEXT: vldr s0, .LCPI9_0
	; CHECK-NEXT: b .LBB9_9			; CHECK-NEXT: b .LBB9_9
	; CHECK-NEXT: .LBB9_4: @ %for.body.preheader.new			; CHECK-NEXT: .LBB9_4: @ %for.body.preheader.new
	; CHECK-NEXT: bic r2, r2, #3			; CHECK-NEXT: sub.w r12, r2, lr
	; CHECK-NEXT: movs r3, #1
	; CHECK-NEXT: subs r2, #4
	; CHECK-NEXT: vldr s0, .LCPI9_0
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r3, r2, lsr #2
	; CHECK-NEXT: movs r3, #0			; CHECK-NEXT: movs r3, #0
	; CHECK-NEXT: dls lr, lr			; CHECK-NEXT: movs r2, #0
	; CHECK-NEXT: .LBB9_5: @ %for.body			; CHECK-NEXT: .LBB9_5: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: adds r4, r0, r3			; CHECK-NEXT: adds r5, r0, r3
	; CHECK-NEXT: adds r2, r1, r3			; CHECK-NEXT: adds r4, r1, r3
	; CHECK-NEXT: vldr.16 s2, [r2, #6]			; CHECK-NEXT: vldr.16 s2, [r4, #6]
	; CHECK-NEXT: vldr.16 s4, [r4, #6]			; CHECK-NEXT: vldr.16 s4, [r5, #6]
	; CHECK-NEXT: vldr.16 s6, [r4, #4]			; CHECK-NEXT: vldr.16 s6, [r5, #4]
	; CHECK-NEXT: vldr.16 s8, [r4, #2]			; CHECK-NEXT: vldr.16 s8, [r5, #2]
	; CHECK-NEXT: vmul.f16 s2, s4, s2			; CHECK-NEXT: vmul.f16 s2, s4, s2
	; CHECK-NEXT: vldr.16 s4, [r2, #4]			; CHECK-NEXT: vldr.16 s4, [r4, #4]
	; CHECK-NEXT: vldr.16 s10, [r4]			; CHECK-NEXT: vldr.16 s10, [r5]
	; CHECK-NEXT: vcvtb.f32.f16 s2, s2			; CHECK-NEXT: vcvtb.f32.f16 s2, s2
	; CHECK-NEXT: vmul.f16 s4, s6, s4			; CHECK-NEXT: vmul.f16 s4, s6, s4
	; CHECK-NEXT: vldr.16 s6, [r2, #2]			; CHECK-NEXT: vldr.16 s6, [r4, #2]
	; CHECK-NEXT: vcvtb.f32.f16 s4, s4			; CHECK-NEXT: vcvtb.f32.f16 s4, s4
	; CHECK-NEXT: adds r3, #8			; CHECK-NEXT: adds r2, #4
	; CHECK-NEXT: vmul.f16 s6, s8, s6			; CHECK-NEXT: vmul.f16 s6, s8, s6
	; CHECK-NEXT: vldr.16 s8, [r2]			; CHECK-NEXT: vldr.16 s8, [r4]
	; CHECK-NEXT: vcvtb.f32.f16 s6, s6			; CHECK-NEXT: vcvtb.f32.f16 s6, s6
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: adds r3, #8
	; CHECK-NEXT: vmul.f16 s8, s10, s8			; CHECK-NEXT: vmul.f16 s8, s10, s8
				; CHECK-NEXT: cmp r12, r2
	; CHECK-NEXT: vcvtb.f32.f16 s8, s8			; CHECK-NEXT: vcvtb.f32.f16 s8, s8
	; CHECK-NEXT: vadd.f32 s0, s0, s8			; CHECK-NEXT: vadd.f32 s0, s0, s8
	; CHECK-NEXT: vadd.f32 s0, s0, s6			; CHECK-NEXT: vadd.f32 s0, s0, s6
	; CHECK-NEXT: vadd.f32 s0, s0, s4			; CHECK-NEXT: vadd.f32 s0, s0, s4
	; CHECK-NEXT: vadd.f32 s0, s0, s2			; CHECK-NEXT: vadd.f32 s0, s0, s2
	; CHECK-NEXT: le lr, .LBB9_5			; CHECK-NEXT: bne .LBB9_5
	; CHECK-NEXT: .LBB9_6: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB9_6: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r5, .LBB9_9			; CHECK-NEXT: wls lr, lr, .LBB9_9
	; CHECK-NEXT: @ %bb.7: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.7: @ %for.body.epil.preheader
	; CHECK-NEXT: add.w r0, r0, r12, lsl #1			; CHECK-NEXT: add.w r0, r0, r2, lsl #1
	; CHECK-NEXT: add.w r1, r1, r12, lsl #1			; CHECK-NEXT: add.w r1, r1, r2, lsl #1
	; CHECK-NEXT: mov lr, r5
	; CHECK-NEXT: .LBB9_8: @ %for.body.epil			; CHECK-NEXT: .LBB9_8: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: vldr.16 s2, [r1]			; CHECK-NEXT: vldr.16 s2, [r1]
	; CHECK-NEXT: vldr.16 s4, [r0]			; CHECK-NEXT: vldr.16 s4, [r0]
	; CHECK-NEXT: adds r0, #2			; CHECK-NEXT: adds r0, #2
	; CHECK-NEXT: adds r1, #2			; CHECK-NEXT: adds r1, #2
	; CHECK-NEXT: vmul.f16 s2, s4, s2			; CHECK-NEXT: vmul.f16 s2, s4, s2
	; CHECK-NEXT: vcvtb.f32.f16 s2, s2			; CHECK-NEXT: vcvtb.f32.f16 s2, s2
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines

	define arm_aapcs_vfpcc float @half_half_acc(half* nocapture readonly %a, half* nocapture readonly %b, i32 %N) {			define arm_aapcs_vfpcc float @half_half_acc(half* nocapture readonly %a, half* nocapture readonly %b, i32 %N) {
	; CHECK-LABEL: half_half_acc:			; CHECK-LABEL: half_half_acc:
	; CHECK: @ %bb.0: @ %entry			; CHECK: @ %bb.0: @ %entry
	; CHECK-NEXT: push {r4, r5, r7, lr}			; CHECK-NEXT: push {r4, r5, r7, lr}
	; CHECK-NEXT: cbz r2, .LBB10_3			; CHECK-NEXT: cbz r2, .LBB10_3
	; CHECK-NEXT: @ %bb.1: @ %for.body.preheader			; CHECK-NEXT: @ %bb.1: @ %for.body.preheader
	; CHECK-NEXT: subs r3, r2, #1			; CHECK-NEXT: subs r3, r2, #1
	; CHECK-NEXT: and r5, r2, #3			; CHECK-NEXT: and lr, r2, #3
				; CHECK-NEXT: vldr s0, .LCPI10_0
	; CHECK-NEXT: cmp r3, #3			; CHECK-NEXT: cmp r3, #3
	; CHECK-NEXT: bhs .LBB10_4			; CHECK-NEXT: bhs .LBB10_4
	; CHECK-NEXT: @ %bb.2:			; CHECK-NEXT: @ %bb.2:
	; CHECK-NEXT: vldr s0, .LCPI10_0			; CHECK-NEXT: movs r2, #0
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: b .LBB10_6			; CHECK-NEXT: b .LBB10_6
	; CHECK-NEXT: .LBB10_3:			; CHECK-NEXT: .LBB10_3:
	; CHECK-NEXT: vldr s0, .LCPI10_0			; CHECK-NEXT: vldr s0, .LCPI10_0
	; CHECK-NEXT: b .LBB10_9			; CHECK-NEXT: b .LBB10_9
	; CHECK-NEXT: .LBB10_4: @ %for.body.preheader.new			; CHECK-NEXT: .LBB10_4: @ %for.body.preheader.new
	; CHECK-NEXT: bic r2, r2, #3			; CHECK-NEXT: sub.w r12, r2, lr
	; CHECK-NEXT: movs r3, #1
	; CHECK-NEXT: subs r2, #4
	; CHECK-NEXT: vldr s0, .LCPI10_0
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r3, r2, lsr #2
	; CHECK-NEXT: movs r3, #0			; CHECK-NEXT: movs r3, #0
	; CHECK-NEXT: dls lr, lr			; CHECK-NEXT: movs r2, #0
	; CHECK-NEXT: .LBB10_5: @ %for.body			; CHECK-NEXT: .LBB10_5: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: adds r4, r0, r3			; CHECK-NEXT: adds r5, r0, r3
	; CHECK-NEXT: adds r2, r1, r3			; CHECK-NEXT: adds r4, r1, r3
	; CHECK-NEXT: vldr.16 s2, [r2, #6]			; CHECK-NEXT: vldr.16 s2, [r4, #6]
	; CHECK-NEXT: vldr.16 s4, [r4, #6]			; CHECK-NEXT: vldr.16 s4, [r5, #6]
	; CHECK-NEXT: vldr.16 s6, [r4, #4]			; CHECK-NEXT: vldr.16 s6, [r5, #4]
	; CHECK-NEXT: vldr.16 s8, [r4, #2]			; CHECK-NEXT: vldr.16 s8, [r5, #2]
	; CHECK-NEXT: vadd.f16 s2, s4, s2			; CHECK-NEXT: vadd.f16 s2, s4, s2
	; CHECK-NEXT: vldr.16 s4, [r2, #4]			; CHECK-NEXT: vldr.16 s4, [r4, #4]
	; CHECK-NEXT: vldr.16 s10, [r4]			; CHECK-NEXT: vldr.16 s10, [r5]
	; CHECK-NEXT: vcvtb.f32.f16 s2, s2			; CHECK-NEXT: vcvtb.f32.f16 s2, s2
	; CHECK-NEXT: vadd.f16 s4, s6, s4			; CHECK-NEXT: vadd.f16 s4, s6, s4
	; CHECK-NEXT: vldr.16 s6, [r2, #2]			; CHECK-NEXT: vldr.16 s6, [r4, #2]
	; CHECK-NEXT: vcvtb.f32.f16 s4, s4			; CHECK-NEXT: vcvtb.f32.f16 s4, s4
	; CHECK-NEXT: adds r3, #8			; CHECK-NEXT: adds r2, #4
	; CHECK-NEXT: vadd.f16 s6, s8, s6			; CHECK-NEXT: vadd.f16 s6, s8, s6
	; CHECK-NEXT: vldr.16 s8, [r2]			; CHECK-NEXT: vldr.16 s8, [r4]
	; CHECK-NEXT: vcvtb.f32.f16 s6, s6			; CHECK-NEXT: vcvtb.f32.f16 s6, s6
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: adds r3, #8
	; CHECK-NEXT: vadd.f16 s8, s10, s8			; CHECK-NEXT: vadd.f16 s8, s10, s8
				; CHECK-NEXT: cmp r12, r2
	; CHECK-NEXT: vcvtb.f32.f16 s8, s8			; CHECK-NEXT: vcvtb.f32.f16 s8, s8
	; CHECK-NEXT: vadd.f32 s0, s0, s8			; CHECK-NEXT: vadd.f32 s0, s0, s8
	; CHECK-NEXT: vadd.f32 s0, s0, s6			; CHECK-NEXT: vadd.f32 s0, s0, s6
	; CHECK-NEXT: vadd.f32 s0, s0, s4			; CHECK-NEXT: vadd.f32 s0, s0, s4
	; CHECK-NEXT: vadd.f32 s0, s0, s2			; CHECK-NEXT: vadd.f32 s0, s0, s2
	; CHECK-NEXT: le lr, .LBB10_5			; CHECK-NEXT: bne .LBB10_5
	; CHECK-NEXT: .LBB10_6: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB10_6: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r5, .LBB10_9			; CHECK-NEXT: wls lr, lr, .LBB10_9
	; CHECK-NEXT: @ %bb.7: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.7: @ %for.body.epil.preheader
	; CHECK-NEXT: add.w r0, r0, r12, lsl #1			; CHECK-NEXT: add.w r0, r0, r2, lsl #1
	; CHECK-NEXT: add.w r1, r1, r12, lsl #1			; CHECK-NEXT: add.w r1, r1, r2, lsl #1
	; CHECK-NEXT: mov lr, r5
	; CHECK-NEXT: .LBB10_8: @ %for.body.epil			; CHECK-NEXT: .LBB10_8: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: vldr.16 s2, [r1]			; CHECK-NEXT: vldr.16 s2, [r1]
	; CHECK-NEXT: vldr.16 s4, [r0]			; CHECK-NEXT: vldr.16 s4, [r0]
	; CHECK-NEXT: adds r0, #2			; CHECK-NEXT: adds r0, #2
	; CHECK-NEXT: adds r1, #2			; CHECK-NEXT: adds r1, #2
	; CHECK-NEXT: vadd.f16 s2, s4, s2			; CHECK-NEXT: vadd.f16 s2, s4, s2
	; CHECK-NEXT: vcvtb.f32.f16 s2, s2			; CHECK-NEXT: vcvtb.f32.f16 s2, s2
	▲ Show 20 Lines • Show All 89 Lines • ▼ Show 20 Lines

	define arm_aapcs_vfpcc float @half_short_mac(half* nocapture readonly %a, i16* nocapture readonly %b, i32 %N) {			define arm_aapcs_vfpcc float @half_short_mac(half* nocapture readonly %a, i16* nocapture readonly %b, i32 %N) {
	; CHECK-LABEL: half_short_mac:			; CHECK-LABEL: half_short_mac:
	; CHECK: @ %bb.0: @ %entry			; CHECK: @ %bb.0: @ %entry
	; CHECK-NEXT: push {r4, r5, r6, lr}			; CHECK-NEXT: push {r4, r5, r6, lr}
	; CHECK-NEXT: cbz r2, .LBB11_3			; CHECK-NEXT: cbz r2, .LBB11_3
	; CHECK-NEXT: @ %bb.1: @ %for.body.preheader			; CHECK-NEXT: @ %bb.1: @ %for.body.preheader
	; CHECK-NEXT: subs r3, r2, #1			; CHECK-NEXT: subs r3, r2, #1
	; CHECK-NEXT: and r6, r2, #3			; CHECK-NEXT: and lr, r2, #3
				; CHECK-NEXT: vldr s0, .LCPI11_0
	; CHECK-NEXT: cmp r3, #3			; CHECK-NEXT: cmp r3, #3
	; CHECK-NEXT: bhs .LBB11_4			; CHECK-NEXT: bhs .LBB11_4
	; CHECK-NEXT: @ %bb.2:			; CHECK-NEXT: @ %bb.2:
	; CHECK-NEXT: vldr s0, .LCPI11_0			; CHECK-NEXT: movs r2, #0
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: b .LBB11_6			; CHECK-NEXT: b .LBB11_6
	; CHECK-NEXT: .LBB11_3:			; CHECK-NEXT: .LBB11_3:
	; CHECK-NEXT: vldr s0, .LCPI11_0			; CHECK-NEXT: vldr s0, .LCPI11_0
	; CHECK-NEXT: b .LBB11_9			; CHECK-NEXT: b .LBB11_9
	; CHECK-NEXT: .LBB11_4: @ %for.body.preheader.new			; CHECK-NEXT: .LBB11_4: @ %for.body.preheader.new
	; CHECK-NEXT: bic r2, r2, #3			; CHECK-NEXT: sub.w r12, r2, lr
	; CHECK-NEXT: movs r3, #1
	; CHECK-NEXT: subs r2, #4
	; CHECK-NEXT: vldr s0, .LCPI11_0
	; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r3, r2, lsr #2
	; CHECK-NEXT: adds r3, r1, #4			; CHECK-NEXT: adds r3, r1, #4
	; CHECK-NEXT: dls lr, lr			; CHECK-NEXT: adds r4, r0, #4
	; CHECK-NEXT: adds r2, r0, #4			; CHECK-NEXT: movs r2, #0
	; CHECK-NEXT: .LBB11_5: @ %for.body			; CHECK-NEXT: .LBB11_5: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldrsh.w r4, [r3, #2]			; CHECK-NEXT: ldrsh.w r5, [r3, #2]
	; CHECK-NEXT: vldr.16 s2, [r2, #2]			; CHECK-NEXT: vldr.16 s2, [r4, #2]
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: adds r2, #4
	; CHECK-NEXT: vmov s4, r4			; CHECK-NEXT: cmp r12, r2
	; CHECK-NEXT: ldrsh r4, [r3], #8			; CHECK-NEXT: vmov s4, r5
				; CHECK-NEXT: ldrsh r5, [r3], #8
	; CHECK-NEXT: vcvt.f16.s32 s4, s4			; CHECK-NEXT: vcvt.f16.s32 s4, s4
	; CHECK-NEXT: ldrsh r5, [r3, #-10]			; CHECK-NEXT: ldrsh r6, [r3, #-10]
	; CHECK-NEXT: vmul.f16 s2, s2, s4			; CHECK-NEXT: vmul.f16 s2, s2, s4
	; CHECK-NEXT: vmov s6, r4			; CHECK-NEXT: vmov s6, r5
	; CHECK-NEXT: vldr.16 s4, [r2]			; CHECK-NEXT: vldr.16 s4, [r4]
	; CHECK-NEXT: vcvt.f16.s32 s6, s6			; CHECK-NEXT: vcvt.f16.s32 s6, s6
	; CHECK-NEXT: ldrsh r4, [r3, #-12]			; CHECK-NEXT: ldrsh r5, [r3, #-12]
	; CHECK-NEXT: vmul.f16 s4, s4, s6			; CHECK-NEXT: vmul.f16 s4, s4, s6
	; CHECK-NEXT: vmov s8, r5			; CHECK-NEXT: vmov s8, r6
	; CHECK-NEXT: vldr.16 s6, [r2, #-2]			; CHECK-NEXT: vldr.16 s6, [r4, #-2]
	; CHECK-NEXT: vcvt.f16.s32 s8, s8			; CHECK-NEXT: vcvt.f16.s32 s8, s8
	; CHECK-NEXT: vmov s10, r4			; CHECK-NEXT: vmov s10, r5
	; CHECK-NEXT: vcvtb.f32.f16 s4, s4			; CHECK-NEXT: vcvtb.f32.f16 s4, s4
	; CHECK-NEXT: vmul.f16 s6, s6, s8			; CHECK-NEXT: vmul.f16 s6, s6, s8
	; CHECK-NEXT: vldr.16 s8, [r2, #-4]			; CHECK-NEXT: vldr.16 s8, [r4, #-4]
	; CHECK-NEXT: vcvt.f16.s32 s10, s10			; CHECK-NEXT: vcvt.f16.s32 s10, s10
	; CHECK-NEXT: vcvtb.f32.f16 s6, s6			; CHECK-NEXT: vcvtb.f32.f16 s6, s6
	; CHECK-NEXT: vmul.f16 s8, s8, s10			; CHECK-NEXT: vmul.f16 s8, s8, s10
	; CHECK-NEXT: vcvtb.f32.f16 s2, s2			; CHECK-NEXT: vcvtb.f32.f16 s2, s2
	; CHECK-NEXT: vcvtb.f32.f16 s8, s8			; CHECK-NEXT: vcvtb.f32.f16 s8, s8
	; CHECK-NEXT: adds r2, #8			; CHECK-NEXT: add.w r4, r4, #8
	; CHECK-NEXT: vadd.f32 s0, s0, s8			; CHECK-NEXT: vadd.f32 s0, s0, s8
	; CHECK-NEXT: vadd.f32 s0, s0, s6			; CHECK-NEXT: vadd.f32 s0, s0, s6
	; CHECK-NEXT: vadd.f32 s0, s0, s4			; CHECK-NEXT: vadd.f32 s0, s0, s4
	; CHECK-NEXT: vadd.f32 s0, s0, s2			; CHECK-NEXT: vadd.f32 s0, s0, s2
	; CHECK-NEXT: le lr, .LBB11_5			; CHECK-NEXT: bne .LBB11_5
	; CHECK-NEXT: .LBB11_6: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB11_6: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r6, .LBB11_9			; CHECK-NEXT: wls lr, lr, .LBB11_9
	; CHECK-NEXT: @ %bb.7: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.7: @ %for.body.epil.preheader
	; CHECK-NEXT: add.w r0, r0, r12, lsl #1			; CHECK-NEXT: add.w r0, r0, r2, lsl #1
	; CHECK-NEXT: add.w r1, r1, r12, lsl #1			; CHECK-NEXT: add.w r1, r1, r2, lsl #1
	; CHECK-NEXT: mov lr, r6
	; CHECK-NEXT: .LBB11_8: @ %for.body.epil			; CHECK-NEXT: .LBB11_8: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldrsh r2, [r1], #2			; CHECK-NEXT: ldrsh r2, [r1], #2
	; CHECK-NEXT: vldr.16 s2, [r0]			; CHECK-NEXT: vldr.16 s2, [r0]
	; CHECK-NEXT: adds r0, #2			; CHECK-NEXT: adds r0, #2
	; CHECK-NEXT: vmov s4, r2			; CHECK-NEXT: vmov s4, r2
	; CHECK-NEXT: vcvt.f16.s32 s4, s4			; CHECK-NEXT: vcvt.f16.s32 s4, s4
	; CHECK-NEXT: vmul.f16 s2, s2, s4			; CHECK-NEXT: vmul.f16 s2, s2, s4
	▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll

	Show First 20 Lines • Show All 381 Lines • ▼ Show 20 Lines
	define arm_aapcs_vfpcc void @test_vec_mul_scalar_add_char(i8* nocapture readonly %a, i8* nocapture readonly %b, i8 zeroext %c, i32* nocapture %res, i32 %N) {			define arm_aapcs_vfpcc void @test_vec_mul_scalar_add_char(i8* nocapture readonly %a, i8* nocapture readonly %b, i8 zeroext %c, i32* nocapture %res, i32 %N) {
	; CHECK-LABEL: test_vec_mul_scalar_add_char:			; CHECK-LABEL: test_vec_mul_scalar_add_char:
	; CHECK: @ %bb.0: @ %entry			; CHECK: @ %bb.0: @ %entry
	; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, lr}			; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, lr}
	; CHECK-NEXT: ldr.w r12, [sp, #28]			; CHECK-NEXT: ldr.w r12, [sp, #28]
	; CHECK-NEXT: cmp.w r12, #0			; CHECK-NEXT: cmp.w r12, #0
	; CHECK-NEXT: beq.w .LBB5_11			; CHECK-NEXT: beq.w .LBB5_11
	; CHECK-NEXT: @ %bb.1: @ %for.body.lr.ph			; CHECK-NEXT: @ %bb.1: @ %for.body.lr.ph
	; CHECK-NEXT: add.w r4, r3, r12, lsl #2			; CHECK-NEXT: add.w r5, r3, r12, lsl #2
	; CHECK-NEXT: add.w r5, r1, r12			; CHECK-NEXT: add.w r6, r1, r12
	; CHECK-NEXT: cmp r4, r1			; CHECK-NEXT: cmp r5, r1
	; CHECK-NEXT: add.w r6, r0, r12			; CHECK-NEXT: add.w r4, r0, r12
	; CHECK-NEXT: cset lr, hi			; CHECK-NEXT: cset r7, hi
	; CHECK-NEXT: cmp r5, r3
	; CHECK-NEXT: cset r5, hi
	; CHECK-NEXT: cmp r4, r0
	; CHECK-NEXT: cset r4, hi
	; CHECK-NEXT: cmp r6, r3			; CHECK-NEXT: cmp r6, r3
	; CHECK-NEXT: cset r6, hi			; CHECK-NEXT: cset r6, hi
	; CHECK-NEXT: ands r4, r6			; CHECK-NEXT: cmp r5, r0
	; CHECK-NEXT: lsls r4, r4, #31			; CHECK-NEXT: cset r5, hi
				; CHECK-NEXT: cmp r4, r3
				; CHECK-NEXT: cset r4, hi
				; CHECK-NEXT: ands r5, r4
				; CHECK-NEXT: lsls r5, r5, #31
	; CHECK-NEXT: itt eq			; CHECK-NEXT: itt eq
	; CHECK-NEXT: andeq.w r6, r5, lr			; CHECK-NEXT: andeq r7, r6
	; CHECK-NEXT: lslseq.w r6, r6, #31			; CHECK-NEXT: lslseq.w r7, r7, #31
	; CHECK-NEXT: beq .LBB5_4			; CHECK-NEXT: beq .LBB5_4
	; CHECK-NEXT: @ %bb.2: @ %for.body.preheader			; CHECK-NEXT: @ %bb.2: @ %for.body.preheader
	; CHECK-NEXT: sub.w r6, r12, #1			; CHECK-NEXT: sub.w r4, r12, #1
	; CHECK-NEXT: and r9, r12, #3			; CHECK-NEXT: and lr, r12, #3
	; CHECK-NEXT: cmp r6, #3			; CHECK-NEXT: cmp r4, #3
	; CHECK-NEXT: bhs .LBB5_6			; CHECK-NEXT: bhs .LBB5_6
	; CHECK-NEXT: @ %bb.3:			; CHECK-NEXT: @ %bb.3:
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: b .LBB5_8			; CHECK-NEXT: b .LBB5_8
	; CHECK-NEXT: .LBB5_4: @ %vector.ph			; CHECK-NEXT: .LBB5_4: @ %vector.ph
	; CHECK-NEXT: movs r6, #0			; CHECK-NEXT: movs r7, #0
	; CHECK-NEXT: dlstp.32 lr, r12			; CHECK-NEXT: dlstp.32 lr, r12
	; CHECK-NEXT: .LBB5_5: @ %vector.body			; CHECK-NEXT: .LBB5_5: @ %vector.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: adds r6, #4			; CHECK-NEXT: adds r7, #4
	; CHECK-NEXT: vldrb.u32 q0, [r0], #4			; CHECK-NEXT: vldrb.u32 q0, [r0], #4
	; CHECK-NEXT: vldrb.u32 q1, [r1], #4			; CHECK-NEXT: vldrb.u32 q1, [r1], #4
	; CHECK-NEXT: vmlas.u32 q1, q0, r2			; CHECK-NEXT: vmlas.u32 q1, q0, r2
	; CHECK-NEXT: vstrw.32 q1, [r3], #16			; CHECK-NEXT: vstrw.32 q1, [r3], #16
	; CHECK-NEXT: letp lr, .LBB5_5			; CHECK-NEXT: letp lr, .LBB5_5
	; CHECK-NEXT: b .LBB5_11			; CHECK-NEXT: b .LBB5_11
	; CHECK-NEXT: .LBB5_6: @ %for.body.preheader.new			; CHECK-NEXT: .LBB5_6: @ %for.body.preheader.new
	; CHECK-NEXT: bic r6, r12, #3			; CHECK-NEXT: sub.w r8, r12, lr
	; CHECK-NEXT: movs r5, #1			; CHECK-NEXT: add.w r5, r3, #8
	; CHECK-NEXT: subs r6, #4			; CHECK-NEXT: adds r6, r0, #3
	; CHECK-NEXT: add.w r4, r3, #8			; CHECK-NEXT: adds r7, r1, #1
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r5, r6, lsr #2
	; CHECK-NEXT: adds r5, r0, #3
	; CHECK-NEXT: dls lr, lr
	; CHECK-NEXT: adds r6, r1, #1
	; CHECK-NEXT: .LBB5_7: @ %for.body			; CHECK-NEXT: .LBB5_7: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldrb r8, [r5, #-3]			; CHECK-NEXT: ldrb r9, [r6, #-3]
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: add.w r12, r12, #4
	; CHECK-NEXT: ldrb r7, [r6, #-1]			; CHECK-NEXT: ldrb r4, [r7, #-1]
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: cmp r8, r12
	; CHECK-NEXT: str r7, [r4, #-8]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: ldrb r8, [r5, #-2]			; CHECK-NEXT: str r4, [r5, #-8]
	; CHECK-NEXT: ldrb r7, [r6], #4			; CHECK-NEXT: ldrb r9, [r6, #-2]
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: ldrb r4, [r7], #4
	; CHECK-NEXT: str r7, [r4, #-4]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: ldrb r8, [r5, #-1]			; CHECK-NEXT: str r4, [r5, #-4]
	; CHECK-NEXT: ldrb r7, [r6, #-3]			; CHECK-NEXT: ldrb r9, [r6, #-1]
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: ldrb r4, [r7, #-3]
	; CHECK-NEXT: str r7, [r4]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: ldrb r8, [r5], #4			; CHECK-NEXT: str r4, [r5]
	; CHECK-NEXT: ldrb r7, [r6, #-2]			; CHECK-NEXT: ldrb r9, [r6], #4
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: ldrb r4, [r7, #-2]
	; CHECK-NEXT: str r7, [r4, #4]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: adds r4, #16			; CHECK-NEXT: str r4, [r5, #4]
	; CHECK-NEXT: le lr, .LBB5_7			; CHECK-NEXT: add.w r5, r5, #16
				; CHECK-NEXT: bne .LBB5_7
	; CHECK-NEXT: .LBB5_8: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB5_8: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r9, .LBB5_11			; CHECK-NEXT: wls lr, lr, .LBB5_11
	; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader
	; CHECK-NEXT: add r0, r12			; CHECK-NEXT: add r0, r12
	; CHECK-NEXT: add r1, r12			; CHECK-NEXT: add r1, r12
	; CHECK-NEXT: add.w r3, r3, r12, lsl #2			; CHECK-NEXT: add.w r3, r3, r12, lsl #2
	; CHECK-NEXT: mov lr, r9
	; CHECK-NEXT: .LBB5_10: @ %for.body.epil			; CHECK-NEXT: .LBB5_10: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldrb r6, [r0], #1			; CHECK-NEXT: ldrb r7, [r0], #1
	; CHECK-NEXT: ldrb r5, [r1], #1			; CHECK-NEXT: ldrb r6, [r1], #1
	; CHECK-NEXT: smlabb r6, r5, r6, r2			; CHECK-NEXT: smlabb r7, r6, r7, r2
	; CHECK-NEXT: str r6, [r3], #4			; CHECK-NEXT: str r7, [r3], #4
	; CHECK-NEXT: le lr, .LBB5_10			; CHECK-NEXT: le lr, .LBB5_10
	; CHECK-NEXT: .LBB5_11: @ %for.cond.cleanup			; CHECK-NEXT: .LBB5_11: @ %for.cond.cleanup
	; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, r9, pc}			; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, r9, pc}
	entry:			entry:
	%res12 = bitcast i32* %res to i8*			%res12 = bitcast i32* %res to i8*
	%cmp10 = icmp eq i32 %N, 0			%cmp10 = icmp eq i32 %N, 0
	br i1 %cmp10, label %for.cond.cleanup, label %for.body.lr.ph			br i1 %cmp10, label %for.cond.cleanup, label %for.body.lr.ph

	▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines
	define arm_aapcs_vfpcc void @test_vec_mul_scalar_add_uchar(i8* nocapture readonly %a, i8* nocapture readonly %b, i8 zeroext %c, i32* nocapture %res, i32 %N) {			define arm_aapcs_vfpcc void @test_vec_mul_scalar_add_uchar(i8* nocapture readonly %a, i8* nocapture readonly %b, i8 zeroext %c, i32* nocapture %res, i32 %N) {
	; CHECK-LABEL: test_vec_mul_scalar_add_uchar:			; CHECK-LABEL: test_vec_mul_scalar_add_uchar:
	; CHECK: @ %bb.0: @ %entry			; CHECK: @ %bb.0: @ %entry
	; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, lr}			; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, lr}
	; CHECK-NEXT: ldr.w r12, [sp, #28]			; CHECK-NEXT: ldr.w r12, [sp, #28]
	; CHECK-NEXT: cmp.w r12, #0			; CHECK-NEXT: cmp.w r12, #0
	; CHECK-NEXT: beq.w .LBB7_11			; CHECK-NEXT: beq.w .LBB7_11
	; CHECK-NEXT: @ %bb.1: @ %for.body.lr.ph			; CHECK-NEXT: @ %bb.1: @ %for.body.lr.ph
	; CHECK-NEXT: add.w r4, r3, r12, lsl #2			; CHECK-NEXT: add.w r5, r3, r12, lsl #2
	; CHECK-NEXT: add.w r5, r1, r12			; CHECK-NEXT: add.w r6, r1, r12
	; CHECK-NEXT: cmp r4, r1			; CHECK-NEXT: cmp r5, r1
	; CHECK-NEXT: add.w r6, r0, r12			; CHECK-NEXT: add.w r4, r0, r12
	; CHECK-NEXT: cset lr, hi			; CHECK-NEXT: cset r7, hi
	; CHECK-NEXT: cmp r5, r3
	; CHECK-NEXT: cset r5, hi
	; CHECK-NEXT: cmp r4, r0
	; CHECK-NEXT: cset r4, hi
	; CHECK-NEXT: cmp r6, r3			; CHECK-NEXT: cmp r6, r3
	; CHECK-NEXT: cset r6, hi			; CHECK-NEXT: cset r6, hi
	; CHECK-NEXT: ands r4, r6			; CHECK-NEXT: cmp r5, r0
	; CHECK-NEXT: lsls r4, r4, #31			; CHECK-NEXT: cset r5, hi
				; CHECK-NEXT: cmp r4, r3
				; CHECK-NEXT: cset r4, hi
				; CHECK-NEXT: ands r5, r4
				; CHECK-NEXT: lsls r5, r5, #31
	; CHECK-NEXT: itt eq			; CHECK-NEXT: itt eq
	; CHECK-NEXT: andeq.w r6, r5, lr			; CHECK-NEXT: andeq r7, r6
	; CHECK-NEXT: lslseq.w r6, r6, #31			; CHECK-NEXT: lslseq.w r7, r7, #31
	; CHECK-NEXT: beq .LBB7_4			; CHECK-NEXT: beq .LBB7_4
	; CHECK-NEXT: @ %bb.2: @ %for.body.preheader			; CHECK-NEXT: @ %bb.2: @ %for.body.preheader
	; CHECK-NEXT: sub.w r6, r12, #1			; CHECK-NEXT: sub.w r4, r12, #1
	; CHECK-NEXT: and r9, r12, #3			; CHECK-NEXT: and lr, r12, #3
	; CHECK-NEXT: cmp r6, #3			; CHECK-NEXT: cmp r4, #3
	; CHECK-NEXT: bhs .LBB7_6			; CHECK-NEXT: bhs .LBB7_6
	; CHECK-NEXT: @ %bb.3:			; CHECK-NEXT: @ %bb.3:
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: b .LBB7_8			; CHECK-NEXT: b .LBB7_8
	; CHECK-NEXT: .LBB7_4: @ %vector.ph			; CHECK-NEXT: .LBB7_4: @ %vector.ph
	; CHECK-NEXT: movs r6, #0			; CHECK-NEXT: movs r7, #0
	; CHECK-NEXT: dlstp.32 lr, r12			; CHECK-NEXT: dlstp.32 lr, r12
	; CHECK-NEXT: .LBB7_5: @ %vector.body			; CHECK-NEXT: .LBB7_5: @ %vector.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: adds r6, #4			; CHECK-NEXT: adds r7, #4
	; CHECK-NEXT: vldrb.u32 q0, [r0], #4			; CHECK-NEXT: vldrb.u32 q0, [r0], #4
	; CHECK-NEXT: vldrb.u32 q1, [r1], #4			; CHECK-NEXT: vldrb.u32 q1, [r1], #4
	; CHECK-NEXT: vmlas.u32 q1, q0, r2			; CHECK-NEXT: vmlas.u32 q1, q0, r2
	; CHECK-NEXT: vstrw.32 q1, [r3], #16			; CHECK-NEXT: vstrw.32 q1, [r3], #16
	; CHECK-NEXT: letp lr, .LBB7_5			; CHECK-NEXT: letp lr, .LBB7_5
	; CHECK-NEXT: b .LBB7_11			; CHECK-NEXT: b .LBB7_11
	; CHECK-NEXT: .LBB7_6: @ %for.body.preheader.new			; CHECK-NEXT: .LBB7_6: @ %for.body.preheader.new
	; CHECK-NEXT: bic r6, r12, #3			; CHECK-NEXT: sub.w r8, r12, lr
	; CHECK-NEXT: movs r5, #1			; CHECK-NEXT: add.w r5, r3, #8
	; CHECK-NEXT: subs r6, #4			; CHECK-NEXT: adds r6, r0, #3
	; CHECK-NEXT: add.w r4, r3, #8			; CHECK-NEXT: adds r7, r1, #1
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r5, r6, lsr #2
	; CHECK-NEXT: adds r5, r0, #3
	; CHECK-NEXT: dls lr, lr
	; CHECK-NEXT: adds r6, r1, #1
	; CHECK-NEXT: .LBB7_7: @ %for.body			; CHECK-NEXT: .LBB7_7: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldrb r8, [r5, #-3]			; CHECK-NEXT: ldrb r9, [r6, #-3]
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: add.w r12, r12, #4
	; CHECK-NEXT: ldrb r7, [r6, #-1]			; CHECK-NEXT: ldrb r4, [r7, #-1]
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: cmp r8, r12
	; CHECK-NEXT: str r7, [r4, #-8]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: ldrb r8, [r5, #-2]			; CHECK-NEXT: str r4, [r5, #-8]
	; CHECK-NEXT: ldrb r7, [r6], #4			; CHECK-NEXT: ldrb r9, [r6, #-2]
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: ldrb r4, [r7], #4
	; CHECK-NEXT: str r7, [r4, #-4]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: ldrb r8, [r5, #-1]			; CHECK-NEXT: str r4, [r5, #-4]
	; CHECK-NEXT: ldrb r7, [r6, #-3]			; CHECK-NEXT: ldrb r9, [r6, #-1]
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: ldrb r4, [r7, #-3]
	; CHECK-NEXT: str r7, [r4]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: ldrb r8, [r5], #4			; CHECK-NEXT: str r4, [r5]
	; CHECK-NEXT: ldrb r7, [r6, #-2]			; CHECK-NEXT: ldrb r9, [r6], #4
	; CHECK-NEXT: smlabb r7, r7, r8, r2			; CHECK-NEXT: ldrb r4, [r7, #-2]
	; CHECK-NEXT: str r7, [r4, #4]			; CHECK-NEXT: smlabb r4, r4, r9, r2
	; CHECK-NEXT: adds r4, #16			; CHECK-NEXT: str r4, [r5, #4]
	; CHECK-NEXT: le lr, .LBB7_7			; CHECK-NEXT: add.w r5, r5, #16
				; CHECK-NEXT: bne .LBB7_7
	; CHECK-NEXT: .LBB7_8: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB7_8: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r9, .LBB7_11			; CHECK-NEXT: wls lr, lr, .LBB7_11
	; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader
	; CHECK-NEXT: add r0, r12			; CHECK-NEXT: add r0, r12
	; CHECK-NEXT: add r1, r12			; CHECK-NEXT: add r1, r12
	; CHECK-NEXT: add.w r3, r3, r12, lsl #2			; CHECK-NEXT: add.w r3, r3, r12, lsl #2
	; CHECK-NEXT: mov lr, r9
	; CHECK-NEXT: .LBB7_10: @ %for.body.epil			; CHECK-NEXT: .LBB7_10: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldrb r6, [r0], #1			; CHECK-NEXT: ldrb r7, [r0], #1
	; CHECK-NEXT: ldrb r5, [r1], #1			; CHECK-NEXT: ldrb r6, [r1], #1
	; CHECK-NEXT: smlabb r6, r5, r6, r2			; CHECK-NEXT: smlabb r7, r6, r7, r2
	; CHECK-NEXT: str r6, [r3], #4			; CHECK-NEXT: str r7, [r3], #4
	; CHECK-NEXT: le lr, .LBB7_10			; CHECK-NEXT: le lr, .LBB7_10
	; CHECK-NEXT: .LBB7_11: @ %for.cond.cleanup			; CHECK-NEXT: .LBB7_11: @ %for.cond.cleanup
	; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, r9, pc}			; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, r9, pc}
	entry:			entry:
	%res12 = bitcast i32* %res to i8*			%res12 = bitcast i32* %res to i8*
	%cmp10 = icmp eq i32 %N, 0			%cmp10 = icmp eq i32 %N, 0
	br i1 %cmp10, label %for.cond.cleanup, label %for.body.lr.ph			br i1 %cmp10, label %for.cond.cleanup, label %for.body.lr.ph

	▲ Show 20 Lines • Show All 205 Lines • ▼ Show 20 Lines
	define arm_aapcs_vfpcc void @test_vec_mul_scalar_add_int(i32* nocapture readonly %a, i32* nocapture readonly %b, i32 %c, i32* nocapture %res, i32 %N) {			define arm_aapcs_vfpcc void @test_vec_mul_scalar_add_int(i32* nocapture readonly %a, i32* nocapture readonly %b, i32 %c, i32* nocapture %res, i32 %N) {
	; CHECK-LABEL: test_vec_mul_scalar_add_int:			; CHECK-LABEL: test_vec_mul_scalar_add_int:
	; CHECK: @ %bb.0: @ %entry			; CHECK: @ %bb.0: @ %entry
	; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, lr}			; CHECK-NEXT: push.w {r4, r5, r6, r7, r8, r9, lr}
	; CHECK-NEXT: ldr.w r12, [sp, #28]			; CHECK-NEXT: ldr.w r12, [sp, #28]
	; CHECK-NEXT: cmp.w r12, #0			; CHECK-NEXT: cmp.w r12, #0
	; CHECK-NEXT: beq.w .LBB9_11			; CHECK-NEXT: beq.w .LBB9_11
	; CHECK-NEXT: @ %bb.1: @ %vector.memcheck			; CHECK-NEXT: @ %bb.1: @ %vector.memcheck
	; CHECK-NEXT: add.w r4, r3, r12, lsl #2			; CHECK-NEXT: add.w r5, r3, r12, lsl #2
	; CHECK-NEXT: add.w r5, r1, r12, lsl #2			; CHECK-NEXT: add.w r6, r1, r12, lsl #2
	; CHECK-NEXT: cmp r4, r1			; CHECK-NEXT: cmp r5, r1
	; CHECK-NEXT: add.w r6, r0, r12, lsl #2			; CHECK-NEXT: add.w r4, r0, r12, lsl #2
	; CHECK-NEXT: cset lr, hi			; CHECK-NEXT: cset r7, hi
	; CHECK-NEXT: cmp r5, r3
	; CHECK-NEXT: cset r5, hi
	; CHECK-NEXT: cmp r4, r0
	; CHECK-NEXT: cset r4, hi
	; CHECK-NEXT: cmp r6, r3			; CHECK-NEXT: cmp r6, r3
	; CHECK-NEXT: cset r6, hi			; CHECK-NEXT: cset r6, hi
	; CHECK-NEXT: ands r4, r6			; CHECK-NEXT: cmp r5, r0
	; CHECK-NEXT: lsls r4, r4, #31			; CHECK-NEXT: cset r5, hi
				; CHECK-NEXT: cmp r4, r3
				; CHECK-NEXT: cset r4, hi
				; CHECK-NEXT: ands r5, r4
				; CHECK-NEXT: lsls r5, r5, #31
	; CHECK-NEXT: itt eq			; CHECK-NEXT: itt eq
	; CHECK-NEXT: andeq.w r6, r5, lr			; CHECK-NEXT: andeq r7, r6
	; CHECK-NEXT: lslseq.w r6, r6, #31			; CHECK-NEXT: lslseq.w r7, r7, #31
	; CHECK-NEXT: beq .LBB9_4			; CHECK-NEXT: beq .LBB9_4
	; CHECK-NEXT: @ %bb.2: @ %for.body.preheader			; CHECK-NEXT: @ %bb.2: @ %for.body.preheader
	; CHECK-NEXT: sub.w r6, r12, #1			; CHECK-NEXT: sub.w r4, r12, #1
	; CHECK-NEXT: and r9, r12, #3			; CHECK-NEXT: and lr, r12, #3
	; CHECK-NEXT: cmp r6, #3			; CHECK-NEXT: cmp r4, #3
	; CHECK-NEXT: bhs .LBB9_6			; CHECK-NEXT: bhs .LBB9_6
	; CHECK-NEXT: @ %bb.3:			; CHECK-NEXT: @ %bb.3:
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: b .LBB9_8			; CHECK-NEXT: b .LBB9_8
	; CHECK-NEXT: .LBB9_4: @ %vector.ph			; CHECK-NEXT: .LBB9_4: @ %vector.ph
	; CHECK-NEXT: movs r6, #0			; CHECK-NEXT: movs r7, #0
	; CHECK-NEXT: dlstp.32 lr, r12			; CHECK-NEXT: dlstp.32 lr, r12
	; CHECK-NEXT: .LBB9_5: @ %vector.body			; CHECK-NEXT: .LBB9_5: @ %vector.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: adds r6, #4			; CHECK-NEXT: adds r7, #4
	; CHECK-NEXT: vldrw.u32 q0, [r0], #16			; CHECK-NEXT: vldrw.u32 q0, [r0], #16
	; CHECK-NEXT: vldrw.u32 q1, [r1], #16			; CHECK-NEXT: vldrw.u32 q1, [r1], #16
	; CHECK-NEXT: vmlas.u32 q1, q0, r2			; CHECK-NEXT: vmlas.u32 q1, q0, r2
	; CHECK-NEXT: vstrw.32 q1, [r3], #16			; CHECK-NEXT: vstrw.32 q1, [r3], #16
	; CHECK-NEXT: letp lr, .LBB9_5			; CHECK-NEXT: letp lr, .LBB9_5
	; CHECK-NEXT: b .LBB9_11			; CHECK-NEXT: b .LBB9_11
	; CHECK-NEXT: .LBB9_6: @ %for.body.preheader.new			; CHECK-NEXT: .LBB9_6: @ %for.body.preheader.new
	; CHECK-NEXT: bic r6, r12, #3			; CHECK-NEXT: sub.w r8, r12, lr
	; CHECK-NEXT: movs r5, #1			; CHECK-NEXT: add.w r5, r3, #8
	; CHECK-NEXT: subs r6, #4			; CHECK-NEXT: add.w r6, r0, #8
	; CHECK-NEXT: add.w r4, r3, #8			; CHECK-NEXT: add.w r7, r1, #8
	; CHECK-NEXT: mov.w r12, #0			; CHECK-NEXT: mov.w r12, #0
	; CHECK-NEXT: add.w lr, r5, r6, lsr #2
	; CHECK-NEXT: add.w r5, r0, #8
	; CHECK-NEXT: dls lr, lr
	; CHECK-NEXT: add.w r6, r1, #8
	; CHECK-NEXT: .LBB9_7: @ %for.body			; CHECK-NEXT: .LBB9_7: @ %for.body
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldr r8, [r5, #-8]			; CHECK-NEXT: ldr r9, [r6, #-8]
	; CHECK-NEXT: add.w r12, r12, #4			; CHECK-NEXT: add.w r12, r12, #4
	; CHECK-NEXT: ldr r7, [r6, #-8]			; CHECK-NEXT: ldr r4, [r7, #-8]
	; CHECK-NEXT: mla r7, r7, r8, r2			; CHECK-NEXT: cmp r8, r12
	; CHECK-NEXT: str r7, [r4, #-8]			; CHECK-NEXT: mla r4, r4, r9, r2
	; CHECK-NEXT: ldr r8, [r5, #-4]			; CHECK-NEXT: str r4, [r5, #-8]
	; CHECK-NEXT: ldr r7, [r6, #-4]			; CHECK-NEXT: ldr r9, [r6, #-4]
	; CHECK-NEXT: mla r7, r7, r8, r2			; CHECK-NEXT: ldr r4, [r7, #-4]
	; CHECK-NEXT: str r7, [r4, #-4]			; CHECK-NEXT: mla r4, r4, r9, r2
	; CHECK-NEXT: ldr.w r8, [r5]			; CHECK-NEXT: str r4, [r5, #-4]
	; CHECK-NEXT: ldr r7, [r6]			; CHECK-NEXT: ldr.w r9, [r6]
	; CHECK-NEXT: mla r7, r7, r8, r2			; CHECK-NEXT: ldr r4, [r7]
	; CHECK-NEXT: str r7, [r4]			; CHECK-NEXT: mla r4, r4, r9, r2
	; CHECK-NEXT: ldr.w r8, [r5, #4]			; CHECK-NEXT: str r4, [r5]
	; CHECK-NEXT: adds r5, #16			; CHECK-NEXT: ldr.w r9, [r6, #4]
	; CHECK-NEXT: ldr r7, [r6, #4]			; CHECK-NEXT: add.w r6, r6, #16
	; CHECK-NEXT: adds r6, #16			; CHECK-NEXT: ldr r4, [r7, #4]
	; CHECK-NEXT: mla r7, r7, r8, r2			; CHECK-NEXT: add.w r7, r7, #16
	; CHECK-NEXT: str r7, [r4, #4]			; CHECK-NEXT: mla r4, r4, r9, r2
	; CHECK-NEXT: adds r4, #16			; CHECK-NEXT: str r4, [r5, #4]
	; CHECK-NEXT: le lr, .LBB9_7			; CHECK-NEXT: add.w r5, r5, #16
				; CHECK-NEXT: bne .LBB9_7
	; CHECK-NEXT: .LBB9_8: @ %for.cond.cleanup.loopexit.unr-lcssa			; CHECK-NEXT: .LBB9_8: @ %for.cond.cleanup.loopexit.unr-lcssa
	; CHECK-NEXT: wls lr, r9, .LBB9_11			; CHECK-NEXT: wls lr, lr, .LBB9_11
	; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader			; CHECK-NEXT: @ %bb.9: @ %for.body.epil.preheader
	; CHECK-NEXT: add.w r0, r0, r12, lsl #2			; CHECK-NEXT: add.w r0, r0, r12, lsl #2
	; CHECK-NEXT: add.w r1, r1, r12, lsl #2			; CHECK-NEXT: add.w r1, r1, r12, lsl #2
	; CHECK-NEXT: add.w r3, r3, r12, lsl #2			; CHECK-NEXT: add.w r3, r3, r12, lsl #2
	; CHECK-NEXT: mov lr, r9
	; CHECK-NEXT: .LBB9_10: @ %for.body.epil			; CHECK-NEXT: .LBB9_10: @ %for.body.epil
	; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1			; CHECK-NEXT: @ =>This Inner Loop Header: Depth=1
	; CHECK-NEXT: ldr r6, [r0], #4			; CHECK-NEXT: ldr r7, [r0], #4
	; CHECK-NEXT: ldr r5, [r1], #4			; CHECK-NEXT: ldr r6, [r1], #4
	; CHECK-NEXT: mla r6, r5, r6, r2			; CHECK-NEXT: mla r7, r6, r7, r2
	; CHECK-NEXT: str r6, [r3], #4			; CHECK-NEXT: str r7, [r3], #4
	; CHECK-NEXT: le lr, .LBB9_10			; CHECK-NEXT: le lr, .LBB9_10
	; CHECK-NEXT: .LBB9_11: @ %for.cond.cleanup			; CHECK-NEXT: .LBB9_11: @ %for.cond.cleanup
	; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, r9, pc}			; CHECK-NEXT: pop.w {r4, r5, r6, r7, r8, r9, pc}
	entry:			entry:
	%cmp8 = icmp eq i32 %N, 0			%cmp8 = icmp eq i32 %N, 0
	br i1 %cmp8, label %for.cond.cleanup, label %vector.memcheck			br i1 %cmp8, label %for.cond.cleanup, label %vector.memcheck

	vector.memcheck: ; preds = %entry			vector.memcheck: ; preds = %entry
	▲ Show 20 Lines • Show All 194 Lines • Show Last 20 Lines

test/Transforms/LoopStrengthReduce/X86/pr46943.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt -S -loop-reduce < %s \| FileCheck %s			; RUN: opt -S -loop-reduce < %s \| FileCheck %s

	target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
	target triple = "x86_64-unknown-linux-gnu"			target triple = "x86_64-unknown-linux-gnu"

	declare void @use(i8 zeroext)			declare void @use(i8 zeroext)
	declare void @use_p(i8*)			declare void @use_p(i8*)

				; nuw needs to be dropped when switching to post-inc comparison.
	define i8 @drop_nuw() {			define i8 @drop_nuw() {
	; CHECK-LABEL: @drop_nuw(			; CHECK-LABEL: @drop_nuw(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
	; CHECK-NEXT: call void @use(i8 [[IV]])			; CHECK-NEXT: call void @use(i8 [[IV]])
	; CHECK-NEXT: [[IV_NEXT]] = add nuw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], 1
	; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[IV_NEXT]], 0			; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[IV_NEXT]], 0
	; CHECK-NEXT: br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[TMP0:%.*]] = add i8 [[IV_NEXT]], -1			; CHECK-NEXT: [[TMP0:%.*]] = add i8 [[IV_NEXT]], -1
	; CHECK-NEXT: ret i8 [[TMP0]]			; CHECK-NEXT: ret i8 [[TMP0]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]			%iv = phi i8 [ 0, %entry ], [ %iv.next, %loop ]
	call void @use(i8 %iv)			call void @use(i8 %iv)

	%iv.next = add nuw i8 %iv, 1			%iv.next = add nuw i8 %iv, 1
	%cmp = icmp eq i8 %iv, -1			%cmp = icmp eq i8 %iv, -1
	br i1 %cmp, label %exit, label %loop			br i1 %cmp, label %exit, label %loop

	exit:			exit:
	ret i8 %iv			ret i8 %iv
	}			}

				; nsw needs to be dropped when switching to post-inc comparison.
	define i8 @drop_nsw() {			define i8 @drop_nsw() {
	; CHECK-LABEL: @drop_nsw(			; CHECK-LABEL: @drop_nsw(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ 127, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ 127, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
	; CHECK-NEXT: call void @use(i8 [[IV]])			; CHECK-NEXT: call void @use(i8 [[IV]])
	; CHECK-NEXT: [[IV_NEXT]] = add nsw i8 [[IV]], -1			; CHECK-NEXT: [[IV_NEXT]] = add i8 [[IV]], -1
				fhahnUnsubmitted Not Done Reply Inline Actions Is this transform actually still helpful, if we have to drop flags to do it? fhahn: Is this transform actually still helpful, if we have to drop flags to do it?
				nikicAuthorUnsubmitted Done Reply Inline Actions As LSR runs in the backend, and the backend makes rather little use of nowrap flags, I would assume so. When we did the same change in LFTR (which runs in the middle of the pipeline where nowrap flags are more important), I don't think any performance regressions were reported. nikic: As LSR runs in the backend, and the backend makes rather little use of nowrap flags, I would…
	; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[IV_NEXT]], 127			; CHECK-NEXT: [[CMP:%.*]] = icmp eq i8 [[IV_NEXT]], 127
	; CHECK-NEXT: br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]			; CHECK-NEXT: br i1 [[CMP]], label [[EXIT:%.*]], label [[LOOP]]
	; CHECK: exit:			; CHECK: exit:
	; CHECK-NEXT: [[TMP0:%.*]] = add i8 [[IV_NEXT]], 1			; CHECK-NEXT: [[TMP0:%.*]] = add i8 [[IV_NEXT]], 1
	; CHECK-NEXT: ret i8 [[TMP0]]			; CHECK-NEXT: ret i8 [[TMP0]]
	;			;
	entry:			entry:
	br label %loop			br label %loop

	loop:			loop:
	%iv = phi i8 [ 127, %entry ], [ %iv.next, %loop ]			%iv = phi i8 [ 127, %entry ], [ %iv.next, %loop ]
	call void @use(i8 %iv)			call void @use(i8 %iv)

	%iv.next = add nsw i8 %iv, -1			%iv.next = add nsw i8 %iv, -1
	%cmp = icmp eq i8 %iv, -128			%cmp = icmp eq i8 %iv, -128
	br i1 %cmp, label %exit, label %loop			br i1 %cmp, label %exit, label %loop

	exit:			exit:
	ret i8 %iv			ret i8 %iv
	}			}

				; Comparison already in post-inc form, no need to drop nuw.
	define i8 @already_postinc() {			define i8 @already_postinc() {
	; CHECK-LABEL: @already_postinc(			; CHECK-LABEL: @already_postinc(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[LOOP:%.*]]			; CHECK-NEXT: br label [[LOOP:%.*]]
	; CHECK: loop:			; CHECK: loop:
	; CHECK-NEXT: [[IV:%.]] = phi i8 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]			; CHECK-NEXT: [[IV:%.]] = phi i8 [ 0, [[ENTRY:%.]] ], [ [[IV_NEXT:%.*]], [[LOOP]] ]
	; CHECK-NEXT: call void @use(i8 [[IV]])			; CHECK-NEXT: call void @use(i8 [[IV]])
	; CHECK-NEXT: [[IV_NEXT]] = add nuw i8 [[IV]], 1			; CHECK-NEXT: [[IV_NEXT]] = add nuw i8 [[IV]], 1
	Show All 20 Lines

test/Transforms/LoopStrengthReduce/X86/sibling-loops.ll

	Show All 11 Lines
	; Check there is no extra lsr.iv generated in foo.			; Check there is no extra lsr.iv generated in foo.
	define void @foo(i64 %N) local_unnamed_addr {			define void @foo(i64 %N) local_unnamed_addr {
	; CHECK-LABEL: @foo(			; CHECK-LABEL: @foo(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: br label [[DO_BODY:%.*]]			; CHECK-NEXT: br label [[DO_BODY:%.*]]
	; CHECK: do.body:			; CHECK: do.body:
	; CHECK-NEXT: [[I_0:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INC:%.*]], [[DO_BODY]] ]			; CHECK-NEXT: [[I_0:%.]] = phi i64 [ 0, [[ENTRY:%.]] ], [ [[INC:%.*]], [[DO_BODY]] ]
	; CHECK-NEXT: tail call void @goo(i64 [[I_0]], i64 [[I_0]])			; CHECK-NEXT: tail call void @goo(i64 [[I_0]], i64 [[I_0]])
	; CHECK-NEXT: [[INC]] = add nuw nsw i64 [[I_0]], 1			; CHECK-NEXT: [[INC]] = add nuw i64 [[I_0]], 1
	; CHECK-NEXT: [[T0:%.]] = load i64, i64 @cond, align 8			; CHECK-NEXT: [[T0:%.]] = load i64, i64 @cond, align 8
	; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i64 [[T0]], 0			; CHECK-NEXT: [[TOBOOL:%.*]] = icmp eq i64 [[T0]], 0
	; CHECK-NEXT: br i1 [[TOBOOL]], label [[DO_BODY2_PREHEADER:%.*]], label [[DO_BODY]]			; CHECK-NEXT: br i1 [[TOBOOL]], label [[DO_BODY2_PREHEADER:%.*]], label [[DO_BODY]]
	; CHECK: do.body2.preheader:			; CHECK: do.body2.preheader:
	; CHECK-NEXT: br label [[DO_BODY2:%.*]]			; CHECK-NEXT: br label [[DO_BODY2:%.*]]
	; CHECK: do.body2:			; CHECK: do.body2:
	; CHECK-NEXT: [[I_1:%.]] = phi i64 [ [[INC3:%.]], [[DO_BODY2]] ], [ 0, [[DO_BODY2_PREHEADER]] ]			; CHECK-NEXT: [[I_1:%.]] = phi i64 [ [[INC3:%.]], [[DO_BODY2]] ], [ 0, [[DO_BODY2_PREHEADER]] ]
	; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INC]], [[I_1]]			; CHECK-NEXT: [[TMP0:%.*]] = add i64 [[INC]], [[I_1]]
	; CHECK-NEXT: tail call void @goo(i64 [[I_1]], i64 [[TMP0]])			; CHECK-NEXT: tail call void @goo(i64 [[I_1]], i64 [[TMP0]])
	; CHECK-NEXT: [[INC3]] = add nuw nsw i64 [[I_1]], 1			; CHECK-NEXT: [[INC3]] = add nuw i64 [[I_1]], 1
	; CHECK-NEXT: [[T1:%.]] = load i64, i64 @cond, align 8			; CHECK-NEXT: [[T1:%.]] = load i64, i64 @cond, align 8
	; CHECK-NEXT: [[TOBOOL6:%.*]] = icmp eq i64 [[T1]], 0			; CHECK-NEXT: [[TOBOOL6:%.*]] = icmp eq i64 [[T1]], 0
	; CHECK-NEXT: br i1 [[TOBOOL6]], label [[DO_BODY8_PREHEADER:%.*]], label [[DO_BODY2]]			; CHECK-NEXT: br i1 [[TOBOOL6]], label [[DO_BODY8_PREHEADER:%.*]], label [[DO_BODY2]]
	; CHECK: do.body8.preheader:			; CHECK: do.body8.preheader:
	; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INC]], [[INC3]]			; CHECK-NEXT: [[TMP1:%.*]] = add i64 [[INC]], [[INC3]]
	; CHECK-NEXT: br label [[DO_BODY8:%.*]]			; CHECK-NEXT: br label [[DO_BODY8:%.*]]
	; CHECK: do.body8:			; CHECK: do.body8:
	; CHECK-NEXT: [[I_2:%.]] = phi i64 [ [[INC9:%.]], [[DO_BODY8]] ], [ 0, [[DO_BODY8_PREHEADER]] ]			; CHECK-NEXT: [[I_2:%.]] = phi i64 [ [[INC9:%.]], [[DO_BODY8]] ], [ 0, [[DO_BODY8_PREHEADER]] ]
	; CHECK-NEXT: [[J_2:%.]] = phi i64 [ [[INC10:%.]], [[DO_BODY8]] ], [ [[TMP1]], [[DO_BODY8_PREHEADER]] ]			; CHECK-NEXT: [[J_2:%.]] = phi i64 [ [[INC10:%.]], [[DO_BODY8]] ], [ [[TMP1]], [[DO_BODY8_PREHEADER]] ]
	; CHECK-NEXT: tail call void @goo(i64 [[I_2]], i64 [[J_2]])			; CHECK-NEXT: tail call void @goo(i64 [[I_2]], i64 [[J_2]])
	; CHECK-NEXT: [[INC9]] = add nuw nsw i64 [[I_2]], 1			; CHECK-NEXT: [[INC9]] = add nuw nsw i64 [[I_2]], 1
	; CHECK-NEXT: [[INC10]] = add nsw i64 [[J_2]], 1			; CHECK-NEXT: [[INC10]] = add i64 [[J_2]], 1
	; CHECK-NEXT: [[T2:%.]] = load i64, i64 @cond, align 8			; CHECK-NEXT: [[T2:%.]] = load i64, i64 @cond, align 8
	; CHECK-NEXT: [[TOBOOL12:%.*]] = icmp eq i64 [[T2]], 0			; CHECK-NEXT: [[TOBOOL12:%.*]] = icmp eq i64 [[T2]], 0
	; CHECK-NEXT: br i1 [[TOBOOL12]], label [[DO_BODY14_PREHEADER:%.*]], label [[DO_BODY8]]			; CHECK-NEXT: br i1 [[TOBOOL12]], label [[DO_BODY14_PREHEADER:%.*]], label [[DO_BODY8]]
	; CHECK: do.body14.preheader:			; CHECK: do.body14.preheader:
	; CHECK-NEXT: br label [[DO_BODY14:%.*]]			; CHECK-NEXT: br label [[DO_BODY14:%.*]]
	; CHECK: do.body14:			; CHECK: do.body14:
	; CHECK-NEXT: [[I_3:%.]] = phi i64 [ [[INC15:%.]], [[DO_BODY14]] ], [ 0, [[DO_BODY14_PREHEADER]] ]			; CHECK-NEXT: [[I_3:%.]] = phi i64 [ [[INC15:%.]], [[DO_BODY14]] ], [ 0, [[DO_BODY14_PREHEADER]] ]
	; CHECK-NEXT: [[J_3:%.]] = phi i64 [ [[INC16:%.]], [[DO_BODY14]] ], [ [[INC10]], [[DO_BODY14_PREHEADER]] ]			; CHECK-NEXT: [[J_3:%.]] = phi i64 [ [[INC16:%.]], [[DO_BODY14]] ], [ [[INC10]], [[DO_BODY14_PREHEADER]] ]
	; CHECK-NEXT: tail call void @goo(i64 [[I_3]], i64 [[J_3]])			; CHECK-NEXT: tail call void @goo(i64 [[I_3]], i64 [[J_3]])
	; CHECK-NEXT: [[INC15]] = add nuw nsw i64 [[I_3]], 1			; CHECK-NEXT: [[INC15]] = add nuw nsw i64 [[I_3]], 1
	; CHECK-NEXT: [[INC16]] = add nsw i64 [[J_3]], 1			; CHECK-NEXT: [[INC16]] = add i64 [[J_3]], 1
	; CHECK-NEXT: [[T3:%.]] = load i64, i64 @cond, align 8			; CHECK-NEXT: [[T3:%.]] = load i64, i64 @cond, align 8
	; CHECK-NEXT: [[TOBOOL18:%.*]] = icmp eq i64 [[T3]], 0			; CHECK-NEXT: [[TOBOOL18:%.*]] = icmp eq i64 [[T3]], 0
	; CHECK-NEXT: br i1 [[TOBOOL18]], label [[DO_BODY20_PREHEADER:%.*]], label [[DO_BODY14]]			; CHECK-NEXT: br i1 [[TOBOOL18]], label [[DO_BODY20_PREHEADER:%.*]], label [[DO_BODY14]]
	; CHECK: do.body20.preheader:			; CHECK: do.body20.preheader:
	; CHECK-NEXT: br label [[DO_BODY20:%.*]]			; CHECK-NEXT: br label [[DO_BODY20:%.*]]
	; CHECK: do.body20:			; CHECK: do.body20:
	; CHECK-NEXT: [[I_4:%.]] = phi i64 [ [[INC21:%.]], [[DO_BODY20]] ], [ 0, [[DO_BODY20_PREHEADER]] ]			; CHECK-NEXT: [[I_4:%.]] = phi i64 [ [[INC21:%.]], [[DO_BODY20]] ], [ 0, [[DO_BODY20_PREHEADER]] ]
	; CHECK-NEXT: [[J_4:%.]] = phi i64 [ [[INC22:%.]], [[DO_BODY20]] ], [ [[INC16]], [[DO_BODY20_PREHEADER]] ]			; CHECK-NEXT: [[J_4:%.]] = phi i64 [ [[INC22:%.]], [[DO_BODY20]] ], [ [[INC16]], [[DO_BODY20_PREHEADER]] ]
	; CHECK-NEXT: tail call void @goo(i64 [[I_4]], i64 [[J_4]])			; CHECK-NEXT: tail call void @goo(i64 [[I_4]], i64 [[J_4]])
	; CHECK-NEXT: [[INC21]] = add nuw nsw i64 [[I_4]], 1			; CHECK-NEXT: [[INC21]] = add nuw nsw i64 [[I_4]], 1
	; CHECK-NEXT: [[INC22]] = add nsw i64 [[J_4]], 1			; CHECK-NEXT: [[INC22]] = add i64 [[J_4]], 1
	; CHECK-NEXT: [[T4:%.]] = load i64, i64 @cond, align 8			; CHECK-NEXT: [[T4:%.]] = load i64, i64 @cond, align 8
	; CHECK-NEXT: [[TOBOOL24:%.*]] = icmp eq i64 [[T4]], 0			; CHECK-NEXT: [[TOBOOL24:%.*]] = icmp eq i64 [[T4]], 0
	; CHECK-NEXT: br i1 [[TOBOOL24]], label [[DO_BODY26_PREHEADER:%.*]], label [[DO_BODY20]]			; CHECK-NEXT: br i1 [[TOBOOL24]], label [[DO_BODY26_PREHEADER:%.*]], label [[DO_BODY20]]
	; CHECK: do.body26.preheader:			; CHECK: do.body26.preheader:
	; CHECK-NEXT: br label [[DO_BODY26:%.*]]			; CHECK-NEXT: br label [[DO_BODY26:%.*]]
	; CHECK: do.body26:			; CHECK: do.body26:
	; CHECK-NEXT: [[I_5:%.]] = phi i64 [ [[INC27:%.]], [[DO_BODY26]] ], [ 0, [[DO_BODY26_PREHEADER]] ]			; CHECK-NEXT: [[I_5:%.]] = phi i64 [ [[INC27:%.]], [[DO_BODY26]] ], [ 0, [[DO_BODY26_PREHEADER]] ]
	; CHECK-NEXT: [[J_5:%.]] = phi i64 [ [[INC28:%.]], [[DO_BODY26]] ], [ [[INC22]], [[DO_BODY26_PREHEADER]] ]			; CHECK-NEXT: [[J_5:%.]] = phi i64 [ [[INC28:%.]], [[DO_BODY26]] ], [ [[INC22]], [[DO_BODY26_PREHEADER]] ]
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines