This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/Analysis/
-
llvm/
-
Analysis/
3
ScalarEvolution.h
3
ScalarEvolutionExpressions.h
-
lib/Analysis/
-
Analysis/
2
ScalarEvolution.cpp
-
test/
-
Analysis/ScalarEvolution/
-
ScalarEvolution/
-
max-addops-inline.ll
-
min-max-exprs.ll
-
predicated-trip-count.ll
-
trip-count14.ll
-
Transforms/
-
IRCE/
-
conjunctive-checks.ll
-
ranges_of_different_types.ll
-
rc-negative-bound.ll
-
single-access-no-preloop.ll
-
single-access-with-preloop.ll
-
LoadStoreVectorizer/X86/
-
X86/
-
compare-scev-by-complexity.ll

Differential D51014

[SCEV] Compare SCEVs by a complexity rank before lexic-al comparison
Needs RevisionPublic

Authored by rtereshin on Aug 20 2018, 4:44 PM.

Download Raw Diff

Details

Reviewers

mkazantsev
efriedma
sanjoy

Summary

This patch is planned to include 2 commits:

The first commit:

Revert "[SCEV][NFC] Check NoWrap flags before lexicographical comparison of SCEVs"

This reverts r319889.

Unfortunately, wrapping flags are not a part of SCEV's identity (they
do not participate in computing a hash value or in equality
comparisons) and in fact they could be assigned after the fact w/o
rebuilding a SCEV.

Grep for const_cast's to see quite a few of examples, apparently all
for AddRec's at the moment.

So, if 2 expressions get built in 2 slightly different ways: one with
flags set in the beginning, the other with the flags attached later
on, we may end up with 2 expressions which are exactly the same but
have their operands swapped in one of the commutative N-ary
expressions, and at least one of them will have "sorted by complexity"
invariant broken.

2 identical SCEV's won't compare equal by pointer comparison as they
are supposed to.

A real-world reproducer is added as a regression test: the issue
described causes 2 identical SCEV expressions to have different order
of operands and therefore compare not equal, which in its turn
prevents LoadStoreVectorizer from vectorizing a pair of consecutive
loads.

On a larger example (the source of the test attached, which is a
bugpoint) I have seen even weirder behavior: adding a constant to an
existing SCEV changes the order of the existing terms, for instance,
getAddExpr(1, ((A * B) + (C * D))) returns (1 + (C * D) + (A * B)).

The second commit:

[SCEV] Compare SCEVs by a complexity rank before lexic-al comparison

after "important" ordering (by type of the SCEV node and by dominance
relation on Loops for AddRec's) is done.

The exact order of operands of associative and commutative SCEV
operators (add, mul, smax, and umax) does not matter but needs to be
consistent. Currently it's resolved by a deep lexicographical
comparision which could be expensive.

This commit adds a small hash value to every SCEV node which is
supposed to have the following properties:

Be equal for SCEVs of the same lexicographical complexity
Collide for SCEVs of different complexities as rarely as possible
Increase in its numerical value for small SCEVs with the increase in their complexity in a sensible way to aid debugging and pretty-printing.

The hash is rougly a polynomial hash with one major difference: it
only adds up the hash values of operands of commutative and
associative operators. This way the complexity of such operators
does not depend on order of the operands, which is rather expected,
and the hash value does not wrap around as quickly as otherwise,
making sure the property (3) holds for larger SCEVs.

CompareSCEVComplexity is also changed so it compares SCEVs by the hash
(called complexity rank) after the "important" checks and resorts to a
full lexicographical comparision only if the ranks are the same.

Expected increase in memory usage is zero, including 32-bit
architectures.

The performance impact is measured on
large-SCEVs-shallow-getSCEV-stack.ll
from https://bugs.llvm.org/show_bug.cgi?id=32731

with https://reviews.llvm.org/D50985 patch applied and
ScalarEvolution::checkValidity disabled on an x86 machine (100 runs total, ms):

	Before	After by W Flags	After by NC Rank
CompareSCEVComplexity	2,620	2,590	84
GroupByComplexity	4,280	4,310	3,100
createSCEV	34,180	34,100	32,840

The major goal of this patch is to make sure we don't have a compile
time regression due to revert of r319889 ("[SCEV][NFC] Check NoWrap
flags before lexicographical comparison of SCEVs")

Diff Detail

Repository: rL LLVM

Event Timeline

rtereshin created this revision.Aug 20 2018, 4:44 PM

Herald added subscribers: javed.absar, eraman. · View Herald TranscriptAug 20 2018, 4:44 PM

If these are two patches, please put them on review as 2 different patches and set dependency between them. You can always have them merged together, but will be easier to review. It is not clear which changes in tests are caused by which patch now. I believe that the part with revert can be merged unconditionally because it shows a functional problem, even if it costs us some compile time.

As for the algorithm, see comments inline. I think you should use hash_combine and collect some statistics to be sure that your assumptions about rare collisisons is right.

include/llvm/Analysis/ScalarEvolution.h
92	I just wonder if 32M different hashes is enough (maybe the statistics collection will help us understand).
146	You only compare ranks when the SCEVTypes of two SCEVs match (lines 636-639): // Primarily, sort the SCEVs by their getSCEVType(). unsigned LType = LHS->getSCEVType(), RType = RHS->getSCEVType(); if (LType != RType) return (int)LType - (int)RType; There is no point to multiply by `(SCEVType + 1U)`.
149	Per comment above, how is that different from `getComplexityRank`?
include/llvm/Analysis/ScalarEvolutionExpressions.h
28	Please arrange includes in lexicographic order.
206	I guess what you want here is `return Acc * X + Op->getComplexityRank();`, otherwise it's not even a polynomial hash.
271	Could you please explain why you multiply by `scMulExpr`? If you need some constant for polynomial hash computation then it's better to declare it separately. It's better be a prime number to have less collisions (I don't even know what number it is now). Otherwise, you might be interested in using the function `hash_combine` which happens over and over LLVM. This will spare you from manual polynoms calculation. I'd also suggest to make a separate method for hash computation to keep things encapsulated.
lib/Analysis/ScalarEvolution.cpp
698	Could you please add the statistics to check how often do we have a hash collision? I.e. when hashes matched but further comparison returned something different from 0.
767	Do we really need this? They have 1 argument, this check will be done if needed when we compare arguments, no?

This revision now requires changes to proceed.Aug 27 2018, 2:28 AM

sanjoy resigned from this revision.Jan 29 2022, 5:41 PM

Herald added a subscriber: pengfei. · View Herald TranscriptJan 29 2022, 5:41 PM

Revision Contents

Path

Size

include/

llvm/

Analysis/

ScalarEvolution.h

33 lines

ScalarEvolutionExpressions.h

39 lines

lib/

Analysis/

ScalarEvolution.cpp

32 lines

test/

Analysis/

ScalarEvolution/

max-addops-inline.ll

2 lines

min-max-exprs.ll

2 lines

predicated-trip-count.ll

2 lines

trip-count14.ll

4 lines

Transforms/

IRCE/

conjunctive-checks.ll

6 lines

ranges_of_different_types.ll

8 lines

rc-negative-bound.ll

76 lines

single-access-no-preloop.ll

6 lines

single-access-with-preloop.ll

8 lines

LoadStoreVectorizer/

X86/

compare-scev-by-complexity.ll

76 lines

Diff 161617

include/llvm/Analysis/ScalarEvolution.h

Show First 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
class SCEV : public FoldingSetNode {		class SCEV : public FoldingSetNode {
friend struct FoldingSetTrait<SCEV>;		friend struct FoldingSetTrait<SCEV>;

/// A reference to an Interned FoldingSetNodeID for this node. The		/// A reference to an Interned FoldingSetNodeID for this node. The
/// ScalarEvolution's BumpPtrAllocator holds the data.		/// ScalarEvolution's BumpPtrAllocator holds the data.
FoldingSetNodeIDRef FastID;		FoldingSetNodeIDRef FastID;

// The SCEV baseclass this node corresponds to		// The SCEV baseclass this node corresponds to
const unsigned short SCEVType;		const unsigned short SCEVType : 4;

		// A hash value for the SCEV expression rooted at this node. Represents the
		// total complexity of the SCEV sub-trees referenced from the root, excluding
		// the root itself. For small expressions increases in value with the increase
		// of the complexity in a way consistent with CompareSCEVComplexity, for large
		// expressions behaves just like a hash.
		const unsigned NestedComplexityRank : 25;
		mkazantsevUnsubmitted Not Done Reply Inline Actions I just wonder if 32M different hashes is enough (maybe the statistics collection will help us understand). mkazantsev: I just wonder if 32M different hashes is enough (maybe the statistics collection will help us…
		static constexpr unsigned NCMask = 0x1FFFFFF;

protected:		protected:
/// This field is initialized to zero and may be used in subclasses to store		/// This field is initialized to zero and may be used in subclasses to store
/// miscellaneous information.		/// miscellaneous information.
unsigned short SubclassData = 0;		unsigned short SubclassData : 3;

public:		public:
/// NoWrapFlags are bitfield indices into SubclassData.		/// NoWrapFlags are bitfield indices into SubclassData.
///		///
/// Add and Mul expressions may have no-unsigned-wrap <NUW> or		/// Add and Mul expressions may have no-unsigned-wrap <NUW> or
/// no-signed-wrap <NSW> properties, which are derived from the IR		/// no-signed-wrap <NSW> properties, which are derived from the IR
/// operator. NSW is a misnomer that we use to mean no signed overflow or		/// operator. NSW is a misnomer that we use to mean no signed overflow or
/// underflow.		/// underflow.
Show All 12 Lines	public:
enum NoWrapFlags {		enum NoWrapFlags {
FlagAnyWrap = 0, // No guarantee.		FlagAnyWrap = 0, // No guarantee.
FlagNW = (1 << 0), // No self-wrap.		FlagNW = (1 << 0), // No self-wrap.
FlagNUW = (1 << 1), // No unsigned wrap.		FlagNUW = (1 << 1), // No unsigned wrap.
FlagNSW = (1 << 2), // No signed wrap.		FlagNSW = (1 << 2), // No signed wrap.
NoWrapMask = (1 << 3) - 1		NoWrapMask = (1 << 3) - 1
};		};

explicit SCEV(const FoldingSetNodeIDRef ID, unsigned SCEVTy)		explicit SCEV(const FoldingSetNodeIDRef ID, unsigned SCEVTy,
: FastID(ID), SCEVType(SCEVTy) {}		unsigned NCRank = 1)
		: FastID(ID), SCEVType(SCEVTy),
		// As the actual complexity rank of the entire expression including the
		// root is computed as ((SCEVType + 1U) * NestedComplexityRank) we need
		// to avoid flushing the nested complexity rank to zero when it wraps.
		NestedComplexityRank((NCRank & NCMask) ? NCRank : 1),
		SubclassData(FlagAnyWrap) {}

SCEV(const SCEV &) = delete;		SCEV(const SCEV &) = delete;
SCEV &operator=(const SCEV &) = delete;		SCEV &operator=(const SCEV &) = delete;

unsigned getSCEVType() const { return SCEVType; }		unsigned getSCEVType() const { return SCEVType; }

		/// Return a rank used to compute the nested complexity of a parent SCEV node.
		/// Has a higher chance of wrapping around than NestedComplexityRank,
		/// therefore use getNestedComplexityRank() instead to compare SCEVs by
		/// complexity if they are known to have the same root type.
		unsigned getComplexityRank() const {
		return (SCEVType + 1U) * getNestedComplexityRank();
		mkazantsevUnsubmitted Not Done Reply Inline Actions You only compare ranks when the SCEVTypes of two SCEVs match (lines 636-639): // Primarily, sort the SCEVs by their getSCEVType(). unsigned LType = LHS->getSCEVType(), RType = RHS->getSCEVType(); if (LType != RType) return (int)LType - (int)RType; There is no point to multiply by `(SCEVType + 1U)`. mkazantsev: You only compare ranks when the SCEVTypes of two SCEVs match (lines 636-639): // Primarily…
		}
		/// Return a rank used to compare complexities of same root type SCEVs.
		unsigned getNestedComplexityRank() const { return NestedComplexityRank; }
		mkazantsevUnsubmitted Not Done Reply Inline Actions Per comment above, how is that different from `getComplexityRank`? mkazantsev: Per comment above, how is that different from `getComplexityRank`?

/// Return the LLVM type of this SCEV expression.		/// Return the LLVM type of this SCEV expression.
Type *getType() const;		Type *getType() const;

/// Return true if the expression is a constant zero.		/// Return true if the expression is a constant zero.
bool isZero() const;		bool isZero() const;

/// Return true if the expression is a constant one.		/// Return true if the expression is a constant one.
bool isOne() const;		bool isOne() const;
▲ Show 20 Lines • Show All 1,881 Lines • Show Last 20 Lines

include/llvm/Analysis/ScalarEvolutionExpressions.h

Show All 19 Lines
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/ADT/iterator_range.h"		#include "llvm/ADT/iterator_range.h"
#include "llvm/Analysis/ScalarEvolution.h"		#include "llvm/Analysis/ScalarEvolution.h"
#include "llvm/IR/Constants.h"		#include "llvm/IR/Constants.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/IR/ValueHandle.h"		#include "llvm/IR/ValueHandle.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
		#include <numeric>
		mkazantsevUnsubmitted Not Done Reply Inline Actions Please arrange includes in lexicographic order. mkazantsev: Please arrange includes in lexicographic order.
#include <cassert>		#include <cassert>
#include <cstddef>		#include <cstddef>

namespace llvm {		namespace llvm {

class APInt;		class APInt;
class Constant;		class Constant;
class ConstantRange;		class ConstantRange;
▲ Show 20 Lines • Show All 101 Lines • ▼ Show 20 Lines	class Type;
protected:		protected:
// Since SCEVs are immutable, ScalarEvolution allocates operand		// Since SCEVs are immutable, ScalarEvolution allocates operand
// arrays with its SCEVAllocator, so this class just needs a simple		// arrays with its SCEVAllocator, so this class just needs a simple
// pointer rather than a more elaborate vector-like data structure.		// pointer rather than a more elaborate vector-like data structure.
// This also avoids the need for a non-trivial destructor.		// This also avoids the need for a non-trivial destructor.
const SCEV const Operands;		const SCEV const Operands;
size_t NumOperands;		size_t NumOperands;

SCEVNAryExpr(const FoldingSetNodeIDRef ID,		SCEVNAryExpr(const FoldingSetNodeIDRef ID, enum SCEVTypes T,
enum SCEVTypes T, const SCEV const O, size_t N)		const SCEV const O, size_t N, unsigned NCRank)
: SCEV(ID, T), Operands(O), NumOperands(N) {}		: SCEV(ID, T, NCRank), Operands(O), NumOperands(N) {}

public:		public:
size_t getNumOperands() const { return NumOperands; }		size_t getNumOperands() const { return NumOperands; }

const SCEV *getOperand(unsigned i) const {		const SCEV *getOperand(unsigned i) const {
assert(i < NumOperands && "Operand index out of range!");		assert(i < NumOperands && "Operand index out of range!");
return Operands[i];		return Operands[i];
}		}
Show All 33 Lines	static bool classof(const SCEV *S) {
S->getSCEVType() == scUMaxExpr \|\|		S->getSCEVType() == scUMaxExpr \|\|
S->getSCEVType() == scAddRecExpr;		S->getSCEVType() == scAddRecExpr;
}		}
};		};

/// This node is the base class for n'ary commutative operators.		/// This node is the base class for n'ary commutative operators.
class SCEVCommutativeExpr : public SCEVNAryExpr {		class SCEVCommutativeExpr : public SCEVNAryExpr {
protected:		protected:
SCEVCommutativeExpr(const FoldingSetNodeIDRef ID,		SCEVCommutativeExpr(const FoldingSetNodeIDRef ID, enum SCEVTypes T,
enum SCEVTypes T, const SCEV const O, size_t N)		const SCEV const O, size_t N)
: SCEVNAryExpr(ID, T, O, N) {}		: SCEVNAryExpr(ID, T, O, N,
		// All subclasses are also associative operations, not
		// just commutative, therefore all the operands have
		// equally weighted contributions to the complexity.
		std::accumulate(O, O + N, 0U,
		[](unsigned Acc, const SCEV *const Op) {
		return Acc + Op->getComplexityRank();
		mkazantsevUnsubmitted Not Done Reply Inline Actions I guess what you want here is `return Acc * X + Op->getComplexityRank();`, otherwise it's not even a polynomial hash. mkazantsev: I guess what you want here is ` return Acc * X + Op->getComplexityRank();`, otherwise it's not…
		})) {}

public:		public:
/// Methods for support type inquiry through isa, cast, and dyn_cast:		/// Methods for support type inquiry through isa, cast, and dyn_cast:
static bool classof(const SCEV *S) {		static bool classof(const SCEV *S) {
return S->getSCEVType() == scAddExpr \|\|		return S->getSCEVType() == scAddExpr \|\|
S->getSCEVType() == scMulExpr \|\|		S->getSCEVType() == scMulExpr \|\|
S->getSCEVType() == scSMaxExpr \|\|		S->getSCEVType() == scSMaxExpr \|\|
S->getSCEVType() == scUMaxExpr;		S->getSCEVType() == scUMaxExpr;
▲ Show 20 Lines • Show All 45 Lines • ▼ Show 20 Lines	class Type;
/// This class represents a binary unsigned division operation.		/// This class represents a binary unsigned division operation.
class SCEVUDivExpr : public SCEV {		class SCEVUDivExpr : public SCEV {
friend class ScalarEvolution;		friend class ScalarEvolution;

const SCEV *LHS;		const SCEV *LHS;
const SCEV *RHS;		const SCEV *RHS;

SCEVUDivExpr(const FoldingSetNodeIDRef ID, const SCEV lhs, const SCEV rhs)		SCEVUDivExpr(const FoldingSetNodeIDRef ID, const SCEV lhs, const SCEV rhs)
: SCEV(ID, scUDivExpr), LHS(lhs), RHS(rhs) {}		: SCEV(ID, scUDivExpr,
		// For consistency with lexicographical order weigh LHS higher.
		lhs->getComplexityRank() * scMulExpr + rhs->getComplexityRank()),
		mkazantsevUnsubmitted Not Done Reply Inline Actions Could you please explain why you multiply by `scMulExpr`? If you need some constant for polynomial hash computation then it's better to declare it separately. It's better be a prime number to have less collisions (I don't even know what number it is now). Otherwise, you might be interested in using the function `hash_combine` which happens over and over LLVM. This will spare you from manual polynoms calculation. I'd also suggest to make a separate method for hash computation to keep things encapsulated. mkazantsev: Could you please explain why you multiply by `scMulExpr`? If you need some constant for…
		LHS(lhs), RHS(rhs) {}

public:		public:
const SCEV *getLHS() const { return LHS; }		const SCEV *getLHS() const { return LHS; }
const SCEV *getRHS() const { return RHS; }		const SCEV *getRHS() const { return RHS; }

Type *getType() const {		Type *getType() const {
// In most cases the types of LHS and RHS will be the same, but in some		// In most cases the types of LHS and RHS will be the same, but in some
// crazy cases one or the other may be a pointer. ScalarEvolution doesn't		// crazy cases one or the other may be a pointer. ScalarEvolution doesn't
Show All 17 Lines	class Type;
///		///
/// All operands of an AddRec are required to be loop invariant.		/// All operands of an AddRec are required to be loop invariant.
///		///
class SCEVAddRecExpr : public SCEVNAryExpr {		class SCEVAddRecExpr : public SCEVNAryExpr {
friend class ScalarEvolution;		friend class ScalarEvolution;

const Loop *L;		const Loop *L;

SCEVAddRecExpr(const FoldingSetNodeIDRef ID,		SCEVAddRecExpr(const FoldingSetNodeIDRef ID, const SCEV const O, size_t N,
const SCEV const O, size_t N, const Loop *l)		const Loop *l)
: SCEVNAryExpr(ID, scAddRecExpr, O, N), L(l) {}		: SCEVNAryExpr(ID, scAddRecExpr, O, N,
		// For consistency with lexicographical order weigh Start
		// higher than Step, and Step higher than x^2-Step.
		std::accumulate(O, O + N, 0U,
		[](unsigned Acc, const SCEV *Op) {
		return Acc * scMulExpr +
		Op->getComplexityRank();
		})),
		L(l) {}

public:		public:
const SCEV *getStart() const { return Operands[0]; }		const SCEV *getStart() const { return Operands[0]; }
const Loop *getLoop() const { return L; }		const Loop *getLoop() const { return L; }

/// Constructs and returns the recurrence indicating how much this		/// Constructs and returns the recurrence indicating how much this
/// expression steps by. If this is a polynomial of degree N, it		/// expression steps by. If this is a polynomial of degree N, it
/// returns a chrec of degree N-1. We cannot determine whether		/// returns a chrec of degree N-1. We cannot determine whether
▲ Show 20 Lines • Show All 457 Lines • Show Last 20 Lines

lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 408 Lines • ▼ Show 20 Lines
}		}

const SCEV *		const SCEV *
ScalarEvolution::getConstant(Type *Ty, uint64_t V, bool isSigned) {		ScalarEvolution::getConstant(Type *Ty, uint64_t V, bool isSigned) {
IntegerType *ITy = cast<IntegerType>(getEffectiveSCEVType(Ty));		IntegerType *ITy = cast<IntegerType>(getEffectiveSCEVType(Ty));
return getConstant(ConstantInt::get(ITy, V, isSigned));		return getConstant(ConstantInt::get(ITy, V, isSigned));
}		}

SCEVCastExpr::SCEVCastExpr(const FoldingSetNodeIDRef ID,		SCEVCastExpr::SCEVCastExpr(const FoldingSetNodeIDRef ID, unsigned SCEVTy,
unsigned SCEVTy, const SCEV op, Type ty)		const SCEV op, Type ty)
: SCEV(ID, SCEVTy), Op(op), Ty(ty) {}		: SCEV(ID, SCEVTy, op->getComplexityRank()), Op(op), Ty(ty) {}

SCEVTruncateExpr::SCEVTruncateExpr(const FoldingSetNodeIDRef ID,		SCEVTruncateExpr::SCEVTruncateExpr(const FoldingSetNodeIDRef ID,
const SCEV op, Type ty)		const SCEV op, Type ty)
: SCEVCastExpr(ID, scTruncate, op, ty) {		: SCEVCastExpr(ID, scTruncate, op, ty) {
assert(Op->getType()->isIntOrPtrTy() && Ty->isIntOrPtrTy() &&		assert(Op->getType()->isIntOrPtrTy() && Ty->isIntOrPtrTy() &&
"Cannot truncate non-integer value!");		"Cannot truncate non-integer value!");
}		}

▲ Show 20 Lines • Show All 259 Lines • ▼ Show 20 Lines	if (LLoop != RLoop) {
return -1;		return -1;
}		}

// Addrec complexity grows with operand count.		// Addrec complexity grows with operand count.
unsigned LNumOps = LA->getNumOperands(), RNumOps = RA->getNumOperands();		unsigned LNumOps = LA->getNumOperands(), RNumOps = RA->getNumOperands();
if (LNumOps != RNumOps)		if (LNumOps != RNumOps)
return (int)LNumOps - (int)RNumOps;		return (int)LNumOps - (int)RNumOps;

// Compare NoWrap flags.		const unsigned LNCRank = LA->getNestedComplexityRank();
if (LA->getNoWrapFlags() != RA->getNoWrapFlags())		const unsigned RNCRank = RA->getNestedComplexityRank();
return (int)LA->getNoWrapFlags() - (int)RA->getNoWrapFlags();		if (LNCRank != RNCRank)
		return (int)LNCRank - (int)RNCRank;
		mkazantsevUnsubmitted Not Done Reply Inline Actions Could you please add the statistics to check how often do we have a hash collision? I.e. when hashes matched but further comparison returned something different from 0. mkazantsev: Could you please add the statistics to check how often do we have a hash collision? I.e. when…

// Lexicographically compare.		// Lexicographically compare.
for (unsigned i = 0; i != LNumOps; ++i) {		for (unsigned i = 0; i != LNumOps; ++i) {
int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI,		int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI,
LA->getOperand(i), RA->getOperand(i), DT,		LA->getOperand(i), RA->getOperand(i), DT,
Depth + 1);		Depth + 1);
if (X != 0)		if (X != 0)
return X;		return X;
}		}
EqCacheSCEV.unionSets(LHS, RHS);		EqCacheSCEV.unionSets(LHS, RHS);
return 0;		return 0;
}		}

case scAddExpr:		case scAddExpr:
case scMulExpr:		case scMulExpr:
case scSMaxExpr:		case scSMaxExpr:
case scUMaxExpr: {		case scUMaxExpr: {
const SCEVNAryExpr *LC = cast<SCEVNAryExpr>(LHS);		const SCEVNAryExpr *LC = cast<SCEVNAryExpr>(LHS);
const SCEVNAryExpr *RC = cast<SCEVNAryExpr>(RHS);		const SCEVNAryExpr *RC = cast<SCEVNAryExpr>(RHS);

		const unsigned LNCRank = LC->getNestedComplexityRank();
		const unsigned RNCRank = RC->getNestedComplexityRank();
		if (LNCRank != RNCRank)
		return (int)LNCRank - (int)RNCRank;

// Lexicographically compare n-ary expressions.		// Lexicographically compare n-ary expressions.
unsigned LNumOps = LC->getNumOperands(), RNumOps = RC->getNumOperands();		unsigned LNumOps = LC->getNumOperands(), RNumOps = RC->getNumOperands();
if (LNumOps != RNumOps)		if (LNumOps != RNumOps)
return (int)LNumOps - (int)RNumOps;		return (int)LNumOps - (int)RNumOps;

// Compare NoWrap flags.
if (LC->getNoWrapFlags() != RC->getNoWrapFlags())
return (int)LC->getNoWrapFlags() - (int)RC->getNoWrapFlags();

for (unsigned i = 0; i != LNumOps; ++i) {		for (unsigned i = 0; i != LNumOps; ++i) {
int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI,		int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI,
LC->getOperand(i), RC->getOperand(i), DT,		LC->getOperand(i), RC->getOperand(i), DT,
Depth + 1);		Depth + 1);
if (X != 0)		if (X != 0)
return X;		return X;
}		}
EqCacheSCEV.unionSets(LHS, RHS);		EqCacheSCEV.unionSets(LHS, RHS);
return 0;		return 0;
}		}

case scUDivExpr: {		case scUDivExpr: {
const SCEVUDivExpr *LC = cast<SCEVUDivExpr>(LHS);		const SCEVUDivExpr *LC = cast<SCEVUDivExpr>(LHS);
const SCEVUDivExpr *RC = cast<SCEVUDivExpr>(RHS);		const SCEVUDivExpr *RC = cast<SCEVUDivExpr>(RHS);

		const unsigned LNCRank = LC->getNestedComplexityRank();
		const unsigned RNCRank = RC->getNestedComplexityRank();
		if (LNCRank != RNCRank)
		return (int)LNCRank - (int)RNCRank;

// Lexicographically compare udiv expressions.		// Lexicographically compare udiv expressions.
int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI, LC->getLHS(),		int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI, LC->getLHS(),
RC->getLHS(), DT, Depth + 1);		RC->getLHS(), DT, Depth + 1);
if (X != 0)		if (X != 0)
return X;		return X;
X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI, LC->getRHS(),		X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI, LC->getRHS(),
RC->getRHS(), DT, Depth + 1);		RC->getRHS(), DT, Depth + 1);
if (X == 0)		if (X == 0)
EqCacheSCEV.unionSets(LHS, RHS);		EqCacheSCEV.unionSets(LHS, RHS);
return X;		return X;
}		}

case scTruncate:		case scTruncate:
case scZeroExtend:		case scZeroExtend:
case scSignExtend: {		case scSignExtend: {
const SCEVCastExpr *LC = cast<SCEVCastExpr>(LHS);		const SCEVCastExpr *LC = cast<SCEVCastExpr>(LHS);
const SCEVCastExpr *RC = cast<SCEVCastExpr>(RHS);		const SCEVCastExpr *RC = cast<SCEVCastExpr>(RHS);

		const unsigned LNCRank = LC->getNestedComplexityRank();
		mkazantsevUnsubmitted Not Done Reply Inline Actions Do we really need this? They have 1 argument, this check will be done if needed when we compare arguments, no? mkazantsev: Do we really need this? They have 1 argument, this check will be done if needed when we compare…
		const unsigned RNCRank = RC->getNestedComplexityRank();
		if (LNCRank != RNCRank)
		return (int)LNCRank - (int)RNCRank;

// Compare cast expressions by operand.		// Compare cast expressions by operand.
int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI,		int X = CompareSCEVComplexity(EqCacheSCEV, EqCacheValue, LI,
LC->getOperand(), RC->getOperand(), DT,		LC->getOperand(), RC->getOperand(), DT,
Depth + 1);		Depth + 1);
if (X == 0)		if (X == 0)
EqCacheSCEV.unionSets(LHS, RHS);		EqCacheSCEV.unionSets(LHS, RHS);
return X;		return X;
}		}
▲ Show 20 Lines • Show All 9,991 Lines • Show Last 20 Lines

test/Analysis/ScalarEvolution/max-addops-inline.ll

	; RUN: opt -analyze -scalar-evolution -scev-addops-inline-threshold=1 < %s \| FileCheck --check-prefix=CHECK1 %s			; RUN: opt -analyze -scalar-evolution -scev-addops-inline-threshold=1 < %s \| FileCheck --check-prefix=CHECK1 %s
	; RUN: opt -analyze -scalar-evolution -scev-addops-inline-threshold=10 < %s \| FileCheck --check-prefix=CHECK10 %s			; RUN: opt -analyze -scalar-evolution -scev-addops-inline-threshold=10 < %s \| FileCheck --check-prefix=CHECK10 %s

	define i32 @foo(i64 %p0, i32 %p1) {			define i32 @foo(i64 %p0, i32 %p1) {
	; CHECK1: %add2 = add nsw i32 %mul1, %add			; CHECK1: %add2 = add nsw i32 %mul1, %add
	; CHECK1-NEXT: --> ((trunc i64 %p0 to i32) * (1 + (trunc i64 %p0 to i32)) * (1 + %p1))			; CHECK1-NEXT: --> ((trunc i64 %p0 to i32) * (1 + %p1) * (1 + (trunc i64 %p0 to i32)))

	; CHECK10: %add2 = add nsw i32 %mul1, %add			; CHECK10: %add2 = add nsw i32 %mul1, %add
	; CHECK10-NEXT: --> ((trunc i64 %p0 to i32) * (1 + ((trunc i64 %p0 to i32) * (1 + %p1)) + %p1))			; CHECK10-NEXT: --> ((trunc i64 %p0 to i32) * (1 + ((trunc i64 %p0 to i32) * (1 + %p1)) + %p1))
	entry:			entry:
	%tr = trunc i64 %p0 to i32			%tr = trunc i64 %p0 to i32
	%mul = mul nsw i32 %tr, %p1			%mul = mul nsw i32 %tr, %p1
	%add = add nsw i32 %mul, %tr			%add = add nsw i32 %mul, %tr
	%mul1 = mul nsw i32 %add, %tr			%mul1 = mul nsw i32 %add, %tr
	%add2 = add nsw i32 %mul1, %add			%add2 = add nsw i32 %mul1, %add
	ret i32 %add2			ret i32 %add2
	}			}

test/Analysis/ScalarEvolution/min-max-exprs.ll

	Show All 27 Lines
	bb2: ; preds = %bb1			bb2: ; preds = %bb1
	%tmp3 = add nuw nsw i32 %i.0, 3			%tmp3 = add nuw nsw i32 %i.0, 3
	%tmp4 = icmp slt i32 %tmp3, %N			%tmp4 = icmp slt i32 %tmp3, %N
	%tmp5 = sext i32 %tmp3 to i64			%tmp5 = sext i32 %tmp3 to i64
	%tmp6 = sext i32 %N to i64			%tmp6 = sext i32 %N to i64
	%tmp9 = select i1 %tmp4, i64 %tmp5, i64 %tmp6			%tmp9 = select i1 %tmp4, i64 %tmp5, i64 %tmp6
	; min(N, i+3)			; min(N, i+3)
	; CHECK: select i1 %tmp4, i64 %tmp5, i64 %tmp6			; CHECK: select i1 %tmp4, i64 %tmp5, i64 %tmp6
	; CHECK-NEXT: --> (-1 + (-1 * ((-1 + (-1 * (sext i32 {3,+,1}<nuw><%bb1> to i64))<nsw>)<nsw> smax (-1 + (-1 * (sext i32 %N to i64))<nsw>)<nsw>))<nsw>)<nsw>			; CHECK-NEXT: --> (-1 + (-1 * ((-1 + (-1 * (sext i32 %N to i64))<nsw>)<nsw> smax (-1 + (-1 * (sext i32 {3,+,1}<nuw><%bb1> to i64))<nsw>)<nsw>))<nsw>)<nsw>
	%tmp11 = getelementptr inbounds i32, i32* %A, i64 %tmp9			%tmp11 = getelementptr inbounds i32, i32* %A, i64 %tmp9
	%tmp12 = load i32, i32* %tmp11, align 4			%tmp12 = load i32, i32* %tmp11, align 4
	%tmp13 = shl nsw i32 %tmp12, 1			%tmp13 = shl nsw i32 %tmp12, 1
	%tmp14 = icmp sge i32 3, %i.0			%tmp14 = icmp sge i32 3, %i.0
	%tmp17 = add nsw i64 %i.0.1, -3			%tmp17 = add nsw i64 %i.0.1, -3
	%tmp19 = select i1 %tmp14, i64 0, i64 %tmp17			%tmp19 = select i1 %tmp14, i64 0, i64 %tmp17
	; max(0, i - 3)			; max(0, i - 3)
	; CHECK: select i1 %tmp14, i64 0, i64 %tmp17			; CHECK: select i1 %tmp14, i64 0, i64 %tmp17
	Show All 9 Lines

test/Analysis/ScalarEvolution/predicated-trip-count.ll

	Show First 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; true.			; true.

	; CHECK: Classifying expressions for: @test2			; CHECK: Classifying expressions for: @test2

	; CHECK: %i.0.ext = sext i16 %i.0 to i32			; CHECK: %i.0.ext = sext i16 %i.0 to i32
	; CHECK-NEXT: --> (sext i16 {%Start,+,-1}<%bb3> to i32)			; CHECK-NEXT: --> (sext i16 {%Start,+,-1}<%bb3> to i32)
	; CHECK: Loop %bb3: Unpredictable backedge-taken count.			; CHECK: Loop %bb3: Unpredictable backedge-taken count.
	; CHECK-NEXT: Loop %bb3: Unpredictable max backedge-taken count.			; CHECK-NEXT: Loop %bb3: Unpredictable max backedge-taken count.
	; CHECK-NEXT: Loop %bb3: Predicated backedge-taken count is (2 + (sext i16 %Start to i32) + ((-2 + (-1 * (sext i16 %Start to i32))<nsw>) smax (-1 + (-1 * %M))))			; CHECK-NEXT: Loop %bb3: Predicated backedge-taken count is (2 + (sext i16 %Start to i32) + ((-1 + (-1 * %M)) smax (-2 + (-1 * (sext i16 %Start to i32))<nsw>)))
	; CHECK-NEXT: Predicates:			; CHECK-NEXT: Predicates:
	; CHECK-NEXT: {%Start,+,-1}<%bb3> Added Flags: <nssw>			; CHECK-NEXT: {%Start,+,-1}<%bb3> Added Flags: <nssw>

	define void @test2(i32 %N, i32 %M, i16 %Start) {			define void @test2(i32 %N, i32 %M, i16 %Start) {
	entry:			entry:
	br label %bb3			br label %bb3

	bb: ; preds = %bb3			bb: ; preds = %bb3
	Show All 18 Lines

test/Analysis/ScalarEvolution/trip-count14.ll

	Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines
	if.end:			if.end:
	%arrayidx = getelementptr i32, i32* %p, i32 %i.0			%arrayidx = getelementptr i32, i32* %p, i32 %i.0
	store i32 %i.0, i32* %arrayidx, align 4			store i32 %i.0, i32* %arrayidx, align 4
	%inc = add i32 %i.0, 1			%inc = add i32 %i.0, 1
	%cmp1 = icmp slt i32 %i.0, %add			%cmp1 = icmp slt i32 %i.0, %add
	br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times			br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times

	; CHECK-LABEL: Determining loop execution counts for: @s32_max2_unpredictable_exit			; CHECK-LABEL: Determining loop execution counts for: @s32_max2_unpredictable_exit
	; CHECK-NEXT: Loop %do.body: <multiple exits> backedge-taken count is (-1 + (-1 * ((-1 + (-1 * ((2 + %n) smax %n)) + %n) umax (-1 + (-1 * %x) + %n))))			; CHECK-NEXT: Loop %do.body: <multiple exits> backedge-taken count is (-1 + (-1 * ((-1 + (-1 * %x) + %n) umax (-1 + (-1 * ((2 + %n) smax %n)) + %n))))
	; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2{{$}}			; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2{{$}}

	do.end:			do.end:
	ret void			ret void
	}			}

	define void @u32_max1(i32 %n, i32* %p) {			define void @u32_max1(i32 %n, i32* %p) {
	entry:			entry:
	▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
	if.end:			if.end:
	%arrayidx = getelementptr i32, i32* %p, i32 %i.0			%arrayidx = getelementptr i32, i32* %p, i32 %i.0
	store i32 %i.0, i32* %arrayidx, align 4			store i32 %i.0, i32* %arrayidx, align 4
	%inc = add i32 %i.0, 1			%inc = add i32 %i.0, 1
	%cmp1 = icmp ult i32 %i.0, %add			%cmp1 = icmp ult i32 %i.0, %add
	br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times			br i1 %cmp1, label %do.body, label %do.end ; taken either 0 or 2 times

	; CHECK-LABEL: Determining loop execution counts for: @u32_max2_unpredictable_exit			; CHECK-LABEL: Determining loop execution counts for: @u32_max2_unpredictable_exit
	; CHECK-NEXT: Loop %do.body: <multiple exits> backedge-taken count is (-1 + (-1 * ((-1 + (-1 * ((2 + %n) umax %n)) + %n) umax (-1 + (-1 * %x) + %n))))			; CHECK-NEXT: Loop %do.body: <multiple exits> backedge-taken count is (-1 + (-1 * ((-1 + (-1 * %x) + %n) umax (-1 + (-1 * ((2 + %n) umax %n)) + %n))))
	; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2{{$}}			; CHECK-NEXT: Loop %do.body: max backedge-taken count is 2{{$}}

	do.end:			do.end:
	ret void			ret void
	}			}

test/Transforms/IRCE/conjunctive-checks.ll

	; RUN: opt -S -verify-loop-info -irce < %s \| FileCheck %s			; RUN: opt -S -verify-loop-info -irce < %s \| FileCheck %s
	; RUN: opt -S -verify-loop-info -passes='require<branch-prob>,loop(irce)' < %s \| FileCheck %s			; RUN: opt -S -verify-loop-info -passes='require<branch-prob>,loop(irce)' < %s \| FileCheck %s

	define void @f_0(i32 %arr, i32 %a_len_ptr, i32 %n, i1* %cond_buf) {			define void @f_0(i32 %arr, i32 %a_len_ptr, i32 %n, i1* %cond_buf) {
	; CHECK-LABEL: @f_0(			; CHECK-LABEL: @f_0(

	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK: [[not_safe_range_end:[^ ]+]] = sub i32 3, %len
	; CHECK: [[not_n:[^ ]+]] = sub i32 -1, %n			; CHECK: [[not_n:[^ ]+]] = sub i32 -1, %n
	; CHECK: [[not_exit_main_loop_at_hiclamp_cmp:[^ ]+]] = icmp sgt i32 [[not_safe_range_end]], [[not_n]]			; CHECK: [[not_safe_range_end:[^ ]+]] = sub i32 3, %len
	; CHECK: [[not_exit_main_loop_at_hiclamp:[^ ]+]] = select i1 [[not_exit_main_loop_at_hiclamp_cmp]], i32 [[not_safe_range_end]], i32 [[not_n]]			; CHECK: [[not_exit_main_loop_at_hiclamp_cmp:[^ ]+]] = icmp sgt i32 [[not_n]], [[not_safe_range_end]]
				; CHECK: [[not_exit_main_loop_at_hiclamp:[^ ]+]] = select i1 [[not_exit_main_loop_at_hiclamp_cmp]], i32 [[not_n]], i32 [[not_safe_range_end]]
	; CHECK: [[exit_main_loop_at_hiclamp:[^ ]+]] = sub i32 -1, [[not_exit_main_loop_at_hiclamp]]			; CHECK: [[exit_main_loop_at_hiclamp:[^ ]+]] = sub i32 -1, [[not_exit_main_loop_at_hiclamp]]
	; CHECK: [[exit_main_loop_at_loclamp_cmp:[^ ]+]] = icmp sgt i32 [[exit_main_loop_at_hiclamp]], 0			; CHECK: [[exit_main_loop_at_loclamp_cmp:[^ ]+]] = icmp sgt i32 [[exit_main_loop_at_hiclamp]], 0
	; CHECK: [[exit_main_loop_at_loclamp:[^ ]+]] = select i1 [[exit_main_loop_at_loclamp_cmp]], i32 [[exit_main_loop_at_hiclamp]], i32 0			; CHECK: [[exit_main_loop_at_loclamp:[^ ]+]] = select i1 [[exit_main_loop_at_loclamp_cmp]], i32 [[exit_main_loop_at_hiclamp]], i32 0
	; CHECK: [[enter_main_loop:[^ ]+]] = icmp slt i32 0, [[exit_main_loop_at_loclamp]]			; CHECK: [[enter_main_loop:[^ ]+]] = icmp slt i32 0, [[exit_main_loop_at_loclamp]]
	; CHECK: br i1 [[enter_main_loop]], label %loop.preheader2, label %main.pseudo.exit			; CHECK: br i1 [[enter_main_loop]], label %loop.preheader2, label %main.pseudo.exit

	; CHECK: loop.preheader2:			; CHECK: loop.preheader2:
	; CHECK: br label %loop			; CHECK: br label %loop
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

test/Transforms/IRCE/ranges_of_different_types.ll

	Show First 20 Lines • Show All 145 Lines • ▼ Show 20 Lines
	; %exit.mainloop.at = 101			; %exit.mainloop.at = 101

	define void @test_03(i32* %arr, i32* %a_len_ptr) #0 {			define void @test_03(i32* %arr, i32* %a_len_ptr) #0 {

	; CHECK-LABEL: test_03(			; CHECK-LABEL: test_03(
	; CHECK-NOT: preloop			; CHECK-NOT: preloop
	; CHECK: entry:			; CHECK: entry:
	; CHECK-NEXT: %len = load i32, i32* %a_len_ptr, !range !0			; CHECK-NEXT: %len = load i32, i32* %a_len_ptr, !range !0
	; CHECK-NEXT: [[SUB1:%[^ ]+]] = sub i32 -2, %len
	; CHECK-NEXT: [[SUB2:%[^ ]+]] = sub i32 -1, %len			; CHECK-NEXT: [[SUB2:%[^ ]+]] = sub i32 -1, %len
	; CHECK-NEXT: [[CMP1:%[^ ]+]] = icmp sgt i32 [[SUB2]], -14			; CHECK-NEXT: [[CMP1:%[^ ]+]] = icmp sgt i32 [[SUB2]], -14
	; CHECK-NEXT: [[SMAX1:%[^ ]+]] = select i1 [[CMP1]], i32 [[SUB2]], i32 -14			; CHECK-NEXT: [[SMAX1:%[^ ]+]] = select i1 [[CMP1]], i32 [[SUB2]], i32 -14
	; CHECK-NEXT: [[SUB3:%[^ ]+]] = sub i32 [[SUB1]], [[SMAX1]]			; CHECK-NEXT: [[SUB1:%[^ ]+]] = sub i32 -2, [[SMAX1]]
				; CHECK-NEXT: [[SUB3:%[^ ]+]] = sub i32 [[SUB1]], %len
	; CHECK-NEXT: [[CMP2:%[^ ]+]] = icmp ugt i32 [[SUB3]], -102			; CHECK-NEXT: [[CMP2:%[^ ]+]] = icmp ugt i32 [[SUB3]], -102
	; CHECK-NEXT: [[UMAX1:%[^ ]+]] = select i1 [[CMP2]], i32 [[SUB3]], i32 -102			; CHECK-NEXT: [[UMAX1:%[^ ]+]] = select i1 [[CMP2]], i32 [[SUB3]], i32 -102
	; CHECK-NEXT: %exit.mainloop.at = sub i32 -1, [[UMAX1]]			; CHECK-NEXT: %exit.mainloop.at = sub i32 -1, [[UMAX1]]
	; CHECK-NEXT: [[CMP3:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at			; CHECK-NEXT: [[CMP3:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at
	; CHECK-NEXT: br i1 [[CMP3]], label %loop.preheader, label %main.pseudo.exit			; CHECK-NEXT: br i1 [[CMP3]], label %loop.preheader, label %main.pseudo.exit
	; CHECK: postloop:			; CHECK: postloop:

	entry:			entry:
	▲ Show 20 Lines • Show All 172 Lines • ▼ Show 20 Lines

	; Unsigned latch, Unsigned RC, negative offset. Same as test_03.			; Unsigned latch, Unsigned RC, negative offset. Same as test_03.
	define void @test_07(i32* %arr, i32* %a_len_ptr) #0 {			define void @test_07(i32* %arr, i32* %a_len_ptr) #0 {

	; CHECK-LABEL: test_07(			; CHECK-LABEL: test_07(
	; CHECK-NOT: preloop			; CHECK-NOT: preloop
	; CHECK: entry:			; CHECK: entry:
	; CHECK-NEXT: %len = load i32, i32* %a_len_ptr, !range !0			; CHECK-NEXT: %len = load i32, i32* %a_len_ptr, !range !0
	; CHECK-NEXT: [[SUB1:%[^ ]+]] = sub i32 -2, %len
	; CHECK-NEXT: [[SUB2:%[^ ]+]] = sub i32 -1, %len			; CHECK-NEXT: [[SUB2:%[^ ]+]] = sub i32 -1, %len
	; CHECK-NEXT: [[CMP1:%[^ ]+]] = icmp sgt i32 [[SUB2]], -14			; CHECK-NEXT: [[CMP1:%[^ ]+]] = icmp sgt i32 [[SUB2]], -14
	; CHECK-NEXT: [[SMAX1:%[^ ]+]] = select i1 [[CMP1]], i32 [[SUB2]], i32 -14			; CHECK-NEXT: [[SMAX1:%[^ ]+]] = select i1 [[CMP1]], i32 [[SUB2]], i32 -14
	; CHECK-NEXT: [[SUB3:%[^ ]+]] = sub i32 [[SUB1]], [[SMAX1]]			; CHECK-NEXT: [[SUB1:%[^ ]+]] = sub i32 -2, [[SMAX1]]
				; CHECK-NEXT: [[SUB3:%[^ ]+]] = sub i32 [[SUB1]], %len
	; CHECK-NEXT: [[CMP2:%[^ ]+]] = icmp ugt i32 [[SUB3]], -102			; CHECK-NEXT: [[CMP2:%[^ ]+]] = icmp ugt i32 [[SUB3]], -102
	; CHECK-NEXT: [[UMAX1:%[^ ]+]] = select i1 [[CMP2]], i32 [[SUB3]], i32 -102			; CHECK-NEXT: [[UMAX1:%[^ ]+]] = select i1 [[CMP2]], i32 [[SUB3]], i32 -102
	; CHECK-NEXT: %exit.mainloop.at = sub i32 -1, [[UMAX1]]			; CHECK-NEXT: %exit.mainloop.at = sub i32 -1, [[UMAX1]]
	; CHECK-NEXT: [[CMP3:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at			; CHECK-NEXT: [[CMP3:%[^ ]+]] = icmp ult i32 0, %exit.mainloop.at
	; CHECK-NEXT: br i1 [[CMP3]], label %loop.preheader, label %main.pseudo.exit			; CHECK-NEXT: br i1 [[CMP3]], label %loop.preheader, label %main.pseudo.exit
	; CHECK: loop			; CHECK: loop
	; CHECK: br i1 true, label %in.bounds			; CHECK: br i1 true, label %in.bounds
	; CHECK: postloop:			; CHECK: postloop:
	▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

test/Transforms/IRCE/rc-negative-bound.ll

	Show First 20 Lines • Show All 106 Lines • ▼ Show 20 Lines
	; RC against a value which is not known to be non-negative. Here we should			; RC against a value which is not known to be non-negative. Here we should
	; expand runtime checks against bound being positive or negative.			; expand runtime checks against bound being positive or negative.
	define void @test_03(i32 *%arr, i32 %n, i32 %bound) {			define void @test_03(i32 *%arr, i32 %n, i32 %bound) {
	; CHECK-LABEL: @test_03(			; CHECK-LABEL: @test_03(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]
	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK-NEXT: [[TMP0:%.]] = add i32 [[BOUND:%.]], -2147483647			; CHECK-NEXT: [[TMP0:%.]] = sub i32 -1, [[BOUND:%.]]
	; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], 0			; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], -1
	; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 0			; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 -1
	; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[BOUND]], [[SMAX]]			; CHECK-NEXT: [[TMP2:%.*]] = sub i32 -1, [[SMAX]]
	; CHECK-NEXT: [[TMP3:%.*]] = sub i32 -1, [[BOUND]]			; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], -1
	; CHECK-NEXT: [[TMP4:%.*]] = icmp sgt i32 [[TMP3]], -1			; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 -1
	; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP4]], i32 [[TMP3]], i32 -1			; CHECK-NEXT: [[TMP4:%.*]] = add i32 [[SMAX1]], 1
	; CHECK-NEXT: [[TMP5:%.*]] = sub i32 -1, [[SMAX1]]			; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[BOUND]], -2147483647
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sgt i32 [[TMP5]], -1			; CHECK-NEXT: [[TMP6:%.*]] = icmp sgt i32 [[TMP5]], 0
	; CHECK-NEXT: [[SMAX2:%.*]] = select i1 [[TMP6]], i32 [[TMP5]], i32 -1			; CHECK-NEXT: [[SMAX2:%.*]] = select i1 [[TMP6]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP7:%.*]] = add i32 [[SMAX2]], 1			; CHECK-NEXT: [[TMP7:%.*]] = sub i32 [[BOUND]], [[SMAX2]]
	; CHECK-NEXT: [[TMP8:%.*]] = mul i32 [[TMP2]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = mul i32 [[TMP4]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = sub i32 -1, [[N]]			; CHECK-NEXT: [[TMP10:%.*]] = sub i32 -1, [[N]]
	; CHECK-NEXT: [[TMP11:%.*]] = icmp sgt i32 [[TMP9]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = icmp sgt i32 [[TMP9]], [[TMP10]]
	; CHECK-NEXT: [[SMAX3:%.*]] = select i1 [[TMP11]], i32 [[TMP9]], i32 [[TMP10]]			; CHECK-NEXT: [[SMAX3:%.*]] = select i1 [[TMP11]], i32 [[TMP9]], i32 [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = sub i32 -1, [[SMAX3]]			; CHECK-NEXT: [[TMP12:%.*]] = sub i32 -1, [[SMAX3]]
	; CHECK-NEXT: [[TMP13:%.*]] = icmp sgt i32 [[TMP12]], 0			; CHECK-NEXT: [[TMP13:%.*]] = icmp sgt i32 [[TMP12]], 0
	; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = select i1 [[TMP13]], i32 [[TMP12]], i32 0			; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = select i1 [[TMP13]], i32 [[TMP12]], i32 0
	; CHECK-NEXT: [[TMP14:%.*]] = icmp slt i32 0, [[EXIT_MAINLOOP_AT]]			; CHECK-NEXT: [[TMP14:%.*]] = icmp slt i32 0, [[EXIT_MAINLOOP_AT]]
	▲ Show 20 Lines • Show All 74 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @test_04(			; CHECK-LABEL: @test_04(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]
	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK-NEXT: [[TMP0:%.]] = sub i32 -1, [[BOUND:%.]]			; CHECK-NEXT: [[TMP0:%.]] = sub i32 -1, [[BOUND:%.]]
	; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], -1			; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], -1
	; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 -1			; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 -1
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[BOUND]], [[SMAX]]			; CHECK-NEXT: [[TMP2:%.*]] = sub i32 -1, [[SMAX]]
	; CHECK-NEXT: [[TMP3:%.*]] = add i32 [[TMP2]], 1			; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], -1
	; CHECK-NEXT: [[TMP4:%.*]] = sub i32 -1, [[SMAX]]			; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 -1
	; CHECK-NEXT: [[TMP5:%.*]] = icmp sgt i32 [[TMP4]], -1			; CHECK-NEXT: [[TMP4:%.*]] = add i32 [[SMAX1]], 1
	; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP5]], i32 [[TMP4]], i32 -1			; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[BOUND]], [[SMAX]]
	; CHECK-NEXT: [[TMP6:%.*]] = add i32 [[SMAX1]], 1			; CHECK-NEXT: [[TMP6:%.*]] = add i32 [[TMP5]], 1
	; CHECK-NEXT: [[TMP7:%.*]] = mul i32 [[TMP3]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = mul i32 [[TMP4]], [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = sub i32 -1, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = sub i32 -1, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[N]]			; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[N]]
	; CHECK-NEXT: [[TMP10:%.*]] = icmp ugt i32 [[TMP8]], [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = icmp ugt i32 [[TMP8]], [[TMP9]]
	; CHECK-NEXT: [[UMAX:%.*]] = select i1 [[TMP10]], i32 [[TMP8]], i32 [[TMP9]]			; CHECK-NEXT: [[UMAX:%.*]] = select i1 [[TMP10]], i32 [[TMP8]], i32 [[TMP9]]
	; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = sub i32 -1, [[UMAX]]			; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = sub i32 -1, [[UMAX]]
	; CHECK-NEXT: [[TMP11:%.*]] = icmp ult i32 0, [[EXIT_MAINLOOP_AT]]			; CHECK-NEXT: [[TMP11:%.*]] = icmp ult i32 0, [[EXIT_MAINLOOP_AT]]
	; CHECK-NEXT: br i1 [[TMP11]], label [[LOOP_PREHEADER2:%.]], label [[MAIN_PSEUDO_EXIT:%.]]			; CHECK-NEXT: br i1 [[TMP11]], label [[LOOP_PREHEADER2:%.]], label [[MAIN_PSEUDO_EXIT:%.]]
	; CHECK: loop.preheader2:			; CHECK: loop.preheader2:
	▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines
	; safely remove this check (see comments in the method			; safely remove this check (see comments in the method
	; computeSafeIterationSpace).			; computeSafeIterationSpace).
	define void @test_07(i32 *%arr, i32 %n, i32 %bound) {			define void @test_07(i32 *%arr, i32 %n, i32 %bound) {
	; CHECK-LABEL: @test_07(			; CHECK-LABEL: @test_07(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]
	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK-NEXT: [[TMP0:%.]] = add i32 [[BOUND:%.]], -2147483647			; CHECK-NEXT: [[TMP0:%.]] = sub i32 -1, [[BOUND:%.]]
	; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], 0			; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], -1
	; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 0			; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 -1
	; CHECK-NEXT: [[TMP2:%.*]] = sub i32 [[BOUND]], [[SMAX]]			; CHECK-NEXT: [[TMP2:%.*]] = sub i32 -1, [[SMAX]]
	; CHECK-NEXT: [[TMP3:%.*]] = sub i32 -1, [[BOUND]]			; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], -1
	; CHECK-NEXT: [[TMP4:%.*]] = icmp sgt i32 [[TMP3]], -1			; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 -1
	; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP4]], i32 [[TMP3]], i32 -1			; CHECK-NEXT: [[TMP4:%.*]] = add i32 [[SMAX1]], 1
	; CHECK-NEXT: [[TMP5:%.*]] = sub i32 -1, [[SMAX1]]			; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[BOUND]], -2147483647
	; CHECK-NEXT: [[TMP6:%.*]] = icmp sgt i32 [[TMP5]], -1			; CHECK-NEXT: [[TMP6:%.*]] = icmp sgt i32 [[TMP5]], 0
	; CHECK-NEXT: [[SMAX2:%.*]] = select i1 [[TMP6]], i32 [[TMP5]], i32 -1			; CHECK-NEXT: [[SMAX2:%.*]] = select i1 [[TMP6]], i32 [[TMP5]], i32 0
	; CHECK-NEXT: [[TMP7:%.*]] = add i32 [[SMAX2]], 1			; CHECK-NEXT: [[TMP7:%.*]] = sub i32 [[BOUND]], [[SMAX2]]
	; CHECK-NEXT: [[TMP8:%.*]] = mul i32 [[TMP2]], [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = mul i32 [[TMP4]], [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[TMP8]]			; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[TMP8]]
	; CHECK-NEXT: [[TMP10:%.*]] = sub i32 -1, [[N]]			; CHECK-NEXT: [[TMP10:%.*]] = sub i32 -1, [[N]]
	; CHECK-NEXT: [[TMP11:%.*]] = icmp sgt i32 [[TMP9]], [[TMP10]]			; CHECK-NEXT: [[TMP11:%.*]] = icmp sgt i32 [[TMP9]], [[TMP10]]
	; CHECK-NEXT: [[SMAX3:%.*]] = select i1 [[TMP11]], i32 [[TMP9]], i32 [[TMP10]]			; CHECK-NEXT: [[SMAX3:%.*]] = select i1 [[TMP11]], i32 [[TMP9]], i32 [[TMP10]]
	; CHECK-NEXT: [[TMP12:%.*]] = sub i32 -1, [[SMAX3]]			; CHECK-NEXT: [[TMP12:%.*]] = sub i32 -1, [[SMAX3]]
	; CHECK-NEXT: [[TMP13:%.*]] = icmp sgt i32 [[TMP12]], 0			; CHECK-NEXT: [[TMP13:%.*]] = icmp sgt i32 [[TMP12]], 0
	; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = select i1 [[TMP13]], i32 [[TMP12]], i32 0			; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = select i1 [[TMP13]], i32 [[TMP12]], i32 0
	; CHECK-NEXT: [[TMP14:%.*]] = icmp slt i32 0, [[EXIT_MAINLOOP_AT]]			; CHECK-NEXT: [[TMP14:%.*]] = icmp slt i32 0, [[EXIT_MAINLOOP_AT]]
	▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines
	; CHECK-LABEL: @test_08(			; CHECK-LABEL: @test_08(
	; CHECK-NEXT: entry:			; CHECK-NEXT: entry:
	; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0			; CHECK-NEXT: [[FIRST_ITR_CHECK:%.]] = icmp sgt i32 [[N:%.]], 0
	; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]			; CHECK-NEXT: br i1 [[FIRST_ITR_CHECK]], label [[LOOP_PREHEADER:%.]], label [[EXIT:%.]]
	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK-NEXT: [[TMP0:%.]] = sub i32 -1, [[BOUND:%.]]			; CHECK-NEXT: [[TMP0:%.]] = sub i32 -1, [[BOUND:%.]]
	; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], -1			; CHECK-NEXT: [[TMP1:%.*]] = icmp sgt i32 [[TMP0]], -1
	; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 -1			; CHECK-NEXT: [[SMAX:%.*]] = select i1 [[TMP1]], i32 [[TMP0]], i32 -1
	; CHECK-NEXT: [[TMP2:%.*]] = add i32 [[BOUND]], [[SMAX]]			; CHECK-NEXT: [[TMP2:%.*]] = sub i32 -1, [[SMAX]]
	; CHECK-NEXT: [[TMP3:%.*]] = add i32 [[TMP2]], 1			; CHECK-NEXT: [[TMP3:%.*]] = icmp sgt i32 [[TMP2]], -1
	; CHECK-NEXT: [[TMP4:%.*]] = sub i32 -1, [[SMAX]]			; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP3]], i32 [[TMP2]], i32 -1
	; CHECK-NEXT: [[TMP5:%.*]] = icmp sgt i32 [[TMP4]], -1			; CHECK-NEXT: [[TMP4:%.*]] = add i32 [[SMAX1]], 1
	; CHECK-NEXT: [[SMAX1:%.*]] = select i1 [[TMP5]], i32 [[TMP4]], i32 -1			; CHECK-NEXT: [[TMP5:%.*]] = add i32 [[BOUND]], [[SMAX]]
	; CHECK-NEXT: [[TMP6:%.*]] = add i32 [[SMAX1]], 1			; CHECK-NEXT: [[TMP6:%.*]] = add i32 [[TMP5]], 1
	; CHECK-NEXT: [[TMP7:%.*]] = mul i32 [[TMP3]], [[TMP6]]			; CHECK-NEXT: [[TMP7:%.*]] = mul i32 [[TMP4]], [[TMP6]]
	; CHECK-NEXT: [[TMP8:%.*]] = sub i32 -1, [[TMP7]]			; CHECK-NEXT: [[TMP8:%.*]] = sub i32 -1, [[TMP7]]
	; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[N]]			; CHECK-NEXT: [[TMP9:%.*]] = sub i32 -1, [[N]]
	; CHECK-NEXT: [[TMP10:%.*]] = icmp ugt i32 [[TMP8]], [[TMP9]]			; CHECK-NEXT: [[TMP10:%.*]] = icmp ugt i32 [[TMP8]], [[TMP9]]
	; CHECK-NEXT: [[UMAX:%.*]] = select i1 [[TMP10]], i32 [[TMP8]], i32 [[TMP9]]			; CHECK-NEXT: [[UMAX:%.*]] = select i1 [[TMP10]], i32 [[TMP8]], i32 [[TMP9]]
	; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = sub i32 -1, [[UMAX]]			; CHECK-NEXT: [[EXIT_MAINLOOP_AT:%.*]] = sub i32 -1, [[UMAX]]
	; CHECK-NEXT: [[TMP11:%.*]] = icmp ult i32 0, [[EXIT_MAINLOOP_AT]]			; CHECK-NEXT: [[TMP11:%.*]] = icmp ult i32 0, [[EXIT_MAINLOOP_AT]]
	; CHECK-NEXT: br i1 [[TMP11]], label [[LOOP_PREHEADER2:%.]], label [[MAIN_PSEUDO_EXIT:%.]]			; CHECK-NEXT: br i1 [[TMP11]], label [[LOOP_PREHEADER2:%.]], label [[MAIN_PSEUDO_EXIT:%.]]
	; CHECK: loop.preheader2:			; CHECK: loop.preheader2:
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

test/Transforms/IRCE/single-access-no-preloop.ll

	Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines

	exit:			exit:
	ret void			ret void
	}			}

	; CHECK-LABEL: @single_access_no_preloop_with_offset(			; CHECK-LABEL: @single_access_no_preloop_with_offset(

	; CHECK: loop.preheader:			; CHECK: loop.preheader:
	; CHECK: [[not_safe_range_end:[^ ]+]] = sub i32 3, %len
	; CHECK: [[not_n:[^ ]+]] = sub i32 -1, %n			; CHECK: [[not_n:[^ ]+]] = sub i32 -1, %n
	; CHECK: [[not_exit_main_loop_at_hiclamp_cmp:[^ ]+]] = icmp sgt i32 [[not_safe_range_end]], [[not_n]]			; CHECK: [[not_safe_range_end:[^ ]+]] = sub i32 3, %len
	; CHECK: [[not_exit_main_loop_at_hiclamp:[^ ]+]] = select i1 [[not_exit_main_loop_at_hiclamp_cmp]], i32 [[not_safe_range_end]], i32 [[not_n]]			; CHECK: [[not_exit_main_loop_at_hiclamp_cmp:[^ ]+]] = icmp sgt i32 [[not_n]], [[not_safe_range_end]]
				; CHECK: [[not_exit_main_loop_at_hiclamp:[^ ]+]] = select i1 [[not_exit_main_loop_at_hiclamp_cmp]], i32 [[not_n]], i32 [[not_safe_range_end]]
	; CHECK: [[exit_main_loop_at_hiclamp:[^ ]+]] = sub i32 -1, [[not_exit_main_loop_at_hiclamp]]			; CHECK: [[exit_main_loop_at_hiclamp:[^ ]+]] = sub i32 -1, [[not_exit_main_loop_at_hiclamp]]
	; CHECK: [[exit_main_loop_at_loclamp_cmp:[^ ]+]] = icmp sgt i32 [[exit_main_loop_at_hiclamp]], 0			; CHECK: [[exit_main_loop_at_loclamp_cmp:[^ ]+]] = icmp sgt i32 [[exit_main_loop_at_hiclamp]], 0
	; CHECK: [[exit_main_loop_at_loclamp:[^ ]+]] = select i1 [[exit_main_loop_at_loclamp_cmp]], i32 [[exit_main_loop_at_hiclamp]], i32 0			; CHECK: [[exit_main_loop_at_loclamp:[^ ]+]] = select i1 [[exit_main_loop_at_loclamp_cmp]], i32 [[exit_main_loop_at_hiclamp]], i32 0
	; CHECK: [[enter_main_loop:[^ ]+]] = icmp slt i32 0, [[exit_main_loop_at_loclamp]]			; CHECK: [[enter_main_loop:[^ ]+]] = icmp slt i32 0, [[exit_main_loop_at_loclamp]]
	; CHECK: br i1 [[enter_main_loop]], label %loop.preheader2, label %main.pseudo.exit			; CHECK: br i1 [[enter_main_loop]], label %loop.preheader2, label %main.pseudo.exit

	; CHECK: loop:			; CHECK: loop:
	; CHECK: br i1 true, label %in.bounds, label %out.of.bounds			; CHECK: br i1 true, label %in.bounds, label %out.of.bounds
	▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines

test/Transforms/IRCE/single-access-with-preloop.ll

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines


	; CHECK: [[len_minus_sint_max:[^ ]+]] = add i32 %len, -2147483647			; CHECK: [[len_minus_sint_max:[^ ]+]] = add i32 %len, -2147483647
	; CHECK: [[check_len_min_sint_offset:[^ ]+]] = icmp sgt i32 %offset, [[len_minus_sint_max]]			; CHECK: [[check_len_min_sint_offset:[^ ]+]] = icmp sgt i32 %offset, [[len_minus_sint_max]]
	; CHECK: [[safe_offset_mainloop:[^ ]+]] = select i1 [[check_len_min_sint_offset]], i32 %offset, i32 [[len_minus_sint_max]]			; CHECK: [[safe_offset_mainloop:[^ ]+]] = select i1 [[check_len_min_sint_offset]], i32 %offset, i32 [[len_minus_sint_max]]
	; CHECK: [[not_safe_start_2:[^ ]+]] = add i32 [[safe_offset_mainloop]], -1			; CHECK: [[not_safe_start_2:[^ ]+]] = add i32 [[safe_offset_mainloop]], -1
	; If Offset was a SINT_MIN, we could have an overflow here. That is why we calculated its safe version.			; If Offset was a SINT_MIN, we could have an overflow here. That is why we calculated its safe version.
	; CHECK: [[not_safe_upper_end:[^ ]+]] = sub i32 [[not_safe_start_2]], %len			; CHECK: [[not_safe_upper_end:[^ ]+]] = sub i32 [[not_safe_start_2]], %len
	; CHECK: [[not_exit_mainloop_at_cond_loclamp:[^ ]+]] = icmp sgt i32 [[not_safe_upper_end]], [[not_n]]
	; CHECK: [[not_exit_mainloop_at_loclamp:[^ ]+]] = select i1 [[not_exit_mainloop_at_cond_loclamp]], i32 [[not_safe_upper_end]], i32 [[not_n]]
	; CHECK: [[check_offset_mainloop_2:[^ ]+]] = icmp sgt i32 %offset, 0			; CHECK: [[check_offset_mainloop_2:[^ ]+]] = icmp sgt i32 %offset, 0
	; CHECK: [[safe_offset_mainloop_2:[^ ]+]] = select i1 [[check_offset_mainloop_2]], i32 %offset, i32 0			; CHECK: [[safe_offset_mainloop_2:[^ ]+]] = select i1 [[check_offset_mainloop_2]], i32 %offset, i32 0
	; CHECK: [[not_safe_lower_end:[^ ]+]] = add i32 [[safe_offset_mainloop_2]], -2147483648			; CHECK: [[not_safe_lower_end:[^ ]+]] = add i32 [[safe_offset_mainloop_2]], -2147483648
	; CHECK: [[not_exit_mainloop_at_cond_hiclamp:[^ ]+]] = icmp sgt i32 [[not_exit_mainloop_at_loclamp]], [[not_safe_lower_end]]			; CHECK: [[not_exit_mainloop_at_cond_loclamp:[^ ]+]] = icmp sgt i32 [[not_safe_upper_end]], [[not_safe_lower_end]]
	; CHECK: [[not_exit_mainloop_at_hiclamp:[^ ]+]] = select i1 [[not_exit_mainloop_at_cond_hiclamp]], i32 [[not_exit_mainloop_at_loclamp]], i32 [[not_safe_lower_end]]			; CHECK: [[not_exit_mainloop_at_loclamp:[^ ]+]] = select i1 [[not_exit_mainloop_at_cond_loclamp]], i32 [[not_safe_upper_end]], i32 [[not_safe_lower_end]]
				; CHECK: [[not_exit_mainloop_at_cond_hiclamp:[^ ]+]] = icmp sgt i32 [[not_exit_mainloop_at_loclamp]], [[not_n]]
				; CHECK: [[not_exit_mainloop_at_hiclamp:[^ ]+]] = select i1 [[not_exit_mainloop_at_cond_hiclamp]], i32 [[not_exit_mainloop_at_loclamp]], i32 [[not_n]]
	; CHECK: [[exit_mainloop_at_hiclamp:[^ ]+]] = sub i32 -1, [[not_exit_mainloop_at_hiclamp]]			; CHECK: [[exit_mainloop_at_hiclamp:[^ ]+]] = sub i32 -1, [[not_exit_mainloop_at_hiclamp]]
	; CHECK: [[exit_mainloop_at_cmp:[^ ]+]] = icmp sgt i32 [[exit_mainloop_at_hiclamp]], 0			; CHECK: [[exit_mainloop_at_cmp:[^ ]+]] = icmp sgt i32 [[exit_mainloop_at_hiclamp]], 0
	; CHECK: [[exit_mainloop_at:[^ ]+]] = select i1 [[exit_mainloop_at_cmp]], i32 [[exit_mainloop_at_hiclamp]], i32 0			; CHECK: [[exit_mainloop_at:[^ ]+]] = select i1 [[exit_mainloop_at_cmp]], i32 [[exit_mainloop_at_hiclamp]], i32 0

	; CHECK: mainloop:			; CHECK: mainloop:
	; CHECK: br label %loop			; CHECK: br label %loop

	; CHECK: loop:			; CHECK: loop:
	Show All 27 Lines

test/Transforms/LoadStoreVectorizer/X86/compare-scev-by-complexity.ll

This file was added.

				; RUN: opt -load-store-vectorizer %s -S \| FileCheck %s

				; Check that setting wrapping flags after a SCEV node is created
				; does not invalidate "sorted by complexity" invariant for
				; operands of commutative and associative SCEV operators.

				target triple = "x86_64--"

				@global_value0 = external constant i32
				@global_value1 = external constant i32
				@other_value = external global float
				@a = external global float
				@b = external global float
				@c = external global float
				@d = external global float
				@plus1 = external global i32
				@cnd = external global i8

				; Function Attrs: nounwind
				define void @main() local_unnamed_addr #0 {
				; CHECK-LABEL: @main()
				; CHECK: [[PTR:%[0-9]+]] = bitcast float* %preheader.load0.address to <2 x float>*
				; CHECK: = load <2 x float>, <2 x float>* [[PTR]]
				; CHECK-LABEL: for.body23:
				entry:
				%tmp = load i32, i32* @global_value0, !range !0
				%tmp2 = load i32, i32* @global_value1
				%and.i.i = and i32 %tmp2, 2
				%add.nuw.nsw.i.i = add nuw nsw i32 %and.i.i, 0
				%mul.i.i = shl nuw nsw i32 %add.nuw.nsw.i.i, 1
				%and6.i.i = and i32 %tmp2, 3
				%and9.i.i = and i32 %tmp2, 4
				%add.nuw.nsw10.i.i = add nuw nsw i32 %and6.i.i, %and9.i.i
				%conv3.i42.i = add nuw nsw i32 %mul.i.i, 1
				%reass.add346.7 = add nuw nsw i32 %add.nuw.nsw10.i.i, 56
				%reass.mul347.7 = mul nuw nsw i32 %tmp, %reass.add346.7
				%add7.i.7 = add nuw nsw i32 %reass.mul347.7, 0
				%preheader.address0.idx = add nuw nsw i32 %add7.i.7, %mul.i.i
				%preheader.address0.idx.zext = zext i32 %preheader.address0.idx to i64
				%preheader.load0.address = getelementptr inbounds float, float* @other_value, i64 %preheader.address0.idx.zext
				%preheader.load0. = load float, float* %preheader.load0.address, align 4, !tbaa !1
				%common.address.idx = add nuw nsw i32 %add7.i.7, %conv3.i42.i
				%preheader.header.common.address.idx.zext = zext i32 %common.address.idx to i64
				%preheader.load1.address = getelementptr inbounds float, float* @other_value, i64 %preheader.header.common.address.idx.zext
				%preheader.load1. = load float, float* %preheader.load1.address, align 4, !tbaa !1
				br label %for.body23

				for.body23: ; preds = %for.body23, %entry
				%loop.header.load0.address = getelementptr inbounds float, float* @other_value, i64 %preheader.header.common.address.idx.zext
				%loop.header.load0. = load float, float* %loop.header.load0.address, align 4, !tbaa !1
				%reass.mul343.7 = mul nuw nsw i32 %reass.add346.7, 72
				%add7.i286.7.7 = add nuw nsw i32 %reass.mul343.7, 56
				%add9.i288.7.7 = add nuw nsw i32 %add7.i286.7.7, %mul.i.i
				%loop.header.address1.idx = add nuw nsw i32 %add9.i288.7.7, 1
				%loop.header.address1.idx.zext = zext i32 %loop.header.address1.idx to i64
				%loop.header.load1.address = getelementptr inbounds float, float* @other_value, i64 %loop.header.address1.idx.zext
				%loop.header.load1. = load float, float* %loop.header.load1.address, align 4, !tbaa !1
				store float %preheader.load0., float* @a, align 4, !tbaa !1
				store float %preheader.load1., float* @b, align 4, !tbaa !1
				store float %loop.header.load0., float* @c, align 4, !tbaa !1
				store float %loop.header.load1., float* @d, align 4, !tbaa !1
				%loaded.cnd = load i8, i8* @cnd
				%condition = trunc i8 %loaded.cnd to i1
				br i1 %condition, label %for.body23, label %exit

				exit:
				ret void
				}

				attributes #0 = { nounwind }

				!0 = !{i32 0, i32 65536}
				!1 = !{!2, !2, i64 0}
				!2 = !{!"float", !3, i64 0}
				!3 = !{!"omnipotent char", !4, i64 0}
				!4 = !{!"Simple C++ TBAA"}

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Compare SCEVs by a complexity rank before lexic-al comparisonNeeds RevisionPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 161617

include/llvm/Analysis/ScalarEvolution.h

include/llvm/Analysis/ScalarEvolutionExpressions.h

lib/Analysis/ScalarEvolution.cpp

test/Analysis/ScalarEvolution/max-addops-inline.ll

test/Analysis/ScalarEvolution/min-max-exprs.ll

test/Analysis/ScalarEvolution/predicated-trip-count.ll

test/Analysis/ScalarEvolution/trip-count14.ll

test/Transforms/IRCE/conjunctive-checks.ll

test/Transforms/IRCE/ranges_of_different_types.ll

test/Transforms/IRCE/rc-negative-bound.ll

test/Transforms/IRCE/single-access-no-preloop.ll

test/Transforms/IRCE/single-access-with-preloop.ll

test/Transforms/LoadStoreVectorizer/X86/compare-scev-by-complexity.ll

[SCEV] Compare SCEVs by a complexity rank before lexic-al comparison
Needs RevisionPublic