This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
polly/trunk/
-
trunk/
-
include/polly/Support/
-
polly/
-
Support/
-
SCEVAffinator.h
-
lib/Support/
-
Support/
-
SCEVAffinator.cpp
-
test/ScopInfo/
-
ScopInfo/
-
infeasible-rtc.ll
-
integers.ll
-
truncate-1.ll
-
truncate-2.ll
-
zero_ext_of_truncate.ll

Differential D25287

[SCEVAffinator] Make precise modular math more correct.
ClosedPublic

Authored by efriedma on Oct 5 2016, 11:15 AM.

Download Raw Diff

Details

Reviewers

Meinersbur
jdoerfert

Commits

rG286c5a76ba33: [SCEVAffinator] Make precise modular math more correct.
rPLO284848: [SCEVAffinator] Make precise modular math more correct.
rL284848: [SCEVAffinator] Make precise modular math more correct.

Summary

Integer math in LLVM IR is modular. Integer math in isl is
arbitrary-precision. Modeling LLVM IR math correctly in isl requires
either adding assumptions that math doesn't actually overflow, or
explicitly wrapping the math. However, expressions with the "nsw" flag
are special; we can pretend they're arbitrary-precision because it's
undefined behavior if the result wraps. SCEV expressions based on IR
instructions with an nsw flag also carry an nsw flag (roughly; actually,
the real rule is a bit more complicated, but the details don't matter
here).

Before this patch, SCEV flags were also overloaded with an additional
function: the ZExt code was mutating SCEV expressions as a hack to
indicate to checkForWrapping that we don't need to add assumptions to
the operand of a ZExt; it'll add explicit wrapping itself. This kind of
works... the problem is that if anything else ever touches that SCEV
expression, it'll get confused by the incorrect flags.

Instead, with this patch, we make the decision about whether to
explicitly wrap the math a bit earlier, basing the decision purely on
the SCEV expression itself, and not its users.

Diff Detail

Repository: rL LLVM

Event Timeline

efriedma updated this revision to Diff 73674.Oct 5 2016, 11:15 AM

efriedma retitled this revision from to [SCEVAffinator] Make precise modular math more correct..

efriedma updated this object.

efriedma added reviewers: jdoerfert, Meinersbur.

efriedma set the repository for this revision to rL LLVM.

efriedma added subscribers: llvm-commits, pollydev.

Herald added a subscriber: sanjoy. · View Herald TranscriptOct 5 2016, 11:15 AM

Hi Eli,

thanks for this patch! I can see the problem now and think your solution is the right way to fix it. Nevertheless I added quite a few (mostly minor) comments you might want to address or comment yourself.

Cheers,

Johannes

PS. Can we expect you to work on Polly more regularly now?

include/polly/Support/SCEVAffinator.h
117 ↗	(On Diff #73674)	I would rephrase this in a more direct way, e.g. True, if a possible wrap in expr is modeled, false otherwise.
lib/Support/SCEVAffinator.cpp
120 ↗	(On Diff #73674)	I think the commit message should state explicitly that this caused ScalarEvoltuion to use flags for SCEVs that we introduced only as a means to an end. Also the solution should be mentioned more clearly: We avoid that by applying modulo semantics in the recursion now instead of afterwards.
250 ↗	(On Diff #73674)	Did you omit the `else` here on purpose? If we add modulo semantics there is no wrapping (by construction)
253 ↗	(On Diff #73674)	This is unrelated. Does isOne() evaluate on true on the value: "i1 -1" ?
262 ↗	(On Diff #73674)	Again this can be guarded by !isPrecise(..) can't it?
360 ↗	(On Diff #73674)	Comments are not adjusted.
372 ↗	(On Diff #73674)	This two lines looks like left over code but I am not 100% sure.
test/ScopInfo/bool-addrec.ll
12 ↗	(On Diff #73674)	No triple pls.
test/ScopInfo/infeasible-rtc.ll
15 ↗	(On Diff #73674)	Why did you modify the originial? Wasn't it refused anymore? Could we have both tests? The new one and the old one with a different comment + check line so people reading the patch later have more information.
test/ScopInfo/integers.ll
115 ↗	(On Diff #73674)	It his hard to verify this domain without the {assumed_,invalid_,}context of the scop. Could you add those to the test or at least the review? Currently all I know that the domain shown above doesn't make sense as it contains the following points (note the nsw for %indvars and the i0 > 3): [n] -> { Stmt_bb[i0] : n = -1 and i0 = 3 } [n] -> { Stmt_bb[i0] : n = -1 and i0 = 1 } [n] -> { Stmt_bb[i0] : n = -1 and i0 = 4 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 5 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 6 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 7 } [n] -> { Stmt_bb[i0] : n = 3 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = 0 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = -1 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = -2 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = -3 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = -4 and i0 = 0 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 6 } [n] -> { Stmt_bb[i0] : n = 0 and i0 = 5 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 5 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 4 } [n] -> { Stmt_bb[i0] : n = 0 and i0 = 4 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 4 } [n] -> { Stmt_bb[i0] : n = -2 and i0 = 3 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 3 } [n] -> { Stmt_bb[i0] : n = 0 and i0 = 3 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 3 } [n] -> { Stmt_bb[i0] : n = -1 and i0 = 2 } [n] -> { Stmt_bb[i0] : n = -3 and i0 = 2 } [n] -> { Stmt_bb[i0] : n = -2 and i0 = 2 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 2 } [n] -> { Stmt_bb[i0] : n = 0 and i0 = 2 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 2 } [n] -> { Stmt_bb[i0] : n = -4 and i0 = 1 } [n] -> { Stmt_bb[i0] : n = -3 and i0 = 1 } [n] -> { Stmt_bb[i0] : n = -2 and i0 = 1 } [n] -> { Stmt_bb[i0] : n = 1 and i0 = 1 } [n] -> { Stmt_bb[i0] : n = 0 and i0 = 1 } [n] -> { Stmt_bb[i0] : n = 2 and i0 = 1 }
test/ScopInfo/zero_ext_of_truncate.ll
21 ↗	(On Diff #73674)	This looks like we could run into some regressions with this more agressive representation of wrapping behaviour. Did you observe problems?

I am not familiar with this part of Polly. Could you summarize how SCEV flags are handled now and how it was before?

Thanks Johannes for the extensive review.

Yes, you can expect more polly patches from me in the near future. :)

Quick summary of what's going on with SCEV flags: integer math in LLVM IR is modular. Integer math in isl is arbitrary-precision. Modeling LLVM IR math correctly in isl requires either adding assumptions that math doesn't actually overflow, or explicitly wrapping the math. However, expressions with the "nsw" flag are special; we can pretend they're arbitrary-precision because it's undefined behavior if the result wraps. SCEV expressions based on IR instructions with an nsw flag also carry an nsw flag (roughly; actually, the real rule is a bit more complicated, but the details don't matter here).

Before this patch, SCEV flags were also overloaded with an additional function: the ZExt code was mutating SCEV expressions as a hack to indicate to checkForWrapping that we don't need to add assumptions to the operand of a ZExt; it'll add explicit wrapping itself. This kind of works... the problem is that if anything else ever touches that SCEV expression, it'll get confused by the incorrect flags.

Instead, with this patch, we make the decision about whether to explicitly wrap the math a bit earlier, basing the decision purely on the SCEV expression itself, and not its users.

(I'll adapt this description for the commit message.)

lib/Support/SCEVAffinator.cpp
250 ↗	(On Diff #73674)	I can. checkForWrapping is essentially a no-op on the result of addModuloSemantic, since the value can't actually be out of range.
253 ↗	(On Diff #73674)	Okay, I can separate it out. Yes, isOne() is true for i1 -1.
372 ↗	(On Diff #73674)	Yes, I think so too; I'll remove them.
test/ScopInfo/infeasible-rtc.ll
15 ↗	(On Diff #73674)	I modified it because the old test wasn't getting rejected: with this patch, we use modular math and correctly compute the trip count. Sure, I can keep around the old test, too, if you think it's appropriate.
test/ScopInfo/integers.ll
115 ↗	(On Diff #73674)	Full output follows. I think the domain is right if you ignore the nsw... but you're right, my patch isn't correctly honoring the nsw flag for very narrow addition operations. I'll fix that. Printing analysis 'Polly - Create polyhedral description of Scops' for region: 'bb => return' in function 'f6': Function: f6 Region: %bb---%return Max Loop Depth: 1 Invariant Accesses: { } Context: [n] -> { : -4 <= n <= 3 } Assumed Context: [n] -> { : } Invalid Context: [n] -> { : 1 = 0 } p0: %n Arrays { i3 MemRef_a[]; // Element size 1 } Arrays (Bounds as pw_affs) { i3 MemRef_a[]; // Element size 1 } Alias Groups (0): n/a Statements { Stmt_bb Domain := [n] -> { Stmt_bb[i0] : i0 >= 0 and 8*floor((5 + n)/8) <= 5 + n - i0 }; Schedule := [n] -> { Stmt_bb[i0] -> [i0] }; MustWriteAccess := [Reduction Type: NONE] [Scalar: 0] [n] -> { Stmt_bb[i0] -> MemRef_a[i0] }; }
test/ScopInfo/zero_ext_of_truncate.ll
21 ↗	(On Diff #73674)	We have a related patch to avoid modeling truncates for loop-invariant expressions, which makes the complexity of truncate expressions less problematic (with or without this patch). Hopefully we'll get something posted soon. Beyond that, I probably should have tested a little more carefully before I posted this patch; this version is more aggressive than what we've been using, and it causes a problem on one of our internal tests. I might tweak the heuristic.

Thank you for the explanation. I'll will look at it again after the update.

lib/Support/SCEVAffinator.cpp
41 ↗	(On Diff #73674)	How is this magic number determined? Why is it constant and does not depend on the the type's width?
250 ↗	(On Diff #73674)	I don't understand why when the type's bitwidth is below a threshold, we always add modulo ie. ignore the nsw flag?

jdoerfert added inline comments.Oct 7 2016, 4:34 PM

lib/Support/SCEVAffinator.cpp
41 ↗	(On Diff #73674)	This is by definition a magic number we use to compare against the types bit-width. It was there before and it stems from some observations about the way ScalarEvolution handles low bit-width modulo expressions but that is basically it.
250 ↗	(On Diff #73674)	Right, we should not. addModuloSemantics should now probably the only place we should check for the nsw flag and exit if it is present.

Updated to address comments.

I'm not particularly happy with killing off the special case for truncates, but we really need to be doing something better with them; handling them precisely is too complicated (causes performance problems).

Otherwise, I think I addressed all the review comments.

Looks almost good. I am not sure about the i1 thing and this one test case though. Does lnt pass?

lib/Support/SCEVAffinator.cpp
215 ↗	(On Diff #74454)	We have a helper function here getNoWrapFlags(Expr) but I do not care too much.
253 ↗	(On Diff #74454)	The fact that you can remove the guard and the test still work is merely a coincidence and should not be commited this way. As soon as (for whatever reason) computeModuloForExpr(..) does not return true for an i1 expression we will have a hard to find bug here.
test/ScopInfo/bool-addrec.ll
12 ↗	(On Diff #73674)	No triple pls.
test/ScopInfo/integers.ll
115 ↗	(On Diff #73674)	The domain in the test is different from the one you posted, isn't it? I'm confused, sry.
test/ScopInfo/truncate-1.ll
15 ↗	(On Diff #74454)	For now OK I guess.
test/ScopInfo/zero_ext_of_truncate.ll
12 ↗	(On Diff #74454)	True but that is not that easy to decide. If there was a second use of "%tmp" the situation might look different. Anyway, if you want to work on this we could see if we can come up with a scheme.

Updated for review comments.

efriedma updated this object.Oct 13 2016, 10:29 AM

Ping.

LGTM.

This revision is now accepted and ready to land.Oct 21 2016, 8:14 AM

Closed by commit rL284848: [SCEVAffinator] Make precise modular math more correct. (authored by efriedma). · Explain WhyOct 21 2016, 11:17 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

polly/

trunk/

include/

polly/

Support/

SCEVAffinator.h

4 lines

lib/

Support/

SCEVAffinator.cpp

80 lines

test/

ScopInfo/

35 lines

20 lines

5 lines

7 lines

zero_ext_of_truncate.ll

5 lines

Diff 75453

polly/trunk/include/polly/Support/SCEVAffinator.h

Show First 20 Lines • Show All 107 Lines • ▼ Show 20 Lines	private:
/// If @p Expr might cause an integer wrap record an assumption.		/// If @p Expr might cause an integer wrap record an assumption.
///		///
/// @param Expr The SCEV expression that might wrap.		/// @param Expr The SCEV expression that might wrap.
/// @param PWAC The isl representation of @p Expr with the invalid domain.		/// @param PWAC The isl representation of @p Expr with the invalid domain.
///		///
/// @returns The isl representation @p PWAC with a posisbly adjusted domain.		/// @returns The isl representation @p PWAC with a posisbly adjusted domain.
__isl_give PWACtx checkForWrapping(const llvm::SCEV *Expr, PWACtx PWAC) const;		__isl_give PWACtx checkForWrapping(const llvm::SCEV *Expr, PWACtx PWAC) const;

		/// Whether to track the value of this expression precisely, rather than
		/// assuming it won't wrap.
		bool computeModuloForExpr(const llvm::SCEV *Expr);

__isl_give PWACtx visit(const llvm::SCEV *E);		__isl_give PWACtx visit(const llvm::SCEV *E);
__isl_give PWACtx visitConstant(const llvm::SCEVConstant *E);		__isl_give PWACtx visitConstant(const llvm::SCEVConstant *E);
__isl_give PWACtx visitTruncateExpr(const llvm::SCEVTruncateExpr *E);		__isl_give PWACtx visitTruncateExpr(const llvm::SCEVTruncateExpr *E);
__isl_give PWACtx visitZeroExtendExpr(const llvm::SCEVZeroExtendExpr *E);		__isl_give PWACtx visitZeroExtendExpr(const llvm::SCEVZeroExtendExpr *E);
__isl_give PWACtx visitSignExtendExpr(const llvm::SCEVSignExtendExpr *E);		__isl_give PWACtx visitSignExtendExpr(const llvm::SCEVSignExtendExpr *E);
__isl_give PWACtx visitAddExpr(const llvm::SCEVAddExpr *E);		__isl_give PWACtx visitAddExpr(const llvm::SCEVAddExpr *E);
__isl_give PWACtx visitMulExpr(const llvm::SCEVMulExpr *E);		__isl_give PWACtx visitMulExpr(const llvm::SCEVMulExpr *E);
__isl_give PWACtx visitUDivExpr(const llvm::SCEVUDivExpr *E);		__isl_give PWACtx visitUDivExpr(const llvm::SCEVUDivExpr *E);
Show All 12 Lines

polly/trunk/lib/Support/SCEVAffinator.cpp

Show All 30 Lines	cl::desc("Do not build run-time checks to proof absence of integer "
"wrapping"),		"wrapping"),
cl::Hidden, cl::ZeroOrMore, cl::init(false), cl::cat(PollyCategory));		cl::Hidden, cl::ZeroOrMore, cl::init(false), cl::cat(PollyCategory));

// The maximal number of basic sets we allow during the construction of a		// The maximal number of basic sets we allow during the construction of a
// piecewise affine function. More complex ones will result in very high		// piecewise affine function. More complex ones will result in very high
// compile time.		// compile time.
static int const MaxDisjunctionsInPwAff = 100;		static int const MaxDisjunctionsInPwAff = 100;

// The maximal number of bits for which a zero-extend is modeled precisely.		// The maximal number of bits for which a general expression is modeled
static unsigned const MaxZextSmallBitWidth = 7;		// precisely.
		static unsigned const MaxSmallBitWidth = 7;
// The maximal number of bits for which a truncate is modeled precisely.
static unsigned const MaxTruncateSmallBitWidth = 31;

/// Return true if a zero-extend from @p Width bits is precisely modeled.
static bool isPreciseZeroExtend(unsigned Width) {
return Width <= MaxZextSmallBitWidth;
}

/// Return true if a truncate from @p Width bits is precisely modeled.
static bool isPreciseTruncate(unsigned Width) {
return Width <= MaxTruncateSmallBitWidth;
}

/// Add the number of basic sets in @p Domain to @p User		/// Add the number of basic sets in @p Domain to @p User
static isl_stat addNumBasicSets(__isl_take isl_set *Domain,		static isl_stat addNumBasicSets(__isl_take isl_set *Domain,
__isl_take isl_aff Aff, void User) {		__isl_take isl_aff Aff, void User) {
auto NumBasicSets = static_cast<unsigned >(User);		auto NumBasicSets = static_cast<unsigned >(User);
*NumBasicSets += isl_set_n_basic_set(Domain);		*NumBasicSets += isl_set_n_basic_set(Domain);
isl_set_free(Domain);		isl_set_free(Domain);
isl_aff_free(Aff);		isl_aff_free(Aff);
Show All 32 Lines
}		}

static void combine(__isl_keep PWACtx &PWAC0, const __isl_take PWACtx &PWAC1,		static void combine(__isl_keep PWACtx &PWAC0, const __isl_take PWACtx &PWAC1,
isl_pw_aff (Fn)(isl_pw_aff , isl_pw_aff *)) {		isl_pw_aff (Fn)(isl_pw_aff , isl_pw_aff *)) {
PWAC0.first = Fn(PWAC0.first, PWAC1.first);		PWAC0.first = Fn(PWAC0.first, PWAC1.first);
PWAC0.second = isl_set_union(PWAC0.second, PWAC1.second);		PWAC0.second = isl_set_union(PWAC0.second, PWAC1.second);
}		}

/// Set the possible wrapping of @p Expr to @p Flags.
static const SCEV setNoWrapFlags(ScalarEvolution &SE, const SCEV Expr,
SCEV::NoWrapFlags Flags) {
auto *NAry = dyn_cast<SCEVNAryExpr>(Expr);
if (!NAry)
return Expr;

SmallVector<const SCEV *, 8> Ops(NAry->op_begin(), NAry->op_end());
switch (Expr->getSCEVType()) {
case scAddExpr:
return SE.getAddExpr(Ops, Flags);
case scMulExpr:
return SE.getMulExpr(Ops, Flags);
case scAddRecExpr:
return SE.getAddRecExpr(Ops, cast<SCEVAddRecExpr>(Expr)->getLoop(), Flags);
default:
return Expr;
}
}

static __isl_give isl_pw_aff *getWidthExpValOnDomain(unsigned Width,		static __isl_give isl_pw_aff *getWidthExpValOnDomain(unsigned Width,
__isl_take isl_set *Dom) {		__isl_take isl_set *Dom) {
auto *Ctx = isl_set_get_ctx(Dom);		auto *Ctx = isl_set_get_ctx(Dom);
auto *WidthVal = isl_val_int_from_ui(Ctx, Width);		auto *WidthVal = isl_val_int_from_ui(Ctx, Width);
auto *ExpVal = isl_val_2exp(WidthVal);		auto *ExpVal = isl_val_2exp(WidthVal);
return isl_pw_aff_val_on_domain(Dom, ExpVal);		return isl_pw_aff_val_on_domain(Dom, ExpVal);
}		}

▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	if (AddRec->getLoop() != L)
continue;		continue;
if (AddRec->getNoWrapFlags() & SCEV::FlagNSW)		if (AddRec->getNoWrapFlags() & SCEV::FlagNSW)
return true;		return true;
}		}

return false;		return false;
}		}

		bool SCEVAffinator::computeModuloForExpr(const SCEV *Expr) {
		unsigned Width = TD.getTypeSizeInBits(Expr->getType());
		// We assume nsw expressions never overflow.
		if (auto *NAry = dyn_cast<SCEVNAryExpr>(Expr))
		if (NAry->getNoWrapFlags() & SCEV::FlagNSW)
		return false;
		return Width <= MaxSmallBitWidth;
		}

__isl_give PWACtx SCEVAffinator::visit(const SCEV *Expr) {		__isl_give PWACtx SCEVAffinator::visit(const SCEV *Expr) {

auto Key = std::make_pair(Expr, BB);		auto Key = std::make_pair(Expr, BB);
PWACtx PWAC = CachedExpressions[Key];		PWACtx PWAC = CachedExpressions[Key];
if (PWAC.first)		if (PWAC.first)
return copyPWACtx(PWAC);		return copyPWACtx(PWAC);

auto ConstantAndLeftOverPair = extractConstantFactor(Expr, SE);		auto ConstantAndLeftOverPair = extractConstantFactor(Expr, SE);
Show All 10 Lines	if (isl_id *Id = S->getIdForParam(Expr)) {

isl_set *Domain = isl_set_universe(isl_space_copy(Space));		isl_set *Domain = isl_set_universe(isl_space_copy(Space));
isl_aff *Affine = isl_aff_zero_on_domain(isl_local_space_from_space(Space));		isl_aff *Affine = isl_aff_zero_on_domain(isl_local_space_from_space(Space));
Affine = isl_aff_add_coefficient_si(Affine, isl_dim_param, 0, 1);		Affine = isl_aff_add_coefficient_si(Affine, isl_dim_param, 0, 1);

PWAC = getPWACtxFromPWA(isl_pw_aff_alloc(Domain, Affine));		PWAC = getPWACtxFromPWA(isl_pw_aff_alloc(Domain, Affine));
} else {		} else {
PWAC = SCEVVisitor<SCEVAffinator, PWACtx>::visit(Expr);		PWAC = SCEVVisitor<SCEVAffinator, PWACtx>::visit(Expr);
		if (computeModuloForExpr(Expr))
		PWAC.first = addModuloSemantic(PWAC.first, Expr->getType());
		else
PWAC = checkForWrapping(Expr, PWAC);		PWAC = checkForWrapping(Expr, PWAC);
}		}

if (!Factor->getType()->isIntegerTy(1))		if (!Factor->getType()->isIntegerTy(1)) {
combine(PWAC, visitConstant(Factor), isl_pw_aff_mul);		combine(PWAC, visitConstant(Factor), isl_pw_aff_mul);
		if (computeModuloForExpr(Key.first))
		PWAC.first = addModuloSemantic(PWAC.first, Expr->getType());
		}

// For compile time reasons we need to simplify the PWAC before we cache and		// For compile time reasons we need to simplify the PWAC before we cache and
// return it.		// return it.
PWAC.first = isl_pw_aff_coalesce(PWAC.first);		PWAC.first = isl_pw_aff_coalesce(PWAC.first);
		if (!computeModuloForExpr(Key.first))
PWAC = checkForWrapping(Key.first, PWAC);		PWAC = checkForWrapping(Key.first, PWAC);

CachedExpressions[Key] = copyPWACtx(PWAC);		CachedExpressions[Key] = copyPWACtx(PWAC);
return PWAC;		return PWAC;
}		}

__isl_give PWACtx SCEVAffinator::visitConstant(const SCEVConstant *Expr) {		__isl_give PWACtx SCEVAffinator::visitConstant(const SCEVConstant *Expr) {
ConstantInt *Value = Expr->getValue();		ConstantInt *Value = Expr->getValue();
isl_val *v;		isl_val *v;
Show All 21 Lines	SCEVAffinator::visitTruncateExpr(const SCEVTruncateExpr *Expr) {
// model them that way. However, for large types we assume the operand		// model them that way. However, for large types we assume the operand
// to fit in the new type size instead of introducing a modulo with a very		// to fit in the new type size instead of introducing a modulo with a very
// large constant.		// large constant.

auto *Op = Expr->getOperand();		auto *Op = Expr->getOperand();
auto OpPWAC = visit(Op);		auto OpPWAC = visit(Op);

unsigned Width = TD.getTypeSizeInBits(Expr->getType());		unsigned Width = TD.getTypeSizeInBits(Expr->getType());
bool Precise = isPreciseTruncate(Width);

if (Precise) {		if (computeModuloForExpr(Expr))
OpPWAC.first = addModuloSemantic(OpPWAC.first, Expr->getType());
return OpPWAC;		return OpPWAC;
}

auto *Dom = isl_pw_aff_domain(isl_pw_aff_copy(OpPWAC.first));		auto *Dom = isl_pw_aff_domain(isl_pw_aff_copy(OpPWAC.first));
auto *ExpPWA = getWidthExpValOnDomain(Width - 1, Dom);		auto *ExpPWA = getWidthExpValOnDomain(Width - 1, Dom);
auto *GreaterDom =		auto *GreaterDom =
isl_pw_aff_ge_set(isl_pw_aff_copy(OpPWAC.first), isl_pw_aff_copy(ExpPWA));		isl_pw_aff_ge_set(isl_pw_aff_copy(OpPWAC.first), isl_pw_aff_copy(ExpPWA));
auto *SmallerDom =		auto *SmallerDom =
isl_pw_aff_lt_set(isl_pw_aff_copy(OpPWAC.first), isl_pw_aff_neg(ExpPWA));		isl_pw_aff_lt_set(isl_pw_aff_copy(OpPWAC.first), isl_pw_aff_neg(ExpPWA));
auto *OutOfBoundsDom = isl_set_union(SmallerDom, GreaterDom);		auto *OutOfBoundsDom = isl_set_union(SmallerDom, GreaterDom);
▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	SCEVAffinator::visitZeroExtendExpr(const SCEVZeroExtendExpr *Expr) {
//		//
// We choose to go with a hybrid solution of all modeling techniques described		// We choose to go with a hybrid solution of all modeling techniques described
// above. For small bit-widths (up to MaxZextSmallBitWidth) we will model the		// above. For small bit-widths (up to MaxZextSmallBitWidth) we will model the
// wrapping explicitly and use a piecewise defined function. However, if the		// wrapping explicitly and use a piecewise defined function. However, if the
// bit-width is bigger than MaxZextSmallBitWidth we will employ overflow		// bit-width is bigger than MaxZextSmallBitWidth we will employ overflow
// assumptions and assume the "former negative" piece will not exist.		// assumptions and assume the "former negative" piece will not exist.

auto *Op = Expr->getOperand();		auto *Op = Expr->getOperand();
unsigned Width = TD.getTypeSizeInBits(Op->getType());

bool Precise = isPreciseZeroExtend(Width);

auto Flags = getNoWrapFlags(Op);
auto NoWrapFlags = ScalarEvolution::setFlags(Flags, SCEV::FlagNSW);
bool OpCanWrap = Precise && !(Flags & SCEV::FlagNSW);
if (OpCanWrap)
Op = setNoWrapFlags(SE, Op, NoWrapFlags);

auto OpPWAC = visit(Op);		auto OpPWAC = visit(Op);
if (OpCanWrap)
OpPWAC.first = addModuloSemantic(OpPWAC.first, Op->getType());

// If the width is to big we assume the negative part does not occur.		// If the width is to big we assume the negative part does not occur.
if (!Precise) {		if (!computeModuloForExpr(Op)) {
takeNonNegativeAssumption(OpPWAC);		takeNonNegativeAssumption(OpPWAC);
return OpPWAC;		return OpPWAC;
}		}

// If the width is small build the piece for the non-negative part and		// If the width is small build the piece for the non-negative part and
// the one for the negative part and unify them.		// the one for the negative part and unify them.
		unsigned Width = TD.getTypeSizeInBits(Op->getType());
interpretAsUnsigned(OpPWAC, Width);		interpretAsUnsigned(OpPWAC, Width);
return OpPWAC;		return OpPWAC;
}		}

__isl_give PWACtx		__isl_give PWACtx
SCEVAffinator::visitSignExtendExpr(const SCEVSignExtendExpr *Expr) {		SCEVAffinator::visitSignExtendExpr(const SCEVSignExtendExpr *Expr) {
// As all values are represented as signed, a sign extension is a noop.		// As all values are represented as signed, a sign extension is a noop.
return visit(Expr->getOperand());		return visit(Expr->getOperand());
▲ Show 20 Lines • Show All 172 Lines • Show Last 20 Lines

polly/trunk/test/ScopInfo/infeasible-rtc.ll

	; RUN: opt %loadPolly -polly-detect -analyze < %s \			; RUN: opt %loadPolly -polly-detect -analyze < %s \
	; RUN: \| FileCheck %s -check-prefix=DETECT			; RUN: \| FileCheck %s -check-prefix=DETECT

	; RUN: opt %loadPolly -polly-scops -analyze < %s \			; RUN: opt %loadPolly -polly-scops -analyze < %s \
	; RUN: \| FileCheck %s -check-prefix=SCOPS			; RUN: \| FileCheck %s -check-prefix=SCOPS

	; DETECT: Valid Region for Scop: header => exit			target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
	; SCOPS-NOT: Region: %header---%exit
				; DETECT: Valid Region for Scop: test1.header => test1.exit
				; SCOPS-NOT: Region: %test1.header---%test1.exit

	; Verify that we detect this scop, but that, due to an infeasible run-time			; Verify that we detect this scop, but that, due to an infeasible run-time
	; check, we refuse to model it.			; check, we refuse to model it.

	target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"			define void @test(i64* %a) nounwind uwtable {
				preheader:
				br label %test1.header

				test1.header:
				%i = phi i56 [ 0, %preheader ], [ %i.1, %test1.header ]
				%tmp = zext i56 %i to i64
				%A.addr = getelementptr i64, i64* %a, i64 %tmp
				%A.load = load i64, i64* %A.addr, align 4
				%A.inc = zext i56 %i to i64
				%A.val = add nsw i64 %A.load, %A.inc
				store i64 %A.val, i64* %A.addr, align 4
				%i.1 = add i56 %i, 1
				%exitcond = icmp eq i56 %i.1, 0
				br i1 %exitcond, label %test1.exit, label %test1.header

				test1.exit:
				ret void
				}

				; Old version of the previous test; make sure we compute the trip count
				; correctly.

	@A = common global [128 x i32] zeroinitializer, align 16			; SCOPS: { Stmt_header[i0] : 0 <= i0 <= 127 };

	define void @test() nounwind uwtable {			define void @test2([128 x i32]* %a) nounwind uwtable {
	preheader:			preheader:
	br label %header			br label %header

	header:			header:
	%i = phi i7 [ 0, %preheader ], [ %i.1, %header ]			%i = phi i7 [ 0, %preheader ], [ %i.1, %header ]
	%tmp = zext i7 %i to i64			%tmp = zext i7 %i to i64
	%A.addr = getelementptr [128 x i32], [128 x i32]* @A, i64 0, i64 %tmp			%A.addr = getelementptr [128 x i32], [128 x i32]* %a, i64 0, i64 %tmp
	%A.load = load i32, i32* %A.addr, align 4			%A.load = load i32, i32* %A.addr, align 4
	%A.inc = zext i7 %i to i32			%A.inc = zext i7 %i to i32
	%A.val = add nsw i32 %A.load, %A.inc			%A.val = add nsw i32 %A.load, %A.inc
	store i32 %A.val, i32* %A.addr, align 4			store i32 %A.val, i32* %A.addr, align 4
	%i.1 = add i7 %i, 1			%i.1 = add i7 %i, 1
	%exitcond = icmp eq i7 %i.1, 0			%exitcond = icmp eq i7 %i.1, 0
	br i1 %exitcond, label %exit, label %header			br i1 %exitcond, label %exit, label %header

	exit:			exit:
	ret void			ret void
	}			}

polly/trunk/test/ScopInfo/integers.ll

Show First 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	entry:
br label %bb		br label %bb

bb:		bb:
%indvar = phi i3 [ 0, %entry ], [ %indvar.next, %bb ]		%indvar = phi i3 [ 0, %entry ], [ %indvar.next, %bb ]
%scevgep = getelementptr i3, i3* %a, i3 %indvar		%scevgep = getelementptr i3, i3* %a, i3 %indvar
store i3 %indvar, i3* %scevgep, align 8		store i3 %indvar, i3* %scevgep, align 8
%indvar.next = add nsw i3 %indvar, 1		%indvar.next = add nsw i3 %indvar, 1
%sub = sub i3 %n, 3		%sub = sub i3 %n, 3
; CHECK: 'bb => return' in function 'f6'		; CHECK-LABEL: 'bb => return' in function 'f6'
; CHECK: [n] -> { Stmt_bb[0] : n = 3 };		; CHECK: Context:
		; CHECK-NEXT: [n] -> { : -4 <= n <= 3 }
		; CHECK-NEXT: Assumed Context:
		; CHECK-NEXT: [n] -> { : }
		; CHECK-NEXT: Invalid Context:
		; CHECK-NEXT: [n] -> { : 1 = 0 }

		; CHECK: Statements {
		; CHECK-NEXT: Stmt_bb
		; CHECK-NEXT: Domain :=
		; CHECK-NEXT: [n] -> { Stmt_bb[i0] : i0 >= 0 and 8floor((2 - n)/8) >= -5 - n + i0 and 8floor((2 - n)/8) <= -2 - n };
		; CHECK-NEXT: Schedule :=
		; CHECK-NEXT: [n] -> { Stmt_bb[i0] -> [i0] };
		; CHECK-NEXT: MustWriteAccess := [Reduction Type: NONE] [Scalar: 0]
		; CHECK-NEXT: [n] -> { Stmt_bb[i0] -> MemRef_a[i0] };
		; CHECK-NEXT:}

%exitcond = icmp eq i3 %indvar, %sub		%exitcond = icmp eq i3 %indvar, %sub
br i1 %exitcond, label %return, label %bb		br i1 %exitcond, label %return, label %bb

return:		return:
ret void		ret void
}		}

polly/trunk/test/ScopInfo/truncate-1.ll

	; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s			; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
	;			;
	; void f(char *A, short N) {			; void f(char *A, short N) {
	; for (char i = 0; i < (char)N; i++)			; for (char i = 0; i < (char)N; i++)
	; A[i]++;			; A[i]++;
	; }			; }
	;			;
				; FIXME: We should the truncate precisely... or just make it a separate parameter.
	; CHECK: Assumed Context:			; CHECK: Assumed Context:
	; CHECK-NEXT: [N] -> { : }			; CHECK-NEXT: [N] -> { : }
	; CHECK-NEXT: Invalid Context:			; CHECK-NEXT: Invalid Context:
	; CHECK-NEXT: [N] -> { : 1 = 0 }			; CHECK-NEXT: [N] -> { : N <= -129 or N >= 128 }
	;			;
	; CHECK: Domain :=			; CHECK: Domain :=
	; CHECK-NEXT: [N] -> { Stmt_for_body[i0] : i0 >= 0 and 256*floor((128 + N)/256) < N - i0 };			; CHECK-NEXT: [N] -> { Stmt_for_body[i0] : 0 <= i0 < N };
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	define void @f(i8* %A, i16 signext %N) {			define void @f(i8* %A, i16 signext %N) {
	entry:			entry:
	br label %for.cond			br label %for.cond

	for.cond: ; preds = %for.inc, %entry			for.cond: ; preds = %for.inc, %entry
	Show All 23 Lines

polly/trunk/test/ScopInfo/truncate-2.ll

	; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s			; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s
	;			;
	; void f(char *A, short N) {			; void f(char *A, short N) {
	; for (short i = 0; i < N; i++)			; for (short i = 0; i < N; i++)
	; A[(char)(N)]++;			; A[(char)(N)]++;
	; }			; }
	;			;
				; FIXME: We should the truncate precisely... or just make it a separate parameter.
	; CHECK: Assumed Context:			; CHECK: Assumed Context:
	; CHECK-NEXT: [N] -> { : }			; CHECK-NEXT: [N] -> { : }
	; CHECK-NEXT: Invalid Context:			; CHECK-NEXT: Invalid Context:
	; CHECK-NEXT: [N] -> { : 1 = 0 }			; CHECK-NEXT: [N] -> { : N >= 128 }
	;			;
	; CHECK: ReadAccess := [Reduction Type: +] [Scalar: 0]			; CHECK: ReadAccess := [Reduction Type: +] [Scalar: 0]
	; CHECK-NEXT: [N] -> { Stmt_for_body[i0] -> MemRef_A[o0] : 256*floor((-N + o0)/256) = -N + o0 and -128 <= o0 <= 127 };			; CHECK-NEXT: [N] -> { Stmt_for_body[i0] -> MemRef_A[N] };
	; CHECK-NEXT: MustWriteAccess := [Reduction Type: +] [Scalar: 0]			; CHECK-NEXT: MustWriteAccess := [Reduction Type: +] [Scalar: 0]
	; CHECK-NEXT: [N] -> { Stmt_for_body[i0] -> MemRef_A[o0] : 256*floor((-N + o0)/256) = -N + o0 and -128 <= o0 <= 127 };			; CHECK-NEXT: [N] -> { Stmt_for_body[i0] -> MemRef_A[N] };
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	define void @f(i8* %A, i16 signext %N) {			define void @f(i8* %A, i16 signext %N) {
	entry:			entry:
	br label %for.cond			br label %for.cond

	for.cond: ; preds = %for.inc, %entry			for.cond: ; preds = %for.inc, %entry
	Show All 19 Lines

polly/trunk/test/ScopInfo/zero_ext_of_truncate.ll

	; RUN: opt %loadPolly -polly-scops -analyze \			; RUN: opt %loadPolly -polly-scops -analyze \
	; RUN: -polly-invariant-load-hoisting=true < %s \| FileCheck %s			; RUN: -polly-invariant-load-hoisting=true < %s \| FileCheck %s
	;			;
	; void f(unsigned restrict I, unsigned restrict A, unsigned N, unsigned M) {			; void f(unsigned restrict I, unsigned restrict A, unsigned N, unsigned M) {
	; for (unsigned i = 0; i < N; i++) {			; for (unsigned i = 0; i < N; i++) {
	; unsigned char V = *I;			; unsigned char V = *I;
	; if (V < M)			; if (V < M)
	; A[i]++;			; A[i]++;
	; }			; }
	; }			; }
	;			;
				; FIXME: The truncated value should be a paramter.
	; CHECK: Assumed Context:			; CHECK: Assumed Context:
	; CHECK-NEXT: [N, tmp, M] -> { : }			; CHECK-NEXT: [N, tmp, M] -> { : }
	; CHECK-NEXT: Invalid Context:			; CHECK-NEXT: Invalid Context:
	; CHECK-NEXT: [N, tmp, M] -> { : N < 0 or (N > 0 and M < 0) or (N > 0 and 256*floor((128 + tmp)/256) > tmp) }			; CHECK-NEXT: [N, tmp, M] -> { : N < 0 or (N > 0 and tmp >= 128) or (N > 0 and tmp < 0) or (N > 0 and M < 0) }
	;			;
	; CHECK: Domain :=			; CHECK: Domain :=
	; CHECK-NEXT: [N, tmp, M] -> { Stmt_if_then[i0] : 0 <= i0 < N and 256*floor((128 + tmp)/256) > tmp - M };			; CHECK-NEXT: [N, tmp, M] -> { Stmt_if_then[i0] : M > tmp and 0 <= i0 < N };
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	define void @f(i32* noalias %I, i32* noalias %A, i32 %N, i32 %M) {			define void @f(i32* noalias %I, i32* noalias %A, i32 %N, i32 %M) {
	entry:			entry:
	br label %for.cond			br label %for.cond

	for.cond: ; preds = %for.inc, %entry			for.cond: ; preds = %for.inc, %entry
	Show All 28 Lines