Download Raw Diff

Details

Reviewers

pete
atrick
sanjoy

Commits

rG99de88d1f3f0: [SCEV] Compute affine range in another way to avoid bitwidth extending.
rL297992: [SCEV] Compute affine range in another way to avoid bitwidth extending.

Summary

This approach has two major advantages over the existing one:

We don't need to extend bitwidth in our computations. Extending

bitwidth is a big issue for compile time as we often end up working with
APInts wider than 64bit, which is a slow case for APInt.

When we zero extend a wrapped range, we lose some information (we

replace the range with [0, 1 << src bit width)). Thus, avoiding such
extensions better preserves information.

Correctness testing:
I ran 'ninja check' with assertions that the new implementation of
getRangeForAffineAR gives the same results as the old one (this
functionality is not present in this patch). There were several failures -
I inspected them manually and found out that they all are caused by
the fact that we're returning more accurate results now (see bullet (2)
above).
Without such assertions 'ninja check' works just fine, as well as
SPEC2006.

Compile time testing:
CTMark/Os:

mafft/pairlocalalign -16.98%
tramp3d-v4/tramp3d-v4 -12.72%
lencod/lencod -11.51%
Bullet/bullet -4.36%
ClamAV/clamscan -3.66%
7zip/7zip-benchmark -3.19%
sqlite3/sqlite3 -2.95%
SPASS/SPASS -2.74%
Average -5.81%

Performance testing:
The changes are expected to be neutral for runtime performance. That
said, I plan to run some tests overnight to verify that.

Diff Detail

Repository: rL LLVM

Event Timeline

mzolotukhin created this revision.Feb 28 2017, 3:14 PM

Harbormaster completed remote builds in B4379: Diff 90092.Feb 28 2017, 3:14 PM

mzolotukhin edited the summary of this revision. (Show Details)Feb 28 2017, 3:16 PM

I haven't fully reviewed the code, but I'm really glad you were able to express the range without extending its width. Thanks for doing this.

lib/Analysis/ScalarEvolution.cpp
4833 ↗	(On Diff #90092)	This is probably fine: CheckBackDirection = StepSMin.isNegative();

As expected, runtime performance didn't change for SPECINT2006 (I tried -O3), but compile time improved across all the tests:

464.h264ref -11.13%
456.hmmer -7.72%
401.bzip2 -7.47%
445.gobmk -5.89%
473.astar -4.50%
462.libquantum -4.29%
429.mcf -4.24%
458.sjeng -3.38%
400.perlbench -3.28%
403.gcc -3.26%

It was not clear to me that this is safe around certain kinds of overflow -- if the questions I raised inline are invalid, please add comments / asserts making them obviously invalid. :)

lib/Analysis/ScalarEvolution.cpp
4683 ↗	(On Diff #90092)	`tripleUCompare` is not too descriptive -- what is it actually doing? Rather, what property of `A`, `B`, `C` is it testing for?
4716 ↗	(On Diff #90092)	Is it okay for this to overflow?
4762 ↗	(On Diff #90092)	Do you need to pass in `BackDirection` here? Can't you get the same information by checking the sign of `Step`?
4765 ↗	(On Diff #90092)	Same question here about the multiplication overflowing.
4842 ↗	(On Diff #90092)	I'd s/`CheckBackDirection`/`CheckBwdDirection`/ for consistency.
4844 ↗	(On Diff #90092)	Is this correct if `StepSMin` (and thus also `StepSMin.abs()`) is `INT_SMIN`?

This revision now requires changes to proceed.Mar 3 2017, 7:18 PM

Address review remarks.

Hi,

Thanks for the initial review, please find the patch updated.

Michael

lib/Analysis/ScalarEvolution.cpp
4683 ↗	(On Diff #90092)	I'm using `ConstantRange::contains` instead. I'd really appreciate more eyes here to check if I'm not missing anything.
4716 ↗	(On Diff #90092)	Fixed.
4762 ↗	(On Diff #90092)	Fixed.

Merge two similar functions together.

Mostly minor stuff. Requesting another iteration mostly because I too want to look at it again with fresh eyes to see if I missed anything.

lib/Analysis/ScalarEvolution.cpp
4691 ↗	(On Diff #90979)	How about just folding all of these together and changing the comment to match? `If either Step or MaxBECount is 0 or the start range is maximally conservative to begin with ..`
4696 ↗	(On Diff #90979)	I'd s/`BackDirection`/`Descending`/ Also, please add a justification on why this `abs` does the right thing of `Step` is `INT_SMIN` (or bail out in that case).
4702 ↗	(On Diff #90979)	Can we get the `APInt::getMaxValue(StartRange.getBitWidth()) == MaxBECount * Step` check here by changing `ult` to `ule`?
4715 ↗	(On Diff #90979)	I'd suggest not calling these `Min` or `Max` since they're not the minimum or maximum (signed or unsigned).
4717 ↗	(On Diff #90979)	Very minor stylistic thing (and I'll understand if you don't want to change it): I'd have written this as `BackDirection ? (StartMin - Offset) : (StartMax + Offset)`
4725 ↗	(On Diff #90979)	Again, these should not be called `Min` or `Max`.

This revision now requires changes to proceed.Mar 10 2017, 6:16 PM

Address review remarks.

Harbormaster completed remote builds in B4723: Diff 91487.Mar 11 2017, 8:10 PM

Add a comment about INT_SMIN.

Harbormaster completed remote builds in B4724: Diff 91488.Mar 11 2017, 8:32 PM

Requesting another iteration mostly because I too want to look at it again with fresh eyes to see if I missed anything.

Absolutely.

I updated the patch, please take a look when you are ready :)

Thanks,
Michael

lib/Analysis/ScalarEvolution.cpp
4691 ↗	(On Diff #90979)	I prefer to keep them separate for the following reason. In the first case we check if the start range doesn't change at all. I.e. we know for sure that final value is the same as the original one because either step is 0, or MaxBECount is 0. In the second case, in contrast, the value is changed, but we cannot predict a new range, because the original range is too conservative. I.e. the final range is also full range, in some sense it's a different full range compared to the start range. I hope this explanation makes sense :)
4696 ↗	(On Diff #90979)	I'd s/BackDirection/Descending/ I like `Descending` better too, thanks! Also, please add a justification on why this abs does the right thing of Step is INT_SMIN I tried to add a comment about it, but I'm not sure it's clear enough. But I'm pretty sure we're doing correct thing here due to wrap-around of APInt.
4702 ↗	(On Diff #90979)	I rechecked the formulas and I think the second one is not need at all. If MaxBECount*Step == UINT_MAX, then we don't overflow. E.g. we can go from 0 to 255 with step=1 and MaxBECount=255 without a wrap.
4715 ↗	(On Diff #90979)	Thanks, fixed.
4717 ↗	(On Diff #90979)	Fixed.

lgtm

Do you mind adding some targeted test cases for this, especially around cases where you think our behavior may have changed?

That's fine to do later in a later commit though.

This revision is now accepted and ready to land.Mar 16 2017, 1:48 PM

Closed by commit rL297992: [SCEV] Compute affine range in another way to avoid bitwidth extending. (authored by mzolotukhin). · Explain WhyMar 16 2017, 2:19 PM

This revision was automatically updated to reflect the committed changes.

mzolotukhin marked 3 inline comments as done.

Diff 92055

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,707 Lines • ▼ Show 20 Lines	if (const SCEVUnknown *U = dyn_cast<SCEVUnknown>(S)) {
}		}

return setRange(U, SignHint, ConservativeResult);		return setRange(U, SignHint, ConservativeResult);
}		}

return setRange(S, SignHint, ConservativeResult);		return setRange(S, SignHint, ConservativeResult);
}		}

		// Given a StartRange, Step and MaxBECount for an expression compute a range of
		// values that the expression can take. Initially, the expression has a value
		// from StartRange and then is changed by Step up to MaxBECount times. Signed
		// argument defines if we treat Step as signed or unsigned.
		static ConstantRange getRangeForAffineARHelper(APInt Step,
		ConstantRange StartRange,
		APInt MaxBECount,
		unsigned BitWidth, bool Signed) {
		// If either Step or MaxBECount is 0, then the expression won't change, and we
		// just need to return the initial range.
		if (Step == 0 \|\| MaxBECount == 0)
		return StartRange;

		// If we don't know anything about the inital value (i.e. StartRange is
		// FullRange), then we don't know anything about the final range either.
		// Return FullRange.
		if (StartRange.isFullSet())
		return ConstantRange(BitWidth, /* isFullSet = */ true);

		// If Step is signed and negative, then we use its absolute value, but we also
		// note that we're moving in the opposite direction.
		bool Descending = Signed && Step.isNegative();

		if (Signed)
		// This is correct even for INT_SMIN. Let's look at i8 to illustrate this:
		// abs(INT_SMIN) = abs(-128) = abs(0x80) = -0x80 = 0x80 = 128.
		// This equations hold true due to the well-defined wrap-around behavior of
		// APInt.
		Step = Step.abs();

		// Check if Offset is more than full span of BitWidth. If it is, the
		// expression is guaranteed to overflow.
		if (APInt::getMaxValue(StartRange.getBitWidth()).udiv(Step).ult(MaxBECount))
		return ConstantRange(BitWidth, /* isFullSet = */ true);

		// Offset is by how much the expression can change. Checks above guarantee no
		// overflow here.
		APInt Offset = Step * MaxBECount;

		// Minimum value of the final range will match the minimal value of StartRange
		// if the expression is increasing and will be decreased by Offset otherwise.
		// Maximum value of the final range will match the maximal value of StartRange
		// if the expression is decreasing and will be increased by Offset otherwise.
		APInt StartLower = StartRange.getLower();
		APInt StartUpper = StartRange.getUpper() - 1;
		APInt MovedBoundary =
		Descending ? (StartLower - Offset) : (StartUpper + Offset);

		// It's possible that the new minimum/maximum value will fall into the initial
		// range (due to wrap around). This means that the expression can take any
		// value in this bitwidth, and we have to return full range.
		if (StartRange.contains(MovedBoundary))
		return ConstantRange(BitWidth, /* isFullSet = */ true);

		APInt NewLower, NewUpper;
		if (Descending) {
		NewLower = MovedBoundary;
		NewUpper = StartUpper;
		} else {
		NewLower = StartLower;
		NewUpper = MovedBoundary;
		}

		// If we end up with full range, return a proper full range.
		if (NewLower == NewUpper + 1)
		return ConstantRange(BitWidth, /* isFullSet = */ true);

		// No overflow detected, return [StartLower, StartUpper + Offset + 1) range.
		return ConstantRange(NewLower, NewUpper + 1);
		}

ConstantRange ScalarEvolution::getRangeForAffineAR(const SCEV *Start,		ConstantRange ScalarEvolution::getRangeForAffineAR(const SCEV *Start,
const SCEV *Step,		const SCEV *Step,
const SCEV *MaxBECount,		const SCEV *MaxBECount,
unsigned BitWidth) {		unsigned BitWidth) {
assert(!isa<SCEVCouldNotCompute>(MaxBECount) &&		assert(!isa<SCEVCouldNotCompute>(MaxBECount) &&
getTypeSizeInBits(MaxBECount->getType()) <= BitWidth &&		getTypeSizeInBits(MaxBECount->getType()) <= BitWidth &&
"Precondition!");		"Precondition!");

ConstantRange Result(BitWidth, /* isFullSet = */ true);

// Check for overflow. This must be done with ConstantRange arithmetic
// because we could be called from within the ScalarEvolution overflow
// checking code.

MaxBECount = getNoopOrZeroExtend(MaxBECount, Start->getType());		MaxBECount = getNoopOrZeroExtend(MaxBECount, Start->getType());
ConstantRange MaxBECountRange = getUnsignedRange(MaxBECount);		ConstantRange MaxBECountRange = getUnsignedRange(MaxBECount);
ConstantRange ZExtMaxBECountRange = MaxBECountRange.zextOrTrunc(BitWidth * 2);		APInt MaxBECountValue = MaxBECountRange.getUnsignedMax();

ConstantRange StepSRange = getSignedRange(Step);
ConstantRange SExtStepSRange = StepSRange.sextOrTrunc(BitWidth * 2);

ConstantRange StartURange = getUnsignedRange(Start);
ConstantRange EndURange =
StartURange.add(MaxBECountRange.multiply(StepSRange));

// Check for unsigned overflow.
ConstantRange ZExtStartURange = StartURange.zextOrTrunc(BitWidth * 2);
ConstantRange ZExtEndURange = EndURange.zextOrTrunc(BitWidth * 2);
if (ZExtStartURange.add(ZExtMaxBECountRange.multiply(SExtStepSRange)) ==
ZExtEndURange) {
APInt Min = APIntOps::umin(StartURange.getUnsignedMin(),
EndURange.getUnsignedMin());
APInt Max = APIntOps::umax(StartURange.getUnsignedMax(),
EndURange.getUnsignedMax());
bool IsFullRange = Min.isMinValue() && Max.isMaxValue();
if (!IsFullRange)
Result =
Result.intersectWith(ConstantRange(Min, Max + 1));
}

		// First, consider step signed.
ConstantRange StartSRange = getSignedRange(Start);		ConstantRange StartSRange = getSignedRange(Start);
ConstantRange EndSRange =		ConstantRange StepSRange = getSignedRange(Step);
StartSRange.add(MaxBECountRange.multiply(StepSRange));

// Check for signed overflow. This must be done with ConstantRange		// If Step can be both positive and negative, we need to find ranges for the
// arithmetic because we could be called from within the ScalarEvolution		// maximum absolute step values in both directions and union them.
// overflow checking code.		ConstantRange SR =
ConstantRange SExtStartSRange = StartSRange.sextOrTrunc(BitWidth * 2);		getRangeForAffineARHelper(StepSRange.getSignedMin(), StartSRange,
ConstantRange SExtEndSRange = EndSRange.sextOrTrunc(BitWidth * 2);		MaxBECountValue, BitWidth, /* Signed = */ true);
if (SExtStartSRange.add(ZExtMaxBECountRange.multiply(SExtStepSRange)) ==		SR = SR.unionWith(getRangeForAffineARHelper(StepSRange.getSignedMax(),
SExtEndSRange) {		StartSRange, MaxBECountValue,
APInt Min =		BitWidth, /* Signed = */ true));
APIntOps::smin(StartSRange.getSignedMin(), EndSRange.getSignedMin());
APInt Max =		// Next, consider step unsigned.
APIntOps::smax(StartSRange.getSignedMax(), EndSRange.getSignedMax());		ConstantRange UR = getRangeForAffineARHelper(
bool IsFullRange = Min.isMinSignedValue() && Max.isMaxSignedValue();		getUnsignedRange(Step).getUnsignedMax(), getUnsignedRange(Start),
if (!IsFullRange)		MaxBECountValue, BitWidth, /* Signed = */ false);
Result =
Result.intersectWith(ConstantRange(Min, Max + 1));
}

return Result;		// Finally, intersect signed and unsigned ranges.
		return SR.intersectWith(UR);
}		}

ConstantRange ScalarEvolution::getRangeViaFactoring(const SCEV *Start,		ConstantRange ScalarEvolution::getRangeViaFactoring(const SCEV *Start,
const SCEV *Step,		const SCEV *Step,
const SCEV *MaxBECount,		const SCEV *MaxBECount,
unsigned BitWidth) {		unsigned BitWidth) {
// RangeOf({C?A:B,+,C?P:Q}) == RangeOf(C?{A,+,P}:{B,+,Q})		// RangeOf({C?A:B,+,C?P:Q}) == RangeOf(C?{A,+,P}:{B,+,Q})
// == RangeOf({A,+,P}) union RangeOf({B,+,Q})		// == RangeOf({A,+,P}) union RangeOf({B,+,Q})
▲ Show 20 Lines • Show All 5,731 Lines • Show Last 20 Lines

llvm/trunk/test/Analysis/ScalarEvolution/zext-wrap.ll

	; RUN: opt < %s -analyze -scalar-evolution \| FileCheck %s			; RUN: opt < %s -analyze -scalar-evolution \| FileCheck %s
	; PR4569			; PR4569

	define i16 @main() nounwind {			define i16 @main() nounwind {
	entry:			entry:
	br label %bb.i			br label %bb.i

	bb.i: ; preds = %bb1.i, %bb.nph			bb.i: ; preds = %bb1.i, %bb.nph
				; We should be able to find the range for this expression.
				; CHECK: %l_95.0.i1 = phi i8
				; CHECK: --> {0,+,-1}<%bb.i> U: [2,1) S: [2,1){{ *}}Exits: 2

	%l_95.0.i1 = phi i8 [ %tmp1, %bb.i ], [ 0, %entry ]			%l_95.0.i1 = phi i8 [ %tmp1, %bb.i ], [ 0, %entry ]

	; This cast shouldn't be folded into the addrec.			; This cast shouldn't be folded into the addrec.
	; CHECK: %tmp = zext i8 %l_95.0.i1 to i16			; CHECK: %tmp = zext i8 %l_95.0.i1 to i16
	; CHECK: --> (zext i8 {0,+,-1}<nw><%bb.i> to i16){{ U: [^ ]+ S: [^ ]+}}{{ *}}Exits: 2			; CHECK: --> (zext i8 {0,+,-1}<nw><%bb.i> to i16){{ U: [^ ]+ S: [^ ]+}}{{ *}}Exits: 2

	%tmp = zext i8 %l_95.0.i1 to i16			%tmp = zext i8 %l_95.0.i1 to i16

	%tmp1 = add i8 %l_95.0.i1, -1			%tmp1 = add i8 %l_95.0.i1, -1
	%phitmp = icmp eq i8 %tmp1, 1			%phitmp = icmp eq i8 %tmp1, 1
	br i1 %phitmp, label %bb1.i.func_36.exit_crit_edge, label %bb.i			br i1 %phitmp, label %bb1.i.func_36.exit_crit_edge, label %bb.i

	bb1.i.func_36.exit_crit_edge:			bb1.i.func_36.exit_crit_edge:
	ret i16 %tmp			ret i16 %tmp
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Compute affine range in another way to avoid bitwidth extending.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 92055

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

llvm/trunk/test/Analysis/ScalarEvolution/zext-wrap.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Compute affine range in another way to avoid bitwidth extending.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 92055

llvm/trunk/lib/Analysis/ScalarEvolution.cpp

llvm/trunk/test/Analysis/ScalarEvolution/zext-wrap.ll

[SCEV] Compute affine range in another way to avoid bitwidth extending.
ClosedPublic