This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/
-
trunk/
-
lib/Transforms/Utils/
-
Transforms/
-
Utils/
-
SimplifyIndVar.cpp
-
test/Transforms/IndVarSimplify/
-
Transforms/
-
IndVarSimplify/
-
strengthen-overflow.ll

Differential D34207

[IndVarSimplify] Add AShr exact flags using induction variables ranges.
ClosedPublic

Authored by dmgreen on Jun 14 2017, 8:45 AM.

Download Raw Diff

Details

Reviewers

sanjoy

Commits

rGb26a0a460ca1: [IndVarSimplify] Add AShr exact flags using induction variables ranges.
rL307157: [IndVarSimplify] Add AShr exact flags using induction variables ranges.

Summary

This is an attempt to fix a regression we saw in benchmarks of FFT like code after computeKnownBitsFromShiftOperator was made more strict with respect to possibly shifting past the bitwidth. It adds exact flags to AShr/LShr flags where we can statically prove it is valid using the range of induction variables. This allows further optimisations to remove extra loads.

Diff Detail

Repository: rL LLVM

Event Timeline

dmgreen created this revision.Jun 14 2017, 8:45 AM

Herald added a subscriber: sanjoy. · View Herald TranscriptJun 14 2017, 8:45 AM

Ping. Anyone fancy taking a look at this, or suggesting a better place for it?

The definition of shifts past the bitwidth has recently been changed (or clarified?) to poison, not undef. I'm unsure how poison plays together with computeKnownBits, and whether the original cause of the regressions (back in https://reviews.llvm.org/rL297724) is now correct or not. It can still produce end to end compilation errors if it's reverted, which I guess is proof enough that it's needed! There may well be other ways to fix this though.

sanjoy requested changes to this revision.Jun 20 2017, 10:03 AM

sanjoy added inline comments.

lib/Transforms/Utils/SimplifyIndVar.cpp
529 ↗	(On Diff #102558)	Please clang-format the change.
532 ↗	(On Diff #102558)	LLVM style is `auto User` or `auto U` (I prefer the latter since there is also a `llvm::User`).
535 ↗	(On Diff #102558)	Use `llvm::PatternMatch` here. You can then fold in the `dyn_cast<ConstantInt>(AShr->getOperand(1))` check into the same patternmatch expression. If you do the `dyn_cast<ConstantInt>(AShr->getOperand(1))` check, then the `AShr->getOperand(0) != BO` check should not be necessary.
542 ↗	(On Diff #102558)	Do you really mean `getUpper` or do you want `getUnsignedMax`? Secondly, `BO` will always be an integer (as opposed to a vector of integers); since the induction variable is an integer. Thirdly, if I'm not missing something, the `IVRange.getUpper().uge(BO->getType()->getPrimitiveSizeInBits())` check should be unnecessary. If the IV is larger than the bitwidth then `BO` is poison, and so is `AShr` (and making it exact does not make it any more poison).
545 ↗	(On Diff #102558)	In cases like this, I think a more indented version is easier to read: if (IVOperand == BO->getOperand(1) && IVMax.ult(BitWidth)) if (auto *AShrCI = dyn_cast<ConstantInt>(AShr->getOperand(1)) if (IVMin.uge(AShrOp2->getValue()) { AShr->setIsExact(true); Changed = true; }
546 ↗	(On Diff #102558)	Perhaps you need `IVRange.getUnsignedMin()` instead of `IVRange.getLower()`?
test/Transforms/IndVarSimplify/strengthen-overflow.ll
116 ↗	(On Diff #102558)	Add a few more tests here, check that We don't do this transform when illegal (e.g. `shl` by 2, `ashr` by 3). We do the transform when the `shl` amount is strictly greater than the `ashr` amount If I'm right about `shl` ing more than the bitwidth amount above, then please add a test case that checks that situation.
123 ↗	(On Diff #102558)	I think most of the load/store stuff here can be pruned.

This revision now requires changes to proceed.Jun 20 2017, 10:03 AM

Hello, Thanks for the review. It sounds like you are happy with where this is, so I will upload a cleaned up version. Thanks for the pointers.

Unfortunately it looks like something else I need for this has broken over in r305481 :( I'll have to try and investigate that to see exactly why it's stopping the full case from optimising.

To this, I've added LShr's, as well as AShr's and the extra testing. Let me know if anything else needs changing, but no need to hurry as I need to look into that other thing now too.

lib/Transforms/Utils/SimplifyIndVar.cpp
542 ↗	(On Diff #102558)	This is my understanding, yes. But I thought the same for shifts in computeKnownBits, so I thought it best to be careful. In the case for the original test it was known what the upper bound for the loop is. Removing the need for this will only make this fire for more loops, which should be good.
test/Transforms/IndVarSimplify/strengthen-overflow.ll
123 ↗	(On Diff #102558)	Sounds good. This was a bit strategic, in case I had to explain where the benefits come in ;)

sanjoy accepted this revision.Jul 3 2017, 1:52 PM

sanjoy added inline comments.

lib/Transforms/Utils/SimplifyIndVar.cpp
33 ↗	(On Diff #103376)	Don't open this namespace globally -- just open it in the function you're using it in.
539 ↗	(On Diff #103376)	The name and documentation of this function is now out of sync. I'd instead just split out a separate `strengthenRightShift`, and put that logic in there.

This revision is now accepted and ready to land.Jul 3 2017, 1:52 PM

dmgreen updated this revision to Diff 105152.Jul 4 2017, 4:23 AM

dmgreen edited the summary of this revision. (Show Details)

Closed by commit rL307157: [IndVarSimplify] Add AShr exact flags using induction variables ranges. (authored by dmgreen). · Explain WhyJul 5 2017, 6:26 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Utils/

SimplifyIndVar.cpp

36 lines

test/

Transforms/

IndVarSimplify/

strengthen-overflow.ll

84 lines

Diff 105262

llvm/trunk/lib/Transforms/Utils/SimplifyIndVar.cpp

Show All 19 Lines
#include "llvm/Analysis/LoopInfo.h"		#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/LoopPass.h"		#include "llvm/Analysis/LoopPass.h"
#include "llvm/Analysis/ScalarEvolutionExpressions.h"		#include "llvm/Analysis/ScalarEvolutionExpressions.h"
#include "llvm/IR/DataLayout.h"		#include "llvm/IR/DataLayout.h"
#include "llvm/IR/Dominators.h"		#include "llvm/IR/Dominators.h"
#include "llvm/IR/IRBuilder.h"		#include "llvm/IR/IRBuilder.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
		#include "llvm/IR/PatternMatch.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "indvars"		#define DEBUG_TYPE "indvars"

STATISTIC(NumElimIdentity, "Number of IV identities eliminated");		STATISTIC(NumElimIdentity, "Number of IV identities eliminated");
Show All 39 Lines	public:

bool eliminateOverflowIntrinsic(CallInst *CI);		bool eliminateOverflowIntrinsic(CallInst *CI);
bool eliminateIVUser(Instruction UseInst, Instruction IVOperand);		bool eliminateIVUser(Instruction UseInst, Instruction IVOperand);
void eliminateIVComparison(ICmpInst ICmp, Value IVOperand);		void eliminateIVComparison(ICmpInst ICmp, Value IVOperand);
void eliminateIVRemainder(BinaryOperator Rem, Value IVOperand,		void eliminateIVRemainder(BinaryOperator Rem, Value IVOperand,
bool IsSigned);		bool IsSigned);
bool eliminateSDiv(BinaryOperator *SDiv);		bool eliminateSDiv(BinaryOperator *SDiv);
bool strengthenOverflowingOperation(BinaryOperator OBO, Value IVOperand);		bool strengthenOverflowingOperation(BinaryOperator OBO, Value IVOperand);
		bool strengthenRightShift(BinaryOperator BO, Value IVOperand);
};		};
}		}

/// Fold an IV operand into its use. This removes increments of an		/// Fold an IV operand into its use. This removes increments of an
/// aligned IV when used by a instruction that ignores the low bits.		/// aligned IV when used by a instruction that ignores the low bits.
///		///
/// IVOperand is guaranteed SCEVable, but UseInst may not be.		/// IVOperand is guaranteed SCEVable, but UseInst may not be.
///		///
▲ Show 20 Lines • Show All 487 Lines • ▼ Show 20 Lines	if (ExtendAfterOp == OpAfterExtend) {
SE->forgetValue(BO);		SE->forgetValue(BO);
Changed = true;		Changed = true;
}		}
}		}

return Changed;		return Changed;
}		}

		/// Annotate the Shr in (X << IVOperand) >> C as exact using the
		/// information from the IV's range. Returns true if anything changed, false
		/// otherwise.
		bool SimplifyIndvar::strengthenRightShift(BinaryOperator *BO,
		Value *IVOperand) {
		using namespace llvm::PatternMatch;

		if (BO->getOpcode() == Instruction::Shl) {
		bool Changed = false;
		ConstantRange IVRange = SE->getUnsignedRange(SE->getSCEV(IVOperand));
		for (auto *U : BO->users()) {
		const APInt *C;
		if (match(U,
		m_AShr(m_Shl(m_Value(), m_Specific(IVOperand)), m_APInt(C))) \|\|
		match(U,
		m_LShr(m_Shl(m_Value(), m_Specific(IVOperand)), m_APInt(C)))) {
		BinaryOperator *Shr = cast<BinaryOperator>(U);
		if (!Shr->isExact() && IVRange.getUnsignedMin().uge(*C)) {
		Shr->setIsExact(true);
		Changed = true;
		}
		}
		}
		return Changed;
		}

		return false;
		}

/// Add all uses of Def to the current IV's worklist.		/// Add all uses of Def to the current IV's worklist.
static void pushIVUsers(		static void pushIVUsers(
Instruction *Def,		Instruction *Def,
SmallPtrSet<Instruction*,16> &Simplified,		SmallPtrSet<Instruction*,16> &Simplified,
SmallVectorImpl< std::pair<Instruction,Instruction> > &SimpleIVUsers) {		SmallVectorImpl< std::pair<Instruction,Instruction> > &SimpleIVUsers) {

for (User *U : Def->users()) {		for (User *U : Def->users()) {
Instruction *UI = cast<Instruction>(U);		Instruction *UI = cast<Instruction>(U);
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	if (!IVOperand)
continue;		continue;

if (eliminateIVUser(UseOper.first, IVOperand)) {		if (eliminateIVUser(UseOper.first, IVOperand)) {
pushIVUsers(IVOperand, Simplified, SimpleIVUsers);		pushIVUsers(IVOperand, Simplified, SimpleIVUsers);
continue;		continue;
}		}

if (BinaryOperator *BO = dyn_cast<BinaryOperator>(UseOper.first)) {		if (BinaryOperator *BO = dyn_cast<BinaryOperator>(UseOper.first)) {
if (isa<OverflowingBinaryOperator>(BO) &&		if ((isa<OverflowingBinaryOperator>(BO) &&
strengthenOverflowingOperation(BO, IVOperand)) {		strengthenOverflowingOperation(BO, IVOperand)) \|\|
		(isa<ShlOperator>(BO) && strengthenRightShift(BO, IVOperand))) {
// re-queue uses of the now modified binary operator and fall		// re-queue uses of the now modified binary operator and fall
// through to the checks that remain.		// through to the checks that remain.
pushIVUsers(IVOperand, Simplified, SimpleIVUsers);		pushIVUsers(IVOperand, Simplified, SimpleIVUsers);
}		}
}		}

CastInst *Cast = dyn_cast<CastInst>(UseOper.first);		CastInst *Cast = dyn_cast<CastInst>(UseOper.first);
if (V && Cast) {		if (V && Cast) {
Show All 35 Lines

llvm/trunk/test/Transforms/IndVarSimplify/strengthen-overflow.ll

	Show First 20 Lines • Show All 98 Lines • ▼ Show 20 Lines

	break:			break:
	ret i32 %civ.inc			ret i32 %civ.inc

	exit:			exit:
	ret i32 42			ret i32 42
	}			}

				define hidden void @test.shl.exact.equal() {
				; CHECK-LABEL: @test.shl.exact.equal
				entry:
				br label %for.body

				for.body:
				; CHECK-LABEL: for.body
				%k.021 = phi i32 [ 1, %entry ], [ %inc, %for.body ]
				%shl = shl i32 1, %k.021
				%shr1 = ashr i32 %shl, 1
				; CHECK: %shr1 = ashr exact i32 %shl, 1
				%shr2 = lshr i32 %shl, 1
				; CHECK: %shr2 = lshr exact i32 %shl, 1
				%inc = add nuw nsw i32 %k.021, 1
				%exitcond = icmp eq i32 %inc, 9
				br i1 %exitcond, label %for.end, label %for.body

				for.end:
				ret void
				}

				define hidden void @test.shl.exact.greater() {
				; CHECK-LABEL: @test.shl.exact.greater
				entry:
				br label %for.body

				for.body:
				; CHECK-LABEL: for.body
				%k.021 = phi i32 [ 3, %entry ], [ %inc, %for.body ]
				%shl = shl i32 1, %k.021
				%shr1 = ashr i32 %shl, 2
				; CHECK: %shr1 = ashr exact i32 %shl, 2
				%shr2 = lshr i32 %shl, 2
				; CHECK: %shr2 = lshr exact i32 %shl, 2
				%inc = add nuw nsw i32 %k.021, 1
				%exitcond = icmp eq i32 %inc, 9
				br i1 %exitcond, label %for.end, label %for.body

				for.end:
				ret void
				}

				define hidden void @test.shl.exact.unbound(i32 %arg) {
				; CHECK-LABEL: @test.shl.exact.unbound
				entry:
				br label %for.body

				for.body:
				; CHECK-LABEL: for.body
				%k.021 = phi i32 [ 2, %entry ], [ %inc, %for.body ]
				%shl = shl i32 1, %k.021
				%shr1 = ashr i32 %shl, 2
				; CHECK: %shr1 = ashr exact i32 %shl, 2
				%shr2 = lshr i32 %shl, 2
				; CHECK: %shr2 = lshr exact i32 %shl, 2
				%inc = add nuw nsw i32 %k.021, 1
				%exitcond = icmp eq i32 %inc, %arg
				br i1 %exitcond, label %for.end, label %for.body

				for.end:
				ret void
				}

				define hidden void @test.shl.nonexact() {
				; CHECK-LABEL: @test.shl.nonexact
				entry:
				br label %for.body

				for.body:
				; CHECK-LABEL: for.body
				%k.021 = phi i32 [ 2, %entry ], [ %inc, %for.body ]
				%shl = shl i32 1, %k.021
				%shr1 = ashr i32 %shl, 3
				; CHECK: %shr1 = ashr i32 %shl, 3
				%shr2 = lshr i32 %shl, 3
				; CHECK: %shr2 = lshr i32 %shl, 3
				%inc = add nuw nsw i32 %k.021, 1
				%exitcond = icmp eq i32 %inc, 9
				br i1 %exitcond, label %for.end, label %for.body

				for.end:
				ret void
				}

	!0 = !{i32 0, i32 2}			!0 = !{i32 0, i32 2}
	!1 = !{i32 0, i32 42}			!1 = !{i32 0, i32 42}