Download Raw Diff

Details

Reviewers

grosser
eli.friedman
sanjoy
hfinkel

Commits

rGbfd7c38de785: [SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its…
rL299118: [SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its…

Summary

Since there is no sdiv in SCEV, an 'udiv' is a better canonical form than an 'sdiv' as the user of induction variable

Diff Detail

Repository: rL LLVM

Event Timeline

etherzhhb created this revision.Mar 29 2017, 8:38 PM

etherzhhb added inline comments.

lib/Transforms/Utils/SimplifyIndVar.cpp
291–293	maybe I should also copy the 'exact' flag?

etherzhhb added a subscriber: llvm-commits.Mar 29 2017, 8:48 PM

Hi Hongbin,

I am not sure what you are trying to achieve here. This kernel is already optimized by -instcombine to

define void @test(i32* %a) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %div = shl nsw i32 %i.01, 5
  %idxprom = sext i32 %div to i64
  %arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
  store i32 %i.01, i32* %arrayidx, align 4
  %inc = add nsw i32 %i.01, 1
  %cmp = icmp slt i32 %i.01, 63
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}

So the sdiv is replaced by a lsh, taking into account the information that sdiv is working on unsigned inputs only.

If you divide by a non-power-of-two, instcombine does not change the code and leaves the sdiv in. However, if you would like to move to an sdiv even in this case this should probably be added to instcombine.

I would also be interested to learn why an 'udiv' is a better canonical form than an 'sdiv'. Is it faster to execute (non-power-of-two) udivs on FPGAs than sdivs?

Hi Tobias

In D31488#713840, @grosser wrote:

Hi Hongbin,

I am not sure what you are trying to achieve here. This kernel is already optimized by -instcombine to

This is based on the discussion in https://groups.google.com/d/topic/llvm-dev/HKK9ffG-dIU/discussion

define void @test(i32* %a) {
entry:
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
  %div = shl nsw i32 %i.01, 5
  %idxprom = sext i32 %div to i64
  %arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
  store i32 %i.01, i32* %arrayidx, align 4
  %inc = add nsw i32 %i.01, 1
  %cmp = icmp slt i32 %i.01, 63
  br i1 %cmp, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
}
So the sdiv is replaced by a lsh, taking into account the information that sdiv is working on unsigned inputs only.

If you divide by a non-power-of-two, instcombine does not change the code and leaves the sdiv in. However, if you would like to move to an sdiv even in this case this should probably be added to instcombine.

Yes, see below.

I would also be interested to learn why an 'udiv' is a better canonical form than an 'sdiv'. Is it faster to execute (non-power-of-two) udivs on FPGAs than sdivs?

I am focusing on the induction variable and its users here. I am trying to get an AddRec from arrayidx.
However, there is no sdiv in SCEV, and SCEV will give up when it see a sdiv. From this perspective udiv is better then sdiv. I think I should explicitly explain this in the summary

Let me also look at instcombine, the original motivation to put it here is mainly because SimplifyIndVar use SCEV, which provide better range knowledge about induction variables.

etherzhhb edited the summary of this revision. (Show Details)Mar 29 2017, 11:35 PM

The relevant function is visitSDiv in InstCombineMulDivRem.cpp. Maybe we have indeed insufficient information in instcombine.

Add more testcases

etherzhhb marked an inline comment as done.Mar 30 2017, 12:56 AM

etherzhhb added inline comments.

test/Transforms/IndVarSimplify/replace-sdiv-by-udiv.ll
11	instcombine (which use computeKnownBits) is not able to prove the status of the signbit of %i.01 here

Comments inline.

You can also do this same rewrite within SCEV. That is, try to have getSCEV(sdiv instruction) return a SCEVUDivExpr if legal. However, given that udiv is more canonical, this patch is good too if it solves your problem.

lib/Transforms/Utils/SimplifyIndVar.cpp
38	s/eliminated/converted to unsigned division/
277	This seems unnecessary - why do you not care about optimizing %val = sdiv <non-iv-known-positive>, <iv>
281	Use `auto *`.
286	I'd use `N` and `D` instead of `S` and `X`.
291	Use `auto *`.
299	Why not `push_back`?

This revision now requires changes to proceed.Mar 30 2017, 12:52 PM

address sanjoy 's comment

etherzhhb marked 6 inline comments as done.Mar 30 2017, 2:51 PM

lgtm with minor comments

lib/Transforms/Utils/SimplifyIndVar.cpp
39	s/NumElimSDiv/NumSimplifiedSDiv/
275	Comment is wrong. I'd actually just drop the comment. It is obvious enough what the code is doing here.

This revision is now accepted and ready to land.Mar 30 2017, 2:56 PM

In D31488#714492, @sanjoy wrote:

Comments inline.

You can also do this same rewrite within SCEV. That is, try to have getSCEV(sdiv instruction) return a SCEVUDivExpr if legal.

This also sounds reasonable and I can also look at that after this.
Just a quick question: are we suppose to use getSCEVAtScope in getSCEV?

However, given that udiv is more canonical, this patch is good too if it solves your problem.

fix the last few things before pushing the commit

etherzhhb marked 2 inline comments as done.Mar 30 2017, 3:06 PM

Closed by commit rL299118: [SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its… (authored by ether). · Explain WhyMar 30 2017, 3:09 PM

This revision was automatically updated to reflect the committed changes.

In D31488#714672, @etherzhhb wrote:

In D31488#714492, @sanjoy wrote:

Comments inline.

You can also do this same rewrite within SCEV. That is, try to have getSCEV(sdiv instruction) return a SCEVUDivExpr if legal.

This also sounds reasonable and I can also look at that after this.
Just a quick question: are we suppose to use getSCEVAtScope in getSCEV?

Not in general. The SCEV expressions you return from getSCEV need to be context insensitive.

However, given that udiv is more canonical, this patch is good too if it solves your problem.

Diff 93550

lib/Transforms/Utils/SimplifyIndVar.cpp

Show All 29 Lines

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "indvars"		#define DEBUG_TYPE "indvars"

STATISTIC(NumElimIdentity, "Number of IV identities eliminated");		STATISTIC(NumElimIdentity, "Number of IV identities eliminated");
STATISTIC(NumElimOperand, "Number of IV operands folded into a use");		STATISTIC(NumElimOperand, "Number of IV operands folded into a use");
STATISTIC(NumElimRem , "Number of IV remainder operations eliminated");		STATISTIC(NumElimRem , "Number of IV remainder operations eliminated");
		STATISTIC(
		sanjoyUnsubmitted Done Reply Inline Actions s/eliminated/converted to unsigned division/ sanjoy: s/eliminated/converted to unsigned division/
		NumSimplifiedSDiv,
		sanjoyUnsubmitted Done Reply Inline Actions s/NumElimSDiv/NumSimplifiedSDiv/ sanjoy: s/NumElimSDiv/NumSimplifiedSDiv/
		"Number of IV signed division operations converted to unsigned division");
STATISTIC(NumElimCmp , "Number of IV comparisons eliminated");		STATISTIC(NumElimCmp , "Number of IV comparisons eliminated");

namespace {		namespace {
/// This is a utility for simplifying induction variables		/// This is a utility for simplifying induction variables
/// based on ScalarEvolution. It is the primary instrument of the		/// based on ScalarEvolution. It is the primary instrument of the
/// IndvarSimplify pass, but it may also be directly invoked to cleanup after		/// IndvarSimplify pass, but it may also be directly invoked to cleanup after
/// other loop passes that preserve SCEV.		/// other loop passes that preserve SCEV.
class SimplifyIndvar {		class SimplifyIndvar {
Show All 24 Lines	public:

bool eliminateIdentitySCEV(Instruction UseInst, Instruction IVOperand);		bool eliminateIdentitySCEV(Instruction UseInst, Instruction IVOperand);

bool eliminateOverflowIntrinsic(CallInst *CI);		bool eliminateOverflowIntrinsic(CallInst *CI);
bool eliminateIVUser(Instruction UseInst, Instruction IVOperand);		bool eliminateIVUser(Instruction UseInst, Instruction IVOperand);
void eliminateIVComparison(ICmpInst ICmp, Value IVOperand);		void eliminateIVComparison(ICmpInst ICmp, Value IVOperand);
void eliminateIVRemainder(BinaryOperator Rem, Value IVOperand,		void eliminateIVRemainder(BinaryOperator Rem, Value IVOperand,
bool IsSigned);		bool IsSigned);
		bool eliminateSDiv(BinaryOperator *SDiv);
bool strengthenOverflowingOperation(BinaryOperator OBO, Value IVOperand);		bool strengthenOverflowingOperation(BinaryOperator OBO, Value IVOperand);
};		};
}		}

/// Fold an IV operand into its use. This removes increments of an		/// Fold an IV operand into its use. This removes increments of an
/// aligned IV when used by a instruction that ignores the low bits.		/// aligned IV when used by a instruction that ignores the low bits.
///		///
/// IVOperand is guaranteed SCEVable, but UseInst may not be.		/// IVOperand is guaranteed SCEVable, but UseInst may not be.
▲ Show 20 Lines • Show All 174 Lines • ▼ Show 20 Lines	#endif
ICmp->setOperand(1, NewRHS);		ICmp->setOperand(1, NewRHS);
} else		} else
return;		return;

++NumElimCmp;		++NumElimCmp;
Changed = true;		Changed = true;
}		}

		bool SimplifyIndvar::eliminateSDiv(BinaryOperator *SDiv) {
		// Get the SCEVs for the ICmp operands.
		auto *N = SE->getSCEV(SDiv->getOperand(0));
		auto *D = SE->getSCEV(SDiv->getOperand(1));
		sanjoyUnsubmitted Done Reply Inline Actions Comment is wrong. I'd actually just drop the comment. It is obvious enough what the code is doing here. sanjoy: Comment is wrong. I'd actually just drop the comment. It is obvious enough what the code is…

		// Simplify unnecessary loops away.
		sanjoyUnsubmitted Done Reply Inline Actions This seems unnecessary - why do you not care about optimizing %val = sdiv <non-iv-known-positive>, <iv> sanjoy: This seems unnecessary - why do you not care about optimizing ``` %val = sdiv <non-iv-known…
		const Loop *L = LI->getLoopFor(SDiv->getParent());
		N = SE->getSCEVAtScope(N, L);
		D = SE->getSCEVAtScope(D, L);

		sanjoyUnsubmitted Done Reply Inline Actions Use `auto `. sanjoy:* Use `auto *`.
		// Replace sdiv by udiv if both of the operands are non-negative
		if (SE->isKnownNonNegative(N) && SE->isKnownNonNegative(D)) {
		auto *UDiv = BinaryOperator::Create(
		BinaryOperator::UDiv, SDiv->getOperand(0), SDiv->getOperand(1),
		SDiv->getName() + ".udiv", SDiv);
		sanjoyUnsubmitted Done Reply Inline Actions I'd use `N` and `D` instead of `S` and `X`. sanjoy: I'd use `N` and `D` instead of `S` and `X`.
		UDiv->setIsExact(SDiv->isExact());
		SDiv->replaceAllUsesWith(UDiv);
		DEBUG(dbgs() << "INDVARS: Simplified sdiv: " << *SDiv << '\n');
		++NumSimplifiedSDiv;
		Changed = true;
		sanjoyUnsubmitted Done Reply Inline Actions Use `auto `. sanjoy:* Use `auto *`.
		DeadInsts.push_back(SDiv);
		return true;
		etherzhhbAuthorUnsubmitted Done Reply Inline Actions maybe I should also copy the 'exact' flag? etherzhhb: maybe I should also copy the 'exact' flag?
		}

		return false;
		}

/// SimplifyIVUsers helper for eliminating useless		/// SimplifyIVUsers helper for eliminating useless
		sanjoyUnsubmitted Done Reply Inline Actions Why not `push_back`? sanjoy: Why not `push_back`?
/// remainder operations operating on an induction variable.		/// remainder operations operating on an induction variable.
void SimplifyIndvar::eliminateIVRemainder(BinaryOperator *Rem,		void SimplifyIndvar::eliminateIVRemainder(BinaryOperator *Rem,
Value *IVOperand,		Value *IVOperand,
bool IsSigned) {		bool IsSigned) {
// We're only interested in the case where we know something about		// We're only interested in the case where we know something about
// the numerator.		// the numerator.
if (IVOperand != Rem->getOperand(0))		if (IVOperand != Rem->getOperand(0))
return;		return;
▲ Show 20 Lines • Show All 144 Lines • ▼ Show 20 Lines
/// side-effect given the range of IV values. IVOperand is guaranteed SCEVable,		/// side-effect given the range of IV values. IVOperand is guaranteed SCEVable,
/// but UseInst may not be.		/// but UseInst may not be.
bool SimplifyIndvar::eliminateIVUser(Instruction *UseInst,		bool SimplifyIndvar::eliminateIVUser(Instruction *UseInst,
Instruction *IVOperand) {		Instruction *IVOperand) {
if (ICmpInst *ICmp = dyn_cast<ICmpInst>(UseInst)) {		if (ICmpInst *ICmp = dyn_cast<ICmpInst>(UseInst)) {
eliminateIVComparison(ICmp, IVOperand);		eliminateIVComparison(ICmp, IVOperand);
return true;		return true;
}		}
if (BinaryOperator *Rem = dyn_cast<BinaryOperator>(UseInst)) {		if (BinaryOperator *Bin = dyn_cast<BinaryOperator>(UseInst)) {
bool IsSigned = Rem->getOpcode() == Instruction::SRem;		bool IsSRem = Bin->getOpcode() == Instruction::SRem;
if (IsSigned \|\| Rem->getOpcode() == Instruction::URem) {		if (IsSRem \|\| Bin->getOpcode() == Instruction::URem) {
eliminateIVRemainder(Rem, IVOperand, IsSigned);		eliminateIVRemainder(Bin, IVOperand, IsSRem);
return true;		return true;
}		}

		if (Bin->getOpcode() == Instruction::SDiv)
		return eliminateSDiv(Bin);
}		}

if (auto *CI = dyn_cast<CallInst>(UseInst))		if (auto *CI = dyn_cast<CallInst>(UseInst))
if (eliminateOverflowIntrinsic(CI))		if (eliminateOverflowIntrinsic(CI))
return true;		return true;

if (eliminateIdentitySCEV(UseInst, IVOperand))		if (eliminateIdentitySCEV(UseInst, IVOperand))
return true;		return true;
▲ Show 20 Lines • Show All 246 Lines • Show Last 20 Lines

test/Transforms/IndVarSimplify/replace-sdiv-by-udiv.ll

This file was added.

				; RUN: opt < %s -indvars -S \| FileCheck %s

				define void @test0(i32* %a) {
				; CHECK-LABEL: @test0(
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%div = sdiv i32 %i.01, 2
				; CHECK-NOT: sdiv
				etherzhhbAuthorUnsubmitted Not Done Reply Inline Actions instcombine (which use computeKnownBits) is not able to prove the status of the signbit of %i.01 here etherzhhb: instcombine (which use computeKnownBits) is not able to prove the status of the signbit of %i.
				; CHECK: udiv
				%idxprom = sext i32 %div to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				store i32 %i.01, i32* %arrayidx, align 4
				%inc = add nsw i32 %i.01, 1
				%cmp = icmp slt i32 %inc, 64
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret void
				}

				define void @test1(i32* %a) {
				; CHECK-LABEL: @test1(
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%div = sdiv exact i32 %i.01, 2
				; CHECK-NOT: sdiv
				; CHECK: udiv exact
				%idxprom = sext i32 %div to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				store i32 %i.01, i32* %arrayidx, align 4
				%inc = add nsw i32 %i.01, 1
				%cmp = icmp slt i32 %inc, 64
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret void
				}

				define void @test2(i32* %a, i32 %d) {
				; CHECK-LABEL: @test2(
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%mul = mul nsw i32 %i.01, 64
				%div = sdiv i32 %mul, %d
				; CHECK-NOT: udiv
				%idxprom = sext i32 %div to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				store i32 %i.01, i32* %arrayidx, align 4
				%inc = add nsw i32 %i.01, 1
				%cmp = icmp slt i32 %inc, 64
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret void
				}

				define void @test3(i32* %a) {
				; CHECK-LABEL: @test3(
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%div = sdiv i32 2048, %i.01
				; CHECK: udiv
				; CHECK-NOT: sdiv
				%idxprom = sext i32 %div to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				store i32 %i.01, i32* %arrayidx, align 4
				%inc = add nsw i32 %i.01, 1
				%cmp = icmp slt i32 %inc, 64
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret void
				}

				define void @test4(i32* %a) {
				; CHECK-LABEL: @test4(
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%mul = mul nsw i32 %i.01, 64
				%div = sdiv i32 %mul, 8
				; CHECK: udiv
				; CHECK-NOT: sdiv
				%idxprom = sext i32 %div to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				store i32 %i.01, i32* %arrayidx, align 4
				%inc = add nsw i32 %i.01, 1
				%cmp = icmp slt i32 %inc, 64
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret void
				}

				define void @test5(i32* %a) {
				; CHECK-LABEL: @test5(
				entry:
				br label %for.body

				for.body: ; preds = %entry, %for.body
				%i.01 = phi i32 [ 0, %entry ], [ %inc, %for.body ]
				%mul = mul nsw i32 %i.01, 64
				%div = sdiv i32 %mul, 6
				; CHECK: udiv
				; CHECK-NOT: sdiv
				%idxprom = sext i32 %div to i64
				%arrayidx = getelementptr inbounds i32, i32* %a, i64 %idxprom
				store i32 %i.01, i32* %arrayidx, align 4
				%inc = add nsw i32 %i.01, 1
				%cmp = icmp slt i32 %inc, 64
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 93550

lib/Transforms/Utils/SimplifyIndVar.cpp

test/Transforms/IndVarSimplify/replace-sdiv-by-udiv.ll

This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negativeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 93550

lib/Transforms/Utils/SimplifyIndVar.cpp

test/Transforms/IndVarSimplify/replace-sdiv-by-udiv.ll

[SimplifyIndvar] Replace the sdiv used by IV if we can prove both of its operands are non-negative
ClosedPublic