This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Analysis/
-
Analysis/
-
ScalarEvolutionExpander.cpp
-
test/Transforms/IndVarSimplify/
-
Transforms/
-
IndVarSimplify/
2
udiv.ll

Differential D39228

[SCEV] Enhance SCEVFindUnsafe for division
ClosedPublic

Authored by mkazantsev on Oct 24 2017, 4:34 AM.

Download Raw Diff

Details

Reviewers

sanjoy
sebpop
reames

Commits

rGb6d40067af8e: [SCEV] Enhance SCEVFindUnsafe for division
rL316568: [SCEV] Enhance SCEVFindUnsafe for division

Summary

This patch allows SCEVFindUnsafe algorithm to tread division by any non-positive
value as safe. Previously, it could only recognize non-zero constants.

Diff Detail

Event Timeline

mkazantsev created this revision.Oct 24 2017, 4:34 AM

mkazantsev added a child revision: D39229: [LoopUnrolling] Do not expand unsafe expressions in loop unrolling.Oct 24 2017, 4:44 AM

mkazantsev added a child revision: D39230: [IndVarSimplify] Do not expand unsafe expressions in IndVarSimplify.Oct 24 2017, 4:50 AM

sanjoy accepted this revision.Oct 24 2017, 9:39 AM

sanjoy added inline comments.

test/Transforms/IndVarSimplify/udiv.ll
172	Put something else instead of `undef` here for completeness, loading from `undef` is UB.

This revision is now accepted and ready to land.Oct 24 2017, 9:39 AM

mkazantsev added inline comments.Oct 24 2017, 10:46 AM

test/Transforms/IndVarSimplify/udiv.ll
172	Ok, will do before checking in.

Closed by commit rL316568: [SCEV] Enhance SCEVFindUnsafe for division (authored by mkazantsev). · Explain WhyOct 25 2017, 4:08 AM

This revision was automatically updated to reflect the committed changes.

Hi,
This patch caused SingleSource/Benchmarks/Shootout/shootout-sieve regression on Arm public bots:

http://lnt.llvm.org/db_default/v4/nts/76950 13.43%
http://lnt.llvm.org/db_default/v4/nts/77068 10.61%

The affected pass is LSR.
I'll provide more data shortly.

Thanks,
Evgeny Astigeevich
The ARM Compiler Optimization team leader

This revision is now accepted and ready to land.Oct 26 2017, 8:56 AM

The benchmark source file: http://www.llvm.org/viewvc/llvm-project/test-suite/trunk/SingleSource/Benchmarks/Shootout/sieve.c?view=markup

I attached

comp.good.log14 KBDownload

comp.bad.log15 KBDownload

Comparing them I can see a number of operations in a loop is increased.

Compiler options:
clang -c -S --target=aarch64-linux-gnueabi -DNDEBUG -O3 -DNDEBUG -mcpu=cortex-a57 -fomit-frame-pointer sieve.c

Reverted as https://reviews.llvm.org/rL316739. I will try to understand what happens.

Hi Evgeny!

I see the difference before the LSR: for the last loop, in the *** IR Dump Before Loop Strength Reduction *** section we have two more Phis. It has two more Phis and more instructions in body (with repeating pattern). Most likely it is unrolling changed the number of unrolled iterations.

For first two loops, in one case it just changed iteration space from 0->len to len->0 which shouldn't have significant performance impact; in another case nothing changed but variable name.

So I believe that problem dwells before LSR, most likely in unrolling.

Hi Max,

Thank you for initial investigation. I'll try to debug the unrolling.

Thanks,
Evgeny

test.ll9 KBDownload

test.good.ll10 KBDownload

test.bad.ll11 KBDownload

Hi Max,

I did some debugging. It's definitely LSR. Compare test.good.ll and test.bad.ll produced from test.ll with

opt -loop-reduce -S -o - test.ll

Thanks,
Evgeny

Closed by commit rGb6d40067af8e: [SCEV] Enhance SCEVFindUnsafe for division (authored by mkazantsev). · Explain WhyOct 7 2019, 5:20 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptOct 7 2019, 5:20 AM

Herald added subscribers: javed.absar, hiraditya. · View Herald Transcript

Revision Contents

Path

Size

lib/

Analysis/

ScalarEvolutionExpander.cpp

10 lines

test/

Transforms/

IndVarSimplify/

udiv.ll

40 lines

Diff 120039

lib/Analysis/ScalarEvolutionExpander.cpp

	Show First 20 Lines • Show All 2,244 Lines • ▼ Show 20 Lines
	// UDiv expressions. We don't know if the UDiv is derived from an IR divide			// UDiv expressions. We don't know if the UDiv is derived from an IR divide
	// instruction, but the important thing is that we prove the denominator is			// instruction, but the important thing is that we prove the denominator is
	// nonzero before expansion.			// nonzero before expansion.
	//			//
	// IVUsers already checks that IV-derived expressions are safe. So this check is			// IVUsers already checks that IV-derived expressions are safe. So this check is
	// only needed when the expression includes some subexpression that is not IV			// only needed when the expression includes some subexpression that is not IV
	// derived.			// derived.
	//			//
	// Currently, we only allow division by a nonzero constant here. If this is
	// inadequate, we could easily allow division by SCEVUnknown by using
	// ValueTracking to check isKnownNonZero().
	//
	// We cannot generally expand recurrences unless the step dominates the loop			// We cannot generally expand recurrences unless the step dominates the loop
	// header. The expander handles the special case of affine recurrences by			// header. The expander handles the special case of affine recurrences by
	// scaling the recurrence outside the loop, but this technique isn't generally			// scaling the recurrence outside the loop, but this technique isn't generally
	// applicable. Expanding a nested recurrence outside a loop requires computing			// applicable. Expanding a nested recurrence outside a loop requires computing
	// binomial coefficients. This could be done, but the recurrence has to be in a			// binomial coefficients. This could be done, but the recurrence has to be in a
	// perfectly reduced form, which can't be guaranteed.			// perfectly reduced form, which can't be guaranteed.
	struct SCEVFindUnsafe {			struct SCEVFindUnsafe {
	ScalarEvolution &SE;			ScalarEvolution &SE;
	bool IsUnsafe;			bool IsUnsafe;

	SCEVFindUnsafe(ScalarEvolution &se): SE(se), IsUnsafe(false) {}			SCEVFindUnsafe(ScalarEvolution &se): SE(se), IsUnsafe(false) {}

	bool follow(const SCEV *S) {			bool follow(const SCEV *S) {
	if (const SCEVUDivExpr *D = dyn_cast<SCEVUDivExpr>(S)) {			if (const SCEVUDivExpr *D = dyn_cast<SCEVUDivExpr>(S)) {
	const SCEVConstant *SC = dyn_cast<SCEVConstant>(D->getRHS());			if (!SE.isKnownNonZero(D->getRHS())) {
	if (!SC \|\| SC->getValue()->isZero()) {
	IsUnsafe = true;			IsUnsafe = true;
	return false;			return false;
	}			}
	}			} else if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(S)) {
	if (const SCEVAddRecExpr *AR = dyn_cast<SCEVAddRecExpr>(S)) {
	const SCEV *Step = AR->getStepRecurrence(SE);			const SCEV *Step = AR->getStepRecurrence(SE);
	if (!AR->isAffine() && !SE.dominates(Step, AR->getLoop()->getHeader())) {			if (!AR->isAffine() && !SE.dominates(Step, AR->getLoop()->getHeader())) {
	IsUnsafe = true;			IsUnsafe = true;
	return false;			return false;
	}			}
	}			}
	return true;			return true;
	}			}
	Show All 11 Lines

test/Transforms/IndVarSimplify/udiv.ll

Show First 20 Lines • Show All 124 Lines • ▼ Show 20 Lines

declare i32 @atoi(i8* nocapture) nounwind readonly		declare i32 @atoi(i8* nocapture) nounwind readonly

declare i32 @printf(i8* nocapture, ...) nounwind		declare i32 @printf(i8* nocapture, ...) nounwind

; IndVars doesn't emit a udiv in for.body.preheader since SCEVExpander::expand will		; IndVars doesn't emit a udiv in for.body.preheader since SCEVExpander::expand will
; find out there's already a udiv in the original code.		; find out there's already a udiv in the original code.

; CHECK-LABEL: @foo(		; CHECK-LABEL: @foo_01(
; CHECK: for.body.preheader:		; CHECK: for.body.preheader:
; CHECK-NOT: udiv		; CHECK-NOT: udiv

define void @foo(double* %p, i64 %n) nounwind {		define void @foo_01(double* %p, i64 %n) nounwind {
entry:		entry:
%div0 = udiv i64 %n, 7 ; <i64> [#uses=1]		%div0 = udiv i64 %n, 7 ; <i64> [#uses=1]
%div1 = add i64 %div0, 1		%div1 = add i64 %div0, 1
%cmp2 = icmp ult i64 0, %div1 ; <i1> [#uses=1]		%cmp2 = icmp ult i64 0, %div1 ; <i1> [#uses=1]
br i1 %cmp2, label %for.body.preheader, label %for.end		br i1 %cmp2, label %for.body.preheader, label %for.end

for.body.preheader: ; preds = %entry		for.body.preheader: ; preds = %entry
br label %for.body		br label %for.body
Show All 9 Lines	for.body: ; preds = %for.body.preheader, %for.body
br i1 %cmp, label %for.body, label %for.end.loopexit		br i1 %cmp, label %for.body, label %for.end.loopexit

for.end.loopexit: ; preds = %for.body		for.end.loopexit: ; preds = %for.body
br label %for.end		br label %for.end

for.end: ; preds = %for.end.loopexit, %entry		for.end: ; preds = %for.end.loopexit, %entry
ret void		ret void
}		}

		; Same as foo_01, but we divide by non-constant value.

		; CHECK-LABEL: @foo_02(
		; CHECK: for.body.preheader:
		; CHECK-NOT: udiv

		define void @foo_02(double* %p, i64 %n) nounwind {
		entry:
		%denom = load i64, i64* undef, align 4, !range !0
		sanjoyUnsubmitted Not Done Reply Inline Actions Put something else instead of `undef` here for completeness, loading from `undef` is UB. sanjoy: Put something else instead of `undef` here for completeness, loading from `undef` is UB.
		mkazantsevAuthorUnsubmitted Not Done Reply Inline Actions Ok, will do before checking in. mkazantsev: Ok, will do before checking in.
		%div0 = udiv i64 %n, %denom ; <i64> [#uses=1]
		%div1 = add i64 %div0, 1
		%cmp2 = icmp ult i64 0, %div1 ; <i1> [#uses=1]
		br i1 %cmp2, label %for.body.preheader, label %for.end

		for.body.preheader: ; preds = %entry
		br label %for.body

		for.body: ; preds = %for.body.preheader, %for.body
		%i.03 = phi i64 [ %inc, %for.body ], [ 0, %for.body.preheader ] ; <i64> [#uses=2]
		%arrayidx = getelementptr inbounds double, double* %p, i64 %i.03 ; <double*> [#uses=1]
		store double 0.000000e+00, double* %arrayidx
		%inc = add i64 %i.03, 1 ; <i64> [#uses=2]
		%divx = udiv i64 %n, %denom ; <i64> [#uses=1]
		%div = add i64 %divx, 1
		%cmp = icmp ult i64 %inc, %div ; <i1> [#uses=1]
		br i1 %cmp, label %for.body, label %for.end.loopexit

		for.end.loopexit: ; preds = %for.body
		br label %for.end

		for.end: ; preds = %for.end.loopexit, %entry
		ret void
		}

		!0 = !{i64 1, i64 10}