This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Analysis/
-
Analysis/
1
ScalarEvolutionNormalization.cpp
-
unittests/Analysis/
-
Analysis/
-
ScalarEvolutionTest.cpp

Differential D151796

[SCEV] Skip min/max expressions when normalizing/denormalizing SCEV expressions
Needs ReviewPublic

Authored by dmakogon on May 31 2023, 5:02 AM.

Download Raw Diff

Details

Reviewers

reames
mkazantsev
nikic
fhahn

Summary

Don't do SCEV normalization and denormalization for max/min expressions.
Generally normalizing SCEV returns an inequivalent expression in which max/min may become redundant, so scalar evolution can simplify it. Denormalizing such expression wouldn't return the missing max/min back. This also works the other way around: if we first denormalize an expression that contains min/max-s, and then normalize it.
AFAIU for any expression S, S must be equal to normalize(denormalize(S)), and also the other way around - S == denormalize(normalize(S)), so it's illegal to do normalization/denormalization on expressions containing min/max.

Consider the following example. Imagine we have the following loop:

loop:
  %iv = phi [ 11, %entry ], [ %iv.next, %loop ]
  %umax = umax(%iv, 10)
  %iv.next = add %iv, -4
  %loop.cond = %iv.next u< 7
  br i1 %loop.cond, %exit, %loop

It executes exactly two iterations.
The SCEV for %umax is (10 umax {11,+,-4}).
Normalizing it for loop '%loop' would give us (10 umax {15,+,-4}).
Now, as the loop has only two iterations, the AddRec {15,+,-4} is always greater than 10, so SCEV simplifies the expression to {15,+,-4}.
Denormalizing it back would give a wrong result - {11,+,-4}.
Obviously on the 2nd iteration the value of %umax is 10, but according to SCEV it's 11 - 4 = 7.
One can also come up with analagous examples for smax, umin and smin.

Prohibiting normalization/denormalization for min/max expressions resolves the issue.

This fixes miscompilation described in https://github.com/llvm/llvm-project/issues/62563.
LSR does request normalization when creating initial formulae for values and it denormalized them back before expansion. The example described above is a short version of the test case in the bug.

Diff Detail

Unit TestsFailed

	Time	Test
	50 ms	x64 debian > LLVM.Transforms/LoopStrengthReduce/X86::pr62563.ll

Event Timeline

dmakogon created this revision.May 31 2023, 5:02 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2023, 5:02 AM

Herald added subscribers: StephenFan, javed.absar, hiraditya. · View Herald Transcript

dmakogon requested review of this revision.May 31 2023, 5:02 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2023, 5:02 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

dmakogon edited the summary of this revision. (Show Details)May 31 2023, 5:04 AM

Harbormaster completed remote builds in B235545: Diff 527000.May 31 2023, 5:49 AM

First time I hear about SCEV normalization/denormalization. The entire approach looks highly dubious.

I'm concerned that this is not a complete fix and there are other cases where normalization/denormalization do not round-trip. What is the criterion that distinguishes min/max from other SCEV expressions? Could something similar affect udiv expressions for example?

I also noticed that there is a check in IVUsers for normalization/denormalization roundtrip at https://github.com/llvm/llvm-project/blob/0a3dc73e700b4a37bc435bf7c02213161b27f54a/llvm/lib/Analysis/IVUsers.cpp#L221-L231.

llvm/lib/Analysis/ScalarEvolutionNormalization.cpp
91	Presumably umin_seq also needs to be handled.
llvm/test/Transforms/LoopStrengthReduce/pr62563.ll
6 ↗	(On Diff #527000)	Please precommit this change. This test should be showing a difference in generated IR.

In D151796#4384028, @nikic wrote:

I also noticed that there is a check in IVUsers for normalization/denormalization roundtrip at https://github.com/llvm/llvm-project/blob/0a3dc73e700b4a37bc435bf7c02213161b27f54a/llvm/lib/Analysis/IVUsers.cpp#L221-L231.

For reference, that check was added in https://github.com/llvm/llvm-project/commit/66806aef1efb2425696c588298c1b52a0a4c3c14.

In D151796#4384028, @nikic wrote:

First time I hear about SCEV normalization/denormalization. The entire approach looks highly dubious.

I'm concerned that this is not a complete fix and there are other cases where normalization/denormalization do not round-trip. What is the criterion that distinguishes min/max from other SCEV expressions? Could something similar affect udiv expressions for example?

I also noticed that there is a check in IVUsers for normalization/denormalization roundtrip at https://github.com/llvm/llvm-project/blob/0a3dc73e700b4a37bc435bf7c02213161b27f54a/llvm/lib/Analysis/IVUsers.cpp#L221-L231.

Thanks for replying!

I agree that this transformation is fishy. There is no clear documentation on it. And I'm not sure whether the round-trip property must hold, but seems like in places where it was used when it was added initially authors assumed it did hold.

I looked at the check you mentioned and at the issue that check is supposed to solve (https://bugs.llvm.org/show_bug.cgi?id=17473).
And it seems to me that it was added just as a quick workaround for this exact issue that sometimes the round-trip property is not preserved.
The test case described in PR17473 has a loop with exactly 127 backedge-taken count and there's a value there with such SCEV: (sext i8 {1,+,1}<nuw><%for.body> to i32).
On the last iteration the addrec overflows and its value is -128, so sext is needed there. Normalizing would subtract 1 from the addrec start value and so it cannot overflow anymore ({0,+,1} so sext can be safely discarded. If we denormalize, we get just {1,+,1}.

So yes, you're right. The current patch doesn't cover all the cases. Clearly we have to prohibit normalization of sext and zext as well.
For udiv I think the following example works. Let S be i32 ({1,+,1}<nuw><%loop> /u -1) and the loop runs until {1,+,1} is equal -1. So it's always zero except for the last iteration when it's value is one. S_norm = ({0,+,1}<nuw><%loop> /u -1) and here SCEV doesn't turn into constant zero, but I think it could. And if it did, then obviously denormalization would give a wrong result.

And also turning the whole thing off causes many LSR tests to regress, mainly there are just more IVs left in the code after LSR. There's a considerable amount of code in that pass that depends on normalization, and I think it would take some time to review all such places.

I propose that we state this clearly somewhere that the round-trip property must be preserved and fix the current normalization/denormalization code accordingly (only allow normalization in cases when round-trip is preserved).

nikic mentioned this in D153004: [LSR] Consider post-inc form when creating extends/truncates..Jun 15 2023, 2:24 AM

dmakogon added inline comments.Jun 19 2023, 4:23 AM

llvm/test/Transforms/LoopStrengthReduce/pr62563.ll
6 ↗	(On Diff #527000)	Unfortunately, miscompilation on this test is not reproducible after 6ed152aff4aa. Before that patch, SCEV didn't initially simplify the expression it returned after normalization to just the AddRec (however it could). And denormalizing it back would return the initial expression and the bail out you mentioned in another comment here wouldn't work. Simplification actually did happen later in LSR and it lead to illegal transform. With that patch applied, SCEV performs the simplification during normalization due to more precise ranges available for the expression, and then the bail out works and normalized expression is discarded. I couldn't come up with another IR test, but I think the issue can still happen if some simplifications won't apply right after normalization and the bail out won't work.

Removed the test which is not reproducible after 6ed152aff4aa.

Harbormaster completed remote builds in B239773: Diff 532602.Jun 19 2023, 5:20 AM

We continue relying on normalization/denormalization round trip producing the original expression (e.g. D153004). Even if this patch doesn't fix all the cases where this property doesn't hold true, it is an improvement.

@nikic, @fhahn, are there any objections to moving forward with this change as a first step?

In D151796#4498796, @apilipenko wrote:

We continue relying on normalization/denormalization round trip producing the original expression (e.g. D153004). Even if this patch doesn't fix all the cases where this property doesn't hold true, it is an improvement.

@nikic, @fhahn, are there any objections to moving forward with this change as a first step?

https://reviews.llvm.org/rG7f5b15ad150e59815fbd4adc3202c8720718896e made normalizeForPostIncUse() return nullptr if it normalization/denormalization doesn't round-trip -- shouldn't that fully address this issue? Do you still see normalization related miscompiles in LSR after that change?

I missed that change. Yes, that should be sufficient to address the original issue.

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ScalarEvolutionNormalization.cpp

46 lines

unittests/

Analysis/

ScalarEvolutionTest.cpp

58 lines

Diff 532602

llvm/lib/Analysis/ScalarEvolutionNormalization.cpp

Show All 35 Lines	struct NormalizeDenormalizeRewriter
// we're careful about the lifetime of NormalizeDenormalizeRewriter.		// we're careful about the lifetime of NormalizeDenormalizeRewriter.
const NormalizePredTy Pred;		const NormalizePredTy Pred;

NormalizeDenormalizeRewriter(TransformKind Kind, NormalizePredTy Pred,		NormalizeDenormalizeRewriter(TransformKind Kind, NormalizePredTy Pred,
ScalarEvolution &SE)		ScalarEvolution &SE)
: SCEVRewriteVisitor<NormalizeDenormalizeRewriter>(SE), Kind(Kind),		: SCEVRewriteVisitor<NormalizeDenormalizeRewriter>(SE), Kind(Kind),
Pred(Pred) {}		Pred(Pred) {}
const SCEV visitAddRecExpr(const SCEVAddRecExpr Expr);		const SCEV visitAddRecExpr(const SCEVAddRecExpr Expr);

		const SCEV visitSMaxExpr(const SCEVSMaxExpr Expr);

		const SCEV visitUMaxExpr(const SCEVUMaxExpr Expr);

		const SCEV visitSMinExpr(const SCEVSMinExpr Expr);

		const SCEV visitUMinExpr(const SCEVUMinExpr Expr);
};		};
} // namespace		} // namespace

		// Don't do anything for max/min expressions. Normalizing SCEV returns
		// an inequivalent expression in which max/min may become redundant, so scalar
		// evolution can simplify it. Denormalizing such expression wouldn't return the
		// missing max/min back.
		// Consider the following example. Imagine we have the following loop:
		// loop:
		// %iv = phi [ 11, %entry ], [ %iv.next, %loop ]
		// %umax = umax(%iv, 10)
		// %iv.next = add %iv, -4
		// %loop.cond = %iv.next u< 7
		// br i1 %loop.cond, %loop, %exit
		// It executes exactly two iterations. The SCEV for %umax is (10 umax
		// {11,+,-4}). Normalizing it for loop '%loop' would give us (10 umax
		// {15,+,-4}). Now, as the loop has only two iterations, the AddRec is always
		// greater than 10, so SCEV simplifies the expression to {15,+,-4}.
		// Denormalizing it back would give a wrong result - {11,+,-4}. Obviously on the
		// 2nd iteration the value of %umax is 10, but according to SCEV it's 11 - 4
		// = 7.
		const SCEV *
		NormalizeDenormalizeRewriter::visitSMaxExpr(const SCEVSMaxExpr *Expr) {
		return Expr;
		}

		const SCEV *
		NormalizeDenormalizeRewriter::visitUMaxExpr(const SCEVUMaxExpr *Expr) {
		return Expr;
		}

		const SCEV *
		NormalizeDenormalizeRewriter::visitSMinExpr(const SCEVSMinExpr *Expr) {
		return Expr;
		}

		const SCEV *
		NormalizeDenormalizeRewriter::visitUMinExpr(const SCEVUMinExpr *Expr) {
		return Expr;
		}
		nikicUnsubmitted Not Done Reply Inline Actions Presumably umin_seq also needs to be handled. nikic: Presumably umin_seq also needs to be handled.

const SCEV *		const SCEV *
NormalizeDenormalizeRewriter::visitAddRecExpr(const SCEVAddRecExpr *AR) {		NormalizeDenormalizeRewriter::visitAddRecExpr(const SCEVAddRecExpr *AR) {
SmallVector<const SCEV *, 8> Operands;		SmallVector<const SCEV *, 8> Operands;

transform(AR->operands(), std::back_inserter(Operands),		transform(AR->operands(), std::back_inserter(Operands),
[&](const SCEV *Op) { return visit(Op); });		[&](const SCEV *Op) { return visit(Op); });

if (!Pred(AR))		if (!Pred(AR))
▲ Show 20 Lines • Show All 64 Lines • Show Last 20 Lines

llvm/unittests/Analysis/ScalarEvolutionTest.cpp

Show First 20 Lines • Show All 510 Lines • ▼ Show 20 Lines	std::unique_ptr<Module> M = parseAssemblyString(
" "		" "
" "		" "
"loop_2: "		"loop_2: "
" br i1 undef, label %end, label %loop_2 "		" br i1 undef, label %end, label %loop_2 "
" "		" "
"end: "		"end: "
" ret void "		" ret void "
"} "		"} "
		" "
		"declare i32 @llvm.umax.i32(i32, i32)"
		" "
		"declare i32 @llvm.smax.i32(i32, i32)"
		" "
		"declare i32 @llvm.umin.i32(i32, i32)"
		" "
		"declare i32 @llvm.smin.i32(i32, i32)"
		" "
		"define void @f_3() "
		" local_unnamed_addr { "
		"entry: "
		" br label %loop "
		" "
		"loop: "
		" %iv = phi i32 [ 155, %entry ], [ %iv.next, %loop ]" // 155 ... 7
		" %iv.next = add i32 %iv, -4"
		" %umax = call i32 @llvm.umax.i32(i32 %iv, i32 10)"
		" %smax = call i32 @llvm.smax.i32(i32 %iv, i32 10)"
		" %umin = call i32 @llvm.umin.i32(i32 %iv, i32 10)"
		" %smin = call i32 @llvm.smin.i32(i32 %iv, i32 10)"
		" %loop.cond = icmp ult i32 %iv.next, 7"
		" br i1 %loop.cond, label %end, label %loop "
		" "
		"end: "
		" ret void "
		"} "
,		,
Err, C);		Err, C);

assert(M && "Could not parse module?");		assert(M && "Could not parse module?");
assert(!verifyModule(*M) && "Must have been well formed!");		assert(!verifyModule(*M) && "Must have been well formed!");

runWithSE(*M, "f_1", [&](Function &F, LoopInfo &LI, ScalarEvolution &SE) {		runWithSE(*M, "f_1", [&](Function &F, LoopInfo &LI, ScalarEvolution &SE) {
auto &I0 = GetInstByName(F, "iv0");		auto &I0 = GetInstByName(F, "iv0");
▲ Show 20 Lines • Show All 87 Lines • ▼ Show 20 Lines	for (const auto &LoopSet : LoopSets)
auto *N = llvm::normalizeForPostIncUse(D, LoopSet, SE);		auto *N = llvm::normalizeForPostIncUse(D, LoopSet, SE);

// Denormalization and then normalizing better give us back the same		// Denormalization and then normalizing better give us back the same
// value.		// value.
EXPECT_EQ(S, N) << "S = " << S << " N = " << N;		EXPECT_EQ(S, N) << "S = " << S << " N = " << N;
}		}
}		}
});		});

		runWithSE(*M, "f_3", [&](Function &F, LoopInfo &LI, ScalarEvolution &SE) {
		auto &UMaxInst = GetInstByName(F, "umax");
		auto &SMaxInst = GetInstByName(F, "smax");
		auto &UMinInst = GetInstByName(F, "umin");
		auto &SMinInst = GetInstByName(F, "smin");

		auto *L = LI.getLoopFor(UMaxInst.getParent());
		PostIncLoopSet Loops;
		Loops.insert(L);

		// Check that normalizing and then denormalizing (and vice versa) preserves
		// umax, smax, umin, smin expressions.
		auto *UMaxS = SE.getSCEV(&UMaxInst);
		auto *SMaxS = SE.getSCEV(&SMaxInst);
		auto *UMinS = SE.getSCEV(&UMinInst);
		auto *SMinS = SE.getSCEV(&SMinInst);

		for (auto *S : {UMaxS, SMaxS, UMinS, SMinS}) {
		{
		auto *N = normalizeForPostIncUse(S, Loops, SE);
		auto *D = denormalizeForPostIncUse(N, Loops, SE);
		EXPECT_EQ(S, D) << S << " " << D;
		}
		{
		auto *D = denormalizeForPostIncUse(S, Loops, SE);
		auto *N = normalizeForPostIncUse(D, Loops, SE);
		EXPECT_EQ(S, N) << S << " " << N;
		}
		}
		});
}		}

// Expect the call of getZeroExtendExpr will not cost exponential time.		// Expect the call of getZeroExtendExpr will not cost exponential time.
TEST_F(ScalarEvolutionsTest, SCEVZeroExtendExpr) {		TEST_F(ScalarEvolutionsTest, SCEVZeroExtendExpr) {
LLVMContext C;		LLVMContext C;
SMDiagnostic Err;		SMDiagnostic Err;

// Generate a function like below:		// Generate a function like below:
▲ Show 20 Lines • Show All 1,172 Lines • Show Last 20 Lines