This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Don't require dominance ordering of add/mul/min/max expressions
AbandonedPublic

Authored by reames on Jun 17 2021, 6:46 PM.

Download Raw Diff

Details

Reviewers

efriedma
nikic
mkazantsev

Commits

rGda6384fbb9fb: Add beginning of LLVM's GettingStarted to GitHub readme

Summary

I'm a bit hesitant about this patch, and welcome ideas on other approaches.

The test case test_non_dom demonstrates a case where scev-aa crashes today. (If exercised either by -eval-aa or -licm.) The basic problem is that SCEV-AA expects to be able to compute a pointer difference between two SCEVs for any two pair of pointers we do an alias query on. For (valid, but out of scope) reasons, we can end up asking whether expressions in different sub-loops can alias each other. This results in a subtraction expression being formed where neither operand dominates the other.

Looking at SCEV, I can't find a reason why the dominance invariant on operands is actually required. The code which references it is simply a optimization rule, the worst that happens is we fail to caonicalize a hypothetical (i.e. not corresponding to an IR Value) expression.

This does result in somewhat odd scev expressions becoming possible. For instance, the getMinusSCEV(getSCEV(%addr1), getSCEV(%addr2)) results in "({(-3200 + (-1 * %data)),+,-8}<nw><%subloop2> + {%data,+,8}<nw><%subloop1>)". I can't find an immediate problem with that, but I am sorta wondering if I'm missing something here.

Diff Detail

Event Timeline

reames created this revision.Jun 17 2021, 6:46 PM

Herald added subscribers: javed.absar, bollu, hiraditya, mcrosier. · View Herald TranscriptJun 17 2021, 6:46 PM

reames requested review of this revision.Jun 17 2021, 6:46 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2021, 6:46 PM

I'm pretty sure if you try to give an expression where arguments don't dominate one another to SCEV Expander, it should break.

• walli99 added a commit: rGda6384fbb9fb: Add beginning of LLVM's GettingStarted to GitHub readme.Jun 18 2021, 2:36 AM

Max is correct. I think I can handle that easily, but let me make sure and then repost.

Harbormaster completed remote builds in B109845: Diff 352889.Jun 18 2021, 10:16 AM

There are two relevant questions:

Does SCEV do something reasonable with such an expression? I mean, you obviously can't expand it; there isn't any viable insertion point. More generally, I'm not sure it's meaningful; even if we drop the assertions, it's not clear what the result is supposed to represent. If we want to support "subtracting" two addrecs to do dependency analysis, we should probably add a dedicated method, which returns something that isn't a regular SCEV expression.
Why is SCEVAA trying to construct such an expression? AliasAnalysis::alias() doesn't make any sense if there isn't a dominance relationship between the two pointers.

In D104503#2829510, @efriedma wrote:

There are two relevant questions:

Does SCEV do something reasonable with such an expression? I mean, you obviously can't expand it; there isn't any viable insertion point. More generally, I'm not sure it's meaningful; even if we drop the assertions, it's not clear what the result is supposed to represent. If we want to support "subtracting" two addrecs to do dependency analysis, we should probably add a dedicated method, which returns something that isn't a regular SCEV expression.

Why is SCEVAA trying to construct such an expression? AliasAnalysis::alias() doesn't make any sense if there isn't a dominance relationship between the two pointers.

On (1), I think that point has convinced me we need to not create such scevs and return could not compute instead. Will give it a bit more thought, but that's the way I'm currently leaning.

On (2), alias set tracker doesn't check dominance relations. I haven't confirmed, but I'm 90% sure that's the issue. aa-eval also does a cross product without concern for dominance.

On (2), alias set tracker doesn't check dominance relations. I haven't confirmed, but I'm 90% sure that's the issue

On trunk, LICM doesn't use AliasSetTracker by default, so this conclusion is a bit suspicious.

Do you think it's worth trying to enforce dominance relations in AA calls more generally, as opposed to trying to address this specifically in SCEVAA?

@efriedma LICM still uses AST for scalar promotion.

It's also not obvious to me why AA queries without dominance relationship are necessarily problematic. Consider something like this:

p = ...
if (...) {
    p1 = gep p, 1
    store p1
}
p2 = gep p, 1
load p2

p1 and p2 are not in a dominance relationship, but an AA query between them seems meaningful (and would answer MustAlias here). I'd expect that e.g. MemorySSA would also perform such queries when traversing MemoryPhi's.

p1 and p2 are not in a dominance relationship, but an AA query between them seems meaningful

In general, an alias() query involves two input pointers/sizes, and a position in the CFG. But in the LLVM API, the "position" is implicit. The easiest way to define this position is "any position dominated by both pointers". Anything else gets more complicated. Consider, for example:

if (cond) {
  a[getc(file1)]++;
} else
  a[getc(file2)]++;
}

Do these accesses alias? What does it even mean for them to alias?

For your example with a GEP after an if statement, there's a sort of "obvious" rule: you can solve the position issue by implicitly hoisting the GEP before the if statement. I guess we could define some sort of "extended" dominance relationship that includes some amount of hoisting. But it's not clear how the caller would know it's dealing with a value that can be hoisted.

@efriedma For AA, the relevant concept is reachability, not dominance. Querying modref between two instructions is sensible as long as one is reachable from the other. For your example (assuming it is not part of a loop), AA is not meaningful, because the instructions are not reachable in either direction. But generally, there are many cases where AA is meaningful, but no dominance relationship holds. As such, I do not believe it makes sense to enforce dominance on the level of the AA API.

Refresh, mostly to add test for following comment.

After giving them some more thought, I believe I've convinced myself that this is the correct approach. Why? Because it's exactly the approach we already use.

If you look at the newly added test_non_dom2 test case, you'll find an example which is identical to the test_non_dom case except that I replaced a gep with a call to an external function. Assume that external function just contains the original gep, and returns it's result. The modified test case didn't crash, and we do in fact form SCEVs with non-dominating arguments.

In the example, we happily form the SCEV "((-1 * %addr1) + %addr2)" where %addr1 and %add2 represent SCEVUnknowns, and neither dominate the other. As Max points out, it's odd that such an expression isn't expandable, but we already have that problem. This patch does not introduce a new concept after all.

(Edit - And it turns out SCEVExpander already guards against this case. In isSafeToExpandAt, we check that the SCEV dominates the insertion point. In this example, the "odd" SCEVs above will not dominate any insertion point, and are thus never expanded.)

Harbormaster completed remote builds in B111409: Diff 355069.Jun 28 2021, 5:55 PM

efriedma added inline comments.Jun 29 2021, 11:44 AM

llvm/lib/Analysis/ScalarEvolution.cpp
752–755	I'm not sure this produces a strict weak ordering suitable for sorting. We've run into issues with other code that tries to sort on dominates(). The solution is usually to use domtree DFS numbering instead.

reames added inline comments.Jul 1 2021, 10:07 AM

llvm/lib/Analysis/ScalarEvolution.cpp
752–755	I'm happy to make the change, but do you have an example? I don't see how we'd end up with anything problematic here.

efriedma added inline comments.Jul 1 2021, 10:26 AM

llvm/lib/Analysis/ScalarEvolution.cpp
752–755	For example, D103441

Address review comment re: dominance sorting

I think you need to call updateDFSNumbers() somewhere?

Harbormaster completed remote builds in B112860: Diff 357066.Jul 7 2021, 2:30 PM

I don't see obvious problems, but I don't really know what AA does with it so can't give any useful input. Resigning since there has not been activity for few months. Once you addressed other reviewers' concerns, fine by me.

reames mentioned this in D114112: [SCEVAA] Avoid forming malformed pointer diff expressions.Nov 17 2021, 10:58 AM

Abandoning this in favor of https://reviews.llvm.org/D114112.

With some time to reflect, the relaxed invariant proposed here kept making me more and more nervous. Thankfully, some work done since this was first posted gives us an obvious path forward with an alternate approach.

reames mentioned this in rGad69402f3e19: [SCEVAA] Avoid forming malformed pointer diff expressions.Nov 17 2021, 12:38 PM

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ScalarEvolution.cpp

22 lines

test/

Analysis/

ScalarEvolution/

scev-aa.ll

128 lines

Diff 357066

llvm/lib/Analysis/ScalarEvolution.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 736 Lines • ▼ Show 20 Lines	if (LBitWidth != RBitWidth)
return (int)LBitWidth - (int)RBitWidth;		return (int)LBitWidth - (int)RBitWidth;
return LA.ult(RA) ? -1 : 1;		return LA.ult(RA) ? -1 : 1;
}		}

case scAddRecExpr: {		case scAddRecExpr: {
const SCEVAddRecExpr *LA = cast<SCEVAddRecExpr>(LHS);		const SCEVAddRecExpr *LA = cast<SCEVAddRecExpr>(LHS);
const SCEVAddRecExpr *RA = cast<SCEVAddRecExpr>(RHS);		const SCEVAddRecExpr *RA = cast<SCEVAddRecExpr>(RHS);

// There is always a dominance between two recs that are used by one SCEV,		// Sort by dominance order. For SCEVs corresponding to Values, there
// so we can safely sort recs by loop header dominance. We require such		// must be a strict dominance order of operands. For hypothetical SCEVs,
// order in getAddExpr.		// that property need not hold.
const Loop LLoop = LA->getLoop(), RLoop = RA->getLoop();		const Loop LLoop = LA->getLoop(), RLoop = RA->getLoop();
if (LLoop != RLoop) {		if (LLoop != RLoop) {
const BasicBlock LHead = LLoop->getHeader(), RHead = RLoop->getHeader();		const BasicBlock LHead = LLoop->getHeader(), RHead = RLoop->getHeader();
assert(LHead != RHead && "Two loops share the same header?");		assert(LHead != RHead && "Two loops share the same header?");
if (DT.dominates(LHead, RHead))
return 1;		auto NumA = DT.getNode(LHead)->getDFSNumIn();
else		auto NumB = DT.getNode(RHead)->getDFSNumIn();
assert(DT.dominates(RHead, LHead) &&		return NumA - NumB;
		efriedmaUnsubmitted Not Done Reply Inline Actions I'm not sure this produces a strict weak ordering suitable for sorting. We've run into issues with other code that tries to sort on dominates(). The solution is usually to use domtree DFS numbering instead. efriedma: I'm not sure this produces a strict weak ordering suitable for sorting. We've run into issues…
		reamesAuthorUnsubmitted Done Reply Inline Actions I'm happy to make the change, but do you have an example? I don't see how we'd end up with anything problematic here. reames: I'm happy to make the change, but do you have an example? I don't see how we'd end up with…
		efriedmaUnsubmitted Not Done Reply Inline Actions For example, D103441 efriedma: For example, D103441
"No dominance between recurrences used by one SCEV?");
return -1;
}		}

// Addrec complexity grows with operand count.		// Addrec complexity grows with operand count.
unsigned LNumOps = LA->getNumOperands(), RNumOps = RA->getNumOperands();		unsigned LNumOps = LA->getNumOperands(), RNumOps = RA->getNumOperands();
if (LNumOps != RNumOps)		if (LNumOps != RNumOps)
return (int)LNumOps - (int)RNumOps;		return (int)LNumOps - (int)RNumOps;

// Lexicographically compare.		// Lexicographically compare.
▲ Show 20 Lines • Show All 2,026 Lines • ▼ Show 20 Lines	for (; Idx < Ops.size() && isa<SCEVAddRecExpr>(Ops[Idx]); ++Idx) {
}		}

// Okay, if there weren't any loop invariants to be folded, check to see if		// Okay, if there weren't any loop invariants to be folded, check to see if
// there are multiple AddRec's with the same loop induction variable being		// there are multiple AddRec's with the same loop induction variable being
// added together. If so, we can fold them.		// added together. If so, we can fold them.
for (unsigned OtherIdx = Idx+1;		for (unsigned OtherIdx = Idx+1;
OtherIdx < Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);		OtherIdx < Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);
++OtherIdx) {		++OtherIdx) {
// We expect the AddRecExpr's to be sorted in reverse dominance order,
// so that the 1st found AddRecExpr is dominated by all others.
assert(DT.dominates(
cast<SCEVAddRecExpr>(Ops[OtherIdx])->getLoop()->getHeader(),
AddRec->getLoop()->getHeader()) &&
"AddRecExprs are not sorted in reverse dominance order?");
if (AddRecLoop == cast<SCEVAddRecExpr>(Ops[OtherIdx])->getLoop()) {		if (AddRecLoop == cast<SCEVAddRecExpr>(Ops[OtherIdx])->getLoop()) {
// Other + {A,+,B}<L> + {C,+,D}<L> --> Other + {A+C,+,B+D}<L>		// Other + {A,+,B}<L> + {C,+,D}<L> --> Other + {A+C,+,B+D}<L>
SmallVector<const SCEV *, 4> AddRecOps(AddRec->operands());		SmallVector<const SCEV *, 4> AddRecOps(AddRec->operands());
for (; OtherIdx != Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);		for (; OtherIdx != Ops.size() && isa<SCEVAddRecExpr>(Ops[OtherIdx]);
++OtherIdx) {		++OtherIdx) {
const auto *OtherAddRec = cast<SCEVAddRecExpr>(Ops[OtherIdx]);		const auto *OtherAddRec = cast<SCEVAddRecExpr>(Ops[OtherIdx]);
if (OtherAddRec->getLoop() == AddRecLoop) {		if (OtherAddRec->getLoop() == AddRecLoop) {
for (unsigned i = 0, e = OtherAddRec->getNumOperands();		for (unsigned i = 0, e = OtherAddRec->getNumOperands();
▲ Show 20 Lines • Show All 11,041 Lines • Show Last 20 Lines

llvm/test/Analysis/ScalarEvolution/scev-aa.ll

Show First 20 Lines • Show All 206 Lines • ▼ Show 20 Lines	for.body: ; preds = %entry, %for.body
%tmp6 = load i64, i64* %p ; <i64> [#uses=1]		%tmp6 = load i64, i64* %p ; <i64> [#uses=1]
%cmp = icmp slt i64 %inc, %tmp6 ; <i1> [#uses=1]		%cmp = icmp slt i64 %inc, %tmp6 ; <i1> [#uses=1]
br i1 %cmp, label %for.body, label %for.end		br i1 %cmp, label %for.body, label %for.end

for.end: ; preds = %for.body, %entry		for.end: ; preds = %for.body, %entry
ret void		ret void
}		}

; CHECK: 14 no alias responses		; CHECK: Function: test_no_dom: 3 pointers, 0 call sites
; CHECK: 26 may alias responses		; CHECK: MayAlias: double* %addr1, double* %data
		; CHECK: NoAlias: double* %addr2, double* %data
		; CHECK: NoAlias: double* %addr1, double* %addr2

		; In this case, checking %addr1 and %add2 involves two addrecs in two
		; different loops where neither dominates the other. This used to crash
		; because we expected the arguments to an AddExpr to have a strict
		; dominance order.
		define void @test_no_dom(double* %data) {
		entry:
		br label %for.body

		for.body:
		%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.latch ]
		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
		br i1 undef, label %subloop1, label %subloop2

		subloop1:
		%iv1 = phi i32 [0, %for.body], [%iv1.next, %subloop1]
		%iv1.next = add i32 %iv1, 1
		%addr1 = getelementptr double, double* %data, i32 %iv1
		store double 0.0, double* %addr1
		%cmp1 = icmp slt i32 %iv1, 200
		br i1 %cmp1, label %subloop1, label %for.latch

		subloop2:
		%iv2 = phi i32 [400, %for.body], [%iv2.next, %subloop2]
		%iv2.next = add i32 %iv2, 1
		%addr2 = getelementptr double, double* %data, i32 %iv2
		store double 0.0, double* %addr2
		%cmp2 = icmp slt i32 %iv2, 600
		br i1 %cmp2, label %subloop2, label %for.latch

		for.latch:
		br label %for.body

		for.end:
		ret void
		}

		declare double* @get_addr(i32 %i)

		; CHECK: Function: test_no_dom2: 3 pointers, 2 call sites
		; CHECK: MayAlias: double* %addr1, double* %data
		; CHECK: MayAlias: double* %addr2, double* %data
		; CHECK: MayAlias: double* %addr1, double* %addr2

		; In this case, checking %addr1 and %add2 involves two addrecs in two
		; different loops where neither dominates the other. This is analogous
		; to test_no_dom, but involves SCEVUnknown as opposed to SCEVAddRecExpr.
		define void @test_no_dom2(double* %data) {
		entry:
		br label %for.body

		for.body:
		%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.latch ]
		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
		br i1 undef, label %subloop1, label %subloop2

		subloop1:
		%iv1 = phi i32 [0, %for.body], [%iv1.next, %subloop1]
		%iv1.next = add i32 %iv1, 1
		%addr1 = call double* @get_addr(i32 %iv1)
		store double 0.0, double* %addr1
		%cmp1 = icmp slt i32 %iv1, 200
		br i1 %cmp1, label %subloop1, label %for.latch

		subloop2:
		%iv2 = phi i32 [400, %for.body], [%iv2.next, %subloop2]
		%iv2.next = add i32 %iv2, 1
		%addr2 = call double* @get_addr(i32 %iv2)
		store double 0.0, double* %addr2
		%cmp2 = icmp slt i32 %iv2, 600
		br i1 %cmp2, label %subloop2, label %for.latch

		for.latch:
		br label %for.body

		for.end:
		ret void
		}


		; CHECK: Function: test_dom: 3 pointers, 0 call sites
		; CHECK: MayAlias: double* %addr1, double* %data
		; CHECK: NoAlias: double* %addr2, double* %data
		; CHECK: NoAlias: double* %addr1, double* %addr2

		; This is a variant of test_non_dom where the second subloop is
		; dominated by the first. As a result of that, we can nest the
		; addrecs and cancel out the %data base pointer.
		define void @test_dom(double* %data) {
		entry:
		br label %for.body

		for.body:
		%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.latch ]
		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
		br label %subloop1

		subloop1:
		%iv1 = phi i32 [0, %for.body], [%iv1.next, %subloop1]
		%iv1.next = add i32 %iv1, 1
		%addr1 = getelementptr double, double* %data, i32 %iv1
		store double 0.0, double* %addr1
		%cmp1 = icmp slt i32 %iv1, 200
		br i1 %cmp1, label %subloop1, label %subloop2

		subloop2:
		%iv2 = phi i32 [400, %subloop1], [%iv2.next, %subloop2]
		%iv2.next = add i32 %iv2, 1
		%addr2 = getelementptr double, double* %data, i32 %iv2
		store double 0.0, double* %addr2
		%cmp2 = icmp slt i32 %iv2, 600
		br i1 %cmp2, label %subloop2, label %for.latch

		for.latch:
		br label %for.body

		for.end:
		ret void
		}

		; CHECK: 18 no alias responses
		; CHECK: 31 may alias responses
; CHECK: 18 must alias responses		; CHECK: 18 must alias responses