This is an archive of the discontinued LLVM Phabricator instance.

[SCEVAA] Don't crash on pointers with no dominance relationship.
Changes PlannedPublic

Authored by efriedma on Jan 2 2018, 6:29 PM.

Details

Summary

If you try to compute the difference between two SCEV values which don't have a dominance relationship (so there's no point in the source code where the difference could actually be computed), SCEV will crash. This patch teaches SCEVAA not to do that.

I'm not sure this is the right fix; it seems like I'm working around the underlying issue rather than actually solving it. Does it makes sense to call AliasAnalysis::alias() in the case where HasDominanceRelation returns false? Alias analysis is normally defined in terms of memory operations, so I'm not sure what alias() means in the case where you can't construct a memory operation which refers to both pointers.

Fixes https://bugs.llvm.org/show_bug.cgi?id=33761.

Diff Detail

Repository
rL LLVM

Event Timeline

efriedma created this revision.Jan 2 2018, 6:29 PM

I haven't thought about this in depth, but it seems like SCEV theoretically should be able to cope under such circumstances.

From the comments in PR33761, SCEV is being asked to compute the difference between ((8 * %tmp2) + %tmp) and {%tmp,+,64}<%bb4>, where %tmp2 doesn't dominate the loop. It seems the result should be something like {0,+,64}<%bb4> + (-8 * %tmp2), and there is even a point in the program where the expression can be computed -- immediately after %tmp2.

there is even a point in the program where the expression can be computed -- immediately after %tmp2.

That doesn't seem right... what's the value of {0,+,64}<%bb4> in the case the loop doesn't execute? (Assuming you replace "i1 true" with an actual condition.)

First, it's definitely the case that you can't compute the difference for all SCEVs. It's equivalent to being able to symbolically executing the program with only static information :)

It is possible to compute the difference between the expressions in some cases, using symbolic execution to come up with path sensitive symbolic answers.
Given alias() is giving all-paths answers, you'd need to meet those answers over all paths anyway.

It's possible that gives you a better answer than "no idea" (IE you discover that on any path, they cannot alias).

It's almost certainly not worth it.

Dominance checking is basically shortcutting this to try to difference them only when you can prove that all paths must lead to computing the result :)

lib/Analysis/ScalarEvolutionAliasAnalysis.cpp
101

FWIW: This rediscovers the same info again and again .
But i'm not sure it's that slow.

IE you could cache scev->largest domtree dfs number interval, and only have to do this computation once per scev.
In fact, it also speeds up the other computation if you want

IE if the dom tree dfs in/out numbers for scev A are {50, 75}, when you call GetBottom on scev B, the second you see an operand with a set of dfs numbers that is either not between those or not encompassing those, there is no dominance relationship between the scevs and you can stop.

(Of course, then you can't cache the dfs number pair for that scev, but ...)

It's fine if you want to leave all this as a comment, i don't know if it's slow enough to be worth doing atm.

Ah, I overlooked that the loop doesn't dominate the exit block.

That said, it suggests another idea here, which would be to permit the formation of expressions like {0,+,64}<%bb4> + (-8 * %tmp2) in SCEV itself, and then just assert if such an expression is passed into SCEVExpander or similar.

I'm not sure this is the right fix; it seems like I'm working around the underlying issue rather than actually solving it.

I agree. This seems like working around the underlying issue.

Does it makes sense to call AliasAnalysis::alias() in the case where HasDominanceRelation returns false?

No, I don't think it does (for the reason you've specified). We only support aliasing queries between values with some dominance relation.

I couple of thoughts:

  1. If we wanted to handle this somehow in SCEVAAResult::alias, can't we just put a dominance check on the values directly instead of trying to examine the SCEVs? Meaning:
Instruction *IA = dyn_cast<Instruction>(LocA.Ptr),
                      *IB = dyn_cast<Instruction>(LocB.Ptr);
if (IA && IB &&
     (DT->dominates(IA->getParent(), IB->getParent()) ||
      DT->dominates(IB->getParent(), IA->getParent())) &&
     "Queries between values without a dominance relationship is not supported");

(or returning a conservative answer instead of asserting)

  1. I think that what AliasSetPrinter is doing is just not well defined. We should really fix it (although, frankly, I'm not sure how without increasing its asymptotic complexity by having it create a set for each block only with others with a dominance relationship, but maybe that's okay as AliasSetPrinter is a diagnostic tool).

If we wanted to handle this somehow in SCEVAAResult::alias, can't we just put a dominance check on the values directly instead of trying to examine the SCEVs?

Oh. Yes, that would be simpler. :) Maybe slightly less powerful, depending on the structure of the code.

I think that what AliasSetPrinter is doing is just not well defined

In this case, it's probably worth noting that I have a testcase which doesn't involve aa-eval; BasicAA also makes queries like this (see BasicAAResult::aliasPHI). I can reduce it if that would be interesting.

If we wanted to handle this somehow in SCEVAAResult::alias, can't we just put a dominance check on the values directly instead of trying to examine the SCEVs?

Oh. Yes, that would be simpler. :) Maybe slightly less powerful, depending on the structure of the code.

I think that what AliasSetPrinter is doing is just not well defined

In this case, it's probably worth noting that I have a testcase which doesn't involve aa-eval; BasicAA also makes queries like this (see BasicAAResult::aliasPHI). I can reduce it if that would be interesting.

I'd find that interesting.

In general, I think that we need to decide what the interface contract here is. I had thought that we required a dominance relationship because there had to be at least on well-defined point where both values could be simultaneously evaluated in order to produce a well-defined result. This might be too strict. I can imagine saying something along these lines but allowing for some kind of hypothetical PHI translation (i.e. saying that the values can be compared if there could exist some series of well-defined PHIs that would bring the values together under at least one valid path through the CFG (albeit under different names). I get a little worried here in defining how this works in the face of backedges (because I need alias (%a, %b) to, say, compare in the current loop iteration, not %b from some loop iteration against %a hypothetically PHI-translated into some other iteration). Our BasicAA::aliasPHI, as I believe we discovered when investigating some bug involving a use of ValueTracking, is somewhat-fundamentally broken in some related sense, so I find your comment unsurprising in that regard.

In any case, I'd be fine with doing a simple dominance check in SCEVAA and working on a separate patch to clean up the docs about what AA means and then trying to clean up everything else.

efriedma planned changes to this revision.Aug 22 2018, 6:41 PM
lkail added a subscriber: lkail.Feb 18 2020, 2:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 18 2020, 2:43 AM
sanjoy resigned from this revision.Jan 29 2022, 5:43 PM