Page MenuHomePhabricator

[AST] Set 'MayAlias' instead of 'MustAlias' in AliasSets for PHI nodes (bugzilla bug#36801)
Needs ReviewPublic

Authored by rehana on Feb 22 2019, 4:22 PM.

Details

Summary

AliasSetTracker was generating 'MustAlias' sets for PHI nodes for its all possible values. This is wrong as they can not all be aliasing at
the same time. This patch corrects the AliasSetTracker for PHI nodes by setting 'MayAlias' instead of 'MustAlias' in AliasSets.

Diff Detail

Event Timeline

rehana created this revision.Feb 22 2019, 4:22 PM
rehana set the repository for this revision to rG LLVM Github Monorepo.Feb 22 2019, 4:47 PM
rehana retitled this revision from [AST] Set 'MayAlias' instead of 'MustAlias' in AliasSets for PHI nodes to [AST] Set 'MayAlias' instead of 'MustAlias' in AliasSets for PHI nodes (bugzilla 36801).Feb 22 2019, 4:52 PM
rehana edited the summary of this revision. (Show Details)
rehana retitled this revision from [AST] Set 'MayAlias' instead of 'MustAlias' in AliasSets for PHI nodes (bugzilla 36801) to [AST] Set 'MayAlias' instead of 'MustAlias' in AliasSets for PHI nodes (bugzilla bug#36801).
rehana updated this revision to Diff 188221.Feb 25 2019, 10:54 AM

I've uploaded the diff with full context.

This is a pretty tricky one, and I may be missing some subtleties here.

[ Code description below refers to SSAUpdater::RewriteUse called from LoopRotationUtils::RewriteUsesOfClonedInstructions ]

The issue I see is that we're removing a Phi and adding another phi outside the loop. So there's now this user in the loop exit block:
%p.0.lcssa = phi i32* [ %p.0, %for.cond ], [ %p.0, %entry ], where its argument is %p.0 = phi i32* [ null, %entry ], [ %arrayidx, %for.end ]
The SSAUpdater is saying "notify values that I replaced all uses of %p.0 with null", then "notify values that I replaced all uses of %p.0 with %arrayidx" (SSAUpdater::RewriteUse:202).
In fact, only one of the entries in %p.0.lcssa is replaced each time, leading to a correct phi node: %p.0.lcssa = phi i32* [ %arrayidx, %for.cond ], [null, %entry ].
There are two distinct Uses of the same Value. Each is replaced with a different Value. I'm inclined to say calling ValueHandleBase::ValueIsRAUWd is not always correct here. But perhaps the fix should be in the VH?

Now, this example is running a sequence of LoopRotate, LICM, LoopRotate, LICM. The issue is that the info computed by the first LICM is incorrectly updated in LoopRotate and that incorrect result is used in the second LICM.

LICM builds an AliasSetTracker on the first run and caches it. Then LoopRotate calls the above magic using the SSAUpdater, "informing" the cached AliasSetTracker first that %p.0 is replaced with null and then that %p.0 is replaced with %arrayidx (the mechanism is in ASTCallbackVH). So the AliasSetTracker will happily create a MustSet with null and %arrayidx. That's clearly wrong. So, one solution is to check that phis are handled differently and recheck alias info, as this patch proposes. This way we push the fix in the VH.
But, I thought the original intent of copyValue was to guarantee the value replaced/cloned is essentially the same, hence it should MustAlias. This call is breaking that assumption. If I got this assumption wrong, then feel free to correct me.

Alternative solutions are to fix the callsite or to not cache the AliasSetTracker (fwiw, not caching resolves this issue).
I'm not sure how to fix the callsite. For the AST, a single copyValue(%p.0, %p.0.lcssa) is probably correct, but it can't be done given the current codebase. Suggestions welcome.

I don't have anything against this change, and, considering this is a current miscompile, I'm ok to be checked in as is. But AFAICT, the issue is somewhat more complex.
I'll be happy if folks more familiar with the SSAUpdater can chime in.

Then LoopRotate calls the above magic using the SSAUpdater, "informing" the cached AliasSetTracker first that %p.0 is replaced with null and then that %p.0 is replaced with %arrayidx

Calling ValueIsRAUWd (allUsesReplacedWith) when the operation isn't equivalent to an RAUW seems dubious in general; maybe we should add a new callback? Haven't really looked closely at what other implementations of allUsesReplacedWith are doing.

A potential solution at the callsite is to simply not call RAUW if the user is a PHINode. In the given example, the replaced phi is simplified to %p.0 = phi i32* [ %arrayidx, %for.end ] then replaced with %arrayidx = getelementptr inbounds [3 x i32], [3 x i32]* %gc, i64 0, i64 0, which again triggers a call in AST, updating the value.
Result ends up correct (see D58746).

I'm not sure this is correct in general. And I have no idea if there aren't other such cases in other places.

My opinion would be patch this up the best we can now and remove the caching of the AST as soon as possible.
Current patch is a perfectly fine "patch up". D58746 doesn't appear to break anything either and it should save on a needless alias call when calling AliasSetTracker::copyValue.
I don't have a strong preference either way.

I *do* prefer removing the AST caching option, but there will be a compile-time performance penalty if we remove the line doing the caching in LICM now.
Enabling MemorySSA removes the caching and it should have compile-time benefits. But this one needs more extensive testing to increase confidence.

I don't really like this even as a temporary fix. Sure, if the PHI node is inside the loop, we're not optimizing it anyway, but I would expect it isn't uncommon to have a PHI node outside the loop, which this would block optimizing. And I'm worried the SSAUpdater usage of ValueIsRAUWd could cause bugs elsewhere.

Sure, if the PHI node is inside the loop, we're not optimizing it anyway, but I would expect it isn't uncommon to have a PHI node outside the loop, which this would block optimizing

I'm sorry, I don't really understand what you mean here. What blocked optimizations are you referring to?

And I'm worried the SSAUpdater usage of ValueIsRAUWd could cause bugs elsewhere.

I agree. Both fixes proposed only address this known bug. There may be other bugs caused by this usage.
I'm not familiar enough with that piece of code to understand other potential bugs, but I think it is reasonable to try to address this issue soon since its a known miscompile.
That code in the SSAUpdater seems to not have changed for 7 years, so if that usage causes other bugs, they're very well hidden...

What blocked optimizations are you referring to?

This patch blocks store sinking/promotion for any store where the pointer operand is a PHI node, no matter where in the function that PHI node is defined, or when it was created.

That code in the SSAUpdater seems to not have changed for 7 years, so if that usage causes other bugs, they're very well hidden...

Caching bugs tend to be hard to discover... as shown by the fact nobody found this bug for years either.

Getting back to this, do you have a suggestion for making progress on fixing this mis-compile?
Is the alternative in D58746 an option?

sanjoy removed a reviewer: sanjoy.Mar 31 2019, 10:35 AM

I'll look a little more at D58746, but that seems more like the right direction

Spent a bit more time looking, and thinking, and I think I have a clearer picture of what's happening.

Given that a pointer inside a loop is getting rewritten with an arbitrary SSAUpdater transform, we can't really keep the AliasSetTracker up to date given the information from the callback. We can't tell what updates are actually relevant. And even if we could, the updates might provide additional information about the pointers in question which would allow splitting the alias set.

The PHI node thing is sort of a red herring; it does not matter at all whether the operation that produces or consumes the pointer is a PHI node, in general. Maybe for certain specific transforms that's true, but more generally it doesn't make sense; consider, for example, LICM itself. (That particular interaction doesn't actually happen, I think, because we don't run LICM twice in a row on the same loop nest, but other passes could perform similar transforms.)

Given the way this works, what information can we reasonably preserve? If want to preserve the AliasSetTracker, and we assume that loop passes don't perform any transforms that actually change the aliasing characteristics of a loop (which I'm not sure is really a safe assumption, but we have to assume it to preserve the AliasSetTracker at all), I guess this patch is essentially the right solution... except that the !(isa<PHINode>(From)) bit doesn't really make any sense, unless we're specifically assuming that the only pass which can trigger copyValue callbacks is LoopRotate. The only other reasonable alternative is, if we get any updates from SSAUpdater, we completely discard the AliasSetTracker and recompute it later.

Of course, this would all be much more straightforward if LICM used MemorySSA instead of using its crazy hack to cache the AliasSetTracker: MemorySSA is a proper analysis that other passes can accurately preserve.

If we're going to do this as a temporary fix, this needs much better comments explaining when exactly we expect the callback to be called, and why we chose this particular solution.

Following the introduction of BatchAA (https://reviews.llvm.org/D59315), this patch is certainly *not* an option, because it would block using BatchAA in the AliasSetTracker (https://reviews.llvm.org/D59438). Using BatchAA is not correct if there are subsequent alias calls after doing CFG changes. And calling addPointer with anything but true as the next to last argument may make an alias call.

My priority now is to enable MemorySSA, so that we don't need to do anything for the AliasSetTracker.

If we do a temporary fix until then, I entirely agree it needs to documented well what it's doing and why.
The options I see right now: (1) D58746, will *lots* of details about why we're doing the replacements that way. This may have problems I overlooked. (2) Discard all AST caching (remove ~LICM:447). This will lead to an increase in compile time for all code using the old pass manager (3) Discard AST on phi replacements. This involves sending a reference of DeleteAST into the promote step, and have it updated in the copyValue() callback.