This is an archive of the discontinued LLVM Phabricator instance.

[NewGVN] Split OpPHI detection and creation.
ClosedPublic

Authored by fhahn on Feb 28 2018, 3:58 AM.

Details

Summary

It also adds a check making sure PHIs for operands are all in the same
block.

Patch by Daniel Berlin <dberlin@dberlin.org>

Diff Detail

Repository
rL LLVM

Event Timeline

fhahn created this revision.Feb 28 2018, 3:58 AM

I don't understand how this can happen, since the value phi should be equivalent to an op phi in the same block.
I need to look at this more.

So, we are definitely "doing it wrong".

This code is definitely broken in the case that multiple operands are defined by phi of ops, and they are not in the same block.
In fact, i think it's now broken with all multiple phis, unless the phis are in the same block. It is just getting lucky because it is getting blocked from using them.

It also badly needs refactoring because of how long it took to notice this.

Here is what is happening:

Let me start with context. The reason it differentiates the phi block from the instruction block is to catch the case the instruction operand is dependent on a dominating phi

ie
bb0
a = phi
goto bb1

bb1:
b = add a, 1

These are to transform valid phi of ops.
Here it will evaluate add a, 1 for each predecessor block of *a*

The limitation on the code the way it is currently written, however, is that it assumes *all* phis for the instruction are in the same block.
Thus, it only evaluates the phiblock once, and generates the operation in the phi block and sees what happens.
You can see this in the code if you look hard enough. the loop over the original instruction operands will return an expression or nullptr the second it hits an operand that is a phi.
The only reason it translates each phi in the expression when they are in the same block is because it does `Op = Op->DoPHITranslation(PHIBlock, PredBB);` for each pred.
Which will handle it regardless of whether it's the original phi or not.

If you give it something like:
bb:

br label %bb1

bb1:

%tmp = phi i32 [ 0, %bb ], [ %tmp10, %bb6 ]

bb6:

%tmp7 = phi i32 [ 0, %bb1 ], [ 1, %bb4 ], [ 1, %bb5 ]
%tmp9 = or i32 %tmp, %tmp7

It will try to evaluate tmp9 in the tmp1 block as or i32 <phi pred>, %tmp7, and either

  1. fail to find anything
  2. fail safety checks
  3. get unlucky, find something, and crash

At least as far as i've been able to reproduce, the only case #3 occurs is when we've made phi of ops for the operands, as you see in this case.
There, it gets past the safety checking/etc, and all the way to the valuephi creation, but now the value phi is in the wrong block.

The correct way to handle all of this is unfortunately recursive.
We'd have to translate through each phi and phi block that occurs as an operand, and evaluate the operation in every possible predecessor of those, and merge those results.

This is unlikely to be worth it ATM, it seems incredibly uncommon.
It would be easy, however, if we refactored this code sanely.

For now, safety wise, we should verify all phis for an op are in the same block and give up if not.
I'll send along a patch to fhahn that works but is otherwise untested.

There is also a case 4 here i forgot, which is get unlucky, find something (where all blocks have the same number of preds) and probably generate wrong code :)

fhahn added a comment.Mar 6 2018, 9:37 AM

There, it gets past the safety checking/etc, and all the way to the valuephi creation, but now the value phi is in the wrong block.

The correct way to handle all of this is unfortunately recursive.
We'd have to translate through each phi and phi block that occurs as an operand, and evaluate the operation in every possible predecessor of those, and merge those results.

This is unlikely to be worth it ATM, it seems incredibly uncommon.
It would be easy, however, if we refactored this code sanely.

For now, safety wise, we should verify all phis for an op are in the same block and give up if not.
I'll send along a patch to fhahn that works but is otherwise untested.

Thanks for tracking that down and for sending along a patch. As far as I can see, it splits up analysis and the phi creation, which makes it much easier to see what going on :) It also fixes the issue by checking if all phis for on op are in the same block

Given that, I was wondering if there is anything left you want me to do or I should look into? :)

At a minimum, the patch needs bootstrap and testing, i didn't test it except on this case and test/Transforms/NewGVN.
:)

fhahn added a comment.Mar 7 2018, 9:31 AM

At a minimum, the patch needs bootstrap and testing, i didn't test it except on this case and test/Transforms/NewGVN.
:)

Done!

With and without this patch, bootstrapping clang/llvm fails with "value number changed after ..." assertions. I finally took the time and re-visited D42180. One case we seem to miss is when changes to the translated operands of the ValueOp cause us to find a leader for the ValueOp, for which we previously did not find a leader and returned early. In this case no additional users are added to trigger re-evaluation of the phi node.

Together with D42180, bootstrapping passed. And also running the test-suite.

So, just to put this on the review: the current code on this review definitely not the right fix compared to the patch i sent :)
I think we should commit that split/cleanup (not sure if you did it as part of the last patch.
It sounds like you had time to bootstrap/test it. If so, please feel free to commit it

fhahn added a comment.Apr 19 2018, 7:38 AM

So, just to put this on the review: the current code on this review definitely not the right fix compared to the patch i sent :)
I think we should commit that split/cleanup (not sure if you did it as part of the last patch.
It sounds like you had time to bootstrap/test it. If so, please feel free to commit it

Yep, the diff in Phab is out-dated. I'll commit D42180 first, and then rebase your patch and do another bootstrap round and update the diff here.

fhahn updated this revision to Diff 143314.Apr 20 2018, 7:40 AM
fhahn retitled this revision from [NewGVN] Create new ValuePHI node, if the number of operands does not match. to [NewGVN] Split OpPHI detection and creation..
fhahn edited the summary of this revision. (Show Details)

This is mostly @dberlin 's patch rebased and with the test case.

This revision was not accepted when it landed; it landed in state Needs Review.Apr 20 2018, 9:40 AM
This revision was automatically updated to reflect the committed changes.