This is an archive of the discontinued LLVM Phabricator instance.

[NewGVN] Re-evaluate phi of ops after moving an instr to new class
AbandonedPublic

Authored by fhahn on Jan 17 2018, 6:30 AM.

Details

Summary

[NewGVN] After moving an instr to a new class, re-evaluate phi of ops.

After moving an instruction to a new class, its leader could have
changed, which means we might be able to simplify phi of ops involving
it as expression.

Diff Detail

Event Timeline

fhahn created this revision.Jan 17 2018, 6:30 AM

At a glance, i don't believe this is correct.
The ClassChanged part above should have already done this.
If not, can you detail me how it is happening?

Note also that we mark all operands that are part of the expression as Deps. If the leader of those operands changed, they would already be marked as changed as Users through the addAdditionalUsers part we do.

fhahn added a comment.Jan 17 2018, 8:25 AM

Thanks for the having a look so quickly and thanks for the pointers. I'll have a look there to see why the phi-of-ops is not re-evaluated.

In the test case, the following does happen:

  1. when trying to find a phi-of-ops for add nsw i32 %inc7, -1 the first time, the symbolic expression we find is { ExpressionTypeVariable, opcode = 78, variable = %y.0.0 = call i32 @llvm.ssa.copy.i32(i32 %y.0)} , which is not suitable ( as %y.0.0 does not dominate the instruction
  2. when processing %y.0.0 = call i32 @llvm.ssa.copy.i32(i32 %y.0), move the instruction to a new class for expression { ExpressionTypeVariable, opcode = 77, variable = %y.0 = phi i32 [ 1, %entry ], [ %inc7, %for.inc6 ]}. This is != the expression we put in ExpressionToPhiOfOps, so we do not re-evaluate it.
  3. While verifying, we find { ExpressionTypeVariable, opcode = 77, variable = %y.0 = phi i32 [ 1, %entry ], [ %inc7, %for.inc6 ]} when evaluating add nsw i32 %inc7, -1 , which dominates the instruction and the phi-of-ops can be created.

At a glance, i don't believe this is correct.
The ClassChanged part above should have already done this.
If not, can you detail me how it is happening?

What follows is my initial understanding after starting having a look at NewGVN and I am probably missing a couple of things. I would greatly appreciate any thoughts and pointers :)

I think the problem is the following: ExpressionToPhiOfOps is used to re-process the phi-of-ops, in case the leader of the symbolic (gvn) expression changed. In the ClassChanged case, we currently re-process any phi-of-ops that where using expression E, as its leader has changed. But by moving I to a new class, the leader of all variable expressions of I might has changed too, e.g. because I was not the leader of the old class.

It seems like we are missing this case. So if during phi-of-ops processing, we failed with symbolic (gvn) expression E1(I), and E != E1(I), we do not re-process the phi-of-ops instruction, whereas I think we should re-process all phi-of-ops instructions, which depended on any symbolic expression involving I. "depend" here might be not the best term, I mean the symbolic expression we evaluated in findLeaderForInst.

Note also that we mark all operands that are part of the expression as Deps. If the leader of those operands changed, they would already be marked as changed as Users through the addAdditionalUsers part we do.

I think this is a symbolic dependency, rather than an additional use , as it is no operand of the new expressions.

Also, it seems like the debug output "New class .... for expression ..." could be misleading in some cases. For example, the code path also gets hit if E is a VariableExpression with an existing class and the value I is just moved to the existing EClass, without the class of E being changed.

dberlin added a comment.EditedJan 17 2018, 12:39 PM

At a glance, i don't believe this is correct.
The ClassChanged part above should have already done this.
If not, can you detail me how it is happening?

What follows is my initial understanding after starting having a look at NewGVN and I am probably missing a couple of things. I would greatly appreciate any thoughts and pointers :)

I think the problem is the following: ExpressionToPhiOfOps is used to re-process the phi-of-ops, in case the leader of the symbolic (gvn) expression changed.

Yes, that is true, but that case should only occur if we have not created a phi of ops, and need to mark what will happen later.

In the ClassChanged case, we currently re-process any phi-of-ops that where using expression E, as its leader has changed. But by moving I to a new class, the leader of all variable expressions of I might has changed too, e.g. because I was not the leader of the old class.

Yes, but this should be taken care of by the code mentioned, by adding those ops as dependencies on the instruction.

It seems like we are missing this case.

We should not be :)

So if during phi-of-ops processing, we failed with symbolic (gvn) expression E1(I), and E != E1(I), we do not re-process the phi-of-ops instruction, whereas I think we should re-process all phi-of-ops instructions, which depended on any symbolic expression involving I. "depend" here might be not the best term, I mean the symbolic expression we evaluated in findLeaderForInst.

Note also that we mark all operands that are part of the expression as Deps. If the leader of those operands changed, they would already be marked as changed as Users through the addAdditionalUsers part we do.

I think this is a symbolic dependency, rather than an additional use , as it is no operand of the new expressions.

The phi of ops can only depend on the expressions that exist for the operands, and those expressions change only if the evaluation of the instruction changes. In such a case, the instruction will move classes, and hit the class changed case. It will then mark the operands in the phi of ops as changed, as they are users of that instruction (either additional users, or regular users). If they are symbolically marked, they should get caught by the other mark call in classchanged.
What is the case where the instructions that make up those operands don't move classes, but the expression still changes anyway? Or what is the case where we have not added the dependency, but needed to.

Your patch actually just marks the phi of ops expressions as changed *even when all that happened was a leader change, and the class did not change*.
I'm unclear on why that fixes anything :)

Also, it seems like the debug output "New class .... for expression ..." could be misleading in some cases. For example, the code path also gets hit if E is a VariableExpression with an existing class and the value I is just moved to the existing EClass, without the class of E being changed.

VariableExpressions do not actually have classes on their own, and we'll never store them :)
But it still should fall into the classchanged case if it matters.

dberlin added a comment.EditedJan 17 2018, 12:55 PM

To try to help explain this (and feel free to turn it into a comment in NewGVN), for a phi of ops, let me try to cover the cases.

We are trying to transform operations on phis into phis of existing operations.

IE add phi(a, b) 5
into
phi(existing add a, 5, existing add b, 5)

there are essentially two main cases:

  1. We found existing-in-the-program operands for all cases of the phi (or they are constants). IE we found existing instructions or constant, in each predecessor, for the transformed operation. In this case, we will make a new phi, with each operand being one of those existing value or constants. We need to make sure that when those new phi operands change congruence class, we re-evaluate the phi of ops. It should not be needed for the phi of ops to be re-evaluated otherwise (because it only depends on those operands being valid members of their congruence classes).
  1. We did not find existing-in-the-program operands for all cases. In this case, we need to make sure that "if something changes where we might find them, we re-evaluate".

In this case, we mark, symbolically, what expression we were looking for that was missing. If the expression appears, we should re-evaluate the phi of ops.

Now, maybe you can think of a case i've missed here. Awesome. I don't claim the invariants are perfect, and it definitely requires a lot of thought to get right, so it's entirely possible i'm wrong :)

The code we have is attempting to maintain these invariants.
It is very easy to just over-mark things and make the bugs go away, but eventually it will just hit more bugs (or make it super slow). In particular, zhendong's fuzzer is amazingly good at sussing out fixes that are bandaids, and always seems to find the corner case we miss :)
Given that, historically, when we've hit similar problems, our solution has been to write verifiers. This is why you have the "value changed after loop completed" verifier (Old GVN cannot survive that verifier, for reference, because re-running analysis definitely changes results. We're trying to do better).

It may be worth writing a phi of ops invariant verifier and see what shakes out. I certainly don't claim this code is perfect, i just want us to push it in the direction of understanding the invariants we need to maintain, maintaining those invariants and verifying we are doing that.

I usually try to think through it a bit, then give up and write something to just make the contract we are trying to keep explicit and verify it.

fhahn added a comment.Jan 21 2018, 3:38 PM

Thank you very much for providing such a detailed response, that's really helpful and I really appreciate it :)

I'll try to add some additional checks for invariants to pin down where something went wrong. I think either of the following conditions should hold just before the end of the if (ClassChanged || LeaderChanged) { block, for all pairs (E, I) in ExpressionToPhiOfOps:

  • the leader of the class for expression E (getClassForExpression) is not changed by moveValueToNewCongruenceClass
  • I is in TouchedInstructions or any instruction with I as one of its users is in TouchedInstructions

In other words, if the leader for expression E changes, I should get re-processed. Does that make sense?

Your patch actually just marks the phi of ops expressions as changed *even when all that happened was a leader change, and the class did not change*.

It is sufficient to only have the additional markPhiOfOpsChanged(createVariableOrConstant(I)) if ClassChange, but I agree it is quite heavy handed and not complete.

fhahn added a comment.Jan 21 2018, 3:40 PM

In other words, if the leader for expression E changes, I should get re-processed. Does that make sense?

Something stronger should also hold I think: if the leader of any operand of E changes, I should get re-processed.

fhahn added a comment.Jan 29 2018, 2:40 PM

Note also that we mark all operands that are part of the expression as Deps. If the leader of those operands changed, they would already be marked as changed as Users through the addAdditionalUsers part we do.

I had a look at how performSymbolicEvaluation adds additional users in case it managed to find existing values that could be used to simplify the value we evaluate.

In findLeaderForInst, we evaluate a temporary instruction (based on an original instruction OI). If we find an existing value VE to simplify this temporary instruction, we currently do not track that OI (which the temporary is based on) depends on VE, as checkSimplificationResults only adds additional dependencies for non temporary instructions. But if VE moves classes, the symbolic evaluation could change and thus we might find a suitable leader, but we do not detect that currently. I am probably missing more again, but from your responses I *think* that in that case we should have recorded an additional dependency for OI, to cover that case. It would be great if you could let me know if that makes sense.

(The invariant checks I have still need some cleanup...)

mcrosier added a subscriber: mcrosier.
fhahn added a comment.Mar 14 2018, 4:22 PM

This approach is definitely wrong. I think I know where we miss the precise dependency in that case and will submit a different patch for it

fhahn abandoned this revision.Apr 20 2018, 10:09 AM

Thanks for all the feedback. Abandoning this change now. rL330334 covers one case where we were missing dependencies. There is at least one other case remaining I think and I will try to post a patch for that soonish.

Okay. There is definitely at least 1 cause of value numbers changing after
the main loop completes that is not phi-of-ops related
(it's SCC related)