In formLCSSAForInstructions we speculatively add new PHI
nodes, that sometimes ends up without having any uses. It
has been discovered that sometimes an added PHI node can
appear as being unused in one iteration of the Worklist,
although it can end up being used by a PHI node added in
a later iteration. We now check, a second time, that the
PHI node still is unused before we remove it. This avoids
an assert about "Trying to remove a phi with uses." for the
added test case.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Transforms/Utils/LCSSA.cpp | ||
---|---|---|
217 ↗ | (On Diff #145170) | /createe/create/ |
Thanks! This LGTM. Just one suggestion.
lib/Transforms/Utils/LCSSA.cpp | ||
---|---|---|
217 ↗ | (On Diff #145170) | Just a suggestion. Since we don't need this container until later, around line 232, can we move this declaration there, and add the unused PHIs at that point? |
@mzolotukhin can you please take a look? This broke a lot in the past and you're the one who fixed it, so you're in the best position to review the patch.
Hi,
First of all, please clean-up the testcase. The bugpoint-output tests are impossible to understand and maintain (it is impossible to say if such a test checks anything at all after some time). I'm pretty sure you can expose the same issue with much a smaller test.
Second, why don't just replace
assert (PN->use_empty() && "Trying to remove a phi with uses.");
with
if (PN->use_empty()) continue;
?
When this code was written the assumption was that it's impossible to have a Phi-node with uses in that list. If your test case proves this to be wrong, then replace the assertion with a check.
Thanks,
Michael
lib/Transforms/Utils/LCSSA.cpp | ||
---|---|---|
233–235 ↗ | (On Diff #145170) | Could we ever iterate this loop more than once? How could we have phi-nodes to be deleted on the second iteration (~why were not they removed on the first one)? |
lib/Transforms/Utils/LCSSA.cpp | ||
---|---|---|
233–235 ↗ | (On Diff #145170) | Thanks for you feedback. I'll try to explain a little bit why I picked this solution. In the past PHI-nodes were added to the PHIsToRemove set, while iterating over the worklist in the loop above, and then they were removed here after the loop. The problem was that a PHI node created speculatively in one such iteration, having no uses and having been inserted in the PHIsToRemove set, was used by another speculatively created PHI node created in a later iteration. So when doing the cleanup the PHI node that was targeted for removal suddenly had uses. I guess that use can either be by a PHI node that should be kept, or by a PHI node that is in the PHIsToRemove set (it can even be used by a PHI node, that is used by a PHI node, that is used by a PHI node, and some/all of those are in the PHIsToRemove list). I do not know why we wait with the removal until after having finished with the worklist. Do you think it is possible to remove the PHI nodes directly in the first loop? Is it guaranteed that a PHI node, without uses, being inserted in PHIsToRemove only can be used by later created PHI nodes? Another question is if it is important that we remove all unused PHI nodes? My proposed patch was based on limited knowledge about the above questions. I picked a solution that removes all speculatively created PHI nodes that are unused after the first loop, including the PHI nodes that are used by PHI nodes that we remove. This solution should work regardless of the algorithm used in the first loop (such as in which order PHI nodes are created etc.). Maybe that was too ambitous. |
Ok, I will make an attempt to reduce it further, but if there was an easy way to trigger the problem, then I guess we would have seen it a long time ago (and bugpoint would have been able to remove some more basic blocks).
If it is preferred, then I can remove all the checks and keep it as a "just check that this no longer asserts" kind of test.
Ok, I will make an attempt to reduce it further, but if there was an easy way to trigger the problem, then I guess we would have seen it a long time ago (and bugpoint would have been able to remove some more basic blocks).
If it is preferred, then I can remove all the checks and keep it as a "just check that this no longer asserts" kind of test.
Yes, it is much preferred, thank you! If we have more than one problem here, then we need more small tests, but not one huge test.
Do you think it is possible to remove the PHI nodes directly in the first loop?
I think it's possible.
Is it guaranteed that a PHI node, without uses, being inserted in PHIsToRemove only can be used by later created PHI nodes
PHI-nodes are only inserted when we call SSAUpdater.RewriteUse. I did not expected that their use lists change afterwards, so frankly, I'm surprised that inserted PHIs get more uses on later iterations - that's why I'm asking for a simpler test that could help to understand how and why this happens.
Then it might be enough to iterate in reverse insertion order...
Why do we need to reverse the order?
Another question is if it is important that we remove all unused PHI nodes?
If it is OK to do it quick and dirty, and only remove some of the speculatively created and unused PHI nodes, then we can simplify this.
I'm not sure I got this, could you elaborate on this please? What would be a quicker way?
I picked a solution that removes all speculatively created PHI nodes that are unused after the first loop, including the PHI nodes that are used by PHI nodes that we remove.
The inline comment I had was about your specific implementation, not about the general idea behind it. How is it possible to iterate that loop more than once?
Thanks,
Michael
Updated the test case (elimintated more branches/blocks, renamed variables/labels, removed checks).
Is there an intermediate IR dump you can send me of where you end up with
phis using phis that are useless?
(I'd also like to understand if you ever get cycles of that happening as
well)
With the new test case the method formLCSSAForInstructions is invoked three times (three loops?).
The third time the following PHI nodes are created (below is the order in which they are inserted in PHIsToRemoveIfNotUsed):
%value.lcssa = phi i16 [ %value.lcssa, %www ], [ %value, %gazank ] %value.lcssa1 = phi i16 [ %value, %gazink ] %value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] %value3.lcssa = phi i16 [ %value3, %foo ] %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ] %value.lcssa.lcssa5 = phi i16 [ %value.lcssa, %www ]
%value3.lcssa4 is for example used by a later created PHI node %value.lcssa.lcssa = phi.
The removal will be done in two iterations.
Iteration 1 removes:
%value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] %value3.lcssa = phi i16 [ %value3, %foo ] %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ]
So the remaining candidates for the second iteration are:
%value.lcssa = phi i16 [ %value.lcssa, %www ], [ %value, %gazank ] %value.lcssa1 = phi i16 [ %value, %gazink ] %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] %value.lcssa.lcssa5 = phi i16 [ %value.lcssa, %www ]
Iteration 2 removes:
%value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ]
However, when examining that PHI node in the first iteration we could not remove it because we had not yet removed %value.lcssa.lcssa = phi.
The remaining PHI nodes that were created by formLCSSAForInstructions are used elsewhere, so they are not removed.
This is what the function looks like if dumping it just before we start to remove unused PHI:s.
The PHI nodes that are removed are marked with <--------------
define void @test() { entry: br label %gazank gazank: ; preds = %gazonk, %entry %value = phi i16 [ 0, %entry ], [ undef, %gazonk ] br i1 undef, label %gazink, label %qqq gazink: ; preds = %gazank br i1 undef, label %gazonk, label %infinite.loop.pred gazonk: ; preds = %gazink br i1 undef, label %exit1, label %gazank qqq: ; preds = %www, %gazank %value.lcssa = phi i16 [ %value.lcssa, %www ], [ %value, %gazank ] br i1 undef, label %www, label %exit2 www: ; preds = %qqq br i1 undef, label %qqq, label %foo.pred foo.pred: ; preds = %www %value.lcssa.lcssa5 = phi i16 [ %value.lcssa, %www ] br label %foo foo: ; preds = %unreachable1, %bar, %foo.pred %value3 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ], [ %value.lcssa.lcssa5, %foo.pred ] br i1 undef, label %bar, label %exit1.pred bar: ; preds = %foo br i1 undef, label %foo, label %exit2.pred unreachable1: ; No predecessors! br i1 undef, label %foo, label %exit2.pred exit1.pred: ; preds = %foo %value3.lcssa = phi i16 [ %value3, %foo ] <----------------- br label %exit1 exit1: ; preds = %exit1.pred, %gazonk %value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] <----------------- ret void exit2.pred: ; preds = %unreachable1, %bar %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] <----------------- br label %exit2 exit2: ; preds = %exit2.pred, %qqq %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ] <----------------- ret void infinite.loop.pred: ; preds = %gazink %value.lcssa1 = phi i16 [ %value, %gazink ] br label %infinite.loop infinite.loop: ; preds = %infinite.loop, %infinite.loop.pred %dead = phi i16 [ %value.lcssa1, %infinite.loop.pred ], [ 0, %infinite.loop ] br label %infinite.loop }
As seen in my earlier post we remove the following PHI nodes when using my patch:
%value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] %value3.lcssa = phi i16 [ %value3, %foo ] %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ]
My idea was that if we iterated in reverse order we would remove the use of %value3.lcssa4 before trying to remove the PHI node defining that value.
But as seen above we wouldn't be able to remove the PHI node defining %value3.lcssa instead (as it is used by the first PHI node in the list).
So it will not help by changing the iteration order in case we desire to remove all those four PHI nodes.
Another question is if it is important that we remove all unused PHI nodes?
If it is OK to do it quick and dirty, and only remove some of the speculatively created and unused PHI nodes, then we can simplify this.I'm not sure I got this, could you elaborate on this please? What would be a quicker way?
Well, it would be simpler/quicker to just do:
for (PHINode *PN : PHIsToRemoveIfNotUsed) if (PN->use_empty()) PN->eraseFromParent();
if we do not care about leaving some unused PHI nodes around (for later passes to clean up).
But then I guess we could skip the cleanup completely.
No, you'd need a reverse postorder of the list of phistoremove, neither forward or backward order alone will help. Reverse postorder will give you a reversed topo sort.
Insertion order you gave:
%value.lcssa = phi i16 [ %value.lcssa, %www ], [ %value, %gazank ] %value.lcssa1 = phi i16 [ %value, %gazink ] %value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] %value3.lcssa = phi i16 [ %value3, %foo ] %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ] %value.lcssa.lcssa5 = phi i16 [ %value.lcssa, %www ]
reverse postorder formed by following the operand tree of the insertion order you gave: would be (regular postorder numbers next to nodes)
%value.lcssa.lcssa5 = phi i16 [ %value.lcssa, %www ] 7 %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ] 6 %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] 5 %value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] 4 %value3.lcssa = phi i16 [ %value3, %foo ] 3 %value.lcssa1 = phi i16 [ %value, %gazink ] 2 %value.lcssa = phi i16 [ %value.lcssa, %www ], [ %value, %gazank ] 1
In a single iteration, you should remove %value.lcssa.lcssa, then remove value3.lcssa4, then remove value.lcssa2, then remove value3.lcssa
You can see even if you started with the above list order and redid postorder, it will still iterate in the same relatively right order.
(You have cycles here, which complicates guarantees)
%value.lcssa1 = phi i16 [ %value, %gazink ] 7 %value.lcssa2 = phi i16 [ %value3.lcssa, %exit1.pred ], [ %value, %gazonk ] 6 %value3.lcssa = phi i16 [ %value3, %foo ] 5 %value.lcssa.lcssa = phi i16 [ %value3.lcssa4, %exit2.pred ], [ %value.lcssa, %qqq ] 4 %value3.lcssa4 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] 3 %value.lcssa.lcssa5 = phi i16 [ %value.lcssa, %www ] 2 %value.lcssa = phi i16 [ %value.lcssa, %www ], [ %value, %gazank ] 1
remove value.lcssa2, remove value3.lcssa, remove value.lcssa.lccsa, remove value3.lcssa4
A few notes:
- Because there is no single root here reaching all nodes, you can't guarantee the ordering will give a single iteration order for any sort order of the list.
- We are assuming no cycles (you could find sccs here to deal with the cycles, it's the same cost as finding RPO because they are done the same way)
- Off the top of my head, my intuition is that if the list is in dominator tree order (which is an RPO except for siblings) to start, the RPO generated by DFS of the operand tree should avoid the problem mentioned in 1, because it would initially order it the same way it would be ordered in the presence of a single root.
I looked deeper into this, and I think you exposed several issues here:
- The entire thing is triggered by the fact that the value we're rewriting is used in an unreachable block. We might just ignore such cases (see the loop on lines 102-110).
- The loops in this test are not in simplified form. While they are not guaranteed to be in it, it might indicate that we have a bug in some other pass that fails to preserve the loop in a simplified form. Whatever fix for the current issue we choose, it would be interesting to understand what led to this situation.
Returning back to the issue in question, here is my understanding of what happens in this case. LCSSA adds a phi-node for the %value. The phi-nodes it creates in the process happen to land into another loop, and thus we need to build LCSSA for those loop too (that's why we iterate several times in the worklist loop). When we add LCSSA phi-nodes for the second loop, we again insert phi-nodes into yet another loop. This continues until we convert all touched loops to LCSSA form. What happens in this process is that a PHI-node might have 0 uses at the time it was inserted first time, but as we keep inserting new phis, it might happen to be used in some of the latter phis - which is why the assertion fails. I think the fix should be to just replace the assert with the check (+add a comment about it). After this some redundant phi-nodes might remain in the program, but that could be fixed by adding operands of the phi to PHIsToRemove if they are not used anywhere except the phi, which we're removing.
Thanks,
Michael
Thanks!
And yes, my solution did not handle/assume cycles (nor did the old solution). I was assuming that the problem only were related to later added PHI nodes using the value from an earlier added PHI node when I originally uploaded this patch.
I think the program originates from some fuzzer (csmith?) and using a slightly modified pass-pipeline such as opt -sroa -O3.
Returning back to the issue in question, here is my understanding of what happens in this case. LCSSA adds a phi-node for the %value. The phi-nodes it creates in the process happen to land into another loop, and thus we need to build LCSSA for those loop too (that's why we iterate several times in the worklist loop). When we add LCSSA phi-nodes for the second loop, we again insert phi-nodes into yet another loop. This continues until we convert all touched loops to LCSSA form. What happens in this process is that a PHI-node might have 0 uses at the time it was inserted first time, but as we keep inserting new phis, it might happen to be used in some of the latter phis - which is why the assertion fails. I think the fix should be to just replace the assert with the check (+add a comment about it). After this some redundant phi-nodes might remain in the program, but that could be fixed by adding operands of the phi to PHIsToRemove if they are not used anywhere except the phi, which we're removing.
I checked what would happen if I erase unused PHI nodes directly at the end of loop that is inserting the PHI nodes.
Then test/Transforms/LCSSA/pr28608.ll started to fail again, unless I also started to ignore unreachable-from-entry users in the loop on lines 102-110.
With both of those fixes we get an unused %value.lcssa5 = phi i16 [ undef, %unreachable1 ], [ %value3, %bar ] in exit2.pred. That one was added by SSAUpdate.RewriteUse. So that PHI node is not even becoming a candidate for removal.
I'll update the patch with the other suggestion (simply replacing the assert with a check).
Simplified the solution, assuming that it is ok to leave some unused PHI nodes behind, at least if the input contains unreachable basic blocks.
Looks good to me now, thanks!
I think the program originates from some fuzzer (csmith?) and using a slightly modified pass-pipeline such as opt -sroa -O3.
Still might be interesting to see why we failed to simplify loops (more precisely, why did we fail to split critical edges).
Thanks,
Michael