This is an archive of the discontinued LLVM Phabricator instance.

StructurizeCFG: Set Undef for non-predecessors in setPhiValues()
ClosedPublic

Authored by ruiling on Aug 23 2022, 1:39 AM.

Details

Summary

During structurization process, we may place non-predecessor blocks
between the predecessors of a block in the structurized CFG. Take
the typical while-break case as an example:

/---A (v = ...)
|  / \
^ B   C
|  \ /|
\---L |
    \ /
     E (r = phi (v:C)...)

After structurization, the CFG would be look like:

/---A
|   |\
|   | C
|   |/
|   F1
^   |\
|   | B
|   |/
|   F2
|   |\
|   | L
\   |/
 \--F3
    |
    E

We can see that block B is placed between the predecessors(C/L) of E.
During phi reconstruction, to achieve the same sematics as before, we
are reconstructing the PHIs as:

F1: v1 = phi (v:C), (undef:A)
F3: r = phi (v1:F2), ...

But this is also saying that v1 would be live through B, which is not
quite necessary. The idea in the change is to say the incoming value
from B is Undef for the PHI in E. With this change, the reconstructed
PHI would be:

F1: v1 = phi (v:C), (undef:A)
F2: v2 = phi (v1:F1), (undef:B)
F3: r  = phi (v2:F2), ...

Diff Detail

Event Timeline

ruiling created this revision.Aug 23 2022, 1:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2022, 1:39 AM
ruiling requested review of this revision.Aug 23 2022, 1:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2022, 1:39 AM
foad added a comment.Aug 23 2022, 2:27 AM

I haven't read the patch carefully, but the description makes a lot of sense to me.

sameerds added inline comments.Aug 24 2022, 12:56 AM
llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
754–756

This is the core idea. But it's not obvious how the code implements this. Would be good to have an explanation of the Stack and the Incomings set. For example, why do we need to erase Current from Incomings? Somewhere in the traversal, the Stack seems to ensure that undefs are added to phis at intermediate Flow blocks ... needs explanation of why it always works.

sameerds added inline comments.Aug 24 2022, 2:29 AM
llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
755

On second thoughts, "this block" contributes an undefined value. That's the whole point of this change, right?

ruiling added inline comments.Aug 24 2022, 2:36 AM
llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
754–756

Sorry I don't quite get your point. The reason we need to erase Current from I

755

Yes, the core idea is the block existing before structurization but not the predecessor block does not contribute any value even after structurization. I am thinking about whether I can refine the code to be less confusing.

ruiling added inline comments.Aug 24 2022, 2:40 AM
llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
754–756

The reason I was erasing Current from Incomings is to try to terminate the traversal loop after finishing all the incoming predecessors. But now I think this maybe not quite necessary. The Stack is just used as traversal stack, no other meaning.

sameerds added inline comments.Aug 24 2022, 4:53 AM
llvm/lib/Transforms/Scalar/StructurizeCFG.cpp
754–756

What I mean is that the code needs comments explaining how it achieves the goal. There is a stack, so some form of DFS is happening. But it will be good to have an explanation (in the source file itself, and not this review) of why the implementation is correct. This is for future reference. It is not enough to expect that anyone reading the code will "eventually" figure out how it is all working.

ruiling updated this revision to Diff 455461.Aug 24 2022, 7:26 PM

refine the code and add some comment.

ruiling edited the summary of this revision. (Show Details)Aug 24 2022, 7:26 PM
ruiling marked an inline comment as done.Aug 24 2022, 7:29 PM
fhahn added a subscriber: fhahn.Aug 25 2022, 12:52 AM

Should this use poison instead of undef?

Should this use poison instead of undef?

If I understand Poison correctly, Poison should work the same as Undef when they are used as incoming value to PHI, right? If so, changing to poison should work, but that should be in a separate change.

any further comment?

sameerds added inline comments.Aug 29 2022, 1:46 AM
llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll
59–61

When the value reaching along a predecessor is a constant literal, do we still need to create the extra PHI in the earlier Flow blocks? In this case, it seems the true/false values are pushed to the predecessor block Flow10. Can this be bad for performance in some case? We can skip it by checking for constants when we add predecessors to the stack.

ruiling updated this revision to Diff 456543.Aug 29 2022, 9:59 PM
ruiling edited the summary of this revision. (Show Details)

Update to help better phi insertion

ruiling marked an inline comment as done.Aug 29 2022, 10:08 PM
ruiling added inline comments.
llvm/test/Transforms/StructurizeCFG/workarounds/needs-fr-ule.ll
59–61

I am not sure whether this would regress quality of generated code, but I happened to get an idea to fix the issue. Please check whether this sounds good to you. The backward traversal is done once per basic block. So I have postponed the constant check.

ruiling marked an inline comment as done.Aug 29 2022, 10:14 PM
sameerds accepted this revision.Aug 29 2022, 11:15 PM
This revision is now accepted and ready to land.Aug 29 2022, 11:15 PM