This is an archive of the discontinued LLVM Phabricator instance.

Post-dom fix - connect virtual edges to last reverse-reachable BB
Needs ReviewPublic

Authored by grosser on Mar 13 2017, 6:40 AM.
This revision needs review, but all specified reviewers are disabled or inactive.

Details

Reviewers
dberlin
Summary

This change is an adapted version of Daniel's "Fix PR 24415 (at least), by
making our post-dominator tree behavior sane." patch, which ensures that we do not loose existing post-dominance information and can consequently correctly detect SESE regions that contain non-reverse reachable infinite loops.

To remain consistent we handle unreachable() basic blocks the same way as
infinite loops.

This is a proof-of-concept patch used in the discussion on how to properly
complete the post-dominator tree to include reverse unreachable nodes.

Event Timeline

grosser created this revision.Mar 13 2017, 6:40 AM
grosser edited the summary of this revision. (Show Details)Mar 13 2017, 6:44 AM

Here an example that illustrates the difference between the original patch Daniel proposed in https://reviews.llvm.org/D29705 and the slightly modified one we prose her:

Daniel's patch

Inorder PostDominator Tree:

[1]  <<exit node>> {0,19}
  [2] %"4" {1,4}
    [3] %"3" {2,3}
  [2] %"1" {5,8}
    [3] %"0" {6,7}
  [2] %"6" {9,18}
    [3] %"12" {10,11}
    [3] %"5" {12,15}
      [4] %"2" {13,14}
    [3] %"11" {16,17}

This patch

[1]  <<exit node>> {0,19}
  [2] %"4" {1,18}
    [3] %"3" {2,17}
      [4] %"1" {3,16}
        [5] %"0" {4,5}
        [5] %"6" {6,15}
          [6] %"12" {7,8}
          [6] %"5" {9,12}
            [7] %"2" {10,11}
          [6] %"11" {13,14}

The main difference is that %1 still post-dominates %2 and all the other basic blocks in the reverse-unreachable region. This behavior only requires

I see a couple problems with this:

  1. There isn't any easy way for passes which use PostDominatorTree to walk the CFG including these fake edges.
  2. For consistency, we would want other analysis and transform passes (DominatorTree/DominanceFrontier/LoopInfo/etc.) to reason about the CFG using the same fake edges as the PostDominatorTree; otherwise, you can't really rely on the results of the PostDominatorTree analysis. I'm not exactly sure what the total impact would be, but it seems complicated at first glance.
dberlin edited edge metadata.Mar 13 2017, 1:02 PM

I see a couple problems with this:

  1. There isn't any easy way for passes which use PostDominatorTree to walk the CFG including these fake edges.
  2. For consistency, we would want other analysis and transform passes (DominatorTree/DominanceFrontier/LoopInfo/etc.) to reason about the CFG using the same fake edges as the PostDominatorTree; otherwise, you can't really rely on the results of the PostDominatorTree analysis. I'm not exactly sure what the total impact would be, but it seems complicated at first glance.

Yes, besides the ongoing discussion of "maintain path invariant" vs "not maintain path invariant" for things, the connect to exit is easy because if there was a fake exit node, it would be the same and have the same predecessor edges, as the post-dom tree's virtual exit node.

For example, for IDF calculator, ignoring the path invariant it depends on for correctness (again, not gonna rehash the thread), you would need to add "j-edges" for these fake edges, such that they got walked at the same time the other predecessor edges did.

I'm not even sure where you store the fake edges, let alone modify the iterators to walk them.
It seems like at least right now, you'd have to hack up a bunch of stuff (IE store the edges with the PDT, and then audit everywhere that walked preds, ...)
That seems, as you said, complicated, even with the zip iterator/etc.

I'm actually a big fan of real edge structures and enabling fake edges vs what we do now in llvm, but that's also a huge change.

Hi Daniel, hi Eli,

thanks for your comments. I agree with Eli that the complexity of the approach in this patch is indeed concerning. Not saying this is the perfect solution. ;-)

@dberlin : Also thanks for your comments. I will reply in the thread.

I came up with another example that illustrates how the flattened tree "breaks" region-info, in some way.

Here one more example:

Inorder PostDominator Tree:

[1]  <<exit node>> {0,29}
  [2] %bb24 {1,2}
  [2] %bb3 {3,10}
    [3] %bb22 {4,7}
      [4] %bb21 {5,6}
    [3] %bb {8,9}
  [2] %bb5 {11,18}
    [3] %bb19 {12,15}
      [4] %bb18 {13,14}
    [3] %bb4 {16,17}
  [2] %bb7 {19,24}
    [3] %bb16 {20,21}
    [3] %bb6 {22,23}
  [2] %bb8 {25,26}
  [2] %infloop {27,28}

Connect in tree

Inorder PostDominator Tree:

[1]  <<exit node>> {0,29}
  [2] %bb24 {1,28}
    [3] %bb3 {2,27}
      [4] %bb22 {3,24}
        [5] %bb21 {4,23}
          [6] %bb5 {5,22}
            [7] %bb19 {6,19}
              [8] %bb18 {7,18}
                [9] %bb7 {8,17}
                  [10] %bb16 {9,14}
                    [11] %bb8 {10,13}
                      [12] %infloop {11,12}
                  [10] %bb6 {15,16}
            [7] %bb4 {20,21}
      [4] %bb {25,26}

You and i fundamentally disagree over whether one is correct or not :)
I don't think that will change.

You believe that everything that does not exit should not affect the precision of things that do. That would be great if possible.
I believe that is impossible to do without breaking the invariants of the dominator tree, when extended to the unreachable nodes to them .
You do not seem to disagree with that, but your response is to break those invariants anyway to produce a "better" result for the reachable nodes.
I'm unwilling to do that, because it defeats, to me, the whole purpose of having them in the post-dom tree.
As I said there, i am more than happy to produce you a postdomwithoutunreachablenodes that produces that result (note that isn't what you are getting *now*, you are getting something in the middle where it sometimes is like that and sometimes not).
I do not believe there are a lot of things you can use that for safely, but if it's what you want, and you are sure, i'm happy to give it to you.

I just think it is not what we should be doing by default. I believe the thing we should do by default is the thing that makes optimizations work correctly without special casing, which is why it is the path, as your investigation shows, every compiler to confront the issue has chosen in the end.