This is an archive of the discontinued LLVM Phabricator instance.

Mark InstCombine as not preserving CFG
AbandonedPublic

Authored by yrouban on Jun 3 2020, 6:02 AM.

Details

Summary

A fix for the bug 46098 "PostDominatorTree is different than a freshly computed one!" with opt -branch-prob -instcombine -block-freq -verify-dom-info

InstCombine invalidates PostDomTree.
See the example in the new test llvm/test/Transforms/InstCombine/infinite-loop-postdom.ll

In general InstCombine swapping successors violates the 2nd rule of PreserveCFG: do not change block terminators in any way. Doing so InstCombine also invalidates DFS numbering, which is a part of DomTree.

This patch reflects these by reporting not preserving CFG.

Diff Detail

Event Timeline

yrouban created this revision.Jun 3 2020, 6:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2020, 6:02 AM

Which fold is at fault here?
Can it be simply taught to update domtree?

nikic added a comment.Jun 3 2020, 6:45 AM

This seems pretty fishy to me. InstCombine is CFG-preserving by design, so it really shouldn't be invalidating any domtrees.

davide added a subscriber: davide.Jun 3 2020, 11:17 AM

I really think this is wrong. If InstCombine doesn't preserve PostDOM, it should be fixed to fix it.

kuhar added a comment.EditedJun 3 2020, 3:06 PM

As a quick fix, maybe we could try to detect changes in reverse-unreachable code and update postdominators when that's detected? Not sure how much overhead that would be; InstCombine is already very expensive w.r.t. compilation times.

A couple of questions I'm asking myself:

  1. How is CFG preservation actually defined? Can we have a verification pass that checks if passes lie about preserving CFG? A similar issue was a source of many bugs in machine IR that we fixed last Fall.
  2. Making root finding insensitive to the order of successors would eliminate some needles recalculations that happen in other places too, provided the alternative ordering is cheap to compute. How can we canonicalize the order of successors? Sorting by BB names doesn't seem like the best idea to me.

I think that whatever the fix ends up being, we should make sure the compilation times are not noticeably regressed before committing changes.

I don't think the solution is to not preserve PDom either.
I think your suggestion from the description is spot-on though: "A better fix would be to improve PostDomTree to be insensitive to the order of successors in post-dom unreachable subgraphs." This is indeed an issue with the order of basic blocks when dealing with infinite loops.

bjope added a subscriber: bjope.Jun 8 2020, 5:56 AM

Which fold is at fault here?
Can it be simply taught to update domtree?

There is no way to update for successors switch. It would be a noop for the updater.

lebedev.ri requested changes to this revision.Jun 10 2020, 6:21 AM

Which fold is at fault here?
Can it be simply taught to update domtree?

There is no way to update for successors switch. It would be a noop for the updater.

Then i'd agree with @asbirlea

I don't think the solution is to not preserve PDom either.
I think your suggestion from the description is spot-on though: "A better fix would be to improve PostDomTree to be insensitive to the order of successors in post-dom unreachable subgraphs." This is indeed an issue with the order of basic blocks when dealing with infinite loops.

This revision now requires changes to proceed.Jun 10 2020, 6:21 AM

Even if we agree the way the PostDominatorTree computes roots in cases involving infinite loops shouldn't depend on the ordering of successors, I'm not sure it's correct to say that instcombine "preserves the CFG". Even for the regular DominatorTree, I think reordering the successors invalidates DFS numbering (in the sense that it changes the result of getDFSNumIn()).

In terms of how the PostDominatorTree should decide what a "root" is, I'm not sure there's any good way to make that decision. Given an SCC which doesn't have any natural parent, you can pick any node in that SCC as the "fake root", and there isn't any natural reason to favor one node over another.

Really, the natural choice is to just follow the standard dominator algorithm and say the nodes are unreachable. Then for passes which want it, we can add a transform to mutate the CFG to introduce fake roots.

Even if we agree the way the PostDominatorTree computes roots in cases involving infinite loops shouldn't depend on the ordering of successors, I'm not sure it's correct to say that instcombine "preserves the CFG". Even for the regular DominatorTree, I think reordering the successors invalidates DFS numbering (in the sense that it changes the result of getDFSNumIn()).

In terms of how the PostDominatorTree should decide what a "root" is, I'm not sure there's any good way to make that decision. Given an SCC which doesn't have any natural parent, you can pick any node in that SCC as the "fake root", and there isn't any natural reason to favor one node over another.

Really, the natural choice is to just follow the standard dominator algorithm and say the nodes are unreachable. Then for passes which want it, we can add a transform to mutate the CFG to introduce fake roots.

Good point! In other words, if DFSNum was a separate CFG base analysis (and it is as a part of DomTree), it must be invalidated.
@lebedev.ri This shows that making the PostDomTree better will not help.
I will stick to invalidating all CFG analyses in InstCombine only if successors are reordered. I believe we should do this to fix the bug. A better solution can be implemented later on.

yrouban updated this revision to Diff 270048.Jun 11 2020, 12:53 AM
yrouban retitled this revision from Mark InstCombine as not preserving PostDomTree to Mark InstCombine as not preserving CFG.
yrouban edited the summary of this revision. (Show Details)
  • Made legacy InstCombine pass to unconditionally report changing CFG
  • New Pass Manager's InstCombine is changed to report changing CFG only if branches were swapped at least once
  • Updated tests accordingly
nikic requested changes to this revision.EditedJun 11 2020, 1:24 AM

If we define that changing successor order constitutes a CFG change, then these optimizations need to be dropped from InstCombine, and moved into SimplifyCFG.

As said before, InstCombine is supposed to be CFG-preserving -- if it isn't because the definition of what "CFG-preserving" means was unclear, then we need to fix InstCombine to be in line with we new definition.

As the pipeline diffs show, not preserving CFG has a very real cost (five new domtree calculations at least). We should try to avoid that :)

This revision now requires changes to proceed.Jun 11 2020, 1:24 AM

Even if we agree the way the PostDominatorTree computes roots in cases involving infinite loops shouldn't depend on the ordering of successors, I'm not sure it's correct to say that instcombine "preserves the CFG". Even for the regular DominatorTree, I think reordering the successors invalidates DFS numbering (in the sense that it changes the result of getDFSNumIn()).

You can always preserve domtrees by marking dfsnumbers as invalid.

In terms of how the PostDominatorTree should decide what a "root" is, I'm not sure there's any good way to make that decision. Given an SCC which doesn't have any natural parent, you can pick any node in that SCC as the "fake root", and there isn't any natural reason to favor one node over another.

Really, the natural choice is to just follow the standard dominator algorithm and say the nodes are unreachable. Then for passes which want it, we can add a transform to mutate the CFG to introduce fake roots.

Daniel B., Tobias G., and myself had a very long discussion about discovering reverse-unreachable nodes or not in postdominators a few years ago. Infinite loops are fairly rare, but can cause very long path that lead to them to be reverse-unreachable as well. This is rather surprising and introduces corner cases in analyzes/optimizations, which can be avoided by shifting this complexity to postdominator construction, while not breaking algorithm that assume the textbook definition of postdominators. As a consequence, you have to chose some arbitrary way of selecting a fake-entry node from nodes in an scc. Right now this order is whatever children or inverse_children return, but could be based on the order of blocks in a function instead.

The issue I see with introducing fake exits is that postdominators is an analysis used by other analyses, and it would be probably complicated to schedule a transformation in front of any analysis that uses postdominators. Even if an analysis doesn't need postdominance info for reverse-unreachable code, it would see a postdominators analysis results for reverse-reachable nodes if a pass running before it introduced them. Note that this would affect all IR-like things that use postdominators, e.g., llvm ir, clang cfg, machine ir, callgraphs, and third-party ir, including irs outside of the monorepo.

If we define that changing successor order constitutes a CFG change, then these optimizations need to be dropped from InstCombine, and moved into SimplifyCFG.

As said before, InstCombine is supposed to be CFG-preserving -- if it isn't because the definition of what "CFG-preserving" means was unclear, then we need to fix InstCombine to be in line with we new definition.

As the pipeline diffs show, not preserving CFG has a very real cost (five new domtree calculations at least). We should try to avoid that :)

+1

As the pipeline diffs show, not preserving CFG has a very real cost (five new domtree calculations at least). We should try to avoid that :)

You can always preserve domtrees by marking dfsnumbers as invalid.

I wasn't trying to imply that instcombine couldn't keep the domtree up-to-date; we can easily do that even if instcombine says it doesn't preserve the CFG.

The issue I see with introducing fake exits is that postdominators is an analysis used by other analyses, and it would be probably complicated to schedule a transformation in front of any analysis that uses postdominators. Even if an analysis doesn't need postdominance info for reverse-unreachable code, it would see a postdominators analysis results for reverse-reachable nodes if a pass running before it introduced them. Note that this would affect all IR-like things that use postdominators, e.g., llvm ir, clang cfg, machine ir, callgraphs, and third-party ir, including irs outside of the monorepo.

Yes, this is messy. Maybe it's just too terrible. I think we could come up with some sort of transition plan if necessary; for things that don't need to update the CFG, the downsides of the heuristics are probably less terrible.

In terms of keeping the existing post-dominators working, I guess the "mark the dfsnumbers invalid" API and domtree updates can also recompute the postdominator roots. Not sure how we make that fast, though.

lebedev.ri requested changes to this revision.Aug 5 2020, 4:04 AM

Glad to see that D84763 is making it's way!