Page MenuHomePhabricator

[LoadCombine] Avoid analysing dead basic blocks

Authored by KennethH on Mar 16 2017, 7:39 AM.



Dead basic blocks may be forming a loop, for which SSA form is
fulfilled, but with a circular def-use chain. LoadCombine could
enter an infinite loop when analysing such dead code. This patch
solves the problem by simply avoiding to analyse all basic blocks
that aren't forward reachable, from function entry, in LoadCombine.


Diff Detail


Event Timeline

KennethH created this revision.Mar 16 2017, 7:39 AM
materi added a subscriber: materi.Mar 16 2017, 8:36 AM

Seems to solve:
opt /tmp/infinite.ll -jump-threading -load-combine

Ka-Ka added a subscriber: Ka-Ka.Mar 16 2017, 8:39 AM
grandinj added inline comments.
233 ↗(On Diff #91999)

presumably checking IsAlive is cheaper than calling skipBasicBlock, so maybe it should be tested first?

bjope added a subscriber: bjope.Mar 16 2017, 2:07 PM
zzheng added a subscriber: zzheng.Mar 16 2017, 4:04 PM

Can we remove such dead BBs before entering passes that would spin forever on them?

Can we remove such dead BBs before entering passes that would spin forever on them?

You don't want to :)

1283 ↗(On Diff #91999)
  1. You need to define dead.

If you mean "forward unreachable", please use that.

  1. RPOT is not the optimal order for some problems (in fact, for some, it's optimally bad), and forcing that order seems ... concerning.

I'm pretty strongly against forcing a bb order in a generic pass infrastructure.

  1. It looks like we have 3 or 4 basic block passes.

Are they getting anything from being basicblockpasses?

It looks like they could all be trivially changed to functionpasses, and those that can't handle dead code, could just be changed to do what they need to.

For example, DCE has a dominator tree around, it can tell whether things are forward unreachable without wasting time doing a reverse postorder traversal.

IMHO, that is a better solution (and helps new PM) than forcing a pass order here.

bjope added inline comments.Mar 17 2017, 2:52 AM
1283 ↗(On Diff #91999)

What do you mean by "forcing an order"?

We have not changed the order in which basic block passes are executed here. All we did was adding the IsAlive flag to tell the basic block passes if the BB it should operate on is "forward reachable" (from the function entry) or not. And then it is up to each BasicBlockPass to make a choice if it should operate on the BB or not.

The idea was that passes like the BasicBlockPassPrinter should be able to also print the dead blocks, whereas LoadCombine could avoid spending time on optimizing the code that most likely will be removed by DCE later anyway. So that is why we provide the flag instead of simply skipping to iterate over the "dead" blocks.

dberlin added inline comments.Mar 17 2017, 9:26 AM
1283 ↗(On Diff #91999)

Yes, you are right, i misread that part.
Note that RPOT is expensive and unnecessary here.
You could just use depth_first_search. Just use the version with an external visited set, and it will fill in the set for you.

I'm still strongly of the opinion to simply kill basicblockpass rather than add another O(basic blocks) walk that not everything needs.

Every other pass also properly handles dead or unreachable code themselves, without help from the pass infrastructure.

KennethH updated this revision to Diff 92485.Mar 21 2017, 7:06 AM

I have updated according to most of the comments.
I think that the re-factoring of all basic block passes to function passed should be done as a separate task.

I have updated according to most of the comments.

Note: You are still calling it "IsAlive", which again, does not really tell anyone what it is supposed to mean :)
However, ...

I think that the re-factoring of all basic block passes to function passed should be done as a separate task.

That's fine, if you want to fix it by making BB passes more expensive and doing another walk of all BB's, you are welcome to.
This, to me, feels like a hack-around to try to paper over how these passes operate.
Someone else can approve that if they are comfortable with it :)

bjope updated this revision to Diff 92617.Mar 22 2017, 3:37 AM

Solution was changed into using the DominatorTree analysis to
determine if basic blocks are "dead". This makes the patch
local to LoadCombine.cpp.

Mr Berlin had a valid point. This looks like a much cleaner solution.

bjope edited the summary of this revision. (Show Details)Mar 22 2017, 10:20 AM
bjope added a comment.Mar 23 2017, 8:36 AM

KennethH gave me permission to upload the improved solution (that uses DominatorTree analysis). This feels like a more "kosher" solution.

The currently suggested solution (for avoiding to optimize on dead code in LoadCombine) is similar to the solution used in BBVectorize. So there is really nothing new invented here.

I'm just afraid that I might have missed something fundamental about "adding a new analysis" (my experience in playing around with analysis passes is very limited).
So any feedback is welcome!

bjope edited the summary of this revision. (Show Details)Mar 28 2017, 12:18 AM
grosser edited edge metadata.Mar 28 2017, 8:16 AM

This looks pretty straightforward to me. I think if there are no other concerns, I think this should go in.

davide requested changes to this revision.Mar 30 2017, 1:24 PM
davide added inline comments.
71 ↗(On Diff #92617)

I'm not an expert on LoadCombine but I think this preserve is correct because LoadCombine doesn't do any cross-BB combining or modifies the CFG structure. That said, if what I say is correct, I expect setPreserveCFG to preserve the DominatorTree already? Can you please double check?

1 ↗(On Diff #92617)

Can you add a CHECK line, to make sure we leave the block alone?

This revision now requires changes to proceed.Mar 30 2017, 1:24 PM
bjope updated this revision to Diff 93648.Mar 31 2017, 8:44 AM
bjope edited edge metadata.

Addressed comments from davide:

  1. Removed the not needed explicit addPreserved for DominatorTree (it is included in setPreservesCFG).
  2. Generated CHECKs in the test case.

I also added a second basic block to the test case. This was useful for some manual test to see that the Dominator Tree only is created once per function, even if the LoadCombine pass is executed multiple times (for each BB). I.e. checking that the DomTree actually is preserved.

Looks good.

davide accepted this revision.Apr 8 2017, 1:30 AM

LGTM now.

This revision is now accepted and ready to land.Apr 8 2017, 1:30 AM
This revision was automatically updated to reflect the committed changes.