Page MenuHomePhabricator

[BPI] Improve unreachable/ColdCall heurstics to handle loops.

Authored by twoh on Nov 11 2019, 4:49 PM.



While updatePostDominatedByUnreachable attemps to find basic blocks that are post-domianted by unreachable blocks, it currently cannot handle loops precisely, because it doesn't use the actual post dominator tree analysis but relies on heuristics of visiting basic blocks in post-order. More precisely, when the entire loop is post-dominated by the unreachable block, current algorithm fails to detect the entire loop as post-dominated by the unreachable because when the algorithm reaches to the loop latch it fails to tell all its successors (including the loop header) will "eventually" be post-domianted by the unreachable block, because the algorithm hasn't visited the loop header yet. This makes BPI for the loop latch to assume that loop backedges are taken with 100% of probability. And because of this, block frequency info sometimes marks virtually dead loops (which are post dominated by unreachable blocks) super hot, because 100% backedge-taken probability makes the loop iteration count the max value. updatePostDominatedByColdCall has the exact same problem as well.

To address this problem, this patch makes PostDominatedByUnreachable/PostDominatedByColdCall to be computed with the actual post-dominator tree.

Diff Detail

Event Timeline

twoh created this revision.Nov 11 2019, 4:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2019, 4:49 PM
twoh updated this revision to Diff 228783.Nov 11 2019, 4:51 PM

Add "CHECK" for the test.

Harbormaster completed remote builds in B40780: Diff 228783.
twoh added a comment.Nov 19 2019, 2:23 PM

Friendly ping. @chandlerc @skatkov Can you please suggest other reviewers if you're too busy to review? Thanks!

The implementation itself looks good to me.
I worry about tests whether fixes of the tests are expected.
If you have ready answers please let me know otherwise I'll try to dig into them deeply.


Theoretically predecessor might be already marked in TargetSet so you could potentially filter the adding list of predecessors.


Why we lost this alignment?


I have a trouble to detect whether the expected behavior of the test is still performed.

Did you ensured in it or just re-generated the expected output?

twoh updated this revision to Diff 230293.Nov 20 2019, 10:44 AM

Addressing comments from @skatkov. Thank you for your review!

For X86 asm tests, I just re-ran script. I think it should generate "correct" test based on the existing FileCheck patterns?

skatkov accepted this revision.Nov 20 2019, 7:41 PM

Addressing comments from @skatkov. Thank you for your review!

For X86 asm tests, I just re-ran script. I think it should generate "correct" test based on the existing FileCheck patterns?

It will generate checks corresponding to the current implementation, what I worry about is that it still checks that "spill is hoisted to a cold BB inside the hotter outer loop."
To ensure it that we need to understand what it did before and does now what is untrivial :)
I see many updates of this test and not sure that all of them checked that the original target of the test still make sense :)

Let add an author introduced it in 2014. If he does not react to this patch in 1-2 days. Feel free to land.

This revision is now accepted and ready to land.Nov 20 2019, 7:41 PM
skatkov added a subscriber: manmanren.

@manmanren, could you please ensure that your test still checks what it expected to do?

Closed by commit rGb19ec1eb3d0c: [BPI] Improve unreachable/ColdCall heurstics to handle loops. (authored by twoh, committed by taewookoh <>). · Explain WhyNov 27 2019, 10:36 AM
This revision was automatically updated to reflect the committed changes.

This broke tests, see . If it takes a while, please revert while you look.

twoh added a comment.Nov 27 2019, 11:13 AM

Sorry my bad. Seems that some powerpc tests are failing. Will revert soon and take a look.