This is an archive of the discontinued LLVM Phabricator instance.

[JumpThreading] Fix branch probabilities in DuplicateCondBranchOnPHIIntoPred()
ClosedPublic

Authored by yrouban on Nov 5 2020, 5:08 AM.

Details

Summary

When instructions are cloned from block BB to PredBB in the method DuplicateCondBranchOnPHIIntoPred() number of successors of PredBB changes from 1 to number of successors of BB. So we have to copy branch probabilities from BB to PredBB. I found this place when the single successor of PredBB got 1.0 probability which was left unchanged after PredBB got the instructions from its successor along with a terminator with several successors. So the first edge from ProbBB had probability 1.0 and the second resulted in 0.5 as it was unset explicitly. I cannot create an LL test since D90284. Here is a short test that failed before:

; CMD: opt -passes=jump-threading -debug-only=branch-prob -debug-only=jump-threading -S test.ll
define void @foo(i1 %f0, i1 %f1, i1 %f2) !prof !{!"function_entry_count", i64 2} {
bb1:
  br i1 %f0, label %bb3, label %bb2

bb2:
  br label %bb3

bb3:
  %ph = phi i1 [ %f1, %bb1 ], [ %f2, %bb2 ]
  br i1 %ph, label %exit1, label %unreach

exit1:
  ret void

unreach:
  unreachable
}

Here is the log with additional printing of probabilities at the end of DuplicateCondBranchOnPHIIntoPred() (compare the probs of bb3 and bb2, they must be the same):

---- Branch Probability Info : foo ----

Computing probabilities for exit1
Computing probabilities for unreach
Computing probabilities for bb3
set edge bb3 -> 0 successor probability to 0x7fffffff / 0x80000000 = 100.00%
set edge bb3 -> 1 successor probability to 0x00000001 / 0x80000000 = 0.00%
Computing probabilities for bb2
Computing probabilities for bb1
Jump threading on function 'foo'
  Duplicating block 'bb3' into end of 'bb2' to eliminate branch on phi.  Cost: 0 block is:
bb3:                                              ; preds = %bb2, %bb1
  %ph = phi i1 [ %f1, %bb1 ], [ %f2, %bb2 ]
  br i1 %ph, label %exit1, label %unreach

Probability of bb3 -> (0) = 0x7fffffff / 0x80000000 = 100.00%
Probability of bb3 -> (1) = 0x00000001 / 0x80000000 = 0.00%
Probability of bb2 -> (0) = 0x40000000 / 0x80000000 = 50.00%
Probability of bb2 -> (1) = 0x40000000 / 0x80000000 = 50.00%

------------- Result
define void @foo(i1 %f0, i1 %f1, i1 %f2) !prof !0 {
bb1:
  br i1 %f0, label %bb3, label %bb2

bb2:                                              ; preds = %bb1
  br i1 %f2, label %exit1, label %unreach

bb3:                                              ; preds = %bb1
  %ph = phi i1 [ %f1, %bb1 ]
  br i1 %ph, label %exit1, label %unreach

exit1:                                            ; preds = %bb2, %bb3
  ret void

unreach:                                          ; preds = %bb2, %bb3
  unreachable
}

!0 = !{!"function_entry_count", i64 2}
--------------------------------------------------

Copying can be done with the newly introduced method in D90839.
The change in the method ThreadThroughTwoBasicBlocks() as a small related NFC.

Diff Detail

Event Timeline

yrouban created this revision.Nov 5 2020, 5:08 AM
Herald added a project: Restricted Project. · View Herald TranscriptNov 5 2020, 5:08 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
yrouban requested review of this revision.Nov 5 2020, 5:08 AM
yrouban edited the summary of this revision. (Show Details)
kazu added a comment.Nov 5 2020, 1:32 PM

Your patch looks reasonable. Is there any way you could come up with a testcase to make sure that the pass indeed copies the probabilities in your testcase? (Note that I'm not asking for a testcase that exposed this bug before D90284.)

Your patch looks reasonable. Is there any way you could come up with a testcase to make sure that the pass indeed copies the probabilities in your testcase? (Note that I'm not asking for a testcase that exposed this bug before D90284.)

I could not create a simple test. One more option to create one is to change the usage of BPI/BFI to request analysis from AM. Then the test can print BPI to show the difference. I'm not sure this big change is worth this small patch. It is not clear why JumpThreading uses its own local version of BPI/BFI.

yrouban updated this revision to Diff 305655.Nov 16 2020, 10:22 PM

added a test

kazu accepted this revision.Nov 16 2020, 10:25 PM

Thank you for the testcase!

This revision is now accepted and ready to land.Nov 16 2020, 10:25 PM