This is an archive of the discontinued LLVM Phabricator instance.

Set machine block placement hot prob threshold for both static and runtime profile.
ClosedPublic

Authored by danielcdh on Jun 3 2016, 3:21 PM.

Details

Summary

With runtime profile, we have more confidence in branch probability, thus during basic block layout, we set a lower hot prob threshold so that blocks can be layouted optimally.

Diff Detail

Repository
rL LLVM

Event Timeline

danielcdh updated this revision to Diff 59634.Jun 3 2016, 3:21 PM
danielcdh retitled this revision from to Set machine block placement hot prob threshold for both static and runtime profile..
danielcdh updated this object.
danielcdh added reviewers: davidxl, djasper.
danielcdh added a subscriber: llvm-commits.
davidxl edited edge metadata.Jun 3 2016, 3:52 PM

The PGO related change is only half correct. In fact the trace successor probability threshold should be computed based on existing parameters misfetch cost and jump instruction cost.

In a diamond shaped CFG, the resulting succProb threshold is indeed 50% as the CFG is symmetric. For Triagular shaped CFG, the result will be very different because breaking topological order introduces more overhead compared with diamond shape case.

Given

   A
 /  |
B  |
  \ |
   C

If Prob(A,B) is greater than Prob(A,C), A B C should always be the right order. However, if if Prob(A,B) is lower than P(A,C), we need to compare the cost of A B C vs A C B

For A B C, the cost is

Prob(A,C) * Misfetch_cost due to the taken branch from A to C

For A C B, the cost is

Prob(A, B) *Misfetch_cost // taken branch from A to B

+ Prob(A,B) * misfetch_cost      // taken branch (unconditional) from B back to C
+ Prob(A,B) * jump_inst_cost    //  one more unconditional jump instruction is introduced

If jump instruction cost is assumed to be low (0), then Prob(A,C) needs to be >66% in order for A C B to be a better layout. If jump instr cost is the same as misfetch, we get 75% threshold.

We need to introduce a helper function to compute the threshold on the fly based on the CFG and parameter.

lib/CodeGen/MachineBlockPlacement.cpp
120 ↗(On Diff #59634)

hot-->likely

125 ↗(On Diff #59634)

hot --> likely

danielcdh updated this revision to Diff 60231.Jun 9 2016, 1:11 PM
danielcdh edited edge metadata.

Handle the triangle-shape CFG.

This function has gone bloated recently and needs some cleanups. I will first do a couple of NFC cleans so that it is in a form that is more friendly to inject cost based analysis first, you can then adapt your patch based on that.

lib/CodeGen/MachineBlockPlacement.cpp
472 ↗(On Diff #60231)

It should look at the predecessors of the block C/'Succ' and find a Pred/B that is a candidate to be scheduled -- simliar to the confilct detection down below.

479 ↗(On Diff #60231)

This condition does not look correct.

481 ↗(On Diff #60231)

Should it adjust threshold BB instead?

I've done refactoring/cleanup. I think the PGO logic can now be put into the new method: getLayoutSuccessorProbThreshold() -- the method needs to pass in BB and Succ. The implementation should be PGO specific.

Also please add a test case.

I'm still putting the triangle shape handling code outside as the logic is not related to PGO. Even with static profile, the triangle-shaped CFG should have a more relaxed condition to be topologically placed.

Added unittest to cover all cases.

davidxl added inline comments.Jun 14 2016, 1:10 PM
lib/CodeGen/MachineBlockPlacement.cpp
596 ↗(On Diff #60727)

There is no need to pass this parameter -- MachineFunction is now a member field in the class.

621 ↗(On Diff #60727)

It is probably not a good idea to bump the frequency of the alternate edge. Instead we should adjust the HotProb threshold computed by:

HotProb = getLayoutSuccessorProbThreshold(BB, Succ);

For instance

BB

\
Pred
/

SUcc

When Edge BB->Succ is examined, with PGO, the thresold should be adjusted to ~66% (compared to 50% default). In order words, in this shape, in order for BB->Succ to be considered, it needs at least 66% likely.

We should not change the non-PGO default (80%) now (can be done later -- but since it requires 'guessing' probbaility, a lot more perf testing is needed in order to change that)

danielcdh updated this revision to Diff 60761.Jun 14 2016, 2:46 PM

handle triangle shape CFG only when profile is available.

davidxl accepted this revision.Jun 14 2016, 3:04 PM
davidxl edited edge metadata.

Please also add a case to cover topo-order with triaglar shape (with profile) as well as a diamond case where BP is ~60%.

lgtm.

This revision is now accepted and ready to land.Jun 14 2016, 3:04 PM
danielcdh closed this revision.Jun 14 2016, 3:33 PM
This revision was automatically updated to reflect the committed changes.