This is an archive of the discontinued LLVM Phabricator instance.

Change the probability to check if a hot successor has a more important predecessor in block-placement.
Needs ReviewPublic

Authored by congh on Sep 10 2015, 3:25 PM.

Details

Reviewers
djasper
davidxl
Summary

Consider the following diamond CFG:

 A
/ \
B C
 \/
 D

Suppose A->B and A->C have probabilities 81% and 19%. In block-placement, A->B is called a hot edge and the final placement should be ABDC. However, the current implementation outputs ABCD. This is because when choosing the next block of B, it checks if Freq(C->D) > Freq(B->D) * 20%, which is true (if Freq(A) = 100, then Freq(B->D) = 81, Freq(C->D) = 19, and 19 > 81*20%=16.2). Actually, we should use 25% instead of 20% as the probability here, so that we have 19 < 81*25%=20.25, and the desired ABDC layout will be generated.

Diff Detail

Event Timeline

congh updated this revision to Diff 34496.Sep 10 2015, 3:25 PM
congh retitled this revision from to Change the probability to check if a hot successor has a more important predecessor in block-placement..
congh updated this object.
congh added reviewers: djasper, davidxl.
congh added a subscriber: llvm-commits.
davidxl added inline comments.Sep 11 2015, 12:35 PM
lib/CodeGen/MachineBlockPlacement.cpp
354

This value should be tuned when PGO is on to follow a clear cost model. I would like to see a separate patch for that.

423

For global conflict detection, do we know why we need to apply the multiplier on the successor edge frequency and then compare with the conflicting incoming edge in the first place?

congh added inline comments.Sep 17 2015, 4:50 PM
lib/CodeGen/MachineBlockPlacement.cpp
354

The patch is ready but I am waiting for this patch being approved firstly.

423

At this point the decision whether to break CFG constraint is made. Considering the diamond branch I gave in the description and assume the probabilities are 60% and 40% respectively (for B and C). After B is chosen, we now decide whether to layout the other branch C. If we don't apply the multiplier here, we can see 40% < 60% and we may choose the D instead of C, which is not right (as we are using 80% as the threshold).

djasper added inline comments.Oct 7 2015, 7:48 AM
lib/CodeGen/MachineBlockPlacement.cpp
423

I think we should not hard code dependent values here. The 1/4 is strictly dependent on HotProb. So either pull out the denominator of HotProb into a variable or calculate this directly. I think you might need to replace

HotProb.getCompl()

with

HotProb.getCompl() / HotProb

But I am still jetlagged, so my math might be off.

congh added inline comments.Oct 12 2015, 1:27 PM
lib/CodeGen/MachineBlockPlacement.cpp
423

You are right. Before we use fixed point represented branch probability, it is a little tricky to do this as we don't have / operator defined for branch probability. Now for fixed point representation, we can just build the cold probability using numerators.

congh updated this revision to Diff 37162.Oct 12 2015, 1:33 PM

Update the patch according to Daniel's comment.

davidxl edited edge metadata.Oct 28 2015, 6:34 PM
davidxl added a subscriber: davidxl.

I will have another round of review in a couple of days ..

David

While the fix seems reasonable, in reality without profile data (or with statically predicted branch probabilities), the effect of this change will be hard to predict.

On the other hand, I think we should do a better job in computing the branch costs and locality cost when profile data is available and does not rely on this magic 80/20 ratio.

davidxl edited edge metadata.Jun 3 2016, 12:00 PM
davidxl added a subscriber: danielcdh.