Page MenuHomePhabricator

Use fixed-point representation for BranchProbability
ClosedPublic

Authored by congh on Sep 3 2015, 11:08 AM.

Details

Summary

BranchProbability now is represented by its numerator and denominator in uint32_t type. This patch changes this representation into a fixed point that is represented by the numerator in uint32_t type and a constant denominator UINT32_MAX. This is quite similar to the representation of BlockMass in BlockFrequencyInfoImpl.h. There are several pros and cons of this change:

Pros:

  1. It uses only a half space of the current one.
  2. Some operations are much faster like plus, subtraction, comparison, and scaling by an integer.

Cons:

  1. Constructing a probability using arbitrary numerator and denominator needs additional calculations.
  2. It is a little less precise than before as we use a fixed denominator. For example, 1 - 1/2 may not be exactly identical to 1 / 2 (this will lead to many BranchProbability unit test failures). This should not matter when we only use it for branch probability. If we use it like a rational value for some precise calculations we may need another construct like ValueRatio.

One important reason for this change is that we propose to store branch probabilities instead of edge weights in MachineBasicBlock (see http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150803/291610.html for the proposal and discussions). We also want clients to use probability instead of weight when adding successors to a MBB. The current BranchProbability has more space which may be a concern.

Most test cases are updated with this change, except those of BranchProbability and BlockFrequency unit tests, which fail due to imprecise representation. If this patch is approved I will update them later.

Diff Detail

Repository
rL LLVM

Event Timeline

congh updated this revision to Diff 33893.Sep 3 2015, 11:08 AM
congh retitled this revision from to Use fixed-point representation for BranchProbability.
congh updated this object.
congh added reviewers: dexonsmith, atrick, davidxl.
congh added a subscriber: llvm-commits.
chapuni added a subscriber: chapuni.Sep 4 2015, 5:14 AM

Could we introduce fixed-point arithmetic class for general use?
(It'd be not urgent)

davidxl edited edge metadata.Sep 5 2015, 10:41 PM

For block frequency print, I think it might make sense to set a fixed precision in the print. For instance, when the value is less than 1, do the following to filter noise in testing:

OS << format("%.2f", float_freq)

otherwise

OS << format("%.1f", flat_freq)

include/llvm/Support/BranchProbability.h
49 ↗(On Diff #33893)

Do we really need this method? If update is done properly, this is not needed. Similar method can be defined for verification purpose.

51 ↗(On Diff #33893)

Should be a private method.

52 ↗(On Diff #33893)

Make this private.

lib/Support/BranchProbability.cpp
35 ↗(On Diff #33893)

To make result with the right rounding, do this

Prob64 = n * (D + d/2) / d

test/Analysis/BlockFrequencyInfo/basic.ll
71 ↗(On Diff #33893)

where does the coma come from?

congh added a comment.Sep 8 2015, 10:50 AM

For block frequency print, I think it might make sense to set a fixed precision in the print. For instance, when the value is less than 1, do the following to filter noise in testing:

OS << format("%.2f", float_freq)

otherwise

OS << format("%.1f", flat_freq)

Do we really need this precision to represent subtle differences?

include/llvm/Support/BranchProbability.h
49 ↗(On Diff #33893)

This is not needed in this patch but will be needed when substituting weights with probabilities. This is a utility function that is used to normalize several probabilities assigned to all successors. I can remove it from this patch and add it later.

51 ↗(On Diff #33893)

It will be used in branch probability unit test.

52 ↗(On Diff #33893)

Ditto.

lib/Support/BranchProbability.cpp
35 ↗(On Diff #33893)

OK.

test/Analysis/BlockFrequencyInfo/basic.ll
71 ↗(On Diff #33893)

With this patch, the result is not 0.8 but 0.8000000001. Removing the comma is a quick way to capture 0.8 but not those zeros behind it. Maybe I should use regex here?

congh updated this revision to Diff 34296.Sep 8 2015, 9:31 PM
congh edited edge metadata.

Update the patch by addressing issues pointed by Duncan and David.

davidxl added inline comments.Sep 8 2015, 10:56 PM
test/CodeGen/AArch64/aarch64-deferred-spilling.ll
26 ↗(On Diff #33893)

register name change?

test/CodeGen/ARM/cmpxchg-weak.ll
11 ↗(On Diff #34296)

unrelated change?

34 ↗(On Diff #34296)

unrelated change?

test/CodeGen/ARM/ifcvt-branch-weight-bug.ll
26 ↗(On Diff #34296)

bb#4->bb#5 : unrelated change.

test/CodeGen/ARM/vector-promotion.ll
56 ↗(On Diff #34296)

unrelated change

congh updated this revision to Diff 34369.Sep 9 2015, 2:54 PM

Update the patch by considering rounding when scaling an integer by a probability. Also update test cases accordingly.

congh added inline comments.Sep 9 2015, 3:08 PM
test/CodeGen/AArch64/aarch64-deferred-spilling.ll
26 ↗(On Diff #33893)

Fixed. This test case keeps unchanged after the patch update. See explains below.

test/CodeGen/ARM/cmpxchg-weak.ll
11 ↗(On Diff #34296)

This change is caused by fixed point probability representation. After investigating the root cause, I found that it is from if-conversion. More specifically, when determine whether to do if-conversion there is a cost model measurement in which the branch probability is used to scale some cost (in integers). The current scale method does not round the result, which may be a problem. For example, ARMBaseInstrInfo::isProfitableToIfCvt() has the following statements:

unsigned TUnpredCost = Probability.scale(TCycles);
unsigned FUnpredCost = Probability.getCompl().scale(FCycles);
unsigned UnpredCost = TUnpredCost + FUnpredCost;

Now assume Probability is 0.5 and both TCycles and FCycles are 1. If scale() does not round, then TUnpredCost and FUnpredCost are 0, and hence UnpredCost = TUnpredCost + FUnpredCost is also zero. This doesn't make sense as here if we calculate the cost in floating point we should get ~1 as UnpredCost.

We we do round in scale(), it is possible to get both TUnpredCost and FUnpredCost as 1, which is also incorrect. However, in our fixed point representation, 0.5 is actually a little less than 0.5, and its complement is a little greater than 0.5. Therefore we will have TUnpredCost = 0, and FUnpredCost = 1 after rounding. This is very hacky.

So I think we do need scale() to have rounding behavior, but in ARMBaseInstrInfo::isProfitableToIfCvt() to avoid precision related issues it is better to scale TCycles/FCycles up (e.g. x1000) before scaling them by Probability. This is find as the function returns the value of

return (TCycles + FCycles + TExtra + FExtra) <= UnpredCost;

and we can scale every operands up.

What do you think?

congh added a comment.Sep 9 2015, 3:10 PM

After supporing rounding in scale(), the if-conversion is enabled for several test cases and that is why they are updated accordingly.

congh added a comment.Sep 9 2015, 4:04 PM

Could we introduce fixed-point arithmetic class for general use?
(It'd be not urgent)

Sorry for the late reply! Do we need such a class in LLVM? Except BlockMass, do we have other usages or demands of fixed-point class? If yes, it is not difficult to abstract such a class from the new BranchProbability in this patch.

davidxl added a subscriber: davidxl.Sep 9 2015, 6:24 PM

I think this can be done as a follow up patch -- having
BranchProbability derive from Ratio which is derived from FixedPoint
Rep.

David

You explanation makes sense -- I saw the other patch to fix the
rounding bug in ifcvt.

David

congh updated this revision to Diff 34474.Sep 10 2015, 12:21 PM

Update the patch by fixing block frequency / branch probability unit tests failures.

I am not sure if this is the right way to update the unit tests. The fixed-point representation has precision issues comparing to the previous rational representation and that is why we need to update the comparison right answers in unit tests.

Cong, It is more common to use power of 2 as the scaling factor in
fixed point representation for computation efficiency. For 32bit
width, 0x80000000 is common (noted as Q31). Here are the benefits

  1. it is much faster to do the scale operation with Q31 -- instead of

using the current slow version of scale, it can call a faster version
to be introduced.

This will probably help resolve bug
https://llvm.org/bugs/show_bug.cgi?id=24620 where 20% of the time is
spent in the slow version of the scale method.

  1. it allows negative fractional number to be represented
  1. it is more precise for lots of common cases. For instance,

representation of 0.5, 0.25, 0.75 (i.e. 1/2, 1/4, 3/4) etc are
actually lossless with Q31 while the current choice of using
uint32_max can not represent them precisely.

(if needed in the future, we can also extend this to be a template
class where number of bits in the scaling factor is a template
parameter to support values > 1

template <int bits> class FixedPoint {
  static const int ScalingFactor = 1 << bits;

};
)

In the IR dump, I find it is easier to read

  1. percentage,
  2. M/N form when N is less than 10.

Current dump of 0x10ffff/0xffffffff form is hard to read -- can then
be simplified to make the denominator < 10?

Duncan, what is your opinion?

thanks,

David

congh updated this revision to Diff 34607.Sep 11 2015, 5:20 PM

Update the patch by using 1<<31 instead of UINT32_MAX as the constant denominator.

davidxl added inline comments.Sep 24 2015, 11:57 AM
lib/Support/BranchProbability.cpp
98 ↗(On Diff #34607)

Should D/2 be added to ProductLow above (and check overflow and carry bit)?

Besides, independent on what scale factor to use for BP, I think this change should not be included in this patch. It changes the rounding behavior of the scale() method (instead of truncating) which is irrelevant to the main purpose of the patch. This should be done as a follow up patch if it is desirable.

congh added inline comments.Sep 24 2015, 1:51 PM
lib/Support/BranchProbability.cpp
98 ↗(On Diff #34607)

It is the division by D that may cause precision loss and I think this is the proper place to add D/2.

It is OK to separate this change to another patch (if it is necessary). This will cause many test failures and I will fix them as long as this patch is approved.

davidxl added inline comments.Sep 24 2015, 2:00 PM
lib/Support/BranchProbability.cpp
98 ↗(On Diff #34607)

I would expect more test failures with this particular change as it differs from existing implementation. How many more tests fail without this change? (note that the rounding in the BP construction is still needed and kept)

congh added inline comments.Sep 24 2015, 2:43 PM
lib/Support/BranchProbability.cpp
98 ↗(On Diff #34607)

There are three regression test failures if we remove D/2 here and two are caused by changes on test cases in this patch. That is two say, without D/2 we only have one more test failure. Other failures are branch probability unit tests.

Without modifying any test cases, how many failures are there with D/2 vs
without?

David

This revision was automatically updated to reflect the committed changes.