When choosing between blocks of approximately the same probability, we need some
other metric to choose between them. In this patch, we add the simple metric of
"Branch Factor", which is the number of independent blocks that can be reached
from a block without crossing a join point (A block with multiple successors).
This is related to the classical "Branch Factor" similarly to how a 2-3-4 tree
is related to a red-black tree. These nodes could in theory all be rolled up
into a single node like a switch. If we did that, the calculated value would be
the branching factor of that single node.
Another way of viewing this is if you don't cross join-points, you're
essentially finding a sub-tree in the CFG. We're counting the # of leaves in
that subtree.
The motivation here is that when choosing between a list of alternatives like
this:
if (v < (1<<7)) {
...
} else if (v < (1<<14)) {
...
} else if (v < (1<<21)) {
...
} else if (v < (1<<28)) {
...
} else {
...
}
That we should prefer to fall through to the next test, because those tests have
a larger number of branches they have to pass through in order to be executed.
Even if our estimates of 50% probabilities are correct for these branches,
Having the fallthrough go to the next test helps to reduce tail latency because
it reduces the # of taken branches to get to the bottom of the sub-tree of the
CFG. In the above example, each case ends up with 1 taken branch, instead of the
last case requiring 4 taken branches.
In the attached test case, the expected number of dynamic taken branches is
15/16 for both layouts. But with the patch, the maximum # of taken branches is
1 . Without the patch, it is 4.
Branch factor is a local heuristic. Ideally we would use a measure based on a
graph property like dominance, which would have a better theoretical
grounding. A good candidate would be exclusion: A excludes B if
A ∉ DomF_inf(B) && B ∉ DomF_inf(A) && A ∉ Dom(B) && B ∉ Dom(A).
This is the theoretical definition, but a good intuition is that if A executes,
then B has not, and will not, and vice versa. We are trying to use branch factor
to guess which chosen branch will have the smaller # of excluded blocks.