For chains of triangles with small join blocks that can be tail duplicated, a
simple calculation of probabilities is insufficient. Tail duplication
can be profitable in 3 different ways for these cases:
- The post-dominators marked 50% are actually taken 56% (This shrinks with longer chains)
- The chains are statically correlated. Branch probabilities have a very U-shaped distribution. [http://nrs.harvard.edu/urn-3:HUL.InstRepos:24015805] If the branches in a chain are likely to be from the same side of the distribution as their predecessor, but are independent at runtime, this transformation is profitable. (Because the cost of being wrong is a small fixed cost, unlike the standard triangle layout where the cost of being wrong scales with the # of triangles.)
- The chains are dynamically correlated. If the probability that a previous branch was taken positively influences whether the next branch will be taken
We believe that 2 and 3 are common enough to justify the small margin in 1.
The code pre-scans a function's CFG to identify this pattern and marks the edges
so that the standard layout algorithm can use the computed results.
Should the default be 3?