Propagation of profile samples through the CFG.

This adds a propagation heuristic to convert instruction samples

into branch weights. It implements a similar heuristic to the one

implemented by Dehao Chen on GCC.

The propagation proceeds in 3 phases:

1- Assignment of block weights. All the basic blocks in the function

are initial assigned the same weight as their most frequently executed instruction.

2- Creation of equivalence classes. Since samples may be missing from

blocks, we can fill in the gaps by setting the weights of all the blocks in the same equivalence class to the same weight. To compute the concept of equivalence, we use dominance and loop information. Two blocks B1 and B2 are in the same equivalence class if B1 dominates B2, B2 post-dominates B1 and both are in the same loop.

3- Propagation of block weights into edges. This uses a simple

propagation heuristic. The following rules are applied to every block B in the CFG: - If B has a single predecessor/successor, then the weight of that edge is the weight of the block. - If all the edges are known except one, and the weight of the block is already known, the weight of the unknown edge will be the weight of the block minus the sum of all the known edges. If the sum of all the known edges is larger than B's weight, we set the unknown edge weight to zero. - If there is a self-referential edge, and the weight of the block is known, the weight for that edge is set to the weight of the block minus the weight of the other incoming edges to that block (if known).

Since this propagation is not guaranteed to finalize for every CFG, we

only allow it to proceed for a limited number of iterations (controlled

by -sample-profile-max-propagate-iterations). It currently uses the same

GCC default of 100.

Before propagation starts, the pass builds (for each block) a list of

unique predecessors and successors. This is necessary to handle

identical edges in multiway branches. Since we visit all blocks and all

edges of the CFG, it is cleaner to build these lists once at the start

of the pass.

Finally, the patch fixes the computation of relative line locations.

The profiler emits lines relative to the function header. To discover

it, we traverse the compilation unit looking for the subprogram

corresponding to the function. The line number of that subprogram is the

line where the function begins. That becomes line zero for all the

relative locations.