Merge sets as described in the paper below provide a fast way to answer multiple
queries for basic blocks that may need PHI nodes, by pre-computing the merge sets
for all nodes in a dominator tree in constant time.
Dibyendu Das and U. Ramakrishna. 2005. A practical and fast iterative algorithm for φ-function computation using DJ graphs. ACM Trans. Program. Lang. Syst. 27, 3 (May 2005), 426-440.
Merge sets can be used to speed up code that currently calculates
iterated dominance frontiers multiple times to find nodes that need PHI
nodes when doing SSA updates.
One example is mem2reg. With merge sets, mem2reg/SROA runs about 3 times
faster for a SROA heavy test case I could find (
https://bugs.llvm.org/show_bug.cgi?id=15412)
With IDF
Total Execution Time: 33.5696 seconds (33.5756 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 10.2445 ( 31.0%) 0.1748 ( 31.4%) 10.4193 ( 31.0%) 10.4207 ( 31.0%) SROA #2 4.7518 ( 14.4%) 0.0120 ( 2.1%) 4.7637 ( 14.2%) 4.7635 ( 14.2%) Induction Variable Simplification 3.1591 ( 9.6%) 0.0248 ( 4.5%) 3.1839 ( 9.5%) 3.1862 ( 9.5%) Simple Register Coalescing
With merge sets (this patch)
Total Execution Time: 25.3381 seconds (25.3400 wall clock) ---User Time--- --System Time-- --User+System-- ---Wall Time--- --- Name --- 4.8915 ( 19.8%) 0.0146 ( 2.2%) 4.9061 ( 19.4%) 4.9061 ( 19.4%) Induction Variable Simplification 2.9898 ( 12.1%) 0.2932 ( 44.3%) 3.2830 ( 13.0%) 3.2838 ( 13.0%) SROA #2 2.9581 ( 12.0%) 0.0153 ( 2.3%) 2.9733 ( 11.7%) 2.9739 ( 11.7%) Simple Register Coalescing
I haven't done any micro tuning so far and the merge sets implementation
still needs a bit of polishing, but I thought it would be good to share
this early to get some high level feedback.
This potentially could be used in SSAUpdaterBulk in combination with a
fast renamer similar to the implementation in PredicateInfo (I have a
patch for such a renamer that I will post shortly).
is it okay to use non-ASCII characters? I'm not sure but wanted to point out.