We noticed that a source of performance difference between llvm and
gcc was in dead code elimination. This diff replaces ADCE with a new
implementation which will remove control flow and, under option, also
remove may-be-infinite loops which are dead. The patch current has
"ADCE_new.h" and "ADCE_new.cpp" to clarify that this is essentially a
full rewrite of the deleted "ADCE.h" and "ADCE.cpp", When ready to
land, I plan to change these names back to the originals.
The new algorithm uses the iterated dominance frontier to
incrementally identify control dependence sources for live operations
to determine when branch operations are live. It adds a new fix-up
phase to correct update phi-nodeswhen control flow changes.
Below are compile time and performance impacts based on Facebook
internal code sets.
Baseline ADCE compared to new ADCE compile time, -O3 removing loops
Comparing compile times for ~400 source files from folly code base
"baseline" is total time in seconds for all files
"net_difference" is increase due to changes
"percent" is increase expressed aa persent of baseline
"samples" is number of distinct tests, these vary in some cases
as runs with high variance in timing are excluded from summary
baseline net_difference percent samples .
28,591.41 165.91 0.58 413 .
Baseline ADCE compared with new ADCE, also removing loops
baseline net_difference percent samples .
28,591.02 215.35 0.75 411 .
Here are the impacts on runtimes for three different internal sets of
benchmarks, again compiled -O3
baseline, percent, samples as above
Without removing loops:
baseline total samples . 105,846,951,721,178.10 1.000 85 . 28,317,122,801,690.20 0.998 2 364,771,688,928.1 0.997 64 .
With removing loops: (adding -mllvm -adce-remove-loops)
baseline total samples . 105,845,678,704,074.98 1.000 78 . 29,901,291,181,631.10 0.988 280 . 332,140,718,056.76 0.982 61 .
Exceeds 80 columns.