This patch fix this FIXME in visitPHI()
// FIXME: We should potentially be tracking values through phi nodes, // especially when they collapse to a single value due to deleted CFG edges // during inlining.
To do this, I maintain two lists: dead blocks during inlining and a mapping of the blocks to their known successors. For example,
define i1 @outer9(i1 %cond) { %C = call i1 @inner9(i32 0, i1 %cond) ret i1 %C } define i1 @inner9(i32 %cond1, i1 %cond2) { entry: switch i32 %cond1, label %exit [ i32 0, label %zero i32 1, label %one i32 2, label %two ] zero: br label %exit one: br label %exit two: br i1 %cond2, label %two_true, label %two_false two_true: br label %exit two_false: br label %exit exit: %phi = phi i32 [0, %zero], [1, %one], [2, %two_true], [2, %two_false], [-1, %entry] ; Simplified to 0 %cmp = icmp eq i32 %phi, 0 call void @pad() store i32 0, i32* @glbl ret i1 %cmp }
Blocks one, two, two_true, two_false are unreachable because of the deleted CFG edges. The successor of block entry has to be block zero so block entry must not be the predecessor of block exit during inlining. Thus, the phi node in block exit can be collapsed to a constant 0.
The algorithm used to find dead blocks is a simplified version from GVN::addDeadBlock(). Basically, if all the predecessors of a block are dead, this block is dead too.
This patch can reduce the inlinecost of the problematic callee of PR34173 from 235 to 65. The impact to spec is as follows
Benchmark | Code Size (%) | Perf (%) |
(+ is bigger) | (+ is faster) | |
spec2000/gcc | 0.05 | -0.39 |
spec2006/gcc | -0.04 | -0.03 |
spec2006/h264ref | -0.05 | 0.15 |
spec2006/xalancbmk | 0.05 | -0.64 |
spec2017/blender | 0.02 | 0.54 |
spec2017/gcc | 0.04 | 0.03 |
spec2017/imagick | 0.01 | 1.03 |
spec2017/perlbench | 0.01 | -0.04 |
spec2017/xalancbmk | 0.04 | 0.29 |