Page MenuHomePhabricator

[InlineCost] Tracking Values through PHI Nodes

Authored by haicheng on Oct 5 2017, 12:36 PM.



This patch fix this FIXME in visitPHI()

// FIXME: We should potentially be tracking values through phi nodes,
// especially when they collapse to a single value due to deleted CFG edges
// during inlining.

To do this, I maintain two lists: dead blocks during inlining and a mapping of the blocks to their known successors. For example,

define i1 @outer9(i1 %cond) {             
  %C = call i1 @inner9(i32 0, i1 %cond)
  ret i1 %C
define i1 @inner9(i32 %cond1, i1 %cond2) {
  switch i32 %cond1, label %exit [ i32 0, label %zero
                                   i32 1, label %one
                                   i32 2, label %two ]
  br label %exit

  br label %exit    

  br i1 %cond2, label %two_true, label %two_false
  br label %exit
  br label %exit
  %phi = phi i32 [0, %zero], [1, %one], [2, %two_true], [2, %two_false], [-1, %entry] ; Simplified to 0
  %cmp = icmp eq i32 %phi, 0
  call void @pad()
  store i32 0, i32* @glbl
  ret i1 %cmp

Blocks one, two, two_true, two_false are unreachable because of the deleted CFG edges. The successor of block entry has to be block zero so block entry must not be the predecessor of block exit during inlining. Thus, the phi node in block exit can be collapsed to a constant 0.

The algorithm used to find dead blocks is a simplified version from GVN::addDeadBlock(). Basically, if all the predecessors of a block are dead, this block is dead too.

This patch can reduce the inlinecost of the problematic callee of PR34173 from 235 to 65. The impact to spec is as follows

BenchmarkCode Size (%)Perf (%)
(+ is bigger)(+ is faster)

Diff Detail


Event Timeline

haicheng created this revision.Oct 5 2017, 12:36 PM

Kindly Ping.

eraman edited edge metadata.Oct 18 2017, 5:16 PM

Thanks for working on this!

167 ↗(On Diff #117838)

nit: Kepp -> Keep

170 ↗(On Diff #117838)

known successors -> known unique successors may be helpful here .

495 ↗(On Diff #117838)

If you have a phi with null as one incoming value and a non-null pointer with a constant offset as the other incoming argument, this code would incorrectly treat the result of the phi as null. Inside the loop, both FirstC and FirstV will be non-null and after exiting the loop , SimplifiedValues[&I] will be set to FirstC.

1626 ↗(On Diff #117838)

This doesn't consider the case where a dead block's successor S has a predecessor P that is not dead, but that predecessor P has a known successor that is not S. Why not extend 1630 below to

return DeadBlocks.count(P) || (KnownSuccessors[P] && KnownSuccessors[P] != S)

haicheng updated this revision to Diff 119762.Oct 21 2017, 10:02 AM
haicheng marked 4 inline comments as done.

Addressed Easwaran's comments and added two test cases. Thank you very much.

Kindly Ping (#2)

eraman added inline comments.Nov 11 2017, 7:20 PM
495 ↗(On Diff #117838)

This version also suffers from the same issue if the order of the phi is reversed (you first see a non-null pointer with constant offset and then see the null). I think it will be easier to first check if you have seen a reachable incoming value and then check if it is constant or constant offset and then make sure the phi you see now is the same. Add a test case as well after fixing this.

haicheng updated this revision to Diff 123340.Nov 17 2017, 7:21 AM
haicheng marked an inline comment as done.

Address Easwaran's comments. Thank you very much.

Kindly Ping.

Kindly Ping (#2)

Kindly Ping (#3)

eraman accepted this revision.Dec 11 2017, 6:29 PM


1632 ↗(On Diff #123340)

You could move this above condition !DeadBlocks.count(S) && llvm::all_of(predecessors(S)...) to a lambda IsNewlyDead(BasicBlock *B) and then use that in line 1619 above (if (Succ == NextBB) || !IsNewlyDead(Succ)) continue; I prefer that but will leave it to you.

This revision is now accepted and ready to land.Dec 11 2017, 6:29 PM
haicheng updated this revision to Diff 126882.Dec 13 2017, 6:30 PM
haicheng marked an inline comment as done.

Thank you very much, Easwaran. The new code looks much simpler. I will commit soon.

This revision was automatically updated to reflect the committed changes.