Since the analysis is described to be suitable for a forward data-flow analysis, maintaining the worklist as a queue is closer to the RPO ordering of block visits, thus reaching the fixpoint earlier.
For example,
// mlir-opt -mlir-disable-threading -allow-unregistered-dialect -pass-pipeline="builtin.func(sccp)" sccp-iterations.mlir
func @simple_control_flow(%arg0 : i32, %arg1 : i1) -> i32 {
cond_br %arg1, ^bb1, ^bb3
^bb1:
%1 = arith.constant 1 : i32
br ^bb2(%1 : i32)
^bb3:
%3 = arith.constant 2 : i32
br ^bb2(%3 : i32)
^bb2(%arg : i32):
%2 = arith.constant 3 : i32
%4 = arith.addi %arg, %2 : i32
return %4 : i32
}Running SCCP on this code results in the following visits when using a stack (currently) to maintain the worklist (I modified visitOperation in SCCP.cpp to print the operation being visited):
(SCCP): Visiting %c2_i32 = arith.constant 2 : i32
(SCCP): Visiting %c3_i32 = arith.constant 3 : i32
(SCCP): Visiting %1 = arith.addi %0, %c3_i32 : i32
(SCCP): Visiting %c1_i32 = arith.constant 1 : i32
(SCCP): Visiting %1 = arith.addi %0, %c3_i32 : i32
module {
func @simple_control_flow(%arg0: i32, %arg1: i1) -> i32 {
%c3_i32 = arith.constant 3 : i32
%c2_i32 = arith.constant 2 : i32
%c1_i32 = arith.constant 1 : i32
cond_br %arg1, ^bb1, ^bb2
^bb1: // pred: ^bb0
br ^bb3(%c1_i32 : i32)
^bb2: // pred: ^bb0
br ^bb3(%c2_i32 : i32)
^bb3(%0: i32): // 2 preds: ^bb1, ^bb2
%1 = arith.addi %0, %c3_i32 : i32
return %1 : i32
}
}But with the proposed change to use a queue instead, the visits are:
(SCCP): Visiting %c1_i32 = arith.constant 1 : i32
(SCCP): Visiting %c2_i32 = arith.constant 2 : i32
(SCCP): Visiting %c3_i32 = arith.constant 3 : i32
(SCCP): Visiting %1 = arith.addi %0, %c3_i32 : i32
module {
func @simple_control_flow(%arg0: i32, %arg1: i1) -> i32 {
%c3_i32 = arith.constant 3 : i32
%c2_i32 = arith.constant 2 : i32
%c1_i32 = arith.constant 1 : i32
cond_br %arg1, ^bb1, ^bb2
^bb1: // pred: ^bb0
br ^bb3(%c1_i32 : i32)
^bb2: // pred: ^bb0
br ^bb3(%c2_i32 : i32)
^bb3(%0: i32): // 2 preds: ^bb1, ^bb2
%1 = arith.addi %0, %c3_i32 : i32
return %1 : i32
}
}On the longer run, we could explore more complex strategies (not to mention the simple improvement of checking if a node is already in the queue before inserting it) as is described in "Iterative Data-flow Analysis - Revisited" (Keith D. Cooper, Timothy J. Harvey, and Ken Kennedy).
I'm not sure what's the right way to add a test for this. Perhaps add a statistic to count when an Operation is visited, print that and CHECK for that in the unit test?
System includes should go after the others:
https://llvm.org/docs/CodingStandards.html#include-style