The recursive implementation of CalcNodeSethiUllmanNumber may
overflow stack on extremely long pred chains. This patch replaces it
with an equivalent iterative implementation.
Details
Diff Detail
Event Timeline
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp | ||
---|---|---|
1898 | I think you can use a much simpler bit of code here. The key bit is that we can't compute the current node until all predecessors are done, but that (by assumption from the existing code) the graph is a tree. (Add an assert that enforces that please!) Also, Wikipedia helps here. :) If you do something along the lines of: Cur = Worklist.pop_back(); if (not all preds done) { push(Cur) push(each pred) continue } compute answer using preds } I think you get the same result right? This will eliminate the need for the intermediate state. |
Rewritten in a more clear manner.
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp | ||
---|---|---|
1898 | We still need a workstate to keep the current pred in order to avoid pushing an element more than once into the stack, but you are right, it can be done in a much more straightforward way. :) |
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp | ||
---|---|---|
1898 | Why do you need the predecessors in order or to track which ones have been previously processed? This is a reduction over a set of child nodes. The order shouldn't matter and the "recursive" walk is responsible for filling in the value. It shouldn't even matter if the code is DFS or BFS if I'm reading the original right. |
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp | ||
---|---|---|
1898 | It's not for tracking order, it's for preventing us from pushing one node more than once. Imagine the case: preds(a) = (x, y, z, b) preds(b) = (x, y, z, c) preds(c) = (x, y, z, d) I need the iterator to not push x, y, z into stack 3 times. |
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp | ||
---|---|---|
1898 | Please note that Sethi-Ullman's algorithm described in Wikipedia works on trees, and here we are dealing with SUnits which are nodes in Scheduling DAG. So it is possible that one SUnit is a predecessor of many others. In this case we need to preserve the calculation order of recursive DFS unless we want one element to be pushed many times as in the example above. I just double-checked that it actually happens (and we fail the assertion if it does). So please don't be confused by Wilipedia's description of the algorithm. This modification is a bit different. |
Argument accepted, minor comment below.
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp | ||
---|---|---|
1864 | Reduce nesting by inverting conditional or using impl helper pattern. |
Reduce nesting by inverting conditional or using impl helper pattern.