This is an archive of the discontinued LLVM Phabricator instance.

[SelectionDAG] Get rid of recursion in CalcNodeSethiUllmanNumber
ClosedPublic

Authored by mkazantsev on Jun 1 2017, 4:40 AM.

Details

Summary

The recursive implementation of CalcNodeSethiUllmanNumber may
overflow stack on extremely long pred chains. This patch replaces it
with an equivalent iterative implementation.

Diff Detail

Repository
rL LLVM

Event Timeline

mkazantsev created this revision.Jun 1 2017, 4:40 AM
reames requested changes to this revision.Jun 8 2017, 11:07 AM
reames added inline comments.
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
1864 ↗(On Diff #101004)

I think you can use a much simpler bit of code here. The key bit is that we can't compute the current node until all predecessors are done, but that (by assumption from the existing code) the graph is a tree. (Add an assert that enforces that please!) Also, Wikipedia helps here. :)

If you do something along the lines of:
while (!Worklist.empty())

Cur = Worklist.pop_back();
if (not all preds done) {
   push(Cur)
   push(each pred)
  continue
}
compute answer using preds

}

I think you get the same result right? This will eliminate the need for the intermediate state.

This revision now requires changes to proceed.Jun 8 2017, 11:07 AM
mkazantsev edited edge metadata.
mkazantsev marked an inline comment as done.

Rewritten in a more clear manner.

lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
1864 ↗(On Diff #101004)

We still need a workstate to keep the current pred in order to avoid pushing an element more than once into the stack, but you are right, it can be done in a much more straightforward way. :)

reames requested changes to this revision.Jun 12 2017, 9:08 AM
reames added inline comments.
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
1864 ↗(On Diff #101004)

Why do you need the predecessors in order or to track which ones have been previously processed?

This is a reduction over a set of child nodes. The order shouldn't matter and the "recursive" walk is responsible for filling in the value. It shouldn't even matter if the code is DFS or BFS if I'm reading the original right.

This revision now requires changes to proceed.Jun 12 2017, 9:08 AM
mkazantsev added inline comments.Jun 12 2017, 8:02 PM
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
1864 ↗(On Diff #101004)

It's not for tracking order, it's for preventing us from pushing one node more than once. Imagine the case:

preds(a) = (x, y, z, b)
preds(b) = (x, y, z, c)
preds(c) = (x, y, z, d)

I need the iterator to not push x, y, z into stack 3 times.

mkazantsev added inline comments.Jun 13 2017, 3:34 AM
lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
1864 ↗(On Diff #101004)

Please note that Sethi-Ullman's algorithm described in Wikipedia works on trees, and here we are dealing with SUnits which are nodes in Scheduling DAG. So it is possible that one SUnit is a predecessor of many others. In this case we need to preserve the calculation order of recursive DFS unless we want one element to be pushed many times as in the example above. I just double-checked that it actually happens (and we fail the assertion if it does). So please don't be confused by Wilipedia's description of the algorithm. This modification is a bit different.

Argument accepted, minor comment below.

lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp
1864 ↗(On Diff #101997)

Reduce nesting by inverting conditional or using impl helper pattern.

reames accepted this revision.Jun 19 2017, 8:22 PM

LGTM subject to previously stated minor comment being fixed.

This revision is now accepted and ready to land.Jun 19 2017, 8:22 PM
mkazantsev marked an inline comment as done.
This revision was automatically updated to reflect the committed changes.