Not only is the old algorithm essentially BFS instead of DFS, it weirdly
can mark heights as dirty which then walks preds. Very strang. I think
a direct DFS that just computes the height of everything in the DAG from
the bottom up and avoids re-computing already computed heights is much
easier to understand.
For test/CodeGen/AMDGPU/spill-scavenge-offset.ll, prior to it having any
scheduler turned off, computing the height was over 25% of the runtime.
With this patch, it is completely gone. I'm measuring improvements from
32s to 24s (25%) in debug builds and 4.5s to 3.17s (30%) in an optimized
build for thes test.
Glancing at it, the depth computation probably needs the same treatment,
but I've not yet found a test that exercises this (I've not looked too
hard yet though).
As with my prior patch, this impacts both SDAG scheduling and MI
scheduling, but for different reasons -- in this case, both schedulers
call the ScheduleDAG's getHeight routine heavily and were causing it
show up in profiles for spill-scavenge-offset.ll.