This is an archive of the discontinued LLVM Phabricator instance.

[Scheduler] Treat weak edges uniformly at entry
AcceptedPublic

Authored by Joe_Nash on Jul 16 2021, 12:17 PM.

Details

Summary

There is a bug where the assert CurCycle >= SU->getDepth() in
SchedulePostRATDList::ScheduleNodeTopDown will fire when an instruction has only
weak predecessors and would otherwise be available at the entry of a scheduling
dag. For all nodes which are not immediately available, we respect the depth
(which respects weak edges) and do not move nodes from pending to available
until the proper depth is met. This patch makes that consistent for entry
available nodes as well.

Diff Detail

Event Timeline

Joe_Nash created this revision.Jul 16 2021, 12:17 PM
Joe_Nash requested review of this revision.Jul 16 2021, 12:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 16 2021, 12:17 PM

I would add a test case if I could, but the case I noticed the issue with is based on weak edges added by an out-of-tree pass similar to MacroFusion.

critson accepted this revision.Jul 20 2021, 1:10 AM

LGTM

I agree manipulation of the available queue here needs to be made consistent with the scheduling loop below it.

I wonder whether we want to be selecting these instructions (leaf with remaining weak preds) to the pending queue at all?
For example, if the condition on line 536 was extended to exclude SUnits with weak predecessors as well?
That said, in the absence of other leaf nodes then we do want to break weak conditions and schedule these SUnits.
Otherwise we risk not being able to schedule anything, so this seems to be the correct solution.

I guess with this change, in the presence of a single leaf with weak preds then CurCycle will be progressively raised until the assertion in ScheduleNodeTopDown passes?
And such a case is relatively atypical?

This revision is now accepted and ready to land.Jul 20 2021, 1:10 AM
fhahn added a comment.Jul 20 2021, 2:41 AM

Is it possible to add a test case?

LGTM

I agree manipulation of the available queue here needs to be made consistent with the scheduling loop below it.

I wonder whether we want to be selecting these instructions (leaf with remaining weak preds) to the pending queue at all?
For example, if the condition on line 536 was extended to exclude SUnits with weak predecessors as well?

The only time nodes are added to pending is right at the entry here, and in ReleaseSucc, which is called when strong predecessors are scheduled. So if leaves are not put onto the pending queue here they will never be scheduled

That said, in the absence of other leaf nodes then we do want to break weak conditions and schedule these SUnits.

Unfortunately, that is not what will happen. Without this change, we would get an assertion failure. With this change, we will stall.

Otherwise we risk not being able to schedule anything, so this seems to be the correct solution.

I guess with this change, in the presence of a single leaf with weak preds then CurCycle will be progressively raised until the assertion in ScheduleNodeTopDown passes?

Yes

And such a case is relatively atypical?

I would guess the incidence is zero, since it would cause an assertion failure without this patch.

Something like this patch is necessary to create consistency and avoid breakage, but I do not think it is possible to achieve good schedule results by feeding weak edges to the ListScheduler. For instance, in X86/testb-je-fusion.ll has the same results in many cases whether or not the PostRA scheduler is enabled.
@courbet

Is it possible to add a test case?

Do you know a way to create a synthetic schedule DAG for testing? Without that, I would say no, since I don't think we want to create a new schedule mutator purely for testing purposes?

Is it possible to add a test case?

Do you know a way to create a synthetic schedule DAG for testing? Without that, I would say no, since I don't think we want to create a new schedule mutator purely for testing purposes?

It will be a little bit tedious, but you can create a new unittest under llvm/unittests/CodeGen.

One approach may be to parse some MIR (for an example see MI/LiveIntervalTest.cpp).
Then add a simple DAG mutation to a custom instance of the PostRA scheduler and apply it.