This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/CodeGen/
-
lib/
-
CodeGen/
-
PostRASchedulerList.cpp

Differential D106176

[Scheduler] Treat weak edges uniformly at entry
AcceptedPublic

Authored by Joe_Nash on Jul 16 2021, 12:17 PM.

Download Raw Diff

Details

Reviewers

sunfish
critson
fhahn

Summary

There is a bug where the assert CurCycle >= SU->getDepth() in
SchedulePostRATDList::ScheduleNodeTopDown will fire when an instruction has only
weak predecessors and would otherwise be available at the entry of a scheduling
dag. For all nodes which are not immediately available, we respect the depth
(which respects weak edges) and do not move nodes from pending to available
until the proper depth is met. This patch makes that consistent for entry
available nodes as well.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	2,770 ms	x64 debian > libarcher.critical::critical.c
	2,780 ms	x64 debian > libarcher.critical::lock.c
	2,840 ms	x64 debian > libarcher.races::critical-unrelated.c
	2,790 ms	x64 debian > libarcher.races::lock-nested-unrelated.c
	3,020 ms	x64 debian > libarcher.races::lock-unrelated.c
		View Full Test Results (18 Failed)

Event Timeline

Joe_Nash created this revision.Jul 16 2021, 12:17 PM

Herald added subscribers: javed.absar, hiraditya. · View Herald TranscriptJul 16 2021, 12:17 PM

Joe_Nash requested review of this revision.Jul 16 2021, 12:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptJul 16 2021, 12:17 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

I would add a test case if I could, but the case I noticed the issue with is based on weak edges added by an out-of-tree pass similar to MacroFusion.

Harbormaster completed remote builds in B114566: Diff 359411.Jul 16 2021, 12:55 PM

LGTM

I agree manipulation of the available queue here needs to be made consistent with the scheduling loop below it.

I wonder whether we want to be selecting these instructions (leaf with remaining weak preds) to the pending queue at all?
For example, if the condition on line 536 was extended to exclude SUnits with weak predecessors as well?
That said, in the absence of other leaf nodes then we do want to break weak conditions and schedule these SUnits.
Otherwise we risk not being able to schedule anything, so this seems to be the correct solution.

I guess with this change, in the presence of a single leaf with weak preds then CurCycle will be progressively raised until the assertion in ScheduleNodeTopDown passes?
And such a case is relatively atypical?

This revision is now accepted and ready to land.Jul 20 2021, 1:10 AM

Is it possible to add a test case?

In D106176#2889686, @critson wrote:

LGTM

I agree manipulation of the available queue here needs to be made consistent with the scheduling loop below it.

I wonder whether we want to be selecting these instructions (leaf with remaining weak preds) to the pending queue at all?
For example, if the condition on line 536 was extended to exclude SUnits with weak predecessors as well?

The only time nodes are added to pending is right at the entry here, and in ReleaseSucc, which is called when strong predecessors are scheduled. So if leaves are not put onto the pending queue here they will never be scheduled

That said, in the absence of other leaf nodes then we do want to break weak conditions and schedule these SUnits.

Unfortunately, that is not what will happen. Without this change, we would get an assertion failure. With this change, we will stall.

Otherwise we risk not being able to schedule anything, so this seems to be the correct solution.

I guess with this change, in the presence of a single leaf with weak preds then CurCycle will be progressively raised until the assertion in ScheduleNodeTopDown passes?

Yes

And such a case is relatively atypical?

I would guess the incidence is zero, since it would cause an assertion failure without this patch.

Something like this patch is necessary to create consistency and avoid breakage, but I do not think it is possible to achieve good schedule results by feeding weak edges to the ListScheduler. For instance, in X86/testb-je-fusion.ll has the same results in many cases whether or not the PostRA scheduler is enabled.
@courbet

In D106176#2889833, @fhahn wrote:

Is it possible to add a test case?

Do you know a way to create a synthetic schedule DAG for testing? Without that, I would say no, since I don't think we want to create a new schedule mutator purely for testing purposes?

In D106176#2890256, @Joe_Nash wrote:

In D106176#2889833, @fhahn wrote:

Is it possible to add a test case?

Do you know a way to create a synthetic schedule DAG for testing? Without that, I would say no, since I don't think we want to create a new schedule mutator purely for testing purposes?

It will be a little bit tedious, but you can create a new unittest under llvm/unittests/CodeGen.

One approach may be to parse some MIR (for an example see MI/LiveIntervalTest.cpp).
Then add a simple DAG mutation to a custom instance of the PostRA scheduler and apply it.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

PostRASchedulerList.cpp

9 lines

Diff 359411

llvm/lib/CodeGen/PostRASchedulerList.cpp

Show First 20 Lines • Show All 528 Lines • ▼ Show 20 Lines	void SchedulePostRATDList::ListScheduleTopDown() {

// Release any successors of the special Entry node.		// Release any successors of the special Entry node.
ReleaseSuccessors(&EntrySU);		ReleaseSuccessors(&EntrySU);

// Add all leaves to Available queue.		// Add all leaves to Available queue.
for (unsigned i = 0, e = SUnits.size(); i != e; ++i) {		for (unsigned i = 0, e = SUnits.size(); i != e; ++i) {
// It is available if it has no predecessors.		// It is available if it has no predecessors.
if (!SUnits[i].NumPredsLeft && !SUnits[i].isAvailable) {		if (!SUnits[i].NumPredsLeft && !SUnits[i].isAvailable) {
		if (SUnits[i].getDepth() > 0) {
		assert(SUnits[i].WeakPredsLeft > 0 && "Expected a weak predecessor");
		PendingQueue.push_back(&SUnits[i]);
		} else {
AvailableQueue.push(&SUnits[i]);		AvailableQueue.push(&SUnits[i]);
SUnits[i].isAvailable = true;		SUnits[i].isAvailable = true;
}		}
}		}
		}

// In any cycle where we can't schedule any instructions, we must		// In any cycle where we can't schedule any instructions, we must
// stall or emit a noop, depending on the target.		// stall or emit a noop, depending on the target.
bool CycleHasInsts = false;		bool CycleHasInsts = false;

// While Available queue is not empty, grab the node with the highest		// While Available queue is not empty, grab the node with the highest
// priority. If it is not ready put it back. Schedule the node.		// priority. If it is not ready put it back. Schedule the node.
std::vector<SUnit*> NotReady;		std::vector<SUnit*> NotReady;
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines