Page MenuHomePhabricator

[SystemZ] Don't crash with -misched-cutoff
ClosedPublic

Authored by jonpa on Jan 10 2021, 6:04 PM.

Details

Summary

This is a fix for https://bugs.llvm.org/show_bug.cgi?id=45928, which reports that SystemZPostRASchedStrategy crashes with -mished-cutoff.

The reason seems to be that dangling pointers in the Available set were never cleared.

This patch clears the set if needed and then also skips advancing the HazardRecognizer through the unscheduled instructions.

Diff Detail

Event Timeline

jonpa created this revision.Jan 10 2021, 6:04 PM
jonpa requested review of this revision.Jan 10 2021, 6:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 10 2021, 6:04 PM

Could you elaborate why initPolicy is the correct place to clear the Available list? I'm wondering because the default implementation doesn't appear to do that either, it looks like common code only clears the list in the main "init" ...

Could you elaborate why initPolicy is the correct place to clear the Available list? I'm wondering because the default implementation doesn't appear to do that either, it looks like common code only clears the list in the main "init" ...

My main concern was that it is cleared before each region, and it seems that either SystemZPostRASchedStrategy::initialize() and SystemZPostRASchedStrategy::initPolicy() could work.

For a region containing just a single instruction (scheduling boundary), only SystemZPostRASchedStrategy::initPolicy() is called, which is why that method is used to update the hazard recognizer with those instructions as well so that it is accurate when the next region begins with actual scheduling.

My idea - based on the assumption that this is a compile-time issue - was to clear Available and also skip updating the HazardRecognizer in cases where the scheduling had reached the limit. I thought that since pickNode() is called for the first instruction in the region, that the cutoff happens for each region, but that was wrong - as soon as the cutoff has been reached, all scheduling stops. I don't understand really why the DAG is built and pickNode() is called for each region when no scheduling will occur...

I think we could probably only clear Available in either initPolicy() or initialize() - the cost of advancing the hazard recognizer should be much less than building the DAGs...?

I don't think we need to bother with skipping advancing the hazard state. I believe the main point of the cutoff is to avoid combinatorial explosion where there are many instructions to schedule and at each step there are many candidates to consider. Advancing the hazard state doesn't consider candidates and is therefore just a linear pass over instructions.

I'd simply clear the Available list once in ::initalize. That's where other MachineSchedStrategy implementations also clear their respective queues.

jonpa updated this revision to Diff 316187.Jan 12 2021, 12:01 PM

Patch updated per review.

The test "crashes now" without this patch, but I guess there is no guarantee that it is meaningful in the long-run given that it triggers asserts when accessing stale SU pointers...

This revision is now accepted and ready to land.Jan 13 2021, 1:07 AM