This is an archive of the discontinued LLVM Phabricator instance.

[ScheduleDAGRRList] Limit number of candidates to explore.
ClosedPublic

Authored by fhahn on Jul 22 2020, 7:59 AM.

Download Raw Diff

Details

Reviewers

efriedma
paquette
niravd

Commits

rG2f8e6b5f3c86: [ScheduleDAGRRList] Limit number of candidates to explore.

Summary

Currently popFromQueueImpl iterates over all candidates to find the best
one. While the candidate queue is small, this is not a problem. But it
becomes a problem once the queue gets larger. For example, the snippet
below takes 330s to compile with llc -O0, but completes in 3s with this
patch.

define void @test(i4000000* %ptr) {
entry:

store i4000000 0, i4000000* %ptr, align 4
ret void

}

This patch limits the number of candidates to check to 1000. This limit
ensures that it never triggers for test-suite/SPEC2000/SPEC2006 on X86
and AArch64 with -O3, while still drastically limiting the compile-time
in case of very large queues.

It would be even better to use a binary heap to manage to queue
(D83335), but some heuristics change the score of a node in the queue
after another node has been scheduled. I plan to address this for
backends that use the MachineScheduler in the future, but that requires
a more careful evaluation. In the meantime, the limit should help users
impacted by this issue.

The patch includes a slightly smaller version of the motivating example
as test case, to guard against the issue.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

fhahn created this revision.Jul 22 2020, 7:59 AM

Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2020, 7:59 AM

Herald added subscribers: hiraditya, kristof.beyls, MatzeB. · View Herald Transcript

fhahn mentioned this in D83335: [ScheduleDAGRRList] Use std::*_heap() to keep candidate queue a heap..Jul 22 2020, 8:00 AM

Harbormaster failed remote builds in B65235: Diff 279829!Jul 22 2020, 8:04 AM

LGTM

This revision is now accepted and ready to land.Jul 22 2020, 2:34 PM

Closed by commit rG2f8e6b5f3c86: [ScheduleDAGRRList] Limit number of candidates to explore. (authored by fhahn). · Explain WhyJul 23 2020, 3:44 AM

This revision was automatically updated to reflect the committed changes.

RKSimon added a subscriber: RKSimon.Jul 23 2020, 10:29 AM

RKSimon added inline comments.

llvm/test/CodeGen/X86/stress-scheduledagrrlist.ll
12	@fhahn Should this be in x86 or generic codegen tests? I wouldn't have noticed it but its causing a notable increase in runtime for the x86 codegen tests.....

fhahn marked an inline comment as done.Jul 23 2020, 10:31 AM

fhahn added inline comments.

llvm/test/CodeGen/X86/stress-scheduledagrrlist.ll
12	It's intentional for X86 to check that we don't have excessive compile-times (the patch improves compile-time by ~ 100x). On my system, it takes ~1-2 seconds, but if that's too long we can reduce the width of the type further.

RKSimon added inline comments.Jul 23 2020, 11:59 AM

llvm/test/CodeGen/X86/stress-scheduledagrrlist.ll
12	Hmm - please can you check an EXPENSIVE_CHECKS build? I'm seeing > 6min for this.

fhahn marked an inline comment as done.Jul 23 2020, 12:04 PM

fhahn added inline comments.

llvm/test/CodeGen/X86/stress-scheduledagrrlist.ll
12	oh right, I think there is some very expensive checks somewhere in SelectionDAG and there will be a very large number of nodes.... Is there a way to disable tests for builds with expensive checks? Otherwise we probably should remove the test.

RKSimon added inline comments.Jul 25 2020, 7:38 AM

llvm/test/CodeGen/X86/stress-scheduledagrrlist.ll
12	I can't think of a way for lit to check for non EXPENSIVE_CHECKS builds - and I'd be worried that such an approach would be abused tbh.

These appear to be red due to timeouts:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu

In D84328#2174045, @RKSimon wrote:

These appear to be red due to timeouts:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win
http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu

OK, I removed the test in rGc09a10845b42. it seems like there's no way to work around the expensive-checks issue and reducing the width of the store much further will defeat the actual purpose of the test.

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

ScheduleDAGRRList.cpp

16 lines

test/

CodeGen/

X86/

stress-scheduledagrrlist.ll

12 lines

Diff 280063

llvm/lib/CodeGen/SelectionDAG/ScheduleDAGRRList.cpp

Show First 20 Lines • Show All 1,832 Lines • ▼ Show 20 Lines	protected:
bool canClobber(const SUnit SU, const SUnit Op);		bool canClobber(const SUnit SU, const SUnit Op);
void AddPseudoTwoAddrDeps();		void AddPseudoTwoAddrDeps();
void PrescheduleNodesWithMultipleUses();		void PrescheduleNodesWithMultipleUses();
void CalculateSethiUllmanNumbers();		void CalculateSethiUllmanNumbers();
};		};

template<class SF>		template<class SF>
static SUnit popFromQueueImpl(std::vector<SUnit > &Q, SF &Picker) {		static SUnit popFromQueueImpl(std::vector<SUnit > &Q, SF &Picker) {
std::vector<SUnit *>::iterator Best = Q.begin();		unsigned BestIdx = 0;
for (auto I = std::next(Q.begin()), E = Q.end(); I != E; ++I)		// Only compute the cost for the first 1000 items in the queue, to avoid
if (Picker(Best, I))		// excessive compile-times for very large queues.
Best = I;		for (unsigned I = 1, E = std::min(Q.size(), 1000ul); I != E; I++)
SUnit V = Best;		if (Picker(Q[BestIdx], Q[I]))
if (Best != std::prev(Q.end()))		BestIdx = I;
std::swap(*Best, Q.back());		SUnit *V = Q[BestIdx];
		if (BestIdx + 1 != Q.size())
		std::swap(Q[BestIdx], Q.back());
Q.pop_back();		Q.pop_back();
return V;		return V;
}		}

template<class SF>		template<class SF>
SUnit popFromQueue(std::vector<SUnit > &Q, SF &Picker, ScheduleDAG *DAG) {		SUnit popFromQueue(std::vector<SUnit > &Q, SF &Picker, ScheduleDAG *DAG) {
#ifndef NDEBUG		#ifndef NDEBUG
if (DAG->StressSched) {		if (DAG->StressSched) {
▲ Show 20 Lines • Show All 1,333 Lines • Show Last 20 Lines

llvm/test/CodeGen/X86/stress-scheduledagrrlist.ll

This file was added.

				; RUN: llc -O0 -mtriple=x86_64-apple-macosx %s -o %t.s

				; Stress test for the list scheduler. The store will be expanded to a very
				; large number of stores during isel, stressing ScheduleDAGRRList. It should
				; compiles in a reasonable amount of time. Run with -O0, to disable most other
				; optimizations.

				define void @test(i1000000* %ptr) {
				entry:
				store i1000000 0, i1000000* %ptr, align 4
				ret void
				}
				RKSimonUnsubmitted Not Done Reply Inline Actions @fhahn Should this be in x86 or generic codegen tests? I wouldn't have noticed it but its causing a notable increase in runtime for the x86 codegen tests..... RKSimon: @fhahn Should this be in x86 or generic codegen tests? I wouldn't have noticed it but its…
				fhahnAuthorUnsubmitted Done Reply Inline Actions It's intentional for X86 to check that we don't have excessive compile-times (the patch improves compile-time by ~ 100x). On my system, it takes ~1-2 seconds, but if that's too long we can reduce the width of the type further. fhahn: It's intentional for X86 to check that we don't have excessive compile-times (the patch…
				RKSimonUnsubmitted Not Done Reply Inline Actions Hmm - please can you check an EXPENSIVE_CHECKS build? I'm seeing > 6min for this. RKSimon: Hmm - please can you check an EXPENSIVE_CHECKS build? I'm seeing > 6min for this.
				fhahnAuthorUnsubmitted Done Reply Inline Actions oh right, I think there is some very expensive checks somewhere in SelectionDAG and there will be a very large number of nodes.... Is there a way to disable tests for builds with expensive checks? Otherwise we probably should remove the test. fhahn: oh right, I think there is some very expensive checks somewhere in SelectionDAG and there will…
				RKSimonUnsubmitted Not Done Reply Inline Actions I can't think of a way for lit to check for non EXPENSIVE_CHECKS builds - and I'd be worried that such an approach would be abused tbh. RKSimon: I can't think of a way for lit to check for non EXPENSIVE_CHECKS builds - and I'd be worried…