We want to run the Machine Scheduler instead of the List Scheduler after RA.
Checked with a performance run on a Power 9 machine with SPEC 2006 and while some benchmarks improved and others degraded the geomean was slightly improved with the Machine Scheduler.
Details
Diff Detail
Event Timeline
I'm afraid we can't just do this without doing some more testing. This patch will indiscriminately replace the existing Post-RA scheduler for all PowerPC targets, yet we've only tested it on Power9.
We need to do one of two things:
- Do some more thorough testing to make sure this doesn't adversely affect at least the CPU's we know are in the field (Power7/8, A2, ??)
- Make this a decision that is based on the target CPU and thereby dependent on the function. I'm not sure how we'd actually accomplish this though. Perhaps if it's reasonable, we could schedule a pass that will getAnalysisIfAvailable<TargetPassConfig>() and then call substitutePass() based on the CPU - although this is just guesswork and I'm not sure if it would work.
Ran a series of tests on this patch.
Overall the results are positive. I'm going to list the degradations.
SPEC 2006
lbm - degrades 7.5%
overall geomean - unchanged
SPEC2017
perlbench - degrades 2.0%
lbm - degrades 2.0%
overall geomean - 1.5% improvement!
lbm
MultiSource/Benchmarks/TSVC/IndirectAddressing-dbl/IndirectAddressing-dbl - degrades 6%
MultiSource/Applications/viterbi/viterbi - degrades 5%
MultiSource/Benchmarks/TSVC/GlobalDataFlow-dbl/GlobalDataFlow-dbl - degrades 4%
overall geomean - unchanged
Other than the 7.5% degradation for lbm I was not able to find anything that was really worrying. I'm looking at that degradation now. However, that fix can be future work.
Hi Eric,
The machine scheduler is already enabled for Power PC.
We use the machine scheduler in the Pre-RA pass and so enableMachineScheduler already returns true. I'm just substituting the List Scheduler for the PostMachineScheduler in Post-RA.
Relatedly it might be a good idea to remove needsAggressiveScheduling and just update the few testcases. I don't think it's worth maintaining the difference for processors that haven't been released in the last decade.
(Current failing tests:
LLVM :: CodeGen/PowerPC/2008-10-28-f128-i32.ll LLVM :: CodeGen/PowerPC/2011-12-05-NoSpillDupCR.ll LLVM :: CodeGen/PowerPC/coalesce-ext.ll LLVM :: CodeGen/PowerPC/i1-to-double.ll LLVM :: CodeGen/PowerPC/lsa.ll LLVM :: CodeGen/PowerPC/ppc32-vacopy.ll LLVM :: CodeGen/PowerPC/save-cr-ppc32svr4.ll LLVM :: CodeGen/PowerPC/save-crbp-ppc32svr4.ll LLVM :: CodeGen/PowerPC/tls.ll
)
LGTM. I think we should go ahead with this.
I agree with the additional work Eric suggested as well. I believe we already use the MachineScheduler on all recent power cores as well as the BG cores, so we might as well make it an across-the-board setting and simplify the code.
Hi,
Sorry for the late reply. I have been distracted with something else for a little while.
I have added the patch to remove needsAggressiveScheduling() here:
https://reviews.llvm.org/D48663
I will update the tests for this item as well and commit it. Thank you both for the reviews.