This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Use the Cortex-A57 sched model for Cortex-A72
ClosedPublic

Authored by samparker on Oct 23 2018, 3:43 AM.

Download Raw Diff

Details

Reviewers

john.brawn
dmgreen
fhahn
javed.absar

Commits

rGa16667e79be9: [ARM] Use Cortex-A57 sched model for Cortex-A72
rL345272: [ARM] Use Cortex-A57 sched model for Cortex-A72

Summary

This mirrors what we already do for AArch64 and LNT scores are improved by a geomean of 1.57%.

Diff Detail

Event Timeline

samparker created this revision.Oct 23 2018, 3:43 AM

Herald added subscribers: chrib, kristof.beyls, javed.absar. · View Herald TranscriptOct 23 2018, 3:43 AM

LGTM. AFAIK cortex-a57 and cortex-a72 are close enough for this to be beneficial. AArch64 re-uses the A57 model for A72 too.

This revision is now accepted and ready to land.Oct 23 2018, 7:27 AM

Hey Florian,

It was brought to my attention that the scheduler still wasn't enabled because of the missing feature. I've now added this and the geomean improvement is 2.23%. I will shortly add a couple of tests too.

Added the a72 to a couple of scheduling tests, as well as the basic unroll one.

Herald added a subscriber: zzheng. · View Herald TranscriptOct 24 2018, 12:43 AM

In D53562#1273812, @samparker wrote:

Hey Florian,

It was brought to my attention that the scheduler still wasn't enabled because of the missing feature. I've now added this and the geomean improvement is 2.23%. I will shortly add a couple of tests too.

So yes, to enable the machine scheduler for a core on ARM, FeatureUseMISched is required. I think it is worth splitting using the Cortex-A57 model and enabling the machine scheduler for Cortex-A72. The A57 model should be more accurate for A72 than no model.

For changing to using the MachineScheduler, the last time I looked at it (about a year ago), I remember seeing some relatively big regressions on some benchmarks, so we might want to be a bit more cautious there, as there might be potential to tweak the scheduling heuristics for ARM. Also, I think it would be good to have numbers for a large set of benchmarks (test-suite + commercial ones)

Ok, fair enough, my LNT numbers show that the MISched results are more variable:

Regressions (%):

benchmark	Model	Model + MISched
Stanford/FloatMM		40.1
McGill/queens	29.6	29.4
Stanford/Puzzle	16.05	15.6
FreeBench/mason/mason		9.27
Olden/power/power		7.61
VersaBench/ecbdes/ecbdes		5.25
BenchmarkGame/fannkuch	6.55
Fhourstones/fhourstones	5.75

Improvements (%):

benchmark	Model	Model + MISched
Stanford/Perm	21.7	22.53
Olden/mst/mst	22.05	22.21
FreeBench/fourinarow/fourinarow	22.83	21.21
TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl	14.17	19.52
TSVC/Searching-flt/Searching-flt	16.26	19.23
TSVC/Searching-dbl/Searching-dbl	16.28	19.15

I will remove the use of the feature for now.

cheers,

Closed by commit rL345272: [ARM] Use Cortex-A57 sched model for Cortex-A72 (authored by sam_parker). · Explain WhyOct 25 2018, 8:10 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

ARM/

ARM.td

2 lines

Diff 170599

lib/Target/ARM/ARM.td

Context not available.
	FeatureAvoidPartialCPSR,	FeatureAvoidPartialCPSR,
	FeatureCheapPredicableCPSR]>;	FeatureCheapPredicableCPSR]>;

	def : ProcNoItin<"cortex-a72", [ARMv8a, ProcA72,	def : ProcessorModel<"cortex-a72", CortexA57Model, [ARMv8a, ProcA72,
	FeatureHWDivThumb,	FeatureHWDivThumb,
	FeatureHWDivARM,	FeatureHWDivARM,
	FeatureCrypto,	FeatureCrypto,
Context not available.