This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Use the Cortex-A57 sched model for Cortex-A72
ClosedPublic

Authored by samparker on Oct 23 2018, 3:43 AM.

Details

Summary

This mirrors what we already do for AArch64 and LNT scores are improved by a geomean of 1.57%.

Diff Detail

Repository
rL LLVM

Event Timeline

samparker created this revision.Oct 23 2018, 3:43 AM
fhahn accepted this revision.Oct 23 2018, 7:27 AM
fhahn added a reviewer: javed.absar.

LGTM. AFAIK cortex-a57 and cortex-a72 are close enough for this to be beneficial. AArch64 re-uses the A57 model for A72 too.

This revision is now accepted and ready to land.Oct 23 2018, 7:27 AM

Hey Florian,

It was brought to my attention that the scheduler still wasn't enabled because of the missing feature. I've now added this and the geomean improvement is 2.23%. I will shortly add a couple of tests too.

Added the a72 to a couple of scheduling tests, as well as the basic unroll one.

fhahn added a comment.Oct 24 2018, 4:55 AM

Hey Florian,

It was brought to my attention that the scheduler still wasn't enabled because of the missing feature. I've now added this and the geomean improvement is 2.23%. I will shortly add a couple of tests too.

So yes, to enable the machine scheduler for a core on ARM, FeatureUseMISched is required. I think it is worth splitting using the Cortex-A57 model and enabling the machine scheduler for Cortex-A72. The A57 model should be more accurate for A72 than no model.

For changing to using the MachineScheduler, the last time I looked at it (about a year ago), I remember seeing some relatively big regressions on some benchmarks, so we might want to be a bit more cautious there, as there might be potential to tweak the scheduling heuristics for ARM. Also, I think it would be good to have numbers for a large set of benchmarks (test-suite + commercial ones)

Ok, fair enough, my LNT numbers show that the MISched results are more variable:

Regressions (%):

benchmarkModelModel + MISched
Stanford/FloatMM40.1
McGill/queens29.629.4
Stanford/Puzzle16.0515.6
FreeBench/mason/mason9.27
Olden/power/power7.61
VersaBench/ecbdes/ecbdes5.25
BenchmarkGame/fannkuch6.55
Fhourstones/fhourstones5.75

Improvements (%):

benchmarkModelModel + MISched
Stanford/Perm21.722.53
Olden/mst/mst22.0522.21
FreeBench/fourinarow/fourinarow22.8321.21
TSVC/CrossingThresholds-dbl/CrossingThresholds-dbl14.1719.52
TSVC/Searching-flt/Searching-flt16.2619.23
TSVC/Searching-dbl/Searching-dbl16.2819.15

I will remove the use of the feature for now.

cheers,

This revision was automatically updated to reflect the committed changes.