We saw improvements in spec2000/mesa (5%) and spec2000/crafty (3%) on cortex-a53 when compiled with -Ofast -mcpu=cortex-a53. The patch enables the post-RA MI scheduler for both cortex-a53 and cortex-a57, but I haven't done runs on cortex-a57. If compile time is an issue, perhaps we can enable this only for the in-order architectures (e.g. cortex-a53).
Thanks,
Sanjin
I'll update the -march to use aarch64 before pushing.