The old CPU model only had MLA->MLA forwarding. I added some moreissing
accumulate op to accumulate op, and non-accumulate op (e.g. MUL, shift)MUL->MLA read advances and a missing absolute diff accumulator read
to relevant accumulate op (e.g. MLAadvance according to the Cortex A57 Software Optimization Guide.
Also added missing schedules for ASIMD shift by immed basic, SRA) forwarding according to theand
Cortex A57 Software Optimization GuideASIMD shift by register, basic, D-form.
The patch improves performance in some internal benchmarks andEEMBC benchmark and causes no
causes no significant regressions (none in SPEC).