Integer ADD and SUB instructions on Cortex-A57 have different
latencies and processor resource usage (pipeline) when they have a
shift of zero vs. non-zero. This patches uses a SchedWriteVariant
to capture both.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
I did not detect performance gain in the tests I ran (internal workload), but you are modelling is more precise now. LGTM.