Update three changes:
- Split the Load/Store resources into two, Ld0St and Ld1, since only one of them is capable of stores.
- Integer ADD and SUB instructions have different latencies and processor resource usage (pipeline) when they have a shift of zero vs. non-zero, refer to D8043
- The throughout of scalar DIV instruction.