For big-endian targets, when we merge two halfword loads into a word load, the order of the halfwords in the loaded value is reversed compared to little-endian, so the load-store optimiser needs to swap the destination registers.
This does not affect merging of two word loads, as we use ldp, which treats the memory as two separate 32-bit words.
I'm guessing the code makes it Rt2MI and RtMI the *only* two possible options?