Other than PC-relative loads/store the patterns that match the various load/store addressing modes have the same complexity, so the order that they are matched is the order that they appear in the .td file.
Rearrange the instruction definitions in ARMInstrThumb.td, and make use of AddedComplexity for PC-relative loads, so that the instruction matching order is the order that results in the simplest selection logic. This also makes register-offset load/store be selected when it should, as previously it was only selected for too-large immediate offsets.