This combine only handled left shifts, but now it can handle right shifts as well. It handles right shifts conservatively and only truncates them to the size returned by TLI.
AMDGPU benefits from always lowering shifts to 32 bits for instance, but AArch64 would rather keep them at 64 bits.
Interesting - this seems like an abuse of getPreferredShiftAmountTy, but I guess it works out OK in practice for the one case we care about (converting 64-bit shifts to 32-bit shifts).