Documentation for TargetLowering::getShiftAmountTy says that LegalTypes
should generally be true during type legalization, so this patch does
that.
On AMDGPU the effect is that we use i32 (a sane type) instead of i64
(pointer sized type) for more shift amounts, which in turn allows more
formation of rotates and funnel shifts pre-legalization.
A bunch of regressions like this seem to be related to worse use of v_perm.