This patch fixes shifts by a 128/256 bit shift amount. It also fixes
codegen for shifts of 32 by delegating to LLVM's default optimisation
instead of emitting a long shift.
Tests that used to generate long shifts of 32 are updated to check for the
more optimised codegen.
You can format these long lines