When shifting by a byte-multiple:
bswap (shl X, C) --> lshr (bswap X), C
bswap (lshr X, C) --> shl (bswap X), C
This is an IR implementation of a transform suggested in D120648. The "swaps cancel" test models the motivating optimization from that proposal.
Alive2 checks (as noted in the other review, we could use knownbits to handle shift-by-variable-amount, but that can be an enhancement patch):
https://alive2.llvm.org/ce/z/pXUaRf
https://alive2.llvm.org/ce/z/ZnaMLf
Worth using KnownBits to allow non-uniform shift amounts?