A possible codegen regression for PowerPC is noted in D117406 because we don't recognize a pattern that demands a single byte from a bswap.
I was wondering if we have that fold in IR, and it has existed there since close to the beginning of LLVM:
https://github.com/llvm/llvm-project/blame/main/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp#L794
...so this patch copies that code as much as possible and adapts it for SDAG.
The test for PowerPC that would change in D117406 is over-reduced with undefs, so I recreated it for AArch64 and x86 by passing in pointer args and renamed the values to make the logic clearer.
Why can't we use getShiftAmountTy always? I fixed the problem that it only works for legal types a few months ago.