When we have a 128-bit register, emitTestBit would incorrectly narrow to 32 bits always. If the bit number was > 32, then we would need a TB(N)ZX. This would cause a crash, as we'd have the wrong register class. (PR48379)
This generalizes narrowExtReg into narrowOrWidenScalarIfNeeded.
This also allows us to remove widenGPRBankRegIfNeeded entirely, since selectCopy correctly handles SUBREG_TO_REG etc.
This does create some codegen changes (since selectCopy uses the all regclass variants). However, I think that these will likely be optimized away, and we can always improve the selectCopy code.