This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly.
Re: the AMDGPU regression - @arsenm it looks there isn't much that generates BFE U32/S32 nodes in the DAG its mostly done in ISEL - in this case we need to match srl(and(shl(x,c1),c2),c1) - is this something that needs fixing first or are you OK with this change for now?
The X86 changes are all definite wins.