Similar to what we already do in DAGCombiner, but this version also handles bitcasts from types with different scalar sizes, which x86 is better at handling.
I've also included a concat(broadcast(x),broadcast(x)) -> broadcast(x) combine in the patch to demonstrate the main motivation - if accepted this will be committed separately.