[X86] Directly form VBROADCAST_LOAD in lowerShuffleAsBroadcast on AVX targets.
If we would emit a VBROADCAST node, we can instead directly emit
a VBROADCAST_LOAD. This allows us to get rid of the special case
to use an f64 load on 32-bit targets for vXi64.
I believe there is more cleanup we can do later in this function,
but I'll do that in follow ups.