Sister patch to D10555
As discussed on D10555, this patch replaces the use of the avx2.vbroadcast (float) and avx2.pbroadcast (integer) broadcast intrinsics in avx2intrin.h with generic __builtin_shufflevector calls.
At present all these changes still result in the expected vbroadcast/vpbroadcast instructions in debug code. I can add a test that ensures this if people wish but the conclusion on the discussion thread was that this shouldn't be considered vital.