This patch proposes an alternate implementation for this conversion derived from our v2i32->v2f32 handling. We can zero extend the v2i32 to v2i64, or it with the bit representation of 2.0^52 which will give us 2.0^52 plus the 32-bit integer since double's mantissa is 52 bits. Then we just need to subtract 2.0^52 as a double and let the floating point unit normalize the remaining bits into a valid double.
This is less instructions then our previous code, but does require a port 5 shuffle for the zero extend or unpack.
Would AVX1/AVX2 benefit here?