We handle well for scalar_to_vector + load for PWR8 and above. This means using lfiwax/lfiwzx for above pattern thus we don't need the memory to do the type change.
This patch adds a similar code-gen improvement for PWR7.
We still miss handling the case for i32->i64 sign-ext and unsigned-ext. PowerPC backend fails to recognize build_vector t1, t1 as scalar_to_vector + vector_shuffle<0, 0>. So it will not be hit by this patch. We will handle this in a later patch.
Is this expected to have further uses in the future? Defining a lambda with all the implementation in it and then simply calling it once seems like a very strange idiom.