This is a follow up patch from https://reviews.llvm.org/D57857 to handle extract_subvector v4f32.
For cases where we fpext of v2f32 to v2f64 from extract_subvector we currently generate on P9 the following:
lxv 0, 0(3) xxsldwi 1, 0, 0, 1 xscvspdpn 2, 0 xxsldwi 3, 0, 0, 3 xxswapd 0, 0 xscvspdpn 1, 1 xscvspdpn 3, 3 xscvspdpn 0, 0 xxmrghd 0, 0, 3 xxmrghd 1, 2, 1 stxv 0, 0(4) stxv 1, 0(5)
This patch custom lower it to the following sequence:
lxv 0, 0(3) # load the v4f32 <w0, w1, w2, w3> xxmrghw 2, 0, 0 # Produce the following vector <w0, w0, w1, w1> xxmrglw 3, 0, 0 # Produce the following vector <w2, w2, w3, w3> xvcvspdp 2, 2 # FP-extend to <d0, d1> xvcvspdp 3, 3 # FP-extend to <d2, d3> stxv 2, 0(5) # Store <d0, d1> (%vecinit11) stxv 3, 0(4) # Store <d2, d3> (%vecinit4)