This patch fixes a bug in the shuffle lowering logic implemented by function 'lowerV2X128VectorShuffle'.
There are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR nodes.
The problematic expansion only occurs when the shuffle mask M has an 'undef' element at position 2, and M is considered equivalent to mask <0,1,4,5>.
In that case only, the algorithm propagates the wrong vector to one of the two new EXTRACT_SUBVECTOR nodes.
Example:
define <4 x double> @test(<4 x double> %A, <4 x double> %B) { entry: %0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32> <i32 undef, i32 1, i32 undef, i32 5> ret <4 x double> %0 }
Before this patch, llc (-mattr=+avx) generated:
vinsertf128 $1, %xmm0, %ymm0, %ymm0
With this patch, llc correctly generates:
vinsertf128 $1, %xmm1, %ymm0, %ymm0
This bug was originally spotted by Greg Bedwell.
Added test lower-vec-shuffle-bug.ll.
Please let me know if ok to submit.
Thanks,
Andrea
I think it would be clearer to do something like:
bool UseOnlyV1 = isShuffleEquivalent(V1, V2, Mask, {0, 1, 0, 1});
if (UseOnlyV1 || isShuffleEquivalent(V1, V2, Mask, {0, 1, 4, 5})) {
<...>
SDValue HiV = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, SubVT,
<...>
}