This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX] Fix wrong lowering of v4x64 shuffles into concat_vector plus extract_subvector nodes.
ClosedPublic

Authored by andreadb on Mar 11 2015, 9:59 AM.

Details

Summary

This patch fixes a bug in the shuffle lowering logic implemented by function 'lowerV2X128VectorShuffle'.

There are few cases where function 'lowerV2X128VectorShuffle' wrongly expands a shuffle of two v4X64 vectors into a CONCAT_VECTORS of two EXTRACT_SUBVECTOR nodes.
The problematic expansion only occurs when the shuffle mask M has an 'undef' element at position 2, and M is considered equivalent to mask <0,1,4,5>.
In that case only, the algorithm propagates the wrong vector to one of the two new EXTRACT_SUBVECTOR nodes.

Example:

define <4 x double> @test(<4 x double> %A, <4 x double> %B) {
entry:
  %0 = shufflevector <4 x double> %A, <4 x double> %B, <4 x i32> <i32 undef, i32 1, i32 undef, i32 5>
  ret <4 x double> %0
}

Before this patch, llc (-mattr=+avx) generated:

vinsertf128 $1, %xmm0, %ymm0, %ymm0

With this patch, llc correctly generates:

vinsertf128 $1, %xmm1, %ymm0, %ymm0

This bug was originally spotted by Greg Bedwell.

Added test lower-vec-shuffle-bug.ll.

Please let me know if ok to submit.

Thanks,
Andrea

Diff Detail

Repository
rL LLVM

Event Timeline

andreadb updated this revision to Diff 21737.Mar 11 2015, 9:59 AM
andreadb retitled this revision from to [X86][AVX] Fix wrong lowering of v4x64 shuffles into concat_vector plus extract_subvector nodes..
andreadb updated this object.
andreadb edited the test plan for this revision. (Show Details)
andreadb added reviewers: chandlerc, qcolombet, mkuper.
andreadb added a subscriber: Unknown Object (MLST).
qcolombet edited edge metadata.Mar 12 2015, 5:02 PM

Hi Andrea,

Nice catch.

A small comment though.

Cheers,
-Quentin

lib/Target/X86/X86ISelLowering.cpp
9025 ↗(On Diff #21737)

I think it would be clearer to do something like:
bool UseOnlyV1 = isShuffleEquivalent(V1, V2, Mask, {0, 1, 0, 1});
if (UseOnlyV1 || isShuffleEquivalent(V1, V2, Mask, {0, 1, 4, 5})) {
<...>
SDValue HiV = DAG.getNode(ISD::EXTRACT_SUBVECTOR, DL, SubVT,

UseOnlyV1 ? V1 : V2, DAG.getIntPtrConstant(0));

<...>
}

Hi Quentin,

lib/Target/X86/X86ISelLowering.cpp
9025 ↗(On Diff #21737)

You are right. It would be a better fix in general. Thanks for the suggestion!
I am going to upload a new version of the patch soon.

andreadb updated this revision to Diff 21923.Mar 13 2015, 6:28 AM
andreadb edited edge metadata.

Hi Quentin,

Patch updated.
Please let me know if ok to submit.

Thanks again for your time!
-Andrea

qcolombet accepted this revision.Mar 13 2015, 10:10 AM
qcolombet edited edge metadata.

LGTM.

Thanks Andrea!
-Quentin

This revision is now accepted and ready to land.Mar 13 2015, 10:10 AM
This revision was automatically updated to reflect the committed changes.

Thanks Quentin!
Committed revision 232179.