This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128
ClosedPublic

Authored by RKSimon on Jul 18 2016, 6:17 AM.

Details

Summary

As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions to load+splat a 128-bit vector to both lanes of a 256-bit vector.

This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match.

Once this is in place I can update _mm256_broadcast_ps and _mm256_broadcast_pd in the headers to use generic IR and remove the clang builtins.

We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts).

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 64310.Jul 18 2016, 6:17 AM
RKSimon retitled this revision from to [X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128.
RKSimon updated this object.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a reviewer: ab.
RKSimon added a subscriber: llvm-commits.
delena added inline comments.Jul 19 2016, 12:19 AM
test/CodeGen/X86/vector-shuffle-256-v4.ll
1365 ↗(On Diff #64310)

The selected instruction here is from AVX2 set. The same patterns should be added to AVX-512.

RKSimon updated this revision to Diff 64519.Jul 19 2016, 10:25 AM

Added AVX512 support

delena accepted this revision.Jul 20 2016, 10:56 AM
delena edited edge metadata.
This revision is now accepted and ready to land.Jul 20 2016, 10:56 AM
This revision was automatically updated to reflect the committed changes.
krasin added a subscriber: krasin.Jul 21 2016, 5:15 PM

For the record, this CL is identified as a possible cause of https://llvm.org/bugs/show_bug.cgi?id=28657