This is an archive of the discontinued LLVM Phabricator instance.

[AVX-512] Teach isel lowering that a subvector broadcast being inserted into both halves of a 512-bit vector can be combined into a larger subvector broadcast.
ClosedPublic

Authored by craig.topper on Oct 15 2016, 10:32 PM.

Details

Summary

This allows us to create broadcasts of 128-bit vector loads into 512-bit vectors.

New patterns added to support 8-bit and 16-bit vector types and v2f64/v2i64->v8f64/v8i64 without DQI instructions.

There also fallback patterns when the load can't be folded. These patterns are a little complex as we first need to insert the lower 128-bits into the second 128-bits using a zmm subvector insert instruction. We need to use a zmm insert in case VLX isn't available. Then use another zmm sub vector insert to take those 256-bits and insert them into the upper bits. Since we used a zmm insert to create the 256-bits we also need to do a extract_subreg to get just the lower 256-bits to pass to the second insert.

The outer insert for the fallback patterns should have its type correct because eventually we should also supported masked operations here too. So we need a DQI and a NoDQI version of the v16f32/v16i32 patterns.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper retitled this revision from to [AVX-512] Teach isel lowering that a subvector broadcast being inserted into both halves of a 512-bit vector can be combined into a larger subvector broadcast..
craig.topper updated this object.
craig.topper added reviewers: RKSimon, delena, igorb.
craig.topper added a subscriber: llvm-commits.
igorb accepted this revision.Oct 18 2016, 12:53 AM
igorb edited edge metadata.

LGTM,
Thanks for looking at this!

This revision is now accepted and ready to land.Oct 18 2016, 12:53 AM
This revision was automatically updated to reflect the committed changes.