This is an archive of the discontinued LLVM Phabricator instance.

[AVX-512] Add support for turning a 256-bit load that goes to both halfs of an insert_subvector into a subvector broadcast.
ClosedPublic

Authored by craig.topper on Oct 15 2016, 10:17 PM.

Details

Summary

This builds on the existing support to do this for 128-bit loads into 256-bit vectors and generalizes it.

New patterns added to support 8-bit and 16-bit elements, v8f32->v16f32 without DQI instructions, and adding fallback for when the load can't be folded.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper retitled this revision from to [AVX-512] Add support for turning a 256-bit load that goes to both halfs of an insert_subvector into a subvector broadcast..
craig.topper updated this object.
craig.topper added reviewers: RKSimon, delena, igorb.
craig.topper added a subscriber: llvm-commits.
igorb accepted this revision.Oct 16 2016, 1:05 AM
igorb edited edge metadata.

LGTM

This revision is now accepted and ready to land.Oct 16 2016, 1:05 AM
RKSimon edited edge metadata.Oct 16 2016, 4:41 AM

Thanks for looking at this, as a future patch would it make sense to move this code into EltsFromConsecutiveLoads?

I've also wondered whether we should make X86ISD::SUBV_BROADCAST a memory intrinsic? I realise that AVX512 has at least partial reg-reg instruction support that would require handling by another approach.

lib/Target/X86/X86ISelLowering.cpp
12998 ↗(On Diff #74782)

Update the comment?

craig.topper edited edge metadata.

Updated comment at Simon's request.

This revision was automatically updated to reflect the committed changes.