This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Add INSERTPS target shuffle combines.
ClosedPublic

Authored by RKSimon on Jan 11 2016, 10:00 AM.

Details

Summary

As vector shuffles can only reference two inputs many (V)INSERTPS patterns end up being split over two targets shuffles.

This patch adds combines to attempt to combine (V)INSERTPS nodes with input/output nodes that are just zeroing out these additional vector elements.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 44524.Jan 11 2016, 10:00 AM
RKSimon retitled this revision from to [X86][SSE] Add INSERTPS target shuffle combines..
RKSimon updated this object.
RKSimon added reviewers: spatel, andreadb, chandlerc.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
spatel edited edge metadata.Jan 18 2016, 8:19 AM

I applied the patch to r258047, and I get a 'make check' failure on:
FAIL: LLVM :: CodeGen/X86/merge-consecutive-loads-128.ll (6972 of 15632)

lib/Target/X86/X86ISelLowering.cpp
6938 ↗(On Diff #44524)

Why do we need this intermediate variable? Ie, couldn't we just set the appropriate elements of the mask rather than this bitvector which then gets copied to the mask?

23605 ↗(On Diff #44524)

"resolveTarget..."

or if the previous inline comment applies, combine the 2 helper functions and call it "setShuffleMaskZeroElements" ?

23756 ↗(On Diff #44524)

warning: comparison of integers of different signs

Thanks Sanjay, I'll get an updated patch out soon - the consecutive merge loads failures were due to some extra tests I added for D16217.

lib/Target/X86/X86ISelLowering.cpp
23605 ↗(On Diff #44524)

I'll merge them - computeKnownZeroShuffleElements was over zealous future proofing on my part.

RKSimon updated this revision to Diff 45261.Jan 19 2016, 7:53 AM
RKSimon edited edge metadata.
spatel accepted this revision.Jan 19 2016, 10:29 AM
spatel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jan 19 2016, 10:29 AM
This revision was automatically updated to reflect the committed changes.
RKSimon marked 3 inline comments as done.