This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Add support for 32-bit element vectors to X86ISD::VZEXT_LOAD
ClosedPublic

Authored by RKSimon on Aug 18 2016, 9:10 AM.

Details

Summary

Consecutive load matching (EltsFromConsecutiveLoads) currently uses VZEXT_LOAD (load scalar into lowest element and zero uppers) for vXi64 / vXf64 vectors only.

For vXi32 / vXf32 vectors it instead creates a scalar load, SCALAR_TO_VECTOR and finally VZEXT_MOVL (zero upper vector elements), relying on tablegen patterns to match this into an equivalent of VZEXT_LOAD.

This patch adds the VZEXT_LOAD patterns for vXi32 / vXf32 vectors directly and updates EltsFromConsecutiveLoads to use this.

This has proven necessary to allow us to easily make VZEXT_MOVL a full member of the target shuffle set - without this change the call to combineShuffle (which is the main caller of EltsFromConsecutiveLoads) tended to recursively recreate VZEXT_MOVL nodes......

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 68551.Aug 18 2016, 9:10 AM
RKSimon retitled this revision from to [X86][SSE] Add support for 32-bit element vectors to X86ISD::VZEXT_LOAD.
RKSimon updated this object.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
mkuper edited edge metadata.Aug 22 2016, 5:00 PM

The code LGTM.
Regarding the rationale - I don't see why this should be different for i32 and i64. But I don't really understand shuffle lowering well enough, so take this with a grain of salt.

We still need the vzmovl patterns because of lowerVectorShuffleAsElementInsertion, right? Do we have anything them tests them now that we will no longer generate these patterns in EltsFromConsecutiveLoads?

We still need the vzmovl patterns because of lowerVectorShuffleAsElementInsertion, right? Do we have anything them tests them now that we will no longer generate these patterns in EltsFromConsecutiveLoads?

I started stripping these to find out - it appears to be very hit and miss, whether its a test coverage issue or they can't match any longer is a length investigation. Given that many still have v2i64 equivalents suggests that they can get used.

OK to commit?

mkuper accepted this revision.Aug 23 2016, 12:21 PM
mkuper edited edge metadata.

Looks OK to me.

This revision is now accepted and ready to land.Aug 23 2016, 12:21 PM
This revision was automatically updated to reflect the committed changes.