This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)
ClosedPublic

Authored by RKSimon on May 16 2018, 9:18 AM.

Details

Summary

As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.

Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.May 16 2018, 9:18 AM
craig.topper added inline comments.May 16 2018, 11:15 AM
lib/Target/X86/X86ISelLowering.cpp
23447

Are these bitcasts here because you could fold them into this line? They are only needed on the non-AVX path from above right?

RKSimon added inline comments.May 16 2018, 11:31 AM
lib/Target/X86/X86ISelLowering.cpp
23447

Yes, I was cheating and saving space, wrapping them around the non-AVX getVectorShuffle isn't pretty either.

This revision is now accepted and ready to land.May 16 2018, 12:16 PM
This revision was automatically updated to reflect the committed changes.