[X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)
ClosedPublic

Authored by RKSimon on Wed, May 16, 9:18 AM.

Details

Summary

As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.

Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.

Diff Detail

Repository
rL LLVM
RKSimon created this revision.Wed, May 16, 9:18 AM
craig.topper added inline comments.Wed, May 16, 11:15 AM
lib/Target/X86/X86ISelLowering.cpp
23447 ↗(On Diff #147109)

Are these bitcasts here because you could fold them into this line? They are only needed on the non-AVX path from above right?

RKSimon added inline comments.Wed, May 16, 11:31 AM
lib/Target/X86/X86ISelLowering.cpp
23447 ↗(On Diff #147109)

Yes, I was cheating and saving space, wrapping them around the non-AVX getVectorShuffle isn't pretty either.

This revision is now accepted and ready to land.Wed, May 16, 12:16 PM
This revision was automatically updated to reflect the committed changes.