Page MenuHomePhabricator

[X86][SSE] Reduce instruction/register usages for v4i32 vector shifts (PR37441)

Authored by RKSimon on May 16 2018, 9:18 AM.



As suggested by Fabian on PR37441, use PSHUFLW to extend shift amount types for use with PSRAD/PSRLD to reduce register pressure.

Some of this ideally would be done by combineTargetShuffle but its tricky to do as most of the shuffles are sharing inputs.

Diff Detail


Event Timeline

RKSimon created this revision.May 16 2018, 9:18 AM
craig.topper added inline comments.May 16 2018, 11:15 AM
23447 ↗(On Diff #147109)

Are these bitcasts here because you could fold them into this line? They are only needed on the non-AVX path from above right?

RKSimon added inline comments.May 16 2018, 11:31 AM
23447 ↗(On Diff #147109)

Yes, I was cheating and saving space, wrapping them around the non-AVX getVectorShuffle isn't pretty either.

This revision is now accepted and ready to land.May 16 2018, 12:16 PM
This revision was automatically updated to reflect the committed changes.