This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] SSE/AVX vector shifts demanded shift amount bits
ClosedPublic

Authored by RKSimon on Aug 11 2015, 6:38 AM.

Details

Summary

Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this.

I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code. it means that SimplifyX86immshift now (re)decodes the type of shift.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 31803.Aug 11 2015, 6:38 AM
RKSimon retitled this revision from to [InstCombine] SSE/AVX vector shifts demanded shift amount bits.
RKSimon updated this object.
RKSimon added reviewers: andreadb, mkuper, majnemer.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
mkuper accepted this revision.Aug 12 2015, 5:35 AM
mkuper edited edge metadata.

LGTM, with a nit, and a question about unrelated code. :-)

lib/Transforms/InstCombine/InstCombineCalls.cpp
259 ↗(On Diff #31803)

Could this code also be replaced by a call to SimplifyDemandedVectorElts(), to remove duplication? Or does it do something smarter?

876 ↗(On Diff #31803)

all -> only?

This revision is now accepted and ready to land.Aug 12 2015, 5:35 AM

Thanks Michael.

lib/Transforms/InstCombine/InstCombineCalls.cpp
259 ↗(On Diff #31803)

AFAICT SimplifyDemandedVectorElts doesn't help us here - what we'd really need is a helper function that bitcasts a ConstantDataVector and we get the raw APInt values. There is similar code in ConstantFolding.cpp (and DAGCombiner::ConstantFoldBITCASTofBUILD_VECTOR) so it isn't out of the question but beyond the scope of this patch.

876 ↗(On Diff #31803)

OK.

mkuper added inline comments.Aug 12 2015, 7:37 AM
lib/Transforms/InstCombine/InstCombineCalls.cpp
259 ↗(On Diff #31803)

Of course, I definitely didn't mean that should be part of this patch.
(Also, I misread the code, it was a silly question to begin with, sorry for the noise.)

This revision was automatically updated to reflect the committed changes.