This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE]] Lower BUILD_VECTOR with repeated ops as BUILD_VECTOR + VECTOR_SHUFFLE
ClosedPublic

Authored by RKSimon on Mar 26 2017, 5:58 AM.

Details

Summary

It can be costly to transfer from the gprs to the xmm registers and can prevent loads merging.

This patch splits vXi16/vXi32/vXi64 BUILD_VECTORS that use the same operand in multiple elements into a BUILD_VECTOR with only a single insertion of each of those elements and then performs an unary shuffle to duplicate the values.

There are a couple of minor regressions this patch unearths due to some missing MOVDDUP/BROADCAST folds that I will address in a future patch.

Note: Now that vector shuffle lowering and combining is pretty good we should be reusing that instead of duplicating so much in LowerBUILD_VECTOR - this is the first of several patches to address this.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Mar 26 2017, 5:58 AM
spatel added inline comments.Apr 3 2017, 9:57 AM
lib/Target/X86/X86ISelLowering.cpp
6112–6113 ↗(On Diff #93069)

"build vector of repeated ops" translated to "splat" in my mind when I read this. I think we guarantee that case won't make it this far, so assert that condition?
How about "build vector with repeated ops (but not a full splat)"?

6115 ↗(On Diff #93069)

I'd rather not use "Permute" in the name here since that implies one of those specific AVX instructions. "lowerBuildVectorWithRepeatedEltsUsingShuffle"?

6129 ↗(On Diff #93069)

I prefer to put a verb on these kinds of bools - "HasRepeatedElts"?

6138 ↗(On Diff #93069)

Run-on: "repeated, so don't"

RKSimon updated this revision to Diff 93905.Apr 3 2017, 11:53 AM

Updated based on Sanjay's feedback.

RKSimon marked 3 inline comments as done.Apr 3 2017, 11:55 AM
RKSimon added inline comments.
lib/Target/X86/X86ISelLowering.cpp
6112–6113 ↗(On Diff #93069)

Splats can occur here as buildvector broadcast lowering only handles a few cases for where we have a legal BROADCAST instruction (AVX1 onwards - it doesn't even deal with MOVDDUP AFAICT.

spatel accepted this revision.Apr 3 2017, 12:52 PM

LGTM.

This revision is now accepted and ready to land.Apr 3 2017, 12:52 PM
This revision was automatically updated to reflect the committed changes.