This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] canonicalize vector select with constant vector condition to shuffle
ClosedPublic

Authored by spatel on Sep 6 2016, 1:45 PM.

Details

Summary

As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ): turn a vector select into a shuffle when possible as a canonicalization step. Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 70464.Sep 6 2016, 1:45 PM
spatel retitled this revision from to [InstCombine] canonicalize vector select with constant vector condition to shuffle.
spatel updated this object.
spatel added a subscriber: llvm-commits.
majnemer added inline comments.Sep 6 2016, 1:49 PM
lib/Transforms/InstCombine/InstCombineSelect.cpp
965 ↗(On Diff #70464)

I'd just reserve NumElts and push_back in the loop.

966 ↗(On Diff #70464)

Might be shorter to use Builder->getInt32Ty()

967 ↗(On Diff #70464)

I think we'd just use unsigned.

968 ↗(On Diff #70464)

Can't this return null?

spatel added inline comments.Sep 6 2016, 3:49 PM
lib/Transforms/InstCombine/InstCombineSelect.cpp
968 ↗(On Diff #70464)

Argh...yes. Let me concoct some tests from the last time I forgot about constant expressions. :)

spatel updated this revision to Diff 70485.Sep 6 2016, 4:21 PM

Patch updated:

  1. If the select condition is a constant expression - getAggregateElement returns nullptr - bail; add test.
  2. If any element of the select condition is a constant expression - ie, not {0,1,undef} - bail; add test.
  3. The code grew just enough to warrant being a helper function IMO, so...
  4. I didn't use Builder->getInt32Ty() to avoid needing the builder as a param.
  5. Fixed to use 'reserve' and 'push_back'.
  6. Fixed 'unsigned int'.
mkuper edited edge metadata.Sep 13 2016, 9:53 AM

Thanks Sanjay, as I wrote on the email thread, I think this is probably the right canonicalization.
Did you happen to check that we don't regress codegen for the IR tests that changed? At least X86? (This should probably go in even if it does regress some cases, but a PR would be good.)

test/Transforms/InstCombine/select.ll
1794 ↗(On Diff #70485)

Yikes.

Did you happen to check that we don't regress codegen for the IR tests that changed? At least X86? (This should probably go in even if it does regress some cases, but a PR would be good.)

Yes - x86 is the easy target because the x86 DAG combining/lowering already turns vselect into shuffle which then gets matched to blend or whatever shuffle instruction scraps are available before SSE4.1.

There is one existing problem for x86 that becomes slightly worse with this change. It manifests in vec_demanded_elts.ll:test_select(), and I just filed PR30371 (https://llvm.org/bugs/show_bug.cgi?id=30371) for that. That test also demonstrates a missing InstCombine fold that could fuse an insertelement with a later shuffle, but even if we had that fold, the x86 backend would screw up the codegen. :)

I filed https://llvm.org/bugs/show_bug.cgi?id=28530 for AArch64.

Thanks, Sanjay.
Anyway, this LGTM, but I'm not an InstCombine expert,so take this with the appropriate amount of salt...

Thanks, Sanjay.
Anyway, this LGTM, but I'm not an InstCombine expert,so take this with the appropriate amount of salt...

Thanks, Michael! I haven't heard any philosophical objections, so I'll commit this soon, and then I'll post back on the llvm-dev thread so people can keep an eye open for any regressions.

This revision was automatically updated to reflect the committed changes.