As discussed on llvm-dev ( http://lists.llvm.org/pipermail/llvm-dev/2016-August/104210.html ): turn a vector select into a shuffle when possible as a canonicalization step. Shuffles may be easier to reason about in conjunction with other shuffles and insert/extract.
Details
Diff Detail
Event Timeline
lib/Transforms/InstCombine/InstCombineSelect.cpp | ||
---|---|---|
968 | Argh...yes. Let me concoct some tests from the last time I forgot about constant expressions. :) |
Patch updated:
- If the select condition is a constant expression - getAggregateElement returns nullptr - bail; add test.
- If any element of the select condition is a constant expression - ie, not {0,1,undef} - bail; add test.
- The code grew just enough to warrant being a helper function IMO, so...
- I didn't use Builder->getInt32Ty() to avoid needing the builder as a param.
- Fixed to use 'reserve' and 'push_back'.
- Fixed 'unsigned int'.
Thanks Sanjay, as I wrote on the email thread, I think this is probably the right canonicalization.
Did you happen to check that we don't regress codegen for the IR tests that changed? At least X86? (This should probably go in even if it does regress some cases, but a PR would be good.)
test/Transforms/InstCombine/select.ll | ||
---|---|---|
1794 | Yikes. |
Yes - x86 is the easy target because the x86 DAG combining/lowering already turns vselect into shuffle which then gets matched to blend or whatever shuffle instruction scraps are available before SSE4.1.
There is one existing problem for x86 that becomes slightly worse with this change. It manifests in vec_demanded_elts.ll:test_select(), and I just filed PR30371 (https://llvm.org/bugs/show_bug.cgi?id=30371) for that. That test also demonstrates a missing InstCombine fold that could fuse an insertelement with a later shuffle, but even if we had that fold, the x86 backend would screw up the codegen. :)
I filed https://llvm.org/bugs/show_bug.cgi?id=28530 for AArch64.
Thanks, Sanjay.
Anyway, this LGTM, but I'm not an InstCombine expert,so take this with the appropriate amount of salt...
Thanks, Michael! I haven't heard any philosophical objections, so I'll commit this soon, and then I'll post back on the llvm-dev thread so people can keep an eye open for any regressions.
I'd just reserve NumElts and push_back in the loop.