We discussed shrinking/widening of selects in IR in D26556, and I'll try to get back to that patch eventually. But I'm hoping that this transform is less iffy in the DAG where we can check legality of the select that we want to produce.
A few things to note:
- We can't wait until after legalization and do this generically because (at least in the x86 tests from PR14657), we'll have PACKSS and bitcasts in the pattern.
- This might benefit more of the SSE codegen if we lifted the legal-or-custom requirement, but I think that requires a closer look to make sure we don't end up worse.
- There's a 'vblendv' opportunity that we're missing that results in andn/and/or in some cases. I thought I'd better just post this as-is to make sure I'm not off the rails, but I could fix that first.
- I'm assuming that AVX1 offers the worst of all worlds wrt uneven ISA support with multiple legal vector sizes, but I can certainly add tests for other targets to make sure this isn't doing harm.
- There's a codegen miracle in the multi-BB tests from PR14657 (the gcc auto-vectorization tests): despite IR that is terrible for the target, this patch allows us to generate the optimal loop code because something post-ISEL is hoisting the splat extends above the vector loops.