This is an archive of the discontinued LLVM Phabricator instance.

[x86] turn insertelement into undef with variable index into splat
ClosedPublic

Authored by spatel on Aug 23 2018, 2:22 PM.

Details

Summary

I noticed this along with the patterns in D51125, but when the index is variable, we don't convert insertelement into a build_vector.

For x86, that means these get expanded at legalization time into the loading/spilling code that we see in the tests. I think it's always better to avoid going to memory on these, and we get the optimal 'broadcast' if it's available.

I suspect other targets may want to look at enabling the hook. AArch64 and AMDGPU have regression tests that would be affected (although I did not check what would happen in those cases). In the most basic cases shown here, AArch64 would do much better with a splat.

Diff Detail