This is an archive of the discontinued LLVM Phabricator instance.

[X86] Fix insert_vector_elt lowering for v32i1/v64i1 with non-constant index
ClosedPublic

Authored by craig.topper on Dec 7 2017, 12:28 AM.

Details

Summary

Currently we don't handle v32i1/v64i1 insert_vector_elt correctly as we fail to look at the number of elements closely and assume it can only be v16i1 or v8i1.

We also can't type legalize v64i1 insert_vector_elt correctly on KNL due to the type not being byte addressable as required by the legalizing through memory accesses path requires.

For the first issue, the patch now tries to pick a 512-bit register with the correct number of elements and promotes to that.

For the second issue, we now extend the vector to a byte addressable type, do the stores to memory, load the two halves, and then truncate the halves back to the original type. Technically since we changed the type, we may not need two loads, but actually checking that is more work and for the v64i1 case we do need them.

I plan to commit the two pieces as separate patches, but posting them together since I need both parts to pass the new tests.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Dec 7 2017, 12:28 AM
RKSimon accepted this revision.Dec 15 2017, 8:11 AM

LGTM - Is it worth adding non-legal types for testing as well e.g. <128 x i1> or <96 x i1>?

lib/Target/X86/X86ISelLowering.cpp
14687 ↗(On Diff #125895)

clang-format this (VecSize / NumElts) ?

This revision is now accepted and ready to land.Dec 15 2017, 8:11 AM
This revision was automatically updated to reflect the committed changes.