This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Support v16i8/v32i8 vector rotations
ClosedPublic

Authored by RKSimon on Jun 27 2018, 9:27 AM.

Details

Summary

This uses the same technique as for shifts - split the rotation into 4/2/1-bit partial rotations and select those partials based on the amount bit, making use of PBLENDVB if available. This halves the use of PBLENDVB compared to expanding to shifts, which can be a slow op.

Unfortunately I haven't found a decent way to share much of this code with the shift equivalent.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon created this revision.Jun 27 2018, 9:27 AM
This revision is now accepted and ready to land.Jun 28 2018, 8:58 PM
This revision was automatically updated to reflect the committed changes.