This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSSE3] Lower vector CTLZ with PSHUFB lookups
ClosedPublic

Authored by RKSimon on May 6 2016, 5:35 AM.

Details

Summary

This patch uses PSHUFB to lower vector CTLZ and avoid (slower) scalarizations.

The leading zero count of each 4-bit nibble of the vector is determine by using a PSHUFB lookup. Pairs of results are then repeatedly combined up to the original element width.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 56407.May 6 2016, 5:35 AM
RKSimon retitled this revision from to [X86][SSSE3] Lower vector CTLZ with PSHUFB lookups.
RKSimon updated this object.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
chandlerc added inline comments.May 12 2016, 3:10 PM
lib/Target/X86/X86ISelLowering.cpp
18792

I'm not following how you're using this conditional. Could you comment the algorithm in somewhat more detail here? I can probably figure it out myself, but it seems better to make the comments explain this in clear terms.

RKSimon updated this revision to Diff 57148.May 13 2016, 2:54 AM

Updated comments to hopefully describe the method more clearly.

chandlerc accepted this revision.May 15 2016, 7:47 PM
chandlerc edited edge metadata.

Much nicer, LGTM. Thanks for the added comments. See a nit pick about the style below, but submit with that addressed.

lib/Target/X86/X86ISelLowering.cpp
1012–1020

Please skip braces on clear single-line ifs and loops.

This revision is now accepted and ready to land.May 15 2016, 7:47 PM
This revision was automatically updated to reflect the committed changes.