This is an archive of the discontinued LLVM Phabricator instance.

AVX512 : i8/i16 vector CTLZ/CTLZ_ZERO_UNDEF lowering
ClosedPublic

Authored by igorb on Oct 11 2015, 12:46 AM.

Details

Summary

AVX512: Lowering i8/i16 vector CTLZ using the dword LZCNT vector instruction ( sub(trunc(lzcnt(zext32(x)))) ).

Diff Detail

Repository
rL LLVM

Event Timeline

igorb updated this revision to Diff 37049.Oct 11 2015, 12:46 AM
igorb retitled this revision from to AVX512 : i8/i16 vector CTLZ/CTLZ_ZERO_UNDEF lowering.
igorb updated this object.
igorb added reviewers: delena, RKSimon.
igorb set the repository for this revision to rL LLVM.
igorb added a subscriber: llvm-commits.
RKSimon edited edge metadata.Oct 11 2015, 6:58 AM

Thanks for looking at this - minor comments below. Elena should review the AVX512 internals.

lib/Target/X86/X86ISelLowering.cpp
1522 ↗(On Diff #37049)

There is a mix of 128-bit, 256-bit and 512-bit types here - probably best to separate them (but still under the same hasCDI() test) and add a comment on what is going on.

1537 ↗(On Diff #37049)

Is v16i8 not possible here?

17387 ↗(On Diff #37049)

Please can you add a minimal amount of hasAVX512() protection to the function (maybe as an early out)? Its likely that someone will get around to adding SSSE3 style vector CTLZ lowering at some point and would want to reuse some of the function's setup.

Alternatively rename this function LowerVectorCTLZ_AVX512 and add a hasAVX512() test to where it is called?

17390 ↗(On Diff #37049)

Fix style + grammar - correct the indentation and add a full stop.

17405 ↗(On Diff #37049)

Is VT.is512BitVector() actually supported here? Surely when you zero extend the vector will exceed 512 bits? Maybe a second assert testing that (NewVT.is256BitVector() || NewVT.is512BitVector())?

17406 ↗(On Diff #37049)

Fix style + grammar

test/CodeGen/X86/vector-lzcnt-512.ll
1 ↗(On Diff #37049)

These prefixes probably need tidying up?

igorb marked 6 inline comments as done.Oct 13 2015, 5:57 AM
igorb added inline comments.
lib/Target/X86/X86ISelLowering.cpp
1537 ↗(On Diff #37049)

CTLZ of v16i8 will be implemented using v16i32 that require hasCDI() only.

igorb updated this revision to Diff 37242.Oct 13 2015, 5:59 AM
igorb edited edge metadata.

Simon, Thank you very much for your review! I have updated the patch according to your comments.

LGTM but I'd prefer if Elena can confirm the X86InstrAVX512.td / CONCAT_VECTOR changes.

lib/Target/X86/X86ISelLowering.cpp
17393 ↗(On Diff #37242)

L0 -> Lo

delena added inline comments.Oct 14 2015, 6:56 AM
lib/Target/X86/X86ISelLowering.cpp
1528 ↗(On Diff #37242)

All these types may be "Custom" for AVX-512F and extended to 512-bit vectors.

17392 ↗(On Diff #37242)

NumElts >= 32 or NumElts > 16

igorb updated this revision to Diff 37698.Oct 18 2015, 1:43 AM
igorb marked 3 inline comments as done.

Thank for your review, I have updated the patch according to your comments.
Please take a look.

This revision was automatically updated to reflect the committed changes.