This is an archive of the discontinued LLVM Phabricator instance.

AVX512: Remove VSHRI kmask patterns from TD file
ClosedPublic

Authored by igorb on Mar 1 2016, 5:58 AM.

Details

Summary

AVX512: Remove VSHRI kmask patterns from TD file. It is incorrect to use kshiftw to implement VSHRI v4i1 , bits 15-4 is undef so the upper bits of v4i1 may not be zeroed. v4i1 should be zero_extend to v16i1 ( or any natively supported vector).
The only use of removed patterns i identified was in Insert1BitVector() , update it implementation to use natively supported shift .
Fix 2 additional bugs in Insert1BitVector implementation

IdxVal == 0 case  -- SubVec should be zero extend ,  
IdxVal + SubVecNumElems == NumElems  - Vec was used instead SubVec

I failed to create test case for IdxVal == 0 , may be you can suggest one.

Most probably in many cases insert kmask sub-vector into zero-vector don't require any instructions ( shiftl + shiftr ) to zero upper part - it already zero.
I think only exstruct_subvec (low part) in current implementation doesn't zero upper part, but i am not sure. In any case i believe this performance improvement ( if it is possible ) should be implemented as separate patch.
Thanks.

Diff Detail

Repository
rL LLVM

Event Timeline

igorb updated this revision to Diff 49474.Mar 1 2016, 5:58 AM
igorb retitled this revision from to AVX512: Remove VSHRI kmask patterns from TD file.
igorb updated this object.
igorb added a reviewer: delena.
igorb set the repository for this revision to rL LLVM.
igorb added a subscriber: llvm-commits.
delena added inline comments.Mar 2 2016, 7:13 AM
lib/Target/X86/X86ISelLowering.cpp
4696 ↗(On Diff #49474)

" " after if

4705 ↗(On Diff #49474)

if (OpVt == WideOpVT) you don't need insert

4708 ↗(On Diff #49474)

if index == 0, you don't need shift

4710 ↗(On Diff #49474)

if (OpVt == WideOpVT) you don't need extract

igorb updated this revision to Diff 49730.Mar 3 2016, 5:44 AM
igorb marked 4 inline comments as done.

Update patch according to comments.
Thanks for review.

lib/Target/X86/X86ISelLowering.cpp
4705 ↗(On Diff #49474)

only in case SubVecVT == OpVT , don't think it is possible

igorb updated this revision to Diff 49735.Mar 3 2016, 6:45 AM
delena accepted this revision.Mar 3 2016, 6:52 AM
delena edited edge metadata.

+ minor comment
LGTM

lib/Target/X86/X86InstrAVX512.td
2485 ↗(On Diff #49735)

typo - extract

Patterns for insert_subvector/extruct_subvector to/from mask vector at index 0.

This revision is now accepted and ready to land.Mar 3 2016, 6:52 AM
This revision was automatically updated to reflect the committed changes.