This is an archive of the discontinued LLVM Phabricator instance.

[X86] Optimization of inserting vxi1 sub vector into vXi1 vector
ClosedPublic

Authored by xiangzhangllvm on Dec 26 2019, 6:03 PM.

Details

Summary

After bugfix the undef value case here, we used more operations to implement inserting vxi1 sub vector into vXi1 vector, I optimize it by use less operations.

The history information at https://reviews.llvm.org/D68311

Diff Detail

Event Timeline

xiangzhangllvm created this revision.Dec 26 2019, 6:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 26 2019, 6:03 PM
RKSimon added inline comments.Dec 29 2019, 2:51 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
5902

APInt Mask0 = APInt::getBitsSet(NumElems, IdxVal, IdxVal + SubVecNumElems) ?

5913

"if needed" ?

craig.topper added inline comments.Dec 29 2019, 2:58 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
5913

That comment is taken from the end of the function. WideOpVT and OpVT might be the same. There’s a check for that case in getNode that will just return the input vector in that case.

xiangzhangllvm marked an inline comment as done.Dec 29 2019, 5:33 PM
xiangzhangllvm added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
5902

Yes! getBitsSet is better, I'll change it.

update the patch, and I tested in my local performance test, it really have a little small improvement.

lebedev.ri retitled this revision from Optimization of inserting vxi1 sub vector into vXi1 vector to [X86] Optimization of inserting vxi1 sub vector into vXi1 vector .Jan 2 2020, 12:15 AM
This revision is now accepted and ready to land.Jan 2 2020, 11:34 AM
This revision was automatically updated to reflect the committed changes.
llvm/lib/Target/X86/X86ISelLowering.cpp