This is an archive of the discontinued LLVM Phabricator instance.

[X86][SSE] Use ISD::MULHU for constant/non-zero ISD::SRL lowering (PR38151)
ClosedPublic

Authored by RKSimon on Jul 19 2018, 11:26 AM.

Details

Summary

As was done for vector rotations, we can efficiently use ISD::MULHU for vXi8/vXi16 ISD::SRL lowering.

Shift-by-zero cases are still problematic (mainly on v32i8 due to extra AND/ANDN/OR or VPBLENDVB blend masks but v8i16/v16i16 aren't great either if PBLENDW fails) so I've limited this first patch to known non-zero cases if we can't easily use PBLENDW.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 156458.Jul 20 2018, 2:51 AM

Add partial support for shift-by-zero amounts - for v8i16 on SSE41+ targets where PBLENDW is available.

craig.topper added inline comments.Jul 22 2018, 2:47 PM
lib/Target/X86/X86ISelLowering.cpp
23516 ↗(On Diff #156458)

Why are we punting with hasAVX512(). We don't get a variable shift of words until hasBWI.

RKSimon added inline comments.Jul 31 2018, 6:07 AM
lib/Target/X86/X86ISelLowering.cpp
23516 ↗(On Diff #156458)

This was a leftover from the zero amount handling - AVX512 can more efficiently extend to D/Q types, shift then truncate. I'll update the patch.

RKSimon updated this revision to Diff 158236.Jul 31 2018, 6:23 AM
RKSimon edited the summary of this revision. (Show Details)

Reinstated AVX512 support

This revision is now accepted and ready to land.Jul 31 2018, 9:51 AM
This revision was automatically updated to reflect the committed changes.