This is an archive of the discontinued LLVM Phabricator instance.

[x86] improve codegen for non-splat bit-masked vector compare and select (PR46531)
ClosedPublic

Authored by spatel on Jul 5 2020, 1:12 PM.

Details

Summary

vselect ((X & Pow2C) == 0), LHS, RHS --> vselect ((shl X, C') < 0), RHS, LHS

Follow-up to D83073 - the non-splat mask cases where we actually see an improvement are quite limited from what I can tell. AVX1 needs multiply and blend capabilities and AVX2 needs vector shift and blend capabilities. The intersection of those 2 constraints is only vectors with 32-bit or 64-bit elements.

Diff Detail

Event Timeline

spatel created this revision.Jul 5 2020, 1:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 5 2020, 1:12 PM
RKSimon added inline comments.Jul 6 2020, 6:15 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
40260

XOP has more vector shifts and vpcmov which should allow 8/16-bit cases as well - I added testing at rGd6c72bdca2f2

spatel marked an inline comment as done.Jul 6 2020, 7:20 AM
spatel added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
40260

Ok - I'll enable XOP for all legal types, and we can decide if we need to exclude any types based on those diffs. I don't have a good sense of what's good/bad/possible with those instructions.

spatel updated this revision to Diff 275705.Jul 6 2020, 7:23 AM
spatel marked an inline comment as done.

Patch updated:
Enable transform for XOP targets.

RKSimon accepted this revision.Jul 7 2020, 9:32 AM

LGTM - cheers

llvm/lib/Target/X86/X86ISelLowering.cpp
40269

unsigned i = 0, e = VT.getVectorNumElements(); i != e; ++i

This revision is now accepted and ready to land.Jul 7 2020, 9:32 AM
spatel marked an inline comment as done.Jul 8 2020, 5:15 AM
This revision was automatically updated to reflect the committed changes.