This is an archive of the discontinued LLVM Phabricator instance.

[X86] Widen i16 shuffle masks if vector width < 512 even with BWI
ClosedPublic

Authored by goldstein.w.n on Feb 10 2023, 3:32 PM.

Details

Summary

{v}blend{d|ps|pd} is preferable to {v}blendw so widen so that we
can match it.

Diff Detail

Event Timeline

goldstein.w.n created this revision.Feb 10 2023, 3:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2023, 3:32 PM
goldstein.w.n requested review of this revision.Feb 10 2023, 3:32 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2023, 3:32 PM

lowerVECTOR_SHUFFLE should always widen a shuffle as much as possible - my guess is canCombineAsMaskOperation is blocking it on AVX512BW targets - maybe address it there instead?

move logic to cancombine

goldstein.w.n retitled this revision from [X86] Try harder to convert `{v}blendw` -> `{v}blend{d|ps}` to [X86] Widen i16 shuffle masks if vector width < 512 even with BWI.
goldstein.w.n edited the summary of this revision. (Show Details)

lowerVECTOR_SHUFFLE should always widen a shuffle as much as possible - my guess is canCombineAsMaskOperation is blocking it on AVX512BW targets - maybe address it there instead?

Done in V2.

RKSimon accepted this revision.Feb 13 2023, 6:43 AM

LGTM

This revision is now accepted and ready to land.Feb 13 2023, 6:43 AM
goldstein.w.n edited the summary of this revision. (Show Details)

Rebase

This revision was landed with ongoing or failed builds.Feb 26 2023, 10:12 AM
This revision was automatically updated to reflect the committed changes.