This is an archive of the discontinued LLVM Phabricator instance.

[X86][AVX512] Support variable-index vector insertion on AVX512 targets
ClosedPublic

Authored by RKSimon on Feb 1 2021, 4:32 AM.

Details

Summary

With predicate masks, AVX512 can efficiently perform variable-index vector insertion with 2 broadcasts + 1 comparison, avoiding a lot of aliased memory traffic.

Diff Detail

Event Timeline

RKSimon created this revision.Feb 1 2021, 4:32 AM
RKSimon requested review of this revision.Feb 1 2021, 4:32 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 1 2021, 4:32 AM
spatel added inline comments.Feb 1 2021, 5:13 AM
llvm/test/CodeGen/X86/insertelement-var-index.ll
4–5

Worth adding a shared prefix for "AVX1or2", so we don't get so much duplication?

RKSimon updated this revision to Diff 320474.Feb 1 2021, 8:11 AM

Add AVX1OR2 check prefix

spatel accepted this revision.Feb 1 2021, 1:37 PM

LGTM - see inline for a couple of minors.

llvm/lib/Target/X86/X86ISelLowering.cpp
18829–18832

Could use DAG.getSplatBuildVector() for both of these for slightly less code.

18839

We should have a code comment to describe the pattern:
// inselt N0, N1, N2 --> select (SplatN2 == {0,1,2...}) ? SplatN1 : N0

This revision is now accepted and ready to land.Feb 1 2021, 1:37 PM
This revision was landed with ongoing or failed builds.Feb 2 2021, 3:46 AM
This revision was automatically updated to reflect the committed changes.