This is an archive of the discontinued LLVM Phabricator instance.

[X86] Teach LowerBUILD_VECTOR to recognize pair-wise splats of 32-bit elements and use a 64-bit broadcast
ClosedPublic

Authored by craig.topper on Jan 15 2018, 1:57 PM.

Details

Summary

If we are splatting pairs of 32-bit elements, we can use a 64-bit broadcast to get the job done.

We could probably could probably do this with other sizes too, for example four 16-bit elements. Or we could broadcast pairs of 16-bit elements using a 32-bit element broadcast. But I've left that as a future improvement.

I've also restricted this to AVX2 only because we can only broadcast loads under AVX.

Looks like we may still need a DAG combine for VBROADCAST + VZEXT_LOAD to fold the loads in insertelement-shuffle.ll and vector-shuffle-combining-xop.ll

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Jan 15 2018, 1:57 PM
spatel added inline comments.Jan 16 2018, 8:27 AM
lib/Target/X86/X86ISelLowering.cpp
8118–8127 ↗(On Diff #129904)

Initialize the first 2 elements to simplify the code?

SmallVector<SDValue, 4> Ops({Op->getOperand(0), Op->getOperand(1)});
bool IsSplatPair = true;
for (unsigned i = 2; i != NumElems; ++i) {
  if (Ops[i % 2] != Op->getOperand(i)) {
    IsSplatPair = false;
    break;
  }
}

I think this would also be easier to read if it was split off into a helper function / lambda because you could just early return when you detect that it's not a splat pair.

8132 ↗(On Diff #129904)

The VTs are confusingly named. ExtVT is the current vector element VT (way back at line 7891). Can we rename things to make this clearer as a preliminary clean-up (ExtVT -> EltVT)?

RKSimon added inline comments.Jan 16 2018, 9:45 AM
test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
490 ↗(On Diff #129904)

We'd gain from INSERT_VECTOR_ELT support being added to EltsFromConsecutiveLoads - merging multiple consecutive scalar loads into a single scalar load+insert into a zero/undef vector.

Address Sanjay's comments.

spatel accepted this revision.Jan 17 2018, 10:38 AM

LGTM.

This revision is now accepted and ready to land.Jan 17 2018, 10:38 AM
This revision was automatically updated to reflect the committed changes.