This is an archive of the discontinued LLVM Phabricator instance.

[PPC] Enable shuffling of VSX vectors
ClosedPublic

Authored by Carrot on Apr 11 2016, 3:42 PM.

Details

Reviewers
hfinkel
Summary

Current PPC backend enables shuffling for very limited vector types, and the default behavior for a vector with more than 2 elements is moving data through memory, it usually triggers store forwarding, it is extremely slow on power.

This simple patch explicitly enables shuffling of vectors if VSX is available. For the testcase in the bug entry, the performance is improved by 6.6x on power8. The number of instructions is reduced from 756 to 406.

Diff Detail

Event Timeline

Carrot updated this revision to Diff 53329.Apr 11 2016, 3:42 PM
Carrot retitled this revision from to [PPC] Enable shuffling of VSX vectors .
Carrot updated this object.
Carrot added a reviewer: hfinkel.
Carrot added a subscriber: llvm-commits.

I forgot to mention the related bug entry is https://llvm.org/bugs/show_bug.cgi?id=27078.

hfinkel added inline comments.Apr 26 2016, 9:11 AM
lib/Target/PowerPC/PPCISelLowering.cpp
11928

I agree that this makes sense. I was going to ask that you combine this with the v2i64 logic above, and add appropriate checks for VSX data types, however, since this shouldn't have any effect on scalarized types, it is not clear that the type checks here are actually useful. Maybe just simplify this to read:

if (Subtarget.hasVSX() || Subtarget.hasQPX())
  return true;

return TargetLowering::shouldExpandBuildVectorWithShuffles(VT, DefinedValues);
Carrot updated this revision to Diff 55109.Apr 26 2016, 3:28 PM
Carrot marked an inline comment as done.
hfinkel accepted this revision.Apr 26 2016, 3:30 PM
hfinkel edited edge metadata.
hfinkel added inline comments.
lib/Target/PowerPC/PPCISelLowering.cpp
11928

You can remove this too:

if (VT == MVT::v2i64)
  return Subtarget.hasDirectMove(); // Don't need stack ops with direct moves

With that, LGTM.

This revision is now accepted and ready to land.Apr 26 2016, 3:30 PM
Carrot added inline comments.Apr 27 2016, 9:10 AM
lib/Target/PowerPC/PPCISelLowering.cpp
11928

Remove this line will cause test/CodeGen/PowerPC/vsx.ll:test80 fail. This statement uses Subtarget.hasDirectMove() to decide if shuffling is used when building a vector with 2 integers. If there is a direct move instruction (power8), the integer value can be moved to vector registers and merged. If there is no direct move instruction (power7), no shuffling is used, the integer value can only be stored to memory, and loaded to vector register.

Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in r268064.