Page MenuHomePhabricator

[PowerPC] Fold vector merges of swapped vectors to merge the opposite half
AcceptedPublic

Authored by nemanjai on Apr 7 2020, 6:52 AM.

Details

Reviewers
hfinkel
stefanp
Group Reviewers
Restricted Project
Summary

There are quite a few sequences that will produce a swap followed by a merge (low/high). This is mainly due to the SCALAR_TO_VECTOR on little endian requiring a swap (since all PPC instructions have a big endian bias). Due to its prevalence, these sequences often appear in fairly hot code, so we want to get rid of unnecessary swaps.

This patch does two things:

  1. Convert a merge (high/low) to the opposite merge if both inputs are swapped
  2. Convert a merge (high/low) to the opposite merge if one input is swapped and the other is symmetrical about the midpoint

Diff Detail

Unit TestsFailed

TimeTest
180 mslldb-unit.Host/_/HostTests::ConnectionFileDescriptorTest.TCPGetURIv6
Note: Google Test filter = ConnectionFileDescriptorTest.TCPGetURIv6 [==========] Running 1 test from 1 test case. [----------] Global test environment set-up.

Event Timeline

nemanjai created this revision.Apr 7 2020, 6:52 AM
nemanjai marked an inline comment as done.Apr 7 2020, 8:20 AM
nemanjai added inline comments.
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
291

MTVSRWS as well as splatting loads are missing.

nemanjai updated this revision to Diff 255824.Apr 7 2020, 2:54 PM

Pulled out the NFC update to PPCInstrVSX.td and rebased on top of that commit. Added the missing symmetrical opcodes.

stefanp accepted this revision as: stefanp.Wed, May 6, 12:57 PM
stefanp added a subscriber: stefanp.

This patch LGTM.
I have a few minor comments but that is it.

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
274

What about XXSLDWIs here? It's the same as XXSLDWI but we know the two inputs are the same.

313

Question:
I assume that trivial instructions such as: XXLEQV and XXLORC are not worth adding because we will never see them with identical inputs. Basically XXLEQV with MI->getOperand(1).getReg() == MI->getOperand(2).getReg() is the same as XXLEQVOnes so we will never see that by itself.

336

Is this last line there so that the compiler does not complain about having Copy defined but not used when we don't have debug enabled?

llvm/test/CodeGen/PowerPC/merge_of_swap.ll
8

nit:
I don't think we need the #0.
Also can you add a couple of sentences to explain that this test checks that we want to remove swaps that feed merges when we know the input to the swap is symmetrical.

This revision is now accepted and ready to land.Wed, May 6, 12:57 PM
amyk added a subscriber: amyk.Wed, May 6, 9:42 PM
amyk added inline comments.
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
317

I think it would be good to document this function, as other PPCMIPeephole:: functions are also documented.

658

Just curious but does vmrg[e | o]w apply here, too?

702

Minor nit, but maybe reorder the cases the same way as how they're presented in the cases above?

lei added a subscriber: lei.Mon, May 11, 2:26 PM
lei added inline comments.
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
287

Wondering if it would be good to add a flag in the td file to mark such instructions instead of listing them here. That way when we add future instructions to the td file that will fall into this case, it'll be easier to see that is needed.

701

maybe add a default: llvm_unreachable() here?