This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Fix MI peephole optimization for splats
ClosedPublic

Authored by vddvss on Oct 27 2019, 6:18 PM.

Details

Summary

This patch fixes an issue where the PPC MI peephole optimization pass incorrectly remove a vector swap.

Specifically, the pass can combine a splat/swap to a splat/copy. It uses TargetRegisterInfo::lookThruCopyLike to determine that the operands to the splat are the same. However, the current logic only compares the operands based on register numbers. In the case where the splat operands are ultimately feed from the same physical register, the pass can incorrectly remove a swap if the feed register for one of the operands has been clobbered.

This patch adds a check to ensure that the registers feeding are both virtual registers or the operands to the splat or swap are both the same register.

Here is an example in pseudo-MIR of what happens in the test cased added in this patch:

Before PPC MI peephole optimization:

%arg = XVADDDP %0, %1

$f1 = COPY %arg.sub_64
call double rint(double)
%res.first = COPY $f1
%vec.res.first = SUBREG_TO_REG 1, %res.first, %subreg.sub_64

%arg.swapped = XXPERMDI %arg, %arg, 2
$f1 = COPY %arg.swapped.sub_64
call double rint(double)
%res.second = COPY $f1

%vec.res.second = SUBREG_TO_REG 1, %res.second, %subreg.sub_64
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
%vec.res = XXPERMDI %vec.res.splat, %vec.res.splat, 2
; %vec.res == [ %vec.res.second[0], %vec.res.first[0] ]

After optimization:

; ...
%vec.res.splat = XXPERMDI %vec.res.first, %vec.res.second, 0
; lookThruCopyLike(%vec.res.first) == lookThruCopyLike(%vec.res.second) == $f1
; so the pass replaces the swap with a copy:
%vec.res = COPY %vec.res.splat
; %vec.res == [ %vec.res.first[0], %vec.res.second[0] ]

As best as I can tell, this has occurred since r288152, which added support for lowering certain vector operations to direct moves in the form of a splat.

Diff Detail

Event Timeline

vddvss created this revision.Oct 27 2019, 6:18 PM
lkail added a reviewer: Restricted Project.Oct 27 2019, 7:42 PM
lkail added a subscriber: lkail.Oct 27 2019, 9:38 PM
lkail added inline comments.
llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
401

Can we just simply check if FeedReg1 and FeedReg2 are virtual registers?

nemanjai requested changes to this revision.Oct 28 2019, 6:39 AM

Thank you for fixing this. I agree with Kai that we should simplify the logic here. We should not make assumptions about physical registers in general so we should just exit early here if either of the input registers are not virtual registers.

llvm/lib/Target/PowerPC/PPCMIPeephole.cpp
390

Yes as Kai pointed out, we should just add

if (!Register::isVirtualRegister(FeedReg1) ||
    !Register::isVirtualRegister(FeedReg2))
  break;
This revision now requires changes to proceed.Oct 28 2019, 6:39 AM
vddvss updated this revision to Diff 226661.Oct 28 2019, 7:30 AM
vddvss edited the summary of this revision. (Show Details)

Updated revision. Note, I also had to update test/CodeGen/PowerPC/load-shuffle-and-shuffle-store.ll since this will omit the optimization in some of those cases.

lkail added inline comments.Oct 30 2019, 2:22 AM
llvm/test/CodeGen/PowerPC/load-shuffle-and-shuffle-store.ll
399 ↗(On Diff #226661)

Maybe we can enhance it by checking if DefMI->getOperand(1).getReg() and DefMI->getOperand(2).getReg() are VRegs and are equal, before invoking lookThruCopyLike.

vddvss added inline comments.Oct 30 2019, 8:11 AM
llvm/test/CodeGen/PowerPC/load-shuffle-and-shuffle-store.ll
399 ↗(On Diff #226661)

So that does work, although at the cost of some added complexity. I will upload a new revision shortly.

vddvss updated this revision to Diff 227116.Oct 30 2019, 8:46 AM
vddvss edited the summary of this revision. (Show Details)

Per @jsji's suggestion, this checks if the operands to DefMI are both the same virtual register, which enables optimization in more cases.

lkail added a comment.Oct 30 2019, 7:06 PM

LGTM. I'll wait for @nemanjai if he has additional comments.

Hmm... sorry to be going back and forth on this a little bit. I am trying to ensure we meet competing goals (simple code, ensure safety and do not overly pessimize the optimization).

I think the semantics should be:

  • If DefReg1 == DefReg2, this should be safe. Even if they were physical registers, it is not possible for the physical register to have different definitions at one place
  • If they are not equal, we look through copies and ensure that the ultimate source is the same virtual register

So it would be good to clearly document the conditions under which we want to do the transformation. Would it work if we changed the early exit to something like:

// If the two inputs are not the same register, check to see if they originate
// from the same virtual register after only copy-like instructions.
if (DefReg1 != DefReg2) {
  unsigned FeedReg1 = TRI->lookThruCopyLike(DefReg1, MRI);
  unsigned FeedReg2 = TRI->lookThruCopyLike(DefReg2, MRI);
  if (FeedReg1 != FeedReg2 || Register::isPhysicalRegister(FeedReg1))
    break;
}
vddvss updated this revision to Diff 227450.Nov 1 2019, 8:12 AM
vddvss edited the summary of this revision. (Show Details)

See attached for an updated diff. This also changes the name of FeedImmed to DefImmed to be congruent with DefReg[12].

No worries on the continued revisions. I want to to thank you all for your suggestions--they have definitely made the original patch a lot better.

Also, I noticed in the case PPC::XXPERMDI: block there are a couple of nested ifs without elses and checks on ifs that could simplified by converted to early breaks, albeit at the cost of polluting git blame with whitespace differences. Let me know if you are interested in a separate patch to clean some of that up.

lkail accepted this revision as: Restricted Project.Nov 10 2019, 5:32 PM

LGTM.

nemanjai accepted this revision.Nov 15 2019, 1:02 PM

LGTM. Thank you for your patience.

This revision is now accepted and ready to land.Nov 15 2019, 1:02 PM

Also, I noticed in the case PPC::XXPERMDI: block there are a couple of nested ifs without elses and checks on ifs that could simplified by converted to early breaks, albeit at the cost of polluting git blame with whitespace differences. Let me know if you are interested in a separate patch to clean some of that up.

Any refactoring patch that makes the code flow better or makes it more readable is always welcome and appreciated.

lkail added a comment.Dec 5 2019, 9:10 PM

Hi @vddvss do you need help landing this patch?

vddvss added a comment.Dec 6 2019, 5:15 AM

@lkail yes, new contributor here so if you can land it or tell me how to go about that? Thanks

This revision was automatically updated to reflect the committed changes.