This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Optimize VPERM and fix code order for swapping vector operands on LE
ClosedPublic

Authored by maryammo on Apr 24 2023, 11:07 AM.

Details

Summary

This patch reverts commit 7614ba0a5db8 to optimize VPERM when one of its
vector operands is XXSWAPD, similar to XXPERM. It also reorganizes the
little-endian swap code on LE, swapping the vector operand after
adjusting the mask operand. This ensures that the vector operand is
swapped at the correct point in the code, resulting in a valid
constant pool for the mask operand.

Diff Detail

Event Timeline

maryammo created this revision.Apr 24 2023, 11:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 11:07 AM
maryammo requested review of this revision.Apr 24 2023, 11:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2023, 11:07 AM

Overall I think that this makes sense to me. I do have a couple of comments but mostly nits.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10238

Now that the swap has been moved down should this also be moved down? Or at least part of it?
I ask because I've noticed that there are a number of tests where a new vmr has been introduced and I'm wondering if that's why.

10310

nit:
Since this guard is only around a smaller section of code I think you can take the isPPC64 part out and make it part of this guard.
Like:

if (isPPC64 && (V1HasXXSWAPD || V2HasXXSWAPD || Opcode == PPCISD::XXPERM))

That way if we are not PPC64 we won't need to check any of the rest.

llvm/test/CodeGen/PowerPC/xxperm-swap.ll
20

This is the kind of extra copy that I mean.

maryammo added inline comments.May 8 2023, 12:24 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10238

In the following code section (line 10286), it uses the NeedSwap to adjust the mask, so this single-use operand swap needs to happen before that.
You are right, there are new vmrs, they seem to be needed based on the new vector operand order in the xxperm instruction.

stefanp added inline comments.May 9 2023, 7:36 AM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10238

I see what you are saying. However, maybe something like this will work:

if ((!isLittleEndian && !V2->hasOneUse() && V1->hasOneUse()) ||
        (isLittleEndian && !V1->hasOneUse() && V2->hasOneUse())) {

I tried the above on one of the tests with the extra copy and it seems to get rid of it. I have not tested this extensively so please make sure that I have not broken anything.

maryammo added inline comments.Jul 5 2023, 12:14 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10238

I tried this by adding your code here and removing

// Only need to place items backwards in LE,
// the mask was properly calculated.
if (isLittleEndian)
  std::swap(V1, V2);

from line 10334. It regressed bunch of test cases, looking at them I realized we can not check the endianness here because this block of code is already guarded by

if (Subtarget.hasVSX() && Subtarget.hasP9Vector() &&
      (V1->hasOneUse() || V2->hasOneUse())) {

while this check if (isLittleEndian) is for all subtargets.

ping for review please.

stefanp added inline comments.Jul 20 2023, 2:05 PM
llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10238

I think I didn't explain this properly.

I think that you should leave the code for:

// Only need to place items backwards in LE,
// the mask was properly calculated.
if (isLittleEndian)
  std::swap(V1, V2);

at the end. That accounts for Big vs. Little Endian and as you said should be for all subtargets.

What I was thinking is just changing this if statement here. This swap is only here to try to prevent copying registers. What I'm proposing is that you take into consideration whether or not there will be a final swap at the end when you check if you want to also do this swap here.

maryammo updated this revision to Diff 543611.Jul 24 2023, 10:08 AM

Take into account the little-endian swap when deciding on single-use operand to be second one

maryammo updated this revision to Diff 552124.Aug 21 2023, 1:54 PM

Address review comments

I think the patch mostly looks good. I have one more minor comment.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
10320

Since you have removed Opcode == PPCISD::XXPERM from other places in the code is it safe to remove it from here as well?
Sorry, I know that I suggested it be written like this but looking at how the condition has been removed everywhere else it makes sense to me to remove this one as well..

maryammo updated this revision to Diff 555388.Sep 1 2023, 8:20 AM

Address review comments

stefanp accepted this revision.Sep 12 2023, 10:58 AM

Thank you for fixing this!
LGTM!

This revision is now accepted and ready to land.Sep 12 2023, 10:58 AM
maryammo updated this revision to Diff 556692.Sep 13 2023, 12:18 PM

Rebase to top of trun, resolve conflicts and update tests

This revision was landed with ongoing or failed builds.Sep 13 2023, 1:00 PM
This revision was automatically updated to reflect the committed changes.