Page MenuHomePhabricator

[PowerPC] Fix vector_shuffle combines when inputs are scalar_to_vector of differing types.
Needs ReviewPublic

Authored by amyk on Jul 25 2022, 7:23 AM.

Details

Reviewers
power-llvm-team
nemanjai
stefanp
Group Reviewers
Restricted Project
Summary

This patch fixes the combines for vector_shuffles when either or both of its
left and right hand side inputs are scalar_to_vector nodes.

Previously, when both left and right side inputs are scalar_to_vector nodes,
the current combine could not handle this situation, as the shuffle mask was
updated incorrectly. https://reviews.llvm.org/D127818 was a temporary solution
to this issue. Now, not only does this patch aim to resolve the previous issue the
of incorrect shuffle mask adjustments respectively, it also updates any test cases
that are affected by this change.

Diff Detail

Event Timeline

amyk created this revision.Jul 25 2022, 7:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 7:23 AM
amyk requested review of this revision.Jul 25 2022, 7:23 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 7:23 AM
amyk updated this revision to Diff 452354.Aug 12 2022, 7:00 PM

Rebase patch based on new tests added within D130485.

Can you also comment on whether this was thoroughly tested on both little endian and big endian systems (bootstrap, test-suite, SPEC, additional internal tests).

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
14917–14918

I don't follow why we need these here. They both seem to only be needed in the respective conditions (i.e. depending on whether the LHS/RHS are scalar_to_vector nodes). And within those conditional blocks, they are reset before they're used.

So why do we need to define them here and initialize them to the width of a vector?

llvm/test/CodeGen/PowerPC/v16i8_scalar_to_vector_shuffle.ll
267

The code for this one gets worse on big endian. Do we know why?

347

The code for this one gets worse on big endian. Do we know why?

578

The code for this one gets worse on big endian. Do we know why?

659

The code for this one gets worse on big endian. Do we know why?

1431

The code for this one gets worse on big endian. Do we know why?

1658

The code for this one gets worse on big endian. Do we know why?

There are probably a bunch of other places. Can you please review what is happening there? I'll stop adding further similar comments.

llvm/test/CodeGen/PowerPC/v4i32_scalar_to_vector_shuffle.ll
123 ↗(On Diff #452354)

The code generated for this one gets worse on all subtargets. Do we know why?

amyk updated this revision to Diff 479927.Dec 4 2022, 9:15 AM

Discussed this patch with Nemanja outside of the review.

Many of the cases where we get worse codegen on BE (and some on LE as well) within the test cases occur either when the shuffle mask is explicitly asking for undefined vector elements from the original scalar_to_vector, or when the values in the vector are partially defined in the cases where the scalar_to_vector element size is smaller than the vector_shuffle element size. These cases primarily seem to appear within these contrived test cases, rather than real code. As a result, I've rebased this patch and addressed some review comments to move forward with this patch.

amyk added a comment.Dec 4 2022, 9:18 AM

@nemanjai Yeah, at the time of posting the patch, I did do little endian and big endian system tests as you mentioned. Since it's been awhile and I need update the patch, I'd like to retest these particular runs with this patch.

llvm/lib/Target/PowerPC/PPCISelLowering.cpp
14917–14918

Ya, I think I had that in the beginning and meant to remove it prior to putting up the patch but didn't realize that it was still left in there. I don't believe I need it either, so I've removed it.