This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Fix a performance bug for PPC::XXSLDWI.
ClosedPublic

Authored by jtony on May 15 2017, 7:05 PM.

Details

Summary
There are some VectorShuffle Nodes in SDAG which can be selected to XXSLDWI instruction,  this patch recognizes them and does the selection/mapping to improve the PPC performance.

Diff Detail

Event Timeline

jtony created this revision.May 15 2017, 7:05 PM
nemanjai edited edge metadata.May 16 2017, 12:06 PM

Although none of these comments are about something major that needs to change, I'd like to see an updated patch with all the comments addressed before approving it.

lib/Target/PowerPC/PPCISelLowering.cpp
1577

The name sounds weird. Perhaps isWordShuffleMask()?

1667

Although I appreciate the effort, we can't really have comments enumerating all the possible shuffles that a specific instruction can take. As you can imagine, doing so for all the different instruction handling specific shuffle types would blow up into thousands of lines of comments.

Besides, the enumeration does seem to be missing some. For example a shuffle mask of 3,0,1,2 does not appear to be listed and should be valid (i.e. shift a vector concatenated with itself left by one word).

1731

As general guidance, do not modify output parameters unless the function succeeds.

1746

It seems kind of arbitrary to assert this for just one index. It should be all or none.

test/CodeGen/PowerPC/vec_sldwi.ll
2

If you're splitting commands on multiple lines, please keep the lines to 80 columns.

10

I don't think you should be splitting the lines of IR in the test cases. You can leave them the way they come out of clang. Also, it's much more natural to look at the test cases if the types of vectors you pass in are <4 x i32> since it's more concise and is a nice match for the instruction. I don't really understand the use of <16 x i8> with all the clutter it introduces.

nemanjai requested changes to this revision.May 16 2017, 12:07 PM
This revision now requires changes to proceed.May 16 2017, 12:07 PM
timshen added inline comments.
lib/Target/PowerPC/PPCISelLowering.cpp
1667

On the bright side, this can be turned into really good test cases.

You can use llvm/util/update_llc_test_checks.py to generate CHECK lines.

1689

FWIW GCC gives:
0 1 2 3
3 0 1 2
2 3 0 1
1 2 3 0

jtony updated this revision to Diff 99494.May 18 2017, 2:25 PM
jtony edited edge metadata.
jtony marked 8 inline comments as done.

Address the code review comments from Nemanja and Tim Shen.

lib/Target/PowerPC/PPCISelLowering.cpp
1667

Yeah, I will remove these excessive comments here. But we do have the 3,0,1,2 mask here (case 3, case 6).

1689

Yes, you are right Tim, It turns out the implementation is correct (ShiftElts = IsLE ? (4 - M0) % 4 /* Case 3 */ : M0 /* Case 6 */;), but the comments is not updated to match that.

1731

I put it there to make sure that parameter (ShiftElts) is always initialized even the check return false. I was told to always initialize a variable to avoid weird behavior, but I guess in this case it should be OK to not initialize shiftElts when checking fails, since in that case we won't do anything in the caller. I will remove that initialization thing at the beginning.

nemanjai accepted this revision.May 23 2017, 2:41 PM

The remaining comments are minor. Please address them on the commit.

LGTM but you may want to give the other reviewers a bit of time to take a look as well.

lib/Target/PowerPC/PPCISelLowering.cpp
1669

I think this can safely be an assert.

1685

This can be converted to an assert. Something like:
assert(M0 < 4 && "Indexing into an undef vector?");

1701

A comment to clarify this is needed. Perhaps something like:

// Input vectors don't need to be swapped if the leading element
// of the result is one of the 3 left elements of the second vector
// (or if there is no shift to be done at all).

Also, add similar comments to the BE code.

1704
// Input vectors need to be swapped if the leading element
// of the result is one of the 3 left elements of the first vector
// (or if we're shifting by 4 - thereby simply swapping the vectors).
This revision is now accepted and ready to land.May 23 2017, 2:41 PM
echristo accepted this revision.May 23 2017, 5:41 PM

I have one inline comment here, but it can be addressed in a follow up patch.

Thanks!

-eric

lib/Target/PowerPC/PPCISelLowering.cpp
1668

I'd like to try to pull out the bool operands from these routines. If not in this patch, but in a follow up patch if possible.

jtony closed this revision.May 25 2017, 8:47 AM
jtony marked 7 inline comments as done.
jtony added inline comments.
lib/Target/PowerPC/PPCISelLowering.cpp
1668

An issue is created for pulling out the bool operands, here is the link: https://github.ibm.com/llvm/llvm-on-power/issues/392

echristo added inline comments.May 25 2017, 4:35 PM
lib/Target/PowerPC/PPCISelLowering.cpp
1668

FWIW it's not very helpful since I don't have an IBM address :)