This is an archive of the discontinued LLVM Phabricator instance.

[PPC64] Add vector quadword add/sub instructions for POWER8
ClosedPublic

Authored by kbarton on Apr 17 2015, 10:06 AM.

Details

Summary

This patch adds support for the vector quadword add/sub instructions introduced in POWER8:

  • vadduqm
  • vaddeuqm
  • vaddcuq
  • vaddecuq
  • vsubuqm
  • vsubeuqm
  • vsubcuq
  • vsubecuq

In addition to adding the instructions themselves, it also adds support for the v1i128 type for intrinsics (Intrinsics.td, Function.cpp, and IntrinsicEmitter.cpp).

Diff Detail

Event Timeline

kbarton updated this revision to Diff 23941.Apr 17 2015, 10:06 AM
kbarton retitled this revision from to [PPC64] Add vector quadword add/sub instructions for POWER8.
kbarton updated this object.
kbarton edited the test plan for this revision. (Show Details)
kbarton added reviewers: hfinkel, wschmidt, seurer, nemanjai.
kbarton added a subscriber: Unknown Object (MLST).
wschmidt edited edge metadata.Apr 17 2015, 1:16 PM

A couple of inline notes.

lib/Target/PowerPC/PPCCallingConv.td
65 ↗(On Diff #23941)

Looks like a line-too-long.

lib/Target/PowerPC/PPCISelLowering.cpp
4925

This is a lot more complicated than that. You've just asserted you can pass a 128-bit value in a single 64-bit register, which will bring about wailing and gnashing of teeth.

An i128 needs to be passed in two GPRs, and furthermore you may need to first skip a GPR to ensure that the image of the i128 in the parameter save area is aligned on a 16-byte boundary. (If the parameter's address is taken, it will be forced to its home location in memory, and that must be properly aligned.) I've verified that in gcc we have rs6000_function_arg_advance_1 which certainly makes this adjustment, as does rs6000_function_arg.

I believe you can borrow some of the necessarily logic from the varargs handling of vector types, below; it has to pass varargs parameters in consecutive GPRs as well. However, I'm bothered that I don't see anything there for alignment. The ABI document is wishy-washy about this for varargs: "Alignment requirements such as those for vector types may require the va_list pointer to first be aligned before accessing a value." Note the "may." But GCC's varargs handling uses rs6000_function_arg_advance_1 as well, so I believe to be compatible we have to have the same alignment done here.

CCing Uli for his thoughts, since he spent a lot of his life in this ABI code. At one time this passed the gcc ABI compatibility tests, so maybe I'm missing something.

wschmidt added a subscriber: uweigand.

One more general comment -- you don't have any testing for the ABI changes. Also, the ABI changes should probably be a separate patch from the quadword instruction patch....sorry. :/

nemanjai added inline comments.Apr 17 2015, 1:34 PM
include/llvm/IR/IntrinsicsPowerPC.td
552

Do we not want these guarded for Power8 Altivec feature? It would seem that we don't want these available in the back end for CPU's that don't have the feature.

test/CodeGen/PowerPC/vec_add_sub_quadword.ll
10

I don't think all the registers are ABI-required registers. Probably ok to use a regex for the other(s). Just making the same comment I received previously and I think it works well. A regex can be something like {{[0-9]+}}.

It may or may not make sense, but you can check that the value being added is loaded into a register and that that register is used for the add (at least for the immediate). Although I don't know if that's a good idea for commutative operations. Maybe someone more knowledgeable can comment on that.

16
  1. Why the star at the end?
  2. I think that in a large test case, it helps to have CHECK-LABEL for every function so that if something fails, it is easier to find where the failure is. The tool won't go past a check if the previous one did not match.
72

Would it make sense to check that the right parameter is in the right register?

One more general comment -- you don't have any testing for the ABI changes. Also, the ABI changes should probably be a separate patch from the quadword instruction patch....sorry. :/

Agreed. I will refactor these changes and take a closer look at your comments in PPCISelLowering.cpp and submit a new patch for review.

kbarton added inline comments.Apr 20 2015, 8:43 AM
include/llvm/IR/IntrinsicsPowerPC.td
552

We don't have any other guards in this file, as far as I can tell.
I believe this is fine to define these intrinsics as is, and let other diagnostic messages determine whether they are valid for the target architecture.

test/CodeGen/PowerPC/vec_add_sub_quadword.ll
10

Agreed. Will change.

16

The star is a typo.
I've also added the check label for the first several functions. Good call.

72

I don't know how this would be done, short of actually running the test case and making sure the output is correct.

kbarton added inline comments.Apr 20 2015, 8:50 AM
test/CodeGen/PowerPC/vec_add_sub_quadword.ll
10

Actually, on second thought I think it's better to leave these tests as is.
The ABI specifies which registers the parameters to the function are passed in and given these are extremely short functions there is no reason to not reuse the same registers for the specific instructions being tested. So, by using the same registers as the parameters, we are testing to ensure no unnecessary register moves are being added to the code.

If these were more complicated test cases, then we would need to use regular expressions because we couldn't guarantee in all cases the same registers would be used.

kbarton updated this revision to Diff 26200.May 20 2015, 7:33 PM
kbarton updated this object.
kbarton edited edge metadata.

The changes related to the ABI have been refactored and put into a different patch.
This patch now contains only the 8 new instructions to be added, and the necessary support for the v1i128 type in the intrinsics code.

wschmidt accepted this revision.May 22 2015, 11:11 AM
wschmidt edited edge metadata.

LGTM. Sorry I missed this with the email server problem!

This revision is now accepted and ready to land.May 22 2015, 11:11 AM
kbarton closed this revision.May 25 2015, 8:54 AM

Committed revision 238144.