Page MenuHomePhabricator

[Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat
ClosedPublic

Authored by cycheng on Feb 2 2016, 10:53 PM.

Details

Summary

This change implements the following vsx instructions:

  • Scalar Insert/Extract
    1. xsiexpdp xsiexpqp xsxexpdp xsxsigdp xsxexpqp xsxsigqp
  • Vector Insert/Extract
    1. xviexpdp xviexpsp xvxexpdp xvxexpsp xvxsigdp xvxsigsp
    2. xxextractuw xxinsertw
  • Scalar/Vector Test Data Class
    1. xststdcdp xststdcsp xststdcqp
    2. xvtstdcdp xvtstdcsp
  • Maximum/Minimum
    1. xsmaxcdp xsmaxjdp
    2. xsmincdp xsminjdp
  • Vector Byte-Reverse/Permute/Splat
    1. xxbrd xxbrh xxbrq xxbrw
    2. xxperm xxpermr
    3. xxspltib

30 instructions

Diff Detail

Event Timeline

cycheng updated this revision to Diff 46747.Feb 2 2016, 10:53 PM
cycheng retitled this revision from to [Power9] Implement new vsx instructions: insert, extract, test data class, min/max, reverse, permute, splat.
cycheng updated this object.
cycheng added reviewers: hfinkel, kbarton, nemanjai, tjablin.
cycheng added a subscriber: llvm-commits.
nemanjai edited edge metadata.Feb 4 2016, 12:14 PM

I just have a general comment about this approach we're using for implementing new instructions for assembly and disassembly only. Perhaps it would not be a terrible idea for us to add inline asm test cases for each of them. For example:

%0 = call <2 x i64> asm "xxspltib $0, $1", "=^wa,i"(i32 44) #1, !srcloc !1

should emit something like

xxspltib 0, 44

Or would this not really test anything?

lib/Target/PowerPC/PPCInstrFormats.td
750

I think these comments are quite useful in quickly identifying fields of the instruction. How come such comments do not appear consistently on all the new classes introduced?

lib/Target/PowerPC/PPCInstrVSX.td
1777

These all operate on vectors rather than scalars. This should be the vsrc register class.

1782

This should probably have the ins/outs dags parameterized since it appears to be used by both vector (xviexpdp, xviexpsp, xxperm, etc.) and scalar (xsmax.., xsmin..., etc.).
So the vector ones should probably use vsrc and scalar ones vsfrc.

1800

Perhaps I am missing something, but it seems that (at least the target) should be in the vsrc register class since the instruction can put the word element 1 (BE) of the source into the target starting at any byte (0-12 BE order)? In fact, I would argue that both operands should be vsrc register class since this type of operation makes more sense for vectors than it does for scalars (especially floating point scalars).

Also, isn't the immediate in this instruction a 4-bit immediate? I realize that this isn't necessarily relevant since we will ensure that the immediate is <=12 so it will never set the reserved bit, but it still makes the instruction definition clear and ensures that even without such checks in place, we will not generate illegal instructions.

1805

Same comment applies here as above but in reverse order.

1855

Again, this is a vector instruction and should not be using vsfrc.

lib/Target/PowerPC/README_P9.txt
3

How come this isn't populated with the instructions added here? I think that it would be pertinent to put in as much of the info at this time as possible (since you're looking at the instructions in some amount of detail). Perhaps something like this for all the new instructions:

  • Has likely SDAG match (and if there's an obvious candidate SDAG, make the suggestion)
  • Likely needs an intrinsic
  • If you know that we want to expose some builtin for it, please note that
  • Miscellaneous aspects we may or may not need to account for (i.e. instructions that modify the FPSCR)

P.S. I really think this file needs to have each instruction listed so that when we start working on full support for these (not just ASM), we can pick them off one at a time.

Thanks for your careful feedback :) I'll fix these issues you mentioned:

  • Review all using of vsfrc, vssrc, vsrc, because I misunderstand its usage. It looks like
    1. vssrc: for VSX scalar single precision
    2. vsfrc:
      • for VSX scalar double precision fp, or single word, double word int
      • move instruction use vsfrc
    3. vsrc: for VSX vector instructions
  • Make the instruction definition clear for XX2_RD6_UIM5_RS6: 5-bit immediate -> 4-bit immediate Review all other new instructions that have similar definition.
  • Add format comment for each new form
  • README_P9.txt: list each new instructions for
    1. Has likely SDAG match?
    2. Needs an intrinsic?
    3. Needs builtin?
    4. Miscellaneous notes
  • Inline assembly test case: Need discuss with Kit

Thanks for summarizing and addressing the comments. One of the key aspects of the vsfrc/vssrc/vsrc classes is whether the entire register is used or just the left 64 bits. So vector instructions operate on the entire register and we use the vsrc register class whereas scalar ones operate on the scalar portion of the register.
So your summary is correct with respect to the usage of the register classes.

P.S. There are no operations on integer scalars in vector registers - the moves are used only for:

  1. Conversion between integers and FP
  2. Bitcasting between integers and FP
  3. Freeing up a GPR by moving the value into a VSR (to be moved back when needed)

Thanks a lot :")

By the way, patch D16919 follows the approach you mentioned here, and I will review D16842, D16709, D16110 again, and update new ones later. I will leave next week, and will be back on Feb 15.

kbarton accepted this revision.Mar 24 2016, 9:32 PM
kbarton edited edge metadata.

LGTM

This revision is now accepted and ready to land.Mar 24 2016, 9:32 PM

Committed r264567

cycheng closed this revision.Mar 31 2016, 5:24 PM