This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Asm: Support for gather LD1/LDFF1 (scalar + vector) load instructions.
ClosedPublic

Authored by sdesmalen on Apr 24 2018, 10:42 AM.

Diff Detail

Event Timeline

sdesmalen created this revision.Apr 24 2018, 10:42 AM
fhahn added inline comments.Apr 24 2018, 1:50 PM
lib/Target/AArch64/SVEInstrFormats.td
764

Thanks for restructuring this Sander, it is easier to get a completer picture now! I think there are still some repetitive classes. I think it would be possible to have a single class dealing with 32 and 64 bit gather & unsized operands, like the diff below. The parameters of sve_mem_gld would need documenting (and one or 2 might be implied by others).

+class sve_mem_gld<bits<4> opc, bit xs, bit scaled, bit ns, bits<7> type, RegisterOperand sz, string asm,
+                         RegisterOperand zprext>
+: I<(outs sz:$Zt), (ins PPR3bAny:$Pg, GPR64sp:$Rn, zprext:$Zm),
+  asm, "\t$Zt, $Pg/z, [$Rn, $Zm]",
+  "",
+  []>, Sched<[]> {
+  bits<3> Pg;
+  bits<5> Rn;
+  bits<5> Zm;
+  bits<5> Zt;
+  let Inst{31-25} = type;
+  let Inst{24-23} = opc{3-2};
+  let Inst{22}    = xs;
+  let Inst{21}    = scaled;
+  let Inst{20-16} = Zm;
+  let Inst{15}    = ns;
+  let Inst{14-13} = opc{1-0};
+  let Inst{12-10} = Pg;
+  let Inst{9-5}   = Rn;
+  let Inst{4-0}   = Zt;
+
+  let mayLoad = 1;
+  let Defs = !if(!eq(opc{0}, 1), [FFR], []);
+  let Uses = !if(!eq(opc{0}, 1), [FFR], []);
+}
+
+//===----------------------------------------------------------------------===//
+// SVE Memory - 32-bit Gather and Unsized Contiguous Group
+//===----------------------------------------------------------------------===//
+multiclass sve_mem_32b_gld_sv_32_scaled<bits<4> opc, string asm,
+                                        RegisterOperand sxtw_opnd,
+                                        RegisterOperand uxtw_opnd> {
+  def _UXTW_SCALED_REAL : sve_mem_gld<opc, 0, 1, 0, 0b1000010, Z_s, asm, uxtw_opnd>;
+  def _SXTW_SCALED_REAL : sve_mem_gld<opc, 1, 1, 0, 0b1000010, Z_s, asm, sxtw_opnd>;
+
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _UXTW_SCALED_REAL) ZPR32:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, uxtw_opnd:$Zm), 0>;
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _SXTW_SCALED_REAL) ZPR32:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, sxtw_opnd:$Zm), 0>;
+}
+
+multiclass sve_mem_32b_gld_vs_32_unscaled<bits<4> opc, string asm,
+                                          RegisterOperand sxtw_opnd,
+                                          RegisterOperand uxtw_opnd> {
+  def _UXTW_REAL : sve_mem_gld<opc, 0, 0, 0, 0b1000010, Z_s, asm, uxtw_opnd>;
+  def _SXTW_REAL : sve_mem_gld<opc, 1, 0, 0, 0b1000010, Z_s, asm, sxtw_opnd>;
+
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _UXTW_REAL) ZPR32:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, uxtw_opnd:$Zm), 0>;
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _SXTW_REAL) ZPR32:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, sxtw_opnd:$Zm), 0>;
+}
+
+
+//===----------------------------------------------------------------------===//
+// SVE Memory - 64-bit Gather Group
+//===----------------------------------------------------------------------===//
+
+multiclass sve_mem_64b_gld_sv_32_scaled<bits<4> opc, string asm,
+                                        RegisterOperand sxtw_opnd,
+                                        RegisterOperand uxtw_opnd> {
+  def _UXTW_SCALED_REAL : sve_mem_gld<opc, 0, 1, 0, 0b1100010, Z_d, asm, uxtw_opnd>;
+  def _SXTW_SCALED_REAL : sve_mem_gld<opc, 1, 1, 0, 0b1100010, Z_d, asm, sxtw_opnd>;
+
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _UXTW_SCALED_REAL) ZPR64:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, uxtw_opnd:$Zm), 0>;
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _SXTW_SCALED_REAL) ZPR64:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, sxtw_opnd:$Zm), 0>;
+}
+
+multiclass sve_mem_64b_gld_vs_32_unscaled<bits<4> opc, string asm,
+                                          RegisterOperand sxtw_opnd,
+                                          RegisterOperand uxtw_opnd> {
+  def _UXTW_REAL : sve_mem_gld<opc, 0, 0, 0, 0b1100010, Z_d, asm, uxtw_opnd>;
+  def _SXTW_REAL : sve_mem_gld<opc, 1, 0, 0, 0b1100010, Z_d, asm, sxtw_opnd>;
+
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _UXTW_REAL) ZPR64:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, uxtw_opnd:$Zm), 0>;
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _SXTW_REAL) ZPR64:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, sxtw_opnd:$Zm), 0>;
+}
+
+multiclass sve_mem_64b_gld_sv2_64_scaled<bits<4> opc, string asm,
+                                         RegisterOperand zprext> {
+  def _SCALED_REAL : sve_mem_gld<opc, 1, 1, 1, 0b1100010, Z_d, asm, zprext>;
+
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _SCALED_REAL) ZPR64:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, zprext:$Zm), 0>;
+}
+
+multiclass sve_mem_64b_gld_vs2_64_unscaled<bits<4> opc, string asm> {
+  def _REAL : sve_mem_gld<opc, 1, 0,  1, 0b1100010, Z_d, asm, ZPR64ExtLSL8>;
+
+  def : InstAlias<asm # "\t$Zt, $Pg/z, [$Rn, $Zm]",
+                  (!cast<Instruction>(NAME # _REAL) ZPR64:$Zt, PPR3bAny:$Pg, GPR64sp:$Rn, ZPR64ExtLSL8:$Zm), 0>;
+}
sdesmalen updated this revision to Diff 143874.Apr 25 2018, 2:08 AM
  • Merged sve_mem_64b_gld_sv and sve_mem_64b_gld_sv2 instruction classes.
lib/Target/AArch64/SVEInstrFormats.td
764

Thanks for pointing out there is some more simplification we can do, I'll also keep this in mind for future SVE patches (we initially had more classes but after some of the refactoring of operand classes over time and never bothered to simplify it and I hadn't realised we can indeed merge some of these classes together now).

There is a variable threshold as to what parameterisation makes sense and makes the file readible. Personally I think it makes sense to draw the line at the point of having separate classes for different encoding groups as they are defined in the architecture specification. This should also make it easier to reference in the spec.

I have updated the patch to combine the two 64-bit gather classes (lsl vs uxtw/sxtw), but still maintaing the base instruction classes from the different encoding groups. Do you think that approach is sufficient?

fhahn added a comment.Apr 25 2018, 2:29 AM

Thanks for combining the 64 bit classes! LGTM

lib/Target/AArch64/SVEInstrFormats.td
767

please add a similar comment here as for sve_mem_64b_gld_sv

fhahn accepted this revision.Apr 25 2018, 2:30 AM
This revision is now accepted and ready to land.Apr 25 2018, 2:30 AM
This revision was automatically updated to reflect the committed changes.
sdesmalen marked an inline comment as done.Apr 26 2018, 1:23 AM