This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add MVE horizontal accumulation instructions.
ClosedPublic

Authored by miyuki on May 30 2019, 8:20 AM.

Details

Summary

This is the family of vector instructions that combine all the lanes
in their input vector(s), and output a value in one or two GPRs.

Diff Detail

Repository
rL LLVM

Event Timeline

simon_tatham created this revision.May 30 2019, 8:20 AM
ostannard added inline comments.
llvm/lib/Target/ARM/ARMInstrMVE.td
367 ↗(On Diff #202205)

There seems to be a lot of repetition here, could some of these classes be combined? For example most bits of t2VMINNM, t2VMINV, t2VMAXNMV and t2VMAX are the same. Could we also use multiclasses to avoid repeating the list of types for each of the MIN and MAX versions of these instructions?

Also, is the a pattern to whether these names end in "V" which I've missed, or could they be made more consistent?

382 ↗(On Diff #202205)

What is this comment adding?

487 ↗(On Diff #202205)

Again, the t2VMLADAV and t2VMLSDAV classes share most of their bits, and could be merged.

llvm/test/MC/ARM/mve-reductions.s
14 ↗(On Diff #202205)

We could avoid repeating all of these check lines by either adding a new check-prefix shared between the two runs, or moving the FP instructions to a separate file. As noted in previous reviews, we should also be checking the error message when the FP instructions are rejected.

dmgreen added inline comments.Jun 4 2019, 12:44 PM
llvm/lib/Target/ARM/ARMInstrMVE.td
307 ↗(On Diff #202205)

As a corollary to what Oliver was saying. The VADDV (or in VADDVs8no_acc) doesn't really take a Rda_src, I believe. At least the structure you would match from llvm would not. There's a perfectly good chance it's possible and I just don't know how to represent that in a pattern. But it seems more natural to make sure that VADDVs*no_acc don't require a Rda_src.

When I looked at this downstream, I just ended up splitting this into two classes.

Remastered patch to apply cleanly against current trunk.

miyuki added a subscriber: miyuki.Jun 11 2019, 5:53 AM
miyuki added inline comments.Jun 11 2019, 10:27 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
307 ↗(On Diff #202205)

But it seems more natural to make sure that VADDVs*no_acc don't require a Rda_src.

This also pertains to VMLADAV, VMLSDAV, etc, right?

miyuki commandeered this revision.Jun 12 2019, 6:03 AM
miyuki added a reviewer: simon_tatham.
miyuki updated this revision to Diff 204274.Jun 12 2019, 6:17 AM
  • Fixed naming inconsistencies
  • Removed comments about undefined instructions
  • Factored out common parts of:
    • VMINV and VMAXV
    • VMINNMV and VMAXNMV
    • VMLADAV and VMLSDAV
    • VMLALDAV, VMLSLDAV (includes VRMLSLDAVH) and VRMLALDAVH
  • Fixed inputs and constraints of non-accumulating instructions
  • Split the assembler test into integer and floating point parts
miyuki marked 6 inline comments as done.Jun 12 2019, 6:18 AM

LGTM, if Oliver is also happy

ostannard added inline comments.Jun 13 2019, 5:35 AM
llvm/lib/Target/ARM/ARMInstrMVE.td
272 ↗(On Diff #204274)

This class is only instantiated once with each value of A, so I think it would be clearer to make these parameters and move the values down to the multiclass below.

421 ↗(On Diff #204274)

Again, this is quite hard to follow, I think it would be better to move the operand lists down to the multiclass.

miyuki marked 2 inline comments as done.Jun 13 2019, 5:50 AM
miyuki added inline comments.
llvm/lib/Target/ARM/ARMInstrMVE.td
421 ↗(On Diff #204274)

But this would mean duplicating everything (i.e., input operands and constraints) 4 times for each case. And 6 times for t2VMLALDAVBase. Wouldn't it be better to create separate subclasses for accumulating and non-accumulating versions?

miyuki updated this revision to Diff 204548.Jun 13 2019, 8:15 AM
  • Rebase
  • Avoid TableGen logic in instruction operands and constraints
miyuki marked 2 inline comments as done.Jun 13 2019, 8:16 AM
This revision is now accepted and ready to land.Jun 14 2019, 7:19 AM
This revision was automatically updated to reflect the committed changes.