This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Select MVE add and sub
ClosedPublic

Authored by dmgreen on Jun 13 2019, 5:36 AM.

Details

Summary

Simple patterns for MVEC integer vector add and sub instructions.

Initial code by David Sherwood

Diff Detail

Event Timeline

dmgreen created this revision.Jun 13 2019, 5:36 AM

Bit confusing this, which other patch(es) is this dependent upon?

Hello. This is probably dependent upon at least some of the uncommitted MVE instructions being added (I haven't checked if add/sub are in tree yet).

If by confusing you are referring to the changes in lib/Target/ARM/ARMISelLowering.cpp, then yes I can see that. Essentially I am breaking things so that we can work from the position that most targets are usually in - the default set of instructions are legal and we choose specific ones to mark as legal or not legal. As opposed to what the code was previously doing - marking all instructions as expand. No one will be using the MVE instructions yet (and the compiler shouldn't generate them until we enable that), so it's probably fine to not have this as 100% working at the moment.

Our downstream version looks something like this, as an illustration:

for (auto VT : iTypes) {
  addRegisterClass(VT, &ARM::QPRRegClass);
  setOperationAction(ISD::VECTOR_SHUFFLE, VT, Custom);
  setOperationAction(ISD::INSERT_VECTOR_ELT, VT, Custom);
  setOperationAction(ISD::EXTRACT_VECTOR_ELT, VT, Custom);
  setOperationAction(ISD::BUILD_VECTOR, VT, Custom);
  setOperationAction(ISD::MLOAD, VT, Legal);
  setOperationAction(ISD::MSTORE, VT, Legal);
  setOperationAction(ISD::MGATHER, VT, Legal);
  setOperationAction(ISD::SETCC, VT, Custom);
  setOperationAction(ISD::SHL, VT, Custom);
  setOperationAction(ISD::SRA, VT, Custom);
  setOperationAction(ISD::SRL, VT, Custom);
  setOperationAction(ISD::SIGN_EXTEND_INREG, VT, Legal);
  setOperationAction(ISD::ABS, VT, Legal);

  // No native support for these.
  setOperationAction(ISD::UDIV, VT, Expand);
  setOperationAction(ISD::SDIV, VT, Expand);
  setOperationAction(ISD::UREM, VT, Expand);
  setOperationAction(ISD::SREM, VT, Expand);

  // Pre and Post inc are supported
  for (unsigned im = (unsigned)ISD::PRE_INC;
       im != (unsigned)ISD::LAST_INDEXED_MODE; ++im) {
    setIndexedLoadAction(im, VT, Legal);
    setIndexedStoreAction(im, VT, Legal);
  }
}

I can change it to only mark the current supported set of nodes as legal if you things that's better. Eventually we will probably want to remove the for (every opcode) setOperationAction(Opc, VT, Expand); loops.

Also, forgot to mention. I was putting this up early to see how people thought about the general layout, before getting a lot of other patches together. The main point being that it puts all the patterns together at the end of the MVE.td file, as opposed to interleaving them with instructions or in the instruction definition patterns.

Simon already mentioned that we could put the patterns into the instructions if we can sort out things like default class parameters. That would obviously involve a bit of a rewrite, but if people think it's a better way to structure the patterns, I'm all ears.

Simon already mentioned that we could put the patterns into the instructions if we can sort out things like default class parameters.

I still haven't managed to form a definite opinion on whether that's better, though!

In favour of putting ISel patterns inside the instruction records they go with:

  • readability: all the information about an instruction is in the same place and easy to find
  • DRYness: when the instruction applies to half a dozen different types with suffixes like .f16, .s32 and the like, the pattern is automatically replicated across the same set of types, without having to write a separate repetition elsewhere (foreach or multiclass or whatever) and make sure to keep the two in sync.

The unusual problem with MVE is the predication. Almost all the vector instructions will ultimately want at least one ISel pattern corresponding to one of the ACLE intrinsics (or rather, to an LLVM intrinsic which the ACLE one will have been converted into by a combination of clang and a header file) which allows the predicated version of the instruction to be used by C code. But some instructions also want patterns corresponding to existing DAG node types, such as ordinary add on vector types – and you can't put both patterns inside the instruction record, because the Pattern field only has room for one. (It's a list<dag>, but the elements of the list are parts of the same pattern, not independent alternatives.)

So is it better to put the intrinsic patterns in the instructions? Or the patterns involving standard DAG nodes (if any)? Or neither, on the basis that if you can't have all the patterns in the instructions then you should instead keep them near each other? No idea, really.

(Also, there's that problem where operands whose default values are specified by virtue of an OperandWithDefaultOps class can't be overridden by ISel patterns. So the intrinsics for the predicated forms of the instructions can't currently be written as ISel patterns at all. But I think that's fixable: I have a draft patch to TableGen already that should remove that constraint, and I'll upload it once I find time to test it a bit more thoroughly.)

dmgreen updated this revision to Diff 206163.Jun 24 2019, 1:17 AM
dmgreen added a reviewer: SjoerdMeijer.

Rebase for new instruction names and moved the patterns to be near (but not in) the relevant instruction encodings. I think I like this way fine.

samparker added inline comments.Jun 26 2019, 7:16 AM
llvm/test/CodeGen/Thumb2/mve-simple-arith.ll
2

Is there a reason why we use mve.fp or integer tests?

The same test includes fadd and fsub later on. Changes incoming for these patches though, which tests both..

samparker accepted this revision.Jun 26 2019, 7:42 AM

Ok. LGTM

This revision is now accepted and ready to land.Jun 26 2019, 7:42 AM
dmgreen updated this revision to Diff 206670.Jun 26 2019, 7:51 AM
dmgreen removed a child revision: D63567: [ARM] Mve vector shuffles.
dmgreen edited the summary of this revision. (Show Details)

Simplified now that this sits on top of D63567. Includes +mve.fp and +mve,+fullfp16 tests.

This revision was automatically updated to reflect the committed changes.