This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Add SVE intrinsic for LD1RQ
ClosedPublic

Authored by kmclaughlin on Mar 27 2020, 8:05 AM.

Details

Summary

Adds the following intrinsic for contiguous load & replicate:

  • @llvm.aarch64.sve.ld1rq

The LD1RQ intrinsic only needs the SImmS16XForm added by this
patch. The others (SImmS2XForm, SImmS3XForm & SImmS4XForm)
were added for consistency.

Diff Detail

Event Timeline

kmclaughlin created this revision.Mar 27 2020, 8:05 AM
Herald added a project: Restricted Project. · View Herald Transcript
andwar added a comment.Apr 2 2020, 4:16 AM

Cheers for working on this @kmclaughlin !

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
1421

[nit] In AArch64ISelLowering.h you added LD1RQ as the last load instruction, here it's the first load instruction.

11687

[Nit] I think that this method could be simplified and made more explicit:

static SDValue performLD1RQCombine(SDNode *N, SelectionDAG &DAG) {
  SDLoc DL(N);
  EVT VT = N->getValueType(0);

  EVT LoadVT = VT;
  if (VT.isFloatingPoint())
    LoadVT = VT.changeTypeToInteger();

  SDValue Ops[] = {N->getOperand(0), N->getOperand(2), N->getOperand(3)};
  SDValue Load = DAG.getNode(AArch64ISD::LD1RQ, DL, {LoadVT, MVT::Other}, Ops);

  if (VT.isFloatingPoint()) {
    SDValue LoadChain = SDValue(Load.getNode(), 1);
    Load = DAG.getMergeValues(
        {DAG.getNode(ISD::BITCAST, DL, VT, Load), LoadChain}, DL);
  }

  return Load;
}

This way:

  • there's only 1 return statement
  • you are being explicit about preserving the chain when generating the bit cast
  • Load replaces a bit non-descriptive L

This is a matter of style, so please use what feels best.

llvm/lib/Target/AArch64/AArch64InstrFormats.td
495

IIUC, only this definition is need for LD1RQ. Could you add a note in the commit message to highlight that SImmS2XForm, SImmS3XForm and SImmS4XForm are added for consistency? Alternatively, only keep that one that's needed.

llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td
16

This comment should read Non-faulting & first-faulting loads - node definitions. That's my fault,, sorry!

23

The (undocumented) style here is:

// <Load type 1> - node definitions
//
<Node definitions for type 1>

// <Load type 2> - node definitions
//
<Node definitions for type 2>

Could you move LD1RQ to a dedicated block and add a comment? Ta!

andwar added a comment.Apr 3 2020, 5:13 AM

Btw, could you also add some negative tests? (e.g. out-of-range immediate)

kmclaughlin marked 4 inline comments as done.
kmclaughlin edited the summary of this revision. (Show Details)

Simplified performLD1RQCombine method & added negative tests where the immediate is out of range.

kmclaughlin marked 2 inline comments as done.Apr 14 2020, 8:30 AM
kmclaughlin added inline comments.
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11687

Done, these changes do make it neater :)

sdesmalen added inline comments.Apr 14 2020, 8:41 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11699

It seems like the LoadChain is lost if VT is of integer type.

kmclaughlin marked an inline comment as done.

Ensure LoadChain is always preserved in performLD1RQCombine

sdesmalen added inline comments.Apr 15 2020, 6:00 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11700

I'd expect this to then use Load.getValue(0) ?

kmclaughlin added inline comments.Apr 15 2020, 9:37 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11700

I think this will have the same effect, as Load just returns a single value

sdesmalen added inline comments.Apr 20 2020, 6:26 AM
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11700

SDValue LoadChain = SDValue(Load.getNode(), 1); suggests that Load has two return values, the result of the load, and the Chain.

kmclaughlin marked an inline comment as done.
  • Use Load.getValue(0) when creating a bitcast in performLD1RQCombine
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
11700

I think you're right - I've changed this to use the result of the load as suggested

This revision is now accepted and ready to land.Apr 21 2020, 12:48 PM
This revision was automatically updated to reflect the committed changes.