This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Implement intrinsic for non-faulting loads
Changes PlannedPublic

Authored by kmclaughlin on Dec 16 2019, 9:15 AM.

Details

Summary

Adds the llvm.aarch64.sve.ldnf1 intrinsic, adding a new
flag to MachineMemOperand (MONonFaulting)

Diff Detail

Event Timeline

kmclaughlin created this revision.Dec 16 2019, 9:15 AM
Herald added a project: Restricted Project. · View Herald Transcript

I'm not sure it's legal to transform a non-faulting load to a load with a non-faulting flag? At least, we'd need to consider the implications of that very carefully. In particular, I'm concerned about the interaction with intrinsics that read/write FFR. I mean, you could specify that loads marked MONonFaulting actually write to the FFR register, but that seems confusing.

It seems simpler to preserve the intrinsic until isel, at least for now.

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4490

I'm not sure how this is related.

sdesmalen added inline comments.Dec 17 2019, 3:02 AM
llvm/lib/Target/AArch64/AArch64InstrInfo.td
329

This duplicates a lot of code, maybe it makes sense to combine this into a multiclass. I'm thinking of something along the lines of:

multiclass load_store_fragments<code pred> {
  def : PatFrag<(ops node:$ptr, node:$pred, node:$def),
                (masked_ld node:$ptr, undef, node:$pred, node:$def),
                pred>;

  def _i8 : PatFrag<(ops ...),
                    ((cast<SDPatternOperator>(NAME) ...),
                    [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i8; }]>;
  def _i16 : PatFrag<(ops ...),
                    ((cast<SDPatternOperator>(NAME) ....),
                    [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i16; }]>;
  def _i32 : PatFrag<(ops ...),
                    ((cast<SDPatternOperator>(NAME) ....),
                    [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i32; }]>;
  def _i64 : PatFrag<(ops ...),
                    ((cast<SDPatternOperator>(NAME) ....),
                    [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i64; }]>;
}

defm non_temporal_load : load_store_fragments<[{
   return cast<MaskedLoadSDNode>(N)->getExtensionType() == ISD::NON_EXTLOAD &&
          cast<MaskedLoadSDNode>(N)->isUnindexed() &&
          cast<MaskedLoadSDNode>(N)->isNonTemporal() &&
          !cast<MaskedLoadSDNode>(N)->isNonFaulting();
}]>;

defm sext_non_temporal_load : load_store_fragments<[{
   return cast<MaskedLoadSDNode>(N)->getExtensionType() == ISD::SEXTLOAD &&
          cast<MaskedLoadSDNode>(N)->isUnindexed() &&
          cast<MaskedLoadSDNode>(N)->isNonTemporal() &&
          !cast<MaskedLoadSDNode>(N)->isNonFaulting();
}]>;

defm zext_non_faulting_load : load_store_fragments<[{
   return cast<MaskedLoadSDNode>(N)->getExtensionType() == ISD::ZEXTLOAD &&
          cast<MaskedLoadSDNode>(N)->isUnindexed() &&
          !cast<MaskedLoadSDNode>(N)->isNonTemporal() &&
          cast<MaskedLoadSDNode>(N)->isNonFaulting();
}]>;
338

nit: we should probably add _masked_ to the name (although I realise that I forgot to spot that on the non-temporal patch)

341

Shall we support any-extend as well? (thus making asext_non_faulting_load)

I'm not sure it's legal to transform a non-faulting load to a load with a non-faulting flag? At least, we'd need to consider the implications of that very carefully. In particular, I'm concerned about the interaction with intrinsics that read/write FFR. I mean, you could specify that loads marked MONonFaulting actually write to the FFR register, but that seems confusing.

It seems simpler to preserve the intrinsic until isel, at least for now.

I missed this comment earlier, but that's a valid point. For SVE having side-effects is assumed from the non-faulting flag. We hoped to latch on to the MLOAD here to reuse code and benefit from legalization in case we want to add a more generic mechanism in the future to use such loads directly in the loop-vectorizer.

Perhaps we can clarify the intent that the non-faulting mode may have side-effects by renaming the flag to something like NonFaultingWithSideEffects? Otherwise we can stick with the intrinsics as you suggest.

Perhaps we can clarify the intent that the non-faulting mode may have side-effects by renaming the flag to something like NonFaultingWithSideEffects?

We could specify something like that... but that probably requires some changes to target-independent DAGCombine, and it's sort of a weird edge case given that no other loads have that kind of side-effect. If we are going to make a change like that to MachineMemOperand, we'd probably want to propose it on llvmdev. I'd prefer a little extra code here to avoid that rabbit hole. We're busy enough already with other substantial changes to target-independent code.

kmclaughlin planned changes to this revision.Dec 19 2019, 4:18 AM

Thanks for the feedback on this patch, @efriedma & @sdesmalen!
I think there is still value in adding a NonFaulting flag to MachineMemOperand so that we can benefit from legalisation, but as this is not a requirement for the ACLE I have created a new patch which implements the non-faulting load intrinsic explicitly: https://reviews.llvm.org/D71698
I will leave this patch in the 'plan changes' state so that it can be referred to in future discussions on the mailing list.