Adds the llvm.aarch64.sve.ldnf1 intrinsic, adding a new
flag to MachineMemOperand (MONonFaulting)
Details
Diff Detail
Event Timeline
I'm not sure it's legal to transform a non-faulting load to a load with a non-faulting flag? At least, we'd need to consider the implications of that very carefully. In particular, I'm concerned about the interaction with intrinsics that read/write FFR. I mean, you could specify that loads marked MONonFaulting actually write to the FFR register, but that seems confusing.
It seems simpler to preserve the intrinsic until isel, at least for now.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
4490 | I'm not sure how this is related. |
llvm/lib/Target/AArch64/AArch64InstrInfo.td | ||
---|---|---|
329 | This duplicates a lot of code, maybe it makes sense to combine this into a multiclass. I'm thinking of something along the lines of: multiclass load_store_fragments<code pred> { def : PatFrag<(ops node:$ptr, node:$pred, node:$def), (masked_ld node:$ptr, undef, node:$pred, node:$def), pred>; def _i8 : PatFrag<(ops ...), ((cast<SDPatternOperator>(NAME) ...), [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i8; }]>; def _i16 : PatFrag<(ops ...), ((cast<SDPatternOperator>(NAME) ....), [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i16; }]>; def _i32 : PatFrag<(ops ...), ((cast<SDPatternOperator>(NAME) ....), [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i32; }]>; def _i64 : PatFrag<(ops ...), ((cast<SDPatternOperator>(NAME) ....), [{ return cast<MaskedLoadSDNode>(N)->getMemoryVT().getScalarType() == MVT::i64; }]>; } defm non_temporal_load : load_store_fragments<[{ return cast<MaskedLoadSDNode>(N)->getExtensionType() == ISD::NON_EXTLOAD && cast<MaskedLoadSDNode>(N)->isUnindexed() && cast<MaskedLoadSDNode>(N)->isNonTemporal() && !cast<MaskedLoadSDNode>(N)->isNonFaulting(); }]>; defm sext_non_temporal_load : load_store_fragments<[{ return cast<MaskedLoadSDNode>(N)->getExtensionType() == ISD::SEXTLOAD && cast<MaskedLoadSDNode>(N)->isUnindexed() && cast<MaskedLoadSDNode>(N)->isNonTemporal() && !cast<MaskedLoadSDNode>(N)->isNonFaulting(); }]>; defm zext_non_faulting_load : load_store_fragments<[{ return cast<MaskedLoadSDNode>(N)->getExtensionType() == ISD::ZEXTLOAD && cast<MaskedLoadSDNode>(N)->isUnindexed() && !cast<MaskedLoadSDNode>(N)->isNonTemporal() && cast<MaskedLoadSDNode>(N)->isNonFaulting(); }]>; | |
338 | nit: we should probably add _masked_ to the name (although I realise that I forgot to spot that on the non-temporal patch) | |
341 | Shall we support any-extend as well? (thus making asext_non_faulting_load) |
I missed this comment earlier, but that's a valid point. For SVE having side-effects is assumed from the non-faulting flag. We hoped to latch on to the MLOAD here to reuse code and benefit from legalization in case we want to add a more generic mechanism in the future to use such loads directly in the loop-vectorizer.
Perhaps we can clarify the intent that the non-faulting mode may have side-effects by renaming the flag to something like NonFaultingWithSideEffects? Otherwise we can stick with the intrinsics as you suggest.
Perhaps we can clarify the intent that the non-faulting mode may have side-effects by renaming the flag to something like NonFaultingWithSideEffects?
We could specify something like that... but that probably requires some changes to target-independent DAGCombine, and it's sort of a weird edge case given that no other loads have that kind of side-effect. If we are going to make a change like that to MachineMemOperand, we'd probably want to propose it on llvmdev. I'd prefer a little extra code here to avoid that rabbit hole. We're busy enough already with other substantial changes to target-independent code.
Thanks for the feedback on this patch, @efriedma & @sdesmalen!
I think there is still value in adding a NonFaulting flag to MachineMemOperand so that we can benefit from legalisation, but as this is not a requirement for the ACLE I have created a new patch which implements the non-faulting load intrinsic explicitly: https://reviews.llvm.org/D71698
I will leave this patch in the 'plan changes' state so that it can be referred to in future discussions on the mailing list.
I'm not sure how this is related.