Add the functionality to lower fceil for passthru variant
We shouldn't be calling getMachineNode in legalization. The way to get an uninitialized value before isel is getUNDEF().
Also, this looks like it's creating an FCEIL with two operands. That's a bad idea; we have a bunch of assertions in getNode() to ensure SelectionDAG nodes are well-formed. Even if those assertions don't catch this issue right now, they might in the future.
This doesn't match the naming convention we're using for these opcodes. See the comment at the beginning of this file: according to those rules, this should be named FRINTP_MERGE_PASSTHRU.
But really, probably better to actually implement FRINTP_PRED, without the extra operand. Maybe try looking at https://reviews.llvm.org/D83765 ?
As suggested by Eli you only need to implement the FRINTP_PRED variant, which doesn't set any expectation on the result of inactive lanes. Doing this means the intrinsics can remain untouched.
FYI: I'm in the process of converting the last few remaining instances of _MERGE_OP1 nodes (only the shifts and max/min remain) because we have no real need for them as yet and I'm trying to ensure we don't tie the register allocator's hands when code generating normal IR.
To make the relationship clearer between the original nodes and their predicated counterparts we just add a suffix. So in this case the predicated node should be named FCEIL_PRED.
I don't understand why you need this change. By definitions the _PRED nodes should take the form:
ISDNODE Op1, Op2...OpN -> ISDNODE_PRED Pg, Op1, Op2...OpN
I would expect this function to do what's required without any changes. I suspect any issues are likely down to mistakes within the isel patterns.
Merge is not required, see comment on function definition.
This is incorrect because the unary _PRED nodes should have 2 operands. The predicated followed by the data operand. See SDT_AArch64Arith for inspiration, where you just need to drop the stuff related to Op3.
Happy as is, but you could enhance readability if the operators are named op_merge and op_pred.
There should be no need to explicitly create the PTRUE here because the _PRED nodes already provide the predicate (which will be Op1 when LowerToPredicatedOp is restored) that should be used directly.
To be honest I'm not entirely sure how this even works since when creating FRINTP_PRED you currently set Op1 to DAG.getUNDEF().
There should also be patterns for the other legal floating point MVTs (i.e. nxv2f16, nxv4f16 and nxv2f32).
You'll need tests for the other legal floating point MVTs (i.e. nxv2f16, nxv4f16 and nxv2f32).
I've been investigating this area (although related to a different node) and I've had a change of heart. When lowering the one operand nodes to predicated versions I think we should introduce SVE specific passthru nodes much like DUP but in this case it would be FRINTP_MERGE_PASSTHRU. This can be used to lower common ISD nodes and SVE intrinsics alike. My original "use _PRED" comments are only relevant to two and three operand operations where there is an advantage to using _PRED nodes to free up some better register allocation. One operand operations do not have this problem because the predicate variants have a dedicated output register. Sorry for the initial misdirection.
Personally I think it's better to order the FCEIL_MERGE_PASSTHRU enum entry and related code alphabetically relative to FNEG_MERGE_PASSTHRU but it's not a deal breaker. So if you're happy with renaming AArch64fceil_mt before landing the patch then I'm happy.
I've tried to use the AArch64 names here, so AArch64frintp_mt, as it makes it easy to spot mismatches with the instruction definitions.