In several places we need to enumerate all constrained intrinsics or IR
nodes that should be represented by them. It is easy to miss some of
the cases. To make working with these intrinsics more convenient and
robust, this change introduces file containing definitions of all
constrained intrinsics and some of their properties. This file can be
included to generate constrained intrinsics processing code.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
- Build Status
Buildable 40733 Build 40864: arc lint + arc unit
Event Timeline
I really like that approach. Unfortunately, this part of it does sort of partially revert one of the changes I recently made:
if (Opcode == ISD::STRICT_FP_ROUND) Opers.push_back( DAG.getTargetConstant(0, sdl, TLI.getPointerTy(DAG.getDataLayout())));
The comparison patch (https://reviews.llvm.org/D69281) will introduce a lot more cases where we need an extra argument on the DAG side.
So maybe the best solution would be to extend the .def file with an additional column listing "extra arguments" for the DAG node -- then the DAG node could be constructed fully automatically without any per-opcode special cases in this routine.
Actually it was moved below in the same file to the line 6900.
The comparison patch (https://reviews.llvm.org/D69281) will introduce a lot more cases where we need an extra argument on the DAG side.
So maybe the best solution would be to extend the .def file with an additional column listing "extra arguments" for the DAG node -- then the DAG node could be constructed fully automatically without any per-opcode special cases in this routine.
It is not clear to me how to encode these "extra arguments". I would propose to limit auto generation to opcodes only, leaving attaching operands to custom code. There may be cases when building operands is complex enough and simple macro expansion implementation is not sufficient.
What I meant was: originally, there were two IR opcode checks, a first check (switch statement) to determine the DAG opcode, and a second check (just one "if" now, but needs to become another switch with the comparison patch) to determine extra DAG operands. My recent patch merged the two into a single opcode check (switch statement) -- and this patch again creates two separate checks.
The comparison patch (https://reviews.llvm.org/D69281) will introduce a lot more cases where we need an extra argument on the DAG side.
So maybe the best solution would be to extend the .def file with an additional column listing "extra arguments" for the DAG node -- then the DAG node could be constructed fully automatically without any per-opcode special cases in this routine.
It is not clear to me how to encode these "extra arguments". I would propose to limit auto generation to opcodes only, leaving attaching operands to custom code. There may be cases when building operands is complex enough and simple macro expansion implementation is not sufficient.
All currently known cases simply add one extra immediate integer argument. I'm wondering whether it wouldn't make sense to address that common case automatically.
That's true. Calculation of opcode is simple 1:1 mapping operation, which can easily be implemented using macros. Building operands can be more complicated. Specific operands are required only for Intrinsic::experimental_constrained_fptrunc. The case of compare intrinsics is considered below.
The comparison patch (https://reviews.llvm.org/D69281) will introduce a lot more cases where we need an extra argument on the DAG side.
So maybe the best solution would be to extend the .def file with an additional column listing "extra arguments" for the DAG node -- then the DAG node could be constructed fully automatically without any per-opcode special cases in this routine.
It is not clear to me how to encode these "extra arguments". I would propose to limit auto generation to opcodes only, leaving attaching operands to custom code. There may be cases when building operands is complex enough and simple macro expansion implementation is not sufficient.
All currently known cases simply add one extra immediate integer argument. I'm wondering whether it wouldn't make sense to address that common case automatically.
I would propose solution like:
diff --git a/llvm/include/llvm/IR/ConstrainedOps.def b/llvm/include/llvm/IR/ConstrainedOps.def index 219da9d111c..4e45c5a0664 100644 --- a/llvm/include/llvm/IR/ConstrainedOps.def +++ b/llvm/include/llvm/IR/ConstrainedOps.def @@ -15,6 +15,10 @@ #define INSTRUCTION(N,A,R,I,D) #endif +#ifndef CMP_INSTRUCTION +#define CMP_INSTRUCTION(C,A,R,I,D) +#endif + #ifndef FUNCTION #define FUNCTION(F,A,R,I,D) #endif @@ -36,6 +40,21 @@ INSTRUCTION(FPToSI, 1, 0, experimental_constrained_fptosi, STRICT_FP_T INSTRUCTION(FPToUI, 1, 0, experimental_constrained_fptoui, STRICT_FP_TO_UINT) INSTRUCTION(FPTrunc, 1, 1, experimental_constrained_fptrunc, STRICT_FP_ROUND) +CMP_INSTRUCTION(oeq, 2, 1, experimental_constrained_fcmpoeq, SETOEQ) +CMP_INSTRUCTION(ogt, 2, 1, experimental_constrained_fcmpogt, SETOGT) +CMP_INSTRUCTION(oge, 2, 1, experimental_constrained_fcmpoge, SETOGE) +CMP_INSTRUCTION(olt, 2, 1, experimental_constrained_fcmpolt, SETOLT) +CMP_INSTRUCTION(ole, 2, 1, experimental_constrained_fcmpole, SETOLE) +CMP_INSTRUCTION(one, 2, 1, experimental_constrained_fcmpone, SETONE) +CMP_INSTRUCTION(ord, 2, 1, experimental_constrained_fcmpord, SETO) +CMP_INSTRUCTION(ueq, 2, 1, experimental_constrained_fcmpueq, SETUEQ) +CMP_INSTRUCTION(ugt, 2, 1, experimental_constrained_fcmpugt, SETUGT) +CMP_INSTRUCTION(uge, 2, 1, experimental_constrained_fcmpuge, SETUGE) +CMP_INSTRUCTION(ult, 2, 1, experimental_constrained_fcmpult, SETULT) +CMP_INSTRUCTION(ule, 2, 1, experimental_constrained_fcmpule, SETULE) +CMP_INSTRUCTION(une, 2, 1, experimental_constrained_fcmpune, SETUNE) +CMP_INSTRUCTION(uno, 2, 1, experimental_constrained_fcmpuno, SETUO) + FUNCTION(ceil, 1, 1, experimental_constrained_ceil, STRICT_FCEIL) FUNCTION(cos, 1, 1, experimental_constrained_cos, STRICT_FCOS) FUNCTION(exp, 1, 1, experimental_constrained_exp, STRICT_FEXP) diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp index b1411f16d96..56c68e26f3b 100644 --- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp +++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp @@ -6893,6 +6893,11 @@ void SelectionDAGBuilder::visitConstrainedFPIntrinsic( case Intrinsic::INTRINSIC: \ Opcode = ISD::DAGN; \ break; +#define CMP_INSTRUCTION(CC, NARG, ROUND_MODE, INTRINSIC, DAGN) \ + case Intrinsic::INTRINSIC: \ + Opcode = ISD::STRICT_FSETCC; \ + Opers.push_back(DAG.getCondCode(ISD::DAGN)) \ + break; #define FUNCTION INSTRUCTION #include "llvm/IR/ConstrainedOps.def" }
Does it solve the problem with compare intrinsics?
Hmm, OK. I guess that way is fine with me. We'll probably still want a DAG opcode operand for CMP_INSTRUCTION, since we'll need two flavours of the intrinsics (quiet and signalling compares). But that can be done after your patch is in.
Updated patch
- Use this approach for the DAG nodes more widely,
- Updated documentation.
See inline comment. Apart from that, this looks good to me -- I think we should go for that solution, it makes the code a lot more maintainable IMO.
llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp | ||
---|---|---|
42 | This double switch is a bit weird. I'm wondering if it wouldn't be more straightforward to just pass all strictfp nodes to ScalarizeVecRes_StrictFPOp and then either just handle them there (in most cases that should be straightforward as far as I can see) or call out to special handling from within that routine. (Same for the three other cases below.) |
Updated patch according to reviewer's notes
- Got rid of DAG strict nodes processing in default case.
- Map FUNCTION to INSTRUCTION by default.
Thanks for the updates! This does address all comments I had raised previously. Patch LGTM now.
This double switch is a bit weird. I'm wondering if it wouldn't be more straightforward to just pass all strictfp nodes to ScalarizeVecRes_StrictFPOp and then either just handle them there (in most cases that should be straightforward as far as I can see) or call out to special handling from within that routine.
(Same for the three other cases below.)