We don't have any FP exception limits in the IR constant folder (apart from strict ops), so I don't think it makes sense to have them here in the DAG either. Nothing else in the backend tries to preserve those (again outside of strict ops), so I don't see how this could have ever worked for real code that cares about FP exceptions.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
LGTM; this matches the direction we're going, where we use different DAG nodes for functions which care about FP exceptions. (Please wait a couple days in case I'm missing something obvious.)
Rather than get rid of this, can we update it to handle strict FP nodes correctly? Or at least put a skeleton in place to prepare to handle it correctly?
Right now I think we're throwing away the rounding mode and exception semantics arguments when we convert constrained intrinsics to strict FP nodes, but if we still had that information here we could use it to make a smart decision. For instance, suppose we had an intrinsic like this:
%t = call double @llvm.experimental.constrained.fadd.f64(double 4.3, double 6.5, metadata !"round.upward", metadata !"fpexcept.ignore")
If that got to the selection DAG we would be able to fold it.
Yes, we could possibly constant-fold STRICT_FADD etc. using similar logic, but I'm not sure we would actually want to reuse the existing codepath in foldConstantFPMath. In any case, it'll be in the revision history if someone does end up wanting to revive the code for that.
Why wouldn't we want to reuse this code path?
We made the decision that we are initially willing to live with poor optimization in order to get correctness for the constrained FP implementation, but the plan all along has been that after we had correct results we wanted to go back to re-enable optimizations. I'm just trying to avoid making that path harder.
What I'm asking for here is something like this:
SDValue SelectionDAG::foldConstantFPMath(unsigned Opcode, const SDLoc &DL, EVT VT, SDValue N1, SDValue N2) { auto *N1CFP = dyn_cast<ConstantFPSDNode>(N1.getNode()); auto *N2CFP = dyn_cast<ConstantFPSDNode>(N2.getNode()); bool HasFPExceptions = false; APFloat::roundingMode = APFloat::round_to_nearest; // FIXME: Add support for strict FP rounding and exception semantics.
That would produce the same effect as this patch. To support it we'd just need extra arguments to provide the rounding mode and exception semantics. I think those are probably even available not far up the stack from where this is called.
You also need to pass in the input chain and probably emit a MERGE_VALUES to connect it to the expected output chain. Probably need to call this from a different getNode function with the right number of input arguments and add STRICT_FP nodes to the switch in that function to call this.
We generally try to avoid keeping dead code in-tree, but I guess I'm not strongly opposed to it here.
I honestly don't have strong feelings about this particular change either. I'm more concerned about the way we think about strict FP handling. We've still got a lot of work to do to finish strict FP handling, and I would hate to see any code that currently has logic to handle strict exception semantics lose it. On the other hand, looking closer at the current code, I see that the logic there today isn't actually correct for strict exception semantics. So I guess if we decide to go with my suggestion then every place that's currently checking "Status != APFloat::opInvalidOp" should really be checking "Status == APFloat::opOK".
BTW, in addition to what Craig pointed out above with regard to the chain, he visited me and mentioned additional complications related to the fact that we aren't keeping the rounding mode and exception semantics in the DAG. We really need a way to fix that. I'm inclined to believe constant folding will still end up going through this function, but it'll be more work to get here than I realized.
That would definitely be an improvement, but I'm not sure if that's enough. For example, if we intend to track denormals here, then we'd have to check the returned constant value rather than the status value?
It seemed to me that the strict and regular cases would end up sharing just the 1 line of code that computes the constant, so it might make more sense to group strict nodes together rather than interleave those cases with the regular ops.
BTW, in addition to what Craig pointed out above with regard to the chain, he visited me and mentioned additional complications related to the fact that we aren't keeping the rounding mode and exception semantics in the DAG. We really need a way to fix that. I'm inclined to believe constant folding will still end up going through this function, but it'll be more work to get here than I realized.
Yes, I think there's a lot of plumbing to do...
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
4819 ↗ | (On Diff #197388) | These break statements (FADD to FREM) are unreachable. |
Patch updated:
- Added a TODO comment about handling strict opcodes.
- Removed unreachable 'break' statements in the switch.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
4233 ↗ | (On Diff #197611) | Happened to stumble across more of these exception checks. Just a heads up... |
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
4233 ↗ | (On Diff #197611) | Thanks. Also, I just noticed that we do not constant fold the FMA intrinsic in IR. Haven't looked at the unary ops in IR yet, but I'll try to make this consistent. |