When trying to constant fold an FMA in the DAG, getNode() fails to fold the FMA if an operand is not finite. In this case this patch allows the constant folding if !TLI->hasFloatingPointExceptions()
Details
Diff Detail
Event Timeline
I think you're not understanding the code, but feel free to add a comment making it more clear what is going on. When you call DAG.getNode, you end up in this code:
SDValue SelectionDAG::getNode(unsigned Opcode, SDLoc DL, EVT VT, SDValue N1, SDValue N2, SDValue N3) { // Perform various simplifications. ConstantSDNode *N1C = dyn_cast<ConstantSDNode>(N1.getNode()); switch (Opcode) { case ISD::FMA: { ConstantFPSDNode *N1CFP = dyn_cast<ConstantFPSDNode>(N1); ConstantFPSDNode *N2CFP = dyn_cast<ConstantFPSDNode>(N2); ConstantFPSDNode *N3CFP = dyn_cast<ConstantFPSDNode>(N3); if (N1CFP && N2CFP && N3CFP) { APFloat V1 = N1CFP->getValueAPF(); const APFloat &V2 = N2CFP->getValueAPF(); const APFloat &V3 = N3CFP->getValueAPF(); APFloat::opStatus s = V1.fusedMultiplyAdd(V2, V3, APFloat::rmNearestTiesToEven); if (s != APFloat::opInvalidOp) return getConstantFP(V1, VT); } break; }
and that will constant-fold the node when possible.
You're completely right, thanks :)
The reason my constant folding didn't occur is because I'm trying to constant fold a NaN.
I tried to do it in my backend, but it does not work because the Target PerformDAGCombine() is never called because even if the constant folding fails, getNode() will return something else than SDValue().
When trying to constant fold an FMA in the DAG, getNode() fails to fold the FMA if an operand is not finite. In this case this patch allows the target to perform its own combine.
Unfortunately, this won't work either. You can't modify getNode so that it will *fail* to return a node that you might want to constant-fold later. This will cause all kinds of problems.
Does your target want to combine these things because it is essentially operating in a 'fast math' mode, do you not support floating-point exception tracking, or is something else going on?
FWIW, I think your target could always install a target-specific DAG combine handler for ISD::FMA and fold them itself. Once we have fast-math flags at the SDAG level, we can also support this (perhaps).
New approach using !TLI->hasFloatingPointExceptions() ; this seems coherent with what is done in getNode() for binary floating point operations.