Implements same logis as in SelectionDAG.
G_FMINNUM_IEEE and G_FMAXNUM_IEEE are never SNaN by definition and
never NaN when one operand is known non-NaN and other known non-SNaN.
G_FMINNUM_IEEE and G_FMAXNUM_IEEE are never NaN/SNaN when one of the
operands is known non-NaN/SNaN.
Details
Diff Detail
Event Timeline
llvm/lib/CodeGen/GlobalISel/Utils.cpp | ||
---|---|---|
514–520 | Generic opcodes should not have target dependent semantics like this, especially since these already have semantics inherited from the IR/C fmin/fmax definition. They return the non-nan argument regardless of snan or not |
llvm/lib/CodeGen/GlobalISel/Utils.cpp | ||
---|---|---|
518–519 | Can this then be return true? |
llvm/lib/CodeGen/GlobalISel/Utils.cpp | ||
---|---|---|
518–519 | No, it depends on the inputs. You can refer to the existing SelectionDAG::isKnownNeverNaN: // Only one needs to be known not-nan, since it will be returned if the // other ends up being one. return isKnownNeverNaN(Op.getOperand(0), SNaN, Depth + 1) || isKnownNeverNaN(Op.getOperand(1), SNaN, Depth + 1); |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
---|---|---|
823 | return isKnownNeverNaN(Op.getOperand(0), SNaN, Depth + 1) || isKnownNeverNaN(Op.getOperand(1), SNaN, Depth + 1); } This is legalized first. %2 is isKnownNeverSNaN with IEEE = true, but since inputs for %2:_(s32) = G_FMAXNUM %0, %1 are not yet canonicalized isKnownNeverSNaN will fail and canonicalize is inserted. However since we know what will happen with '%2:_(s32) = G_FMAXNUM %0, %1 we declare it isKnownNeverSNaN to not depend on order of legalization. |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
---|---|---|
823 | The legalization order doesn't matter. These operations have their own independent semantics. isKnownNeverNaN needs to understand both pairs |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
---|---|---|
818–820 | Using only that formula this will not change and we are left with unnecessary G_FCANONICALIZE(G_FMAXNUM_IEEE) which stops us from performing further combines both from td file and combiner. Do you have other suggestion how to avoid making G_FCANONICALIZE? | |
823 | How can this be done?
Formula above gives different result depending if input was legalized already or not. It would work if we would legalize uses first. |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
---|---|---|
823 | The same way SelectionDAG handles this: case ISD::FMINNUM_IEEE: case ISD::FMAXNUM_IEEE: { if (SNaN) return true; // This can return a NaN if either operand is an sNaN, or if both operands // are NaN. return (isKnownNeverNaN(Op.getOperand(0), false, Depth + 1) && isKnownNeverSNaN(Op.getOperand(1), Depth + 1)) || (isKnownNeverNaN(Op.getOperand(1), false, Depth + 1) && isKnownNeverSNaN(Op.getOperand(0), Depth + 1)); } |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
---|---|---|
823 | I meant for FMINNUM, writing same code as in SDAG doesn't deal with unnecessary canonicalize. FMINNUM_IEEE is the same (there is only if (SNaN) part atm). |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
---|---|---|
823 | The lowering for fminnum to fminnum_ieee tries to avoid the canonicalize depending on the instruction/inputs. This is how SelectionDAG also does it. For this patch, it just needs to fully handle the correct semanics for the two pairs |
llvm/lib/CodeGen/GlobalISel/Utils.cpp | ||
---|---|---|
514–520 |
I would say that only "is Known Never any NaN" is given. This patch does not change that (isKnownNeverNaN(SNaN = false)).
still stands. We are just aware of the way we lower instruction and use it in isKnownNeverSNaN only. This is only used for efficiency to avoid having to quiet "known non SNaN" into instruction that is about to be lowered. | |
518–519 | I don't think return isKnownNeverNaN(DefMI->getOperand(1).getReg(), MRI, SNaN) || isKnownNeverNaN(DefMI->getOperand(2).getReg(), MRI, SNaN); is enough for globalisel since it does not look at the whole picture. The formula above works when target naturally has FMINNUM instruction and knowing the inputs gives conclusion about the output (one input is not nan result is not nan). This would be generic formula (FIXME, use this formula in isKnownNeverNaNForMI instead of return false). | |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fminnum.mir | ||
823 | About examining the input: We are likely examining inputs in the middle of the legalization. With IEEE=true fminnum will never be the input but fminnum_ieee. Recursively asking for fminnum's inputs is not correct in this context since (maybe quieted)inputs will go to fminnum_ieee (its result is never SNaN) not fminnum and this is target dependent. |
llvm/lib/CodeGen/GlobalISel/Utils.cpp | ||
---|---|---|
518–519 | The single instruction is the whole picture and does not depend on the target. These have defined, universal NAN handling semantics regardless of however they are lowered. This patch should only aim to faithfully implement these semantics. Ensuring we don't have redundant quieting canonicalizes is an optimization beyond this patch. As an optimization, the expansion to the IEEE version tries to avoid inserting a quiet, which mostly works well enough. We are missing some optimization hints. I have long wished that we had separate no-qnan and no-snan fast math flags (as well as renaming the IR intrinsics to match the real fmin/fmax semantics, and introduce a new pair for the IEEE-758 2008 semantics) |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir | ||
---|---|---|
721 | Thoughts on adding %0:_(s32) = G_FMINNUM_IEEE/G_FMAXNUM_IEEE ... %1:_(s32) = G_FCANONICALIZE %0 -> %0:_(s32) = G_FMINNUM_IEEE/G_FMAXNUM_IEEE ... %1:_(s32) = COPY %0 combine post legalizer? |
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fmaxnum.mir | ||
---|---|---|
721 | Yes, redundant canonicalization elimination belongs post legalizer |
Can this then be return true?