For now, we correct the result for sqrt if iteration > 0. This doesn't make sense as they are not strict relative. Besides, some target(i.e. AArch64 and PowerPC downstream) want to have custom sqrt soft expansion instead of the default OneConstantNR/TwoConstantNR and set the iteration as zero. We need to correct the result for them also. As AArch64 don't want to test the input with IEEE, so, I move the test implementation into the hook.
Details
- Reviewers
spatel evandro t.p.northover RKSimon dmgreen - Group Reviewers
Restricted Project - Commits
- rGffc3e800c65e: [NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
---|---|---|
5873 | (style) drop the else if (Mode.Input == DenormalMode::IEEE) { ... } // Test = X == 0.0 return DAG.getSetCC(DL, CCVT, Op, FPZero, ISD::SETEQ); } |
llvm/test/CodeGen/AArch64/sqrt-fastmath.ll | ||
---|---|---|
75–76 | The bif/bic difference seems logically fine and probably better overall since we're using less registers, but someone who knows AArch should comment. Also, it would be interesting to know why the code changed because we are seemingly producing the same set of SDNode ops? There might be another combine/lowering opportunity. |
The bif/bic difference seems logically fine and probably better overall since we're using less registers, but someone who knows AArch should comment.
Also, it would be interesting to know why the code changed because we are seemingly producing the same set of SDNode ops? There might be another combine/lowering opportunity.
This is a good example to show how important the code review is and thank you for pointing out this. I should have realized it as it should be a NFC patch. The reason why you see the test change is that, AArch64 don't want the zero if it is denormal input while the default implementation return zero if it is denormal input. I miss to override it.
but someone who knows AArch should comment.
Sorry, I had no idea what this was and what it was doing. I see now, and this does indeed look like an NFC.
It turns out that on AArch64 it is controlled by a subtarget feature that is never enabled, so won't be used without being forced. Not handling IEEE denomal inputs might be a mistake, I'm not sure. This patch is an NFC in that regards though.
LGTM
(style) drop the else