This review contains binary operations and is a natural progression with a small and manageable set of changes.
Details
Diff Detail
Event Timeline
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
686 | Are we almost to the point where we could seed the fast math flags from the attributes during initial DAG creation to avoid doing this everywhere, or do too many places still not preserve them? |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
686 | Many places do not preserve the flags as of yet. Even when we finish here there will still be work to be done. Also, because we propagate in these scenarios to expressions we make, that seems less likely as the context will become more divergent over time at the sub flag level. |
So if I remove FP_ROUND from this patch, this entire patch becomes extension of optimization through guard testing and is no longer in any propagation context, given that, would this patch be ok?
That part is committed, so we should be propagating from IR almost completely now. So in the larger scheme:
- DAG flags match IR FMF (D45710)
- IR FMF is propagated for all FP math ops (D46854) (although there are some corner cases like libcalls where it's not done properly yet)
- We can update transforms to use and propagate the node-level flags (this patch)
We have very few codegen tests that include IR-level FMF, so there's no way to remove the global unsafe predicate checks with confidence IMO.
My preference is to split this patch up and add tests along the way. So:
- Remove the global 'unsafe' check from some transform.
- Get a regression test failure.
- Transform that test to a sibling test that has IR-level FMF, but no global/function-level unsafe-ness and commit it.
- Update the code to use node-level flags *and* the global unsafe (show the change in the new test).
When that's done, every transform will be predicated by node level flags and globals. At that point, it's safe to remove all the global checks and tests.
If we remove the global unsafe predicate usage as we go along, we run the risk of regressing perf for multi-fold scenarios that are not covered by any tests. No idea how common that would be, but I don't see much upside to taking that shortcut.
As per the discussion, I have split the review. The subset is for fadd,fsub,fdiv and fmul with tests created by removal of the unsafe flag and augmented for fmf.
Mind that the addition of the remaining components, post review, will cause changes to some of these tests.
I don't know about the other reviewers, but I'm still overwhelmed by this patch, and I see existing bugs expanding in scope.
Can you split only the first hunk in SelectionDAG.cpp into its own patch with tests for each opcode?
case ISD::FADD: case ISD::FSUB: case ISD::FMUL: case ISD::FDIV: case ISD::FREM: if (getTarget().Options.UnsafeFPMath || Flags.isFast()) {
lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
4806–4807 | This is just wrong. Let me try to clean this up before we compound the problem. |
This working model uses fast math sub flags to guide optimization and currently passes all the lit tests. A number of lit tests were expanded to dually test the new patterns.
Also float arithmetic getNode optimizations have been cleaned up as all the equivalent optimizations now happen in the DAG combiner.
Still looking for some discussion on these changes...
test/CodeGen/AMDGPU/ffloor.f64.ll | ||
---|---|---|
38 | some cleanup here, this file will disappear shortly from this review (no changes) |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
698–700 | This 1st transform shows why I'm not comfortable with patches that are trying to make multiple changes at once. Which test(s) correspond to this code change? I think the existing predicate is bogus, and the new ones are wrong too. Why does NaN/Inf make a difference here? Is this safe other than with -0.0? Test program to check that part: #include <stdio.h> int main() { float x,y; x = -0.0; y = -0.0; printf("-(%f + %f) = %f\n", x, y, -(x + y)); printf("(-%f) - %f = %f\n", x, y, (-x) - y); x = +0.0; y = -0.0; printf("-(%f + %f) = %f\n", x, y, -(x + y)); printf("(-%f) - %f = %f\n", x, y, (-x) - y); x = -0.0; y = +0.0; printf("-(%f + %f) = %f\n", x, y, -(x + y)); printf("(-%f) - %f = %f\n", x, y, (-x) - y); x = +0.0; y = +0.0; printf("-(%f + %f) = %f\n", x, y, -(x + y)); printf("(-%f) - %f = %f\n", x, y, (-x) - y); return 0; } |
Are we almost to the point where we could seed the fast math flags from the attributes during initial DAG creation to avoid doing this everywhere, or do too many places still not preserve them?