This is an archive of the discontinued LLVM Phabricator instance.

Utilize new SDNode flag functionality to expand current support model
AbandonedPublic

Authored by mcberg2017 on May 7 2018, 4:52 PM.

Details

Summary

This review contains binary operations and is a natural progression with a small and manageable set of changes.

Diff Detail

Event Timeline

mcberg2017 created this revision.May 7 2018, 4:52 PM
arsenm added inline comments.May 8 2018, 2:48 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
681

Are we almost to the point where we could seed the fast math flags from the attributes during initial DAG creation to avoid doing this everywhere, or do too many places still not preserve them?

mcberg2017 added inline comments.May 8 2018, 7:58 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
681

Many places do not preserve the flags as of yet. Even when we finish here there will still be work to be done. Also, because we propagate in these scenarios to expressions we make, that seems less likely as the context will become more divergent over time at the sub flag level.

So if I remove FP_ROUND from this patch, this entire patch becomes extension of optimization through guard testing and is no longer in any propagation context, given that, would this patch be ok?

I will have to update this again once D46854 is checked in.

I will have to update this again once D46854 is checked in.

That part is committed, so we should be propagating from IR almost completely now. So in the larger scheme:

  1. DAG flags match IR FMF (D45710)
  2. IR FMF is propagated for all FP math ops (D46854) (although there are some corner cases like libcalls where it's not done properly yet)
  3. We can update transforms to use and propagate the node-level flags (this patch)

We have very few codegen tests that include IR-level FMF, so there's no way to remove the global unsafe predicate checks with confidence IMO.
My preference is to split this patch up and add tests along the way. So:

  1. Remove the global 'unsafe' check from some transform.
  2. Get a regression test failure.
  3. Transform that test to a sibling test that has IR-level FMF, but no global/function-level unsafe-ness and commit it.
  4. Update the code to use node-level flags *and* the global unsafe (show the change in the new test).

When that's done, every transform will be predicated by node level flags and globals. At that point, it's safe to remove all the global checks and tests.
If we remove the global unsafe predicate usage as we go along, we run the risk of regressing perf for multi-fold scenarios that are not covered by any tests. No idea how common that would be, but I don't see much upside to taking that shortcut.

mcberg2017 updated this revision to Diff 146971.EditedMay 15 2018, 5:37 PM

As per the discussion, I have split the review. The subset is for fadd,fsub,fdiv and fmul with tests created by removal of the unsafe flag and augmented for fmf.

Mind that the addition of the remaining components, post review, will cause changes to some of these tests.

I don't know about the other reviewers, but I'm still overwhelmed by this patch, and I see existing bugs expanding in scope.

Can you split only the first hunk in SelectionDAG.cpp into its own patch with tests for each opcode?

case ISD::FADD:
case ISD::FSUB:
case ISD::FMUL:
case ISD::FDIV:
case ISD::FREM:
  if (getTarget().Options.UnsafeFPMath || Flags.isFast()) {
lib/CodeGen/SelectionDAG/SelectionDAG.cpp
4816

This is just wrong. Let me try to clean this up before we compound the problem.

mcberg2017 updated this revision to Diff 147200.EditedMay 16 2018, 3:27 PM

With code moved to D46968 and the balance here. Undef is deferred.

In process are updates to this patch for sub flag usage, on a case by case basis.

This working model uses fast math sub flags to guide optimization and currently passes all the lit tests. A number of lit tests were expanded to dually test the new patterns.

Also float arithmetic getNode optimizations have been cleaned up as all the equivalent optimizations now happen in the DAG combiner.

mcberg2017 added inline comments.May 24 2018, 1:27 PM
test/CodeGen/AMDGPU/fdiv.f16.ll
221 ↗(On Diff #148468)

Note for Matt, the numbers now used reflect 16 bit fp values.

test/CodeGen/X86/fadd-combines.ll
225 ↗(On Diff #148468)

I trimmed this down to the required essentials required to test both paths.

mcberg2017 edited the summary of this revision. (Show Details)Jun 2 2018, 9:56 AM

Still looking for some discussion on these changes...

test/CodeGen/AMDGPU/ffloor.f64.ll
38 ↗(On Diff #148468)

some cleanup here, this file will disappear shortly from this review (no changes)

spatel added inline comments.Jun 4 2018, 6:53 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
696–697

This 1st transform shows why I'm not comfortable with patches that are trying to make multiple changes at once. Which test(s) correspond to this code change?

I think the existing predicate is bogus, and the new ones are wrong too. Why does NaN/Inf make a difference here?

Is this safe other than with -0.0?

Test program to check that part:

#include <stdio.h>
  
int main() {
  float x,y;
  x = -0.0;
  y = -0.0;
  printf("-(%f + %f) = %f\n", x, y, -(x + y));
  printf("(-%f) - %f = %f\n", x, y, (-x) - y);
  x = +0.0;
  y = -0.0;
  printf("-(%f + %f) = %f\n", x, y, -(x + y));
  printf("(-%f) - %f = %f\n", x, y, (-x) - y);
  x = -0.0;
  y = +0.0;
  printf("-(%f + %f) = %f\n", x, y, -(x + y));
  printf("(-%f) - %f = %f\n", x, y, (-x) - y);
  x = +0.0;
  y = +0.0;
  printf("-(%f + %f) = %f\n", x, y, -(x + y));
  printf("(-%f) - %f = %f\n", x, y, (-x) - y);
  return 0;
}
mcberg2017 abandoned this revision.Jun 15 2018, 5:09 PM

This review is now covered in other checkins.