This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] allow fdiv folds with less than fully 'fast' ops
ClosedPublic

Authored by spatel on Feb 16 2018, 11:06 AM.

Details

Summary

These folds raise 2 questions for me as I'm trying to eliminate the use of 'isFast' here:

  1. Do we need both 'arcp' and 'reassoc' to do this? My understanding is that -freciprocal-math allows this:

(X / Y) / Z --> X * (1.0 / Y) * (1.0 / Z)

...which wouldn't be useful here in InstCombine, but that might be a target-specific transform for the DAG.
If we are allowed to reassociate, then we reduce the above as:
X * (1.0 / Y) * (1.0 / Z) --> X * (1.0 / (Y * Z)) --> (X * 1.0) / (Y * Z) --> X / (Y * Z)

...but gcc currently says we can do this with just -freciprocal-math:
https://godbolt.org/g/vx7JnL

Should we allow this with just 'arcp'?

  1. We're applying the intersected FMF to the new fmul op. That results in strict fmul ops in the regression tests. Is that too conservative? Strict ops inhibit further reassociation and might cause more FMF removal in early-cse (because that also does FMF intersection).

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Feb 16 2018, 11:06 AM

...but gcc currently says we can do this with just -freciprocal-math

Regardless of what gcc is doing in this case, I think both 'reassoc' and 'arcp' are needed here. So I think your code is right.

...but gcc currently says we can do this with just -freciprocal-math

Regardless of what gcc is doing in this case, I think both 'reassoc' and 'arcp' are needed here. So I think your code is right.

Thanks. I wasn't sure what weight should be given to matching gcc-based customer expectations, so I thought it was worth noting here. Either way, this gets us closer to their behavior because we at least don't have to enable all of -ffast-math.

Any thoughts about the FMF on the fmul? It can be a separate patch if that's preferred, but I think this is the only transform around here that is intersecting FMF. We always propagate flags from the last op to intermediate ops in other cases. That makes more sense to me because the new op only exists to produce an intermediate result for our transformed fdiv, so it should have the same relaxations as the fdiv.

spatel updated this revision to Diff 135242.Feb 21 2018, 6:07 AM

Patch updated:
No functional change from the last rev, but I cleaned up the surrounding folds, so this is the last use of isFast() in visitFDiv(). There's also one more test affected by this change.

Any thoughts about the FMF on the fmul? It can be a separate patch if that's preferred, but I think this is the only transform around here that is intersecting FMF. We always propagate flags from the last op to intermediate ops in other cases. That makes more sense to me because the new op only exists to produce an intermediate result for our transformed fdiv, so it should have the same relaxations as the fdiv.

That makes sense to me, too. And from my POV, it's sensible to do this in the same patch.

spatel updated this revision to Diff 135446.Feb 22 2018, 9:41 AM

Patch updated:
Propagate FMF from the last fdiv to the new fmul.

Your reasoning makes sense to me. And I think it's fine to push this given no floating point expert objects.

Thanks for the reviews. I think we have unofficial approval. Can someone click the button to 'Accept' and/or issue the official 'looks good...'?

wristow accepted this revision.Feb 25 2018, 6:22 PM

LGTM

This revision is now accepted and ready to land.Feb 25 2018, 6:22 PM
This revision was automatically updated to reflect the committed changes.