This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Add combine for fcmp sqrt(x),C --> fcmp x,C*C
AbandonedPublic

Authored by bsmith on May 23 2022, 3:37 AM.

Details

Summary

Co-Authored-by: Paul Walker <paul.walker@arm.com>

Diff Detail

Event Timeline

bsmith created this revision.May 23 2022, 3:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2022, 3:37 AM
bsmith requested review of this revision.May 23 2022, 3:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2022, 3:37 AM
bsmith retitled this revision from [AArch64][InstCombine] Add combine for fcmp sqrt(x),C --> fcmp x,C*C to [InstCombine] Add combine for fcmp sqrt(x),C --> fcmp x,C*C.May 23 2022, 5:13 AM
bsmith updated this revision to Diff 431375.May 23 2022, 7:52 AM
  • Be stricter around fast math flags and require 'fast' not just 'nnan'.
  • Propagate fast math flags through combine.

Is this “safe” even with fast math for all predicates?

spatel added inline comments.May 23 2022, 11:28 AM
llvm/include/llvm/IR/PatternMatch.h
2204 ↗(On Diff #431375)

This is useful already, so I added it as a preliminary step:
e8c20d995bed

llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
6781

We're (very slowly) moving away from using FMF with fcmp because fcmp doesn't usually have a logical relationship to the FMF.
What really matters for this fold is that we can't require precise results from sqrt, so we should check that the sqrt has "reassoc" or "afn".

llvm/test/Transforms/InstCombine/fcmp.ll
1234

Can we handle this kind of example as a preliminary patch? There's no fast-math required if we are checking if the result of sqrt is or is not negative (but we should test several predicates to verify that we have the correct behavior with NAN):
https://alive2.llvm.org/ce/z/EeYGLt
https://alive2.llvm.org/ce/z/X7qsg2

spatel added inline comments.May 23 2022, 11:52 AM
llvm/test/Transforms/InstCombine/fcmp.ll
1234

Maybe also worth to check gcc's issues with this transformation:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91734

bsmith updated this revision to Diff 431649.May 24 2022, 5:24 AM
  • Rebase.
  • Move fast math flag requirements from fcmp to sqrt.
  • Don't require 'fast' flag on sqrt, only 'nnan && (reassoc || afn)'

Is this “safe” even with fast math for all predicates?

With the fast math flags requirement moved to the sqrt itself, do you still see a problem here?

llvm/test/Transforms/InstCombine/fcmp.ll
1234

That optimization is orthogonal to what we are doing here, additionally I don't think I have the floating-point expertise in order to get the NaN cases in such a change right.

spatel added inline comments.May 27 2022, 10:44 AM
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
6785

What happens if C*C overflows or underflows? Unless there's some reason to diverge from the behavior in the gcc bug link (see earlier comment from @xbolva00 ), we should try to have the same rules for when this transform is allowed.

I think I'd prefer to start with the cases that don't require fast-math flags, then expand the transform as needed to include the cases that do require them.

For example, Alive2 says the following is correct without any fast-math flags:

declare half @llvm.sqrt.f16(half)
define i1 @src(half %x) {
  %sqrt = call half @llvm.sqrt.f16(half %x)
  %cmp = fcmp ogt half %sqrt, 2.0
  ret i1 %cmp
}
define i1 @tgt(half %x) {
  %r = fcmp ogt half %x, 4.00390625
  ret i1 %r
}
bsmith abandoned this revision.Jun 7 2022, 7:27 AM

This has turned out to be far more complex to do safely than originally thought, given I don't as of yet have a compelling example where this provides a significant benefit I will abandon this for now.

spatel added a comment.Jun 7 2022, 8:05 AM

This has turned out to be far more complex to do safely than originally thought, given I don't as of yet have a compelling example where this provides a significant benefit I will abandon this for now.

No problem - I filed a bug and linked it to this review, so someone can revisit:
https://github.com/llvm/llvm-project/issues/55918