Co-Authored-by: Paul Walker <paul.walker@arm.com>
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
- Be stricter around fast math flags and require 'fast' not just 'nnan'.
- Propagate fast math flags through combine.
llvm/include/llvm/IR/PatternMatch.h | ||
---|---|---|
2204 ↗ | (On Diff #431375) | This is useful already, so I added it as a preliminary step: |
llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp | ||
6781 | We're (very slowly) moving away from using FMF with fcmp because fcmp doesn't usually have a logical relationship to the FMF. | |
llvm/test/Transforms/InstCombine/fcmp.ll | ||
1234 | Can we handle this kind of example as a preliminary patch? There's no fast-math required if we are checking if the result of sqrt is or is not negative (but we should test several predicates to verify that we have the correct behavior with NAN): |
llvm/test/Transforms/InstCombine/fcmp.ll | ||
---|---|---|
1234 | Note that we should already reduce some compares via: |
Maybe also worth to check gcc's issues with this transformation:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91734
- Rebase.
- Move fast math flag requirements from fcmp to sqrt.
- Don't require 'fast' flag on sqrt, only 'nnan && (reassoc || afn)'
With the fast math flags requirement moved to the sqrt itself, do you still see a problem here?
llvm/test/Transforms/InstCombine/fcmp.ll | ||
---|---|---|
1234 | That optimization is orthogonal to what we are doing here, additionally I don't think I have the floating-point expertise in order to get the NaN cases in such a change right. |
I think I'd prefer to start with the cases that don't require fast-math flags, then expand the transform as needed to include the cases that do require them.
For example, Alive2 says the following is correct without any fast-math flags:
declare half @llvm.sqrt.f16(half) define i1 @src(half %x) { %sqrt = call half @llvm.sqrt.f16(half %x) %cmp = fcmp ogt half %sqrt, 2.0 ret i1 %cmp } define i1 @tgt(half %x) { %r = fcmp ogt half %x, 4.00390625 ret i1 %r }
This has turned out to be far more complex to do safely than originally thought, given I don't as of yet have a compelling example where this provides a significant benefit I will abandon this for now.
No problem - I filed a bug and linked it to this review, so someone can revisit:
https://github.com/llvm/llvm-project/issues/55918
We're (very slowly) moving away from using FMF with fcmp because fcmp doesn't usually have a logical relationship to the FMF.
What really matters for this fold is that we can't require precise results from sqrt, so we should check that the sqrt has "reassoc" or "afn".