This is an archive of the discontinued LLVM Phabricator instance.

[DCE] Eliminate no-op atan and atan2 calls
ClosedPublic

Authored by mohammed-nurulhoque on Jun 16 2022, 7:26 AM.

Details

Summary

From the opengroup specifications, atan2 can fail if the result underflows and atan can fail if the argument is subnormal.
In other cases we can eliminate the call to atan/atan2.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 7:26 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
mohammed-nurulhoque requested review of this revision.Jun 16 2022, 7:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 7:26 AM

The opengroup specification (https://pubs.opengroup.org/onlinepubs/9699919799/functions/atan.html) states:

If x is subnormal, a range error may occur

It does not state that error shall occur, which means this behavior is optional. So constant evaluation hardly needs changes.

nikic edited reviewers, added: efriedma, spatel; removed: eli.friedman.Jun 17 2022, 12:30 AM
nikic added a subscriber: nikic.
nikic added inline comments.
llvm/test/Transforms/InstCombine/elimAtan.ll
2 ↗(On Diff #437531)

Please run only -instcombine.

Changed the tests to only invoke -early-cse and moved the file to the earlyCSE tests folder

The opengroup specification (https://pubs.opengroup.org/onlinepubs/9699919799/functions/atan.html) states:

If x is subnormal, a range error may occur

It does not state that error shall occur, which means this behavior is optional. So constant evaluation hardly needs changes.

Yes. But before this patch, it doesn't consider any atan* call as No-Op, which means it's assuming all atan* as potentially failing.
This patch strictly limits the cases where an error might be expected

sepavloff added inline comments.Jul 6 2022, 10:08 AM
llvm/lib/Analysis/ConstantFolding.cpp
3302

Should we check for denormal argument?

C11 standard (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf, 7.12.4.3) does not mention any error condition. GLIBC documentation (https://www.gnu.org/software/libc/manual/html_node/Inverse-Trig-Functions.html) also says nothing about it. POSIX (https://pubs.opengroup.org/onlinepubs/9699919799/functions/atan.html) says range error is optional.

Can we safely assume atan never sets error?

Updated, so atan & atan2 are always no-op. Since, the errors are optional, never raising an error is compliant. It's also consistent with how the other math functions with optional errors are currently implemented.

mohammed-nurulhoque marked an inline comment as done.Jul 12 2022, 12:05 PM
sepavloff added inline comments.Jul 13 2022, 11:10 AM
llvm/lib/Analysis/ConstantFolding.cpp
3371–3374

Both GLIBC and POSIX docs say that atan(0, 0) is 0. But C11 says (7.12.4.4p2):

... A domain error may occur if both arguments are zero.

Could you please elaborate this question?

llvm/lib/Analysis/ConstantFolding.cpp
3371–3374

The C11 still says the error is optional, so I think it's correct not to raise the error. Furthermore, POSIX docs say explicitly "If both arguments are 0, a domain error shall not occur."
Other than that, out of these 3, POSIX descriptions look the most complete here, as it enumerates a lot of corner cases, whereas GLIBC & C docs are high-level, so I'm inclined to go with the POSIX description here.

sepavloff added inline comments.Jul 15 2022, 6:16 AM
llvm/lib/Analysis/ConstantFolding.cpp
3371–3374

Please put here a comment saying that POSIX, GLIBC and MSVC assume atan2(0, 0) is 0, C11 says that a domain error may occur but does not require that.

llvm/test/Transforms/EarlyCSE/atan.ll
21

Why it is not constfolded? C11 (F.10.1.3) states:

atan(±∞) returns ±π /2.
32

Why it is not constfolded? C11 (F.10p11) states:

Functions with a NaN argument return a NaN result and raise no floating-point exception,
except where stated otherwise.
49

Why it is not removed? The value is not used and it should not have side effect, no?

mohammed-nurulhoque marked 3 inline comments as done.

Add atan2 tests for Inf & NaN

llvm/test/Transforms/EarlyCSE/atan.ll
21

infinity and NaN are not folded because in ConstantFoldScalarCall1 there's this snippet:

/// We only fold functions with finite arguments. Folding NaN and inf is
/// likely to be aborted with an exception anyway, and some host libms
/// have known errors raising exceptions.
if (!U.isFinite())
  return nullptr;

This is too conservative as your example shows, but changing it will affect most of the floating point functions, so it's probably best left for a separate commit.
Interestingly, the same wasn't done for calls with 2 arguments as atan2 with Inf&NaN arguments are folded as the newly added test cases show.

49

It's removed indeed. I accidentally uploaded the wrong diff.

There's an earlier check in canConstantFoldCallTo() that appears to prevent folding on "atanl" (and other long double calls), but it would be good to include a test like this to confirm:

define x86_fp80 @atanl_x86(x86_fp80 %x) {
  %call = call x86_fp80 @atanl(x86_fp80 noundef 0xK3FFF8CCCCCCCCCCCCCCD)
  ret x86_fp80 %call
}
llvm/test/Transforms/EarlyCSE/atan.ll
2

Unless there's a reason/difference, I'd expect tests like this to be in the folder tests/Transforms/InstSimplify/ConstProp with similar tests for mathlib calls and use "RUN: opt -passes=instsimplify..."

add a long double test

llvm/test/Transforms/EarlyCSE/atan.ll
2

instsimplify does the "replacing the use of the call return value with a constant" part, but does not do the "eliminating the now redundant call" part. This patch adds the elimination not the replacement with constant.

spatel added inline comments.Aug 5 2022, 1:27 PM
llvm/test/Transforms/EarlyCSE/atan.ll
2

Ah, I see now that the existing files are running -early-cse, but misplaced in the InstSimplify test directory. It will be easier to tell exactly what is changing if we pre-commit the tests with baseline results. Do you have commit access?

llvm/test/Transforms/EarlyCSE/atan.ll
2

Sounds good. I don't have commit access.

spatel added a comment.Aug 8 2022, 7:06 AM

Baseline tests committed:
b53d44fe474

Please rebase and add test comments to make it obvious why things are changing (or not changing).

Rebase tests on committed baseline tests

I moved the existing (misplaced) test files:
59f3b3d7963b93
...so this will need another update. Also, see my suggestion to add test comments, so the current behavior is explained.

There's an open question about what to do with denormals. We recently added denorm-attribute-aware folding with:
D116952
D128647
...and we mentioned that it is a work-in-progress to get consistent behavior for intrinsics and libcalls. Exclude denorm inputs in this patch?

There's an open question about what to do with denormals. We recently added denorm-attribute-aware folding with:
D116952
D128647
...and we mentioned that it is a work-in-progress to get consistent behavior for intrinsics and libcalls. Exclude denorm inputs in this patch?

The denormal question will affect the exact constant that will be substituted for the call, but it doesn't affect whether atan/atan2 is free from side-effects, no?
This patch only changes whether atan/atan2 is safe to remove, it doesn't change how we substitute constants.

spatel added a comment.Aug 9 2022, 5:15 AM

There's an open question about what to do with denormals. We recently added denorm-attribute-aware folding with:
D116952
D128647
...and we mentioned that it is a work-in-progress to get consistent behavior for intrinsics and libcalls. Exclude denorm inputs in this patch?

The denormal question will affect the exact constant that will be substituted for the call, but it doesn't affect whether atan/atan2 is free from side-effects, no?
This patch only changes whether atan/atan2 is safe to remove, it doesn't change how we substitute constants.

Denorms are mentioned specifically in the POSIX doc cited in an earlier comment:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/atan.html

These functions may fail if
Range Error
  [MX] [Option Start] The value of x is subnormal.

But you're proposing that we take the permissive side of "may fail", so I guess that's fine. Please add test comments to make that clear.

rebase on moved tests & comment on error behaviour

spatel accepted this revision.Aug 9 2022, 7:44 AM

LGTM - see inline comment for one more suggestion.

llvm/test/Transforms/EarlyCSE/atan.ll
14

Label this test and the next one with a TODO comment?

This revision is now accepted and ready to land.Aug 9 2022, 7:44 AM

Updated TODO comments. Thank you for the review. I don't have write permissions, so please land this patch. Here are my details:
Name: Mohammed Nurul Hoque
email: mohammed.nurulhoque@imgtec.com

This revision was landed with ongoing or failed builds.Aug 10 2022, 8:01 AM
This revision was automatically updated to reflect the committed changes.
ro added a subscriber: ro.Aug 18 2022, 3:41 AM

The llvm/test/Transforms/EarlyCSE/atan.ll test FAILs on Solaris (both sparcv9 and amd64):

/vol/llvm/src/llvm-project/local/llvm/test/Transforms/EarlyCSE/atan.ll:55:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: ret float -0.000000e+00
              ^

Comparing to the Linux/x86_64 version of the output, the only difference is

--- /homes/ro/atan.ll.x86_64	2022-08-18 12:38:48.313115000 +0200
+++ /homes/ro/atan.ll.sparcv9	2022-08-18 11:50:22.976866000 +0200
@@ -25,7 +25,8 @@
 }
 
 define float @callatan2_00() {
-  ret float -0.000000e+00
+  %call = call float @atan2f(float -0.000000e+00, float 0.000000e+00)
+  ret float %call
 }
 
 define float @callatan2_x0() {

Any suggestion on how to handle this? Just XFAIL the test on Solaris?

In D127964#3731720, @ro wrote:

The llvm/test/Transforms/EarlyCSE/atan.ll test FAILs on Solaris (both sparcv9 and amd64):

/vol/llvm/src/llvm-project/local/llvm/test/Transforms/EarlyCSE/atan.ll:55:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: ret float -0.000000e+00
              ^

Comparing to the Linux/x86_64 version of the output, the only difference is

--- /homes/ro/atan.ll.x86_64	2022-08-18 12:38:48.313115000 +0200
+++ /homes/ro/atan.ll.sparcv9	2022-08-18 11:50:22.976866000 +0200
@@ -25,7 +25,8 @@
 }
 
 define float @callatan2_00() {
-  ret float -0.000000e+00
+  %call = call float @atan2f(float -0.000000e+00, float 0.000000e+00)
+  ret float %call
 }
 
 define float @callatan2_x0() {

Any suggestion on how to handle this? Just XFAIL the test on Solaris?

That would be the quick fix to get things passing. But does that mean the mathlib on Solaris is setting errno on "atan2(-0.0, 0.0)" ?

IEEE says "atan2(±0, +0) is ±0".
POSIX has that case as optional - https://pubs.opengroup.org/onlinepubs/9699919799/functions/atan2.html :
"If y is ±0 and x is +0, ±0 shall be returned."
And also says:
"If both arguments are 0, a domain error shall not occur."
...but that's in the same optional block.

So the more general fix would be to disallow constant folding on this case (and also on "atan2(+0.0, 0.0)")?

ro added a comment.Aug 18 2022, 1:34 PM

That would be the quick fix to get things passing. But does that mean the mathlib on Solaris is setting errno on "atan2(-0.0, 0.0)" ?

IEEE says "atan2(±0, +0) is ±0".
POSIX has that case as optional - https://pubs.opengroup.org/onlinepubs/9699919799/functions/atan2.html :
"If y is ±0 and x is +0, ±0 shall be returned."
And also says:
"If both arguments are 0, a domain error shall not occur."
...but that's in the same optional block.

I've just checked: the difference is between clang 11.3.0 and clang 15.0.0 (and older):

  • With gcc, I get both the expected return values and errno is 0.
  • With clang, the values remain correct, but somehow errno is set to EDOM.

However, this only happens because gcc, unlike clang, evaluates the atan2 calls at compile time. When compiling with -fno-builtin-atan2, both clang and gcc binaries behave the same.

I've found a copy of the Solaris 9 libm sources and am looking at the atan2 implementation and error handling right now.

ro added a comment.Aug 19 2022, 7:55 AM

I've looked around some more and it seems the Solaris libm acts within the C standard: all of C99, p.219, C11, p.239, and C17, p.147 state

A domain error may occur if both arguments are zero.

I've also found the atan2 docs on cppreference.com which says the same, adding

If the implementation supports IEEE floating-point arithmetic (IEC 60559),

    If x and y are both zero, domain error does not occur
    If x and y are both zero, range error does not occur either

IEC 60559 support thus seems to be lacking on Solaris and, given that's it's optional, the LLVM testsuite should cope either way.

In D127964#3735267, @ro wrote:

I've looked around some more and it seems the Solaris libm acts within the C standard: all of C99, p.219, C11, p.239, and C17, p.147 state

A domain error may occur if both arguments are zero.

I've also found the atan2 docs on cppreference.com which says the same, adding

If the implementation supports IEEE floating-point arithmetic (IEC 60559),

    If x and y are both zero, domain error does not occur
    If x and y are both zero, range error does not occur either

IEC 60559 support thus seems to be lacking on Solaris and, given that's it's optional, the LLVM testsuite should cope either way.

Thanks for checking. I added tests and avoided folding on these patterns:
4bff1037bbfc3
7f1262a322c0

In the initial review, we misinterpreted the POSIX docs because the optional behavior specifier for raising errors is easily missed. If there are other cases where this patch overstepped, we should fix those too.

ro added a comment.Aug 19 2022, 2:04 PM

Thanks for checking. I added tests and avoided folding on these patterns:
4bff1037bbfc3
7f1262a322c0

In the initial review, we misinterpreted the POSIX docs because the optional behavior specifier for raising errors is easily missed. If there are other cases where this patch overstepped, we should fix those too.

Unfortunately, the fix isn't complete yet: I still get a FAIL on Solaris/amd64 (atan.ll.x86_64 is the Linux/x86_64 output, atan.ll.amd64 the Solaris/amd64 one):

--- atan.ll.x86_64	2022-08-19 23:01:10.257646000 +0200
+++ atan.ll.amd64	2022-08-19 23:00:57.605261000 +0200
@@ -26,22 +26,22 @@
 
 define float @callatan2_00() {
   %call = call float @atan2f(float 0.000000e+00, float 0.000000e+00)
-  ret float 0.000000e+00
+  ret float %call
 }
 
 define float @callatan2_n00() {
   %call = call float @atan2f(float -0.000000e+00, float 0.000000e+00)
-  ret float -0.000000e+00
+  ret float %call
 }
 
 define float @callatan2_0n0() {
   %call = call float @atan2f(float 0.000000e+00, float -0.000000e+00)
-  ret float 0x400921FB60000000
+  ret float %call
 }
 
 define float @callatan2_n0n0() {
   %call = call float @atan2f(float -0.000000e+00, float -0.000000e+00)
-  ret float 0xC00921FB60000000
+  ret float %call
 }
 
 define float @callatan2_x0() {
In D127964#3736290, @ro wrote:

Unfortunately, the fix isn't complete yet: I still get a FAIL on Solaris/amd64 (atan.ll.x86_64 is the Linux/x86_64 output, atan.ll.amd64 the Solaris/amd64 one):

IIUC, that bug existed before this patch; it was just exposed with the additional tests. Ie, because the library raises an exception on atan2(0, 0), we don't want to assume any particular constant-folded result.

A similar corner-case came up in the earlier comments: we avoid constant folding on 1-argument libm calls that potentially raise exceptions with a NaN/Inf input, but that check is missing on the path that handles 2-argument calls. And now we need a special-case for atan2 to bail out when both inputs are zero.

Are tests passing on Solaris after 2981a9490277a7920936d287c?

ro added a comment.Aug 22 2022, 1:37 PM

Are tests passing on Solaris after 2981a9490277a7920936d287c?

They are indeed. The Solaris/amd64 is (almost) reliably green again after quite some time, and the Solaris/sparcv9 one is just one other (unrelated) failure away.

Thanks a lot.