This is an archive of the discontinued LLVM Phabricator instance.

[WIP] [RFC] Don't lower floating point intrinsics to libcalls which modify errno
AbandonedPublic

Authored by efriedma on Jan 4 2017, 4:43 PM.

Details

Summary

sqrt(), sin(), cos(), pow(), exp(), exp2(), log(), log2(), and log10() are not available in variants which don't set errno on glibc (and maybe other more obscure platforms), so don't claim they're universally available in TargetLowering.cpp. Then fix LangRef to make it obvious that they in fact have no side-effects, and get rid of the silly sqrt() special-case.

TODO:

  • Make the backend error message print out the actual name of the intrinsic rather than a number.
  • Provide libcall lowerings for these intrinsics on platforms where they're available.
  • Provide a target API to check whether these intrinsics are available, and use it (SimplifyLibCalls, vectorizer)
  • Fix ConstantFolding so it doesn't assume the operand to llvm.sqrt() is positive
  • Fix regression tests (60-ish failures at the moment).
  • Fix clang so it doesn't generate calls to llvm.sqrt() and llvm.pow() when they aren't available.
  • Fix clang so it doesn't mark calls which have side-effects readnone.

Sort of a response to https://reviews.llvm.org/D27618 and other related discussion lately. I'm not sure when I'll have time to finish this.

Diff Detail

Repository
rL LLVM

Event Timeline

efriedma updated this revision to Diff 83161.Jan 4 2017, 4:43 PM
efriedma retitled this revision from to [WIP] [RFC] Don't lower floating point intrinsics to libcalls which modify errno.
efriedma updated this object.
efriedma set the repository for this revision to rL LLVM.
efriedma added subscribers: llvm-commits, spatel, avt77 and 2 others.

sqrt(), sin(), cos(), pow(), exp(), exp2(), log(), log2(), and log10() are not available in variants which don't set errno on glibc (and maybe other more obscure platforms), so don't claim they're universally available in TargetLowering.cpp. Then fix LangRef to make it obvious that they in fact have no side-effects, and get rid of the silly sqrt() special-case.

I fully support work in this direction; we definitely need to fix this up.

We also do need to model errno, however, or provide compiler-rt wrappers,etc. so that we can really use -fno-math-errno on systems where the math functions really do set errno and not miscompile code by reordering calls to math functions with calls to other functions (e.g. open, read) that set errno and where errno is later used (e.g. by perror).

We also do need to model errno, however, or provide compiler-rt wrappers,etc. so that we can really use -fno-math-errno on systems where the math functions really do set errno and not miscompile code by reordering calls to math functions with calls to other functions (e.g. open, read) that set errno and where errno is later used (e.g. by perror).

Yes, that's what I was getting at with "Fix clang so it doesn't mark calls which have side-effects readnone." It's hard to come up with a good solution here, though. The options I can think of:

  1. Make compiler-rt implement a bunch of libm functions with sane semantics. Straightforward, but writing fast, correct libm routines is tricky.
  2. Instead of marking the math functions readnone, mark them "dead_errno_write" or something like that. That allows correct modeling, but it probably ends up blocking a substantial number of optimizations because we have to assume almost any pointer could alias errno.
  3. Some solution which involves teaching the compiler or compiler-rt how to compute the address of errno. This gets nasty fast because it's different for every libc implementation.

That's largely orthogonal to this patch, though.

RKSimon added a subscriber: davide.Jan 5 2017, 3:26 AM

We also do need to model errno, however, or provide compiler-rt wrappers,etc. so that we can really use -fno-math-errno on systems where the math functions really do set errno and not miscompile code by reordering calls to math functions with calls to other functions (e.g. open, read) that set errno and where errno is later used (e.g. by perror).

Yes, that's what I was getting at with "Fix clang so it doesn't mark calls which have side-effects readnone." It's hard to come up with a good solution here, though. The options I can think of:

  1. Make compiler-rt implement a bunch of libm functions with sane semantics. Straightforward, but writing fast, correct libm routines is tricky.

No, you just need to wrap them such that you save/restore errno around the calls to the libm implementation.

  1. Instead of marking the math functions readnone, mark them "dead_errno_write" or something like that. That allows correct modeling, but it probably ends up blocking a substantial number of optimizations because we have to assume almost any pointer could alias errno.
  2. Some solution which involves teaching the compiler or compiler-rt how to compute the address of errno. This gets nasty fast because it's different for every libc implementation.

I know. I posted an RFC at some point where I detailed how errno is implemented across a wide range of implementations. It is nasty, at least in theory, but practically, it is not as bad as you might fear.

That's largely orthogonal to this patch, though.

It is in some sense, but understanding how we want to proceed here is an important part of this overall space, and directly affects whether it is reasonable to assume that the intrinsics will have no side effects.

I know. I posted an RFC at some point where I detailed how errno is implemented across a wide range of implementations. It is nasty, at least in theory, but practically, it is not as bad as you might fear.

http://lists.llvm.org/pipermail/llvm-dev/2013-November/068154.html ? It looks like the conclusion was actually that it's worse than that: matherr() exists on both glibc and MSVC, which means sqrt(-1) can do anything.

Sort of a response to https://reviews.llvm.org/D27618 and other related discussion lately. I'm not sure when I'll have time to finish this.

Should D27618 be abandoned and we instead pursue this approach?

Should D27618 be abandoned and we instead pursue this approach?

Probably, yes...

Actually, just fixing sqrt() on its own is substantially simpler than fixing all the other functions: we can lower llvm.sqrt() to "x < 0 ? NaN : sqrt(x)" if necessary, since that never sets errno. (We probably want to try to avoid generating that sequence if possible, but it would only be relevant on soft-float Linux targets, so maybe not a big deal.)

jlebar added a subscriber: jlebar.Jan 9 2017, 11:53 PM

I haven't really been looking at this lately.

The only angle that isn't mentioned here is that it might make sense to get rid of some of these intrinsics. llvm.sin(), llvm.cos(), and llvm.pow() are lowered to libcalls for every in-tree target.

Doesn't X86 lower llvm.sin/llvm.cos to fsin/fcos instructions with unsafe math and no sse?

arsenm added a subscriber: arsenm.Aug 4 2017, 12:43 PM

I haven't really been looking at this lately.

The only angle that isn't mentioned here is that it might make sense to get rid of some of these intrinsics. llvm.sin(), llvm.cos(), and llvm.pow() are lowered to libcalls for every in-tree target.

This isn't true. On AMDGPU we select sin and cos at least to a sequence involving a hardware instruction (but we use these as the fast versions)

Huh, I guess we do use the x87 sin/cos. It's probably a terrible idea (the microcoded routines are both slow and wildly inaccurate for large inputs), but we do it.

And it looks like AMDGPU lowers llvm.pow using the identity pow(x,y) = exp2(y * log2(x)). So nevermind. :)

scanon added a subscriber: scanon.Aug 4 2017, 4:08 PM

Huh, I guess we do use the x87 sin/cos. It's probably a terrible idea (the microcoded routines are both slow and wildly inaccurate for large inputs), but we do it.

fsin/fcos can be wildly inaccurate even for relatively modest arguments (https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/). We really shouldn’t ever generate them unless the user specifically asks for them.

spatel edited edge metadata.Dec 3 2017, 9:43 AM

I think we have clang doing the expected thing after:

https://reviews.llvm.org/rL317031 ( https://reviews.llvm.org/D39204 )
https://reviews.llvm.org/rL317265 ( https://reviews.llvm.org/D39481 )
https://reviews.llvm.org/rL317407 ( https://reviews.llvm.org/D39615 )
https://reviews.llvm.org/rL318093 ( https://reviews.llvm.org/D39641 )
https://reviews.llvm.org/rL318598 ( https://reviews.llvm.org/D39611 )
https://reviews.llvm.org/rL319593 ( https://reviews.llvm.org/D40044 )
https://reviews.llvm.org/rL319619

...if you still see bugs up there, let me know.

Instead of:
"Fix clang so it doesn't generate calls to llvm.sqrt() and llvm.pow() when they aren't available."
We're deferring that to LLVM to save/restore errno somehow, but that's not done yet.

efriedma abandoned this revision.Aug 30 2018, 2:13 PM