This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
compiler-rt/
-
lib/builtins/
-
builtins/
1/2
divdc3.c
-
divsc3.c
-
divtc3.c
3/4
fp_lib.h
-
int_lib.h
-
int_math.h
-
ppc/
-
divtc3.c
-
test/builtins/Unit/
-
builtins/
-
Unit/
-
compiler_rt_fmax_test.c
-
compiler_rt_fmaxf_test.c
-
compiler_rt_fmaxl_test.c
1/1
compiler_rt_scalbn_test.c
-
compiler_rt_scalbnf_test.c
-
compiler_rt_scalbnl_test.c

Differential D91841

[builtins] Define fmax and scalbn inline
ClosedPublic

Authored by rprichard on Nov 19 2020, 9:17 PM.

Download Raw Diff

Details

Reviewers

compnerd
scanon
rupprecht
efriedma
sivachandra
MaskRay
srhines

Commits

rGd20220141022: Reland "[builtins] Define fmax and scalbn inline"
rG341889ee9e03: [builtins] Define fmax and scalbn inline

Summary

Define inline versions of compiler_rt_fmax* and compiler_rt_scalbn*
rather than depend on the versions in libm. As with
__compiler_rt_logbn*, these functions are only defined for single,
double, and quad precision (binary128).

Fixes PR32279 for targets using only these FP formats (e.g. Android
on arm/arm64/x86/x86_64).

For single and double precision, on AArch64, use __builtin_fmax[f]
instead of the new inline function, because the builtin expands to the
AArch64 fmaxnm instruction.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

rprichard created this revision.Nov 19 2020, 9:17 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 19 2020, 9:17 PM

Herald added subscribers: Restricted Project, pengfei, kbarton and 2 others. · View Herald Transcript

rprichard requested review of this revision.Nov 19 2020, 9:17 PM

With this change, I think PR32279 is still an issue with non-standard FP formats, like the x86 80-bit extended-precision format. In principle, it seems straightforward to extend the __compiler_rt_* inline functions to handle that format.

compiler-rt/test/builtins/Unit/compiler_rt_scalbn_test.c
35	FWIW, msvc added support for C99/C++11 hex float literals somewhere between 19.10 and 19.14: https://godbolt.org/z/fchz37. I'm not sure it's OK to take this dependency, but it's also used only in a test. The C11/C++11 {FLT,DBL,LDBL}_TRUE_MIN macros are defined in msvc 19.00.23506 and up, though (verified with rextester.com).

Harbormaster completed remote builds in B79563: Diff 306594.Nov 19 2020, 10:34 PM

Switch scalbn calls in ppc/divtc3.c to __compiler_rt_scalbn. I verified that, after doing so, libclang_rt.builtins-powerpc64le.a has no unresolved symbols for fmax*/logb*/scalbn*.

rprichard added reviewers: compnerd, scanon, rupprecht, efriedma.Nov 19 2020, 11:29 PM

rprichard added subscribers: pirama, srhines.

The existing __compiler_rt_logb* inline functions were also added to address PR32279, in D49514. On Android, all of fmax*/logb*/scalbn* are in libm, not libc. I commented more on the bug: https://bugs.llvm.org/show_bug.cgi?id=32279#c7.

Harbormaster completed remote builds in B79568: Diff 306606.Nov 19 2020, 11:54 PM

I wrote D49514 a long time ago, and since then, llvm-libc was created -- and it looks like it implements fmax, though not scalbn. Anyway, I wonder if it would be possible to use the llvm-libc implementation of fmax instead of adding one here? And similarly, for scalbn, maybe it would be easier/better long term to add one to libc instead of here.

Siva: are there docs anywhere saying what the status of each libc function is? Which functions have been added, are mature enough to actually use, etc.?

In D91841#2409070, @rupprecht wrote:

Siva: are there docs anywhere saying what the status of each libc function is? Which functions have been added, are mature enough to actually use, etc.?

What is available can be found here: https://github.com/llvm/llvm-project/tree/master/libc/src/math
Specifically, fmax and friends are available. While scalbn is not available, ldexp which is equivalent to is scalbn for radix 2 systems, is available.

I do not know much about the compiler_rt use cases. But, it seems to me that you want to avoid dependence on a libc. So, what exactly do you mean by pulling the implementations from LLVM libc? Do you mean to suggest that compiler_rt should use [parts of] LLVM libc as a library of math functions? If yes, then that is possible with the basic functions: https://github.com/llvm/llvm-project/tree/master/libc/utils/FPUtil

FPUtil went through few iterations so it is not as clean as we would like it to be. But, all the functions defined in there are template functions which handle float, double and long double (even for x86_64). So, for the callers, it is just a header library one can call into. No link time dependency.

I don't think that we can use llvm-libc, but I suppose that lifting the functionality from there is possible. What does that gain over this though?

compiler-rt/lib/builtins/divdc3.c
23–24	If you want to change the prefix (which I think is probably a good idea), please do so in a separate (preliminary) change for `crt_fmax`, `crt_scalbn`, and `crt_fabs`.

The __compiler_rt_{logb,scalbn,fmax}* functions don't work with x87 extended-precision FP, and I think FPUtil would handle that. Perhaps the existing inline functions could be extended to support x87, though.

Is there an ABI issue with using FPUtil headers from the builtins? e.g. The compiler is allowed to output an out-of-line version of C++ inline functions, so if a builtins archive and an LLVM libc.a were built from different versions of LLVM, could we have incompatible versions of (say) NormalFloat's methods between the two archives? Maybe FPUtil could be in an anonymous namespace.

I think FPBits is assuming a GCC/Clang-like compiler, and the builtins have some support for MSVC. e.g. FPBits is using __attribute__((packed)) and appears to use bitfields to match an FP type's layout. The bitfields have different base types, so I think the layout with MSVC won't work as expected. (But MSVC should be OK if the base types are the same for all fields, I think.)

FPUtil logb has this code:

if (bits.isZero()) {
  // TODO(Floating point exception): Raise div-by-zero exception.
  // TODO(errno): POSIX requires setting errno to ERANGE.
  return FPBits<T>::negInf();
} else if (bits.isNaN()) {

I think the builtins are not supposed to set errno. (But they currently can: https://bugs.llvm.org/show_bug.cgi?id=32279#c8)

The builtins currently don't have any C++ code, AFAIK. I don't know that this matters, though.

I also noticed that __llvm_libc::fputil::ldexp appears to ignore the rounding mode (fesetround), whereas the ldexp/scalbn functions in glibc, musl, and bionic all respect the rounding mode. I don't know whether the compiler-rt builtins (e.g. for complex division) currently have the correct rounding-mode behavior -- for __compiler_rt_scalbn*, I was just preserving the behavior I saw in scalbn.

compiler-rt/lib/builtins/divdc3.c
23–24	I haven't renamed the functions. My change adds `__compiler_rt_scalbn` and `__compiler_rt_fmax` as functions in fp_lib.h, but it doesn't remove the existing `crt_scalbn` and `crt_fmax` functions from int_math.h. The long double `crt_logbl`, `crt_scalbnl`, and `crt_fmaxl` functions are still used: As the fallback for non-binary128 long double FP in fp_lib.h. (I'm not sure that the `__compiler_rt_l` functions are actually called for a non-binary128 QUAD_PRECISION mode, though...) In divxc3.c (x87 80-bit long double complex division) I noticed that D49514 removed the non-long-double `crt_` functions it replaced, so I suppose I should also do that.

Remove obsoleted crt_fmax[f] and crt_scalbn[f] functions.

Harbormaster completed remote builds in B79888: Diff 307236.Nov 23 2020, 9:41 PM

In D91841#2412847, @rprichard wrote:

The __compiler_rt_{logb,scalbn,fmax}* functions don't work with x87 extended-precision FP, and I think FPUtil would handle that. Perhaps the existing inline functions could be extended to support x87, though.

Is there an ABI issue with using FPUtil headers from the builtins? e.g. The compiler is allowed to output an out-of-line version of C++ inline functions, so if a builtins archive and an LLVM libc.a were built from different versions of LLVM, could we have incompatible versions of (say) NormalFloat's methods between the two archives? Maybe FPUtil could be in an anonymous namespace.

I think FPBits is assuming a GCC/Clang-like compiler, and the builtins have some support for MSVC. e.g. FPBits is using __attribute__((packed)) and appears to use bitfields to match an FP type's layout. The bitfields have different base types, so I think the layout with MSVC won't work as expected. (But MSVC should be OK if the base types are the same for all fields, I think.)

If one wants cross-platform code out of the box, LLVM libc in general is not yet ready for it. We are working on it, and if everything goes well, end of Q1/early Q2 2021 is when I would think we will see LLVM libc slowly coming up on Windows.

I also noticed that __llvm_libc::fputil::ldexp appears to ignore the rounding mode (fesetround), whereas the ldexp/scalbn functions in glibc, musl, and bionic all respect the rounding mode. I don't know whether the compiler-rt builtins (e.g. for complex division) currently have the correct rounding-mode behavior -- for __compiler_rt_scalbn*, I was just preserving the behavior I saw in scalbn.

Yes. We are working on the floating point exception and rounding mode story currently. If everything goes well, we expect to sort this out before the end of this year.

If the concerns you raised here are actually hard requirements, I would say FPUtil is not quite ready for your use case. But, if you do want to use FPUtil, I will be happy to prioritize your uses cases higher and get them out faster.

In D91841#2411686, @compnerd wrote:

I don't think that we can use llvm-libc, but I suppose that lifting the functionality from there is possible. What does that gain over this though?

It lets you reuse an already written implementation instead of adding a (possibly buggy) implementation here. In general it's better to have one implementation of things than N implementations -- code is a liability, not an asset.

Anyway, it sounds like it isn't possible, at least not yet. Thanks all for at least entertaining my idea :)

I may not be the best reviewer for this, but I'll take a pass today.

LGTM, though it'd be good to have another set of eyes on this to approve it.

compiler-rt/lib/builtins/fp_lib.h
309–310	IIUC this also handles a = -0.0, so the comment should say `// +/- 0.0, NaN, ...`
346–347	For my own curiosity, C99 says on fmax: 361) Ideally, fmax would be sensitive to the sign of zero, for example fmax(−0. 0, +0. 0) would return +0; however, implementation in software might be impractical. If this were exported outside of compiler-rt, I'd consider asking for that to be changed, but it looks like it doesn't matter -- the only context it's used in compiler-rt is variants of "fmax(fabs(x), fabs(y))", so sign should never be a factor.
362–368	Normally, the way to go would be to override this in `compiler-rt/lib/builtins/aarch64`. I'm not sure that's an option here though.

rprichard added inline comments.Nov 30 2020, 8:33 PM

compiler-rt/lib/builtins/fp_lib.h
362–368	I could also leave the special case AArch64 out. Maybe it's faster to use fmaxnm, but it also might not be important. Clang manages to compile the expression into 3 FP instructions. GCC manages it with 5 instructions. https://godbolt.org/z/Yc9zoK. This isn't so much an AArch64-specific version of fmax as a list of configurations where the compiler is expected to inline `__builtin_fmax*` rather than call a library function.

Rename new fmax/scalbn fp_lib.h params from a/b to x/y for consistency with the new tests and with the existing __compiler_rt_logb*.

Add +/- to __compiler_rt_scalbnX comment.

Harbormaster completed remote builds in B80602: Diff 308522.Nov 30 2020, 9:13 PM

Harbormaster completed remote builds in B80604: Diff 308524.Nov 30 2020, 9:18 PM

The runtime libraries should have a proper layering. Generally compiler-rt builtins (libgcc_s.so.1) should not depend on libc. (There are use cases where people depend on compiler-rt builtins but not libc) abort is a known exception but we should generally avoid adding more exceptions.

Implementing the relevant functions called by divxc3 is one choice, another choice is to emulate libgcc - don't use scalbn.

danielkiss added a subscriber: danielkiss.Jan 27 2021, 8:18 AM

In D91841#2428875, @MaskRay wrote:

Implementing the relevant functions called by divxc3 is one choice, another choice is to emulate libgcc - don't use scalbn.

The libgcc version goes about the computation in a slightly different way -- e.g. it computes an intermediate ratio of c / d or d / c (ensuring that the fabs(ratio) is at most 1.0) to avoid overflow, whereas the current compiler-rt version uses scalbn (and closely matches the reference implementation in Annex G of the C specification). The scalbn approach seems to be more accurate? e.g.:

From G.5.1 Multiplicative operators (paragraph 9):

Scaling the denominator alleviates the main overflow and underflow problem, which is more serious than for multiplication.
In the spirit of the multiplication example above, this code does not defend against overflow and underflow in the calculation
of the numerator. Scaling with the scalbn function, instead of with division, provides better roundoff characteristics.

From the libgcc source:

/* ??? We can get better behavior from logarithmic scaling instead of
   the division.  But that would mean starting to link libgcc against
   libm.  We could implement something akin to ldexp/frexp as gcc builtins
   fairly easily...  */

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714#c8

I'm guessing we wouldn't want to reduce accuracy to break the libm dependency?

Adding a couple of other people as possible reviewers.

Ping.

In D91841#2540868, @rprichard wrote:
In D91841#2428875, @MaskRay wrote:

Implementing the relevant functions called by divxc3 is one choice, another choice is to emulate libgcc - don't use scalbn.

The libgcc version goes about the computation in a slightly different way -- e.g. it computes an intermediate ratio of c / d or d / c (ensuring that the fabs(ratio) is at most 1.0) to avoid overflow, whereas the current compiler-rt version uses scalbn (and closely matches the reference implementation in Annex G of the C specification). The scalbn approach seems to be more accurate? e.g.:

From G.5.1 Multiplicative operators (paragraph 9):

Scaling the denominator alleviates the main overflow and underflow problem, which is more serious than for multiplication.
In the spirit of the multiplication example above, this code does not defend against overflow and underflow in the calculation
of the numerator. Scaling with the scalbn function, instead of with division, provides better roundoff characteristics.

From the libgcc source:
/* ??? We can get better behavior from logarithmic scaling instead of
   the division.  But that would mean starting to link libgcc against
   libm.  We could implement something akin to ldexp/frexp as gcc builtins
   fairly easily...  */
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59714#c8

I'm guessing we wouldn't want to reduce accuracy to break the libm dependency?

@MaskRay do you still have concerns with the approach, or are you happy with the better accuracy? I think in practice most binaries will depend on libm anyway (iirc libc++ requires it already). If you're okay with this approach, mind LGTMing? :)

LGTM.

This revision is now accepted and ready to land.Feb 19 2021, 7:07 PM

This revision was landed with ongoing or failed builds.Feb 24 2021, 2:33 PM

Closed by commit rG341889ee9e03: [builtins] Define fmax and scalbn inline (authored by rprichard). · Explain Why

This revision was automatically updated to reflect the committed changes.

rprichard added a commit: rG341889ee9e03: [builtins] Define fmax and scalbn inline.

rprichard added a reverting change: rG680f836c2fa7: Revert "[builtins] Define fmax and scalbn inline".Feb 24 2021, 2:49 PM

The new unit tests failed on sanitizer-windows. I reverted it for now. https://lab.llvm.org/buildbot/#builders/127/builds/6620

rprichard reopened this revision.Feb 26 2021, 4:26 AM

This revision is now accepted and ready to land.Feb 26 2021, 4:26 AM

Disable non-default-rounding-mode scalbn[f] tests when using the MSVC libraries.

MSVC's scalbn appears to ignore the current rounding mode, so the value returned from __compiler_rt_scalbn[f] disagreed with the value returned from scalbn[f]. I think it's sufficient to disable the non-default-rounding-mode tests for MSVC. With this change, the __divdc3 and __divsc3 builtins, when compiled with MSVC, should produce the same results as on Linux systems (or as with MinGW and the other Unix systems I tested).

These two builtins don't seem to have much use when LLVM is configured to use MSVC's libraries. In this configuration, complex.h doesn't define a complex macro for complex float and complex double. It's possible to use _Complex float and _Complex double directly, but they can't be passed to functions in the C library, because those functions accept MSVC's _Fcomplex and _Dcomplex struct types instead. By default, the builtins library isn't linked -- I must pass --rtlib=compiler-rt to enable it.

For reference, I wrote a test demonstrating some behavior I saw with MSVC's scalbn and ldexp functions: https://gist.github.com/rprichard/8a4e8c3edd73ffbceb187a9e8cb7e9ed#file-zzz-output-msvc-19-16-x64-txt

ldexp respected the current rounding mode on overflow, but scalbn didn't.
For rounding of subnormals:
- scalbn rounded to nearest-even and didn't set FE_INEXACT or FE_UNDERFLOW.
- ldexp rounded towards zero and set FE_INEXACT and FE_UNDERFLOW.

Harbormaster completed remote builds in B91008: Diff 326653.Feb 26 2021, 5:14 AM

+2 for relanding this.

This revision was landed with ongoing or failed builds.Feb 26 2021, 4:22 PM

Closed by commit rGd20220141022: Reland "[builtins] Define fmax and scalbn inline" (authored by rprichard). · Explain Why

This revision was automatically updated to reflect the committed changes.

rprichard added a commit: rGd20220141022: Reland "[builtins] Define fmax and scalbn inline".

Hello,

When integrating this change to our downstream toolchain, the new tests for scalbn and scalbnf fails:

TEST 'Builtins-arm-linux :: compiler_rt_scalbn_test.c' FAILED ****

Exit Code: 1
Command Output (stdout):

error: [FE_UPWARD] in __compiler_rt_scalbn(-0x1p+0 [BFF0000000000000], 10000) = -0x1.fffffffffffffp+1023 [FFEFFFFFFFFFFFFF] != -inf [FFF0000000000000]

TEST 'Builtins-arm-linux :: compiler_rt_scalbnf_test.c' FAILED ****

Exit Code: 1
Command Output (stdout):

error: [FE_UPWARD] in __compiler_rt_scalbnf(-0x1p+0 [BF800000], 1000) = -0x1.fffffep+127 [FF7FFFFF] != -inf [FF800000]

It seems there's some inaccuracies, can someone provide some tip on how to debug these errors?

error: [FE_UPWARD] in __compiler_rt_scalbn(-0x1p+0 [BFF0000000000000], 10000) = -0x1.fffffffffffffp+1023 [FFEFFFFFFFFFFFFF] != -inf [FFF0000000000000]

It looks like compiler-rt's inlined __compiler_rt_scalbn is respecting the floating-point rounding mode (fesetround(FE_UPWARD)), but the libm scalbn is ignoring it instead. What libc/libm and (sub)architecture is this?

The rounding mode tests are already disabled on arm if __ARM_FP isn't defined, but maybe that's not sufficient.

What libc/libm and (sub)architecture is this?

ARM32 on linux, using -mfloat-abi=soft. We've tried libc/libm distributed with gcc (4.9.0 and 7.2 released by fsf, and 7.1.1 release by linaro).

We also tried -mfloat-abi=hard on gcc 10.2's libc/libm, released by ARM, which passed the tests. So.. these tests should be disabled when using -mfloat-abi=soft?

hmm... per GCC/ARM manual about -mfloat-abi:

-mfloat-abi=name

Specifies which floating-point ABI to use. Permissible values are: ‘soft’, ‘softfp’ and ‘hard’.

Specifying ‘soft’ causes GCC to generate output containing library calls for floating-point operations. **‘softfp’ allows the generation of code using hardware floating-point instructions, but still uses the soft-float calling conventions.** ‘hard’ allows generation of floating-point instructions and uses FPU-specific calling conventions.

We build compiler-rt and the these tests using 'softfp' by default in our downstream environment. We'll build & run these tests with 'soft' for now. Please let us know if this is the proper solution @rprichard.

MaskRay mentioned this in D119827: [OpenMP][cmake] Ensure linking against libm for Linux.Feb 24 2022, 3:14 PM

Revision Contents

Path

Size

compiler-rt/

lib/

builtins/

12 lines

11 lines

11 lines

95 lines

13 lines

8 lines

ppc/

divtc3.c

19 lines

test/

builtins/

Unit/

compiler_rt_fmax_test.c

41 lines

compiler_rt_fmaxf_test.c

39 lines

compiler_rt_fmaxl_test.c

58 lines

compiler_rt_scalbn_test.c

78 lines

compiler_rt_scalbnf_test.c

77 lines

compiler_rt_scalbnl_test.c

77 lines

Diff 326835

compiler-rt/lib/builtins/divdc3.c

	Show All 14 Lines
	#include "int_lib.h"			#include "int_lib.h"
	#include "int_math.h"			#include "int_math.h"

	// Returns: the quotient of (a + ib) / (c + id)			// Returns: the quotient of (a + ib) / (c + id)

	COMPILER_RT_ABI Dcomplex __divdc3(double __a, double __b, double __c,			COMPILER_RT_ABI Dcomplex __divdc3(double __a, double __b, double __c,
	double __d) {			double __d) {
	int __ilogbw = 0;			int __ilogbw = 0;
	double __logbw = __compiler_rt_logb(crt_fmax(crt_fabs(__c), crt_fabs(__d)));			double __logbw = __compiler_rt_logb(__compiler_rt_fmax(crt_fabs(__c),
				crt_fabs(__d)));
				compnerdUnsubmitted Not Done Reply Inline Actions If you want to change the prefix (which I think is probably a good idea), please do so in a separate (preliminary) change for `crt_fmax`, `crt_scalbn`, and `crt_fabs`. compnerd: If you want to change the prefix (which I think is probably a good idea), please do so in a…
				rprichardAuthorUnsubmitted Done Reply Inline Actions I haven't renamed the functions. My change adds `__compiler_rt_scalbn` and `__compiler_rt_fmax` as functions in fp_lib.h, but it doesn't remove the existing `crt_scalbn` and `crt_fmax` functions from int_math.h. The long double `crt_logbl`, `crt_scalbnl`, and `crt_fmaxl` functions are still used: As the fallback for non-binary128 long double FP in fp_lib.h. (I'm not sure that the `__compiler_rt_l` functions are actually called for a non-binary128 QUAD_PRECISION mode, though...) In divxc3.c (x87 80-bit long double complex division) I noticed that D49514 removed the non-long-double `crt_` functions it replaced, so I suppose I should also do that. rprichard: I haven't renamed the functions. My change adds `__compiler_rt_scalbn*` and…
	if (crt_isfinite(__logbw)) {			if (crt_isfinite(__logbw)) {
	__ilogbw = (int)__logbw;			__ilogbw = (int)__logbw;
	__c = crt_scalbn(__c, -__ilogbw);			__c = __compiler_rt_scalbn(__c, -__ilogbw);
	__d = crt_scalbn(__d, -__ilogbw);			__d = __compiler_rt_scalbn(__d, -__ilogbw);
	}			}
	double __denom = __c * __c + __d * __d;			double __denom = __c * __c + __d * __d;
	Dcomplex z;			Dcomplex z;
	COMPLEX_REAL(z) = crt_scalbn((__a * __c + __b * __d) / __denom, -__ilogbw);			COMPLEX_REAL(z) =
				__compiler_rt_scalbn((__a * __c + __b * __d) / __denom, -__ilogbw);
	COMPLEX_IMAGINARY(z) =			COMPLEX_IMAGINARY(z) =
	crt_scalbn((__b * __c - __a * __d) / __denom, -__ilogbw);			__compiler_rt_scalbn((__b * __c - __a * __d) / __denom, -__ilogbw);
	if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {			if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {
	if ((__denom == 0.0) && (!crt_isnan(__a) \|\| !crt_isnan(__b))) {			if ((__denom == 0.0) && (!crt_isnan(__a) \|\| !crt_isnan(__b))) {
	COMPLEX_REAL(z) = crt_copysign(CRT_INFINITY, __c) * __a;			COMPLEX_REAL(z) = crt_copysign(CRT_INFINITY, __c) * __a;
	COMPLEX_IMAGINARY(z) = crt_copysign(CRT_INFINITY, __c) * __b;			COMPLEX_IMAGINARY(z) = crt_copysign(CRT_INFINITY, __c) * __b;
	} else if ((crt_isinf(__a) \|\| crt_isinf(__b)) && crt_isfinite(__c) &&			} else if ((crt_isinf(__a) \|\| crt_isinf(__b)) && crt_isfinite(__c) &&
	crt_isfinite(__d)) {			crt_isfinite(__d)) {
	__a = crt_copysign(crt_isinf(__a) ? 1.0 : 0.0, __a);			__a = crt_copysign(crt_isinf(__a) ? 1.0 : 0.0, __a);
	__b = crt_copysign(crt_isinf(__b) ? 1.0 : 0.0, __b);			__b = crt_copysign(crt_isinf(__b) ? 1.0 : 0.0, __b);
	Show All 12 Lines

compiler-rt/lib/builtins/divsc3.c

	Show All 14 Lines
	#include "int_lib.h"			#include "int_lib.h"
	#include "int_math.h"			#include "int_math.h"

	// Returns: the quotient of (a + ib) / (c + id)			// Returns: the quotient of (a + ib) / (c + id)

	COMPILER_RT_ABI Fcomplex __divsc3(float __a, float __b, float __c, float __d) {			COMPILER_RT_ABI Fcomplex __divsc3(float __a, float __b, float __c, float __d) {
	int __ilogbw = 0;			int __ilogbw = 0;
	float __logbw =			float __logbw =
	__compiler_rt_logbf(crt_fmaxf(crt_fabsf(__c), crt_fabsf(__d)));			__compiler_rt_logbf(__compiler_rt_fmaxf(crt_fabsf(__c), crt_fabsf(__d)));
	if (crt_isfinite(__logbw)) {			if (crt_isfinite(__logbw)) {
	__ilogbw = (int)__logbw;			__ilogbw = (int)__logbw;
	__c = crt_scalbnf(__c, -__ilogbw);			__c = __compiler_rt_scalbnf(__c, -__ilogbw);
	__d = crt_scalbnf(__d, -__ilogbw);			__d = __compiler_rt_scalbnf(__d, -__ilogbw);
	}			}
	float __denom = __c * __c + __d * __d;			float __denom = __c * __c + __d * __d;
	Fcomplex z;			Fcomplex z;
	COMPLEX_REAL(z) = crt_scalbnf((__a * __c + __b * __d) / __denom, -__ilogbw);			COMPLEX_REAL(z) =
				__compiler_rt_scalbnf((__a * __c + __b * __d) / __denom, -__ilogbw);
	COMPLEX_IMAGINARY(z) =			COMPLEX_IMAGINARY(z) =
	crt_scalbnf((__b * __c - __a * __d) / __denom, -__ilogbw);			__compiler_rt_scalbnf((__b * __c - __a * __d) / __denom, -__ilogbw);
	if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {			if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {
	if ((__denom == 0) && (!crt_isnan(__a) \|\| !crt_isnan(__b))) {			if ((__denom == 0) && (!crt_isnan(__a) \|\| !crt_isnan(__b))) {
	COMPLEX_REAL(z) = crt_copysignf(CRT_INFINITY, __c) * __a;			COMPLEX_REAL(z) = crt_copysignf(CRT_INFINITY, __c) * __a;
	COMPLEX_IMAGINARY(z) = crt_copysignf(CRT_INFINITY, __c) * __b;			COMPLEX_IMAGINARY(z) = crt_copysignf(CRT_INFINITY, __c) * __b;
	} else if ((crt_isinf(__a) \|\| crt_isinf(__b)) && crt_isfinite(__c) &&			} else if ((crt_isinf(__a) \|\| crt_isinf(__b)) && crt_isfinite(__c) &&
	crt_isfinite(__d)) {			crt_isfinite(__d)) {
	__a = crt_copysignf(crt_isinf(__a) ? 1 : 0, __a);			__a = crt_copysignf(crt_isinf(__a) ? 1 : 0, __a);
	__b = crt_copysignf(crt_isinf(__b) ? 1 : 0, __b);			__b = crt_copysignf(crt_isinf(__b) ? 1 : 0, __b);
	Show All 12 Lines

compiler-rt/lib/builtins/divtc3.c

	Show All 15 Lines
	#include "int_math.h"			#include "int_math.h"

	// Returns: the quotient of (a + ib) / (c + id)			// Returns: the quotient of (a + ib) / (c + id)

	COMPILER_RT_ABI Lcomplex __divtc3(long double __a, long double __b,			COMPILER_RT_ABI Lcomplex __divtc3(long double __a, long double __b,
	long double __c, long double __d) {			long double __c, long double __d) {
	int __ilogbw = 0;			int __ilogbw = 0;
	long double __logbw =			long double __logbw =
	__compiler_rt_logbl(crt_fmaxl(crt_fabsl(__c), crt_fabsl(__d)));			__compiler_rt_logbl(__compiler_rt_fmaxl(crt_fabsl(__c), crt_fabsl(__d)));
	if (crt_isfinite(__logbw)) {			if (crt_isfinite(__logbw)) {
	__ilogbw = (int)__logbw;			__ilogbw = (int)__logbw;
	__c = crt_scalbnl(__c, -__ilogbw);			__c = __compiler_rt_scalbnl(__c, -__ilogbw);
	__d = crt_scalbnl(__d, -__ilogbw);			__d = __compiler_rt_scalbnl(__d, -__ilogbw);
	}			}
	long double __denom = __c * __c + __d * __d;			long double __denom = __c * __c + __d * __d;
	Lcomplex z;			Lcomplex z;
	COMPLEX_REAL(z) = crt_scalbnl((__a * __c + __b * __d) / __denom, -__ilogbw);			COMPLEX_REAL(z) =
				__compiler_rt_scalbnl((__a * __c + __b * __d) / __denom, -__ilogbw);
	COMPLEX_IMAGINARY(z) =			COMPLEX_IMAGINARY(z) =
	crt_scalbnl((__b * __c - __a * __d) / __denom, -__ilogbw);			__compiler_rt_scalbnl((__b * __c - __a * __d) / __denom, -__ilogbw);
	if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {			if (crt_isnan(COMPLEX_REAL(z)) && crt_isnan(COMPLEX_IMAGINARY(z))) {
	if ((__denom == 0.0) && (!crt_isnan(__a) \|\| !crt_isnan(__b))) {			if ((__denom == 0.0) && (!crt_isnan(__a) \|\| !crt_isnan(__b))) {
	COMPLEX_REAL(z) = crt_copysignl(CRT_INFINITY, __c) * __a;			COMPLEX_REAL(z) = crt_copysignl(CRT_INFINITY, __c) * __a;
	COMPLEX_IMAGINARY(z) = crt_copysignl(CRT_INFINITY, __c) * __b;			COMPLEX_IMAGINARY(z) = crt_copysignl(CRT_INFINITY, __c) * __b;
	} else if ((crt_isinf(__a) \|\| crt_isinf(__b)) && crt_isfinite(__c) &&			} else if ((crt_isinf(__a) \|\| crt_isinf(__b)) && crt_isfinite(__c) &&
	crt_isfinite(__d)) {			crt_isfinite(__d)) {
	__a = crt_copysignl(crt_isinf(__a) ? 1.0 : 0.0, __a);			__a = crt_copysignl(crt_isinf(__a) ? 1.0 : 0.0, __a);
	__b = crt_copysignl(crt_isinf(__b) ? 1.0 : 0.0, __b);			__b = crt_copysignl(crt_isinf(__b) ? 1.0 : 0.0, __b);
	Show All 12 Lines

compiler-rt/lib/builtins/fp_lib.h

Show First 20 Lines • Show All 293 Lines • ▼ Show 20 Lines	static __inline fp_t __compiler_rt_logbX(fp_t x) {
} else {		} else {
// Subnormal number; normalize and repeat		// Subnormal number; normalize and repeat
rep &= absMask;		rep &= absMask;
const int shift = 1 - normalize(&rep);		const int shift = 1 - normalize(&rep);
exp = (rep & exponentMask) >> significandBits;		exp = (rep & exponentMask) >> significandBits;
return exp - exponentBias - shift; // Unbias exponent		return exp - exponentBias - shift; // Unbias exponent
}		}
}		}

		// Avoid using scalbn from libm. Unlike libc/libm scalbn, this function never
		// sets errno on underflow/overflow.
		static __inline fp_t __compiler_rt_scalbnX(fp_t x, int y) {
		const rep_t rep = toRep(x);
		int exp = (rep & exponentMask) >> significandBits;

		if (x == 0.0 \|\| exp == maxExponent)
		return x; // +/- 0.0, NaN, or inf: return x
		rupprechtUnsubmitted Not Done Reply Inline Actions IIUC this also handles a = -0.0, so the comment should say `// +/- 0.0, NaN, ...` rupprecht: IIUC this also handles a = -0.0, so the comment should say `// +/- 0.0, NaN, ...`

		// Normalize subnormal input.
		rep_t sig = rep & significandMask;
		if (exp == 0) {
		exp += normalize(&sig);
		sig &= ~implicitBit; // clear the implicit bit again
		}

		if (__builtin_sadd_overflow(exp, y, &exp)) {
		// Saturate the exponent, which will guarantee an underflow/overflow below.
		exp = (y >= 0) ? INT_MAX : INT_MIN;
		}

		// Return this value: [+/-] 1.sig * 2 ** (exp - exponentBias).
		const rep_t sign = rep & signBit;
		if (exp >= maxExponent) {
		// Overflow, which could produce infinity or the largest-magnitude value,
		// depending on the rounding mode.
		return fromRep(sign \| ((rep_t)(maxExponent - 1) << significandBits)) * 2.0f;
		} else if (exp <= 0) {
		// Subnormal or underflow. Use floating-point multiply to handle truncation
		// correctly.
		fp_t tmp = fromRep(sign \| (REP_C(1) << significandBits) \| sig);
		exp += exponentBias - 1;
		if (exp < 1)
		exp = 1;
		tmp *= fromRep((rep_t)exp << significandBits);
		return tmp;
		} else
		return fromRep(sign \| ((rep_t)exp << significandBits) \| sig);
		}

		// Avoid using fmax from libm.
		static __inline fp_t __compiler_rt_fmaxX(fp_t x, fp_t y) {
		// If either argument is NaN, return the other argument. If both are NaN,
		// arbitrarily return the second one. Otherwise, if both arguments are +/-0,
		// arbitrarily return the first one.
		rupprechtUnsubmitted Done Reply Inline Actions For my own curiosity, C99 says on fmax: 361) Ideally, fmax would be sensitive to the sign of zero, for example fmax(−0. 0, +0. 0) would return +0; however, implementation in software might be impractical. If this were exported outside of compiler-rt, I'd consider asking for that to be changed, but it looks like it doesn't matter -- the only context it's used in compiler-rt is variants of "fmax(fabs(x), fabs(y))", so sign should never be a factor. rupprecht: For my own curiosity, C99 says on fmax: ``` 361) Ideally, fmax would be sensitive to the sign…
		return (crt_isnan(x) \|\| x < y) ? y : x;
		}

#endif		#endif

#if defined(SINGLE_PRECISION)		#if defined(SINGLE_PRECISION)

static __inline fp_t __compiler_rt_logbf(fp_t x) {		static __inline fp_t __compiler_rt_logbf(fp_t x) {
return __compiler_rt_logbX(x);		return __compiler_rt_logbX(x);
}		}
		static __inline fp_t __compiler_rt_scalbnf(fp_t x, int y) {
		return __compiler_rt_scalbnX(x, y);
		}
		static __inline fp_t __compiler_rt_fmaxf(fp_t x, fp_t y) {
		#if defined(__aarch64__)
		// Use __builtin_fmaxf which turns into an fmaxnm instruction on AArch64.
		return __builtin_fmaxf(x, y);
		#else
		// __builtin_fmaxf frequently turns into a libm call, so inline the function.
		return __compiler_rt_fmaxX(x, y);
		#endif
		rupprechtUnsubmitted Done Reply Inline Actions Normally, the way to go would be to override this in `compiler-rt/lib/builtins/aarch64`. I'm not sure that's an option here though. rupprecht: Normally, the way to go would be to override this in `compiler-rt/lib/builtins/aarch64`. I'm…
		rprichardAuthorUnsubmitted Done Reply Inline Actions I could also leave the special case AArch64 out. Maybe it's faster to use fmaxnm, but it also might not be important. Clang manages to compile the expression into 3 FP instructions. GCC manages it with 5 instructions. https://godbolt.org/z/Yc9zoK. This isn't so much an AArch64-specific version of fmax as a list of configurations where the compiler is expected to inline `__builtin_fmax` rather than call a library function. rprichard:* I could also leave the special case AArch64 out. Maybe it's faster to use fmaxnm, but it also…
		}

#elif defined(DOUBLE_PRECISION)		#elif defined(DOUBLE_PRECISION)

static __inline fp_t __compiler_rt_logb(fp_t x) {		static __inline fp_t __compiler_rt_logb(fp_t x) {
return __compiler_rt_logbX(x);		return __compiler_rt_logbX(x);
}		}
		static __inline fp_t __compiler_rt_scalbn(fp_t x, int y) {
		return __compiler_rt_scalbnX(x, y);
		}
		static __inline fp_t __compiler_rt_fmax(fp_t x, fp_t y) {
		#if defined(__aarch64__)
		// Use __builtin_fmax which turns into an fmaxnm instruction on AArch64.
		return __builtin_fmax(x, y);
		#else
		// __builtin_fmax frequently turns into a libm call, so inline the function.
		return __compiler_rt_fmaxX(x, y);
		#endif
		}

#elif defined(QUAD_PRECISION)		#elif defined(QUAD_PRECISION)

#if defined(CRT_LDBL_128BIT)		#if defined(CRT_LDBL_128BIT)
static __inline fp_t __compiler_rt_logbl(fp_t x) {		static __inline fp_t __compiler_rt_logbl(fp_t x) {
return __compiler_rt_logbX(x);		return __compiler_rt_logbX(x);
}		}
		static __inline fp_t __compiler_rt_scalbnl(fp_t x, int y) {
		return __compiler_rt_scalbnX(x, y);
		}
		static __inline fp_t __compiler_rt_fmaxl(fp_t x, fp_t y) {
		return __compiler_rt_fmaxX(x, y);
		}
#else		#else
// The generic implementation only works for ieee754 floating point. For other		// The generic implementation only works for ieee754 floating point. For other
// floating point types, continue to rely on the libm implementation for now.		// floating point types, continue to rely on the libm implementation for now.
static __inline long double __compiler_rt_logbl(long double x) {		static __inline long double __compiler_rt_logbl(long double x) {
return crt_logbl(x);		return crt_logbl(x);
}		}
#endif		static __inline long double __compiler_rt_scalbnl(long double x, int y) {
#endif		return crt_scalbnl(x, y);
		}
		static __inline long double __compiler_rt_fmaxl(long double x, long double y) {
		return crt_fmaxl(x, y);
		}
		#endif // CRT_LDBL_128BIT

		#endif // *_PRECISION

#endif // FP_LIB_HEADER		#endif // FP_LIB_HEADER

compiler-rt/lib/builtins/int_lib.h

Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	int __inline __builtin_clzll(uint64_t value) {
uint32_t lsh = (uint32_t)(value & 0xFFFFFFFF);		uint32_t lsh = (uint32_t)(value & 0xFFFFFFFF);
if (msh != 0)		if (msh != 0)
return __builtin_clz(msh);		return __builtin_clz(msh);
return 32 + __builtin_clz(lsh);		return 32 + __builtin_clz(lsh);
}		}
#endif		#endif

#define __builtin_clzl __builtin_clzll		#define __builtin_clzl __builtin_clzll

		bool __inline __builtin_sadd_overflow(int x, int y, int *result) {
		if ((x < 0) != (y < 0)) {
		*result = x + y;
		return false;
		}
		int tmp = (unsigned int)x + (unsigned int)y;
		if ((tmp < 0) != (x < 0))
		return true;
		*result = tmp;
		return false;
		}

#endif // defined(_MSC_VER) && !defined(__clang__)		#endif // defined(_MSC_VER) && !defined(__clang__)

#endif // INT_LIB_H		#endif // INT_LIB_H

compiler-rt/lib/builtins/int_math.h

	Show First 20 Lines • Show All 72 Lines • ▼ Show 20 Lines
	#define crt_fabsl(x) fabs((x))			#define crt_fabsl(x) fabs((x))
	#else			#else
	#define crt_fabs(x) __builtin_fabs((x))			#define crt_fabs(x) __builtin_fabs((x))
	#define crt_fabsf(x) __builtin_fabsf((x))			#define crt_fabsf(x) __builtin_fabsf((x))
	#define crt_fabsl(x) __builtin_fabsl((x))			#define crt_fabsl(x) __builtin_fabsl((x))
	#endif			#endif

	#if defined(_MSC_VER) && !defined(__clang__)			#if defined(_MSC_VER) && !defined(__clang__)
	#define crt_fmax(x, y) __max((x), (y))
	#define crt_fmaxf(x, y) __max((x), (y))
	#define crt_fmaxl(x, y) __max((x), (y))			#define crt_fmaxl(x, y) __max((x), (y))
	#else			#else
	#define crt_fmax(x, y) __builtin_fmax((x), (y))
	#define crt_fmaxf(x, y) __builtin_fmaxf((x), (y))
	#define crt_fmaxl(x, y) __builtin_fmaxl((x), (y))			#define crt_fmaxl(x, y) __builtin_fmaxl((x), (y))
	#endif			#endif

	#if defined(_MSC_VER) && !defined(__clang__)			#if defined(_MSC_VER) && !defined(__clang__)
	#define crt_logbl(x) logbl((x))			#define crt_logbl(x) logbl((x))
	#else			#else
	#define crt_logbl(x) __builtin_logbl((x))			#define crt_logbl(x) __builtin_logbl((x))
	#endif			#endif

	#if defined(_MSC_VER) && !defined(__clang__)			#if defined(_MSC_VER) && !defined(__clang__)
	#define crt_scalbn(x, y) scalbn((x), (y))
	#define crt_scalbnf(x, y) scalbnf((x), (y))
	#define crt_scalbnl(x, y) scalbnl((x), (y))			#define crt_scalbnl(x, y) scalbnl((x), (y))
	#else			#else
	#define crt_scalbn(x, y) __builtin_scalbn((x), (y))
	#define crt_scalbnf(x, y) __builtin_scalbnf((x), (y))
	#define crt_scalbnl(x, y) __builtin_scalbnl((x), (y))			#define crt_scalbnl(x, y) __builtin_scalbnl((x), (y))
	#endif			#endif

	#endif // INT_MATH_H			#endif // INT_MATH_H

compiler-rt/lib/builtins/ppc/divtc3.c

	Show All 21 Lines

	long double _Complex __divtc3(long double a, long double b, long double c,			long double _Complex __divtc3(long double a, long double b, long double c,
	long double d) {			long double d) {
	DD cDD = {.ld = c};			DD cDD = {.ld = c};
	DD dDD = {.ld = d};			DD dDD = {.ld = d};

	int ilogbw = 0;			int ilogbw = 0;
	const double logbw =			const double logbw =
	__compiler_rt_logb(crt_fmax(crt_fabs(cDD.s.hi), crt_fabs(dDD.s.hi)));			__compiler_rt_logb(__compiler_rt_fmax(crt_fabs(cDD.s.hi),
				crt_fabs(dDD.s.hi)));

	if (crt_isfinite(logbw)) {			if (crt_isfinite(logbw)) {
	ilogbw = (int)logbw;			ilogbw = (int)logbw;

	cDD.s.hi = crt_scalbn(cDD.s.hi, -ilogbw);			cDD.s.hi = __compiler_rt_scalbn(cDD.s.hi, -ilogbw);
	cDD.s.lo = crt_scalbn(cDD.s.lo, -ilogbw);			cDD.s.lo = __compiler_rt_scalbn(cDD.s.lo, -ilogbw);
	dDD.s.hi = crt_scalbn(dDD.s.hi, -ilogbw);			dDD.s.hi = __compiler_rt_scalbn(dDD.s.hi, -ilogbw);
	dDD.s.lo = crt_scalbn(dDD.s.lo, -ilogbw);			dDD.s.lo = __compiler_rt_scalbn(dDD.s.lo, -ilogbw);
	}			}

	const long double denom =			const long double denom =
	__gcc_qadd(__gcc_qmul(cDD.ld, cDD.ld), __gcc_qmul(dDD.ld, dDD.ld));			__gcc_qadd(__gcc_qmul(cDD.ld, cDD.ld), __gcc_qmul(dDD.ld, dDD.ld));
	const long double realNumerator =			const long double realNumerator =
	__gcc_qadd(__gcc_qmul(a, cDD.ld), __gcc_qmul(b, dDD.ld));			__gcc_qadd(__gcc_qmul(a, cDD.ld), __gcc_qmul(b, dDD.ld));
	const long double imagNumerator =			const long double imagNumerator =
	__gcc_qsub(__gcc_qmul(b, cDD.ld), __gcc_qmul(a, dDD.ld));			__gcc_qsub(__gcc_qmul(b, cDD.ld), __gcc_qmul(a, dDD.ld));

	DD real = {.ld = __gcc_qdiv(realNumerator, denom)};			DD real = {.ld = __gcc_qdiv(realNumerator, denom)};
	DD imag = {.ld = __gcc_qdiv(imagNumerator, denom)};			DD imag = {.ld = __gcc_qdiv(imagNumerator, denom)};

	real.s.hi = crt_scalbn(real.s.hi, -ilogbw);			real.s.hi = __compiler_rt_scalbn(real.s.hi, -ilogbw);
	real.s.lo = crt_scalbn(real.s.lo, -ilogbw);			real.s.lo = __compiler_rt_scalbn(real.s.lo, -ilogbw);
	imag.s.hi = crt_scalbn(imag.s.hi, -ilogbw);			imag.s.hi = __compiler_rt_scalbn(imag.s.hi, -ilogbw);
	imag.s.lo = crt_scalbn(imag.s.lo, -ilogbw);			imag.s.lo = __compiler_rt_scalbn(imag.s.lo, -ilogbw);

	if (crt_isnan(real.s.hi) && crt_isnan(imag.s.hi)) {			if (crt_isnan(real.s.hi) && crt_isnan(imag.s.hi)) {
	DD aDD = {.ld = a};			DD aDD = {.ld = a};
	DD bDD = {.ld = b};			DD bDD = {.ld = b};
	DD rDD = {.ld = denom};			DD rDD = {.ld = denom};

	if ((rDD.s.hi == 0.0) && (!crt_isnan(aDD.s.hi) \|\| !crt_isnan(bDD.s.hi))) {			if ((rDD.s.hi == 0.0) && (!crt_isnan(aDD.s.hi) \|\| !crt_isnan(bDD.s.hi))) {
	real.s.hi = crt_copysign(CRT_INFINITY, cDD.s.hi) * aDD.s.hi;			real.s.hi = crt_copysign(CRT_INFINITY, cDD.s.hi) * aDD.s.hi;
	Show All 34 Lines

compiler-rt/test/builtins/Unit/compiler_rt_fmax_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t

				#define DOUBLE_PRECISION
				#include <fenv.h>
				#include <math.h>
				#include <stdio.h>
				#include "fp_lib.h"

				int test__compiler_rt_fmax(fp_t x, fp_t y) {
				fp_t crt_value = __compiler_rt_fmax(x, y);
				fp_t libm_value = fmax(x, y);
				// Consider +0 and -0 equal, and also disregard the sign/payload of two NaNs.
				if (crt_value != libm_value &&
				!(crt_isnan(crt_value) && crt_isnan(libm_value))) {
				printf("error: in __compiler_rt_fmax(%a [%llX], %a [%llX]) = %a [%llX] "
				"!= %a [%llX]\n",
				x, (unsigned long long)toRep(x),
				y, (unsigned long long)toRep(y),
				crt_value, (unsigned long long)toRep(crt_value),
				libm_value, (unsigned long long)toRep(libm_value));
				return 1;
				}
				return 0;
				}

				fp_t cases[] = {
				-NAN, NAN, -INFINITY, INFINITY, -0.0, 0.0, -1, 1, -2, 2,
				-0x1.0p-1023, 0x1.0p-1023, -0x1.0p-1024, 0x1.0p-1024, // subnormals
				-1.001, 1.001, -1.002, 1.002,
				};

				int main() {
				const unsigned N = sizeof(cases) / sizeof(cases[0]);
				unsigned i, j;
				for (i = 0; i < N; ++i) {
				for (j = 0; j < N; ++j) {
				if (test__compiler_rt_fmax(cases[i], cases[j])) return 1;
				}
				}
				return 0;
				}

compiler-rt/test/builtins/Unit/compiler_rt_fmaxf_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t

				#define SINGLE_PRECISION
				#include <fenv.h>
				#include <math.h>
				#include <stdio.h>
				#include "fp_lib.h"

				int test__compiler_rt_fmaxf(fp_t x, fp_t y) {
				fp_t crt_value = __compiler_rt_fmaxf(x, y);
				fp_t libm_value = fmaxf(x, y);
				// Consider +0 and -0 equal, and also disregard the sign/payload of two NaNs.
				if (crt_value != libm_value &&
				!(crt_isnan(crt_value) && crt_isnan(libm_value))) {
				printf("error: in __compiler_rt_fmaxf(%a [%X], %a [%X]) = %a [%X] "
				"!= %a [%X]\n",
				x, toRep(x), y, toRep(y), crt_value, toRep(crt_value), libm_value,
				toRep(libm_value));
				return 1;
				}
				return 0;
				}

				fp_t cases[] = {
				-NAN, NAN, -INFINITY, INFINITY, -0.0, 0.0, -1, 1, -2, 2,
				-0x1.0p-127, 0x1.0p-127, -0x1.0p-128, 0x1.0p-128, // subnormals
				-1.001, 1.001, -1.002, 1.002,
				};

				int main() {
				const unsigned N = sizeof(cases) / sizeof(cases[0]);
				unsigned i, j;
				for (i = 0; i < N; ++i) {
				for (j = 0; j < N; ++j) {
				if (test__compiler_rt_fmaxf(cases[i], cases[j])) return 1;
				}
				}
				return 0;
				}

compiler-rt/test/builtins/Unit/compiler_rt_fmaxl_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t

				#define QUAD_PRECISION
				#include <fenv.h>
				#include <math.h>
				#include <stdio.h>
				#include "fp_lib.h"

				#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)

				int test__compiler_rt_fmaxl(fp_t x, fp_t y) {
				fp_t crt_value = __compiler_rt_fmaxl(x, y);
				fp_t libm_value = fmaxl(x, y);
				// Consider +0 and -0 equal, and also disregard the sign/payload of two NaNs.
				if (crt_value != libm_value &&
				!(crt_isnan(crt_value) && crt_isnan(libm_value))) {
				// Split expected values into two for printf
				twords x_t, y_t, crt_value_t, libm_value_t;
				x_t.all = toRep(x);
				y_t.all = toRep(y);
				crt_value_t.all = toRep(crt_value);
				libm_value_t.all = toRep(libm_value);
				printf(
				"error: in __compiler_rt_fmaxl([%llX %llX], [%llX %llX]) = "
				"[%llX %llX] != [%llX %llX]\n",
				(unsigned long long)x_t.s.high, (unsigned long long)x_t.s.low,
				(unsigned long long)y_t.s.high, (unsigned long long)y_t.s.low,
				(unsigned long long)crt_value_t.s.high,
				(unsigned long long)crt_value_t.s.low,
				(unsigned long long)libm_value_t.s.high,
				(unsigned long long)libm_value_t.s.low);
				return 1;
				}
				return 0;
				}

				fp_t cases[] = {
				-NAN, NAN, -INFINITY, INFINITY, -0.0, 0.0, -1, 1, -2, 2,
				-0x1.0p-16383L, 0x1.0p-16383L, -0x1.0p-16384L, 0x1.0p-16384L, // subnormals
				-1.001, 1.001, -1.002, 1.002,
				};

				#endif

				int main() {
				#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
				const unsigned N = sizeof(cases) / sizeof(cases[0]);
				unsigned i, j;
				for (i = 0; i < N; ++i) {
				for (j = 0; j < N; ++j) {
				if (test__compiler_rt_fmaxl(cases[i], cases[j])) return 1;
				}
				}
				#else
				printf("skipped\n");
				#endif
				return 0;
				}

compiler-rt/test/builtins/Unit/compiler_rt_scalbn_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t

				#define DOUBLE_PRECISION
				#include <fenv.h>
				#include <float.h>
				#include <limits.h>
				#include <math.h>
				#include <stdio.h>
				#include "fp_lib.h"

				int test__compiler_rt_scalbn(const char *mode, fp_t x, int y) {
				fp_t crt_value = __compiler_rt_scalbn(x, y);
				fp_t libm_value = scalbn(x, y);
				// Consider +/-0 unequal, but disregard the sign/payload of NaN.
				if (toRep(crt_value) != toRep(libm_value) &&
				!(crt_isnan(crt_value) && crt_isnan(libm_value))) {
				printf("error: [%s] in __compiler_rt_scalbn(%a [%llX], %d) = %a [%llX] "
				"!= %a [%llX]\n",
				mode, x, (unsigned long long)toRep(x), y,
				crt_value, (unsigned long long)toRep(crt_value),
				libm_value, (unsigned long long)toRep(libm_value));
				return 1;
				}
				return 0;
				}

				fp_t cases[] = {
				-NAN, NAN, -INFINITY, INFINITY, -0.0, 0.0, -1, 1, -2, 2,
				DBL_TRUE_MIN, DBL_TRUE_MIN*7, DBL_MIN, DBL_MAX,
				-1.001, 1.001, -1.002, 1.002, 1.e-6, -1.e-6,
				0x1.0p-1021,
				0x1.0p-1022,
				0x1.0p-1023, // subnormal
				0x1.0p-1024, // subnormal
				};
				rprichardAuthorUnsubmitted Done Reply Inline Actions FWIW, msvc added support for C99/C++11 hex float literals somewhere between 19.10 and 19.14: https://godbolt.org/z/fchz37. I'm not sure it's OK to take this dependency, but it's also used only in a test. The C11/C++11 {FLT,DBL,LDBL}_TRUE_MIN macros are defined in msvc 19.00.23506 and up, though (verified with rextester.com). rprichard: FWIW, msvc added support for C99/C++11 hex float literals somewhere between 19.10 and 19.14…

				int iterate_cases(const char *mode) {
				const unsigned N = sizeof(cases) / sizeof(cases[0]);
				unsigned i;
				for (i = 0; i < N; ++i) {
				int j;
				for (j = -5; j <= 5; ++j) {
				if (test__compiler_rt_scalbn(mode, cases[i], j)) return 1;
				}
				if (test__compiler_rt_scalbn(mode, cases[i], -10000)) return 1;
				if (test__compiler_rt_scalbn(mode, cases[i], 10000)) return 1;
				if (test__compiler_rt_scalbn(mode, cases[i], INT_MIN)) return 1;
				if (test__compiler_rt_scalbn(mode, cases[i], INT_MAX)) return 1;
				}
				return 0;
				}

				int main() {
				if (iterate_cases("default")) return 1;

				// Rounding mode tests on supported architectures. __compiler_rt_scalbn
				// should have the same rounding behavior as double-precision multiplication.
				#if (defined(__arm__) \|\| defined(__aarch64__)) && defined(__ARM_FP) \|\| \
				defined(__i386__) \|\| defined(__x86_64__)
				// Skip these tests for MSVC because its scalbn function always behaves as if
				// the default rounding mode is set (FE_TONEAREST).
				#ifndef _MSC_VER
				fesetround(FE_UPWARD);
				if (iterate_cases("FE_UPWARD")) return 1;

				fesetround(FE_DOWNWARD);
				if (iterate_cases("FE_DOWNWARD")) return 1;

				fesetround(FE_TOWARDZERO);
				if (iterate_cases("FE_TOWARDZERO")) return 1;
				#endif

				fesetround(FE_TONEAREST);
				if (iterate_cases("FE_TONEAREST")) return 1;
				#endif

				return 0;
				}

compiler-rt/test/builtins/Unit/compiler_rt_scalbnf_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t

				#define SINGLE_PRECISION
				#include <fenv.h>
				#include <float.h>
				#include <limits.h>
				#include <math.h>
				#include <stdio.h>
				#include "fp_lib.h"

				int test__compiler_rt_scalbnf(const char *mode, fp_t x, int y) {
				fp_t crt_value = __compiler_rt_scalbnf(x, y);
				fp_t libm_value = scalbnf(x, y);
				// Consider +/-0 unequal, but disregard the sign/payload of NaN.
				if (toRep(crt_value) != toRep(libm_value) &&
				!(crt_isnan(crt_value) && crt_isnan(libm_value))) {
				printf("error: [%s] in __compiler_rt_scalbnf(%a [%X], %d) = %a [%X] "
				"!= %a [%X]\n",
				mode, x, toRep(x), y, crt_value, toRep(crt_value),
				libm_value, toRep(libm_value));
				return 1;
				}
				return 0;
				}

				fp_t cases[] = {
				-NAN, NAN, -INFINITY, INFINITY, -0.0, 0.0, -1, 1, -2, 2,
				FLT_TRUE_MIN, FLT_TRUE_MIN*7, FLT_MIN, FLT_MAX,
				-1.001, 1.001, -1.002, 1.002, 1.e-6, -1.e-6,
				0x1.0p-125,
				0x1.0p-126,
				0x1.0p-127, // subnormal
				0x1.0p-128, // subnormal
				};

				int iterate_cases(const char *mode) {
				const unsigned N = sizeof(cases) / sizeof(cases[0]);
				unsigned i;
				for (i = 0; i < N; ++i) {
				int j;
				for (j = -5; j <= 5; ++j) {
				if (test__compiler_rt_scalbnf(mode, cases[i], j)) return 1;
				}
				if (test__compiler_rt_scalbnf(mode, cases[i], -1000)) return 1;
				if (test__compiler_rt_scalbnf(mode, cases[i], 1000)) return 1;
				if (test__compiler_rt_scalbnf(mode, cases[i], INT_MIN)) return 1;
				if (test__compiler_rt_scalbnf(mode, cases[i], INT_MAX)) return 1;
				}
				return 0;
				}

				int main() {
				if (iterate_cases("default")) return 1;

				// Rounding mode tests on supported architectures. __compiler_rt_scalbnf
				// should have the same rounding behavior as single-precision multiplication.
				#if (defined(__arm__) \|\| defined(__aarch64__)) && defined(__ARM_FP) \|\| \
				defined(__i386__) \|\| defined(__x86_64__)
				// Skip these tests for MSVC because its scalbnf function always behaves as if
				// the default rounding mode is set (FE_TONEAREST).
				#ifndef _MSC_VER
				fesetround(FE_UPWARD);
				if (iterate_cases("FE_UPWARD")) return 1;

				fesetround(FE_DOWNWARD);
				if (iterate_cases("FE_DOWNWARD")) return 1;

				fesetround(FE_TOWARDZERO);
				if (iterate_cases("FE_TOWARDZERO")) return 1;
				#endif

				fesetround(FE_TONEAREST);
				if (iterate_cases("FE_TONEAREST")) return 1;
				#endif

				return 0;
				}

compiler-rt/test/builtins/Unit/compiler_rt_scalbnl_test.c

This file was added.

				// RUN: %clang_builtins %s %librt -o %t && %run %t

				#define QUAD_PRECISION
				#include <fenv.h>
				#include <float.h>
				#include <limits.h>
				#include <math.h>
				#include <stdio.h>
				#include "fp_lib.h"

				#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)

				int test__compiler_rt_scalbnl(const char *mode, fp_t x, int y) {
				fp_t crt_value = __compiler_rt_scalbnl(x, y);
				fp_t libm_value = scalbnl(x, y);
				// Consider +/-0 unequal, but disregard the sign/payload of NaN.
				if (toRep(crt_value) != toRep(libm_value) &&
				!(crt_isnan(crt_value) && crt_isnan(libm_value))) {
				// Split expected values into two for printf
				twords x_t, crt_value_t, libm_value_t;
				x_t.all = toRep(x);
				crt_value_t.all = toRep(crt_value);
				libm_value_t.all = toRep(libm_value);
				printf(
				"error: [%s] in __compiler_rt_scalbnl([%llX %llX], %d) = "
				"[%llX %llX] != [%llX %llX]\n",
				mode, (unsigned long long)x_t.s.high, (unsigned long long)x_t.s.low, y,
				(unsigned long long)crt_value_t.s.high,
				(unsigned long long)crt_value_t.s.low,
				(unsigned long long)libm_value_t.s.high,
				(unsigned long long)libm_value_t.s.low);
				return 1;
				}
				return 0;
				}

				fp_t cases[] = {
				-NAN, NAN, -INFINITY, INFINITY, -0.0, 0.0, -1, 1, -2, 2,
				LDBL_TRUE_MIN, LDBL_MIN, LDBL_MAX,
				-1.001, 1.001, -1.002, 1.002, 1.e-6, -1.e-6,
				0x1.0p-16381L,
				0x1.0p-16382L,
				0x1.0p-16383L, // subnormal
				0x1.0p-16384L, // subnormal
				};

				int iterate_cases(const char *mode) {
				const unsigned N = sizeof(cases) / sizeof(cases[0]);
				unsigned i;
				for (i = 0; i < N; ++i) {
				int j;
				for (j = -5; j <= 5; ++j) {
				if (test__compiler_rt_scalbnl(mode, cases[i], j)) return 1;
				}
				if (test__compiler_rt_scalbnl(mode, cases[i], -100000)) return 1;
				if (test__compiler_rt_scalbnl(mode, cases[i], 100000)) return 1;
				if (test__compiler_rt_scalbnl(mode, cases[i], INT_MIN)) return 1;
				if (test__compiler_rt_scalbnl(mode, cases[i], INT_MAX)) return 1;
				}
				return 0;
				}

				#endif

				int main() {
				#if defined(CRT_HAS_128BIT) && defined(CRT_LDBL_128BIT)
				if (iterate_cases("default")) return 1;

				// Skip rounding mode tests (fesetround) because compiler-rt's quad-precision
				// multiply also ignores the current rounding mode.

				#else
				printf("skipped\n");
				#endif

				return 0;
				}