This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
cfe/trunk/
-
trunk/
-
lib/CodeGen/
-
CodeGen/
-
CGBuiltin.cpp
-
test/CodeGen/
-
CodeGen/
-
2005-07-20-SqrtNoErrno.c
-
builtin-sqrt.c
-
libcalls.c

Differential D39204

[CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set
ClosedPublic

Authored by spatel on Oct 23 2017, 1:49 PM.

Download Raw Diff

Details

Reviewers

hfinkel
efriedma
fhahn

Commits

rG7cb25a888ce5: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set
rC317031: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set
rL317031: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set

Summary

The LLVM sqrt intrinsic definition changed with:
D28797
...so we don't have to use any relaxed FP settings other than errno handling.

This patch sidesteps a question raised in PR27435:
https://bugs.llvm.org/show_bug.cgi?id=27435

Is a programmer using __builtin_sqrt() invoking the compiler's intrinsic definition of sqrt or the mathlib definition of sqrt?

But we have an answer now: the builtin should match the behavior of the libm function including errno handling.

Diff Detail

Repository: rL LLVM

Event Timeline

spatel created this revision.Oct 23 2017, 1:49 PM

Herald added a subscriber: mcrosier. · View Herald TranscriptOct 23 2017, 1:49 PM

spatel mentioned this in D39160: [CodeGen] __builtin_sqrt should map to the compiler's intrinsic sqrt function.Oct 23 2017, 1:50 PM

The gcc documentation says "GCC includes built-in versions of many of the functions in the standard C library. These functions come in two forms: one whose names start with the __builtin_ prefix, and the other without. Both forms have the same type (including prototype), the same address (when their address is taken), and the same meaning as the C library functions". And gcc specifically preserves the stated semantics. Given that, I'm not sure it makes sense for us to try to redefine __builtin_sqrt() just because it's convenient.

Note that this reasoning only applies if the user hasn't specified any fast-math flags; under -ffinite-math-only, we can assume the result isn't a NaN, and therefore we can use llvm.sqrt.*. (The definition of llvm.sqrt.* changed in https://reviews.llvm.org/D28797; I don't think we ever updated clang to take advantage of this).

If we really need a name for the never-sets-errno sqrt, we should probably use a different name, e.g. __builtin_ieee_sqrt().

In D39204#904312, @efriedma wrote:

The gcc documentation says "GCC includes built-in versions of many of the functions in the standard C library. These functions come in two forms: one whose names start with the __builtin_ prefix, and the other without. Both forms have the same type (including prototype), the same address (when their address is taken), and the same meaning as the C library functions". And gcc specifically preserves the stated semantics. Given that, I'm not sure it makes sense for us to try to redefine __builtin_sqrt() just because it's convenient.

Note that this reasoning only applies if the user hasn't specified any fast-math flags; under -ffinite-math-only, we can assume the result isn't a NaN, and therefore we can use llvm.sqrt.*. (The definition of llvm.sqrt.* changed in https://reviews.llvm.org/D28797; I don't think we ever updated clang to take advantage of this).

If we really need a name for the never-sets-errno sqrt, we should probably use a different name, e.g. __builtin_ieee_sqrt().

Thanks for the explanation and link. Let me know if I've gone wrong:

We don't want to convert clang math builtins to llvm intrinsics because builtins are supposed to be exactly equivalent to C library functions (including setting errno).
LLVM intrinsics should be equivalent to C library functions except that they don't set errno (but this is currently wrong in some cases, and D28335 would fix that).
Therefore, the existing code in this file that is converting 'pow' and other builtin calls to intrinsics is correct for now, but only because 2 wrongs made it right? :)

Working my way through the stack: does the sqrt LangRef change mean we can revert rL265521? What allows us to transform any of those libm calls that set errno to vectors in the first place?

I'm even more confused than usual, but if I can find a way to untangle this mess, I'll try to start making patches.

I think you're understanding the semantics correctly.

For r265521, look again at the implementation of llvm::checkUnaryFloatSignature; if "I.onlyReadsMemory()" is true, we somehow proved the call doesn't set errno (mostly likely because we know something about the target's libm).

In D39204#905860, @efriedma wrote:

I think you're understanding the semantics correctly.

For r265521, look again at the implementation of llvm::checkUnaryFloatSignature; if "I.onlyReadsMemory()" is true, we somehow proved the call doesn't set errno (mostly likely because we know something about the target's libm).

Right. Either by target default, or because the user passed -fno-math-errno (or something that implies it, such as -ffast-math), we mark the functions as readonly/readnone because they won't write to errno.

In D39204#905361, @spatel wrote:

Working my way through the stack: does the sqrt LangRef change mean we can revert rL265521?

Yes, maybe. We can now form the intrinsics from the library calls so long as we only care about the result (and not the value of errno), even if the input is negative or would otherwise generate a NaN (as the intrinsic no longer as UB in those situations). However, we still need to know that the potential value of errno is of no interest. I'm not sure how we know that without some particular modeling.

What allows us to transform any of those libm calls that set errno to vectors in the first place?

I'm even more confused than usual, but if I can find a way to untangle this mess, I'll try to start making patches.

Patch updated:
As suggested, I've morphed this patch to just handle sqrt libcalls based on the updated LLVM intrinsic definition.

I was going to include the builtins too, but that exposes another bug (marked here with FIXME) - the builtins are all defined 'const'. Therefore, they can never set errno (unless I'm still misunderstanding). So I think we are wrongly currently turning those into libcalls marked 'readnone'.

We could wrongly turn those into intrinsics in this patch if that seems better? :)

This does make me curious about the use-case of libm-equivalent builtins. If they are exactly identical to libm (including errno behavior), then why are they needed in the first place?

RKSimon added a subscriber: RKSimon.Oct 30 2017, 3:39 PM

I was going to include the builtins too, but that exposes another bug (marked here with FIXME) - the builtins are all defined 'const'.

Probably just need to change c->e in Builtins.def?

This does make me curious about the use-case of libm-equivalent builtins. If they are exactly identical to libm (including errno behavior), then why are they needed in the first place?

It gets treated differently under -fno-builtin/-std=c89.

In D39204#911316, @efriedma wrote:

I was going to include the builtins too, but that exposes another bug (marked here with FIXME) - the builtins are all defined 'const'.

Probably just need to change c->e in Builtins.def?

Yes - at least that made sqrt behave like I expected. So I think it's really just a question of what order and combination that we want to fix this in:

Just the sqrt libcalls in this patch.
Fix both sqrt libcalls and __builtin_sqrt in this patch.
Fix all libm libcalls and builtins simultaneously.

Let's do this one step at a time; first this patch, then fix the __builtin_* functions to use "e", then add all the missing cases CodeGenFunction::EmitBuiltinExpr.

This patch LGTM, assuming you fix the commit message (and the title on Phabricator) to properly describe the change.

This revision is now accepted and ready to land.Oct 31 2017, 11:45 AM

In D39204#912084, @efriedma wrote:

Let's do this one step at a time; first this patch, then fix the __builtin_* functions to use "e", then add all the missing cases CodeGenFunction::EmitBuiltinExpr.

Sounds good.

This patch LGTM, assuming you fix the commit message (and the title on Phabricator) to properly describe the change.

Sure - I didn't know if we prefer to leave the title for the sake of email thread continuity on the list or not, but I'll update that now.

spatel retitled this revision from [CodeGen] __builtin_sqrt should map to the compiler's intrinsic sqrt function to [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set.Oct 31 2017, 12:45 PM

spatel edited the summary of this revision. (Show Details)

Closed by commit rL317031: [CodeGen] map sqrt libcalls to llvm.sqrt when errno is not set (authored by spatel). · Explain WhyOct 31 2017, 1:20 PM

This revision was automatically updated to reflect the committed changes.

spatel mentioned this in D39481: [CodeGen] fix const-ness of builtin equivalents of <math.h> and <complex.h> functions that might set errno.Oct 31 2017, 3:21 PM

spatel mentioned this in rL317265: [CodeGen] fix const-ness of builtin equivalents of <math.h> and <complex.h>….Nov 2 2017, 1:40 PM

spatel mentioned this in D39642: [ValueTracking] readnone is a requirement for converting sqrt to llvm.sqrt; nnan is not.Nov 5 2017, 8:53 AM

spatel mentioned this in rL317519: [ValueTracking] readonly (const) is a requirement for converting sqrt to llvm..Nov 6 2017, 2:40 PM

spatel mentioned this in D43765: [InstSimplify] loosen FMF for sqrt(X) * sqrt(X) --> X.Mar 5 2018, 10:48 AM

spatel mentioned this in D57359: [GlobalISel] Introduce a G_FSQRT generic instruction.Jan 29 2019, 5:29 PM

Revision Contents

Path

Size

cfe/

trunk/

lib/

CodeGen/

CGBuiltin.cpp

29 lines

test/

CodeGen/

2005-07-20-SqrtNoErrno.c

10 lines

builtin-sqrt.c

19 lines

libcalls.c

25 lines

Diff 121042

cfe/trunk/lib/CodeGen/CGBuiltin.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,066 Lines • ▼ Show 20 Lines	case Builtin::BI__c11_atomic_signal_fence: {
Builder.CreateFence(llvm::AtomicOrdering::SequentiallyConsistent, SSID);		Builder.CreateFence(llvm::AtomicOrdering::SequentiallyConsistent, SSID);
Builder.CreateBr(ContBB);		Builder.CreateBr(ContBB);
SI->addCase(Builder.getInt32(5), SeqCstBB);		SI->addCase(Builder.getInt32(5), SeqCstBB);

Builder.SetInsertPoint(ContBB);		Builder.SetInsertPoint(ContBB);
return RValue::get(nullptr);		return RValue::get(nullptr);
}		}

// Library functions with special handling.
case Builtin::BIsqrt:		case Builtin::BIsqrt:
case Builtin::BIsqrtf:		case Builtin::BIsqrtf:
case Builtin::BIsqrtl: {		case Builtin::BIsqrtl:
// Transform a call to sqrt* into a @llvm.sqrt.* intrinsic call, but only		// Builtins have the same semantics as library functions. The LLVM intrinsic
// in finite- or unsafe-math mode (the intrinsic has different semantics		// has the same semantics as the library function except it does not set
// for handling negative numbers compared to the library function, so		// errno. Thus, we can transform either sqrt or __builtin_sqrt to @llvm.sqrt
// -fmath-errno=0 is not enough).		// if the call is 'const' (the call must not set errno).
if (!FD->hasAttr<ConstAttr>())		//
break;		// FIXME: The builtin cases are not here because they are marked 'const' in
if (!(CGM.getCodeGenOpts().UnsafeFPMath \|\|		// Builtins.def. So that means they are wrongly defined to have different
CGM.getCodeGenOpts().NoNaNsFPMath))		// semantics than the library functions. If we included them here, we would
		// turn them into LLVM intrinsics regardless of whether -fmath-errno was on.
		if (FD->hasAttr<ConstAttr>())
		return RValue::get(emitUnaryBuiltin(*this, E, Intrinsic::sqrt));
break;		break;
Value *Arg0 = EmitScalarExpr(E->getArg(0));
llvm::Type *ArgType = Arg0->getType();
Value *F = CGM.getIntrinsic(Intrinsic::sqrt, ArgType);
return RValue::get(Builder.CreateCall(F, Arg0));
}

case Builtin::BI__builtin_pow:		case Builtin::BI__builtin_pow:
case Builtin::BI__builtin_powf:		case Builtin::BI__builtin_powf:
case Builtin::BI__builtin_powl:		case Builtin::BI__builtin_powl:
case Builtin::BIpow:		case Builtin::BIpow:
case Builtin::BIpowf:		case Builtin::BIpowf:
case Builtin::BIpowl: {		case Builtin::BIpowl: {
// Transform a call to pow* into a @llvm.pow.* intrinsic call.		// Transform a call to pow* into a @llvm.pow.* intrinsic call.
▲ Show 20 Lines • Show All 7,884 Lines • Show Last 20 Lines

cfe/trunk/test/CodeGen/2005-07-20-SqrtNoErrno.c

	// RUN: %clang_cc1 -triple x86_64-apple-darwin %s -emit-llvm -o - \| FileCheck %s
	// llvm.sqrt has undefined behavior on negative inputs, so it is
	// inappropriate to translate C/C++ sqrt to this.
	float sqrtf(float x);
	float foo(float X) {
	// CHECK: foo
	// CHECK: call float @sqrtf(float %
	// Check that this is marked readonly when errno is ignored.
	return sqrtf(X);
	}

cfe/trunk/test/CodeGen/builtin-sqrt.c

				// RUN: %clang_cc1 -fmath-errno -triple x86_64-apple-darwin %s -emit-llvm -o - \| FileCheck %s --check-prefix=HAS_ERRNO
				// RUN: %clang_cc1 -triple x86_64-apple-darwin %s -emit-llvm -o - \| FileCheck %s --check-prefix=NO_ERRNO

				// FIXME: If a builtin is supposed to have identical semantics to its libm twin, then it
				// should not be marked "constant" in Builtins.def because that means it can't set errno.
				// Note that both runs have 'readnone' on the libcall here.

				float foo(float X) {
				// HAS_ERRNO: call float @sqrtf(float
				// NO_ERRNO: call float @sqrtf(float
				return __builtin_sqrtf(X);
				}

				// HAS_ERRNO: declare float @sqrtf(float) [[ATTR:#[0-9]+]]
				// HAS_ERRNO: attributes [[ATTR]] = { nounwind readnone {{.*}}}

				// NO_ERRNO: declare float @sqrtf(float) [[ATTR:#[0-9]+]]
				// NO_ERRNO: attributes [[ATTR]] = { nounwind readnone {{.*}}}

cfe/trunk/test/CodeGen/libcalls.c

	// RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-YES %s			// RUN: %clang_cc1 -fmath-errno -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-YES %s
	// RUN: %clang_cc1 -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-NO %s			// RUN: %clang_cc1 -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-NO %s
	// RUN: %clang_cc1 -menable-unsafe-fp-math -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-FAST %s			// RUN: %clang_cc1 -menable-unsafe-fp-math -emit-llvm -o - %s -triple i386-unknown-unknown \| FileCheck -check-prefix CHECK-FAST %s

	// CHECK-YES-LABEL: define void @test_sqrt			// CHECK-YES-LABEL: define void @test_sqrt
	// CHECK-NO-LABEL: define void @test_sqrt			// CHECK-NO-LABEL: define void @test_sqrt
	// CHECK-FAST-LABEL: define void @test_sqrt			// CHECK-FAST-LABEL: define void @test_sqrt
	void test_sqrt(float a0, double a1, long double a2) {			void test_sqrt(float a0, double a1, long double a2) {
	// Following llvm-gcc's lead, we never emit these as intrinsics;
	// no-math-errno isn't good enough. We could probably use intrinsics
	// with appropriate guards if it proves worthwhile.

	// CHECK-YES: call float @sqrtf			// CHECK-YES: call float @sqrtf
	// CHECK-NO: call float @sqrtf			// CHECK-NO: call float @llvm.sqrt.f32(float
				// CHECK-FAST: call float @llvm.sqrt.f32(float
	float l0 = sqrtf(a0);			float l0 = sqrtf(a0);

	// CHECK-YES: call double @sqrt			// CHECK-YES: call double @sqrt
	// CHECK-NO: call double @sqrt			// CHECK-NO: call double @llvm.sqrt.f64(double
				// CHECK-FAST: call double @llvm.sqrt.f64(double
	double l1 = sqrt(a1);			double l1 = sqrt(a1);

	// CHECK-YES: call x86_fp80 @sqrtl			// CHECK-YES: call x86_fp80 @sqrtl
	// CHECK-NO: call x86_fp80 @sqrtl			// CHECK-NO: call x86_fp80 @llvm.sqrt.f80(x86_fp80
				// CHECK-FAST: call x86_fp80 @llvm.sqrt.f80(x86_fp80
	long double l2 = sqrtl(a2);			long double l2 = sqrtl(a2);
	}			}

	// CHECK-YES: declare float @sqrtf(float)			// CHECK-YES: declare float @sqrtf(float)
	// CHECK-YES: declare double @sqrt(double)			// CHECK-YES: declare double @sqrt(double)
	// CHECK-YES: declare x86_fp80 @sqrtl(x86_fp80)			// CHECK-YES: declare x86_fp80 @sqrtl(x86_fp80)
	// CHECK-NO: declare float @sqrtf(float) [[NUW_RN:#[0-9]+]]			// CHECK-NO: declare float @llvm.sqrt.f32(float)
	// CHECK-NO: declare double @sqrt(double) [[NUW_RN]]			// CHECK-NO: declare double @llvm.sqrt.f64(double)
	// CHECK-NO: declare x86_fp80 @sqrtl(x86_fp80) [[NUW_RN]]			// CHECK-NO: declare x86_fp80 @llvm.sqrt.f80(x86_fp80)
	// CHECK-FAST: declare float @llvm.sqrt.f32(float)			// CHECK-FAST: declare float @llvm.sqrt.f32(float)
	// CHECK-FAST: declare double @llvm.sqrt.f64(double)			// CHECK-FAST: declare double @llvm.sqrt.f64(double)
	// CHECK-FAST: declare x86_fp80 @llvm.sqrt.f80(x86_fp80)			// CHECK-FAST: declare x86_fp80 @llvm.sqrt.f80(x86_fp80)

	// CHECK-YES-LABEL: define void @test_pow			// CHECK-YES-LABEL: define void @test_pow
	// CHECK-NO-LABEL: define void @test_pow			// CHECK-NO-LABEL: define void @test_pow
	void test_pow(float a0, double a1, long double a2) {			void test_pow(float a0, double a1, long double a2) {
	// CHECK-YES: call float @powf			// CHECK-YES: call float @powf
	▲ Show 20 Lines • Show All 41 Lines • ▼ Show 20 Lines

	// Just checking to make sure these library functions are marked readnone			// Just checking to make sure these library functions are marked readnone
	void test_builtins(double d, float f, long double ld) {			void test_builtins(double d, float f, long double ld) {
	// CHECK-NO: @test_builtins			// CHECK-NO: @test_builtins
	// CHECK-YES: @test_builtins			// CHECK-YES: @test_builtins
	double atan_ = atan(d);			double atan_ = atan(d);
	long double atanl_ = atanl(ld);			long double atanl_ = atanl(ld);
	float atanf_ = atanf(f);			float atanf_ = atanf(f);
	// CHECK-NO: declare double @atan(double) [[NUW_RN]]			// CHECK-NO: declare double @atan(double) [[NUW_RN:#[0-9]+]]
	// CHECK-NO: declare x86_fp80 @atanl(x86_fp80) [[NUW_RN]]			// CHECK-NO: declare x86_fp80 @atanl(x86_fp80) [[NUW_RN]]
	// CHECK-NO: declare float @atanf(float) [[NUW_RN]]			// CHECK-NO: declare float @atanf(float) [[NUW_RN]]
	// CHECK-YES-NOT: declare double @atan(double) [[NUW_RN]]			// CHECK-YES-NOT: declare double @atan(double) [[NUW_RN]]
	// CHECK-YES-NOT: declare x86_fp80 @atanl(x86_fp80) [[NUW_RN]]			// CHECK-YES-NOT: declare x86_fp80 @atanl(x86_fp80) [[NUW_RN]]
	// CHECK-YES-NOT: declare float @atanf(float) [[NUW_RN]]			// CHECK-YES-NOT: declare float @atanf(float) [[NUW_RN]]

	double atan2_ = atan2(d, 2);			double atan2_ = atan2(d, 2);
	long double atan2l_ = atan2l(ld, ld);			long double atan2l_ = atan2l(ld, ld);
	Show All 23 Lines
	// CHECK-NO: declare float @logf(float) [[NUW_RN]]			// CHECK-NO: declare float @logf(float) [[NUW_RN]]
	// CHECK-YES-NOT: declare double @log(double) [[NUW_RN]]			// CHECK-YES-NOT: declare double @log(double) [[NUW_RN]]
	// CHECK-YES-NOT: declare x86_fp80 @logl(x86_fp80) [[NUW_RN]]			// CHECK-YES-NOT: declare x86_fp80 @logl(x86_fp80) [[NUW_RN]]
	// CHECK-YES-NOT: declare float @logf(float) [[NUW_RN]]			// CHECK-YES-NOT: declare float @logf(float) [[NUW_RN]]
	}			}

	// CHECK-YES: attributes [[NUW_RN]] = { nounwind readnone speculatable }			// CHECK-YES: attributes [[NUW_RN]] = { nounwind readnone speculatable }

	// CHECK-NO: attributes [[NUW_RN]] = { nounwind readnone{{.*}} }			// CHECK-NO-DAG: attributes [[NUW_RN]] = { nounwind readnone{{.*}} }
	// CHECK-NO: attributes [[NUW_RNI]] = { nounwind readnone speculatable }			// CHECK-NO-DAG: attributes [[NUW_RNI]] = { nounwind readnone speculatable }