This is an archive of the discontinued LLVM Phabricator instance.

CodeGen: Emit sqrt intrinsic from __builtin_sqrt
Needs ReviewPublic

Authored by • tstellarAMD on Mar 19 2015, 2:51 PM.

Download Raw Diff

Details

Reviewers

hfinkel
doug.gregor

Summary

We need to add a check for x < -0.0 before the intrinsic, because it has undefined behavior in this case.

Diff Detail

Repository: rL LLVM

Event Timeline

• tstellarAMD updated this revision to Diff 22308.Mar 19 2015, 2:51 PM

• tstellarAMD retitled this revision from to CodeGen: Emit sqrt intrinsic from __builtin_sqrt.

• tstellarAMD updated this object.

• tstellarAMD edited the test plan for this revision. (Show Details)

• tstellarAMD added reviewers: doug.gregor, hfinkel.

• tstellarAMD set the repository for this revision to rL LLVM.

• tstellarAMD added a subscriber: Unknown Object (MLST).

• tstellarAMD edited subscribers, added: Unknown Object (MLST); removed: Unknown Object (MLST).

To play devil's advocate: I don't understand what you're trying to fix here. So the intrinsic has undefined behavior if x < -0.0; that's the definition of our intrinsic.

Regardless, you don't need to insert the extra code if we're in NoNaNs mode.

In D8468#143776, @hfinkel wrote:

To play devil's advocate: I don't understand what you're trying to fix here. So the intrinsic has undefined behavior if x < -0.0; that's the definition of our intrinsic.

The check is there to avoid the undefined behavior. My understanding is that __builtin_sqrtf is supposed to return NaN for x < -0.0, so we need to handle that case specially if we emit the intrinsic.

Regardless, you don't need to insert the extra code if we're in NoNaNs mode.

In D8468#143819, @tstellarAMD wrote:

In D8468#143776, @hfinkel wrote:

To play devil's advocate: I don't understand what you're trying to fix here. So the intrinsic has undefined behavior if x < -0.0; that's the definition of our intrinsic.

The check is there to avoid the undefined behavior. My understanding is that __builtin_sqrtf is supposed to return NaN for x < -0.0, so we need to handle that case specially if we emit the intrinsic.

Right, I just want to understand what "supposed to" means? (what GCC does?)

Regardless, you don't need to insert the extra code if we're in NoNaNs mode.

In D8468#143820, @hfinkel wrote:

In D8468#143819, @tstellarAMD wrote:

In D8468#143776, @hfinkel wrote:

To play devil's advocate: I don't understand what you're trying to fix here. So the intrinsic has undefined behavior if x < -0.0; that's the definition of our intrinsic.

The check is there to avoid the undefined behavior. My understanding is that __builtin_sqrtf is supposed to return NaN for x < -0.0, so we need to handle that case specially if we emit the intrinsic.

Right, I just want to understand what "supposed to" means? (what GCC does?)

No, it's what libm does. Aren't all the __builtin prefixed library calls supposed to be identical to the library call except that they don't set errno?

Regardless, you don't need to insert the extra code if we're in NoNaNs mode.

In D8468#143838, @tstellarAMD wrote:

In D8468#143820, @hfinkel wrote:

In D8468#143819, @tstellarAMD wrote:

In D8468#143776, @hfinkel wrote:

To play devil's advocate: I don't understand what you're trying to fix here. So the intrinsic has undefined behavior if x < -0.0; that's the definition of our intrinsic.

The check is there to avoid the undefined behavior. My understanding is that __builtin_sqrtf is supposed to return NaN for x < -0.0, so we need to handle that case specially if we emit the intrinsic.

Right, I just want to understand what "supposed to" means? (what GCC does?)

No, it's what libm does. Aren't all the __builtin prefixed library calls supposed to be identical to the library call except that they don't set errno?

At least in practice, no. Here's the problem: For all of the math builtins, it is possible that the target will naively support the operation, and so you'll get the underlying operation, this is the same as what the library function would do except that errno is untouched. However, if the target has no such support, then, you do get a call to the library function, and errno might be changed. Regardless, for all such builtins, we mark the corresponding call as readonly/readnone, as if it will never set errno (and, thus, if it does, we can miscompile code by reordering the call with some other call, to open() for example, that sets errno in a way we care about). Thus, the fact that the builtin call does not set errno is not a structural guarantee but a contract: the user promises never to call the builtin with an argument such that the library call would set errno. Otherwise, the behavior is undefined. In return, the compiler will optimize as though the call will never set errno.

IMHO, what we really need here is better documentation on this point.

Regardless, you don't need to insert the extra code if we're in NoNaNs mode.

Revision Contents

Path

Size

lib/

CodeGen/

CGBuiltin.cpp

16 lines

test/

CodeGen/

builtins.c

22 lines

Diff 22308

lib/CodeGen/CGBuiltin.cpp

Context not available.
	return RValue::get(Builder.CreateCall(F, Arg0));	return RValue::get(Builder.CreateCall(F, Arg0));
	}	}

		case Builtin::BI__builtin_sqrt:
		case Builtin::BI__builtin_sqrtf:
		case Builtin::BI__builtin_sqrtl: {
		// Lib functions with the __builtin prefix don't set errno, so we can safely
		// use the intrinsic here. We just need to add a check for x < -0.0
		Value *Arg0 = EmitScalarExpr(E->getArg(0));
		llvm::Type *ArgType = Arg0->getType();
		Value *F = CGM.getIntrinsic(Intrinsic::sqrt, ArgType);
		Value *Sqrt = Builder.CreateCall(F, Arg0);
		Value *Cmp = Builder.CreateFCmpOLT(Arg0,
		ConstantFP::getNegativeZero(ArgType));
		return RValue::get(Builder.CreateSelect(Cmp, ConstantFP::getNaN(ArgType),
		Sqrt));

		}

	case Builtin::BI__builtin_pow:	case Builtin::BI__builtin_pow:
	case Builtin::BI__builtin_powf:	case Builtin::BI__builtin_powf:
	case Builtin::BI__builtin_powl:	case Builtin::BI__builtin_powl:
Context not available.

test/CodeGen/builtins.c

Context not available.
	// CHECK: call i64 @llvm.readcyclecounter()	// CHECK: call i64 @llvm.readcyclecounter()
	return __builtin_readcyclecounter();	return __builtin_readcyclecounter();
	}	}

		// CHECK-LABEL: define void @test_float_builtin_libcalls
		void test_float_builtin_libcalls(float F, double D, long double LD) {
		volatile float resf;
		volatile double resd;
		volatile long double resld;

		resf = __builtin_sqrtf(F);
		// CHECK-DAG: [[SQRTF:%.*]] = call float @llvm.sqrt.f32(float
		// CHECK-DAG: [[FCMPF:%.]] = fcmp olt float {{.}}, -0.000000e+00
		// CHECK: select i1 [[FCMPF]], float 0x7FF8000000000000, float [[SQRTF]]

		resd = __builtin_sqrt(D);
		// CHECK-DAG: [[SQRTD:%.*]] = call double @llvm.sqrt.f64(double
		// CHECK-DAG: [[FCMPD:%.]] = fcmp olt double {{.}}, -0.000000e+00
		// CHECK: select i1 [[FCMPD]], double 0x7FF8000000000000, double [[SQRTD]]

		resld = __builtin_sqrtl(LD);
		// CHECK-DAG: [[SQRTL:%.*]] = call x86_fp80 @llvm.sqrt.f80(x86_fp80
		// CHECK-DAG: [[FCMPL:%.]] = fcmp olt x86_fp80 {{.}}, 0xK80000000000000000000
		// CHECK: select i1 [[FCMPL]], x86_fp80 0xK7FFFC000000000000000, x86_fp80 [[SQRTL]]
		}
Context not available.

This is an archive of the discontinued LLVM Phabricator instance.

CodeGen: Emit sqrt intrinsic from __builtin_sqrtNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 22308

lib/CodeGen/CGBuiltin.cpp

test/CodeGen/builtins.c

CodeGen: Emit sqrt intrinsic from __builtin_sqrt
Needs ReviewPublic